The Data Stack Show - The PRQL: What Part of the Data Stack Will Be Commoditized Next?
Episode Date: November 19, 2021On this week's PRQL, Kostas and Eric preview their upcoming conversation with Ciaran Dynes of Matillion. ...
Transcript
Discussion (0)
Welcome to the Data Stack Show prequel.
We started out a couple of weeks ago recording debriefs where Kostas and I would just go
back and forth talking about the show that we just recorded and some of the ideas that
the guests brought up.
We thought it would be great to actually publish these before the upcoming show to give you
a preview of what we're going to talk about. So we just recorded an episode with the head of product from Matillion,
and it was really, really interesting. I think Costas, you built an ETL company,
and I'm interested in your take on ETL versus ELT. And I'll give a little teaser from the show. Our guest said that
ETL should never have existed, which was a pretty strong statement, especially considering sort of
the landscape of the market. But what's your take as the founder of an ETL company?
Yeah, yeah. It's a very interesting conversation that we had on this
topic and i think what is like really really important for people like to understand and like
keep from this conversation is that elt at the end is like it's it's an architecture right it's like
how you architect your data infrastructure, let's say.
So some things might also look, especially in the future,
as we move closer to this data cloud platform or whatever,
how we want to call it,
it will become much more transparent in a way.
And you can see that especially, I mean, we keep saying that we like, we first load the data in the data warehouse and, like, then we do the transformations.
But when we move, like, to a data lake architecture, then things, like, become a little bit, like, different.
But again, it's, like, the idea of, like, loading the data on S3 and then run, like, some kind of process to transform the data. And based on that, you can tell that all the things that people were doing in the past
with Databricks and doing ETL from there, it was actually ELT, right?
Because the data pretty much had to be loaded on S3 and then you would write your pipeline
on Spark, execute it, and do whatever transformations you want to do there.
It makes a lot of sense.
I mean, I think this kind of evolution from ETL to ELT makes like it had to happen.
And the catalyst for that was, of course, like the separation between processing and storage that happens with data warehouses.
And that's another thing that we are discussing
in this episode with Karen.
So yeah, it's good to hear about more companies
that they have embraced and they have built products on.
Yeah.
Okay.
So another question for you,
something that we touched on
and we didn't actually use the term commoditization,
but we talked about how sort of the infinite possible scale of the modern cloud data warehouse,
lake house, whatever you want to call it. And in many ways, the commoditization of that
has brought about innovation, new architectures, et cetera.
What do you think will be commoditized next?
Which part of the data stack or which tooling do you think will be commoditized next?
That's an interesting question.
I would say ELT.
I would say that data pipelines in general
will become some kind of commodity.
And there is a good reason for that.
It's something that needs to be commoditized
so we can embrace the next wave of innovation.
Keep in mind that that's the value of commoditizing something.
It makes something so much available
that people can go back
there and like innovate on top of that without having to work right so that's why we see with
technology like this recurrent theme of like oh like now computing became like a commodity now
like i don't know like human interactions became a commodity like all these things and after having
processing and storage being commoditized,
I think that when we get also the data movement commoditized,
we will be ready to start focusing on the next wave of innovation
when it comes to data.
And so let's see also what's going to happen with visualization,
to be honest, because we have one cycle that has closed
a couple of years ago
with the acquisition of Looker and all these companies
and the acquisition of Tableau.
But I think we are about to enter a new cycle in this market.
And I think that we are going to see more and more interesting companies
and products in this space.
I agree.
I used to joke around.
I did a lot of work in a company with this TTO and I was in marketing and we were trying
to connect the stack and ran into all the problems there.
And we sort of did a thought experiment one time that was basically sort of every business
has somewhat unique logic, but business models tend
to fall into a couple of high-level categories. And we thought, wouldn't it be so cool if you
basically had a Terraform script that sort of was 80% of your starting point for like pipelines and
tooling that you could just run? And it's like, hey, here's like your basic subscription b2b sass sort of architecture and then you can you know you know go from there and so that sort of
commoditization is interesting because if you just assume all of that and abstract it away from
thinking about your business like it really opens up a lot of headspace to do interesting
interesting things or think about interesting things. Yeah, yeah, 100%.
I think the next one to two years,
like they're going to be like super, super interesting
in terms of new products
and even like new categories.
So yeah, it's going to be...
I mean, I feel lucky that I'm in this space
like at this point of time.
Cool.
And also there's a nice hot take on CDC
in the episode. So keep your
ears out for that and
join us in the next episode when we talk to
the head of product from Atelier. And we
will catch you on the next episode of the DataTag Show.