The Data Stack Show - Data Debrief: Will Enterprise Build The Future of Data Tooling?
Episode Date: October 22, 2021On this week's Data Debrief, Eric and Kostas dig more into the topic of data tooling. ...
Transcript
Discussion (0)
Welcome to the second show debrief for the Datasack Show.
Last week, we had Brian Liu from our team join us.
He wasn't available this week, so it's just Costas and I.
I'm going to start with the controversial take here, Kostas, and I want your opinion on it.
And this is just off the cuff.
We don't prepare for this, which maybe will make it even better.
But when I think about the terms, and actually let me caveat this.
I think Scott is correct in many ways in that the future architecture looks very different than sort of your standard data lake, data warehouse paradigm that we see today.
But there is still, I think, a lot of ambiguity around what does it actually look like for a company to build a data mesh or a data fabric. I think at enterprise scale, they're facing a lot
of problems that a lot of companies in the mid-market and especially
smaller businesses don't face.
And so to some extent, I don't know if I have collected enough information to really feel
confident that I could say, okay, if I was going to go out and try and architect a data
mesh or a data fabric, here's what I would do because it's still a little bit fuzzy.
So, I mean, Scott was very helpful in explaining that.
But I think a lot of companies just still live in the data lake, data warehouse paradigm,
and it's hard to break out of that.
Yeah, absolutely.
And I think there's a very good reason why all these new paradigms start from the enterprise
space.
Because by definition, the large enterprises, each one of them is like a very different use case.
And even if you have a product,
you probably have to customize it a lot.
So when you have like this kind of vague architectures,
they are like much easier to implement
in the enterprise space
because you will have to customize anyway
compared to productize this
and put it out there for like the SMBs
or the mid-market to use it.
Okay.
And I think that's one of the interesting things
with the data market, let's say in general,
is that we see like with SaaS and with clouds,
there was like this concept of we will go out there,
we'll get like a traditional problem that enterprise has.
We will introduce it to mid market and SMBs.
We are going to innovate really fast and build like an amazing product, gathering like feedback from these.
And then we will go up market and sell it to the large enterprises.
And that's how we are going to make money, right?
I think that this kind of like
motion is going to be reversed when it comes to data and we have like examples like that think
about databricks think about confluent and kafka you will see more technologies coming out from
the large enterprises or like the big companies and then they are going like to be spread into
like the mid-market and and the smbs and there is a good reason behind that and that's because
these companies have the need for to process and work with this data today right so and that's like
what we see also with many startups right like it's not that's why
you see like so many data related products coming out from companies like from founders that they
were working in uber before or they were working in netflix like companies that they had to solve
problems at a scale that the market is going to face tomorrow right so? Yep. So I think that's like something very interesting
that we are going to see.
And don't forget that, okay,
like the data market is still a very, very young market.
We are still solving basic problems,
setting out there like the foundations.
And many of these like things will see
how they are going to be implemented like in the near future, I think.
And they will become much more concrete.
Do you think that machine learning is the first wave of that?
Like in terms of like, and I'm thinking back on just both guests who sat on the show and then my own opinions, which, that relationship is really good when guests on
the show influence my opinions. That's the right direction of influence. But we talk, I mean,
there's still a lot of companies trying to figure out analytics, but we've talked with multiple
guests who have said, okay, all the analytics stuff is being standardized and commoditized,
right? So it's not too far in the future where that will be
templatized. And the next thing is machine learning. And then of course, with like
BigQuery ML, we had someone from Continual on the show, where now a lot of companies who
previously just didn't have the resources to do interesting things with ML are now having,
like they have services that can automate
a huge amount of that for them.
I mean, even the concept of writing SQL,
basic SQL and training a model
makes a lot of that stuff accessible
to people who just weren't able to do that before.
So do you think that that's the first iteration
of like practical trickle down from the enterprise
in terms of data products?
I mean, or one early example?
Oh yeah, absolutely. And I think that we're not even there yet with machine learning,
like how many companies out there they are using like infrastructure and they have the scale of,
I don't know, like Netflix when it comes to recommendation systems or Amazon, right? We are still, I think, I agree with you, the first iteration, like the first wave is
the ML products. But even there, I think we need a little bit more time until we see like a much
broader adoption. And I think you can see that also like with like some signals that you can
get for that. It's like, okay, let's consider one very fresh new technology out there
that is very related to that problem, which is feature stores, right?
We had at some point someone from Tekton here talking about feature stores.
Even there, something that has been implemented already,
we are not talking about an architecture like a data fabric.
We are talking about
like an actual software system
that you can take it and
deploy it.
The majority of the people don't even know what it is and why
you need it. And
the reason for that is
obvious because they didn't have the need.
And that's
like one of the challenges of like trying to be ahead of your time and time the
market and be a category creator and all these things. It's hard and it takes time and you have
to be patient and obviously have a lot of money to burn. Yeah. The last thing, which this is totally
random, but since this is our debrief, we haven't had a good take on crypto in a while.
We need to ask one of our, we were on a run there for a while where we were getting really
interesting predictions about crypto.
So maybe I'll try and sneak that into another.
Yeah.
And actually, I think that with the data meshes and data fabrics and all this decentralization
when it comes to data, I think we should also chat a little bit with people that are coming from crypto
because they are the definition
of decentralized architectures
and obviously there's
data involved there too.
So I think we should,
yeah, let's do it.
Let's do it.
Right.
Thanks for joining us
for the show debrief
and we'll catch you
on the next episode
of the Data Stack Show.