The Data Stack Show - Data Debrief: Will Enterprise Build The Future of Data Tooling?

Episode Date: October 22, 2021

On this week's Data Debrief, Eric and Kostas dig more into the topic of data tooling. ...

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the second show debrief for the Datasack Show. Last week, we had Brian Liu from our team join us. He wasn't available this week, so it's just Costas and I. I'm going to start with the controversial take here, Kostas, and I want your opinion on it. And this is just off the cuff. We don't prepare for this, which maybe will make it even better. But when I think about the terms, and actually let me caveat this. I think Scott is correct in many ways in that the future architecture looks very different than sort of your standard data lake, data warehouse paradigm that we see today.
Starting point is 00:00:54 But there is still, I think, a lot of ambiguity around what does it actually look like for a company to build a data mesh or a data fabric. I think at enterprise scale, they're facing a lot of problems that a lot of companies in the mid-market and especially smaller businesses don't face. And so to some extent, I don't know if I have collected enough information to really feel confident that I could say, okay, if I was going to go out and try and architect a data mesh or a data fabric, here's what I would do because it's still a little bit fuzzy. So, I mean, Scott was very helpful in explaining that. But I think a lot of companies just still live in the data lake, data warehouse paradigm,
Starting point is 00:01:31 and it's hard to break out of that. Yeah, absolutely. And I think there's a very good reason why all these new paradigms start from the enterprise space. Because by definition, the large enterprises, each one of them is like a very different use case. And even if you have a product, you probably have to customize it a lot. So when you have like this kind of vague architectures,
Starting point is 00:01:55 they are like much easier to implement in the enterprise space because you will have to customize anyway compared to productize this and put it out there for like the SMBs or the mid-market to use it. Okay. And I think that's one of the interesting things
Starting point is 00:02:11 with the data market, let's say in general, is that we see like with SaaS and with clouds, there was like this concept of we will go out there, we'll get like a traditional problem that enterprise has. We will introduce it to mid market and SMBs. We are going to innovate really fast and build like an amazing product, gathering like feedback from these. And then we will go up market and sell it to the large enterprises. And that's how we are going to make money, right?
Starting point is 00:02:43 I think that this kind of like motion is going to be reversed when it comes to data and we have like examples like that think about databricks think about confluent and kafka you will see more technologies coming out from the large enterprises or like the big companies and then they are going like to be spread into like the mid-market and and the smbs and there is a good reason behind that and that's because these companies have the need for to process and work with this data today right so and that's like what we see also with many startups right like it's not that's why you see like so many data related products coming out from companies like from founders that they
Starting point is 00:03:32 were working in uber before or they were working in netflix like companies that they had to solve problems at a scale that the market is going to face tomorrow right so? Yep. So I think that's like something very interesting that we are going to see. And don't forget that, okay, like the data market is still a very, very young market. We are still solving basic problems, setting out there like the foundations. And many of these like things will see
Starting point is 00:04:03 how they are going to be implemented like in the near future, I think. And they will become much more concrete. Do you think that machine learning is the first wave of that? Like in terms of like, and I'm thinking back on just both guests who sat on the show and then my own opinions, which, that relationship is really good when guests on the show influence my opinions. That's the right direction of influence. But we talk, I mean, there's still a lot of companies trying to figure out analytics, but we've talked with multiple guests who have said, okay, all the analytics stuff is being standardized and commoditized, right? So it's not too far in the future where that will be
Starting point is 00:04:46 templatized. And the next thing is machine learning. And then of course, with like BigQuery ML, we had someone from Continual on the show, where now a lot of companies who previously just didn't have the resources to do interesting things with ML are now having, like they have services that can automate a huge amount of that for them. I mean, even the concept of writing SQL, basic SQL and training a model makes a lot of that stuff accessible
Starting point is 00:05:13 to people who just weren't able to do that before. So do you think that that's the first iteration of like practical trickle down from the enterprise in terms of data products? I mean, or one early example? Oh yeah, absolutely. And I think that we're not even there yet with machine learning, like how many companies out there they are using like infrastructure and they have the scale of, I don't know, like Netflix when it comes to recommendation systems or Amazon, right? We are still, I think, I agree with you, the first iteration, like the first wave is
Starting point is 00:05:52 the ML products. But even there, I think we need a little bit more time until we see like a much broader adoption. And I think you can see that also like with like some signals that you can get for that. It's like, okay, let's consider one very fresh new technology out there that is very related to that problem, which is feature stores, right? We had at some point someone from Tekton here talking about feature stores. Even there, something that has been implemented already, we are not talking about an architecture like a data fabric. We are talking about
Starting point is 00:06:25 like an actual software system that you can take it and deploy it. The majority of the people don't even know what it is and why you need it. And the reason for that is obvious because they didn't have the need. And that's
Starting point is 00:06:43 like one of the challenges of like trying to be ahead of your time and time the market and be a category creator and all these things. It's hard and it takes time and you have to be patient and obviously have a lot of money to burn. Yeah. The last thing, which this is totally random, but since this is our debrief, we haven't had a good take on crypto in a while. We need to ask one of our, we were on a run there for a while where we were getting really interesting predictions about crypto. So maybe I'll try and sneak that into another. Yeah.
Starting point is 00:07:13 And actually, I think that with the data meshes and data fabrics and all this decentralization when it comes to data, I think we should also chat a little bit with people that are coming from crypto because they are the definition of decentralized architectures and obviously there's data involved there too. So I think we should, yeah, let's do it.
Starting point is 00:07:35 Let's do it. Right. Thanks for joining us for the show debrief and we'll catch you on the next episode of the Data Stack Show.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.