The Data Stack Show - The PRQL: Is It Viable to Manage Integrations Open Source?

Episode Date: January 21, 2022

Eric and Kostas preview the upcoming show featuring Douwe Maan of Meltano. ...

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Data Sack Show prequel. We just recorded an episode with Dawa, who is the CEO of Meltano, and it's a really interesting platform. I'm going to start off with a somewhat provocative question for you, Kostas. So in the world of data, and so in the world of data and especially in the world of data integration, one of the problems is maintenance of integrations, right? I mean, that's something you literally work on every single day when you're not with me on the podcast, of course, which is the most important thing that you do, but it's a huge
Starting point is 00:00:40 problem, especially as the number of, you know, sort of business tools, infrastructure tools, et cetera, grows. The stack is growing in complexity. We've talked about that on the show. Do you think it's viable to manage those open source, the integrations, right? I mean, do you have this tension between sort of closed source where you have very tight control over what's going on, and you can force prioritization. Whereas if you have, you know, a couple hundred connectors,
Starting point is 00:01:12 like we talked about with DAWA, managed open source, there's inevitable neglect. So do you think it's viable? Yeah, I think it is. Actually, to be honest, I was very excited to see the birth of Synger because it looked like, let's say, a way to solve this problem of maintaining an open set of integrations at the end out there. At the beginning, it didn't go that well, mainly because of external reasons, which is like the company that invented it got acquired, and priorities changed, and all that stuff. But we see a kind of renaissance right now of Singer,
Starting point is 00:01:54 with companies like Meltano trying to create some kind of governance around it. If they succeed, I think it's going to be, I wouldn't say unstoppable, but I think it's going to be a very unique mode say like unstoppable, but I think it's going to be like a very unique mode out there compared like to the companies who are trying like to maintain everything like in closed source. Right. Yeah. So now it's not easy mainly because the problem with like maintaining integrations is that it's integration. Like it's a little bit like a fit zone project or product at the end, right?
Starting point is 00:02:26 So figuring out exactly how you can govern this and maintain quality and all that stuff, I don't think it's solved as a problem yet, but I think we are getting closer to that. So yeah, I'm very super excited to see like what's happening in this space. And of course with Meltano and like all the people coming from GitLab that they have have a ton of experience with open source. I think that the
Starting point is 00:02:48 right people like to try and tackle this problem. So I'm super excited to see how this is going to evolve. I agree. I think, and I want your opinion on this. We're close to time here, but when we think about the data stack increasing in complexity and we hear, you know, we, you know, kind of laugh about abstract concepts like the data mesh, you know, and other ways that people are trying to sort of, you know, frame the way that we think about these, the challenge of all of these different components and changing and, you know, all that sort of stuff, the package management type paradigm for the data stack that Dawa mentioned is I think one of the most compelling answers to,
Starting point is 00:03:32 compelling tactical answers to sort of the challenge of trying to have a framework to think about the complexity as it relates to I'm trying to actually make all this stuff work together in my day-to-day job. What do you think? to, I'm trying to actually make all this stuff work together in my day-to-day job. What do you think? Okay. I'm a bit biased mainly because I'm coming like from a, like I have like a data, sorry, an engineering background. So, and I'm also like a big, let's say proponent of like not trying to reinvent the wheel. Right. So if I have like to, let's say if I have in order to solve a problem, let's say I have two options. One is like to use a metaphor or try to invent a new word. I prefer the metaphor, right? Like for me, it's much more valuable and much more like productive to be like, okay, there are disciplines out there that are dealing with the same problems for like decades.
Starting point is 00:04:25 Why are we trying to create new paradigms instead of trying to get the paradigms that we have experienced that they are working and try to adapt them to what we are doing here? And that also helps with communication because what we forget is that when we create new products, it's not just like the software that we're building, right? We also need to build the language around it, educate the people, help the people understand like, what we are doing.
Starting point is 00:04:51 And if you're talking about like, data engineers, yeah, like, packet management, like, makes much more sense. It's much easier, like, to understand what this thing is about than, like, a data mesh. Like, what is data mesh? I don't know. Like, it's a mesh of data. Yeah, like, and I don't want to say that, mesh is not something meaningful, right?
Starting point is 00:05:08 Sure, sure. It is. But what I'm trying to emphasize here is how much more difficult is it to communicate what the data mesh is compared to something like packet management, right? I'm going to interpret that as you agreeing with me which makes me feel great i always agree with you always all right well if you want to if you want to hear someone far more qualified to tackle these subjects definitely keep an eye out for the next episode that we just recorded with dawa from meltano and we'll also get to hear a lot about how meltano was
Starting point is 00:05:44 born inside of gitlab which is really fun and super interesting so we'll also get to hear a lot about how Meltano was born inside of GitLab, which is really fun and super interesting. So we'll catch you on the next show.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.