The Data Stack Show - The PRQL: How Does Composability in Data Infrastructure Differ at Different Levels of Abstraction? Featuring Pedro Pedreira of Meta

Episode Date: November 27, 2023

In this bonus episode, Eric and Kostas preview their upcoming conversation with Pedro Pedreira of Meta. ...

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Data Stack Show prequel. This is a short bonus episode where we preview the upcoming show. You'll get to meet our guest and hear about the topics we're going to cover. If they're interesting to you, you can catch the full-length show when it drops on Wednesday. Costas, this week's conversation is with Pedro from Meta. And wow, what a lot to talk about. VLOX, of course, is a huge topic. And his work on that and usage of that inside of Meta,
Starting point is 00:00:40 which is fascinating. So execution engine that does a ton of stuff inside of meta. But Pedro's really an expert in so many different things, databases, you know, sort of architecture of data infrastructure, so much to talk about. The thing that I'm really interested in is this concept of composable has been a marketing term in the data space as it relates to sort of, let's say, like higher level vendors that you would purchase to, you know, sort of handle data in an ingestion pipeline or an egress pipeline, right? But Pedro has a really interesting perspective on this concept of composable at a lower level of data infrastructure.
Starting point is 00:01:32 And the execution engine is really sort of a foundation for that. And so I think this can actually help us sort of cut through some of the marketing noise that the vendors are creating with higher level tooling and help us sort of cut through some of the marketing noise that the vendors are creating with higher level tooling and help us understand you know at the infrastructure level what does composable mean so i think that would be an awesome subject to cover yeah i think okay i think you put like in the right way because i think the difference here when we are talking about composability is on what level of abstraction we're talking about when it comes to composability.
Starting point is 00:02:09 Like the vendors you're talking about, I think they are talking more about composability of, let's say, features in a way or functionality, like the user wants. Composability, when it comes to what we were discussing with Pedro is a little bit more fundamental and has to do more with how software systems are architected. And I'm sure people that listen to us,
Starting point is 00:02:35 they know the value of being able to build a system, a software system that has some kind of separation of concerns between its modules. So have much more agility and flexibility in building, updating, having people that are dedicated to different areas and in general, have something that scales much easier in terms of building. We are not talking about processing scalability.
Starting point is 00:03:01 Now, traditionally, this was never the case with database systems, though. Database systems were kind of like big monoliths in a way, and for some very good reasons. It has a lot to do with how hard it is to build such a system. So when we're talking about the composability that we'll be discussing with Pedro, it's more about that. How there are some new architectures and some new components coming out that actually allow a developer to pick different libraries and build databases or data processing systems, let's say, in general. And that's a very important thing in the industry because traditionally building database systems has been extremely hard.
Starting point is 00:03:47 Exactly because there was almost zero concept of reusability of libraries or software or whatever. It pretty much had to do everything from scratch. And that made the whole process of building these systems really hard. And very risky, also from a venture point of view, right? So we are entering like a new kind of like era, like when it comes like to these systems where with technologies like Arrow, for example, Velux, we start seeing, let's say, some fundamental components
Starting point is 00:04:21 that you find in every system out there provided as like a, in a way, that you can take and integrate and build your own system. So this is the things that we are going to talk about when we're talking about composability with him. But there's much, much more. Actually, we're going to talk a lot about some very basic and important concepts when it comes to data processing.
Starting point is 00:04:45 And by the way, Pedro has been working for 10 years in meta, data infrastructure. So he has seen a lot in this past 10 years. There's so many things that have changed and they were built. So we're going to talk a lot about the evolution, things that 10 years ago were innovative and today they need to be rethinked. And that's a hint because he's going to announce some very interesting things also about some
Starting point is 00:05:14 changes and some updates in some very important systems out there. So Velox is a very interesting project, but also involves some amazingly experienced people. We started with Pedro today. Everyone should listen and are going to enjoy and take a glimpse of the future of what is coming. And hopefully we are going to have him back and also more people related to this technology to talk more about that stuff in the future.
Starting point is 00:05:48 All right, that's a wrap for the prequel. The full-length episode will drop Wednesday morning. Subscribe now so you don't miss it.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.