The Data Stack Show - The PRQL: What Old Tech Concepts Were Borrowed to Build the Data Lake House?

Episode Date: January 7, 2022

Eric and Kostas preview the upcoming show as they talk about data lakes and data warehouses and why these are important. ...

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Data Sack Show prequel, where we discuss the upcoming episode after we just recorded it with the guests. And boy, do we have a guest with a pretty significant resume. So I'm just going to roll a couple of big names off here. LinkedIn, he was there from 2011 to 2014, right? So very formative in terms of technology. He went to Box after that. Uber, where he actually was a creator of Hootie, which we're going to talk about a ton in the episode. Confluent, and then now he's actually working on Hootie full-time as part of the Apache Foundation. So just a mind-blowing resume and so many things to talk about.
Starting point is 00:00:50 Kostas, I'm interested though. So we've talked a lot about data warehouses on the show. And I think this is the first time we're going to talk with someone who has developed a sort of industry standard technology in the data lake space. What questions are you going to ask him about the data lake since we sort of have the inventor as a captive audience here? Yeah, I'll tell you. But before that, I have to add that he also started from Oracle.
Starting point is 00:01:21 Don't forget Oracle. Oh, that's right. It might be a dinosaur, but still, I mean, he has done a lot of work actually, like with the very core products. One thing that was interesting, which now that you mentioned that and thinking back on the show, we just recorded the way that he related sort of old technology paradigms to the way you think about developing new technologies was fascinating. And he's like, okay, if we borrow this concept from here and this concept from here, that was just really interesting. Yeah, yeah.
Starting point is 00:01:58 Okay. I don't want to say too much about the episode, but it was an amazing conversation that we had. It really helped me understand much better what a data lake is or why we need it. And also to understand that a lake house is not just another marketing buzzwords because we are trying to create a new category of products. Yeah.
Starting point is 00:02:23 So I think we got a very, very good definition of what a data warehouse is, what a data lake is, and what a lake house is, where we stand in terms of the development of all this and how mature its technology is. But I think from a technical perspective, the most interesting part of the conversation was in order to build the lake house what kind of as you said about like the older technology or whatever what kind of like elements we need to borrow from there
Starting point is 00:02:57 and how we need to figure out the right trade doves to adopt them in whatever we build over the file system in the case of the data lakes and the lake houses, right? And we had like, I think, the peak probably of this conversation was around transactions. So we had like a very interesting conversation of like why transactions matter,
Starting point is 00:03:19 why we need transactions over a file system. And that's like a big differentiation between the data lake and the lake house. What are the differences there between the different technologies like Iceberg, Hudi, Delta Lake? Each one makes different compromises and like different kinds of trade doors.
Starting point is 00:03:39 But this part, it's like probably one of the most important elements that Lakehouse has that differentiates it compared to Data Lake. So, yeah, it's an amazing conversation. I feel like I understand that I can explain now what Data Lake is. That's great. Well, obviously you're going to want to listen to this episode because Costas and I are like two kids who, you know, just left a Star Wars movie on opening night and we're all giddy with excitement because we had such a great conversation. So if that's any indication, there's lots there.
Starting point is 00:04:19 So be sure to catch the next episode with Vinath from Hootie. And we'll catch you on the next one.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.