Disseminate: The Computer Science Research Podcast - Yifei Yang | Predicate Transfer: Efficient Pre-Filtering on Multi-Join Queries | #48

Starting point is 00:00:00 Hello and welcome to Disseminate the Computer Science Research Podcast. I'm your host, Jack Wardby. The usual reminder that if you do enjoy the show, please consider supporting us through Buy Me a Coffee. It really helps us to keep making the show and for us to keep the lights on. So yeah, please do consider supporting us there through that. You can find links to that in the show notes and on our socials if you do fancy buying us a coffee. One more announcement, we also have a listener survey out at the moment. So if you do feel like Express giving us some

Starting point is 00:00:51 feedback and helping shape the future of the podcast, please do consider filling that out. You can also find that on our socials and I'll drop a link to that in the show notes as well. So please do give some feedback, all feedback welcome, and please be candid with it. So yeah, please check that out. Cool. On to today's podcast. And it gives me great pleasure to say I'm joined today by Yifei Yang from the University of Wisconsin, Madison, who will be taking us on a journey through query processing. We're going to be talking about his 2024 CIDR paper, Predicate Transfer, Efficient Pre-Filtering on Multi-Join Queries. Yifei, welcome to the show. Hi, thank you for the introduction.

Starting point is 00:01:29 Hi, nice to meet you, Jack. Likewise. I'm really excited about this conversation today. I love query processing, so I'm really excited to learn some more about it today. So before we do dig into and talk about query processing and all of the things on predicate transfer, tell us a little bit more about yourself and how you ended up in research and specifically how you became interested in data management and the query processing side of things. Okay, sure. So I'm currently a third year PhD student at the University of Wisconsin-Madison. And my current interest of the research is generally broad query optimization on OLAP, in other words, analytical query processing

Starting point is 00:02:10 for regional databases. So before I be in this stage, I contacted databases when I was an undergraduate. So actually at that time, I don't know what research I should do. I just know I want to do some research. And I just brought in contact professors. And finally, just luckily got adopted by a professor who took interest in graph databases.

Starting point is 00:02:43 So at that time, I don't know any details about relation databases, but just had a very brief research period on the graph databases. But then I realized, well, if I can make everything faster, just like you do a running computation, well, you'll get excited when you make your query processing faster and faster. And then I kind of shaped my interest in make something faster, no matter whether it's query processing or relational data processing, or some others like general distributed system or some analytical tasks on that.

Starting point is 00:03:20 So when I step into the master program, I also find, so I will shift my target to some latency-aware optimization fields. So I got a chance to work with a professor on Polystore, which is execution of multiple databases in multiple models, like rational, like graph, like XML and documentation, something like that, because I have some little background on graph processing. So this is maybe the reason he wanted to set me. And during this time, I also contacted for preparation on my application on PhD. So at this time, on the summer, I got in contact with my current advisor, which is interested in researching traditional analytical processing.

Starting point is 00:04:13 And then I got on board. That's the basic story. It's a long story, right? Yeah, that's nice. Yeah, I mean, just kind of going back to something you said at the very start of the answer that you kind of knew that you were attracted were attracted to research sort of even from an undergraduate sort of level what was it about research that was sort of so appealing to you

Starting point is 00:04:33 yeah uh actually the the reason at first is something very trivial i just feel if i start working when i finish undergraduate or master i just just don't know things enough. Don't know things deeply enough. For example, I don't know what team I should work on. I don't know what technology or what kind of area I fit most. So I want to discover what I like. And during the process, I find that a business is something really interesting. So so actually another reason that i don't want to work at that time okay so something combined to lead this to a lot of i want to apply for this program

Starting point is 00:05:15 yeah i can totally i can totally relate to that as well yeah it's kind of similar sort of thing and from my sort of education as well i was kind of like i'm not ready to go into industry or anything yeah i want to keep keep seeing how things go but yeah also as well something that you said about elsewhere and about sort of um making things faster like running faster making things faster it gives you a nice kind of that dopamine hit right it gives you a nice sort of thrill when it's you kind of making stuff faster so yeah can definitely um can uh can definitely get on board with that as well so let's talk about how you've made query processing faster anyway so but before we do that i guess we need to set a little bit of background context for for the listener so i mean kind of build it from the ground up here for us kind of what are joins and these are really important in query processing so yeah kind of

Starting point is 00:05:58 what is a join and then what are all the family of joins that go with it's like multi joins and yeah i guess also kind of it's in the title of the paper that we're going to be talking about. Yeah, what's pre-filtering as well. So yeah, give us some background, set the scene for us. Yeah, sure. So when you store data in a database in a regional format, basically you just have tables. And you don't want to store all the data in a very giant large table because you will get many, many rows, many rows many columns people will usually decompose this all the information into many pieces of tables for

Starting point is 00:06:30 example you can have a table storing your your person's information and another table storing your scores for each person and some other people stored for for example the contact information and whatever. But those information are split so that each table is responsible for one particular category. And when you want to query, you probably need some information, not just from a single table. So you'll need to, for example, you want to find the score of a person given a name or a phone number. You need to somehow

Starting point is 00:07:07 combine information from multiple tables to get the rows or information that you want. So in this case, you need to basically connect this information to tables, what we call a join. So that's the Hollywood definition. So in a grammar level case, basically, the drawing requires you to specify two tables, at least. And you can specify some predicate on that. So if you do not specify

Starting point is 00:07:33 any predicate, you will have a Cartesian product where each of the, maybe one row will be connected with every other row in another table. People usually do not do that

Starting point is 00:07:43 because it does not make any sense in most cases. People usually do not do that because it does not make any things in most cases. People need some predicate to specify what kind of information they want. So basically, maybe in one table you store a person ID. In another table, you also have a person ID referencing to the personality of the first table. Then when you want to combine the formation, you basically check for each row in that table whether there is a row in another table has the same person ID. Then you can match, well, the formation of these two rows point to the same person. And you can do some further processing on that. This is called a natural join. Yeah, and basically our research is on top of a natural join. Yeah, and basically, our research is

Starting point is 00:08:25 on top of this natural join. There can be many more complex joins like altered join, semi-join, but it's basically an extension of this. So this natural join specifies the commutation

Starting point is 00:08:38 between one pair of tables, basically two tables. And sometimes, you need information from more than two tables. Whenever you need information from more than one more than two tables whenever you need three tables what you do is basically you join this first and the second second tables first you get a maybe intermediate table and then you're doing the intermediate table with the third table you will get two joints and this is what you call multi-joint okay nice cool yeah i mean i can i can definitely see the need for filtering.

Starting point is 00:09:06 So if we've got two pretty sizable tables and then we end up with a Cartesian product, that could be a huge, huge table, right? So yeah. And also just to kind of recap on the multi-join there, it's the scenario where we want to join information from multiple tables and we would join two and then we get

Starting point is 00:09:25 an intermediate table and we join that table with the say the third table and i guess the process could go on for n number of tables depending on how many tables were part of a part of your part of your um your query or what computation you wanted to do cool so with that sort of scene set then so can you kind of tell us where predicate transfer fits into this story then and give us sort of the elevator pitch of your work? Yeah. So when you observe some benchmark queries, for example, TPSH or some customer queries, maybe you can just focus on a single join. Let's say both of the table is a 1 million row, for example.

Starting point is 00:10:05 And you can observe that maybe in the result, you only get 100 rows, which is very small because you have a very selective predicate applied on each of the tables. So what we observed is that most of the rows in the base table, which is the joining table, will not contribute to the final result. So for example, many rows of the first table after filtering do not have a match on the other table also after the filter applied. So in this case, if you simply join this table first, you probably will get a very large intermediate result And it's only filtered when you have more and more table joins. And only finally, those unnecessary rows

Starting point is 00:10:51 will be filtered out. So what we hope is to filter these rows out, maybe not entirely, but almost, maybe 90%, before any join occurs. Because join is very complex. If you join many roles, you will have a lot of overhead. But if you join with small inputs, with pre-filtering applied, you will get a very cheap competition.

Starting point is 00:11:15 Okay, nice. Cool. So if I'm understanding this correctly, there is a scenario in which even when you do some filtering, that it sometimes isn't actually that... What's the word to say? So you do the filtering and you still kind of then you do your drawing and you still end up with a lot of like not many matches even though you've done your filter so the filter kind of hasn't really had that much of an effect is that sort of part of the problem there okay of course yeah i'm getting it

Starting point is 00:11:39 now now that now this now the way i can see where the predicate transfer is going to fit in here is like okay we're going to try and improve this situation and make sure that we're a bit more effective with this. Okay, cool. So I know that from your paper, bloom joins, another type of join, is important in your work here. So tell us about what bloom joins are and how they work, and that can, I guess, lead us into how the predicate transfer algorithm works.

Starting point is 00:12:06 Okay, yeah. So this depends on the join algorithm of the, maybe the natural join for example. There are several algorithms. The most popular two are just hash join and merge join. So, and among these, hash join is also most popular. So hash join, basically you want to build your hash table on your smaller side of the joining tables. And then you probe the rows of the larger table one by one into the hash table to check whether they are the match. If they are the match, you meet one output row. So Bloom filter is used to optimize this hash join. Since hash table, the right to the hash table and the access or probing the hash table is a little bit expensive. So people invent a Bloom filter, which

Starting point is 00:12:52 is a very lightweight structure. Its size is much smaller than a hash table, and its right and access is also much faster. So basically, the shape is the same. So you build your bloom filter on the smaller side, just the same size of the hash table building. And then you also probe each row of the larger, larger, another side of the table into your very small bloom filter. And this technique will not guarantee you will figure out all the matches, but it will filter out most of the rows

Starting point is 00:13:28 that will not have a match. So it will have no false negative, but have some slightly small false positive rate. But it's good enough because it's executing so fast. So we do not care whether it's filtered out entirely or not. Because after this step, we will also run regular hash draw no matter what. Okay, cool. Yes, I can see why

Starting point is 00:13:52 the no false negatives is really important because you don't want to discount anything that actually is a match. And we can allow for a small number of false positives. I always get my false positives and false negatives the wrong way around so forgive me if I've said that the wrong way around but yeah um yeah and

Starting point is 00:14:10 then so you can have a few because then we can we can kind of I guess clean these up a little bit afterwards okay cool so tell us about the predicate transfer algorithm in a little bit more depth because I know there's two phases to it so yeah walk walk us through it from from the kind of I don't know maybe give us an example that could be useful to walk us through it. The floor is yours to explain this, Yifei. Yeah, sure. So what we started from a very traditional academic algorithm called the Anarchic Algorithm. So this one even took maybe 20 or 30 years ago, which is used in

Starting point is 00:14:43 basic queries. Actually, people may not be familiar with whether the query is CK or CK. So, basically, it means there's no triangle or a loop when you're joining tables. So that will be just a tree structure, basically, the join tree, join graph. So this algorithm, the detail is a bit complex, but basically we will run this

Starting point is 00:15:07 algorithm, it will do a bunch of semi-join in the join tree from the leaf node to the root, and then from the root to the leaf in the two-passive way, all transitions are semi-join. And after this kind of magic processing, it guarantees that if there is a row remained, it will contribute to the final join result. It features other rows which are not unnecessary. The details of the row are a little bit complex.

Starting point is 00:15:33 Sorry, if I just jump in real quick, what's a semi-join? Oh, sorry. Semi-join is something like, you join two tables, but you only want to keep columns of one side of the table. So, for example, you have two tables, maybe they are the columns that are A, B in the first table, and C, B in the second table. After joining the natural join, you only want to keep a b so it's similar to an exist class where if for the for the first table

Starting point is 00:16:11 if there is a row that has a match in the second table we keep that row we only keep row of that table and anything in the second table we don't care about that. So we just try to check whether they're the match from the specific row in the first table to the second table. That's a semi-join. Okay, cool. So I interrupted you there. Continue with your explanation. Sorry. No, it's okay. Sorry. I forgot to bring up this basic

Starting point is 00:16:38 definition. Yeah. So basically, this kind of semi-join in two parts in a magic way, people can figure out and they can prove that the line there, the robot contributed to the result, it will be preserved. If it's not going to contribute, it will be discarded. Yeah, that's a way of traditional algorithm.

Starting point is 00:17:01 And as we mentioned, join is very expensive. There's no exception for semi-join. It's also expensive. Yeah. So basically, you can think our idea is to apply this lightweight Bloom filter to replace semi-join so that we can still filter most of the rows. But we don't need a guarantee. We don't need to guarantee 100 percent filter out we still figure out most of the unnecessary roles but we are efficient in the in this featuring phase and after that we do regular join the regular join will start from a very small base table inputs maybe a little bit larger than the reduced tape of the anarchakakis algorithm, but there will not be a large difference. The join effect

Starting point is 00:17:47 will be always very efficient. Nice, cool. So yeah, I mean I guess we could maybe dig into the actual, I guess the steps a little bit more. I know because in your paper you give an example with one of the queries from TCPH. So maybe

Starting point is 00:18:04 we could talk through how the algorithm would work on that example, maybe. Yeah, maybe. Yeah, I think we have a better favorite. We always use that. Don't be special. Which is Q3, actually. So Q3 basically involves only three tables.

Starting point is 00:18:20 Smallest customer, middle one-card orders, and largest line item. So basically, the chain structure to join. You join customer with orders first, and then with line item. So it's definitely easy to click. So the anti-cash algorithm must work on that. So the tree structure is also simple. It's just like a chain so what do we do is let's take maybe uh the start starting point at the

Starting point is 00:18:50 customer table maybe the leaf at the customer table and the top or the root at the line item where we go from customer to orders and an item in the forward direction during the transfer so what this i will introduce the details of the forward transfer next. So basically, you start from the smallest table customer. You build a Bloom filter on the join key between customer and orders. And then the Bloom filter is used to filter the rows on the orders table. And orders will be reduced maybe to 10% for example. And we use the reduced table, 10x smaller orders table to build another

Starting point is 00:19:31 Bloom filter on a join key between orders and line item. And this Bloom filter is used to filter line item table, maybe reduced another 10x on line item. Yeah. And this forward path. And when we do the backward path, the reverse direction, the similar thing, which is we start from the reduced line item table

Starting point is 00:19:51 in the forward path, basically, for example, 10x smaller of the original size. We build a Bloom filter using, on the doing key between line item and orders. And this Bloom filter is used to filter already reduced orders table. Maybe you get another 2x smaller, right? Then you start from the 20x smaller orders table.

Starting point is 00:20:14 You build another Bloom filter on the join key between customer and orders. And this Bloom filter is used to filter the customer table, which may shrink maybe by 5 X on the customer table. And basically that's all, as I've done for all the tables to do the side. And after that, you can do regular join from the very smaller tables. Jason Wong O' So we've made our problem massively smaller by doing these two passes. That's awesome.

Starting point is 00:20:43 Can you tell us a little bit more about the implementation here then so like how easy was this to implement in practice now what did you implement it in and yeah how easy how easy was it to do i guess yeah so if you had this uh idea on your back on your mind probably it's not hard but when you go into a real, it will be a little bit tricky. So we have a code base called FlexPushdownDB. This is just eye control. I know the code. Maybe it's a little bit easier for me to do. So basically, this code base is similar to maybe open source

Starting point is 00:21:17 ones like DuckDB, like ThickLight, Postgres, some other databases. So it has joined the bloom filters so basically we start from the original execution we already get a query plan right we add a phase before this execution phase called pre-filtering phase and in the pre-fitting phase, we scan the base table one by one, and we construct those bloom filters, apply those bloom filters in the same direction as the example that we just went through. Right? Customer orders an item and go back.

Starting point is 00:21:58 And then we will store our reduced table somewhere in memory, just as intermediate result. And use this intermediate result tables as the new input of the execution phase. So the execution phase do not need to scan the original table. They will just scan those reduced table produced by our pre-filtering phase. And then both phases are connected and will go smoothly. Nice. table produced by our pre-fitting interface and then both bits are connected and we'll go smoothly.

Starting point is 00:22:27 Nice. So obviously you knew the code so how long was the implementation effort though? Because you said it's a simple idea but as soon as you go to the client, put this down in code it can be quite challenging, right? It's never as straightforward as you think and there's always bugs and whatnot. So yeah, was it a long implementation effort?

Starting point is 00:22:43 What was that like? It's not too long. basically you have to have some preparation for them but you have to implement your boom filter first that's something right okay you right yeah when you have these kind of tricks you you won't get a very long code you just need to order those tables and specify which table starts with the blue filter build first. And then what is the next table to use that? And after this use, what is the next table to build a second Bloom filter? We just need to specify those orders. In our paper, the order is always, the price transfer is always occurred from the smaller table to the larger table. So in our Q3, it's customer orders line item because this is sorted based on the sizes. So basically our Q3, it's customer orders line item because this is sort of based

Starting point is 00:23:25 on the sizes. So basically in implementation, you need to specify the order. You can have your own customized order. This does not matter, but you just need to specify the order and then everything will go smoothly. Because you just call the libraries of the build bloom filter on some column, you call the library of the problem filter using some table on some column. And we keep doing this. I like it. Great. So let's talk about some results then. So, well, first of all, maybe, how did you go about evaluating predicate transfer then? So what was your experimental setup like? What did you compare it against? Yeah. What was that evaluation like?

Starting point is 00:24:04 Yeah. So basically start from the very famous TPCH benchmark. And at that time I submitted the paper, the prototype is not matured enough, which is our transfer technique only applied on single thread execution. So we haven't finished parallel and something like distributed execution. So we only measure our smart data set, which is QFactor 1, which is a one gigabyte and 10 gigabyte data set using a single strand. So in that case, we only also compare with ourselves.

Starting point is 00:24:40 We didn't compare with some other existing open source or commercial systems. That would be too much work right we want to get a paper done as quick as possible so we already figured we got a very large gain and i think we just think that's good enough maybe we should try to submit first yeah okay that's good evaluation, awesome stuff. So yeah, I guess let's talk some numbers then. So kind of what was the performance like? Yeah, so the performance really depends on how many tables you have in the join, right? So if you have many, many joins, the transfer will make a very large impact because a single predicate can be transferred to everywhere

Starting point is 00:25:26 in the join table but if you have just two tables joined it will behave similar like a traditional blue join because you don't have the possibility to transfer further so we observed that on average we can with the average means geo mean we use geo mean to measure. The speedup can be 3x to 3.5x. But on some very large queries like Q2, Q5, or Q21, it can be more than 10x to 60x. A single set exclusion. Because sometimes they have eight tables in the joint tree, or they even have a cycle.

Starting point is 00:26:04 A cycle is something really interesting because Yanakaki's traditional algorithm of Yanakaki cannot guarantee the optimal. But we also do not need to require this optimality. We just want to filter as much as we can. So actually, in a sticky case, which is Q5, we do a very good job comparing Anahakis. We even filter more rows than the traditional algorithm does because there's no guarantee. But we can filter a lot more edges than Anahakis do.

Starting point is 00:26:38 So we follow every edge in the graph, but Anahakis have to break the graph, a sticky graph, into a tree somehow. We have more opportunities so yeah so the high speed up queries can get again up to 60x but some small queries can just have a speed up between one to two x oh because yeah so it's kind of it's very much related to the sort of the the size of the tables the number of the tables and then there's kind of whether the cyclic is cyclic or not kind of a few things have been bouncing around in my mind as we've been chatting about this and the the order in which you you choose to actually do this join is you saying at the moment it's a manual process where you're like okay we're going to use simple heuristic in the sense of we'll use the smallest FAST or the biggest FAST or kind of whatever.

Starting point is 00:27:28 Is that decision at the moment kind of the humans inputting that decision or is the scenarios in which other types of orders might be better depending on the type of the query? I see. It's basically following the traditional wisdom, which is you usually want to build your hashtag table on a small table and probe in a larger one. So basically, adopt the same strategy. We want to build a little bit on a smaller one and probe on a larger one. Following on this trick, we just order the table by the sizes and from the smallest to largest.

Starting point is 00:28:00 And actually, we have some experiment to show that maybe it's not a much difference when you start from the smallest or the largest. Oh really? Anyway, you have, yeah. Anyway, you have to scan the table once when you build or prove.

Starting point is 00:28:15 And build and prove the difference is almost similar, but in hash table is different because when you build a hash table, it's much more expensive than a probe in a hash table. Yeah. Awesome stuff. Yeah. So yeah that's fascinating so also as well kind of this principle of using um bloom filters can can it be applied to other i other other joins well i don't like worst case optimal joins or things like so could you apply the same sort of principle to that as well to those types of algorithms and use it as a generic pre-filtering sort of step? Yeah, that's a fantastic question actually. We got this question many times including the conference

Starting point is 00:28:51 and some offline discussions. I think worst-case optimal join can be the future actually. But I'm not sure whether it will get as successful as today's binary join, because it's very well optimized. So to combine party transfer with worst-case optimal join, I feel a very naive approach is just to separate the two phases. You still do the transfer with your table size first and then from the new table smaller table you'll do the work with the optimal joint but i i think there can be some opportunities to couple these two techniques more tightly instead of isolating in two steps and uh we probably want

Starting point is 00:29:38 to dig in dig further onto that but i couldn't really have a very concrete answer definitely one for that definitely an interesting one for future research and also I just kind of

Starting point is 00:29:48 introduced worst case optimal joins there without describing it

Starting point is 00:29:50 so maybe you could actually give us a better description of what worst case

Starting point is 00:29:54 optimal joins are fill the listener in on that as well

Starting point is 00:29:57 I'm not an expert on that we have a student working with together

Starting point is 00:30:02 I know the idea from him. So I can only tell some high-level things that I understand. So basically, in a binary join, you have to join two tables to produce some intermediate results first

Starting point is 00:30:16 and use this result to join another table, which is the third one. But in the worst case, probably you'll want to produce a row as a whole when this row goes over all the tables for the probing or check whether they're the match. So when you only produce a row, when you guarantee that this row will be the final result. So for example, there are three tables. I think if you start from the first table, you will check whether they match on

Starting point is 00:30:45 the second table. And if they don't match, you keep checking whether they match on the third table. Only when all the conditions are met, you will meet that rule. There probably will be no intermediate result. And on complexity, it has a very good, strong property on the CK queries. For example, in a triangle queries, it shows the complexity is smaller than binary join. So this complexity will be into the power of 1.5. Let's assume every table has any rows. But in a traditional binary join,

Starting point is 00:31:20 it's in a square, something like that. For the proof, I don't understand, but I just remember this it's probably it so there's a better better description of it than i could give so no that that's awesome cool um right yeah also also so i it with with predicate transfer it kind of feels to me pretty much like a win-win in in most in most scenarios but are there any sort of cases or situations in which you think that the performance of it might be sort of suboptimal essentially

Starting point is 00:31:49 or like any sort of place where it's like, yeah, it's probably not going to give you that much of a win. Yeah, definitely there should be. So practice transfer is just used to pre-filter those unnecessary roles. So a very simple contract example is that you don't have unnecessary roles.

Starting point is 00:32:04 You know that most of the roles will join the final result, even though the result may be too large, So a very simple contract example is that you don't have unnecessary rows. You know that most of the rows will join the final result, even though the result may be too large, but you don't care. In this query, there's not too many unnecessary rows. In this case, pre-transfer may incur an extra overhead, but filters out a very few number of rows. Yeah. So the thing is,

Starting point is 00:32:25 in our code base, maybe our code is not fully optimized. We have done that. Bloom filter is always too fast that even though you do this extra work, it won't incur too much overhead. But we also have done that in some commercial system like DuckDB,

Starting point is 00:32:39 Bloom filter is not a negligible overhead compared to HatchToon. And in this case, the step opt-out case will be more obvious, I think, and we should optimize something with it. Yeah, because as we've kind of been talking about as well, kind of the cost of creating the Bloom filter each time

Starting point is 00:32:57 is obviously, it's non-negligible, right? There is some cost there. So I guess there are scenarios in which that, I mean, it maybe is very much implementation dependent as well. You can probably have a really efficient implementation of it as well, high performance implementation. But yeah, that's interesting to know as well.

Starting point is 00:33:11 Cool. Yeah, I guess so. I kind of guess with that then, where do you, you said this earlier on in the chat that this was a very sort of initial stage sort of project kind of putting the paper together. It was kind of done on kind of quite a quick time scale and so where do you go next with it what's what's the sort of next on the research agenda for this project i mean it seems like there's a lot of opportunities there so yeah yeah yeah we got the same thing actually yeah we thought there are many new problems worth

Starting point is 00:33:39 digging further so uh for me uh because you know when when I submit the paper, it's just a single set of execution, right? So a natural extension is just to support parallelism and distributed execution, which is gonna be, I'm working on. Yeah, I need to deal with some bugs, performance bottlenecks before I get everything done. Yeah, and some other students are working on, as I just said, how to do this transfer adaptively to avoid

Starting point is 00:34:05 some unnecessary overhead. Yeah. And some others are working on whether the pre-transfer will affect traditional dream ordering. Some interesting things. And some others are also brainstorming with pre-transfer combined with

Starting point is 00:34:22 worst-case optimal join. So there are many interesting ideas. Yeah, that's sure. It's going to spawn, I'm sure, quite a nice amount of research, this paper. So that's fantastic. Look forward to seeing the fruits of it

Starting point is 00:34:36 as it develops over the coming years. Cool. So yeah, I guess kind of going into some sort of more kind of general questions now, I guess. And as sort of a software engineer or a data engineer, how do you think this paper, what impact do you think it can have? What impact do you think your findings can have kind of going forward? Yeah, so I think it will change fundamentally the structure of the query execution.

Starting point is 00:35:06 So basically it's just a one-phase execution where you just join table in your execution phase. But now I feel maybe it works well to add another pre-filtering phase before this execution phase to reduce table first. Something like, so in a, in a, there are already some pre-filters, but not pre-filtering. For example,

Starting point is 00:35:28 you, you prune data blocks based on some core screen index, like MIMEX. You also, you also pre-filter something. So this can also be some kind of pre-filtering. So maybe add between the core screen filtering

Starting point is 00:35:44 and real execution. You plug this actual phase in the real system when you execute the query. And for real system, this will be a very large change in their internal engine. I can imagine

Starting point is 00:35:59 they have to put many people working on that if you want to support this. And they also need to care about whether there are some optimal cases to fix that. And yeah, I always say it's a little bit non-trivial for commercial system to adopt that, but I think people will be more inclined to look into that

Starting point is 00:36:23 and to see whether it will happen in the future. Yeah, I mean, if the performance numbers are like, oh, this is definitely going to give us a massive speed up, it's worth the engineering effort, right? To sort of get it into the system and deal with all of those sort of problems that might come from it from an engineering standpoint. But yeah, do you know if in any sort of initial interest from any sort of commercial systems exploring this technique, you know, is it still sort of too early for that, you think, at the moment? I clearly don't know. Maybe I should talk with someone when go party touch in this number yeah yeah all right cool um yeah awesome stuff so yeah i guess working on this project then it

Starting point is 00:36:58 seems like there's been some really interesting findings i mean i i from from kind of the outside looking in but from from the inside and working on the project what what is the most sort of interesting insight or lesson that you kind of got from working on this project uh yeah there's one lesson which is not so closely to the relevant to the technical part but there's a very interesting story interesting stories, yes so this project started with a seminar where which is also a co-author of this paper

Starting point is 00:37:32 who presented the Yanakaki algorithm this is the origin of the project and then my advisor thought, had a very good feeling I'm not sure how he had this feeling but he just thought that we have opportunities to use that.

Starting point is 00:37:48 Because he thought this is a very elegant, fancy algorithm, although it was invented way earlier ago, like 20 years ago. But my thought is, well, it was invented 20 years or even 30 years ago,

Starting point is 00:38:01 but people haven't deployed it until today. I'm very concerned whether it's worthwhile investigating that. even 30 years ago, but people haven't deployed until today. I'm very concerned whether it's worthwhile investigating that. And we have a very long and many times discussion on that. And finally we figured out a way. So I think the lesson is that I shouldn't be too pessimistic

Starting point is 00:38:22 on some old algorithm. Maybe we should somehow, we can somehow find a way to break the theory and maybe industry or implementation into the practice using some other techniques, just like combine Makaki with Bloomfield.

Starting point is 00:38:39 At that time, I didn't realize that. So I married again on this project at the beginning but finally i'm interested on that yeah you know you've been convinced right that's it you've come yeah yeah totally convinced after some evaluation yeah but yeah that's definitely a really interesting point about sort of not discounting something that's kind of like maybe more theoretical and older but that's not sort of it's like a hidden gem almost right that's kind of like maybe more theoretical and older but that's not sort of it's like a hidden gem almost right you've kind of brushed off the the debt and then yeah give it a jazzed it up for the 21st century and like yeah cool um yes i guess kind of related to this question um you touched on the origin story of the project there that it was from a

Starting point is 00:39:21 seminar and it sort of grew out of that um was it kind of plain sailing all the way or were there i know you obviously had some kind of a bit of skepticism about sort of this idea in general but were there sort of like any sort of um bumps in the road or things you tried along the way that kind of failed some dead ends or was it just plain sailing i guess everything went smooth as long as we settled the detailed algorithm, for example, transfer from smallest to largest, for example. I guess there are some little things I got blocked, for example, for some cyclic kernel at the beginning. I don't know how to do this because at that time, I haven't settled down a specific order,

Starting point is 00:40:03 for example, from the smallest to largest. So say you have a triangle of the joining tables. If you do the clockwise or counterclockwise, there will be infinite transfers, right? You don't know where to stop because there will always be a filter sent to somewhere and need to filter that. And you use the reduced table to do another transfer. You always keep doing this. But yeah, and then we settle down the order,

Starting point is 00:40:29 which is always from the smallest to largest. And then there's no clockwise or counterclockwise cycle on this. So there will always be a table which is smallest, whereas there will be no incoming filters for itself. Because this one is the smallest. He only used to stand out filters but not receive receive a filter from other tables and then this guy's having these

Starting point is 00:40:50 terminal sort of uh places has yeah it stops the infinite loop right so yeah yeah yeah that's cool um yeah so i guess i guess now is probably maybe a good time also to sort of uh give some air time to your other research as well you kind of worked on as part of your PhD so far. So yeah, what else have you kind of got irons in the fire? What else are you working on? Yeah, so before that, I mainly working on something like pushdown in the storage disaggregation architecture. So cloud database.

Starting point is 00:41:21 I prefer working on cloud databases for all that. So in the cloud, the storage and the computer are always desegregated. So the network can be a bottleneck. Basically, people want to run some condition in the storage so that the size of the network will be smaller. So what I do is to, on top of this basic setup, and to do some further improvement. For example, if you have some cached data in your compute layer,

Starting point is 00:41:50 you probably do not want to push down everything on your computation because you have some cached data. You can reuse that. So we just propose a hybrid solution to use the cached data on local processing as much as possible and to push down when there is a miss on the cache. And this is one project. And another one, this is already finished on the first year.

Starting point is 00:42:13 And another one is to enhance it, is push down on our structure, which is we want to try to figure out whether there are more operators can be executing storage. So before that, maybe people would just want to execute selection, which is filtering and projection and aggregation, which are very, very hard to say. The operators that easily to get gains,

Starting point is 00:42:37 but not complex, intimate. People will see that no honey fruit. Yeah, the easy wins, right? Yeah, yeah, the easy wins. Yeah, even the easy wins, right? Yeah, the easy wins. Yeah. Even the easy win system. We're trying to figure out whether there are more opportunities to score. Yeah. And we've got some other ideas.

Starting point is 00:42:54 But that's too much detail on that, I guess. Cool, yeah. So now you're kind of spending some time with the distributed execution. I imagine that's fun to debug when things go wrong. And there's, yeah but um cool and yeah so just just a few more questions now um and the the the first one or the penultimate one is is about your creative process there so you maybe you alluded to it earlier on and about the seminar process you kind of have in your group and but how do you approach like kind of personally as a group idea generation and then once that kind of is doing your own sort of filtering to to choose

Starting point is 00:43:32 which ones to to work on yeah i would say um my myself as a researcher probably i'm not very good at creating very fancy ideas i have to say what What I created, just some small piece of ideas, and my advisor will say, well, it's a very good idea, but it's just a small piece of optimization on existing tricks, so it may not work well to wrap as a whole paper, for example.

Starting point is 00:43:58 So, basically, what I worked on is just a discussion with my advisor and some other professors. They gave me some hint that I think I'm not good at creating new things from scratch, but good at refining

Starting point is 00:44:13 new things or refining semi-finished ideas into full-fledged ideas, for example. The computer one. For example example for this uh package transfer one uh we talk about the time you do we talk about anakakis and i think maybe we can do maybe optimize the semi-join or optimize the round of the NIC-HKG transfer. Because previously, we could do forward and backward.

Starting point is 00:44:46 Maybe we can optimize this round to get a better result, to make it practical in a real system. But I propose you just use two filters. Let's get combined. That's a very good example, I feel. And for some others, basically, they gave me the hint

Starting point is 00:45:03 and I pick what i feel most interesting or most reliable ones yeah that's nice we have many ideas yeah yeah that's it yeah i guess using them as a sounding board and like leaning on their experience to sort of fine-tune your ideas and and then yeah but i also i actually i get where you're coming from on the kind of not like kind of reinventing something for oring something from scratch, but taking an existing invention and iterating on it and refining it. I mean, that's definitely an important part of the creative process as well, I think, is like standing on the shoulders of giants, right,

Starting point is 00:45:41 and building on this prior knowledge and then just tweaking it a little bit and coming up with something really cool and creative. So yeah, I think that's awesome. So yeah, last question now. So what's the one thing you want the listener to take away from this chat today? I see. Wow, that's a pretty good question, right?

Starting point is 00:46:02 Yeah. So I think this smart tech solution, the Pact Transfer, can be a starting point of many other projects. And I really hope there will be more people in academia and industry

Starting point is 00:46:12 who join us to explore the very giant design space of these tricks. And let's see whether it will lead to fundamental change in the execution in the industry. And I hope it can happen. Just like change in the execution in the industry. And I hope it can happen, just like what's the optimal dream.

Starting point is 00:46:29 Some people say it will be the future in maybe a decade, for example. And I want to make our techniques also the same case. Yeah, well, I hope they are. I mean, they look like it to me as well, so fingers crossed. But yeah, everyone's been told now. Yeah, so great stuff. So yeah, I guess that's a wrap it to me as well. So fingers crossed. But yeah, everyone's been told now. Yeah, so great stuff. So yeah, I guess that's a wrap then, Yifei. Thank you so much for coming on the show.

Starting point is 00:46:50 It's been a fascinating chat. I've learned a lot. It's been a fun wander through query processing. So yeah, thanks for coming on. And just to the listeners, if you want to support podcasts, please consider making a donation through Buy Me A Coffee, and another reminder that we have a survey out at the moment, so go and fill it in and have your

Starting point is 00:47:10 say on how we can improve the podcast, and yeah, awesome stuff, and we'll see you all next time for some more awesome computer science research. Thank you.

Disseminate: The Computer Science Research Podcast - Yifei Yang | Predicate Transfer: Efficient Pre-Filtering on Multi-Join Queries | #48

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.