Coding Blocks - Hierarchical Data cont’d – Path Enumeration and Closure Tables
Episode Date: June 29, 2015It's that time again. This week we answer a question, Allen registers for school, Joe reads some numbers, Michael breaks out the survey results, and Joe cringes at the thought of bidets. It's time f...or episode 29! And we thought, what better to talk about than to continue our discussion on hierarchical data solutions.
Transcript
Discussion (0)
You're listening to Coding Blocks, episode 29.
Subscribe to us and leave us a review on iTunes, Stitcher, and more using your favorite podcast app.
Visit us at CodingBlocks.net where you can find show notes, examples, discussion, and more.
And send your feedback, questions, and rants to comments at CodingBlocks.net.
Follow us on Twitter at CodingBlocks.
Head over to Facebook at Facebook.com slash CodingBlocks.
Or head to www.CodingBlocks.net and find all our social links there at the top of the page.
And with that, welcome to CodingBlocks. I'm Alan Underwood.
I'm Joe Zach.
And I'm Mike Woutlaw.
And let's go ahead and get started with some news.
First, we have some new reviews and...
Yep, big shout out to Sean Lee, Mike Mike, and Dead Iron.
Thank you guys very much and thanks for mentioning the bidet
I've been trying to forget about that
and Mike
Mike that was well done holding
us at ransom for the review that was
that was a good job well now don't give everyone
else an idea what are you
doing don't spread that and Sean
Lee also I think that was from
Stitcher, right?
Yes.
They were saying that they were very selective about the podcast they listened to, and they
really enjoyed the algorithm one we did.
So thank you very much for the kind review and for putting us into your highly selective
list of listening material.
Yeah.
That's awesome.
Very nice.
Awesome.
We've said it before, and we'll say it again.
Thank you guys for taking the time to write those reviews. We really do appreciate it.
Yep.
So let's get into this. I thought this would be interesting. So we've done a couple of surveys over the past couple of episodes.
Let's start with, I'm going to quiz you two guys. Regarding the first one we did, it was regarding the frequency of the podcast,
which would you prefer most?
And we had a couple options in there,
you know, one hour to hour and a half
versus shorter episodes
that were released more often
in like 10 to 15 minute length segments
or a show that was around once a week
but 30 minutes in length.
Which one of those do you think fared
well what about you joe um just gut feel yeah i'm gonna guess uh once a week hour long like
this american life yeah that wasn't one of the three that i said but okay but yeah i'm thinking
i'm thinking the once a week was probably pretty high up there and then only because i had seen the
original one i was
surprised as many people like the format that we had right now which was longer
every so so once you once i tabulated everything down and like took some of the
free form ones and broke them down into like you know uh one of these one of these categories, right, to just have a smaller set of items to talk about.
It broke down into the current format being tied with shows released once a week at 30 minutes in length.
Interesting.
So the one hour, you both mentioned once a week at an hour.
Well, I said once a week.
I didn't specify the hour.
Okay.
Well, that one was a fairly popular write-in.
I guess it was an oversight on our part that we didn't put that in as one of the default choices.
But it was a popular write-in.
However, it was as popular, the same amount of popular, as the shorter episodes that were 10 to 15 minutes
in length interesting so i thought that was a an interesting take that you know our you guys
our audience were uh mostly consistent you know they mostly fell into two very popular camps
and then the ones that didn't fall into those two popular camps fell into like two other
consistent camps they chose he said it was an even divide among among those so i guess now we need to
figure out what our takeaway from that is whether we're going to start trying to do some some more
frequent episodes or whatnot i mean uh well it sounds like that there's not you know the sorry
joe but it sounds like um you know people aren't gonna be upset if
there was some more right um but a lot of people did like the current format too so a lot of people
to figure something out yeah yeah yeah we're tied at 43 you know so it's a big drop off to number
three yeah yeah the first two were tied at 43 sorry i didn't get into that and then and then the other
two options one of those was a just a popular write-in uh those were it's you know roughly
six and a half percent so um yeah it was a big drop so let's get into the second poll our survey
which was you know years of programming experience you You guys have any guesses as to which one you think?
There were five options.
Less than one year, one to two years, two to five years, six to ten years, or 11 plus
years of programming experience.
If you're listening along, you can play along with us and say your vote out loud while these two guys ponder.
I honestly think that probably one to two years was the most popular.
And to be fair, I did not look at these results at all.
Thank you for not.
I have no idea.
Thank you for not cheating like Joe did.
But I'll tell you why.
Because most of the people who write in to us are the ones that, and I think it might
be those people that are more curious because they haven't been in the field yet, but that's why... So you're saying
one to two. One to two would have been the majority.
Joe, what do you think is the
most popular? I could have
flipped the coin. I really couldn't have predicted
anyway, but I was surprised to see
what the answer was.
Okay. Well, yeah.
He totally cheated.
Yeah, the majority of our audience is
11 plus years. Really? Yeah, the majority of our audience is 11 plus years.
Really?
Yeah, the second most popular bucket was the two to five year mark.
So it was 33% for 11 plus years and 24% for the two to five years.
The one to two year range that you thought would be most popular was the absolute lowest.
You know what's awesome?
It was actually tied for the lowest.
It was tied at 13% and it was tied with less than one year.
You know what's awesome about that to me?
It actually excites me because that means that there's a few things going on.
One, people are genuinely looking to better their programming experience, even though they have 11 plus years of experience, right?
Like, that's amazing.
And the other thing that I find really awesome about that is that, you know, you people who are listening right now, you're actually finding what we're doing valuable enough to even, you know, to listen and with that much experience,
go up there and take our polls.
So I,
I've,
I,
that's flattering and awesome all at once.
Yeah.
In the,
in that poll obviously hasn't been available as long as the,
the one regarding the frequency of the show.
So it'd be interesting to come back and see,
you know,
uh,
over time as time goes on.
Um, but yeah.
Awesome.
Thank you very much for going and filling in those, those surveys guys.
That's, that's excellent.
I say guys, gals as well.
So everyone, thank you.
Uh, so next up on the list.
So right before we started recording tonight, we got an email from Lou and he's kind of got an interesting situation because he is he's later
in his career and he's getting ready to go back to school from scratch and essentially go through
a four-year school to get into programming and he's kind of wondering you know what is this this
is a bad thing is this something that we've seen i mean do people actually come out as junior level
programmers later in their career and here's my take on it and and both these guys will also fill
fill in on this but i don't think there's ever a time to to that's too late to get into something
if you have a passion for it and you and you truly have a drive to do it,
the only thing that I think would really ever hold any back
is if you financially can't afford to take that step back.
If you are a master, I don't know.
It sounds like he's fortunate enough that he can.
Yeah, if you can, do it.
Good for him.
My take on it is this.
There is never too many people that do a fantastic job at anything.
So if you come out of this thing and you become an excellent programmer,
you hone your skills, you're thorough, you think through problems well,
you do all that, there's room for you.
And it doesn't matter what your age is.
If you're somebody who's easy to get along with, you communicate well,
and you become skilled
at your craft absolutely man and i will say you mentioned not getting into the field until
you know closer to when you graduate my recommendation would be try and start getting
work a little bit earlier on even if it's a junior level type thing or go to one of those sites we mentioned previously, like topcoder.com
or, or project oiler or some of those things. Yeah. Oiler. Um, so do some of those things
because you would be shocked at how much better you'll get and how much faster it'll happen if
you get involved in doing that kind of stuff. And then you can put those kind of things on your resume like you, or even putting code
into like a GitHub project.
So I'll let these guys carry on with this, but that's my two cents on it.
Yeah.
I mean, so as a 21 year old, I will say that, you know that my take on it is that it's definitely –
I feel like there's never a wrong time if you feel like you need to reinvent your life.
I don't feel like there's ever a wrong time to do that.
If you feel like you need to do it, if it's something that you want to do,
if it's something you're passionate about, go for it.
What frustrates me, though, and the problem is, is that there's, you know,
there have been people that have put data together to form, to back it up. And I'm not saying like,
I've bothered to investigate how accurate the data is or whatever, but like, here's an example
that I quickly Googled that, that bothers me. And it's a quote,
it's maybe a well-known quote from Mark Zuckerberg.
And he says,
I want to stress the importance of being young and technical.
Young people are just smarter.
Why are most chess masters under 30?
I don't know.
Young people just have simpler lives lives we may not own a car
we may not have family simplicity in life allows you to focus on what's important
and i kind of understand what he's getting at but it's frustrating to say that like
well if you're not in your 20s then you can't solve technical problems
yeah that's not so much an issue of getting a late start as it is,
uh,
just not being the age anymore.
And it kind of implies that people are going to age out of the peak of their
career early.
Well,
and that's kind of like what,
you know,
going back to Lou's email though.
I mean,
he was talking about like companies going after,
uh,
you know,
being willing to hire someone,
um,
you know,
in their upper ths to early 40s
was the age range that he had questioned.
And that's why it bothers me.
Like when you see, here's the head of a major dot-com company, right,
making a statement like that, and you're just like, wow.
Yeah, what's he say behind closed doors?
Exactly. I'm sure the official HR legal answer is that,
no, we'll take anyone.
But he's kind of vocally saying,
being quite public about it.
I will say if you take it a step further than that,
I have seen and heard of companies,
like when you start getting closer to retirement age,
when you're in your 50s,
then it becomes a question, right?
Like, do we want to bring on this guy that we know that in three years getting closer to retirement age when you're in your 50s then it becomes a question right like do
we want to bring on this guy that we know that you know in three years he's basically going to
be leaving that's always a question i think the age range that he's talking about right here that
he'd be in that's not usually the question but there are companies like the facebooks of the
world that think that you know hey um the younger generation is where it's at probably for several reasons one they're cheaper
two they they don't have all the other things they don't have families that are in the way so
they can work 16 hour days and they think it's cool that there's a playstation 4 in the break
room right i've definitely been in those environments where they're like oh my god
they let us spend the night here and right you know we got xbox yeah and so i mean it's it's
a different there are definitely places you're going to walk into that all have different vibes
but again i will go back to like there there is always a place for somebody who is good at what
they do and fits on a team well because there are plenty of people out there who are not good at
what they do and don't and don't meld into a team properly and i agree and that's why i worded it the way
the way i did was that if it's something you're passionate about and you want to reinvent your
life you know go for it yeah you know i mean don't just do something you know half-heartedly
because you think like that it might be better it should be something that you're truly interested in
so that you'll have that drive to want to continue bettering.
I mean, I look at it this way.
I've been doing this career for quite a while,
and I still study new topics all the time.
So just because I'm not submitting homework doesn't mean I'm not still studying.
Right. Has to be something you're passionate about.
Yep. And also want to stress the importance of networking, you know, especially with, you know, a resume that's going to look a little different than the people you're competing with. to go out there and shake hands and get to know people at meetups and also being able to show something, some sort of side project or some sort of something on GitHub or somewhere
else just to kind of show that you do have a passion for this and that you're for real.
That's a really good point.
I mean, even someplace like Stack Overflow, right, where you get cred for answering questions
or that kind of stuff.
Yeah, that can never be stressed enough is the importance of networking. It,
it really is underutilized by a lot of people, especially in the tech industry.
So yeah, excellent, excellent email. Excellent question.
Hopefully we've given you something to walk away with positively on that.
And the next thing I wanted to do is in going into the whole education
thing. It's been a few years because I am also 21 since I took the course on algorithms back in
college. And I, I don't know if you guys know about this, but there's, there's sites out there
to where you can actually take university level courses for free. And one of the sites out there is called Coursera.
And there is an algorithms part one course that I'm just going to take,
uh,
that starts in July,
I think in just a few days.
So I,
we're going to leave a link in the show notes and anybody else who wants to
join me and maybe,
you know,
see,
see how well they do in algorithms.
It's going to go over things like big O notation.
What's also important to note, though, that for those that aren't familiar with Coursera,
this is a Stanford University class.
It's major.
And you also get, it's not like you get college credit for taking it,
but you actually get a certificate of completion and that kind of stuff.
Like it's real.
You have to do assignments. They get graded. Um, you'll have quizzes. You'll have, you basically watch videos
throughout the week. And I think they said that it's about six hours a week of coursework and it
lasts six weeks. So anybody else who wants to polish up on their algorithm skills, uh, you know,
come join me on this. And like I said, we'll leave the link in the show notes.
And,
and one cool thing about it too,
is any of you guys who are interested in interviewing at the Microsoft's,
the Google's,
the Amazon's of the world,
something like this is fairly much crucial.
So,
you know,
that's it.
Yeah.
Great.
All right.
Also, I wanted to mention
that I got the
notes out of order a little bit here but
independent 6 is out
we actually just heard
so I think they fixed a couple
of your gripes Alan they did
which is awesome
so the two actually one of them
that both Outlaw and myself shared
was theming
so independent 5 or yeah it was
five right yeah independ five did not honor your theme you could change it at dark you change it
whatever you wanted it didn't care it was just going to stay with its um white theme and then
the other thing was the whole double clicking on the column dividers in a grid it would just
collapse everything down to nothing they fixed that that as well. So, so these guys listen, I mean, can you say that about a lot of companies out there writing
software that you use day to day? No, it's cool. It's a coder tool written by a coder, you know,
so he eats his own dog food and that's really cool. Yeah, that's most excellent. So if you guys
haven't checked it out and you, you need a static analysis tool to tell
you how good or bad your code may be and areas to improve upon definitely check out independ
and if you're curious to hear our previous uh review and conversation about this this product
you can go back to episode 15 it was titled independence on how good your code is yep
very nice okay so i think that brings us to the topic so this week we're talking about on how good your code is. Yep. Very nice.
Okay, so I think that brings us to the topic.
So this week we're talking about
other solutions for storing hierarchical data.
In episode 28, we mentioned two,
adjacency lists and this and that.
So we're going to be probably comparing back
to those a little bit.
So feel free to go back an episode
if you want to catch up on that.
And this time we're talking about two new strategies for solving the same problem or
similar problems, uh, path enumeration and closure tables.
Yep.
So let's go ahead and get rolling on this one.
Path enumeration is also referred to as materialized path.
I've seen that written in a few places.
So this goes both for the same. And this one's
actually fairly simple. If you think about like a bread chrome, that's essentially what this is. So
if you have a hierarchy of items, which again, continuing on the hierarchical path here,
then essentially the way the data is stored is it will be your root node slash the sub node slash the
sub node slash the sub node or and it doesn't have to be slashes it could be periods i've seen them
pipe delimited i the delimiter is not important it's just the fact that it's basically stored
in a string parent slash child slash grandchild slash etc great grandchild yeah and so the concept's super
easy i mean you guys have probably seen this stuff written out i mean it's very similar to
what you see in your windows explorer right if you go to your drive you see c colon and then a
folder and then another folder it's just it sub all the way down so i i mean other you know in all of our when we were prepping for the show like that analogy that
you just came up with windows explorer never once came up and it is the most beautiful way to
visualize what a materialized path or path enumeration is it is exactly what it is colon
slash program files slash microsoft visual studio 12 slash common
slash ide slash okay so i've gone there a few times so if we imagine like a database of all
our you know representing our file system basically then i would have a record for say
you know my host file and associated with that i would have a field that contains the full path
to that item and that's really nice because i can say things like select star from files where path like
C program files.
Yep.
Right?
Yep.
And then also you could even take it a step further.
Now here's where it kind of sucks though because indexes don't play nicely with this.
But you could actually say where like hosts, right? And then it would go get you all the paths that have a host file as the very
last string in there. So, so you could do that again.
SQL indexes usually aren't smart enough to be able to,
to reverse up like that.
You'd actually need a full text index or something to,
to make that fully efficient. But yeah,
I mean,
it's really easy to query.
It's well,
but see,
here's the thing.
And I was really hoping that maybe one of you two guys would have like a,
to be able to talk me into it.
Cause I've never been a fan of this,
you know,
anytime I've seen it in a,
in a table for any reason where it was trying to,
you know,
the purpose was to just flatten out the hierarchy
and
I've never been a fan of it.
You say
that it's easy to query
and
I would agree that it would be
easy to write the query,
but from an
optimization or from a performance point of view,
you're taxing the server because it's doing a bunch of string manipulation to do that search.
So while the query might be easy to write,
the work that the server is doing isn't as easy as other hierarchical data models that we've talked about.
Yep.
It's not normalized at all.
And there's no integrity. There's no referential integrity Yep. It's not normalized at all. And there's no integrity.
There's no referential integrity checks.
It's a string.
So my question to you guys then is,
talk me into why I would want to use path enumeration.
I won't.
Yeah.
Yeah, I mean, it is convenient.
I've done it.
You know, that's fair.
I asked for it.
Yeah, there's been times I've done this before.
I consider it an anti-pattern in databases,
but there's been times when I've popped a common delimited list into a field for convenience,
and sometimes that's all you need to do.
And it's not great.
It's not scalable.
It's a lot of extra data, and it's not great for – there's no integrity.
Indexes are kind of yucky, but it's quick and dirty.
Well, and I'll give you another thing that if you take it a step further,
when you just start talking about database architecture and how it works behind the scenes,
if this list gets big, now you're going to have your data splitting across page files.
Like your performance can really start taking a major hit.
Oh, what if you decide to change the middle of that structure?
Yeah, and that's...
You lop off one of those tree nodes,
and now you have to go back and rewrite all those strings.
So here's why I will not ever talk anybody into doing this one.
If you look at it from the perspective of maybe a product catalog
in an e-commerce situation let's
say that you have 10,000 products that all existed under some subcategory and all of a sudden you're
going to get rid of that subcategory and you're going to put it under another one typically the
way these things are maintained are through triggers so if you change what category it
belongs to it's going to go back through and automatically update all those fields.
It is horrible because now you made that one change where you just moved all those products from 1.3.
This is you trying to talk me into it?
No, no, no.
I'm telling you why I would never.
Oh.
So if you moved all these 10,000 and something else, it's now going to update all 10,000 of those products.
And it's going to have to re-query to find out the entire
structure for each one of those to then update it and the problem is okay that sounds bad enough
but when you start introducing that into something like a transactional system you start locking
those tables and so everybody else is trying to access that data is now shut out so i don't know
i don't know did we give like an actual concrete example but I don't know if we did or didn't,
but let's just say for simplicity,
you have three records, okay?
A record one, ID record one,
ID record two, and three.
And one is the parent,
two is a child of one,
and three is a child of two.
So your path enumeration might look like one slash two slash three on record
three on record three and record two would be one slash two.
Yeah.
And record one would just be one one.
Correct.
Right.
So,
so there's this column that could be on the same table as those records or it
might be in some other column,
but or some other table. But the point is, is that all of that data would be in the same table as those records or it might be in some other column but or some other
table but the point is is that all that data would be in the one column so kind of like how we talked
about adjacency list where every record knew its immediate parent in a path enumeration every
record knows its entire lineage that is the one and only pro that I can come up with.
Oh, there's actually a really good one.
A huge advantage over nested sets and adjacency lists
is you can actually have multiple lineages.
So if you break this item out from its lineage,
you can actually have a breakout table
and have multiple lineages.
So in a file system example, like we talked about,
you could have linked files or soft links.
In an e-commerce example, if you had a gaming mice,
you could have that under electronics, computers, accessories, mice,
and you could also have it under toys and games.
Okay, okay.
That's cool.
There's another example.
All right, so fine. Multiple parents and cool. There's another tree. So, all right.
So fine.
Multiple parents and every child knows its entire lineage.
But what absolutely sucks about it is let's say that you do have this
lineage and it's five deep,
right?
And you want to know,
instead of just the ID of it's of the middle category,
you want to know the name of that category.
You don't have to parse that string.
Oh, right.
You have to split that string
on whatever the delimiter is and say,
okay, now I have to assume that's an ID.
And I say assume
because there's no referential integrity there.
Right.
And now you're going to have to say,
okay, that was supposed to be a number.
Let me try and join over to this category table
to get it and hope that nothing blows up.
See, here's, in my mind, just,
and this is true for any type of code that you're ever going to run into in
your life.
In my mind,
if you're doing any kind of serious string manipulation in order to do
something like you have to manipulate,
manipulate some string and start,
you know,
you're running a regular expression against it to start getting out tokens so
that you can decide what you need to do or,
you know,
what actions to take.
Like that's a serious code smell.
Something's wrong.
And,
and that's,
this is what path enumeration feels like to me is that just like you said,
you gotta have some regular expression to break that apart into its separate tokens so that you can get what's the,
the third,
uh,
you know,
ancestor above this particular record right
it feels wrong yeah and it is i i would never ever ever recommend doing this so so when would
you but that's the point is that there's there must be some value to it or else it wouldn't be
a thing right no one would do it it wouldn't exist unless
somebody found value in here's a particular use case where path enumeration will just rock your
world i'll give you a few if you don't have a database system that supports something like a cte
or any kind of hierarchical data set type stuff my sql is an example you can basically you can get
away with doing this because
let's think about it in SQL Server in the olden days before you did have CTEs.
The olden days. The olden days back in 7. The way that you would do this kind of
thing is a lot of times this is what you'd see because there was no built-in
way to do it and the only way to do what you needed to do was to write some sort
of stored proc that would loop over things with a cursor usually and then insert into some temporary table to get your hierarchy.
So, Joe, I think what he's telling us, and you tell me if this is what you took away from that, Joe, is that this is an outdated way to do it.
Yeah, if only there was a way to solve this problem and store these associations but maintain referential integrity.
Woo-hoo!
All right.
So is that your way of telling me that we should segue off of this?
I think that was a sagoo.
Yeah, I think so too.
Well, before we jump into that, though, let me take one quick moment to say that uh you know you've probably heard me ask for this
before but if you haven't already we greatly appreciate it if you would take the time to go
to stitcher or itunes wherever you prefer to find your podcast and drop us a review it helps others
to find us and we greatly appreciate that and when you take the time to do that, we actually see the results of it.
And we really appreciate it.
And it really makes us feel good
when we read some of these results.
Yeah, it's huge.
So please do.
And to make it easier on you,
you can go to www.cuttingblocks.net
slash review
and then pick your poison.
iTunes, Stitcher, whatever.
So please do that.
Also, I do want...
We don't touch on this very much but so our show
notes like we take pride and put a lot of blood sweat and tears into doing the show notes on this
and we try to make it in a way that whether you're using an iphone or or an android device or a
tablet of some sort most of those like i i know you guys use the podcast app on iOS.
Yeah, the built-in iOS podcast app.
I use Pocket Cast on Android.
And if you go to the show notes
on any one of the episodes,
we've got it set up to where
you can click the links easy
to go look at whatever we're talking about.
Yeah, you can read the full-on show notes
straight from the podcast that you download.
Yeah, you don't even have to go up to the website.
We're not trying to tell you to stay away from the website, but we want to make it easy on you.
We want you to have the information.
Yeah, so you can go there.
And these polls and these surveys that we mentioned, you can actually have the links there in the actual show notes as well.
So if you want to be able to participate but you forget or whatever you know while you're driving just go i'm
kidding not while you're driving you could go to the show notes and literally we'll have the links
there and everything else so so if you weren't aware that you can see those directly in your in
your pod listener uh you probably can just just explore a little bit and uh for those that do
like to hit up the site you know or if you haven't already be
sure to hit up some feedback on uh on each episode too you can go down there and leave uh your
comments directly to a particular show yep so all right so to uh kind of bring us back to the topic
we were just talking about uh what's it called path enumeration crap enumrap enumeration? Oh, yeah. Path. Sorry.
Yeah, we're not a fan of this one, but you basically
associate the breadcrumbs with your
record node, and it's easy for querying, but it's
gross to maintain, and it just
feels dirty. No, it's not easy to query. It's easy to write the query.
It's easy to write the query.
I like that. There's a difference.
There is. Okay.
Alright, so that
moves into, if only there was a better way.
And there is.
Coincidentally enough, that works with any database system out there.
I don't care what it supports.
We're talking to you, MySQL.
The closure table method.
Wait, is this a JavaScript thing?
It could be. So let's see if I can, if I can somewhat put
this into, I want to go three levels deep. Like what Michael said a little while ago, just, just
to keep things simple, starting off level one is the top level two is a child of level one and
level three is a child of level two. All right. The way a closure table works is level. Two is a child of level one and level three is a child of level two.
All right.
The way a closure table works is this.
It is a separate table.
It's off by itself.
And I'm going to,
in the same vein that,
that,
uh,
Michael told me on the nested set,
I'm going to go ahead and throw in this additional information on this one as
well,
though,
in a closure table,
you have an
ancestor column a descendant column and then you don't have to but you add in like a depth or a
level column and that's easy for querying but essentially what you've got now is this this
table is going to be every single combination of every record and its associated descendants or ancestors
that that exist period so first off let's do the big o because we actually had somebody
ask about big o notation on the nested set model and we went looking for it i think out
he's gonna go look for it while i while i talk through this i forgot to i forgot to even mention that at the start of the show so with here's here's the one downside
to a closure table is given big o notation it could actually be o n squared all right that is
absolute worst case scenario which almost never happens because that means everything would always
be at the very bottom level but here's how it. But that's a pretty bad one, though.
It would be bad,
but it doesn't ever happen that way.
O of N squared is bad,
but it won't typically ever happen that way.
It's usually quite a bit less than that.
So just to make sure I understand it,
we're talking about basically storing
the association between my node
and everything in the hierarchy.
So if we're looking at the 1, 2, 3 example,
that means I've got a row in there for 1, 3,
a row in there for 1, 2,
and a row in there for 2, 3.
Well, let's take it all from the top
because you missed a few.
So you're going to have 1, 1 to itself
with a level of 0.
Then 1, 2, level of one,
because that's his child,
then one, three, level of two,
because it's his grandchild,
but then you're also going to go down to the next one.
So hold on, because so far,
we've only described the hierarchy
for one part of the node,
and he's already added three records.
Where the ancestor is one
right so so all these were ancestor one all right now if you wanted so now you've also got to do
this for ancestor two and three so now you're also going to have a record for two two level of zero
because it's itself two three level of one because that's itself to 3 level of 1
because that's it's child
so now wait we had
3 records now we're at 5
now we're at 5 records
now for that bottom record number 3
you've also got 3 3
level 0 so now you've got
6 records for that
1 child all the way
at the bottom
now it sounds bad right It does sound a little
bit bad, but here's the deal. The reason O N squared, if you assume that everything was always
at the bottom, that's why you get N squared, because basically it would have to do it for
everything up the chain. You're not always going to have everything down at the bottom level. So
that's not going to be it. It's going to to it's definitely going to run into a decent number of records but let's talk about why it actually
is cool this way first off you're storing typically typically if you're doing guids
you probably just go shoot your server right now but typically you're storing integers right
for referential lookups so querying these things is uber fast, incredibly fast.
Inserting is actually pretty easy.
And assuming that you don't have a crazy number of hierarchy that's 100 levels deep,
then the inserts are also going to be pretty quick as well.
That's the problem, it seems like, with the closure table, though,
is that it really is, if you have a crazy tall tree, then this is going to
break down in a performance point of view. No, it probably won't break down except for rights.
Your reads are going to be still just incredibly fast, but your rights could run into an issue.
But again, keep in mind, this table is not unlike the adjacency list
or some of these other methods we've talked about.
This is usually a standalone table
that is literally ancestor, descendant, level.
It's three columns.
So, and these columns are typically integers,
not always.
If you have some sort of key that's not a surrogate key,
then it might be something like a
guid or it could be a string or something but let's say for for simplicity's sake that all
your records were using integer indexes right so basically you have three columns and they're all
numbers so searching this even if this was an incredibly tall table is going to be stupid fast because you can index it.
You can index it based off ancestor off descendant,
and then you could also tack a level on to each one of those.
And now you can query this thing 20,
which ways from Sunday and it's going to be just fast.
So,
so just to put this into different words,
right?
If,
if in case if it's not already clear.
So what this is doing is it's storing every path from each node to each of its descendants.
Correct.
Right?
So there's a couple things that I really like about the closure table.
Number one, like you mentioned, simple integers,
assuming you're using integers for your ID columns, that you
can easily index.
I like the fact that you can get the referential integrity there.
That's awesome, right?
I like that, like the nested set model, you can easily get an entire tree. Yep. Right?
Where this one has an advantage over the Nesta set model, though, is that you can, without having to do any kind of crazy math,
you can just easily get, here's the hierarchy.
Yep.
Right?
Where you basically had three columns, an ancestor column,
a descendant column, and a depth column.
Yep. had three columns an ancestor comma descendant column and a depth column so whatever your uh
whatever node that you want the tree down from you can just easily query say select star from
whatever your closure table name is where ancestor id equals the node that you're interested in yep
the node on the tree that you're interested in and then you can get all of it from there. Yep. It's, it's real easy. And, and let's just to give some examples of what,
what Mike was just talking about. So one of the things that's kind of cool is let's say that you
want all the descendants of something. The query is select all from tree paths. Let's assume that
we named our closure table, tree paths, select all from tree paths. Let's assume that we named our closure table tree paths.
Select all from tree paths where ancestor equal N, whatever the number was.
That gives you all its descendants because you're saying, hey, give me everything where I am the ancestor, right?
Super easy.
And then if you wanted to order it, you basically say, hey, order by the level.
Done.
Now you actually have a nice little thing that says,
all right, so here was my child, my grandchild, my great grandchild, whatever. Super easy to do.
Now let's take the inverse. Let's say, hey, we want all the ancestors of this particular thing. Select all from tree paths where descendant equal in. So where I am a child, give me everything.
Because there's a relationship to every single record
walking all the way up the tree down to whatever that item is it's very easy and then you could
again just order by the level in this case you probably do level descending to get it going in
that order up from you so again super easy now here's an interesting one though let's say you
you're going to insert an entry.
And so Joe and I were talking about this beforehand. And one of the things that you'll
see a lot online are comment systems, right? Like a, like a form or something, because if,
when you have a commenting system, right? Like the first person who comments, he's comment number
one. If somebody replies, well, that's a child of that first comment. And if somebody replies to him, then that's a child of that comment. So, so that's
why a lot of times you'll see that. Well, one of the things that's interesting is, all right,
so you insert that comment into your comment table, right? And let's say it's comment 10.
Well, the way that you would insert this into your closure table now, because now it needs to
be able to relate itself to every single comment
that was up above it in the hierarchy you'd say insert into tree paths and then you'd say your
ancestor descendant and level now instead of just a values type insert you're going to say select
ancestor and then whatever that that new common id is so 10 and then do level plus one from tree pass where
descendant equal whatever the the immediate parent to that comment was and
then that is literally going to insert the records for all the parent records
all the way up and then you'd also have to insert itself as well so you'd say
insert into tree pass 10 10 10, 0, right?
So it's pretty simple.
Now, here's the thing.
Like we said, if this is a really tall tree or a really deep hierarchy,
that could be 100 records if you're 100 levels deep, right?
Oh, actually, it's much worse than that.
I just pulled some numbers for fun.
Oh, yeah, it would be worse than that.
Yeah, it's much worse.
And this is where that N squared comes in.
So basically, in our simple example of 1, 2, 3 with the depth of 3, there were 6 records.
Now, if we had 10 records with that same depth structure, so 10 is a child of 9, is a child of A, is a child of 7, then you end up with 220 records.
Woo!
100 records.
But that's not N squared yet.
Well, it's actually
pretty close.
So if you had 100 records, right?
Now we're looking at 10,102.
Actually, wait.
10 would be worse than N squared.
What did you say it was?
N is 10.
And we end up with 220 records.
Yeah, that's worse.
Yeah.
And that's where like...
The depth.
Yeah, because I was going to ask you,
if we had to decide...
Okay, so we talked about Ness' set model
in the last one.
Hold up, though.
What did you say?
What was that other one after 10?
1,000, yes.
Oh, good God.
He said, oh, for 1,000 oh geez what that
would be i have no idea i do some math i'm gonna say a million uh half a million it's basically
it's n times n plus one divided by two it's just a basic summation uh but yeah we're looking at 500,500. 500,500.
Easy for you to say.
Okay.
So here's the part.
Before we go too much further.
All right.
Does anybody have a tree that's 1,000 levels deep?
Right?
Yeah.
If you do, we need to talk.
Yeah, that's 1,000 items with a depth of 1,000.
Yeah.
Okay.
Yeah.
So before we
start bashing on the closure table,
I did want to come back with
one comment that you haven't said yet.
And that is that closure
tables allow for multiple parents.
Yes, much like the other one that we were
talking about. The path enumeration.
Yeah, you can do as many as you want.
So now let's bash on closure table.
Okay. It could be actually many as you want. So now let's bash on Closure Table. Okay.
Because it could be actually much worse than this.
So in the last episode, we talked about adjacency list and Ness's set models, right?
And I think we, at least Alan and I, are big fans of the Ness's set models.
Joe's kind of indifferent.
Is that fair to say?
Right.
So my question, though, would be, I'm going to make you pick models, Joe's kind of indifferent. Is that fair to say? Right. So my question though would be, I'm going to
make you pick now, Joe.
Because you really care about adjacency
list versus Nessa set model, but closure
table versus Nessa set model.
I'm putting you on the spot.
Yeah, if it's a true hierarchy, I'm definitely going
Nessa sets. It's easy to query.
It's easier to maintain. But if I
need a true graph if
i need cycles then i think that uh closure tables is my solution wait you said the nest set it's
easier to maintain well i think he means in terms of like the batch process that's going to do it
yeah well yeah you know easier is tough to define but to me it's just a lot less data and it's more
human understandable it's like every time you make a change, it's not 100 inserts or updates.
No, so the nested set model is...
Yeah, you can't do that with a nested set model.
Yeah, they're both horrible.
Yeah.
Thanks, Joe.
You know, let's just screw it.
Let's just do path enumeration.
It supports cycles.
It's simple.
And string parsing isn't that hard.
And that's probably why it's
a thing that concludes this episode thanks all for listening right right no so let me hear your
take on it alan so same question to you here's my thing i do closure pretty much just about every
time the only thing that would drive me away from it is if I knew I was going to have a tree depth of of greater than I don't know 50 if you got if you got there then it starts becoming a thing
because here's the deal if if it's performance that you need but it's transactional I don't
think there's even a question you have to do the closure if it's something that you can do in batch like we talked
about with the nest set we've both done them before they are super fast to query and actually
very easy to write queries against unless you're doing inserts and updates and whatnot my thing is
space is really cheap nowadays right and ssds are on the on the market and people are using them in
servers so well let me let me clear up one thing because when we're talking about the big o SSDs are on the market and people are using them in servers.
Well, let me clear up one thing because when we're talking about the big O notation,
we're talking like O of N squared rows for closure table, right?
That was the worst case in terms of the number of rows.
So when we're talking about big O notation, we're not talking about in terms of complexity to create it or to query it, right?
So that comes back to a whole nother conversation right good
point so so if we're talking about in terms of space since that's what you're talking about
then nessus set models is less way better it's going to be oh event yeah it's going it's going
to be whatever it's one to one but it has this many rows you got this many rows but it does not
support multiple hierarchies and you can get cyclic uh issues no no. You can have multiple hierarchies within a Nessus set table.
What you can't have is multiple parents from a node.
Multiple parents.
My bad.
So both closure tables and Nessus set models will support multiple hierarchies can coexist in that same table.
But the parents.
Except closure table will allow multiple parents.
Parents.
And here's one other good thing.
So we talked about before like if you're
trying to batch process something like a net set model and there's some sort of you know something
references a parent that that it shouldn't have you get into the circular dependency it's going
to crash everything out right you don't necessarily have that same issue over here you build those
records on the fly and then it's done. So querying it,
I don't think you run into that same circular breakdown that you would with a nest set model.
But here's my thing. Space is cheap. The nested set model, or not the nest set, the closure table
is easier to insert records into, easier to delete from, easier to update, and super stupid,
easy to query. And it's a very small table and you
don't have to batch process it like in my opinion it wins it pretty much across the board except for
space if you're talking about space and space is at a premium then then this may not be your key. But in just about every other situation, it seems like it wins.
Okay, so I'm going to put you back on the spot again.
Let's say you have 500,000 records.
That's a lot of records.
You were talking about a tree, if your tree was 1,000 deep.
But what we're talking about here is it doesn't necessarily have to be a deep tree,
right?
So you have 500,000,
whether that's,
um,
500,000 products or in the,
when we talked about Nessa set models,
we talked about,
uh,
employee hierarchies.
So it could be 500,000 employees.
You have 500,000 records.
Which one?
If there are if they're multiple parents then there's no question right like you basically gotta go that
way but let's say if it's five hundred thousand what do we say in the level is five deep unknown
you gotta have at least some sort of you're walking into it you don't know you have some process you're either going to create a process that's going to have at least some sort of cap. You're walking into it. You don't know. You have some process.
You're either going to create a process that's going to
look at some data and flatten it into a closure table
or it's going to flatten it into a nested set model.
So the first thing I would do is I'd find out how many
levels deep it was.
And then after that,
after that,
I would say,
I mean, I would probably
and this is.
Go back to your little calculator that you had there, Joe.
What's 500,000?
No, no, but that's levels deep.
That's not fair.
So in all fairness, like truly what I would do.
No, no, no, that was supposed to be based on the number of rows, not based on the depth.
Oh, come on.
You said an employee.
Are you going to have 500,000 people and they all report to the person above them? Yeah, that's not going to happen that's not gonna happen no no but what i'm saying is uh wasn't the o of n squared
it wasn't refer n wasn't the depth was it it was the number it's the number of rows but it usually
assumes that the deeper you go the more rows it creates but but at any rate so it honestly trying
to solve that problem what i would do is the thing, I would probably try to do the closure table,
find out how many records it ends up with overall,
given whatever depth, unknown depth and unknown number of reports
at whatever level.
I would do that.
If it only hit a couple million rows, I'd probably stick with it.
500,000 squared is going to be 250 billion.
Yeah, that's not going to happen.
Yeah, I have the number about it.
I just don't know how to say it.
So, actually, I got it.
Okay.
125ity billion.
Zero, zero, zeroity, deep, zippity.
I feel like a Googleplex is going to come in.
You can't Google how to save this number.
But, I mean, seriously, that's like one of those things to where you don't know what's going to fall out until you do it.
So the way that I would approach it is I would probably attempt to close your table first, find out, hey, what's the damage, right?
What's the amount of space this is taking up on disk?
After I index it, how fast does it run?
Because, let's be real right any
any database where i feel like the the advantage to ness's set models assuming you don't have
multiple parents this is where the advantage to the ness's set model comes in because i don't
have to know all that i don't have to take the time to say like it's also been for rows let me
figure out how many levels there are let me let me do a test run and see like how long did this
thing run how many rows did it create how much space does that take up in fairness that would
be another thing that i would take into into consideration as well is hey is this something
that doesn't change is it not transactional if it's not transactional maybe then that's what
is the way to go because it's going to be faster who cares if it takes two minutes for it to create
all the records however with the closure table if it is transactional that's going to be the first thing i look at because i can't i can't
wait there are some additional requirements but i wasn't specifying that right so i i mean i like
the idea of being able to using straightforward sql to create update delete and query these things on the fly right
that's the beauty of the closure table um whereas with the nested set it's not you you basically do
have to create some sort of process to update these things now that being said there is something
really nice to be said for hey you have a one-to-one relationship with every single record right
that's that's excellent fast and and easy to actually just look at and kind of see what's
going on yeah i mean what about you joe did you give up on all of it are you doing a document
database structure now yeah i'm actually looking at other strategies for solving hierarchical data.
I mean, there's...
And this particular show...
Is it called Solr?
We haven't...
Well, I mean, there you go.
That's actually one thing.
But we haven't even...
We weren't even going to talk about it
on this particular show.
But there's always something
like a document database, right?
To where you can literally store whatever you want as subnodes of something,
and then you don't even have to worry about any of it.
We should talk about search engines.
That'd be a good show, I think.
Yeah, I think it would be.
I mean, well, let me, so you teased it a moment ago about the big O for adjacency list and net set model. And so what I want to say was that coincidentally,
bigocheatsheet.com was the tip of the week in that episode.
And bigocheatsheet.com has all of the operations for adjacency list already spelled out in Big O notation.
Nice. So normally when you talk about big O notation, you usually use N, but these were not.
So for example, storage for an adjacency list was the O of absolute V plus absolute E as
an example.
But then there were other ones that were kind of easy,
like adding a vertex was O of one or adding an edge was O of one, right?
So you're not going to get much better than O of one.
No.
Now, this is where my research became more difficult was as it related to a nested set model.
What was it in terms of big O, uh, notation for nested set model?
And I couldn't find anything out there.
But if you think about what a nested set model creates and I'm,
you know,
I'm open to feedback,
but as I was thinking about it,
what I came to in my mind that it,
that a nested set model actually ends up being is an NRE tree or a KRN tree, depending on which one you prefer to call it, right? It's not
a binary tree, but it's, it's an NRE or KRE tree, which is essentially the same as a binary tree,
except they can have multiple nodes at each level. Well, in a binary tree. Okay. Let's be clear a binary tree you have at most two nodes uh per any given you know one
per parent right and the left is always lower and the right is always higher so so in a in our tree
there is you know zero one and two are valid, but it could also be three, four, five, six, you know, whatever. There's no set number of how many nodes any one branch could have on it, right?
And I couldn't find any place where, like, where in big O documentation, you know, notation where
this was documented. What I did find, and this is also was already on the Big O cheat sheet,
is that specific to binary search, the operations were more often than not, for access, search,
insertion, and deletion, binary search in Big O notation is O of log of n right for the worst case scenarios for those same operations it's o of n right so
that's a that's in terms of binary search now both of those scenarios are not bad right um they're
you know the worst case scenario is you do one for every. Yes. Yeah. It's not O of one, but at the same time, it's definitely not O of N squared either.
Right.
But that's just storage.
But that's for a binary tree.
Right. If you have taken the time to do calculations to calculate what its O notation is, then maybe you have a better answer than me.
That was the best I could come up with is that it's got to be somewhere close to that.
I know it's not the same as binary, but I feel like that's going down a good path.
But it really depends on...
But that's the tough part too, though,
because we're talking about SQL, right?
It's not a programming challenge.
Well, but these could be
in-memory structures too.
They don't have to necessarily be...
You could have your Nessus set model
just as an in
memory structure you know in your program right you could have your you could represent your
closure table or your path enumeration or your jcc list or your binary search tree like you could
represent all of these things as just in memory objects yeah i don't know how it how it translates
though it i mean i get what you're saying like if you're
going to do that programming wise how you could actually do that search but when you think about
it about how databases work right where they index things so that they know how to quickly find
particular elements like with the nested set model you're just saying hey where it's between this and
this and so i don't really know how to put that into,
I don't know how it equates is what, is what I'm trying to get at.
Like that's a very fast operation in a database.
And I don't know the complexity because I'd have to,
I'd have to dig way deeper than what I ever have to find out exactly how the
database is storing that,
that information and how it would equate to doing it in memory and something
like C sharp or Java or whatever.
So I don't know.
Nested sets are stupid fast and they're highly efficient.
I don't know what the big O notation for it is.
Yeah.
So,
so I guess I'm putting the request out there that if anybody else has an
answer as to the big O notation for nested set model, I would love to know that.
Yeah.
So we were actually asked that question on episode 28.
So if you go to www.cuttingblocks.net slash episode 28, it's down there at the bottom of the page.
If you know what it is, you can reply directly to him we i don't think we have yet so uh anyways yeah that's
that was uh that was an interesting thing that came out of that particular episode
yep but anyways that's uh that pretty much that wraps up that section of it we've uh we've pretty
much gone over in depth all the hierarchical data structures that maybe we're aware of.
Not all.
There's a bunch, but they start to get weirder
after this.
Those are the primary ones that we've looked at.
There's other combinations
where you combine a Ness's
set model with an
adjacency list or whatever.
There's some permutations that you can get into.
Yeah.
Now it is time for our tip of the what tip of the week the weeks oh and i do love somebody left us a review and actually called us out for that that was awesome
oh yeah well like you don't call us out every time. Well, you know. We try to get into this. It's funny.
We need to change that.
Anyways, all right.
I like it just the way it is.
Tip of the week.
All right.
So I'm going first this time.
This is Joe.
My tip, we actually got feedback on that too, saying our names more often.
Tip of the week.
So mine comes from Scott Hanselman again. He had a nice tip a couple days ago about using or setting up Chrome incognito mode as your browser default in Visual Studio.
So when you hit debug, it pops up Chrome in incognito mode, which is going to give you a new session.
It's going to clear all your cookies, all that sort of stuff.
So it's a nice way to get a fresh browser
and a fresh session when you're debugging.
Well, that's really cool.
Huh.
All right.
You mean you don't use IE8?
Right.
All right.
So my tip of the week,
I don't remember what I was doing that required, like, large data sets,
but I was...
That's always the best way to start a story.
I mean, that's...
So there I was.
Yeah, right?
I mean, I don't know.
Anyways, I ended up coming across several resources for creating data sets.
And one of them that I thought was really cool was Amazon.com
has exposed some public large data sets.
Things like NASA NEXT.
There's the Human Genome Project.
Common Crawl Corpus.
So this corpus of web crawl data composed over 5 billion web pages.
So these are all data sets that are freely available.
So if you're trying to mock something up and you,
and you need some data to be able to test some things out with,
this is a fairly cool little resource where you can go and get real data for
free.
So that's on a aws.amazon.com slash creepy public dash data dash sets.
What's creepy.
I said,
it sounds creepy.
It's, I mean, it's kind of cool.
One of them is the U.S. Census data.
Yep, from 2000, right?
1980, 1990, and 2000.
Yep, so pretty cool stuff if you ever need some real fake data
or some real data to play with for mocking some stuff up.
All right, and so here's a get one for you.
But I'm going to warn you to use this sparingly.
I was kind of nervous about even talking about this one.
But let's say you've done all your commits,
and let's assume that you haven't already pushed anything.
Nothing has been exposed
out in the wild. You've done all your commits locally while you've been working on something
and now you're ready to bring in the latest version of some remote branch. Let's just assume
it's master and you're ready to package your unit of work up and send it on its way.
Well, you can do a rebase and your pull all at one step
so that you can grab the latest from that remote branch, in this case master,
and replay your commits on top of it to make, when you do your final push
to the remote repository, it looks like just one simple little clean
bubble of commits rather than seeing your commits interspersed through other commits that other team members may have done
and the commit the command is this get pull dash dash rebase equal true and then origin and master
so get space pull space dash dash rebase equal true space origin space master, and that'll do it.
That's actually really cool.
And then you would do your push,
and then that way when all your team members see your commit that you made when you do your pull request,
it'll just look like one clean set of commits that have been built on top of it.
Now I say, oh God, please use this sparingly because
if for example use it smartly okay let's say that you're working in some branch and you have
already pushed that branch to the origin then you don't want to do this. Right. Okay?
You only want to do this if your stuff has never seen the light of day anywhere else.
Because if you don't, then it's going to end up erasing things that were previously public.
And so there'll be no...
Well, not necessarily erasing.
I wouldn't word it that way.
Okay.
What could happen is, let's say you're working let's say that alan and i are
working on a on a branch and and we're sharing that branch on some remote repository and so
we've already got a bunch of stuff in there and alan has based commits off of things that are
already there if i were to come back back and do some rebasing on it,
it would change his history
of what he's been creating commits off of.
But there's a bigger problem too, though,
is that if you have already done this push
in this example that we're talking about,
you couldn't even,
if I did this git pull rebase,
git pull rebase true or master command and then tried to push something that has already been published, that push is going to fail because it's going to tell me I have to do a pull first in order to make it match because there's already been history that's been pushed remotely.
Interesting. been history that's been that's been pushed remotely interesting so that's why i say like you you definitely want to use it sparingly and only when history has not already been
published externally yeah oh and uh one thing that's important to know is when you rebase
it doesn't squash all your commits into one. So any commit messages you make that might be somewhat...
You better mean them.
Yeah, might be a little offensive.
You know, like, I can't believe this.
Are you telling me that you used whatthecommit.com to get your best commit messages?
I might have gone one up on that.
But yeah, just know that all your commit messages will still be there
even if you thought they were just going
to be your local copy.
The random commit for the evening
is its secret.
Nice.
Yep, that's
I think we've hit it all, right?
That was our tips
of the week.
Wrap it up with a little summary. We hope you've enjoyed the last two shows
about different ways of storing hierarchical data.
Last episode, episode 28,
we talked about adjacency lists and nested sets.
Just to do a real quick recap on those,
adjacency lists is when your node stored the parent ID,
which was simple but scaled poorly.
Nested sets, we were just talking about that,
but each node keeps track of its place
in a hierarchical tree, and it's fast to query,
harder to maintain.
The two we talked about this time, path enumeration,
you basically store the breadcrumbs in simple text
and you parse it out, which is pretty gross.
And then there's closure tables, which is similar,
but with data integrity and can potentially,
but maybe not end up in lots of extra rows.
Yeah, so with that, subscribe to us on iTunes, Stitcher,
or more using your favorite podcast app.
Be sure to leave us a review as well on those platforms.
Tell a friend or three.
Make sure you spread the word.
We really do appreciate that. Yep, and contact us with a question or three make sure you spread the word we really do appreciate that yep and contact
us with a question or a topic leave your name and however you'd like to be mentioned on the show
and visit us at codingblocks.net where you can find our show notes examples discussions and more
yeah and send your feedback questions and rants to comments at codingblocks.net and make sure to follow us on Twitter at Coding Blocks. And that is it. We'll be back soon with another episode. Soonish.
To celebrate the launch of version six of Indipend, they are offering a 20% off discount
through the end of July 2nd. I know that's only a few days away, but if you head over
to www.independ.com and you enter in the offer code INDEPENDV6, coding blocks, all one word,
then you can get 20% off their latest version of their static analysis tool.