Programming Throwdown - Trees
Episode Date: May 12, 2021In another duo episode, Jason and Patrick give an in-depth introduction to trees, their many types, approaches and functions, and their importance in modern programming. Also, peppered throug...hout the episode are the games, books, tools, and ideas that have currently piqued their interest.This episode touches on the following key topics and ideas:00:00:17 Avoiding drama at work00:07:10 News: C++20 (7:10)00:09:37 News: Play Co-op Diablo II in the browser00:12:58 Wreckfest00:15:07 Kaboom00:17:45 The future of remote work00:24:46 Jason’s Book of the Show: Debt: The First 5000 Years00:27:08 fractional-reserve banking00:31:30 DeFi, distributed finance00:33:08 Patrick’s Book of the Show: Harry Potter and the Sorcerer's Stone, the Illustrated Edition00:35:49 (Ad) Audible00:37:05 Jason’s Tool of the Show: Vagrant00:41:04 Patrick’s Tool of the Show: Zach Gage Games00:45:03 (Ad) ConfigCat00:46:03 feature flags00:47:03 Trees: why are they important? 00:49:43 The divide and conquer approach00:51:34 The agglometric approach00:55:57 Choosing the right tree and algorithm00:57:56 Keeping trees balanced01:01:10 binary trees01:02:52 binary trees and machine learning01:05:28 b-trees01:10:04 spatial trees: the k-d tree01:16:50 k-d trees and multidimension01:18:42 quadtrees and octrees01:21:44 r-treesResources mentioned in this episode:BooksDebt: The First 5000 Years, by David Graeber https://amzn.to/3uKEoe9Harry Potter and the Sorcerer's Stone, The Illustrated Edition, by JK Rowling https://amzn.to/2R6ILSsGamesDiablo II browser game http://clouddiablo.com/Wreckfest https://www.thqnordic.com/games/wreckfestZach Gage Games http://stfj.net/ToolsVagrant https://www.vagrantup.com/Kaboom https://replit.com/kaboomArticlesArticle on C++20: https://oleksandrkvl.github.io/2021/04/02/cpp-20-overview.htmlThe debate over remote work: https://www.bbc.com/news/technology-56771539Get ConfigCat: https://configcat.com/Get Audible: http://www.audibletrial.com/programmingthrowdownIf you’ve enjoyed this podcast, you can listen to more programming news and updates like this one on Programming Throwdown’s website: https://www.programmingthrowdown.com/You can also follow Programming Throwdown on Facebook | Apple Podcasts | Spotify | Player.FM You can also help support Programming Throwdown through our Patreon. ★ Support this podcast on Patreon ★
Transcript
Discussion (0)
programming throwdown episode 112 trees take it away jason, everybody. So this is going to be a pretty cool episode.
Another duo episode, as we talked about a little earlier.
We got some awesome folks.
Literally, the company is called Awesome Pros.
So check them out if you're interested for your podcast.
But we got some awesome folks helping us out on the editing recording side.
So we're able to produce more content, which is super exciting.
And we've had a bunch of ideas. We have a whole list of ideas lined up of stuff that we want to
chat about. And today we're going to be talking about trees, computer trees, although there is
that whole thing about planting a million trees. Have you heard about that? On YouTube? I thought,
you know, that's funny. I thought the same thing. People are going to assume we're talking about
fundraising for planting trees. Yeah. Honestly, I don't know very much about it. I thought, you know, that's funny. I thought the same thing. People are going to assume we're talking about fundraising for planting trees.
Yeah. Honestly, I don't know very much about it. I mean, it sounds cool. Sounds like, you know,
planting trees sounds like it's pretty, you know, universally like a good thing to do.
But no, this is about computer trees. Yeah, we're going to dive really deep into that.
But first, we'll talk a little bit about avoiding drama at work. So dramatic, you know, this,
so there is, you know, a lot of sources of drama at work. There are, you know, political things,
there are various sort of social situations, or like, I don't know, social phenomena. I mean,
there's also just, you know, other than the macro stuff, there's also just like what we call like office politics. You know, this person, you know, is in a bind and they kind of backstab this other person.
And then, you know, there's all of this drama. And I'm here to tell you to avoid the drama.
And we're give some tips on how to avoid drama at work. You know, maybe we'll just ping pong
back and forth. I mean, my biggest thing, my biggest piece of advice that really like it's, it's actually, you kind of influence people just by being yourself,
right? And so you don't really have to go out of your way to tell people how you feel about things
that are like, besides the work you're doing, right? So I mean, you know, just by being,
you know, yourself and having opinions that you're proud of, you know, and all of that, you know, that that's kind of in my opinion, that's kind of enough.
Right. I mean, you don't then have to, like you might see a lot of things and feel compelled to say, oh, like I want to argue with this or I want to go back and forth, you know, and the way that you kind of think about things is already being expressed just through you being yourself.
You don't also have to kind of take the extra step.
Yeah, I think I, what you're saying is like over time, sort of being consistently acting in a manner which
is true to those things that you think, then you kind of like express them into your work
environment.
Yeah, exactly.
And another thing too is, you know, there was a situation a really long time ago where,
you know, someone had done something that I felt was not right.
Now, of course, like if there is issues with, you know, harassment or anything like that, you know, of course you need to speak up.
I mean, that is extremely important.
But I'm just saying, I mean, this wasn't anything like that. But it's something where I kind of felt really compelled to kind of, you know, start
kind of a discussion that that was probably not gonna be very productive. And I decided, you know,
no, I don't really want to kind of get into this kind of stuff at work. And then actually, what
ended up happening is, and this took a little bit of it took, it took a while, you know, it's not something that happened overnight. But but eventually, that person, you know, ended up kind of losing a lot of their
authority and their and their seniority and things at work. And the thing I realized from that
experience was, you know, at that moment, I felt like, you know, there was this kind of real attack
on me personally, and that I had to stand up for myself. But then, you know, there was this kind of real attack on me personally and that I had to stand up for myself.
But then, you know, in hindsight, it was obvious that, you know, this person kind of had a problem with everybody.
It really had nothing to do with me.
I was just one of the people that this person was interacting with.
And now there is sort of a tragedy of the commons.
Like you could you could say to yourself, well, if no one does anything, then this sort of toxic
person might just continue being toxic forever.
And so you do have to kind of balance that out.
But at the same time, I think it's so easy to go the other way and kind of get into a
really big battle that is ultimately just going to take a lot of your time and energy
and resources and will just not really be productive for anybody.
This is going to come across sounding a lot like self-help.
Yeah. Well, I mean, I think it's a...
Actually, I saw a study that said the number two reason why CEOs get fired
is actually like public scandal, which the number one reason is performance.
So, I mean, that's kind of a no-braer, right? People get fired for performance, especially at that high
of a level. But, you know, the number two reason is effectively, you know, some type of like public
faux pas or something. And that's huge when you think about all the other reasons someone could
get fired. I mean, they could get fired for like maybe they had some financial problem, like they had a financial error.
And so like, you know, the SEC gets involved.
There's all these other reasons why a CEO could get terminated.
But that was number two. Right.
And so. So, yeah, I think it's a really poignant topic.
And I guess the biggest thing is, is just to always just keep a level head and kind of keep your cool.
And especially, you know, at work, I think I also don't really understand why people argue
political stuff over the internet. But if you're going to do it, at least do it over the internet,
not at the water cooler or something. At least do it pseudonymously and not in person.
Yeah, exactly. Yeah. All right. So that's that's my rant about avoiding drama at work. It's so important and it takes a lot of fortitude. But I think at the end, you're always going to look back and say, I'm really proud of myself for having that kind of reserve. Right?
Makes sense. Yeah, that's hard to follow up, man. I don't know. That was a heady opening topic.
All right. We'll jump into the news then. The news is much more benign.
Okay. No, good stuff. Good stuff. Yeah. Yeah. A drama. My completely unrelated to drama is my
first link is C++20 new features with example. This article that I found shows almost all of
the new C++20 core features and gives a little like code snippet
for how each of them works.
Now, this is obviously like gonna make sense
to a subset of the subset of a subset of our audience.
I always find this like it's so useful
because to me, I always, someone was showing,
you know, recently we went to C++ 17 at work.
And although, you know, that's an upgrade
and still a few years old,
it takes a while for these things to kind of roll out if you're not familiar with C++17 at work. And although you know, that's an upgrade and still a few years old, it takes a
while for these things to kind of roll out if you're not familiar with C++. And someone was
like listing off the new features. And all I'm doing is like, I don't even know what that means.
Like, what was it? There was like the ternary operator. And I'm like, I don't even know what
the or the trigraph. We knew what the ternary operator is. But there's something like the
trigraph operator is been deprecated in C++17. i don't know what that is like go look it up is that the thing
with the question mark and the colon that's the ternary operator that one's still good okay but
there's a trigraph operator which i i don't even remember what it is now it's like question mark
angle bracket angle bracket or something and i have i didn't even know it was a thing, which is why it's been deprecated.
Anyways, so seeing a list like this was super helpful for me to be able to realize like what
these named things actually meant. Because I know it means something very specific and language is
very important. And the C++ standards committee, I'm sure worked very hard to make sure it makes
sense, but it doesn't make sense to me. And so seeing the code examples
allows me to quickly discount 90% of them
as not useful to me.
I mean, it helps me to understand
what these new features are.
I'm not a big fan of jumping in
and using all of the new features,
but it's definitely helpful
to kind of see them written down.
So I really appreciated this article.
I will have a link in the show notes
because it's on someone's GitHub page
and it's really long.
So I'm not going to say it out loud.
But if you're used to C++,
you're interested in seeing
what new features are in 20,
check out the show notes
and definitely visit this link.
What was the version that added lambdas?
That was the last big feature.
13 or 11?
I think 11.
Oh, it's that old?
Now lambdas and then,
well, actually one that came out,
it might be in 20
or it might be in 20 where it's not experimental anymore,
is the file system library.
That was another really important one.
Ah, 17.
Yeah, okay.
Those are the two I remember where I was like,
oh, we totally needed this for such a long time.
Cool.
All right, my news is play Co-op Diablo 2 in the browser.
So there's always this interesting kind of march towards, you know, doing just more and more exotic things in the browser.
So this is, you know, I know a while back MAME was ported to the browser.
This is using like Emscripten and these other technologies like WebAssembly.
And so, you know, one of the challenges, though, for something like this is,
you know, the network stack, right? Because you can't really run a server, for example,
in the browser and have other people, or maybe you can. Let me think about that.
Actually, you kind of can using WebRTC and stuff like that. But it's definitely not a one-to-one
mapping between, you know, these low-level kind of Windows sockets or Unix sockets and what you can do in the browser.
And so, you know, that's kind of been an area where you haven't seen a lot of development.
But here they actually have Diablo 2.
And I'm assuming they, you know, don't have the source code or anything like that.
So this is, you know, just using the diablo 2 binary you are able to actually play
in the browser and it's able to talk to a diablo 2 uh you know battle.net server and you can
actually play with other people co-op online without even having to to really do it download
anything you just visit a website which is pretty wild i mean i'll have to do a bit of research into
how this actually works my guess is maybe they hacked the like DOS box emulator. And so
and so they somehow wrote a translation layer, where when you try to create a Windows socket,
it uses uses WebRTC, which is just pretty amazing, or WebSockets or something like that.
Pretty amazing. Check it out. It's clouddiablo.com. We'll have a link in the show notes.
But I thought that was pretty wild achievement there.
That is really awesome. And now I'm going to get depressing. I remember playing Diablo 2 when it
came out and I was not a super young person when it did and playing it, really enjoying it,
playing a lot. Do you know when Diablo 2 came out?
You know, I've never, I played Diablo 1, just the demo. I only played the demo though. So I've
never been into the whole Diablo thing, but yeah, go ahead. I only played the demo though. So I've never been into the
whole Diablo thing, but yeah, go ahead and tell us when did it come out? 2000. So it's 21 years
older. It will be 21 years old in June. Oh man, that's nuts. Yeah, that's right. I remember right
around, uh, that would be senior year of high school for me. Oh, now you just, oh no, no,
no. People are doing the math in their head, man. Yeah, I'm going to hit the big 4-0
in, I guess, 10 months or something.
Oh, okay.
Yeah.
Diablo 2 had, if I remember correctly,
Diablo 2 was the one that actually had
like an overland map, right?
Because the one I played was just,
you're in a town and there was a dungeon
and that was it.
But then 2 added like a whole world, right?
Yeah, like you walk around
and then you go into dungeons.
Yes. Yeah. Yeah, the Diablo around and then you go into dungeons. Yes.
Yeah.
Yeah, the Diablo games are actually pretty fun.
I enjoy them.
I don't know.
For me, there's like a whole layer of complexity,
but I never really got into it as like,
you know, number crunching and min-maxing
your character build-outs.
Like I never got competitive like that.
I always just sort of like wandered around
and like, you know, smashed a lot of stuff.
And I always thought that was fun.
But I never got super, super into it. But a lot of people really did. And for me,
games that allow both kind of players are super awesome. There's a lot of recent games that people get really into that I don't feel support casual or semi casual play. Yeah, yeah, I totally agree
with that. You know, the games I've been getting into lately have been racing games and i've been considering buying a steering wheel oh the challenge is um if you buy a steering wheel
you uh you also they all come with the pedals and i feel like you can't really use the pedals if
you're standing up and i have a standing desk so it's like it's like it's starting to spiral out
of control it's like okay first i need to buy a bucket seat and then I need to buy, you know, with the hydraulics to make it react to you. Yeah. Yeah. So I'm not sure what to do.
I'm a little bit of analysis paralysis, but I think maybe I did see there's some steering wheels
that have like basically like a grip. So like the harder you grip, the faster the car goes.
And so I might try and pick one of those up. Interesting. I think you need to like come up with a custom
like DDR style map for...
Oh, that would be awesome.
I've been playing this game Wreckfest.
I just want to give a shout out to Wreckfest.
They're not a sponsor or anything,
but Bugbear is a company
and they made this game Wreckfest.
It's really interesting.
So you have 24 racers.
20 are what you'd expect.'re just in cars they're racing
around the track etc the other four are in giant heavy school buses and their job is to go around
the track backwards and wreck like the people in in the top three positions so like so you'll be
driving it could be an oval or whatever but like you'll be driving around the track and all of a sudden like a school bus will just crush you.
So it's kind of like the blue turtle in Mario Kart.
Are they computer controlled or like it's like an ASM, like some players are playing
one and some players the other.
Yeah, no, they're all real people.
So it's kind of like if the blue shell was sentient.
It's just crushed.
Oh man. It is really, really fun. Highly recommend Wreckfest. so it's kind of like if the blue shell was sentient it's just crushed oh oh man it's really
really fun highly recommend wreck fest i actually you know i never buy cosmetic stuff like just buy
a new skin for your car whatever i never really done that but i actually went and bought a bunch
of cosmetic stuff in wreck fest i don't even use it i just i put so many hours into that game that
was like these guys deserve more than the 20 bucks or whatever it was.
Interesting.
I'll have to check that out.
So your news article was Diablo 2 in the browser, talking about browser games, but on a completely
different level.
This was something I came across that I thought was like super cool.
And one of those things, which we have said probably 500 times in the podcast, which is
like, I wish this was around when I was learning to program.
And that's Kaboom.
And the URL is replit.com slash Kaboom.
And so I guess Replit is, they have a free tier,
but it's also like a paid service,
like IDE in the cloud,
where you can run code and do development and stuff.
I can't vouch for it, I haven't tried that.
But this Kaboom thing is they were able to host
a JavaScript game development library.
If you do nothing else,
search it and watch their opening video,
which is hand-drawn MS Paint pixel art
with just a hilarious run-up.
And the audio is made by them.
The drawing is made by them.
And it's not meant to be the next blockbuster AAA title game engine.
It's literally like, oh, I want to drop a character here.
And then it's like, you know, draw the sprites for your character's animation sequence.
And you draw them in a little pixel editor.
And then, you know, draw the wall tile and you draw the wall tile.
And it manages the assets and your video game thing and
it's kind of not really like a definitely not a no code way of building it but just sort of this
like right at your fingertips like everything sort of kept right there and i imagine for complex
games it gets out of hand really fast but for just making super simple like a side scrolling engine
and uh doing ascii descriptions of your text level, like equal signs for the
walls and plus signs for where you want item pickups and doing those kinds of things in a
very simple environment, but something that really gets you into programming and allows you to make
these. I think they look pretty cool, retro, nostalgic looking games. I thought this was a
really cool thing. Kaboom.js. I actually wonder,
this was a way to try it easily in the browser, which is how I was able to play around with it
for a few minutes. But they may have, I'm not a big JavaScript person, so I don't know
how people pick up JavaScript libraries normally. But it looks like they also have,
you don't need to go only on Replit. They have a way to do it if you want to just host it yourself
and try it. they have it available as
well at kaboomjs.com oh that's awesome i'll have to check that out and so so the idea is you you
draw the map in ascii but then you for each like ascii character you have a picture attached to it
or something yeah you just like has like the assets right there and you say like invent a new asset
and then it just gives you a little pixel editor and asks you to draw it oh very cool you have to check this out all right my topic is
on remote work and so so patrick and i are both uh well everyone is working remote but you know
at some point um offices are going to reopen and i don't know patrick if you've what your plan is or
you know or anything like that.
But in my case, I'm permanently remote.
So this is the real deal forever and ever.
Oh, ever and ever.
That's a big commitment.
I mean, who knows?
But yeah, I mean, I'm basically a remote employee permanently.
And so this whole like, oh, maybe remote wasn't a good idea. Article is definitely scary to somebody who's really there's no going back.
But yeah, I think it's an it's a really interesting debate.
And honestly, I don't have a lot of answers or insights, really.
I mean, I think that so far being remote has been really great.
I think we might have talked about this.
I think we talked about this.
We talked with Kevin.
But yeah, we both have loved being remote.
But then the question is, when people come back into the office, what's the hybrid solution?
That's really what it is, where it's maybe there's four people who are all sitting next to each other
and another four people who are spread out around the globe. How do you keep the social environment
healthy and fair and all of that in
that kind of world. And I think that's, that's what this article is kind of positing. And I
don't know if there are really any answers yet. Yeah, I think that's going to be a hard question
for people like what is the future look and then the debate we were even having at work the other
day is, is this a kind of one and done thing? And like, it just slowly over time returns back to
normal? Or like, is this going to keep creeping up, there will be at least for some while people's
concern that it may come back, or that, you know, the next version, you know, the in in Silicon
Valley, I guess the whole open office densely packed people real estate at a super high premium
is kind of very incompatible with the view that there might be reasons to not be super close to
people.
And so it's very interesting to see how we'll run a sort of real life experiment in the various ways of solving the problem over the next couple of years. Yeah, that's another really good point.
I mean, a lot of companies in the Valley will have to double their office space, right? Because
the density, do you think they're going to have laws around the density? Or do you think it'll
be mostly like self governance? I mean, laws is one of those things that's tough, right? Like at best
case, I think laws are sort of slower, slower moving, slower to catch up, which makes sense.
I actually think that, you know, too fast lawmaking, you might end up in a problem.
And so I think what will happen is the employees in some ways will sort of express what they're
comfortable or not comfortable doing. And especially if there's a, like, I think what will happen is the employees in some ways will sort of express what they're comfortable or not comfortable doing.
And especially if there's a, like, I think Silicon Valley had gotten quite homogenous
in the approach.
And I think this will shake it up and allow for some experimentation to take place, which
I think traditionally has been where you see a lot of innovation happen, right?
And so I think we might be entering an era where you see some interesting innovation around this and a lot of people trying things and the ones that will succeed
will attract people to them away from the ones that are on the other end of the spectrum.
Yeah, that makes sense. I mean, one thing that I don't know if I mentioned this on the show, but
one thing I've been thinking a lot about is, like, is it possible and like, sort of safe from a privacy standpoint to like decouple your kind of work
team and your social team like like you know look at we work for example right so we work is this
company or this this idea where you know they have a big office and you might go and rent one cubicle
in this giant office and the person next to you and a person on the other side of you
work for totally different companies you're just sharing this like you know a physical space right
and so you know if you wanted to go start like a ping pong club you could do it with the people
around you even though you know the 12 of you work for 12 different companies and so i wonder if that
is going to if that idea is really going to fly where,
you know, it's, it's like you kind of have a group of people and you have a whole relationship with
them, but then you also have like your work team, which is spread out throughout the whole globe.
And it's sort of like two different social networks, right?
Not to get back into the avoidant trauma work you're talking about before,
but I think it's been interesting. This thing you're saying a little bit reminds me of something
I've thought some about, which is the different way different people treat their sort of work
team. And some people, the team becomes like a extension of their family, which makes sense.
I mean, you spend, used to spend, you know, 40 hours a week minimum with those people basically.
And in some ways that's more awake time than, you know, 40 hours a week minimum with those people, basically. And in some ways,
that's more awake time than, you know, you often spend with other members of your family.
And so for some people, they sort of really bring those people in and treat them as family. Other
people want to maintain a separation, right, between their sort of work teammates, and they
can be sort of work friends with them. But there's a difference between a friend and a work friend,
and they maintain that pretty solidly. And then, of course, people all along the spectrum. And then
of course, people who have no interest in sort of the social aspects at work. And I've always
thought it's interesting that how a person approaches that problem impacts how the work
evolves, right? So how the manager treats the team and their personal life separation probably colors how they view people
on their team's treatment of work stuff. That is, that was a really vague way of saying it. But like,
if you come to work and your managers are very like super, everyone's my friend, let's hang out
all the time. And you want to maintain a stronger separation that creates a bit of awkwardness.
And so I actually think this thing you're saying, Jason, where like they get separated for ill or good, I'm not sure. But at least it's sort of evens out that some if people
realize I'm not going to be at work all the time, or we enter a hybrid or a mostly remote work,
then people really understand that separation. And it's sort of somewhat forced in. And you can
still have fun with the people at work, but it's a little bit different. And I think that will
even out the people who maybe felt trepidation or awkwardness with those relationships before. Yeah, that's a really good call out. And
so then as a thought experiment, what happens to the other people like the people who, you know,
always want to go golfing with the boss or something, but now the boss is, you know,
1000 miles away. Maybe maybe people have that personality will will just won't be remote, right? They won't be remote employees.
Yeah, I don't know. Maybe they seek out a team or environment where,
for some reason or another, they sort of need everyone together, or that's a culture thing,
or whatever. Yeah, yeah, it's interesting. Yeah, that's pretty wild. Yeah, I think,
yeah, I'm really curious to see what's going to happen. I wonder to what degree like technology
can help with something like this. Yeah. I mean, it's hard to know because we're not really in that
situation yet. Like it's hard to kind of project what that's going to look like and what the social
feelings are going to be, you know? I mean, I, for one, welcome our telepresence humanoid robot desk mates. Oh, man. All right. Cool. So on that note,
book of the show. My book of the show is Debt, the First 5,000 Years. I have to confess,
I haven't read this yet. I just got it. But I'm always looking out for books on finance. Finance
or just the history of finance is just super interesting to me. And somebody at work sent me a ping maybe a couple of weeks ago. And they said,
hey, I read this book. I think it's right up your alley. And so I'm just catching up with... I
finished Anti-Fragile. And so I'm just kind of getting caught up with my current books and
jumping into this new book. So I haven't read it yet, but it does seem really cool. I think that if you think about it, the whole thing around
debt and credit and lending is really, really fascinating. In my case, I try not to have
any debt if possible. I mean, there's things that, there's large purchases that you have to pay over
time, like you buy a house or something like that. But, you know, on the credit card and some of these things, you know, I try not to have debt, but at the same time, I don't preload anything either. So the credit card is super, super useful because I can make a payment on something that has zero balance using something that has zero balance, using something that has zero balance, and then pay it later.
And so I think... And then, of course, if for whatever reason I didn't pay it, well,
now it's like there's debt. So it's like you can't really have a credit card without debt.
So it's unavoidable if you want to have that level of convenience.
And you can ask yourself... I mean, there's a lot of really fundamental questions
about debt that I have. And I think this book will clear a lot of them up. One of them is, and Patrick, tell me your take on this, but if something is backed by the government, I still don't quite understand why they could charge interest. You know what I mean? if you're not actually taking any risk. Because in effect, the person who's paying the loan
as a citizen of a country that's ultimately backing the loan is actually taking the risk
and giving you interest. So that's just one thing. But debt is really fascinating to me.
And I think this book is really going to explain a lot of stuff.
Yeah, I think I'm going to opt out of getting into an in-depth fractional reserve banking
discussion and interest just now
like in here i this is gonna avoiding drama i don't want so okay a couple of things so one thing
is is uh yeah i look at it just just super objective i don't have like a horse in the race
per se i just try to learn as much as possible but the other thing is is what is fractional reserve
because you mentioned that a while back and i always meant to ask you off the air, and I never got a chance. So what is fractional reserve banking? What does that mean?
So you kind of were bringing it up by saying, you were admitting the realization that the bank
borrows money from the government at some level before lending out money. So the naive way of viewing a bank, and I might mess it up, I'm going to try.
The naive way is that the bank has somebody come in and say, I'm going to give you $100.
And then they say, cool, we're going to pay you 0.001% interest on your $100. Then they go find a
mortgage customer who says, oh, I want to use $100 to buy a house.
And you go, okay, cool.
I'm going to loan out my $100 to you that I got from this other person.
And then you can use it to pay the person you're buying the house from.
And then what I now have is someone who believes the $100 is at the bank, but in actuality
it hasn't.
It's been given to someone to buy a house and that person is making payments. And then the hope is that when the payments come in,
that the person doesn't come and ask for their money back before it's there.
So you can't loan out money that you don't have unless you have a way of borrowing it. So
the question is how much of the amount of money
that you have loaned out, your obligations, your liabilities for a bank, do you need to have in
reserve? And that fraction that you need is sort of governed by the government saying like, you
have to have this. Now, the difference is that you can also sort of borrow money. But what happens with the whole fractional reserve banking,
if banks borrow money from each other or from the government
and keep a small amount in a reserve,
and say someone takes out a loan at bank A
and then deposits it in bank B and starts earning interest
and then takes it out again and puts it somewhere else.
Now, if you ask each of the banks,
each of the banks will say like,
oh, I have $100 deposited times three banks. But what if you've just gone in a circle?
You actually only ever had $100. Oh, and so fractional reserve banking is pointing out this
thing that you can sort of artificially create money by not requiring that banks fully back
the deposits. Oh, I get it. Oh, it's interesting.
And so the government can sort of, by controlling what percentage of liabilities are needed to being
held and by the interest rate at which they're willing to loan new money and do these audits,
they can sort of have a, I guess, not a direct, but an influence over sort of how much total money is circulating out in the environment.
So by lowering the interest rates from the government, they can encourage the banks to lend out money because they can lower their rates, which are like a delta on top of the government's rate.
And so they can encourage people to take loans by making interest rates lower and lower, and thereby cause sort of inflation, right? There's more apparent money in the system.
I probably didn't get that 100% right, but that's the gist.
Yeah, yeah. No, it totally makes sense. I mean, basically, if the government has a really high
interest rate, then you could invest in that. I don't know what you would call that,
investing in, I don't know, the government or something. Yeah.
You buy treasuries. Yeah. Yeah. Yeah. Right. No, I know like the vehicle,
but it's like, what are the treasuries actually doing? It's like building bridges or something.
They're a bond backed by the government. So it's a government taking out a loan from you.
Oh, I got it. So it's basically just going to whatever the government pays for. It could be anything. Yes. And so, and so, yeah. And, the higher the interest rates go, the more people who
are going to do that. And that means there's less people who want to invest in other private things.
And so I think right now the interest rates are very low. And so people are always looking for
things to invest in. So real estate and other things. Yep. Okay. Yeah. That's going to get,
if we keep going down this path, another few steps, it's going
to get dramatic and political.
Okay.
All right.
Let's not do that.
The other thing I was going to bring up here that's interesting.
So I've read the first sort of 50 pages of this book.
Well, listen to the equivalent of about the first 50 pages of this book, because I have
this book on Audible, but I haven't made it through.
It's definitely not the most exciting read, at least in the first few pages.
But I also am intrigued by this concept.
And it also plays into the thing
that we've been a common theme here on the show,
which is the cryptocurrency and blockchain
and this concept of DeFi, this distributed finance.
And how would you allow on the blockchain
people to borrow money by depositing one asset and taking money out of
some sort of contract and going and using that money and then paying back that debt over time?
And how would you handle that if it needed to be done all via algorithm and not by human
intervention or judgment? Wow, that's actually in the book?
No, no, no. I don't think so. I'm just saying this conversation we're having about fractional
reserve banking, inflationary versus deflationary politics, and the common theme of
cryptocurrency leads to a very interesting topic if anyone wants to go sort of look in that and
hasn't yet about this distributed finance. I don't think it's in the book. I haven't seen anything
about crypto derivatives. I mean, I'm sure it exists, but it's definitely not a popular thing.
Oh, the book was written in 2011. So yeah,
there's zero chance it's in there. Oh, yeah. Oh, there you go. I think crypto derivatives would be
a pretty interesting thing. But then the thing is, how do you do that? Because derivatives have
to trade faster, presumably, have to trade at higher cadence than the original thing.
And crypto is so slow. I mean, you have to somehow make the algorithm, that whole process
be much faster. My book of the show is not well suited to audio edition in the format I'm going
to recommend it. But that is Harry Potter and the Sorcerer's Stone, the illustrated edition.
So we've talked about before, both Jason and I have children, and my children are now old enough
to sort of start introducing them to Harry Potter and they're getting interested in reading.
And so I figured to to read to them, I would read Harry Potter and the Sorcerer's Stone.
And then I decided to go in and pick up the illustrated edition just to kind of keep it more engaging for them, unless at least until we saw how it went.
And oh, man, I harry potter i don't
even know how long ago i'm like long ago uh when back like when they were first coming out it also
it came out around the time diablo came out oh did it i think so yeah correct me if i'm wrong
i'll look it up but going back through it again and reading it and then like also seeing the
pictures are so well done in this book these books books aren't cheap, like I've gotten used to like really cheap books. These books are just very well
done though the drawings, even the pages that don't have sort of full color illustration have
like texture to the page. And it's not just a black and white. Small knit is like there are
some places where the text is a little hard to read against the background. But just like thematic
and sort of, you know, nice to look at and just a really nice like thematic and sort of you know nice to look
at and just a really nice book to read and and sort of go page by page through i've really enjoyed
that so harry potter and the sorcerer's stone came out in 1997 we're gonna keep going uh so
so if you've not if you're interested in harry potter or if you have children or if you've never
read it before or if you just really like pictures, you know, I don't think it's probably everyone who's into
this has probably seen these before. But anyway, my recommendation is Harry Potter and the
Sorcerer's Stone. So it's not a small book, right? So how does that work? Is there a picture every
few pages? Or is there one every page? Or what's the... So the I mean, the book is both large in
sort of height and width and then also
it's sort of normally deep so it's a little bigger than a normal book but you would not want to carry
this book to the beach and read it on the beach oh got it or whatever like it's a bigger book
and then you know also and there's you know so every page has like some sort of like thematic
texture drawing to it if i recall but then yeah it's sort of every other page or every third page pair will have like some picture related to the thing sort of like off to the side or
in the corner. So is it a bridged or is it like actually the whole book?
No, it's the full thing.
Wow, that's cool. Yeah, I find out about so many good products on this show.
This is like, what is that? Oprah who does the Christmas best things
or things you have to buy?
Yeah, that's right.
Yeah, very cool.
I'll definitely be picking that up.
So I alluded to it.
And I actually do have Harry Potter
and the Sorcerer's Stone on Audible,
but it's obviously not illustrated,
but it is well narrated.
And it's a very pleasant way to consume the book.
But if you've never tried Audible before,
you can check it out and help sponsor the show
by going to audibletrial.com slash programmingthrowdown.
And there they'll do a free one month trial of Audible,
which you can pick sort of any book that's in Audible,
download it and you get to keep that book.
And then you can cancel the trial
if you decide you don't want to keep it
or keep going and get a book a month.
I've done that for a while.
You also get access to their sales and I've built up quite a collection. But honestly,
I really enjoy it. I don't have a bookshelf at home with dozens and dozens and dozens of physical books. It's just not a thing I do anymore. But I do enjoy actually scrolling
through my Audible list of books and picking out my recommendations for this show.
Yeah, very cool. And for folks who either already have an Audible account or are interested in that, they can
support us on Patreon.
So Patreon and Audible are how we can keep this show going and how we were able to get
more content.
I mean, this episode exists because of your support.
And so we really appreciate that.
You can reach us at patreon.com slash programming throwdown.
It's time for tools of the show.
My tool of the show is Vagrant.
Have you ever used Vagrant, Patrick?
Or are you new to Vagrant?
I feel like I know the word, but I'm not thinking of what it is right now.
Okay, so Vagrant is, it's similar to Docker.
We talked about Docker on an earlier episode,
but it is for virtual machines.
So Vagrant is a way to spin up a virtual machine
and to initialize it with a set of parameters
and a set of scripts.
So for example, I use Vagrant a lot
when I'm testing new versions of Eternal
Terminal. So before I release a new version, which is kind of a big process, it goes to all the
Debian and Fedora, like official, like the package systems, and it runs through a million different
tests. People will download it. I want to make sure that at least like all the tests run and build and everything and so i have a set of vagrant scripts and so the way this works is similar to
docker you know you write a vagrant file and in that file you say what the virtual machine should
be like should it be ubuntu should it be windows should it be you be Fedora or OpenSUSE or whichever one?
And then you specify the things like how many CPUs should the virtual machine have, etc.
And you can specify scripts.
So in my case, the script goes to GitHub, grabs the latest eternal terminal release,
and builds it and runs all the tests.
And then kind of exits, right? and builds it and runs all the tests and then exits.
All of that is nice because, for example,
there was a GitHub issue the other day where someone said,
I don't know how you pronounce this.
It's OpenSUSE, but that variant of Linux wasn't compiling.
I just pointed them to Vagrant.
I said, hey, install Vagrant,
run Vagrant up in this directory,
and you'll see it totally builds and works.
And it turns out that the person
who actually maintains the OpenSUSE package,
who isn't me, that person,
there was some error there.
I don't actually know the details,
but I was able to just get anybody
to just go on
there and run and build. It's just super convenient. Now, you might say to yourself, well,
if you have Docker, why do you need this, right? And the answer is, there is that layer in between
Docker and the virtual machine, right? There is the whole, the Docker, like Docker Daemon, right? And so Docker
tries to, you know, be really good about, you know, giving you a lot of freedom and flexibility
when you're doing things like building packages, or especially you're doing things which involve
low level networking. You know, that can give you results that aren't really natural. And sometimes
you really need like a bare bones virtual machine. And so Vagrant
is kind of like the Docker for that kind of stuff. So if you need that, check it out. It's super,
super useful, especially if you're deploying software on a bunch of different OSs. You know,
I literally have just one script that I run and I can test every OS. And when the script finishes,
I know like every OS works. So it's just a one liner. And so I'm a
big fan of Vagrant. Nice. Maybe, you know, what is the difference between something like Vagrant
and VirtualBox? Oh, yeah. So under the hood, Vagrant uses VirtualBox. I think of Vagrant
is just like a scripting library for VirtualBox. That's a good way of looking at it. I see. I see.
Okay. The other thing it does is it maps
like a folder on your computer to a virtual machine. So if I do like vagrant up in a certain
folder or with a certain vagrant file, it has a UUID for that file. And if the machine already
exists in virtual box, then it won't have to like totally destroy it and bring it back up and stuff
like that. Nice. My tool of the show, i actually had one and then i switched it to a collection because
indecisiveness i was playing a new game by uh zach gauge games and that was good sudoku so i
played sudoku for a while i got off and on in books but i never really got into it just didn't
feel the same like i couldn't use my pencil to kind of make notes
on iPhone or, you know, on an Android phone.
I use an iPhone, but I'm sure it works the same.
And so, you know, I kind of never really did it,
got into it on mobile.
But then I tried this good Sudoku
and I didn't realize it was by the Zack Age Games,
but I tried it and they just like the kind of UI
they built up around playing Sudoku.
It's not exactly the same
as how i would do it on pencil and paper it plays a little faster but that's actually what i want
and i just really thought it was as very well done and then i realized like i was searching this you
know zach gage games this is the maker of of good sudoku and then i'm like oh i actually like a lot
of these games so they're the people that made Ridiculous Fishing,
Really Bad Chess.
So Ridiculous Fishing is like you're a dude on a fishing boat and then you use like the motion sensor in your phone
to drop a line down into the depths of the ocean,
catch a fish and then sort of reel it back in.
It's ridiculous.
Did you reach the end of that game?
No, uh-uh.
Oh, okay.
Do I need to?
I'll spoil the ending after you explain it.
No, don't spoil it. No, you can tell me offline. Okay, okay. Do I need to? I'll spoil the ending after you explain it. No, don't spoil it.
No, you can tell me offline.
Okay, all right.
I won't spoil it.
I won't spoil it.
No, I need like a week.
Give me a week.
I'll go do it.
Now I know there's an ending.
The ending's actually pretty good.
Oh, okay.
So really bad chess is, you know,
I know Jason is like a master chess person,
so we're not going to ignore him for a second.
But I'm really bad at chess, actually. So in in this game they don't even try to make an attempt that the sides white
and black are even it does effectively randomly place uh chess pieces on your side of the board
so you might have like five queens and the other person has all pawns or something where you go in
one king or something you know and then you try play. So the expectation is just to like be a little more goofy and a little free and make it not have
to be this super serious chess game. Anyway, so if you've never checked out any of these games,
good Sudoku, really bad chess, ridiculous fishing, some are available on iOS, some on the Play Store,
go check them out. They're actually this, I didn't know anything, I need to go learn more
about this company. But it turns out right before the show just doing research it turns out actually really
into these games now i'm going to follow them and watch for new ones to come out oh very cool yeah
the really bad chess sounds awesome do you play online with people or is it against the computer
you know i i played it once and i i don't do online playing so i actually don't know if you
i'm sure there's probably like this based on it. There's a way to do it multiplayer. It's just for me, I never, it always feels embarrassing, even though it's anonymous. Like, I don't want to lose to real people. I don't care about losing to the west coast i actually set up a meetup group called uh sports for people who are
really bad at sports okay and uh it was great because you know you kind of self-select for
people who aren't going to take it real seriously and it actually i was worried that like somebody
would see that title and be really good at ping pong and then just kind of embarrass us and like
have to deal with it but no actually like everyone who showed up genuinely was really bad at sports
and we all had a really fun time.
Oh, that's cool.
It does not look like really bad chess has multiplayer.
It looks like it's only single player.
Oh, okay.
It kind of makes sense
because I think that could be more frustrating.
I think it's supposed to be more like,
it's kind of like puzzly, right?
Like they sort of know like,
here's the best you could do
with this crappy set of pieces.
And if you played multiplayer and one person got like a decidedly a good advantage against you and they win so what yeah exactly like like like teasing apart the computer's
kind of strategy is part of the fun and a part of it's kind of gone if you go online
hey everybody we have a new sponsor and that is configCat. I'm going to tell you a little bit
about what ConfigCat is. It is a feature flag service. You can easily use flags in your code
with ConfigCat libraries for Python and nine other platforms. Toggle your feature flags visually on
the visual dashboard. Hide or expose features in your application without redeploying your code.
Set targeting rules to allow you to control who has access to those new features.
ConfigCat allows you to get the features out faster, test in production, and do easy rollbacks.
With ConfigCat's simple API and clear documentation, you'll have initial proof of concept up and running in just minutes.
Train new team members in minutes easily
so you don't have to pay extra for growing teams.
With a simple UI, the whole team can use it effectively.
Whether you're an individual or a team,
you can try it out with the forever free plan.
Release your features faster and with less risk with ConfigCat.
Check them out today at configcat.com.
Cool. Yeah.
For people who don't know what feature flags
are, it's, you know, imagine you have something out in production and you want to change some
kind of behavior. So you might say, you know, now use this other database, but then you do that and
then it broke in and now you've broken it for everybody. And that's, that's kind of a nightmare.
So, so what you want to do instead is you want to try something out on like a few people
and then slowly kind of add more and more and more people.
And one way to do that is to have a feature flag.
Have a flag that says, if this is part of my test group of people, then switch to the
new database.
Otherwise, use the old database.
And slowly that if becomes true for more and more people.
And so ConfigCast is a service that kind of handles a lot of that for you.
So they can like handle deciding like who gets the feature and who doesn't.
And then you can go in their UI and change all of that and slowly, slowly get it to 100%.
And they take a lot of that burden off you.
So check them out.
They're great.
All right.
So we could jump into
trees why are trees important why are trees important so so uh just a quick story you know
a lot of search indexes and e-commerce sites so if you go to amazon or you go to google or you go
to any of these websites you know they need to serve content to you really quickly.
And the way that you can do that, especially in a place where you're adding a lot of content, it's very dynamic.
You're going to be using hash tables, which we'll talk about another time, and trees and especially trees.
And so for this reason, you could say money grows on trees.
If you work for Amazon, money literally grows on B trees. And so that's why it's so important.
If you're going to be doing queries, insertions, if you're going to be doing partitioning of data,
you are going to use trees a lot and they are extremely important. They're going to come up
all the time. I think it's also one of the first things people learn in computer science. And you get this, if you look on the internet,
you get the, why if I interview at X tech company, do they ask me about binary trees or about linked
lists? And I somewhat, I hear, and I don't want to get into the drama or the politics or whatever around all of that. But what I'll say is that, you know, being able to
think through trees and their trade-offs and, um, so it's just like, it's part of, you know,
kind of like the problem solving, even if like Jason and I were talking before. So some of these
trees we're going to talk about, I definitely use and use them all the time, but the sort of basic
vanilla binary tree that most people
are introduced to first, like I actually have trouble coming up with exactly how I would use
that one. And it's not clear that it has a ton of use. But as a learning tool or reasoning tool
or thinking tool or like a minimum backstop, like I got to at least do better than this,
it actually serves a really good purpose. Yeah, yeah, totally. I think that,
I mean, the thing that trees teach people at a fundamental level is the idea of sort of divide
and conquer and also like agglometric thinking. So we'll jump into like how trees are built.
And hopefully, like, we can use that to kind of motivate why they're so important to know about. So, you know, imagine if you have, you know,
some set of numbers and you need to put this set of numbers into a tree and we'll get into why
you'd want to do that later. But imagine, you know, your task is sort of doing this. You have
this binary tree and you want to, you know, you want to fill it full of these numbers and you
want the tree to be pretty balanced and stuff like that. So there's two kind of ways to do this. One way is a sort of divide and conquer. So you can imagine
saying, well, what I'll do is I'll, I'll start with all my numbers. So I kind of, in a way have
like a single node that just has my entire set of numbers in it. And then what I'll do is I'll just split the set into two halves.
You know, one half is the bottom 50% of numbers, right? The bottom half of the numbers,
if I was to order them, like maybe you'll sort them and then cut that sorted list in half.
And then say, you know, I'll send that half down the left side of the tree. And I'll send the
bigger half down the right side of the tree. And maybe i'll take the node that's right in the on the median and make that my node so you know imagine like you have 20 as
the root of the tree and everything to the right of 20 is bigger everything's left to 20 is smaller
and so now you can repeat that process again on the left and right side right you have a little
less than half the size of your original set, but that process stays
the same. And if you were to just go to that left branch that has that half of numbers and just look
at it in isolation, you actually have the same problem as you did in the very beginning, just
with fewer numbers. And so trees are a really good way to explain recursion and to, you know,
deal with all of that phenomena and to cover a lot of other phenomena in computer science, right?
So trees are kind of, you can even think of like the call graph when you're doing recursion and
we're doing computation as sort of this tree, like the history of all the functions
you've called. So yeah, trees are kind of ubiquitous. And so it kind of like introduces
you to a lot of other areas of computer science. Now, another way to build trees
is what we call agglometric or bottom up. Well, to do the same thing in like an agglometric way,
you might say, I'm going to take all my numbers, I'm going to sort them, and then I'm going to group them into,
you know, 16 different groups. And then for each of those groups, I'm going to,
you know, pick a number out and put it and call it like the leaf node of that group. And then I'm going to, you
know, do that and then kind of go bottom up and then start kind of merging some of these mini
trees together. So now it's like I have a bunch of these smaller trees. I'm going to kind of merge
them, merge them, merge them. Then when I'm done, I have one big tree. And that's kind of the
agglometric way of assembling these trees. And so that is also,
if you look at, for example, merge sort or something like that, you actually see kind of
both of these things, right? So in merge sort, you start by kind of splitting a list into halves and
then halving it, halving it, halving it, just these successive halving algorithm. And then at some point,
you reach a point where it's just all just lists of size one.
And then you start kind of merging, merging, merging.
And so, you know, Merge Short actually has both,
you know, a divide and conquer approach
and then an agglometric approach
to kind of bring everything back together.
But when it comes back together, it comes back sorted.
Oh, I didn't know that had a name yeah
yeah so so it's uh so then then you have this question of like how do you go about doing that
divide right like if it's if it's small you know like for example you know you have 100 items and
you're just putting it into some binary tree well then you can you can kind of what we talked about
sort it and then split it and you'll have this like perfectly balanced tree
and it'll look really nice.
But in practice, you're dealing with like massive,
massive data sets.
And so you can't really do that.
It doesn't really make sense.
You might even have things that are multi kind of modal
or multi-objective.
So there might be different ways to split data.
You know, maybe you have a list of people. And those people have, you know, ages, they have
genders, they have other characteristics. And you want to have some binary tree that says,
you know, that ultimately, like like results in different equally sized buckets of
people. So your binary tree might say something like, well, let's first look at everyone who's
over 18 is on one side and everyone who's under 18 is on the other side. And it's everyone who
has been to my website in the past day is on one side. And so in this kind of decision tree,
at the end of all of these decisions,
at the end of asking all of these questions, you now have this, you know, end state, which
hopefully doesn't have too many people in it. And so, you know, whether you're doing a decision
tree like that, or you're doing just a, you know, a sorted tree, but of a really large data set.
You can't try every possible way to construct that tree.
It's just too expensive.
I think it's like MP hard, like coming up with the best possible decision tree.
And so that's where you end up with trying to do some of these approximations. So you might try, you know, at every node in the tree,
you might try 50 or 100 different ways of splitting it. Now, there might be thousands,
but you're just going to try 50. And because you're trying 50, you might not, you know,
get the best way of splitting, but hopefully the best one out of 50 is decent. And then when you
go to all these children that you've created from the
split, you're going to do this again. And you're going to pick 50 at random, a different 50 ways
of splitting. And if there was a really good way of splitting that you missed, hopefully you catch
it at the next level. And so this is how a lot of these decision trees or binary trees over a really large data sets are kind of assembled.
There have been a lot of other like search based approaches, but it's actually really hard to beat the just a greedy way of doing it.
So to back it off of back to the kind of like the basic stuff i think jason was already jumping ahead that
we are at least me i tend to think of trees as storing like strictly orderable things in the
data like numbers or strings uh and one of the other things to point out is that you may not
be exposed to all of them all up front so if you are experiencing them over time having these things
that jason's talking about and figuring out how you split how do you insert new data or remove
data or modify data plays a part too and what tree and algorithm you might select the other thing
that i that has come into play for me is that I often think about first about trees as being sort of
individual allocated nodes with pointers between them. I work in C++. So that's how I think. But
they don't have to be. Actually, in fact, in some cases, in many cases, it may be better served to
have an array of all of your data and then order the stuff in the array where
like the zero item in the array is your root node. And then the, you know, first and second
are the children nodes. And then you can do sort of what level are you on? And where in the level
are you and then you can do the visitation of the nodes and the search by simply going to the right
place in the array. And that has efficiencies for keeping your data,
you know, more close to each other, more local,
and that's more cache efficient.
And we're going to get into some other things
where that plays a role as well.
But I think it starts off when you're first introduced trees
in my experience, like it's very rigid.
They're always binary trees.
In fact, they're normally always binary search trees,
and you always divide them up in a certain way.
But actually, the sort of body of all kinds of trees that are used in computer science is quite big.
And they often share interesting similarities, but also very interesting differences.
Yeah, yeah, totally makes sense.
Yeah, a lot of people have invented different trees for different specific use cases. And a lot of it comes down to, you know,
being able to split the data in a way that more often than not splits evenly.
Right?
Because what you don't want is,
and you'll hear this like trees that are balanced or unbalanced, right?
So imagine if you have a binary search tree over numbers,
but it's completely unbalanced.
So the root node has, let's say,
you have the numbers one to 100.
The root node has a one.
It has just one child to the right.
That's a two.
And you just do this all the way to 100.
Now you actually have a linked list, right?
You used a tree idea, but you ended up with a linked list.
And you have all the performance problems of a list, right? You used a tree idea, but you ended up with a linked list and you have all the
performance problems of a list, right? If you were expecting something to be really, really fast,
you're not going to get that, you're not going to meet that expectation, right?
And so you want these trees to be balanced, but you also, as we talked about before, like you
can't spend a lot of time building it because you're dealing with such large volume, right?
So you can't look at all the items, do many, many passes.
And so a lot of these structures that we'll get into have really nice propensity to just naturally break data down into ways that are even, you know, for a specific type of problem.
I mean, yeah, I think the algorithms for keeping them balanced are probably
slightly difficult to communicate over audio.
Yeah.
But what are some of those algorithms if people want to look them up?
Yeah, it's a really good point.
So in addition to like choosing the right tree, and people are inventing new types
of trees all the time, you know, no matter what you do, especially if data is coming in and going
out, you know, the trees will become unbalanced, either when you're building it or when you're,
you know, modifying it. And so there is, you know, there are special kind of trees called AVL trees
and red black trees. And for things like bee trees, there are
ways to rebalance them. But as Patrick said, I mean, you definitely take a look at if you search
for any of these trees on Wikipedia or other places, you'll see really nice kind of visuals
of how they're rebalanced. But I remember from undergrad, rebalancing trees is one of the hardest
things that I have to implement
and i just remember having so many issues because you have to kind of project yourself into different
parts of the tree and like think okay at this point of the tree you know i i am like like i'm
at this node in the tree but because i'm in the middle of this rebalancing, I know this, this, and this. And so I have to do that. And so that stuff gets really, really difficult. So yeah, I think in
general, if you're going to use trees, try to find a library on the web. Definitely take the time to
understand how they work. But if you're going to put something into production, there should be a
really good implementation of a variety of different trees that's publicly available.
Looking at binary trees specifically, a binary tree is one where you only have two children.
And so you might say to yourself, well, why would I have that restriction?
Why would I, why would I, you know, have that restriction, right? Like what's, what's, why would I limit myself, right? But there is one thing that binary
trees can do really, really well. And that is like upper or lower bound search, right? So for example,
imagine if you have a set of numbers and you're asking the question, you know, what is the biggest number in my set, you know, less than 30, right?
That question requires you to do kind of an in-order search. So you have to kind of look for
30. If you find 30, well, I guess I said less than. So yeah, even if you find 30, you still
have to kind of go backwards and find the biggest number less than
30. So you have to kind of march in sorted order through that structure, right? And a binary tree
is really nice because since there's only two children, you know that the child to the left
is smaller. And assuming there's no duplicates in the tree, the child to the left is smaller
and the child to the right is bigger.
And that's always true.
And you can also do this trick
where if you take any point in the tree,
if you take the child to the left
and then keep going to the right
as many times as you can,
you're going to get the number
that is smaller than the one you're currently at,
but bigger than all the other ones. And so that's
a unique feature to binary trees. And I think that is one of the biggest reasons. Another reason why
people use binary trees is in machine learning. We talked about there's in machine learning,
you're typically not using a tree to store a lot of numbers or something like that. You know, you're using a tree
to the structure of the tree to store decisions. And so at the end of the leaf or the end of a
binary tree, if you've searched through it, you have kind of a list of constraints and you can
use that as a way to learn things, right? So in that case, it's actually pretty difficult to split multiple ways, depending on what kind of decision you're making.
So it's decisions like, you know, this number should be greater than X or, you know, this binary number should be set to true.
These are all binary decisions like thresholds and binary operators.
There might be times where you have some kind of categorical thing.
So day of the week, it could be seven choices and you should have seven children.
But even in those cases, it's kind of hard to calculate whether that is a good decision to put in your tree or not.
Versus if you have a binary yes-no decision, it's pretty easy to calculate.
Yes, I want to take that decision and put it here in the tree.
And so for that reason, a lot of machine learning will use binary trees when they're making these decision trees.
You'll also hear in machine
learning the term random forest. At the time, I thought random forest was this really complicated
thing because I'd heard where it was used. And so I thought, oh, one of these days, I really want
to learn random forest. Random forest is very simple. All it is is if you have a big data set and you're planning on building a decision tree,
instead what you can do is you can split that data set
in different ways and build in different decision trees.
And because of that randomness I talked earlier about
where you're randomly trying different splits,
even if the distributions of the data
or the type of the data is the same in each of those groups, you're still going to end up with
different decision trees. And then once you have all of these decision trees, when you want to make
a decision in the future, you can actually ask all of them. And then you can either do some kind of voting, or you can come up with, but one of them is a B tree.
And so B trees say that each node can, and there's some variance here, but basically each node can contain multiple values. And then the, however many values it has, you have that many children
plus one. So let's say you had two values stored there instead of just one,
then things before the lowest value, you keep them sorted. And then the thing below the lowest value,
you would have reference to the next child there. Between the two values, you would have a child.
And to the right, the bigger, the biggest, larger than the two of the bigger, you would have a child to your right.
So in that case, you had two values in the node
and three children.
And you could see this expanding to various sizes,
four, five, six, seven.
And there's a lot of varieties and some techniques
for how you keep them balanced.
But the interesting thing about here
is when you visit a node,
rather than simply making a single decision,
do I use this value? Do
I go to left or right? You sort of can consult multiple values. And then also, by doing that,
you make your tree shallower. So you don't have as many levels of children as you would if you
used a binary tree because you're putting more data in the node. Now, the other thing that
I already hinted at as well, where this starts to matter, and this is one of those things where
when I'm doing interviews with people, and we do some sort of, you know, questions, and we're
working through a problem, and then I'll often ask them, you know, oh, what is sort of the runtime?
And they start scratching their head, and this is different for everyone, but I always express to them,
look, I'm actually not that interested
in sort of the academic asymptotic runtime
because it turns out like in my life,
most of the time you're dominated by other factors.
So I'm looking for you to just express to me
like which parts of this algorithm
about how many times would this take?
Where are your bottlenecks?
And so one of the things that B-trees really excel at
is if you imagine, which was true,
maybe a lot more true five, 10 years ago
than is true today.
But if you go back 10 years, RAM was really small.
Like CPU caches weren't really like they are today.
And things were stored on sectors on hard drives.
And if you were going to
go access something on the hard drive, the hard drive was going to spin up, move the little arm
out to the right spot on the moving piece of metal with magnetized bits on it and read in a whole
bunch of data in a sector and bring that into memory. And that was an extremely slow thing.
And so if you were going to do an on disk search for
data, you wanted to bring in sort of as much in one go as you can. And if you imagine a binary tree,
where at each block, you have one, and then the next block is who knows where, each time you go
to the next node, you are spending a lot of time waiting to get the next node into memory, or
getting into the CPU. And these B-trees became very valuable by
instead saying, we're going to store a whole bunch of data in the sector, and then we're going to
analyze all that data, increase our chances at any given node of finding ours, or at least
cutting down the search set dramatically so the number of different nodes we have to visit is as
small as possible. So for things like file system and file system metadata, for things like databases and indexes and databases, these B-trees get used heavily because you can tune the number of children and you can do sort of more work per node visitation.
And if node visitation at a certain spot,
that you know how to keep it ordered and organized and well balanced. And so there's a whole body of
work around bee trees, but it's enormously valuable in this sort of broadening the scope
out from just one value and two children to allowing a more general form and realizing
that that may play better with an
actual computer architecture. Yeah, really well said. Yeah, that really covers it. And so,
yeah, as Patrick said, I mean, pretty much any database that you look at is going to have a
B-tree implementation in it. I mean, all of them, they all use B-trees in some way, shape, or form. So now we'll move on to spatial trees.
And as we motivated earlier, spatial trees often aren't functionally that different in terms of their idea, but they're just taking advantage of some domain knowledge.
In this case, the fact that it's operating over some space or some field, right?
And so the simplest one of these that I know of is a KD tree. I don't actually know where the name
KD came from, but effectively it's a binary tree where every split is done on a single axis. And so if you imagine like, so like picture in your
mind, you know, a graph, like just with an X and a Y, right, graphing some function. If you just
have a line, right, you can describe a line just by looking at one of the two letters. So if you
look at just a horizontal or a vertical line, you can do that just by looking
at one of the two variables, right? So if I have a line like y equals x or a parabola like y equals
x squared, I need to be thinking about y and x at the same time because they kind of relate to each
other. So y equals x squared, you know, as x is increasing, y is really increasing. And so you
have this like parabola
right this line that like just shoots upwards right but if i have a just a vertical line
that's just like x equals two and i want all the y's right that's a vertical line or if i have a
horizontal line that would be you know like y equals four and so and i just want all the x's
that are all points on that line.
So vertical and horizontal lines are kind of the easiest to express, right? And so KD tree uses this
to, um, you know, to, to have a very simple division of, of, of the space. So what a KD tree,
KD tree will do when you're building one is you'll look over all the points and let's just keep it at
two dimensions just to keep it simple so you'll look over at all these points and then you'll
calculate the spread in other words you calculate the difference between the smallest point and the
biggest point on that dimension so if you were imagining like uh you know a bunch of points but
but they're all kind of around the same height, but there are
some that are really far to the left, some that are really far to the right. You have this really
large horizontal spread, but not a very large vertical spread. And so whichever dimension has
the largest spread, that one's going to be chosen as your split dimension. So you're going to choose
that dimension, and then you're going
to split it right in the middle. Now, when I say in the middle, I don't mean like in the middle of
the extremes, but you're going to split it such that half the points go on the left and half the
points go on the right. And so while you're computing the spread, you can also be, you know,
finding that middle, finding that median median point so if you imagine the
root node the root node is going to look at your entire data set find which dimension has the
largest spread and then slice that dimension such that half the points go to one child and half the
points go to the other child and then all that root node needs to keep track of is which dimension it was split on and where in that dimension the split was. Now, then each of the children will do
the same thing. And so you'll keep doing this until you get to a node where there aren't that
many children. At that point, you'll say, well, there's no point in me splitting again. I'm just going to have like just a list of points at that level.
And so KD trees are super, super important
because there's a really, really cool trick you could do with them.
In machine learning, you have this challenge, right?
Where let's say you're Amazon, right?
And someone types in batteries. And so you need to
return back a list of items that you think that person would be interested in,
right? But the problem is Amazon has like a billion items, right? Or maybe even if you take
it even more extreme, let's say someone just goes to amazon.com. They don't even put a search.
So you need to show them some items you think they would like. And there's just so many items,
probably a billion SKUs in Amazon's catalog, right? So you can do this trick. Let's say you had a vector, a vector, just like a set of numbers. Let's say you had a set of numbers to describe a person, right? And so
you know, that might be sort of like some way of capturing like their history and the things
you think they like. And you were to sort of what's called embed that or create just like
a single point for that person. And then you also do this for all your different products.
So for all your different products, you have kind of like a point or a vector, a list of numbers or a to this person point that I've created. And it turns
out if you use a KD tree, you can do that very, very, very quickly. Things like nearest neighbor
search can happen extremely quickly in a KD tree because, for example, look at that first split.
That first split might be like so far away from where the person point is that you could say anything on the other
side of that split, there's no way that any of those points are my nearest neighbor. And so that's
right away, that's half the data you don't have to look at. Right. And so you can kind of take that
sort of intuition that basically, you know, you can kind of follow the KD tree down to the,
you know, final part of the KD tree that has your person in it, your person point in it,
and look at the products there. And then maybe look at like some of the products,
you know, higher up in the tree, but you definitely don't have to look at the entire tree.
You can ignore 99% of the tree. And so this is how Amazon
does at the end of the day, this is how they can show you stuff so quickly, even though they have
so many items. And so there's a lot of work that goes into like the embeddings and how you create
the tree and everything, but ostensibly, you know, they're, they just have a giant KD tree
that they've implemented. And when you come to Amazon, they find the nearest
things to you in that tree. That's how it works. So to highlight one difference and then talk about
one final thing, I think when Jason is describing like this nearest neighbor search on multi
dimensions, which we sort of call these spatial trees, because that tends to be one way of
thinking about it. But when we're talking up front about B trees and binary trees, we're talking about
a single number line, basically, where all of your elements live on. And when we talk about
multidimensional, it's having at least two or more dimensions. And the problem there becomes,
there isn't a strict way of viewing, like, I'm just going to go in dimension one and then dimension two.
Like, that turns out to be really inefficient.
So both Katie, Jason, and things I'm going to talk about and some others try to divvy up the space in a way that moves back and forth between the dimensions.
And as Jason was saying, using the spread, you might divide on the X line, the X line, but then you'll go by the Y line, right? And you're sort of intermixing the two dimensions and trying to sort of interleave so that you
can get to your answer a little bit faster.
And when you're traversing, the other interesting thing is you may end up having to go through
multiple branches in order to find the nearest neighbor.
So once you go down one branch, you know how far away the split was from you.
And if your answer ends up being further away
than that split was,
you actually have to go visit the stuff
on the other side of that split.
And that's true as well for a variety of these facial trees
and doing that nearest neighbor search.
And the other thing that both his KD tree explanation
and my, oh man, I guess I did
this out of order. Anyways, the thing I'm about to talk about also support, which is very similar,
is for querying ranges. Or if you think about in 2D, it would be rectangles or boxes, right? So if
you drew a rectangle and said, I want everything inside of this rectangle, they need to be efficient
at doing that. And it's a little hard to describe over audio, but I mean, both of them are well-suited
to being able to do that.
So Jason talked about splitting the data
on one of the axes at each level.
So instead of looking at your data
and sort of splitting based on what the data says,
instead, if you come up with predefined splits
based on your space,
you come up with a different spatial
tree called a quad tree for two dimensions or an octree for three dimensions. I don't know what it
would be for four dimensions. I've never heard of someone doing that, but maybe you could. And so
if you imagine in two dimensions having, let's just say a square, right? And you divide that
square once in the middle up and down and once in the middle left and right then you end up with four quadrants of that space and now you see why it's called a
quadri so the root node would be the base sort of square then the first set of children there
would be four and one would be the upper left one would be the upper right one would be the bottom
left and one would be the bottom right and what would be the bottom left, and one would be the bottom right. And what that does is that sort of dimension mixing I was talking about, you get it,
right? So as you go, you know, child 0, 1, 2, 3, you're actually traversing both dimensions sort
of intermixed. But the quad tree, what it does is say, if I have a set of points scattered around in my square, I'm going to keep subdividing my space until I get, you could say, just one thing in each square.
So the disadvantage is that as you're cutting down and down and down, if you have two points really close to each other, you might need a lot of splits before you get to the split that separates the two.
And so you could have a lot of depth before you get to the split that separates the two. And so you could have a lot of
depth, right? But in traversing, it actually ends up working really similar to how the KD tree does.
The advantage of a quad tree is that I can figure out where in the tree it goes from the start. So
if I know the level I want to go to, like let's say level five split, I can compute in closed form which traversal of the
tree I would need to place that element in the tree there. The other nice thing is that if you
have dynamic data where your data is moving around, so imagine like someone shoots a laser
beam out of your spaceship in your game and you have your world game split up in a quad tree or
if it's in three dimensions, an octree, which is the same thing, but with eight children, if you imagine that laser bolt moving through space and needing to do these nearest
neighbor queries, which is what you would want to do if you wanted to see like if I hit something
in my current box, then this traversal you can figure out in sort of a more direct closed form
computation and do the management in a sort of much cheaper way rather than reshuffling and potentially needing to re-split like you would need to for a KD tree or other kinds of spatial
trees that would be used. So they're similar in actually the way you use it. And once you start
looking at algorithms between them, like once you figured it out once, it kind of ends up being
pretty similar for all of them and how to do these range searches and nearest neighbor searches. I find that these quad trees and octrees are used a lot in games for dynamic data. And the quad tree,
KD tree, there's also one more that's worth mentioning, I guess, which is like an R tree,
which is taking a group of data and drawing a rectangle around it, and then building a tree
of rectangles over multiple rectangles. And I won't go into it more than that.
And you'll see that also used a lot in sort of mapping data,
geographical or geographic information systems, GIS systems.
We'll use these a lot in databases and stuff.
Also, I think KD trees can also be used in video games.
I've seen them used for like splitting up a level
and understanding which polygons you want to draw or not, because the level itself, like the environment tends to
be static. And so you can a priori before you're like when you're making your level, you can do
the split and have it stored so that when you're trying to figure out which polygons to render,
you just traverse this pre-made tree rather than having to build it at runtime.
Yeah, totally. Yeah, I think, yeah, you really hit the nail on the head. I think KD trees and R trees, the structure of the tree is dependent on the data. So, you know, as Patrick said, if
your data is like really wide, you might have a lot of vertical splits before you start seeing
some horizontal splits. If data is really
narrow, it's going to be the opposite. And yeah, in the case of the archery, you know, if there's a
part of the space that's very dense, I mean, imagine building an archery of the US, right?
You're going to have like a ton of rectangles around like New York City, San Francisco,
Los Angeles, right? Because I mean, imagine you're building an archery on like
over like businesses or addresses in the US, right? There's just a lot more addresses in a tiny space
in New York City, right? And so the structure of the tree will, you know, even all the way up to
the decisions made at the very top will be influenced heavily by the distribution
of the data versus like in the case of a quad tree, you know, now there might be nodes that
exist or don't exist because of the data, but the structure, like the, the way that the splits are
done is totally immutable and it has nothing to do with, with the data. Right. And so, you know,
that's sort of like a double-edged sword,
right? On one hand, if you're not splitting in an informed way, then you're not going to be very
efficient. And so if you're doing something just once, like if you're creating a tree off some
data that's not changing and then using it a lot of times like that's what amazon is doing you know amazon will create this giant tree and that might take a lot of time but then they're going to swap it out with
the old tree and then they're going to use that new tree you know billions and billions of times
and so they you know they can afford to have this sort of frozen set, right? And so they'll use like a KD tree or an R tree
or something like that.
But for games, the content is moving all the time.
I mean, there's, as Patrick said,
like if you have people walking around on a map,
shooting laser beams and all these things,
then you can't have the structure of the tree
dependent on these things that are
constantly moving around. And so for games,
they'll use these octrees and quad trees. And my guess is,
and I think you alluded to this Patrick is, you know,
games will probably need both because the level doesn't move unless you're
playing like Minecraft or something, but the other, the level doesn't,
the geometry doesn't change.
And so that would be much better served by something like a kd tree or r
tree and there's actually more like exotic trees for certain domains there's like a bsp tree there's
uh you know vantage point trees and so they'll use something like that and but then for actually
storing all of the dynamic content they'll use something like an oak tree. That was a lot of fun.
Yeah, we totally got to geek out on trees.
If we missed any trees,
let us know.
Pine trees, oak trees.
Yeah, that's right. Birch.
Minecraft taught me a lot about trees.
I've been playing that a lot with the kids.
I learned that birch is a thing.
I don't quite know the order, the big O notation of that birch is a thing. So I don't quite know like the order,
the big O notation of a birch.
We'll have to figure that one out.
But thank you so much, everyone,
for all of your support.
Let us know what you think about the duo episodes.
I mean, we're bringing them back.
We still have a lot of really awesome interviews lined up.
So, you know, looking forward to that. But we'll be, you know, get looking forward to that.
But we'll be kind of intermixing some of these, you know, duo episodes in there and definitely give us some feedback.
We really appreciate all the feedback you have been giving and just keep it up.
Let us know what you think about the show.
We definitely look at all of them.
We've gotten a lot of our show ideas from listeners. This was one that we really wanted
to do ourselves, but we've taken a lot of really great show topics from listeners. And that's all
because of you and continuing support and feedback. So we appreciate it. Yeah. Thanks, everyone.
See you later. music by eric barn dollar programming throwdown is distributed under a creative commons attribution
share alike 2.0 license you're free to share copy distribute, distribute, transmit the work, to remix, adapt the work, but you
must provide an attribution to Patrick and I and share alike in kind.