Coding Blocks - Caching Overview and Hardware
Episode Date: August 12, 2016In this episode we give a general overview of caching, where it’s used, why it’s used, and what the differences in hardware implementations mean in terms we can understand. This will be foundati...onal to understanding caching at a software level in an upcoming episode. There’s also something about the number 37 that may be the […]
Transcript
Discussion (0)
Ready in five, four, three.
You're listening to Coding Blocks, episode 45.
Subscribe to us and leave us a review in iTunes, Stitcher, or more using your favorite podcast app.
Visit us at codingblocks.net where you can find shoutouts, examples, and more.
Send your feedback, questions, and rants to comments at codingblocks.net.
Follow us on Twitter at Coding Blocks or head to www.codingblocks.net
and find all our
social links there at the top of the page.
You were supposed to mix it up, man. I'm pretty sure
that tonight it was supposed to be dub
w-dub. I think that's what I did last
time. I thought you did w-dub
maybe. We're going to say dub-dub-w.
Okay, there you go.
You guys sound
so old right now. Nobody does the dub, dub, dubs.
It's a new generation.
What do they do now?
Nothing.
Well, we haven't got their introduction,
so I don't even know who's talking right now.
That's right.
It's just codingblocks.net.
You can do that too,
but we'll put the dub, dub, dub on there for you
because we are old school.
If we were cool, we'd have a.io.
You know, I've thought about that oh yeah we could call
it codeo uh anyways with that i'm alan underwood i'm joe zach and i'm michael outlaw or am i yeah
and first we wanted to thank you for your patience we had a couple of uh rough audio issues the last
couple episodes so we appreciate you guys sticking with us and we've already uh started a little stumbling on each other uh so we're off with a bang but uh thanks
for sticking with us and uh you guys are great and hopefully you can hear joe this time that
would be a plus i think yeah so sorry about that it it really did not uh the audio unfortunately unfortunately wasn't up to our normal standard of what we would prefer.
Yeah.
So, you know, we want to start off with some reviews here.
We want to start off on a high note here.
I think we should.
All right.
So we had some amazing reviews as always.
We love it when you guys do this.
It definitely puts a smile on our face.
And so this is unfortunately I have to read some of these again
so you're gonna have to bear with me if it's your name that i totally butcher but i think i picked
the easy one so i'm sorry alan so uh in itunes hedgehog which we actually got twice was interesting
and then i don't know i think I should just call this one Thomas.
I'm
the
Gormas AI.
That sounds pretty good. I guess
button 1992
Jonah John
Lee
undead code monkey
love it. Z
McKinnon hillside
cruiser dib jib jub and d do rose i think i got those at least in the
ballpark correct maybe yeah that wasn't terrible so on ours i i've got the stitcher reviews
and we've got pch teaspoon, maybe TSP. Not sure.
Rafael.
Tablespoon.
Yeah.
No, TSP, man.
That's teaspoon.
What a horrible acronym.
Anyway, we've got Rafael H, CK 142, the middleman 124 and local juiced.
And by the way, I am now the Californian motorcycle driving dude with a headscarf.
I read that.
I did read that somewhere.
And, you know, the funny part about that, though,
and I don't mean this as a negative to the author of that comment, but when I hear Alan talk,
no part of his voice reminds me of California.ia come on y'all you know i'm from
california i mean but then at the same time i don't know maybe maybe we all sound like well
no i guess we don't all sound like we're from california because you were the only one that
got the comment but yeah i i totally won't be that guy though okay yes so just so you guys are in on um we got a
review thank you local juice uh that basically um compared to alan or the the mental image of alan
uh mapped to a uh california motorcycle riding guy with a headscarf unfortunately alan's head
might have outgrown that scarf a little bit after reading the review, but we do appreciate it.
After reading the review?
No way, man.
He outgrew it before he ever read that.
I don't think you guys heard, but I got a four on the Stack Overflow thing.
Oh, right.
There was that.
There was that.
Yeah.
Again, thanks, everybody, for the reviews.
They seriously, seriously make our days.
Did you read this whole thing though? Like it where he says
I picture Alan as a
Californian motorcycle dude with
thick eyes. That part is important
and a headscarf.
So, you know,
I laughed
for five minutes after I read that.
It was awesome. It was good. Now
I understand why he said,
you know, he made some comment about like, well, thank me after you read it or something like that.
You know, read it before you thank me.
Now I understand.
Well, feel free to show me some love, guys.
You know, my head's shrinking over here.
I don't know.
I don't know if I want to know what you guys think I look like and what vehicle I'm driving on the freeway in California.
I think that that should actually be the poll.
No, not a poll.
Everybody should leave their comments.
If you haven't seen our picture on our thing,
there should be a thing describing it.
This is a bad idea.
We shouldn't.
Well, I mean, you keep saying things so much.
I'm trying to get like, what is he talking about?
I was thinking leave a comment on this particular episode,
but now I'm actually scared.
I feel like this should be i can tell like i i don't mind telling you after some of the feedback we got i don't want to know what people think of because that that could not go in a good
place that could go wrong quick uh so we're saying a meatus caption contest oh god here we go here we
go we're gonna turn our site into Reddit. Oh, God. All right.
The internet is cruel.
So we got a little other piece of news here.
So there was this amazing article that I saw.
What was it?
Yesterday that came out.
And I don't know, Joe, if you had already seen this, but do you like SSDs, Joe?
I love them.
Well, how would you like to have a 60 terabyte ssd you heard me uh that would be
fantastic 60 terabytes this beast from seagate is coming out i think it said uh 2017
uh yeah 2017 and and it is supposed to be for business use.
But it got me excited because I was like,
oh man, that means that it's not too far out then
before larger capacity drives like this come out for consumers.
And here's the best part of this thing too,
is that a 60 terabyte, a single drive,
60 terabyte drive dwarfs not only all the other SSDs that are out there,
but every hard disk that's out there too.
Yeah.
Yeah, that's amazing.
The part that I'm excited about is you just recently got the Samsung 950 Pro, right?
Oh my God, it's amazing.
So for anybody out there that is not up to date on SSDs and all that kind of stuff, the 950 Pro Samsung is amazing because it's the M.2 form factor.
But the reason why it's so impressive, though, is it uses the PCI lane.
So it is super fast.
I think it's got like 2.5 gigabit per second reads.
And I don't even remember what the writes are.
They're like 1.5 gigabits.
But it's stupid fast.
It's like five times faster than regular SSDs that are already fast, right?
It does depend on which one you get, though. The 512 does outperform the 256,
which if you think about it, it makes sense because just imagine from a, from a, a size thing, a point of view, right? The same size, you know,
thing. Okay. One of them is double the size of the others. That means that the one that's double
the size, all the bits are crammed in less space. So it makes sense that it would be able to read
and write faster from that point of view. But also at the same time, when I see that though,
I'm like, well,
darn, that's thinking of it in like a mechanical sense where it's like you got to go further. But
whatever, man, it that it is it the it is true, though, that the larger capacity one is faster
than the smaller one. Well, temporal locality matters. It's sounds like we've been doing some
researching on caching and hard disks. It really does does and we're going to be diving into that so um but but we still have a
few more pieces of news but i was going to say the exciting part is is samsung's not the only
one doing the the m.2 pci stuff now plate stores joining the game and some other ones so hopefully
the price of those things will start dropping soon. Yeah. So just to just to follow up on those numbers, though, here here's the
the five hundred and twelve gig version versus the two fifty six gig. These are
obviously both in the same from Samsung sequential read to how I even say this
twenty five hundred megabytes per second for sequential read. It's insane. And then on the 256 gig model,
it's 2200 megabytes per second.
So a difference of 300.
Yeah.
And then on the sequential writes,
it's 1500 megabytes per second for the 512
and 900 for the 256.
So, you know, not half, but also a lot. Stupid fast, either which way,
just crazy fast. Yeah. I mean, I'm totally in love with that thing. It is amazing.
So in the next piece, one thing that we mentioned last episode was, I think Outlaw just kind of randomly thrown out Ready Player One about being a good book,
right? Yes. So I went online and I was like, okay, I need the Kindle version. And I started
reading it. Well, unfortunately for me, I fall asleep when I read at night. So I was getting
10 pages in every night and I'll finish that book in the year 2050. So I tried to go and give Amazon
my money and buy the audible version but i for whatever
reason i could not get it to take my payment so i was like you know what i'm going to check my
library so here's here's a a pre-show tip for you guys check your library a lot of libraries now
have a digital section where you can download audiobooks for free and you can even download
kindle books for free and you can even download Kindle books for free. And
you can get all kinds of stuff for free from your local library. So I tried to give Amazon my money.
They rebuked me. And so I was able to download the audio book for free. It's amazing. If you have not
listened to or read Ready Player One, man, it's really enjoyable, especially if you grew up in the area of the 80s
with playing video games like we did.
Man, it is a ton of fun.
Wesley Crusher for President 2016.
I'm writing him in.
Awesome.
And then the last thing, I mentioned this on Twitter
and got quite a bit of Twitter love on this one,
is we decided that in an upcoming episode, because a lot of Twitter love on this one is we decided that, you know, in an upcoming episode,
because a lot of you guys really enjoyed when we went through the book, how to become a programmer
or how to be a programmer, advanced, intermediate, all those, we got a lot of good feedback on that.
So we decided we were going to pick up clean code. Each one of us got a copy. We're going to
spend a couple episodes on that here in the near future. And then we'll be doing a giveaway.
So,
you know,
be prepared when a new episode hits,
you know,
listen to it because probably within a couple of weeks of that,
we'll be doing giveaways of at least three copies of this book.
So yeah.
Yeah.
Yeah.
So if you were,
if this is a book that you were already interested in,
this is a good opportunity for you to be lazy and just wait yep clinko by robert martin right
yeah uncle bob uncle bob all right i don't like i don't like saying uncle
bob you don't know i mean i'm sure he likes it but it just feels awkward
doesn't i don't know i don't maybe he does it sort of rolls off the tongue though man uncle bob i haven't read one way or the other if if he likes that so maybe
we're assuming well that's his twitter oh it is yeah you're right martin so he must like he must
or either that or it's grown on him maybe it started out in the beginning like that's the
thing maybe it maybe in the beginning it was kind like, wait, why are you calling me that? And then over time,
I was like, okay, fine.
I kind of whatever I do,
I'll accept it.
I'm uncomfortable with it.
Like most things,
I'm just going to go ahead
and say it now.
No, Uncle Mike,
I don't want to hear it.
No, Uncle Outlaw,
none of that.
What about Grandpa?
Oh, now those are fighting words.
Come on, California. Yeah, yeah, bring it over here. Not a two one. Oh, let's take this. Let's go. I show you grandpa. That's amazing. So what's our topic of the evening? Something about turtles? Yeah, no, no turtles. You guys guys are you guys familiar with the turtle story
no not at all turtles all the way down no yeah um i i don't remember the the whole story offhand
but basically it had to do with um like a preconceived uh notion or a myth that somebody
thought that the world was basically balanced on the back of a turtle and that's how we all um
managed to kind of you know stay afloat. And somebody asked, you know, what's underneath the turtle?
And the person who gave the original answer,
oh, it's turtles all the way down.
And it reminded me of caching,
which is our topic of the night,
where basically there are so many different layers
of caching that it just blows my mind.
It's incredible to think that
this whole world of competing
is really so heavily built around caching to the smallest level.
And so it reminded me of that little story.
I'll find a link for that.
I don't get the turtles all the way down, though.
Like, what they mean, which is a turtle on top of a turtle
on top of a turtle on top of a turtle?
Well, like there's cash on top of cash on top of cash.
Yeah, it's totally irrational to
think that there's turtles all the way down I mean
it doesn't answer the question so it's
a it's irrational but yeah it's just the idea
that you know it just keeps
on going down forever
cash upon cash upon cash I kind
of feel like until I hear this story though
Joe tell me a story
all right
looking searching Joe, tell me a story. All right.
Looking, searching.
Unfortunately, it reminds me of Horton Hears a Who.
I don't know if you guys ever saw it.
I have little kids.
I remember that one.
Yeah.
It reminds me of that.
We're a planet floating around in somebody else's backyard.
So caching or short-term memory.
Yes.
So, I mean, what is it? What is caching or short-term memory. Yes. So, I mean, what is it?
What is caching?
Why is it important?
What do we use it for? I mean, why are we even talking about it?
Yeah, and there's a couple definitions out there,
but the shortest one, so one that was my favorite,
is one that I wrote, so it's probably crappy,
but it's basically,
I think it was just storing a subset of information
for faster retrieval.
So it's a smaller portion of the whole data
of everything you're working with.
And you save that smaller portion in a smaller spot
and more easy to access spot.
Yep.
Sure, I'll take that definition. and it sounds so simple right i mean really that's what it is that's it's it's such a simple concept
just well and also like as he was saying that though like you kind of have to wrap your head
around that because he said storing a subset of information for faster retrieval but and it's the
subset part because then i was think like well if the browser
caches that could be caching the entire web page and it could be but that entire web page is only
a subset of the internet. That's that's all true and all of those are examples where caching does
come into play and actually that's a great example there with the the web request like it just using
the internet using something like chrome you type in you know to go to www.codingblocks.net then just the different layers of caching is
mind-blowing to me your browser caches data right so you know this if you've ever had to clear your
cache or clear your cookies or you know i guess clear your cache but also um just the dns name
is cached right so your computer keeps a local cache of common websites
or websites that you've gone to before,
so it doesn't have to go look that up
and do that action over and over again.
Well, actually...
Your ISP does it?
In the case of Chrome, though,
since you specifically mentioned that,
Chrome has its own IP management, though.
So that wouldn't be...
But I understand what you're saying.
Your computer does keep a cache of DNS entries management though so that wouldn't be but i understand what you're saying like an i you
know your your computer does keep like a cache of dns entries is what you're getting if you ever
flushed your cache then you know all about it good old ip config um then uh the isp like if
you're comcast or bright house or whoever your internet provider is does the same thing they keep
copies of the data that's commonly requested closer to you,
so they don't have to go all the way to CodingBlocks or Google, and they can just serve
that up for a closer location, like physically closer in most cases. And even then, once we,
you know, forward that request on, you may not even be going to our CodingBlocks.net server,
you could be going to a content delivery network like
Akamai or CloudWatch. Azure
has got one too. And even
then, we do a lot of
caching. Well, I was going to say also
CloudFlare. CloudFlare,
CloudFront, Azure's got its own.
Well, I was thinking it's CloudFlare
specifically because it's a free
one. So it's really easy for a
lot of people to use. And we were using it, but I think we stopped.
It was actually slowing us down horribly.
But, you know, I mean, yeah, the thing is, though, it's the same.
What he's saying is like there's caching used at every layer of a web page coming down.
Like we're only talking about a web page, right?
Like anything, if you go to codingblocks.net,
all this stuff is happening all the way down.
And actually it's sort of in reverse order, right?
You've got your CDN, then you've got your ISP,
you've got your DNS, your browser.
No, he had it right.
This is the simple request of making a web request, right?
So that web request starts on your
computer so in your browser the browser looks at its cache yeah and then as a result of that it
says okay let me do a name lookup and do i already have this this dns entry cached if so i'll use
that otherwise i'm going to go and query a dns server which who knows if it's got a cache or not
and then you know after i do that then i'm going to start making the web request outside of this computer, which goes to the ISP who might have it cached. And if not,
now I'm going to actually go to try to get to the website, but that website might be fronted by a
CDN, such as a Cloudflare. And, you know, if that CDN doesn't have it, then I'm going to actually
go to the website. And then once I actually hit the website, then the website itself might have its own caching layers that it's doing.
So instead of making additional queries to the database
over and over and over for maybe static content
or content that doesn't change often,
maybe that content is already cached at the web server level.
And real quick, I want to back up on the CDN thing
because this was new to me a couple years ago
like most of the other things that we just mentioned there are fairly common for programmers
if you've never heard of a cdn all it really is is a place to put things that don't change much
that you want to be closer typically to whoever's consuming that so that's not exactly the way i
would phrase it well let me finish my thought okay Then we can revisit. Here I go. So, so like, for instance, images on a website, right? Like,
if you have a bunch of images on your website, typically, you don't want to serve that up from
your server, because you're using your server to serve up things that really don't change much. So
it doesn't really make much sense to take the processing power from your computer that you might need for more important things. And additionally,
when you do content delivery networks, they can be cached closer to regions of the world.
So all your images you could have in Europe so that when somebody requests your website from
Europe, all the images are served up from a server closer to them, or if they're in Asia, something closer to them. And so you put your images out there on these fronted networks so that when
somebody makes a request to get that image, it's not actually coming all the way back to your server.
It's being served up from what they call these edge nodes that are closer to where they are.
And so that cuts down on the amount of network traffic that you have direct to your server. It cuts down on the amount of processing your server's doing like Apache
or anything else. Like if it's having to serve up a file, it's kind of wasting time.
And so that's one of the ways that you can do that. So that's a content delivery network.
Well, specifically as it relates to the edge caching too, though, because it's physically
closer to the client, then there's less latency involved
in them making the request and retrieving the request now here's why i say that like your
definition of it being content that doesn't change isn't entirely accurate and that's because like
my initial experiences with cdns was all about video and in the case of video, we would push content out to multiple CDNs.
And the CDN's job is as soon as they got it, they would then blast that out to all of their edge nodes.
But that video, they were only keeping maybe, I forget how it was set up.
You know, let's say a minute or two is worth a video. So,
so maybe I'm doing a horrible job of explaining this. Let's say I was streaming a live event.
Okay. Like some sporting event, right? The Olympics are going on. Okay. The Olympics
are a great example. That was never one of them, but let's just use that as an example.
So if we were streaming the Olympics, right, and that might be the entirety
of that stream, you know, as a live event, maybe let's say an hour long event, right? But, you
know, we would be pushing it up in segments. You know, the encoders were pushing up segments of
this live stream. So like think of, you know,
a bunch of individual movie files, right? And those were getting pushed out to the CDN, which
the CDN would then push it out to the edge nodes. But that CDN would only keep, let's say 10 of
those segments. So 10 of those movie files, and then the way, and now this might be some dated
information, but like if you were to do streaming
for iOS, live video streaming for iOS, and the way the QuickTime wrapper would work is you would
basically say like, okay, you're going to get these number of streams coming in, but the names
would constantly roll from the encoder, right? And so basically,
it would kind of like just loop around.
Let's say if it started off as segments 1 through 10, right?
So it'd go 1, 2, 3.
So your iOS device starts playing it from 1
and then goes to 2, 3, 4.
And then when it gets to 10,
if you were to actually look at the file
in a text view,
you could see the segments changing but they would change back
in order so then the first one listed would be 11 and then go so maybe it starts off in the in
the file like if you look at the metadata of the file one two three four you know five six seven
eight nine ten and then eleven two three four five six seven eight nine ten eleven twelve three
four five six seven eight nine ten you see what I'm saying? 11, 12, 13, 4, 5, 6, 7, 8, 9, 10.
And so that way, depending on where the user came in,
they could, let's say, start from whatever the oldest was.
So like we finished off at four would be the oldest one.
So, you know, but the point is,
is that that content was aged off of the CDN Edge somewhat quick.
So it wasn't exactly what you said about something that's not going to change often.
It depends on the usage of what you're doing.
So specific to video, at least, it's going to change if it's it's a live stream now if it's not a live stream
then that's a whole nother yeah i mean i was thinking more like images like you'll even see
like uh javascript files from google or something right like you'll see static google actually i
think it's static.google.com if you look a lot of their stuff and that's their content delivery
network so that basically instead of hitting their google servers they just have this file sitting out there that basically just shoots back to text
so but at any rate that was a little tangent i just kind of wanted to uh for anybody that
never really heard of a cdn it's that's what it is yeah uh i mean to your point there's a whole uh
well no that's not exactly the same thing. I was thinking of like the hosted libraries that Google has for like Angular or jQuery or something.
I mean, probably using a CDM.
Yeah, behind the scenes.
But I was thinking it was more exposed.
Because I know what you're thinking of.
You're thinking of CloudFront.
CloudFront is for AWS.
Google also, like if you use their Google Map APIs, behind the scenes, it'll make calls out to static.google.com for images.
And those are their CDN type files. um behind the scenes it'll make calls out to static.google.com for images and that and those
are their cloud front or their uh their cdn type files so yeah i mean it's pretty cool if you've
never heard of it check it out but back to our regularly programmed schedule so why don't we
just cache everything then because so far way we've said it is it sounds amazing it sounds like
there's already a ton of it and And, you know, why wouldn't
we just do it all the time for everything ever? In general, the faster it is, the more expensive
it is, and the more expensive it is, the smaller it is. And as you mentioned, to a temporal locality,
you saw, you know, the size of physical location of these things matter. So it matters that your
RAM is close to your CPU, you know, you wouldn't want that in, say, Africa or even across the room from you
because it really does have a big impact.
And it's kind of funny because we usually don't think about stuff like that in computing.
You just think about networking connections.
You don't really think about the physical location of the server.
But when you're talking about performance, that stuff really has a big impact.
Now, just to be clear, though, when you say that fast is expensive, we're talking dollars. Yes, right? Yes. Yep. And one of you guys threw
a quote in here somewhere, and I'm trying to find it. We might actually pull it out there. What are
the two hardest things in programming? Oh, right. Well, yeah, I mean, that was going to come up later.
But yeah, how to cache things and how to name things.
And I know I'm messing up the quote, but that's the gist of it, is that the two hardest things in computer science are how to cache
and when to invalidate the cache and then how to name things.
And off by one errors.
All right.
That's not hard.
That's just everyday life.
Was the two hardest things.
Forget it.
Yeah.
And cache invalidation is a good point.
We're not going to talk about too much tonight because it's like a whole
another subject.
But yeah, a lot of the caching things that we're going to talk about tonight
actually are situations where we do have a good situation for knowing when information is old.
But if you're talking about like a distributed application or say a situation where you've got like multiple nodes of a NoSQL database, then just knowing when your data is out of sync is a big problem.
This is another reason why we don't use cache everywhere.
Just mostly everywhere.
Yeah, and so for tonight, we actually wanted to take a look from a different perspective than we normally come from on the show.
We wanted to talk about these things from the bottom level up.
And we thought this is kind of a cool way of thinking about things.
And particularly, there's a really well known set of numbers,
commonly referred to as latency numbers, every programmer should know. And we've got a link here
in the show notes that points to a gist from GitHub. But this is actually these numbers have
been floating around the internet for a long time. I think somebody from Google, maybe even Bill Gates
invented these numbers. I don't know where they came from. But they've been floating around for
a long time. And so if you Google for this, you're going to see them
everywhere. But we found a nice source
that I really like and the comments are really good on it. So we'll
have that in the show notes. But what we're
going to do is actually start at the very bottom,
the very bottom turtle
if you will, and work our way up.
What do you want to cover? Some of
these numbers in the
latency numbers first? I think
we should at least talk
about them because at least a couple of them yeah it's some of them are really hard to even i wish
i wish that at least some of us hadn't looked at these it'd been it'd have been interesting to like
take a guess at these yeah i took a little snapshot out of the ones that i thought were
kind of the most relevant because uh i think the list itself is or has around 20 items so i kind
of snipped out the ones that i thought were like the most important and got some
relative speeds.
And then I've actually taken some notes that compare these things to like numbers that
we can actually realize.
It basically translates these things to seconds, minutes, hours.
And so it's a little bit easier to talk about.
Well, let's go for it.
Give us some of them.
All right. Well, let's go for it. Give us some of them. All right.
Well, let's go ahead and start at the very bottom.
So the L1 cache.
L1 cache is a really, really, it's the quickest thing we have.
And it basically sits right next to the CPU.
You can stack data in there.
The CPU knows how to get it.
And this is actually something you can't even modify with assembly.
This is just total hardware. The CPU uses the stuff it loads stuff in takes it out and it operates ridiculously fast like 0.5
nanoseconds fast and i know what you're thinking yeah so i looked this up 0.5 nanoseconds is about
uh it's roughly the amount of distance that light travels uh in six inches so it's the
speed of light for six inches wow yeah so um that's 0.5 so one nanosecond is roughly one foot
lights why can't i say this it's one foot of light traveling i help me out here guys
no you're doing a great job this is like when I'm trying to say some of the names for the reviews. I'm
enjoying this part. This is tell me the story that light travels. Wait, light
travels twelve inches in one nanosecond. That's impressive. So this is how far far it travels of waiting and okay all right i am totally
confused now but anyway it's really
freaking fast this is how long it takes
light to travel in six inches
right yes exactly okay hey hey and to
put this in just what we're talking
about expensive all right so the new sky lake i or new new ish the sky lake And to put this in just what we were talking about, expensive. All right.
So the new Skylake or new-ish, the Skylake i7 6700K processor,
the level one cache, which is what we're talking about right here,
it has four slots of 32 kilobytes.
Right.
So it's small cache. It's tiny cache.
All right.
This is a brand new high performance
process right right and that's that's for the that's for the four times 32k yes four by 32k
but it also has an l1 data cache that is the same four by 32k so that's k right kilobytes right
nobody's used that as a measurement in since like the 80s right so that's when we talk
about it's expensive that's what he means like the if they could put you know a gig of that stuff on
here you'd have a computer that was basically doing quantum computing right well i was going
to ask why can't i just get my whole motherboard to be l1 cache right you know you can have a
little bit of corner for the cpu but i want the rest of it to just be L1 cache and just everything operate that fast.
By the way, when was the last time you guys ever heard
of kilobytes used for anything?
My internet connection sometimes.
So, I mean, putting that,
that's almost hard to imagine, right?
You can create a text document with very little in it
and it'll be bigger than 32K.
Right.
So that's kind of to put it in perspective there.
And also if you did get, say,
a whole motherboard made out of this L1 cache material,
it also wouldn't be nearly as fast as this 0.5 nanoseconds
because of temporal locality.
A big part of that speed
is that it being really close,
like on the molecular level to the CPU.
So the physical size is actually a limitation here.
Yep.
You know, I wonder, now I'm kind of curious
because you said four times 32K, right?
Yep.
That's 128K.
That's a Commodore 128 back in the day.
Yeah, so I'm super curious now.
I'm trying to find.
Okay, let's say.
Hold on.
I don't want to tell you what this is just yet.
And let me.
We're not going to use any text editor.
So we're just going to go straight VI on this bad boy.
And let's VI.
And then call it that.
And then paste that bad boy in and save it and then
now. What did you say? The number was a hundred and twenty eight. You said
right hundred and twenty eight kilobytes. So my curiosity was is the
preamble to the United States Constitution larger than all of the L
one cash on a sky like processor. It's it's well it registered in it three to
well. No, that would be in bites, so maybe it's not so bad. Okay, I'd take
that back. That would be on your encoding, right? Yeah, that's why I was
like because at first I thought like it was automatically showing me an in K
and I was like, oh, wait a minute. That's amazing. It's actually bigger,
but no, it's not all right. So what's the next big number? Yeah, that was
invites three hundred and twenty eight8 bytes so it's a lot
smaller well that didn't make it as much fun as i was hoping for okay can we erase that part
we can but we're not where's the undo button on this show too lazy dang all right so what's the
next number that you had dog marked as uh or earmarked as yep uh so i want to call it the
l2 cache and i I mean, all this,
we should mention that all these numbers
are like rough approximations.
Things are different based on all sorts of factors,
including your computer, the temperature, the humidity.
I mean, who knows?
All sorts of factors.
But the next thing I want to mention was the L2 cache.
And some processors, some computers do go up to L3, L4,
and there's all sorts of other stuff.
But I did want to mention the L2 specifically because as far as I know, all modern computers have them.
But also, I just want to show how big of a jump it was.
And the L2 speed clocks in at 7 nanoseconds.
So 14 times slower.
14 times slower, yeah, just for a little bit further on the chip and to also put that one
into perspective using the skylight processor this one is quite a bit bigger it has four blocks
of 256 kilobytes so 1024 right so i mean it's almost 10 times the amount that they put on the
l2 than they had on the L1.
Yep. And just like L1, this is not stuff that you can mess with in assembly.
This is all managed by the CPU itself.
So you're out of it.
Yep.
It has its own custom algorithms.
Out of curiosity, did you have an L3 on this list?
Because I don't see it on this page.
So I'm assuming probably not, right?
Nope.
It may be in the
numbers that i pulled it from originally uh i can check that but i don't think so okay well i'm going
to i didn't see it yeah there's no l3 or l4 in that um uh the article the github article that
joe was referring to so here's here's what i want to point out is on the i7-6700K processor,
if you just look at regular retail-type specs on it,
it says it has an 8-megabyte cache.
That is referring to its L3 cache,
which it's awesome to go ahead and tout that
because that's kind of a cool spec on your
processor but that is your slowest by far so we got to figure if your l1 was traveling at six
inches um the same as the speed of light it's over six inches right that's pretty impressive
then the l2 was already 14 times slower which makes me believe that L3, being that they put on quite a bit, right, like
eight times the amount of the L2 cache, that's probably another order of magnitude slower,
like maybe 20 times slower.
This is just a guess, but, you know.
Yeah, I guess it depends on the chip, but yeah, I think that's a good approximation.
I did also want to mention, I should have written this down, Jeff Dean and Peter Norvig, who both are pretty big wigs over at Google.
These are the guys who originally published these numbers.
Cool.
So what was the next big number you had on there then?
Yeah, so skipping the cache, I wanted to talk about RAM, main memory.
Hey, just one question, though, and maybe this is going to sound stupid,
but it's still on the same die, though.
Yep.
So going back to Joe's comment about proximity,
like if your memory was on the other side of the room,
how that would affect the increase in latency.
But this is on the same die as the cores in the L1 cache it matters but yet it's slower it's slower access
speed might be a different type of memory and might also be i mean different construction i
should say i don't know if you've looked at the cpu lately but that little thing is just a piece
of magic i don't even understand how that thing can work because when you look at it none of it makes sense it's just a solid thing and yet there's supposed to
be something inside of it that does some stuff yeah it's pretty amazing when you think it's
literally a black box but yeah i mean i guess that's the whole point though is just moving it
you know a few millimeters when it's that small and electricity is going through it can make a huge difference
right well now l4 is a different type of uh thing material i don't know it's not listed on 6700 so
i don't know yeah i think they even go i think it even goes on from there like different architectures
can go further and further i found some pretty cool pictures but it's just over my head so i one of the pictures i'm looking at
right now has got a shared l3 cache so it's actually got um all of these processors and
they share this l3 cache and so it is a lot bigger but it's something that they can use to kind of
pass information between each other interesting pretty cool idea cool so you said your next one
was memory right yeah so this is RAM.
And they had a couple numbers in there, so I wasn't really sure what to use.
They talked about having memory access, and then they talked about reading a megabyte sequentially.
I just don't know what memory access means.
Does that mean the overhead associated with reading anything?
Or is that, you know, I wasn't sure which number was what I typically think of
when I think about reading from memory.
But in either case, it was a big sweat, which is why I wanted to call it out.
So depending on whether it's memory access or reading a megabyte, which is huge,
for memory, it's the difference between 100 nanoseconds and 250,000 nanoseconds.
That's a pretty big jump.
Even though a nanosecond is still hard to really conceptualize because it's,
what, 10 to the negative 9?
Yeah.
Wait, you're talking about the difference between reading from the memory
versus L1 or L2?
Well, yeah.
And also, even with the memory, there were two numbers for memory.
It was kind of confusing.
So it's somewhere between 100 nanoseconds and 250,000 nanoseconds,
which just compared to the L2, it's either 14 times slower or 35,000.
Not 20, 35,000 times slower.
Well, if I got to take my pick, I want the 100 nanosecond main memory.
So I think to your point, Joe, looking at this chart,
I think the main memory reference is just getting to that slot.
You know, if you think about like a hash lookup type thing,
I think it's just getting to that slot.
But then to actually read a megabyte sequentially from memory,
which is as fast as you can do, because if it's sequentially in memory,
then that means it's actually you know all in line
250 000 nanoseconds so we got confused about that sorry i'm no no you're good go ahead what were
you saying uh it's random access memory so i think that the way like physically the ram is
laid out is it doesn't really matter about sequential or not so but that was kind of
confusing to me too but i'm just kind of showing my ignorance i don't really know how this stuff
works but maybe that's why i'm so fascinated by it yeah well maybe this is why we are not in the
hardware business and we are software developers i mean if you look at it and this is a little bit
out there but if you look at benchmarks on like ssds um i think there's a site called ssd benchmark
uh that has a lot of good information on it. But they'll show you the
difference between sequential reads or random reads and that kind of stuff. And so I think it
really does boil down to if you have a smaller file, typically, it's stored sequentially on the
disk. And I would imagine the same thing happens in RAM. But as soon as something grows over a
certain size, it can't cram it all in contiguous memory slots. So then it starts pushing it around on,
on the drive wherever it can. And so I think that's what we're looking at here is this read
one megabyte sequentially from memory means like literally if I have a contiguous block of memory
and I grab it all out, it's 250,000 nanoseconds. And that compared to your level one cache reference
at 0.5 nanoseconds that's an order of 500 000 times
difference right that's yeah it's incredible that's pretty massive when you think about it
and we think of well no go go we think of ram as being fast right yeah i was getting ready to say
do you guys remember the days when they're like hey you want to upgrade your computer i have some more ram to that thing right like that's that's what
the things were back you know 10 years ago now they're like hey i had an ssd because that's
what's really slowing you down you know who knows what it's going to be in the future but you're
making we're going to get there you're making the point about sequential though normally sequential
read rights are faster than random read-writes.
Yeah, that's what I was saying.
Because, I don't know.
For me, the analogy is always easiest to think about a spinning hard drive
where you can actually see, or not see it,
maybe see it in your mind, but not see it in real time.
But you can envision the head moving across the platter, right?
And if it's able to just read everything it needs
or write everything it needs sequentially,
then it's not having to move back and forth.
But if it's doing a bunch of random reads or writes
because the disc is fragmented, for example,
then you're losing time and latency
just for it to move the head to the correct spot.
Yep, it'd be bouncing all over the place.
You'd see that arm moving back and forth and back and forth
as opposed to just getting everything in one row.
Now, you mentioned the SSD benchmark,
and I'm not sure if this is what you were referring to,
but I did find ssd.userbenchmark.com.
I'm not sure if that's the one that you were referring to.
No.
It looks like a wealth.
I'd never heard of this site,
and I only searched for it
because you said something just now and uh there's some very interesting comparisons here for ssds
yeah there's a link in the show notes yeah we'll definitely put that in there
um so what was your next one up so we've already determined that just reading sequentially, the fastest type of read for memory is 500,000 times slower than your L1 cache reference.
What's the next one on the list?
Yeah, this one was surprising to me.
The next one is network.
However, there's a couple of caveats there because, once again, this is kind of confusing to talk about because there are a couple entries for network.
But this in particular is actually sending data. So this is
just, you know, stuff going out on the wire, not getting it back in. Oh, no, you know, I'm sorry,
I ended up writing something different down. This particular number is a set of numbers I
captured, we do have that number. But the one that I wrote down here is actually local data center.
So the amount of time that it takes to go out over your network card,
your NIC, and come back and basically a local data center.
And what was really surprising to me is that this was faster than hard drives,
SSD or spinner.
That is interesting.
I see that.
Well, that could make sense to me, though,
especially like what kind of networking are we talking about here?
Are we talking about?
Fastest? Are we talking about? Well, what I'm getting at is where we talk about fiber connections or are we talking about, you know, twisted pair? Yeah, it's
got to be fiber because they're talking about inside a data center. Well, even if you're
talking about even if you were talking about twisted pair like Ethernet,
I'm calling it on that. Yeah, I'm calling that is well
can you? I know that i know that like the
the routers they usually have fiber uh interconnections to get those 10 gigabit
speeds between from router to router if you're talking about like a serious enterprise
router you can buy server motherboards that have 10 gigabit nicks on them really yeah so so yeah
you can get twisted pair with 10 gigabits.
Now, the distance that that'll travel is probably not very far.
But, yeah, so the data center trip for send and receive is faster than an SSD.
But that's faster than a sequential read of one megabyte on the SSD.
So we're talking about
just a simple request, I think around trip. Wait, was it faster than SSD or just faster than a hard
drive? I thought he said hard drive at first, right? Both. Yeah. Yeah. So this is actually
not related to caching. I was just really surprised by this. But it does kind of give you
an indication of just how fast network can be if you've got these guys in the right spot. So you
can see how using something like an application server, you know, like a Redis
or something to cache something in your data center could be really beneficial because
it really is comparable to hard drive speeds.
And it's only, if you've got the same numbers I do here, it's only twice, it's two times
slower than the read from memory. So that's really not that slow right oh
wow so i just found this 10 gigabit per second pcie card over standard cat 6a cabling yep and
it there's your rj45 connection right there yep i'm sure that the the length of that run can't
be very far at all, but still.
I mean, that's impressive for a twisted pair set of wires.
So now we've seen that the data center, and I'm imagining that's just like a packet send and receive.
Like that's probably nothing major, right?
Yeah, probably not a megabyte.
It didn't actually say, I don't think, the size.
Right.
So probably like in an ideal world, jumbo frames are not on.
Yeah.
There's going to be all kinds of,
all kinds of things coming to play there.
So what was your next number that you had marked down?
Yeah.
So two times slower than my memory and 1 million than L1.
So we're getting the big numbers.
And the next was SSD.
Well,
we already have to abbreviate our numbers.
So that's an idea,
like just how much small,
how much larger our numbers are getting.
Right.
Oh, yeah.
Oh, yeah.
Yeah, I was actually, I spent some time trying to find like almost like a,
you know, like a planetary, like a solar system picture of these guys
just so you can get a sense of scale for how far out these bigger items are.
But I couldn't really find a good visualization that I liked.
So, you just said though that the ssd that was next on the list and it's
two times slower than the round trip to the data center but this is for a one meg one meg read
sequential right yep but i mean in fairness though if we're talking about 10 gigabit per second
ethernet then okay sure you know i could see that because not even that uh nine fifty samsung 950 pro
writes it 10 gigabit right or reads it 10 gigabit unless you were talking about rated but you're
talking about like a single drive here yep so yep but i mean that's still impressive though so
what this one came to was one million nanoseconds, which is 1 millisecond.
Yeah.
1 millisecond.
That's crazy.
Yep.
And 2 times slower than network, so it's 4 times slower than RAM and 2 million times slower than L1 cache.
That is unreal, man.
So now I don't feel so great about that 950 Pro that I just bought.
Can I trade that in?
I just want like a truckload of L1.
Well, you know, it's kind of funny.
Like, well, one of the things I was thinking about,
especially comparing the SSD and the spinner,
is we can see, and we'll talk about in a second,
the spinner is 20 times slower than the SSD.
But that doesn't mean that going and getting a solid-state drive
is going to make your computer feel 20 times faster
because there are so many layers of caching
that you're only improving basically the slowest parts of that
where the cache is missed and you have to go and fetch that from the hard
drive.
All right.
Now I see a return to Amazon in my future.
Yeah, I don't feel so good about that thing.
Yeah.
So what was the next one on your list?
So we just hit the SSD.
What goes down from that one?
You have
something, Michael? Well, I was going to say
because you already covered spinning. Yeah,
no, I thought we just did. Okay,
let's do spinning. Okay,
yeah. So first point
is get your boss to buy an SSD because
it is 20 times slower
than the solid state drive says the
40 million times.
Hard drive.
Yeah, boss, if you're listening in SSD, than the solid state drive. Says the guy who uses a spinning hard drive. Yeah.
Boss, if you're listening,
you need SSD.
But yeah, as we just mentioned,
it doesn't mean you're going to be 20 times more productive because there's all sorts of other things
like the RAM, like the caches going on,
like your network stuff.
And so, yeah, getting a SSD
isn't going to make you 20 times more productive
but an interesting side story here because we've actually done this when we stood up our website
you know we were kind of low on ram remember and we ended up basically telling it to use
the ssd underneath it as a swap all right for memory and so that kind of tells you how good
that stuff is now yeah it's it's quite a bit slower than your memory that's directly on the
motherboard with it but but you these things have come so far that literally using we've had page
swaps for hard drives forever for windows because back in the day,
you couldn't have eight gigs of RAM or whatever.
And so it would constantly page things
to and out of memory to your hard drive
and things would be just crazy slow.
Nowadays, you can basically do the same thing with an SSD
and you really don't even notice the performance difference.
So you cover these numbers,
20 times slower than an SSD,
40 times slower than the L1 cache.
This just goes back.
I'm sorry, 40 million.
How did I forget that?
40 million times slower than L1 cache.
This just goes back to my supporting argument that I just want a truckload of L1 cache and I'll just create one giant RAM disk out of it.
And I'll just never shut down my computer.
I'll have it on a UPS, and that way I'm safe.
Yeah, you got no persistence there.
So, yeah, don't turn that baby off or you're screwed.
Yeah.
Well, hey, so let's just take a moment for, if you don't mind,
before we continue on with this awesomeness,
because you know what, dear listener?
If you haven't already put a smile on our face,
today would be a great opportunity for you to do that and head to www.codingblocks.net
slash review. And you'll find links to either iTunes or Stitcher, whichever one's your favorite
and do leave us a review. We greatly, greatly, greatly, super-duper appreciate it.
Even if you want to say how Californian Alan is,
if you want to talk about his thick eyes and his headscarf,
it will definitely make him giggle.
But we would greatly appreciate it.
And if you have already left us a review and you're like,
oh, not this part of the show again,
we are so super appreciate those reviews.
We go back and look at them and it just it puts a smile on our face every time we see
it.
And it is really your way to give back to us.
I mean, more than anything, that is that is our one ask.
So, yeah, I mean, we've had people say like, hey, you know, do you have a Patreon?
You know, I'd give you money for a beer, you know, but hey, consider this your way of of giving us that beer until we get the Patreon up and then you have a Patreon? I'd give you money for a beer. But hey, consider this your way of giving us that beer.
Until we get the Patreon up, and then you'll have to do both.
Show me the money.
Wow.
Wow.
Just saying.
This took a turn quick, guys, and I am sorry.
I did not see that coming.
I probably should have.
You've changed, man.
I accidentally set that up up and I apologize now.
I ain't too proud to beg.
Yeah, I see.
So so while we're while we're in this like little break for a moment here,
you guys, I hope haven't cheated and I would like to go over the results of
the last poll.
Hold on.
Let me get there because no, because this one, this one, I think, struck a nerve with some people.
So the question was, what is the hardest part of landing the technical job?
Right?
So your options were the job requirements, dealing with recruiters, the technical interview, racism, sexism,
ageism, any discrimination of some form, location, location, location,
or just getting the call back about the job, period.
Now, Alan, you're not cheating, are you?
I'm not.
You seem really busy on your computer over there.
I'm playing with a link that we're going to share in a minute.
All right.
Which one of those do you think is the one?
You know, I'm honestly going to go with the location.
Location, location, location.
Yeah, I think that's like the biggest barrier for people.
Like, you know, people try to live where they can afford a bigger house or whatever.
And that's usually counter to actually finding a job in tech.
So that's,
that's,
that's mine.
Okay.
Oh,
okay.
I think,
okay.
I took that.
I misunderstood you.
You're saying that you think that,
uh,
they'll go to a place that they can
afford so that they can get the most bang for their buck, which might be a
place more remote than where.
Okay, I'm with you now.
Yes, Joe, what's your pick?
Yeah, you know, part of me thinks the ages and, sexism, racism, but I think that I'm going to have to go with location.
Oh, really?
You're both going to take this?
Okay.
So then price is right rules.
You guys know it.
Okay.
Give me a percentage that you think that that one was the winner by,
and if you go over, it's an automatic loss.
33. 33% from Alan, and if you go over you, it's an automatic loss. Thirty three thirty three percent from Alan and
he's sure this time because notice last time he
stumbled and he changed his vote and
he was like, oh man, I'm so close.
What say you
Joe? Twenty seven percent
twenty seven percent.
All right. Well, here's the deal.
So again,
you're both wrong dog on it, but this time Joe's more right. Well, here's the deal. So again, you're both wrong, but this time Joe's more right.
Oh, but wait, wait, wait. Was it location? And I'm pretty sure he cheated.
It was location, um, during that
discussion was that the hardest part about getting the job was that you just had to get
called back period.
And that's why it was so important to have those bullet points on your resume, such as,
you know, stack overflow, uh, you know, here, here's my information that you can find out
about me on stack overflow. Here's my information that you can find out about me on Stack Overflow.
Here's my GitHub, things like that, right? Just because that'll help your account, your resume
to stand out so that someone can see information about you before they even bother to talk to you.
And that was the number one choice by, you know, the listeners just getting the callback period.
Now, this is why I say that I think Joe might
have cheated there, Joe, didn't you? Come on, admit it. Nope, I'm in touch with our listeners,
right? That's okay. There you go. That's that's how you were super duper. No, he was cheating
because location, location, location was twenty seven point one percent. It was the number two.
Well, it was tied for number two with the technical interview which was also 27.1 which is why they're tied um and i didn't i forgot to mention that just getting the
callback period one at 29.2 percent oh so they were all super close those three those three
100 of everything yeah those three were definitely the top ones. Now, here was, I was actually kind of surprised about this one.
The racism, sexism, ageism, discrimination of some form was nothing.
But in all fairness, you know, I mean, we didn't put on there,
hey, put whether you're white, Caucasian, or black, or Asian.
Well, I mean, the color of your skin has nothing to do with your age or your sex your sex sexism so we didn't do male female white black we didn't do any of those so
that you know if everybody that took it was a white male or you know 99 of the people you know
so okay fair enough but i mean we're missing some data points there but but even still though okay
yes i i totally understand but that's why i. But that's why I added into it just discrimination of some form.
It doesn't matter what the discrimination is.
It could just be of any form.
And that way, because ageism can affect anyone.
It doesn't matter.
Totally.
Totally.
Right?
Good point.
But I was actually really surprised to see that that one didn't get any pick.
So, you know know and that's also
that's also a very subjective one too though because the other ones um you know i mean you
only you only might have a feeling that that might have been what kept you from getting a job after
you've left an interview or something right right whereas you know just getting a call like if
there's a job that you know you want,
right, and you can't get the call back, then you're going to know right away like that's the
hard part that I'm facing right now is just getting them to call me back, right, right.
And then so along that line, we also talked about the favorite meetup ever or at least,
you know, my favorite meetup ever, which at least, you know, my favorite meetup
ever, which I went back and I dug up the information on this because I was so super curious
and wanted to share it. So, uh, this was with the Atlanta JavaScript meetup and, uh, I can provide
a link to the specific one, but the title of it was hacking interviews. And, uh and it was with Nick Larson.
That was the developer from Stack Overflow and Sam Lawrence was the
interviewee for the for the meetup and it was fantastic and we can include
that information if you're curious to see it in the show notes.
I'm surprised the ageism, sexism, racism wasn't higher.
Yeah, it might say more about demographics,
but yeah, that's the poll results.
Do you guys have an idea for the next poll?
Oh.
I've got a list here.
I've got one if you want me to shoot it out there.
Yeah, yeah, go ahead.
Okay, so here's the question.
Don't answer it.
Okay.
If you were stranded on a... If I do answer it, though,
then I get to taint the jury, right?
Yeah, which you may want to do.
Okay.
Just want to make sure I understand the rules.
Sorry.
Yeah, you can cheat if you don't...
Oh.
It is cheating.
Let me clarify.
So the question is, if you were stranded on a desert island,
and for some weirdo reason,
you could only take one programming book with you
to better your skills on the desert island,
which one would it be?
And we'll have a couple options there,
but we'll have stuff like common books,
like Clean Code, Code Complete, Design Patterns,
Pragmatic Programmer, maybe a couple others.
But which programming book is basically,
do you think is the most important?
That's a good one.
Which one has the most pages?
Because if I'm stuck on an island,
I'm going to need to start a
fire. So I'm going to need some kindling. Because my main concern is going to be getting off that
island. You know what, though? This is good fodder, though, because this particular one might decide
what our next purchase is in our next giveaway. So think about this one a little bit. And don't
just do it because you've already got it at
home right pick some pick one that you really think i wanted to bring this up too um you know
in what this is 45 so episode 43 was a couple episodes ago and we had talked about the imposter
syndrome and apparently that was another topic that really hit well with people.
They, too, had experienced this, and we weren't the only ones.
We weren't alone.
But there was this great feedback that we actually got today,
this great comment that I guess being a guy I just didn't have the perspective of
that this commenter did.
So Carol wrote in, and she um, she imagines that the
reason why, uh, you know, we had said in the wiki, in the Wikipedia article that, um, this affected
women more than men, or that's at least what the study was in, uh, you know, found for women,
I believe it was, I forget exactly how it was worded.
And Carroll imagines that it hits women more than men just because of the environments that they grew up in
and their abilities being questioned from statements,
from simple statements such as, well, girls can't do math,
to various other stereotypes that are constantly being thrown at girls. And, you know, even even when people might
not realize that they're doing it and, you know, I thought, well, that's a
great point that, you know, I just didn't have the perspective to to even
consider that. And I thought that's a, you know, a good reason why maybe that
is the reason why it was more often in women than men.
Or at least in that study.
Yeah, I wanted to point out too, she actually left a really great comment on episode 42.
So yeah, Carol is killing it.
And yeah, you guys can comment too and read some really intelligent and awesome comments as well on our website.
Yep.
And that's www.codingblocks.net slash episode 42. And then her recent was on episode 44. So 43, 43, my bad. So yeah,
definitely go up there, check those out, respond. We'll do that as well. We just haven't because
we're preparing for tonight's podcast. We will get to it. and then so with that let's get back to these to these numbers
so next in terms of we've already gone 40 million times slower than the l1 cache
yeah like how can it get any slower than that
the well there's the internet and so we talked about data center and it was really quick um
surprisingly so So another
number they tacked on here at the end is basically a rough gauge of internet speed. And I think they
went California to the Netherlands and back. And there's some numbers on that you can read an
article, but they just basically wanted to give you an example. And all sorts of stuff comes into
play here. We talked about CDNs, we talked about ISPs, we talked about all sorts of stuff that
really messed with this.
But just for a broad picture
of what's going on,
we're looking at 150 million nanoseconds.
I feel like we should stop measuring in nanoseconds.
Yes.
And yeah, we're going to get to that.
So that is 7.5 times slower than spinning disk,
which is still
pretty awesome, but 300 million times slower than L1. Yeah, that's insane. And, and this is actually
easier to put into reference of something that we understand, right? It's 150 milliseconds. So if I'm correct, and I always
jack this up, that is
basically
10 of those would give us a second, right?
I didn't know we were going to do math.
I can remember this.
All right, so one second.
This seems odd, though, because if I go into Google and I say,
hey, 150 million nanoseconds in minutes or something like that,
I'm getting crazy results.
Yeah, so 150 milliseconds is roughly one-tenth.
So 100 milliseconds is one-tenth of a second.
So I feel better that I didn't do that totally wrong.
But, yeah, so to do that request, the packet, just one packet from California to the Netherlands
and then back to California was one tenth of a second, a little slower than one tenth
of a second.
And that one tenth of a second is 300 million times slower than the L1 cash.
And we went from top to bottom of this just to give you a perspective of it.
But the importance of it is, is really what?
Like, why does any of this matter?
Like, we kind of need to explain what that really is, right?
Yeah, we need to bring it back to caching.
Well, hold on, though.
Was that, for the internet one, did they say what the speed was for that?
Are we talking about Google Fiber?
Are we talking about my internet connection where we're measuring K?
No, they didn't give any specifics.
Yeah, and there's no way even for them to really know.
I guess you could do a trace route to see what it's hitting,
but it's really hard to tell even what's happening there.
So it's just a real rough approximation.
But what it does tell you is that if you can skip that call to the internet
and hit RAM instead,
you're going to be saving, you know,
150 million nanoseconds or whatever it is.
So in other words,
this is a gauge on how to determine
what parts of your application
you should focus on for improving.
Could be.
And so going back to our how to be a programmer series, right?
When we had talked about, you know, when to refactor and like what parts of it
to look for improvements for, right?
And, you know, if it was going outside of the computer, right, a network call,
then that might be something that you'd want to focus on
first in terms of
cashing it locally
because you're going to get a much
larger return
on that investment. And
here it is in numbers. If
you're going across the internet for that
request, it's
300 million times slower than they'll want.
Yep. Yep.
Yep.
Yeah, and so a lot of ways,
this is really sideways to the whole caching conversation now that I think about it,
but it does show you just how important these numbers are
and how important caching is
because it lets you avoid a lot of these bumps,
which really do have astronomical figures.
I mean, I just looked it up.
Earth is roughly 365 million miles away from jupiter so
if the l1 cache is one mile then the internet is jupiter which i can't even see right right
without it's huge yeah it's the biggest planet in our solar system well if you can't see it does it
still exist i mean i'm doubtful yeah i'm gonna I'm going to question that. I think somebody drew something pretty.
Hey, but you had the more relatable terms too.
Yeah, and so this is probably much better, much easier way to talk about this.
We probably should start with it.
Maybe I don't know.
But yeah, somebody translated these numbers to times that are actually like, you know,
something I can deal with as a human being.
So if we said that the l
cache was one second the l1 cache specifically l1 cache the smallest of the smallest unit of
time here we're talking about is one second then we're talking about five days for ram
that's crazy yep that data center that we are all amazed is so quick is 11 days.
And that was for the network access within the same data center.
Yep.
Yep.
Solid state, roughly four times faster, 23 days.
So we're almost up to a month.
That's it.
I'm definitely returning that 950.
That's done.
Yep.
Well, hey, you know what uh any ssd is gonna be a lot better than 15 months
for a spinning drive you know what i'm done with computers i'm just turning it off so we just jump
from 23 days for the ssd to 15 months over here right wow but he hasn't even gotten to the grand finale yes no that uh that packet that went
from california netherlands back 10 years the internet
yep wow yeah so that's incredible and those really are just astronomical numbers so
yeah if you can skip those internet calls and hit memory instead, and even better, if your processor can get the information it needs out of those L1 and L2 caches instead of memory, then you are just, it's just incredible.
It's amazing how much time you can save.
Yeah, that's really cool.
Yeah, and with those kind of numbers, it's not so much about performance.
It's a matter of whether you can even do it at all. You know, we're not talking about this, you know, your app behaving slowly or not, when we're talking about, you know, 150 million
times slower, we're talking about whether something is doable at all. Yeah, totally.
You know, I don't see it in our show notes, but I think we should talk a little bit about so
we just talked about the differences in the speeds between l1 l2 ram ssd spinning all that
but what makes it work right like the processor has these algorithms built into them so that they
know how to put things into the cache and then get things out of the cache and then when something
there's what's it called for l1 and l2 or three and four but when you get
into ram or ssd that's not the processor no no it's not the process but i'm talking about
specifically about the on die the l1 l2 stuff right sorry so that's algorithms built into that
but the whole point of the cache is it's only useful as long as it's keeping stuff that needs to be accessed a lot right so so if
you're putting things into it you want to put things that are most relevant that need to be
accessed the most those need to be on your l1 right well now we're getting to into like the
different algorithms for cash yeah we're not going to talk about the algorithms but i just want to
like we talked about we are in an upcoming episode an upcoming episode that's why i don't want to get
too deep into it but i think to put it into perspective we need to understand okay so l1's
fast but okay cool i mean what does that mean does that mean that i guess the flow of it's kind of
important and i think joe you had a link um yeah i've got a couple of nice things we're going to
have in the resources we like section you're talking about the visualization yeah yeah we've
got one cool one um and what it is is a bunch of blocks that represents the l1 cache the l2
main memory and hard drive and you can click each of these boxes and it gives you uh like a a little
floating square that shows you how fast it takes to travel from that location to the
processor and actually will flow through each. So you click on L1, it's almost instant. L2 flows to
L1 a little bit faster. Main memory, noticeable. Two, then one, then CPU. Man, when you hit that
hard drive, it really hits home just how slow disk access is. And for those following along in the show notes this is the overbite.com.au link
that that you'll find that has this fun little cache example that he's talking about and this
i've been clicking on it for 15 minutes now it's addictive right i really it's cool to watch i mean
seriously it is it is one of those stress relievers um but yeah and uh and one thing I read a little bit about caches
because I still have a hard time really understanding
what goes on in the processor.
That's kind of like college stuff that we read about,
took the test, and forgot about.
But some of the examples I saw for what kind of things happen
in L1 cache that it's really good at is stuff like for loop,
so incrementing indexes, stuff like that,
numbers that it takes in increments and caches or stores again
and repeats and you know repeats and
and things like addition and things that just computers are fantastic at those kind of things
do great in those caches because they go in there they stay there you operate on you put them back
you take a you know take it back in the cpu add another one drop back in do some you know
conditional anding oring even like string comparisons where you're comparing two things
two sets of binary to see if they match. Like those things are just fantastic for cache.
They just fly right through. Yeah. And I, the important part of this whole cache thing that
I really wanted to touch on without going into any specifics, let's go around is the L1 is right
there for the CPU to grab stuff from just as fast as it possibly can, right? So it can it
can get input to there as fast as it wants. The L2 is where you're going to have things that maybe
aren't accessed quite as much as what's in the L1. So that's why it's your L2 cache. So things
get put in there. And those are probably hit a lot in one of the articles that I think both Joe and I had read were from Extreme Tech. And it had
a great explanation of this is the whole point of cache is when the CPU goes to grab that data,
it needs to get a hit most of the time. Whether or not it gets it from L1 or L2 is somewhat
irrelevant. But the point is, is the more you can have in cache, the faster it can operate.
And so that L1 stupid fast, the L2 is also incredibly fast, but we already said it's 14
times slower. But if you can get a good 90% hit rate of everything that processor needs,
and one of those first two caches, you're screaming along. As soon as you get outside
those two caches, now you're going
into memory, which where a lot of stuff is cached there as well. But now you're talking about orders
of magnitude slower. So anytime it misses a hit in one of those two caches, it's having to go to
way slower memory spots to bring that stuff in. So the flow of this is super important. And that's
kind of why we went through all this was to describe things that maybe we don't think about much.
Yeah, and so in this episode, we wanted to give you a good baseline for what these numbers are and what frameworks, like say Node or ASP, how that factors in, what SQL Server does with memory and how it
speeds things up by kind of keeping stuff in RAM instead of disk, some tricks for local hard drives
and things that they play with, caching servers, even like the WordPress site we use has, you know,
super cache plugins, something or other that just basically prevents extraneous processing, right? So there's all sorts of different reasons for caching. And
we want to talk a little bit about some of the algorithms that we use to kind of maximize the
benefit and minimize those misses, and some data structures, and just some other cool stuff. I
wanted to kind of go here from a bottle bottom-up approach so we could have a good baseline for talking about it
in one of our future episodes.
Yeah, totally.
All right.
So with that,
maybe we should get into Alan's favorite part of the show.
It's the tip of the week.
We're actually closer to week than the last, I don't know, 12 episodes.
I mean, I don't know why you got to judge.
All right.
So who's up first this go around?
I think that's going to be Joe.
Yep.
I woke up at six this morning and I put my show notes in.
It was terrible.
Wait, what? Yeah, I woke up too early and you know it was either
play a video game or do my show notes so wait wait seriously you you get if you wake up in
the morning you'll put on a video game well oh yeah man if you're gonna be awake you might as
well play right you don't just lay there in misery hoping to go back to sleep no man i got to get the dogs
out first thing so if i wake up or they wake me up but they they need to go out i don't want any
accidents and once i'm up and i'm up wow wow okay now you know why i'm falling asleep you have kids
dude what are you talking about dude my kids somehow make it to school. I'm not sure how. I have kids. They somehow get
to school. No, I mean, seriously, like if I wake up at like five in the morning, I'll just be mad
about it for two hours. I'm not going to get out of bed. I'm going to lay there and be mad about
the fact that I'm not sleeping till seven. Yeah. Yeah. In fairness, I'm kind of the same way.
In fairness, I'm the same way. If you know, I'll just
if I could sleep, then
like I think I really think that sleeping
beauty had it right where she just
like spend her life sleeping. You know, that
sounds like to sleep. It wouldn't be such a
horrible thing, you know,
but you know, even today
by the time my boys
had left for school, well, I was up. They just didn't see me. And then you
had from no, no, no, it wasn't like I was hiding from them. I was just, you
know, up and doing my thing, but they didn't see me, you know, by the time
they had gone downstairs and, uh, and, and so when they did see me,
I was coincidentally walking downstairs, and my oldest was like,
hey, did you just wake up?
I'm like, thanks.
Thanks.
Wow.
Tough crowd.
All right, so you woke up.
You put the show notes in.
What did you throw in here for this?
Oh, yeah.
Sorry.
Sidetrack.
Yep.
So this is an audio book I just listened to that was really good um the author's been making his rounds on a couple podcasts so you
probably heard some snippets or at least him talking about a few things if you listen to shows
like reply all or and maybe invisibility of a note to self i forget i forget where all he's been
where i've heard it but anyway um he's he takes uh a a bunch of computer science problems and algorithms for solving those problems and applies them to real life.
And the book is called Algorithms to Live By.
And he uses things like the traveling salesman problem and talk about how that might apply to, say, finding your life partner and dating and
the halting problem, when to quit, how to organize your sock drawer, talks about sorting algorithms,
talks about caching, talks about just all sorts of, you know, computer science, like hard computer
science problems, and applies to those two, you know, all sorts of things in real life, even,
you know, finding the best parking spot or not necessarily the best parking spot, but
at which point you should just give up and take the next one you see.
And it's just a really interesting, uh, quote unquote read. And I think that, uh, if you like
the show, you're going to love this book. Hmm. It sounds stressful to me. No, it's fun. You know,
it's like it's computer science algorithm. He describes the problem and then talks about how you can apply it to your closet or sock drawer or bookshelf.
All right.
I will probably pick this up.
What's awesome is that a lot of times it's the laziest approach.
So, for example, the bookshelf idea, if you've got something like, say, a you know, a file drawer, something, you know,
papers, receipts, something like that, rather than sorting those things, you might as well just put
them on top of the pile whenever you need one, fish it out of there, put it right back on the top
of the pile, because you're more likely to need things that you've looked at recently. And so
older things will kind of find their way down to the bottom of the pile. And the newer stuff and
the stuff you're actually dealing with more often is going to be on top.
And this is actually the most common caching strategy,
which we'll be talking about soon for LRU or least recently used,
where items that aren't accessed often aren't accessed often.
And they drop out of the cache.
And the stuff that you're doing more often just kind of sits at the top.
So don't organize your drawer.
Don't organize your files. Yeah, all these years i've just been hyper efficient i i feel like these may be great things
to improve your efficiency in life but also decrease um your spouse's approval of oh yeah
things no am i remembering this incorrectly? The traveling salesman
problem, if I remember right, that's that
one is there's technically
not an answer for it. It's like
so hard.
Yeah, especially
like when we're talking about large numbers
like that's yeah, that's the one that
it's you
know, they've gotten
very good at trying to come up with decent solutions, but there is no truly correct answer for it.
Yeah, I mean, this is why like Google Maps does a good estimation of it, but especially as the distance in the number of places grows, then it becomes exponentially harder.
Yeah, it's exponentially more difficult to solve the problem.
And, you know, in fairness,
maybe this is why Apple Maps isn't so good.
Yeah, because it can't handle more than three points on a map.
I don't know.
Maybe it's two points.
Yeah, well, actually, a huge portion of the book
is dedicated to problems like that that are
are basically exponentially they're problems that are hard if not impossible to solve perfectly
however if you take a rough approximation if you do one of these computer sciencey algorithms that's
used to dealing with uncertainty in real life situations then you can dramatically
improve your results and so the traveling salesman problem is it's NP.
It's not solvable as far as anyone can tell for, you know, without going just brute force
and looking at everything.
However, you can get pretty damn good by looking at a small subset of data.
And the book dives into all sorts of things, all sorts of problems like that, like dating
and when to stop looking for a partner and just take the next best one.
And it won't necessarily find you the best life partner.
Chances are it'll be pretty good.
So the most direct solution would be for the traveling salesman would be to try all permutations, right?
Yep.
And then if you go that approach, right?
In big O notation, it's a polynomial factor of O to the N factorial.
Oh, good Lord.
The factorial of the number of cities that are involved, right?
Wow.
And one of the earliest applications of this,
solve the problem in O of N squared times two to the Nth.
Jeez, dude, that's a lot of processing.
Like this is considered, you said NP, but you know, it's an NP hard problem.
Yeah, that's the one I was thinking of.
Hey, but wait a second.
There is no answer.
We need to go back to something that joe just said right there sorry it no no no that's impressive information but can you imagine
you're having to talk with your future you know partner and you know like hey do you remember how
we met oh totally like the first three just weren't, you know, they were too hard.
Yeah.
Well, you know what's funny is humans naturally kind of do this.
That's what you do when you're dating, right?
Like, you know, you date a few people when you're younger, and at some point you get around marriage age,
and it basically, you know, those first couple are gone, right?
Those bridges are burned.
There's no going back.
So you end up kind of, you know, I don't know that I should continue.
I feel like this needs to be burnt to the ground before anybody.
No, we are finding out just how romantic Joe is.
And I think there's value in that.
Yeah. And the magic number is 37%. finding out just how romantic joe is yeah and i think there's value in that man yeah and the
magic number is 37 you should basically date 37 of what you you know think your entire dateable
pool is and then just pick the next one that's better than any of those you've dated before and
done oh my god this is amazing by the way i have a feeling we're going to be the cause of a few car
accidents because people are gonna be they have to laugh about this.
Like this is wow.
Yeah.
All right.
So I guess I'll follow that up with my tip of the week,
which is not nearly as interesting as Joe's 37%.
No,
no.
37%.
Just to find the next that you need to drop down.
Well,
it was of your dateable pool. It was 37% of your dateable pool find the next that you need to drop down. Well, it was of your dateable pool.
It was 37% of your dateable pool.
Oh,
that's amazing.
I think I had that number,
right?
I could be wrong.
Yeah.
Um,
so let's say,
let's say for example,
have you ever found yourself in a situation where you get some text and it
already has,
um,
the non printable characters in it where you can see like a code representation
of that non-printable character and you want to like you want to be able to see you actually
wanted something to translate that so that you get the actual thing so let's say for example
you have some text a body of text and it has a bunch of backslash n's in it for new lines, right?
But you actually want something that would replace those backslash n's, right? Well,
I found this great question and answer on Stack Overflow where you can use Sublime to replace
those with the actual character that you want by turning on the regex search and replace feature in Sublime.
And in my case, what I wanted to replace was,
get your propeller hats on and get ready for your keyboards,
ampersand pound XD semicolon,
ampersand pound XA semicolon.
They look like models of Scions, by the way.
They are.
Those are definitely Scion models,
the XA and the XD.
So, but, you know,
because what was happening in my particular case was,
let's say if you were looking at
within SQL Server Management Studio,
and you're looking at the execution plan of some query, and there's one in particular that you wanted to look at,
and so you, if you, unfortunately, like let's say it was a stored procedure that might have had a bunch of queries in it.
And you're like, oh, here's this one problematic one.
Which specific query is that?
Let me just read that, right?
But in the execution plan, you can't really see the query that nicely.
And even on a large monitor, it could, depending on the size of that query, eventually truncate the text, right? So if you
were to right click on where it shows like query number 10 of the batch, you can say show execution
plan as XML and it'll show you just you can see all of the execution plan and you can find your
particular query that way. And then as one of the attributes of one of the
XML nodes, you can see the query, but then the query is munged, let's say, because all of the
new lines and all the tabs and all of the non-printable characters are going to be translated
into, you know, XML safe text for the attribute, right?
So that's where you're going to get these ampersand pound XD, you know,
semicolon another sign on.
And so what was happening is in my case, I wanted to be able to take that query
and actually see which one it was because in this, this was actually a
really large store procedure.
So it had a truckload of queries in it. And I was trying to find like the exact one in the original store procedure. And all I had
was the execution plan as XML that I was trying to find it that I was using to find it. And so
this little sublime trick turned out to be quite nice way of getting it back to what the original was so you could put um ampersand pound
xd semicolon ampersand pound xa semicolon in the find and then in your replace i could put
backslash in backslash t and it would actually replace it with a new line in a tab which is
what those characters those original scion characters represented. Very nice.
Yeah, I actually do this daily with Sublime
for all sorts of reasons.
I think you said it means I'm
notepad. Oops, sorry.
Yeah, I think it just means I'm copying and pasting a lot of code
around, which is not a good thing.
Yeah, I do this all the time.
But you said you do it in notepad.
Notepad++. I didn't get it.
Oh, okay. Sorry.
Yeah, so notepad++, you can do the same exact thing.
You can hit the – there's either a regex or they also have an extended set,
but you can do something similar.
I find myself doing that a lot.
Because where I thought you were going was just Notepad,
because what I have been able to do in the past with Notepad
is if you can select the text, which may include non-printprintable characters then you can use it as your find right but you can't use it as your
replace because how are you going to type that in and that's where this little trick using sublime
and and uh and or notepad plus plus as you just uh mentioned could help you out. Yeah, that's beautiful. All right. So I guess mine now,
this one is, this one kind of grew out of the fact that our show notes are usually a pain point.
Like we've had people ask us about, you know, Oh, you guys just record. And then, you know,
you publish. No, we don't like, we typically mix our tracks together and all kinds of stuff.
Well, one of the pain points is always getting the show notes together. Cause if you've ever
gone to our site or looked at the show notes on an episode, like we really
try and take, you know, we go through a decent amount of effort to make sure that we're providing
you guys with something that you can go to the show notes and actually go click a link and go
look at the things that we talk about or whatever. Right. So one of the things that we all sort of
like is Markdown because it's a real nice
shorthand for doing this stuff. And I stumbled across this collaborative Markdown. So if you
think of like a Google Doc, where you could have five people in there messing around at the same
time, I found this collaborative Markdown tool to where you could have three, four or five people
in there doing Markdown on the same document at the same time and that link is www.hackmd.io so hack medical doctor.io and it's really neat or markdown or
markdown yes oh by the way somebody on slack today shared an app on the mac that is called
mac down and i was like oh that Oh, that's sweet. That's,
that's really nice. But, um, but yeah, this is a really cool little utility. It's free. You go
online, you can either sign in as a user or you can just sign, you know, say, Hey, I'm a guest.
It'll give you a URL that you can share with other people and everybody could be playing around in
the same document at the same time. And it's, it's a pretty nifty little thing. So that is my tip for the week. Yeah. And, and for those curious about the,
you know, if you want to get more meta about the show here for a moment, you know, the,
the show notes that you mentioned, we actually modeled that after, or after the way broadcasters do their rundown for, you know, like a news program, right?
So that's where the pain point that Alan, because we don't, aren't using the expensive
software that the broadcaster might be using.
And so we're trying to mock that in Google.
Yeah.
I mean, the biggest problem is when you copy and paste out of like google docs sometimes
it wants to carry along the html with it and i mean it just it turns into a nightmare right
you know you end up spending an hour going through and doing this stuff but i think we've actually
found a way now to where we can do it fairly quickly which means that we can give you episodes
at the same three-week interval that we always have but it'll just be easier for us so it wait doesn't
benefit them yeah not really but it's a cool utility if you ever need it so with that uh be
sure to subscribe to us on itunes to learn more using your favorite podcast app and like i said
before uh you know be sure to give us a review you can find some helpful links at www.codingblocks.net
slash review yep and you can visit uscodingblocks.net slash review.
Yep.
And you can visit us at codingblocks.net where you find our show notes, examples, discussions, and more.
And send your feedback, questions, and rants to codingblocks.net slash slack
because they're much smarter than I am and much better at responding.
You know, Joe, what we should have done here is tested Alan and just made him close his lid there on his computer.
Because he said he had this portion of the show memorized.
So we'll have to do that for the next episode.
Let him do the closing since he has it all memorized.
Thank you for listening to Coding Blocks.
Now he's got his warning.
He can like study for the next few weeks.
That's right.
Yeah, there was one
other thing. Oh, yeah. And so if you want the show notes
for this episode, it's
CodyBlocks.net slash episode
45. So
that's it. That is
it.