Coding Blocks - Designing Data-Intensive Applications – Leaderless Replication

Starting point is 00:00:00 You're listening to Coding Blocks, but you already knew that. This is episode 162, and welcome aboard again. Find us on iTunes, Spotify, Stitcher if you haven't already subscribed. Oh, and please keep all arms and vehicles inside the... Crap. Inside the arms. Ah, jeez. Okay, I'm just gonna...

Starting point is 00:00:20 This is why I stick to the script. You were the one that didn't want to stick to the script, Joe. Oh, busted. Hey, wait up up pull out the dirty laundry there well i mean you brought it up in front of the kids not in front of the kids all right you're right i like it when mommy and daddy fight oh boy all right who's this uh third guy so i So I guess I'll go say, yeah, go ahead and visit us at CodyBlocks.net where you can find our show notes, stand for discussions and more, send your feedback, questions, and rants to comments at CodyBlocks.net. Follow us on Twitter at CodyBlocks or head to www.CodyBlocks.net. And you can find all our social links at the top of the page. And with that, following the script is Alan Underwood.

Starting point is 00:01:03 Oh, no, man. Like, who has the the cdo now like we just thought that my cdo was bad but apparently alan's is worse because he couldn't not skip over that like yeah gotta watch gotta watch wapner is that wapner oh yes uh you just speed ran the uh intro there hey who are you joe. Hey, who are you? Joe Zach. Yes. Who are you? Michael.

Starting point is 00:01:29 Outlaw. Outlaw. There we go. This episode is sponsored by educative.io. Learn in-demand tech skills without scrubbing through videos, whether you're just beginning your developer career, preparing for an interview, or just looking to grow your skill set. So last episode, we talked about single and multi-leader replication. And we told you about the reasons you would want replication, like failure tolerance,

Starting point is 00:01:55 scalability, geolocation, and single replication is a lot easier. I'm telling you, by the way, so you don't have to go back and listen. This is the recap. If you just want to start here awesome did you want to summarize four hours in like 30 seconds is that what you just said okay yeah so skip those old episodes you're all caught up everything we've talked about in 161 episodes you're with us cool all right so easy interface and interface. That's right. Now you're all caught up. Did want to mention that having multiple leaders sounds great on paper. It gets you better high availability, participation, and you can spread out more.

Starting point is 00:02:36 So potentially performance as well. But it's more complicated and much more likely to go wrong. So if you can avoid it, the general guidance is to not do it. And now we're going to go into wacky land, and we're going to talk about leaderless replication. I want to do one last reminder, which is we're still only talking about the use case where 100% of your data can fit on one machine or one node.

Starting point is 00:03:03 Or in the case of replication, all the data fits on every single node. All of it. Yeah, but it's still the same full set of data replicated on every node. Right. It's a copy. So, yeah. I mean, we might as well just call this like anarchy replication because that would be about the same meaning, right?

Starting point is 00:03:26 I think. Yeah. Yeah. Yeah. All right. So, you know, we like to always say thank you to those that took the time to leave us a review. So, huh. I was going to say this one and then I realized that's not what I thought it said.

Starting point is 00:03:41 But I guess I've already, you know, started. So here we go. So from iTunes, we have Tunzer. So thank you, Tunzer. I like the confused look on your face. Well, at first I thought it was like dyslexia set in or something. I transposed some of the things, so i thought it was tuners and i'm like oh i got this one i can know how to say the word tuners and i'm like wait that s is in the wrong place we read goods threw me off right yeah that was from the the the it is um you know review there nice all right well we uh we totally just blew through the the intro

Starting point is 00:04:29 of the show uh 100 s tier so now we can get on to the meat yeah let's do it or be like products whatever whatever you want uh yeah move along what would that be called? Like incredible? Impossible. Here's the tofu of the show right here. It was the impossible intro. And now we're into the tofu of the show. Yeah. So talk about leaderless replication.

Starting point is 00:05:00 So when you have leaders and followers, still we're going to kind of mention a few things that we've mentioned in other episodes just to kind of make it a little easier because yeah it's been two weeks since you listened last episode right so when you have leaders and followers the leader is responsible for making sure that the followers get operations in the correct order and that's really important because you have causal data where something has to happen for something else. Like an order has to be placed before you can ship it. Or, you know, if you're incrementing numbers or making kind of changes like that,

Starting point is 00:05:30 then it's very important that you get things in the right order. Debits need to happen before credits. If that's how it happened in the real world, I'm at a bank. What if we just let every replica take rights? Made them all equal. It'd be a crazy world. Crazy world.

Starting point is 00:05:49 It's definitely easier to reason about how, um, forget the complications that might be involved with, with multiliter or, uh, with, with replication in general. It's just easier to think about that.

Starting point is 00:06:05 If you had a, you know, a computer science class homework project where you're like, Hey, write some program to where you're only going to have one leader who's going to take the rights, but you're going to distribute reads, you know,

Starting point is 00:06:18 that's a complicated problem, but it sounds a lot easier to think about to replicate the one set of changes from the one place. Then when you say, Hey, all of these replicas can now take rights and now figure out how to like merge those things together and make sure that there's not conflicts or if there is how you're going to deal with it.

Starting point is 00:06:39 It's something more elegant about leaderless. It just seems simpler. You don't have a special roles. You don't have topology diagrams. The devil's in the details. There's some kind of restrictions on what you can do, and there's some things you have to give up. But overall, I mean, if there were no pros and no cons and you had to kind of pick one, I think this is the one you would want because it's less crazy. Well, I think it was going back, if I remember right, in this portion of the book, I'd have

Starting point is 00:07:07 to go back and find the exact quote, but I think this is another one of those times of those crazy kids in the 70s, like they were so far ahead because if I remember right, this was like leaderless was kind of like the way that they thought about things and then for some reason they stopped. They were like relational databases and SQL took over. And then they stopped thinking about things in terms of like leaderless, you know, problems.

Starting point is 00:07:34 And now here we are in this world to where that's all we think about. And it wasn't really until Amazon kind of like brought this back into everyone's mind with their use of an internal data system called Dynamo. And then that's what kind of inspired everybody else to start thinking about it again. And then other systems started getting created that they now refer to as Dynamo-like. Yeah, but we have to call these out because one of them has a really cool name. So it's not Re-Oct. It's not Cassandra. It's Voldemort.

Starting point is 00:08:11 Hey, you're not supposed to say that. Yeah, you can't say it. I don't know how we were supposed to say it. That which shall not be named. But I think, though, going back to what you said, and I think I don't remember if they touched on it in this chapter or not, one of the you know, they said that, yes, it was popular and then relational databases came in, destroyed it. And now it's come back right in the past several years. But I want to say a lot of this is based around like transactions, right, like acid type transactions and all that, right? Like, I think that's why the relational stuff got popular. And this still,

Starting point is 00:08:47 this doesn't fit every single use case, right? This doesn't fit necessarily the whole order shipping type thing, like what you were talking about earlier, right? Because I don't know that any of these databases support transactions on a level like, hey, you wrote to this table, then this table, then this table, right? Like, I don't think it does that. So keep in mind when we talk about this, you still have to know the use case for where this is going to fit for you. Well, I think that's where Joe's going to hate this. So Joe brought it up earlier, so blame Joe, of a new term, new to us term called new sequel. And from the very brief thing that I read about it,

Starting point is 00:09:29 I think that's where they were going, where the purpose of new sequel was to bring in like that type of acid compliance guarantees on top of these, you know, quote no sequel type systems. Right. Which I mean, leaderless isn't necessarily no SQL, but I don't know, for some reason, in my mind, I kind of coupled the two together. Yeah, I don't know of any of these leaderless ones

Starting point is 00:09:59 that are not no SQL. Right. Yeah. If you remember... Go ahead. Oh, go ahead. I was going to say it's a good use. If you remember... Go ahead. Oh, go ahead. I was going to say it's a good use of a double negative. Go ahead.

Starting point is 00:10:09 Yeah, that's right. Yeah, and I was going to say, too, if you remember when we talked about that, we did a whole episode talking about relational versus document databases. And part of the benefit of going with relational databases is that the database and the query optimizer are able to kind of put things together in a smart way. And so you don't necessarily know how you have to use your database. You have to break it apart into pieces that make sense, that model your world. But with document databases, you kind of have to know how your data is going to be used because things like joins

Starting point is 00:10:44 don't work very well and transactions i'm breaking these into multiple parts like you kind of want to centralize that stuff and so your things are happening in like more atomic rights that kind of model a thing that sounds flexible if you need to make changes to that stuff or break it apart it's more painful and so i i think every leaderless solution that we're talking about here, I kind of think of as being in that world. These are not necessarily document databases like Cassandra's considered, what do you call it, like wide column or whatever. So it's columnar. Wait, you know what?

Starting point is 00:11:18 I'm just going to stop because I'm going to get it wrong. But I didn't want to say it's not technically a document database. And so there's trade-offs there but like everything we're talking about with leaderless here we were talking about no sql databases i did want to call out one thing though so in the previous episode i had um mistakenly referred you know was started talking about kafka as an example of um, what was that? Oh, shoot, I forgot what it was. It was multi-leader.

Starting point is 00:11:47 Yeah, as a multi-leader example. And then Alan correctly corrected me in saying, no, no, no, no. If you think about it, if you think about the individual partitions, Kafka is definitely a single leader for a given partition strategy. And I had this epiphany, like, oh, crap. Yeah, Alan's right. Yeah, that would be the case. As I said, I'm rereading this book, right? So as I was going back through and rereading this specific chapter, again, I noticed that in the first portion of that chapter where we were discussing single leader,

Starting point is 00:12:25 they in fact referenced Kafka as a single leader example for that. So I was like, Oh man, it was like right there in the book and I had forgotten about it the first time. There's a lot, there's a lot to grok here. So to kind of walk it back a little bit,

Starting point is 00:12:40 we said, what if we said there were no leaders, no followers, just, just everybody is equal. Yeah. equal then what uh if you need to do your rights you can write to any of them and the problem there is what if the one you try to write to is down and so just the kind of general strategy here is that you're going to want to write several of your replicas at once, which sounds a little goofy at first. We should go ahead and prepare people with their propeller

Starting point is 00:13:14 hats because we're going to talk a lot about W, R, and N as part of this episode, because this portion of the chapter was all about your W plus your R greater than N. Yeah. And to say what those are, it's the number of writers that you do, the number of nodes that you write to from a client perspective, the number of nodes that you read from your R and the total number of nodes. And so specifically here, when I say you need to write to several replicas, I'm saying you're going to configure that database to say, Hey clients, we need you to write to this many nodes whenever you want to make a single change. Well, hold up, let's back up. So you started on this. Yes. So what we started here was there are two, there's at least a couple of ways to do this whole writing to multiple replicas. You can either do it like what you just said, Jay-Z, which is

Starting point is 00:14:13 the client says, Hey, I know that there's five nodes. I need to write to three of them, or the client can write to a coordinator node that will then try and write to the other nodes. So it can either be on the client itself or it can be on some sort of server or service that distributes those rights themselves. So those are like the two common ways to do it. Now, since you brought up the coordinator, though, because immediately you would think to yourself like, oh, well, isn't that, doesn't that mean that he becomes the single one that you're writing to? And like, he's just figuring out how to replicate the data out then, but they specifically called out the coordinator, why that is not the case because, and I'm trying to remember that part now, because the coordinator,

Starting point is 00:15:03 there is no, uh, there was like no guarantee as to like who he might write to when he might do the right. If I remember correctly, do you remember that portion of it? It's totally out of your hands. Yeah. Like,

Starting point is 00:15:15 like you're, you're, you're giving to him. I'll, I'll look for the exact way that they phrased the coordinator. Yeah. I might write to these five this time and those five the next time, and it's out of your client's hands. But the net result is that whenever you do any writing, instead of writing to a single leader, even when we had multiple leaders, you still only wrote to one leader and it was in charge of kind of spreading things out.

Starting point is 00:15:36 But in this case, you literally do write to multiple replicas. And we'll talk about those numbers and the ratios and why that happens here in a minute. But I just want to kind of get the general philosophy of writes out here. When you're talking about single leader and multileader, you write to a leader and it replicates out. When you're talking about leaderless replication, you write to multiple, multiple nodes. Which still sounds goofy to me. Yeah, it's weird. but if you think about it, it's a simple approach to the problem, right? I mean, it really is.

Starting point is 00:16:10 Now, the interesting thing about this is they say, hey, how do you keep these operations in order, right? Like if I write three things out and I need to go to three different nodes each time, how do you keep them in order? You don't. You don't. You don't. You just write them and that's it.

Starting point is 00:16:29 Yeah, there's a couple of advanced techniques for cases where you really need to do something and you can kind of play a couple tricks there that we'll talk about a little bit. But for the most part, the best you can do with a leaderless database is to not have data that you update. So you can get around it, you can work around it,

Starting point is 00:16:46 you can deal with consequences. But if you would just have like log only type data, like analytics data, click stream type stuff, this is perfect. You just write, you don't care as long as the data eventually shows up on all the replicas. So what you just write, write, write, write, write, you write to multiple. That's the price you're paying, and everything ends up fine. It's when you have to make updates or changes or dealing with causal relationships where something has to happen to change the status of something that happened before that you get in trouble. And it's worth noting that the reason why we're talking about the ordering and why there is no enforcement of the order here is when we were talking about single leader or multi-leader replication,

Starting point is 00:17:26 it was necessary, right? Like when it would try and push those, those rights to their replicas, they needed to do it in order to ensure that it was handling the transactions and all that kind of stuff properly. Right. So,

Starting point is 00:17:40 so, you know, just be aware that that's why it's important to know that this doesn't enforce that, which is, so the aware that that's why it's important to blast out the changes, then that makes him the single write for it. He is basically like a single leader thing, you know, set up at that point, because I'm going to write to him, and he's going to replicate it out. But you could give him three things to write, and there's no guarantee that that coordinator is going to write them in the order that you gave them. And that is why the coordinator in a leaderless replication setup is not considered to be a single leader replication strategy.

Starting point is 00:18:34 Okay. Because of the ordering. Yeah. Because specifically in the book it says, unlike a leader database where there's any kind of leader, the coordinator does not enforce a particular ordering of the rights. Okay. All right. So into the next section where we talk about these multiple rights and multiple reads, what do you do if your client or your coordinator isn't able to write to all the nodes or multiple nodes?

Starting point is 00:19:09 What if one of them is down or two of them are down? Yep. And we got an important concept here, an important vocabulary word that we're going to mention a lot. And that is quorum. And you can think of a quorum as being the minimum number of nodes that need to be in agreement for something to be accepted. So you can imagine almost like a meeting of Congress or some big committee might say, in order for this law to pass, you need 60% votes in the affirmative. And there can be senators or people that are not there that day or abstain or vote the opposite, but you need to have some sort

Starting point is 00:19:43 of percentage in order for this thing to count. and that's kind of what we're talking about with a quorum here we say you've got 10 nodes in order to have a quorum we'll say maybe you have to have six that confirm that right for the same account and that's why you wrote right to multiple is because one of those might be down but as long as you achieve that number that you're aiming for that percentage that you're aiming for we say that you've got quorum and we say that that right took. And specifically, if I remember right, like that configuration setting that you would set for the number of rights, it's not, hey, just write to these three nodes and that's, you know, pick three nodes and write to three. It was write to all the nodes that you can write to, but you only need to get back three successful responses for this thing to be considered successful. And so this is where the book keeps going on with the math of W plus R greater than N because in an ideal scenario, you would want the number of write nodes plus the number of read nodes.

Starting point is 00:20:45 If that's greater than the number of nodes total by once you round up, then you're guaranteed that there's going to be overlap there so that when your reads happen, at least one of those reads is going to come from new source because your reads are also going to come from multiple sources too. So you're going to read from as many as you can. And then the client miraculously, you know, you know, never take for granted the drivers that you use for like your,

Starting point is 00:21:15 when you like the next time you're like, Hey, I got to use this, uh, you know, Postgres SQL driver to, to read or this JDBC driver to read from this data source. You'll never take those things

Starting point is 00:21:25 for granted again. So that thing reads from as many as it can and then miraculously figures out that, oh, this is the one that has the later data. That's going to be the ultimate result that I use from the thing, right? And if you had a total of five nodes, for example, and you had the read and the writes both set to be three, which they said that typically, you know, you would set W equal to R in those situations, then, you know, that's, that's your, because it's six, because W three plus three is greater than five, you're guaranteed that at least one of those nodes that you wrote to will be one of the nodes that you read from. Yeah, so just to recap that,

Starting point is 00:22:11 what we're saying here is the general strategy is when you write, you write to every node and you need confirmation from some number, which we call a quorum, say that this has been accepted by at least so many. And at that point, you consider the write done and you can stop listening. You don't care about the return. And then when you read any data, you ask to read from all of the nodes.

Starting point is 00:22:33 And when you get that R number back of reads, that's when you can say, okay, I've got my read back. I can move on. And the question is then, you know, how many do you need for reading and writing? And like I said, the trick is your number of writers that you wait for and the number of readers that you wait for needs to be greater than the total number of notes, because that's how you make sure that you get the latest data. And by having this overlap, that's how you make sure that you've got the almost perfect shot at getting the real up-to-date data. But so just to be careful here, that is when we talk about having your readers plus writers greater than the number of nodes, right? When we talk about that particular equation, that is if you want to, almost like what you said, Joe, if you want to guarantee that you're going to get back fresh data, that does not have to be

Starting point is 00:23:40 how you configure things. They even speak about that in the book, meaning that let's say that you care more about fast writes and you don't care as much about the reads and getting back the freshest data possible. You might make your quorum on writes really low and then make your reads higher or maybe keep it low too, just because you only want that acknowledgement of that right that it happened, right? Now, this comes with the trade-off of fault tolerance, right? Like if you say that I only want one writer to confirm, then that means that your data only went to one replica, but your latency is ultra low, right? I think this was the scenario, correct me if I'm wrong, but I think this was the scenario where they said where you would do that kind of configuration is because if you, okay,

Starting point is 00:24:31 so going back to a common theme in this book is you need to understand your data and the usage patterns of how you use that. And if your data usage pattern is going to be such that, say, 98% of the time, 99% of the time, you're going to be doing reads. Then in that scenario, if I recall, that was the given scenario where they were saying, like, you might not care to do the writes to a quorum of those because you're going to do it so little that who cares? And you could say, I never want a reader to get older data. So I would say read from every single node then. Make sure you get every response and make sure that the response that you eventually return to the customer, or your R to be equal to every node, then you cannot sustain a single node outage without failing the entire system. You're as slow as the slowest response that you get there, too. Yeah, Right. So what you'd probably do in that situation, just if we were going to be hypothetical here,

Starting point is 00:25:48 let's say that you had 11 nodes available in that situation, you might make it to where your rights go to three nodes and your reads go to nine. And then that way you guarantee that you're always going to be reading from one of them that has it right. nine. And then that way you guarantee that you're always going to be reading from one of them that has it right. So your writers can happen that way so that you're guaranteed to get the latest data outside of a few caveats, which I don't think we go into that because there are some very weird edge cases to where it doesn't happen. But, but that's the thing. It would still have to be, it would still have to like, even in that case where there it's

Starting point is 00:26:22 unbalanced, where R and W are unbalanced, it's still in order to have that quote guarantee, then R plus W would still have to be greater than. Right. And that's why I'm saying if you care about the reads, if you care, if you care about read speed though, or,

Starting point is 00:26:38 or correctness, you might make that one higher than your rights, right? It all depends on what you're trying to do. You're trying to make your reads faster, your rights faster than you might sway it in one direction over the other, right? Because really what you're doing is what Outlaw said earlier is when you say, hey, I need to get three reads back to say it's okay, you actually have to wait for your client and aggregate to get three okays back before it'll say it's okay. Otherwise it's going to say, Hey, I failed, you know,

Starting point is 00:27:06 now your application has to do something else. Yeah. Or maybe sloppy reads are like, you know, there was actually a term for like later, I think we're going to get to where it was referred to as sloppy quorums. And there was another strategy called hinted handoff. Yep. Yeah.

Starting point is 00:27:25 Yeah. Yeah. And so the trick to mitigating stale data, like we mentioned here, is that you read from multiple and the replicas keep a version number of the data. So there's a couple of different ways to get that number. But basically that you can think of as like a version number. So you're like you get the data back and it says Michael's customer record. This is version 11. And you look at the data from another replica and, Oh, it's like a lot of version 12. Well, I, now I know I should go with the 12 since that one's been updated. And this note that gave me the 11th, uh, obviously it hasn't been synced to yet.

Starting point is 00:27:56 So I should go with the most recent data and move on. Yeah. So that's super important. What he just said, that's where that equation came into play, right? So if you wrote a certain number of nodes and you're reading from a certain number of nodes and that readers plus writers is greater than the total number of nodes and you've got overlap between what's being read and what was written to. So you might get some stale data from a couple of those nodes, but you're going to at least get the newest data from one of the nodes. And so when the client compares that, it's going to say, OK, I got version 12. I'm going to toss out version 11. And then that's what your application uses.

Starting point is 00:28:34 Yeah, that's pretty cool. We just came up with a system or, you know, other people came up with a system where you can get the latest copy of data without having to worry about any sort of leaders. And so each node is kind of is independent and it's heavily, it's very fault tolerant and available because some number of nodes can go down and you can still get your data and you're still getting back the most recent copy of the data. Like that's pretty good. It's not perfect. It's not a hundred percent,

Starting point is 00:28:58 but it's pretty dang close and it works out really well for most app for most applications. Hey, I just thought of something, though. In our example that we gave, wouldn't that be backwards? Because we had said that if you wanted to put the emphasis on the reads that you might have, you would read out of nine, but you'd only write to three.

Starting point is 00:29:18 If you wanted the freshest, not speed, if you wanted the most up-to-date always. You'd have more readers because then you're guaranteed to read from nodes that had the writes on them. If you want performance, you do less readers so you get less latency, but that's at the cost of staleness. Yeah, so that example, I would say you want faster writes than you want reads. So you can't worry about writing that. And if you want faster reads at the cost of slower writes, then you would go with nine writers and three readers.

Starting point is 00:29:51 Because remember, you're going to end up with the slowest connection basically is going to be your bottleneck there. Well, I guess where I was going with that is that in my example that I gave where I was saying like, you know, you're going to, your application is going to do 98% reads, then you wouldn't want your read count to be higher than your right count. Because in that scenario, um, you know, you, you're trying to emphasize your reads because it's their 98% of your use case. So you want your reads to be fast and lower latency, but you're willing to take the hit on the rights. So your rights would be the nine and your reads would be the three in those same numbers. Unless you care about making sure that you have the newest data and then you'll want your readers. Nope, that would still be the thing. That would still, because your rights, or because you're writing, if you were, if the numbers are nine and three.

Starting point is 00:30:42 Oh, I'm with you. I'm with you. Nine plus three would still be greater than whatever your nodes were. You'd still get the same guarantee, but now it's exactly what you said a moment ago about the fewer nodes, so you'd have lower latency. Right. Okay. Yeah, it's like a race. Because remember, you still read from all of them. You just say, like, take the – whenever I get that quorum number for the reads, that's where I can say done. So if I said we've got 11 nodes, I say, you know what?

Starting point is 00:31:08 Write to all 11 and don't consider that write written until you get an okay from all 11. Then there's problems with that, of course. But you can have one reader. And so you can say read from all and whatever one's the fastest, that's why I return, which is a nice read experience. It's like the Stack Overflow type setup, right? They don't get that many new questions and answers every day, but they got people just slamming them all day trying to read their stuff, right? That's a good, I think, even though I know they use SQL Server for that kind of stuff. I mean, people are going to be like, oh my God, you're so stupid.

Starting point is 00:31:48 Because prior to reading this book, I didn't realize that's how some of these systems worked. That you were going to read from a bunch of nodes or write to a bunch of nodes. And you're like, hey, I'm going to blast this out to, I'm going to multicast this out maybe. That's how it works. I don't know. But I'm going to blast out this out maybe. That's how it works. I don't know. I'm going to blast out this right

Starting point is 00:32:05 to all 100 nodes and if 25 of them say that they did it good, that's good enough for me. Or whatever your numbers might be. Right? I never would have... You think about something like that and then you think about all the different

Starting point is 00:32:21 web services that are out there for just storage, for example, like, I mean, I'm not saying that this is how they're doing the storage behind the scenes, but you know, I mean, sure they have similar kind of problems. So, I mean, it's been this, you know, we've, we can't rave enough about this book. Um, cause it really has been an eye opener for me to like think about some of these problems and that I don't get to work on in my day job, you know, that, uh, that I might not have thought about otherwise. So I guess this would be a good opportunity to say, uh, if you haven't read this book and you would like a copy of this book, you need to leave a comment on this episode. You can find it at

Starting point is 00:33:00 www.codingblocks.net slash episode 162. Or in your podcast player, there'll be a link in there that you can click and it'll take you there. You know, but... First tangent of the night. Here we go. Here we go. The latest Apple podcast player, you don't see that

Starting point is 00:33:20 anymore. Have you seen that? I heard that they did something and there's a way to get around it. I don't remember what it is. In the podcast player you can get around it or as a publisher you can get around it? As the person who publishes it, there's a way to get around it. I know how to do it as a publisher, but it would be dirty. Yeah, I don't remember. I want to say it had something to do with the fact that there's href tags and they're not parsing them properly. I don't remember. I want to say it had something to do with the fact that there's HREF tags

Starting point is 00:33:45 and they're not parsing them properly. I don't remember what it was. We need to look into that, though, because we try to make it easy for people to be able to go to this stuff. I'm just going to say it because I know that there's a lot of other people that listen to this show that also podcast. I mean, we talked about Jamie Taylor and all of his, uh, various things last episode. Uh, I think, I think he like has like 18 different podcasts or something. It's

Starting point is 00:34:11 insane. Um, I exaggerate a little, but it was like 16 or 17. Um, in our podcast hosting provider, they actually have a separate section for iTunes now. And what I suspect is happening is because the thing that we're putting in there is just like a blurb instead of like the whole show notes. That's why that's what Apple is taking, I suspect. Um, but it's weird because Apple used to be,

Starting point is 00:34:40 get it used to get it directly from our feed. So yeah, whatever. Yeah. And that's behind the scenes encoding blocks. And, yeah, now we bring you back to our regularly scheduled program. How to keep your data in sync. Yeah, and so, you know, we talked about how you write to many

Starting point is 00:34:57 and then you read from many. And when you read, you take the latest. Well, what about that data that is old? So as the client, you know that you just got old data back from nodes one, three, and four. One strategy for keeping this data in sync is to have the client go update one, three, and four and say, hey, I got something newer from two, five, and seven, whatever. should you should get this data if you don't already have it by the time you get this now isn't that just flipping the problem on its

Starting point is 00:35:31 head about like how you would do like when i read that part i was like i mean well yeah sure sure you could do that that's like totally like that's yeah i guess i I wouldn't have thought that that's how it was done. But yeah, now that I read it, I can't unread it. Yep. So that's one strategy here. And you can have your database only do this. They call it read repair. And the idea is whenever that data is read and only when that data is read, you can have the client say, oh, hey, there's a mismatch.

Starting point is 00:36:02 But because of this quorum, I know that at least one of these has the most recent data. And now I'm going to fix the rest of you. And they did call out specifically, though, that in this type of situation, depending on how often your data is read or that particular piece of data might be read, you could technically have stale data that lives out on your nodes for a very long time. Right. So if you never read that old data from a client app, then it doesn't know that it's old on those other replicas. So it never gets updated. So it's an interesting problem. Or worse, the newer systems that have the updated data, that old data is never read read the new systems that have the newer version

Starting point is 00:36:45 something they die they you know they crash and burn whatever something happens to them i guess the data just don't exist yeah you're stuck with the old data now um yeah it's it's all weird right but the idea is that eventually over a long span of time the data will be consistent but they don't say eventually five minutes, eventually one hour. DynamoDB is almost 10 years old now. There could be data out there that has never been read, never been fixed up. Maybe they might do something else, implement it into details. But it's just kind of crazy to think that's a valid strategy. It really, it really goes back to, you know,

Starting point is 00:37:25 knowing your data and your use cases. And that if you were going to go with a system that uses read repair as the strategy, then it's, it works for you because you know, it's a high read situation. And so it's likely to be corrected by the client. But if it's for archival purposes and you're looking at using a system that uses read repair, you need to rethink your strategy. It was kind of my takeaway from it. Archival type use cases would not be where you would want to use read repair. If you're doing a backup or a data dump, then you're probably not doing a read in the traditional sense that a client would do. Maybe you're just copying files on disk or something like that would be a case where you could really miss some data. Well, I want to be careful with the choice of words there

Starting point is 00:38:11 because when I was talking about archival purposes, I didn't mean like you were taking a backup of the database. I just meant like, you know, you were you were going to think of an example here. Wayback machine. You're storing data for a long time. Say what now? Archive.org or the Wayback Machine for the web. Yeah, but I mean, maybe like, let's picture, I don't know, the thing that came to mind was like portions of your resume. Like if you had your resume as like portions that were in, you know,

Starting point is 00:38:42 each a separate document or row in a database, right? Like stuff that you did from 10 years ago isn't going to get written often, right? Or maybe not even get read often because like who's going to go that far back, right? And so if you did go back and say, oh, you know what? I just remembered some project that I worked on from 10 years ago. I'm going to go add that to the project. But if it's on a separate, you know, if it's literally like a separate row and, you know, cause you know, a reader of that isn't going to get to it until they got to

Starting point is 00:39:11 like the eighth page of your resume. Cause your resume is way too verbose. Then, uh, you know, in this kind of system, right? Like that isn't going to be highly read. So therefore it might not get, um, I mean, I'm not trying to say that your resume wouldn't be extremely exciting. That probably came across as rude and I apologize. Uh,

Starting point is 00:39:34 not you, but your friend's resume. Then I mean, you know, I wouldn't even talk about your friend. I was talking to Respect, respect. I get it.

Starting point is 00:39:46 Yeah. Some guy that you saw at the coffee, you know, at Starbucks, that guy's resume is probably not going to get read that often. At least not the eighth page of it. So there is a solution to this, though, right? this though right like instead of having this read thing that has to go update these things as it gets it there's this other thing called anti-entropy which basically is just a background task that will go try and sync that data up with the other replicas which sounds very similar to what the um you know single and multi-liter replication stuff would do too you know i mean it's kind of what you just always imagine or at least i I always imagined. Like, this was already a thing. Like, why is it going to have such a negative name about it?

Starting point is 00:40:27 Why is it anti-anything? No, come on. This is the way. This is the Mandalorian. And the other thing is the anti-way. Because the other thing is just weird. But, I mean, it works. Yeah.

Starting point is 00:40:39 And I'm a little upset that somebody ticked off the next one. Because the database that should not be named only uses read repair, which is interesting. So Voldemort, they only use. Yeah, I'm going to show up. But that's pretty interesting that they like they were just like, no, we're not doing this background testing. Well, I mean, I joke about the anti-entropy, but in reading about it, that actually sounds a lot more complicated. The read repair sounds much easier. If you're going to read from whatever your R is configured to, let's go back to our five-node example, so R and W are both three. You're going to read from the three nodes, and if any one of those needs to be updated or two of those need to be updated, then fine. You know, no big deal. Trying to figure out these things that like where to thinking in a relational database kind of way that,

Starting point is 00:41:47 that when you take that away from it, then this auto entropy makes it sound like it'd be way more complicated to find what are the new pieces and is it new because it changed or is it new because it was deleted and it has a tombstone marker? Like, you know, and not to mention, like you said,

Starting point is 00:42:03 the ordering for the auto the anti-entropy is not guaranteed at all so you can imagine a strategy where it's just like uh all the leaders are constantly just clearing each other like normal clients so hey um go fish you know like have you do you have uh do you have record uh one two two okay cool i got it okay you got it all right whatever next you know so i don't know i like the idea of like these kind of nodes just like sitting around lazily talking to each other you know of course like random is one way to do it you could also go like very procedurally and vet the whole database depending on on what you want but i just kind of

Starting point is 00:42:37 like the idea of like these nodes sit around playing good talking about the weather yeah i mean it's kind of like like as you were describing that it kind of reminded me of the all to all drawing where like every node was talking to every node you know yeah but you got to schedule that you got to make sure it's not too chatty so it's definitely more complicated and no doubt about it and the reader pair just sounds so cool that you should probably just go with that right uh-huh all right so we right. So we're kind of, we're going to jump over a few of the things that we had in the notes here, cause we've already talked about a lot of it. So the quorum for the reading and the writing, um, the one thing I do want to touch back on though, is what we

Starting point is 00:43:17 mentioned is that a common way for doing this leaders leaderless thing is to make sure that your number of writers plus your number of readers is greater than the number of nodes. And then that will ensure that you get fresh data back and that you're also writing in a way that is fault tolerant. Now, the interesting thing here is, and Joe pointed this out in the notes, is let's say that you have 10 nodes and you have five readers and five writers. It's very possible that because you don't have overlap, you're not greater than the number of nodes. You could have written to nodes one through five, but then when you went to read, you read from node six through 10. And so you get old data.

Starting point is 00:44:00 That's why I was calling out earlier that it's specific that it not be greater than or equal. It has to be greater than. Greater than. But I do want to clarify one bit of terminology with, like, it's not the number of writers or readers, because that makes it sound like it's a task off on some other service. It's the number of successful writes or the number of successful reads or confirmed, you know, whatever your choice word.

Starting point is 00:44:24 Like, you know, it's meaning, meaning, and the reason why I want to call that out is because when you say like a writer or reader, it makes it sound like it's some other service on some job in the data center. But really, this is on the client. The client is making this determination of I got enough rights, successful confirmed or successful rights. And I got enough successful reads that I can move on. You know what I like is if you would explain this to a DBA, like 10 years ago, like, you know,

Starting point is 00:44:53 the way we're talking about now and said, we do is you send to every node and then you read from every node. It's great. They, they would have had a hard attack right there because like it's terribly inefficient and it is, but the deal is you get to scale out. So this is not something you would ever do with a single node this is something that only makes sense for larger scale type things i mean i think we've joked about this in the past too

Starting point is 00:45:14 but um or and if we haven't then uh hey new to you um you remember you remember like go back 20 years, you know, or, or more 25 years, you remember like how setting up SSL on your, your server, like that was a big deal. You only flipped to HTTPS when you very specifically needed to do something like authentication. And then you immediately came back because the, the burden on the server to encrypt the traffic for the number of concurrent requests was just too great. So it's like, hey, if you don't need to be encrypted, then get out of there, right? And so to your point about going back 10 or more years to talk to a DBA and be like, hey, check this idea out. What I'm going to do is I'm going to write to all of them. Right.

Starting point is 00:46:07 Like, you know, of course, cause of course they would flip because, you know, I mean, we've just gotten, fortunately, like as things have progressed in time, right? Like processing has gotten better. Network speeds have gotten better. Latencies have gotten lower. So we can start to take advantage of some of that, you know, and maybe that's why i like things from the 70s were like now i get it now i can now i can actually implement it because we got it

Starting point is 00:46:31 jeff bezos heard this and he's like wait a minute we got a database system here where you write all 100 of the data over and over again to every computer every time and you every time you read you get all the data if we charge people by the storage, by the compute and by the network traffic, they'll just pay it. Yeah, that's right. That's right.

Starting point is 00:46:53 And they, and they never actually remove any of the test stuff that they had out there. Beautiful. And this stuff can, they don't even have to query it to fix it. They can be just like chatting in the background and just passing data back and forth all day long. Yeah, yeah.

Starting point is 00:47:07 Let's do this. And that's how you got AWS. Speaking of those who shall not be named since you named him. Because, you know, he kind of looks like, doesn't he kind of look like? You see some of the pictures? He kind of looks like it, right? Have you seen? So there's the story about he's going to be one of the first to ride in one of the space trips where he's not an astronaut. There's a petition.

Starting point is 00:47:38 I don't know if you've seen this, to not allow him re-entry into Earth's atmosphere. It's got over 120,000 signatures to not allow him to re-enter. That's amazing. It's so funny. It's mean. It's cruel. But it's also a little funny. If you rearrange

Starting point is 00:47:59 the letters of the person who created that petition, it actually spells his ex-wife's name. Tom Riddle, Voldemort. Sorry, I couldn't resist. What are those called where you can rename it? An anagram. There you go. Yeah, it's not true.

Starting point is 00:48:14 So back on to the serious stuff. One of the cool things that you get if you do this reads plus writes is greater than the nodes. If you take the number of nodes and you divide it by two and then round down. So if you had nine nodes, you divide by two, you got four and a half, just round it down to four. That's the number of failed nodes you can tolerate in that situation and still be able to do your writes and your reads. So that's pretty nice.

Starting point is 00:48:45 And just having that simple equation makes it to where you kind of know, you know, how available your system can be. And we mentioned about how you could kind of tweak those numbers for your workload to make what makes sense. But kind of the standard general advice is if you don't have a special reason, just go about half. So take those nodes, add one, divide by two. There you go. There's your RNN or RNW rather, Readers and Rights for the Quorum. Successful Reads and Rights.

Starting point is 00:49:16 I wonder, though, like what would be realistically like take an Amazon.com, for example, you know, if they were on a system like this, for example, then how many would they have? Right. Because, you know, you think of an amazon.com, right? Yeah. It would have to be a lot, right? And so what would you do for your number of reads and writes there? If you had a thousand and you're like, OK, you need to have back five thousand and one successful reads, five thousand and one successful writes like everybody's home Internet connection would be just saturated with reads and write requests to Amazon. Even if, you know, you really weren't doing much to shop from Amazon at that time. Obviously, it points out that, at least in my mind, that you're not going to have obscene numbers of these things. I just went and looked.

Starting point is 00:50:24 It looks like the most you can have for a standard AWS account without making a phone call is 50 per region. 50 nodes in Dynamo. Wait a minute. Hold up. That's a great call out. I'm so glad this was said, too. The Dynamo that we were talking about before is not the Dynamo that Amazon makes publicly available through AWS. That's where the author called it out. It's very confusing

Starting point is 00:50:52 that Dynamo, and when we say that that system that shall not be named and REAC and what was the other one? Not Cassandra. Cassandra. Oh, it was Cassandra. That they're Dynamo-like, that's referring to an in-house system that Amazon has that they don't make publicly available. But to confuse things, as an AWS service,

Starting point is 00:51:18 they have a completely different system called DynamoDB, which is leader-based. Oh, that's interesting. I missed that. It's good that we're not confused. If we ever finish this chapter on replication, the next one is on partitioning, which starts getting into splitting up your data so it's smaller and can fit across multiple nodes. Once you bring that into play, the numbers start getting really big

Starting point is 00:51:45 and yeah you can spread things out a little bit better but remember we're still focused on just whole data per replica or node per machine well virtual machine anyway it's so hard to talk about stuff accurately it is wait we were talking about things accurately i gotta go back and reread hold on i'll be back we're supposed to supposed to so uh you know everything we talked about we mentioned that there's ways for things to go wrong the author explicitly lists five edge cases and uh kind of a category which the category with one of those five is basically like and there's a bunch more we're not going to go into details of and so when i wrote the notes i kind of figured like is basically like, and there's a bunch more we're not going to go into details of. And so when I wrote the notes,

Starting point is 00:52:27 I kind of figured like, you know what, this is a complicated topic. I think it's, you, you can imagine cases where things go wrong, you know, notes coming down, coming up as you're going along,

Starting point is 00:52:37 uh, do notes coming up as you're querying. Just, uh, you can imagine all sorts of things. And, uh, yeah.

Starting point is 00:52:43 Or reads happening as the rights are happening like that yeah there's there's so many different things so like we didn't want to go into crazy detail on them yeah and what if you can't get a quorum you say you write and you say i need at least three to accept this uh this right and you don't get it after two minutes you just got two sitting out there what do you do do you try to walk that back do you just go with it like well you know what do you do do you yeah so and all that gets into really specific implementation details there's things that databases have to decide what to do with that and so we start getting out of general rules we can talk about like we haven't so far and so we're just going to skip it it goes into like five plus or minus 30. Things that can go wrong.

Starting point is 00:53:27 Plus or minus 30? 30, yeah. Exactly. About. Yeah. At least we got a range. I mean, it could have been worse. Within a few standard deviations.

Starting point is 00:53:36 Yeah. Here's a laundry list of problems that might happen. And, you know, you're guaranteed to get one of them. Here's a good question. How do you know how stale your data is? So we talked about if you've got this reader fix-em-up strategy. I forget what

Starting point is 00:53:56 it's called already. Read repair. Fix-em-up's good. I like that one better, honestly. Fix-em-up strategy. You can imagine having a Prometheus or a dashboard or something that kind of keeps track of how far your leaders are. Because that's the question you wouldn't want to know. It's like, I've got 11 nodes here. How are we doing?

Starting point is 00:54:14 How similar are they? How different are they? Well, that's a really hard question to answer, especially if you're doing that reader repair strategy. Because you don't really know that there's a problem until you see one. Yeah. I mean, they, they talked about the fact that in the single leader, or even in the multi leader replication, there's this ordering. So, you know, that if you did write number 10 on your leader, then if, if your replica

Starting point is 00:54:39 over here is on right, number seven, there's three behind, right? Like that's easy to do in this leaderless world, I don't know, like Joe said, maybe it'll never be updated because it's never going to get read again. Or even if it is read again, there's no guarantee that the writes happen in the same order. So you can't just look at the write-ahead log to be like, oh, this is where you're supposed to be and you're not, so you're this far behind and done. Yeah, They did say that there was an algorithm that somebody's come up with. I don't remember exactly what it was, but they said the worst part is right now, at least in

Starting point is 00:55:15 the leaderless world, it doesn't seem to be a priority to expose those types of metrics, right? So it's kind of on you to come up with your own way to make this happen, which is a little unfortunate. The way I remember that was a little bit different, which was that there was research in that area, but there wasn't an answer. Yeah. I think they said it was reasonable, right? It would give you a guesstimate, a decent guesstimate, but it's not implemented on most of the systems. Yeah, so the paper that the book referenced referred to PBS

Starting point is 00:55:48 or probabilistic bounded staleness, which is almost what it sounds like is basically what they do is they do random sampling and they figure out how good the samples are. It's almost like sampling animal populations. So they go out for an hour and they say, okay, we counted 20 deer so multiply by the number of

Starting point is 00:56:04 miles in the state and hey, we counted 20 deer, so multiply by the number of miles in the state and, hey, we got 20,000 deer in the state. And so that's the kind of sampling that's done for animal populations. And that's kind of what they're doing here is they basically take those random samples, extrapolate, and say, hey, pretty good or pretty bad. And it's not

Starting point is 00:56:19 great. And from what you can tell, most of the databases that we're talking about don't really implement that. like we mentioned it's still kind of uh it's not experimental but it's just not widely used but one thing that is widely used that i don't think up in the notes is that one thing you could do is just keep a total counter of rights per replica and then i could say hey replica number one how many you got uh 1013 all right you number two how much you got 1015 like well they're about two off i don't know which two but we're about two off so we have a pretty good we have a 0.2 variance on this thing yeah it's pretty good now if you were to you know ask again

Starting point is 00:57:01 a second later it could be wildly different you know it depends on how much variation you know or how much data you had coming in it's kind of funny but yeah it's an interesting problem i wonder though in that scenario that you described i mean yeah that works fine if all nodes are created equal at the same time from the start but if you have say three nodes already running and you're like, you know what? I want to bring two more on. Then do they start doing rights from the beginning of time or do they just take a snapshot of what was already on one of the others and start it and

Starting point is 00:57:34 start from there. And then in which case they're going to get, does that count as the first right? Does that count as right one or do you like not count that one? And then the first right that they actually get is count one. And so they're going to look like they're thousands behind you know it didn't the book that you go into uh that strategy so i would assume it's kind of like leaders which is what he said basically takes a snapshot and once it gets far enough along you can say okay you're ready to come in now but how do you know if it's far enough along to put in right that's the what

Starting point is 00:58:01 we're talking about this episode is sponsored by Educative.io. Educative.io offers hands-on courses with live developer environments, all within a browser-based environment with no setup required. With Educative.io, you can learn faster using their text-based courses instead of videos. Focus on the parts you're interested in and skim through the parts you're not. So I have a confession. I've been doing a little bit of typescript lately uh and that's technically typescript because i've got the ts extension but i've basically just been doing javascript because it kind of works for the most part and i've been ignoring all the really cool things that typescript has to offer and we're bringing this language in for a reason right

Starting point is 00:58:44 because it's great. So I went and looked on Educated, and they have a whole path dedicated to TypeScript and a total of seven courses that deal with it. So I started one, and I'm already learning. And what's really nice is because I know JavaScript pretty well, I'm able to scroll through really fast through the things I know. So I skipped the chapters on var and const and what, and jumped down into things I didn't know,

Starting point is 00:59:07 which the first thing I smacked into was declare, which I had no clue. Do you know what declare does in TypeScript? No, you should take this course. What was really nice is stuff where you can run, you know, we mentioned these little playgrounds where you can kind of code around and

Starting point is 00:59:21 like answer, you know, try to solve the problems they give you. But what's really cool too, is also had some um non-happy path situations where like hey try to run this code you do it and it breaks and gives you an error message and then they go on to explain why that error why that might be counterintuitive so it's just a really great way to learn like i said it was so nice to skip over the things that i don't need to learn because this is something that's so familiar so it's been really painful for me

Starting point is 00:59:44 to try and watch a video on TypeScript, for example, because I feel like I know so much of it that I'm tuned out by the time I get to the parts that I don't. So that's been really nice. Yeah. I mean, the,

Starting point is 00:59:55 the learning environment that educated.io sets up for you is, uh, very easy to, for you to be able to tailor the experience to what you want by allowing you to just easily skim through, you know, the, the whole blocks of, of, you know, text. You're like, Nope, don't need to even read that. I can just move on and play in the playground that I want. And, you know, in the, um, in the sponsor tease, I had mentioned, you know, like, Hey, if you're using, uh, you could use educative.io as like part of your interview prep preparation, interview prep preparation.

Starting point is 01:00:29 Is that too many preps? And they have a new course called Decode the Coding Interview. So you can stop grinding through endless practice questions and get straight to real-world examples. And all within your browser there, uh, your browser, there are no need to switch to your IDE or download some SDK or install some special package or whatever. Educative has the combined, uh, you know, has combed through the most commonly asked interview questions at top tech companies and has crafted a set of scenarios for you to learn from. And the courses are available in Python, Java, C++, and JavaScript. I mean, that sounds perfect, right? For an interview prep,

Starting point is 01:01:12 like that's exactly what you want is somebody else has already done the hard work to come up with like, here's the questions you're likely going to get asked. Oh, and I forgot to say too, they've also introduced the, you know, there's their best-selling Grokking the Interview prep series as well. It has system courses like Grokking the System Design Interview and Grokking the Coding Interview. Well, now tell us about the new one, Jay-Z. Yep, newest one, Grokking Machine learning interview focuses on the system design side of machine learning have you heard about um

Starting point is 01:01:48 ml ops is what the word I was looking for machine learning operations so that like the actual implication of these systems because it's really tough especially as good things get big and this of course is designed all around really getting to the heart of that and you'll actually go through real world systems such as the ad prediction system.

Starting point is 01:02:08 It's the only course of its kind on the internet. Yeah, so go ahead and visit educative.io slash codingblocks to get an additional 10% off an Educative unlimited annual subscription. You'll have unlimited access to their entire course catalog. But go ahead and hurry because they don't run these deals very often. That's educative.io slash coding blocks to start your subscription today. All right, everybody, it's that time of the show where we pause to ask you, if you wouldn't mind, why did you get like your deep, serious, like, hey, baby. That's right.

Starting point is 01:02:46 Did you notice like how intimate all of a sudden he got with it? I mean, I'm asking for an intimate thing here. It got weird. I'm asking for an intimate thing here. Okay. I'll put on the soft voice. I'll put on the soft voice. It can be our voice.

Starting point is 01:02:58 Yeah. That's right. So if you haven't had a chance yet and you would like to give back to the show, please do. Consider leaving us a review by going to codingblocks.net slash review. We have a couple of links there that will help you get to a good place to write one of these reviews. And as always, we appreciate them. They put smiles on our faces. And yes, we love it when we get those.

Starting point is 01:03:21 Thank you. And back to our regularly scheduled programming. I can't unhear it, man. I feel like you should be like a nighttime radio DJ from like the 70s or something. You're listening to the sweet, smooth sounds of WJZZ. Mr. Calling. ZZ. Mr. Calling. Hey, don't think I can't do this. I might switch careers.

Starting point is 01:03:55 You probably should, man. I think we found your calling. I think we did. I got the base. I got the base. Yeah, right? So I'm sure I can make that happen. I just got to smoke a few cigarettes, and then I'll have that raspy bassy voice. Right, that's what we wanted to hear.

Starting point is 01:04:07 You're listening to the sound of WJZZ. You're not going to like the salary, though. Well, I don't know. With that kind of voice, maybe. That's right. Mike Wisnowski, I need to submit your paperwork. All the parents just got that reference. Yes.

Starting point is 01:04:31 All right. So what about it's time for my favorite portion of the show. Survey says. All right. A few episodes back, we asked, do you want to run your own business? And your choices were, heck yeah, Shark Tank, here I come. Or, I don't know, that sounds like a lot of work. Or, I already do, and the boss is a real jerk.

Starting point is 01:05:01 So this is, what, 62? So, Jay-Zz you are up oh boy okay uh geez um i don't know it sounds like a lot of work at 34 stole my pick i'm gonna go with i't know. That sounds like a lot of work. 35%. Oh, that's dirty. That's dirty. It's fun now. I hope we both lose. There's going to be some fighting.

Starting point is 01:05:37 We're going to play Call of Duty later, and somebody's going to get sniped. That's right. We're going to turn on friendly fire, and it's going to get ugly. Okay. sniped and we're going to turn on friendly fire and it's going to get ugly. Um, okay. So Joe says, I don't know. It sounds like a lot of work for 34%. And Alan says,

Starting point is 01:05:55 I don't know. It sounds like a little bit more work for 35%. And the winner is neither. Oh, come on. Wow, okay. Heck yeah, Shark Tank, here I come. All right. Hey, congrats.

Starting point is 01:06:11 44% I like it. We were being Debbie Downers here. Yeah, you were. Yeah, you were. I like it. So if you want to complain to Alan and Joe, you can find them on Slack. You'll never guess Joe's name because it changes constantly.

Starting point is 01:06:28 That is great. Now, we're easy to find on Slack. We're just at our name, at Alan, at Joe, at Michael. Yeah, so you can complain to them and be like, hey, why would you think that we wouldn't be optimistic and want to work for ourselves? Now, that said, you guys weren't

Starting point is 01:06:51 far off because I don't know, that sounds like a lot of work, was 42% of the vote. Oh, so it was real close. Yeah, it was. What was the, heck yeah, what was its percent? 44. I thought I said that. Did I not? It was 44, 44. 2%. Okay. I still like it. I like the positivity. Yeah. Yeah.

Starting point is 01:07:15 Yeah. I mean, it just really says like a lot of, uh, go-getters that we have. Um, and, and I was really surprised too, though, because I know like from like Slack conversations and you know, the conversations and the virtual meetups that we've done and whatnot, there are several listeners that do work for themselves. So I was actually surprised that that wasn't more of the vote than… They were too busy to take the survey. Probably. That checks out. Yeah, right.

Starting point is 01:07:43 Their boss wouldn't give them the time off to do it. Exactly. So for today's survey, we thought we'd ask, cause you know, there's this interesting story that came out. You know, Tik TOK has always been you know,

Starting point is 01:08:00 there's been some controversy about it since its inception. It seems like maybe that's just, you know, from the news sources that I read or listen to. And there is a new one that came out about Tik TOK and collecting biometric data. So it made us question like, Hmm, do you have Tik TOK installed in your choices are heck yeah. I love those videos or Nope, no way. Never. That's it. Pretty simple.

Starting point is 01:08:28 I think I'm going to be upset by the answers on this one. It's wildly popular, so I don't want to paint the jury pool there, but I think the cards are stacked against you. A little bit popular.

Starting point is 01:08:44 What is it with every iOS app now? Like, hey, can this app access all your other data for all your other apps on your phone? I'm like, this app does nothing. Why would it need access to everything else on my phone? Well, that's because Apple, with the latest iOS, they started exposing those things to make the users more aware. I know. it's ridiculous, man. I don't like phones.

Starting point is 01:09:09 I'm going to get me a flip phone back. I think so. Well, I mean, they got smart flip phones. So, you know, you get one of those.

Starting point is 01:09:16 I want a dumb flip flip. I don't install mobile apps anymore. I just use websites. So they have to buy that crap from Google. I do the same thing. I seriously do. I, I tried to avoid installing apps if i can i feel like the average user uh phone user installs zero apps per month really yeah well i'd imagine the the first month you get the phone you install

Starting point is 01:09:37 like a thousand and then after that you're like well i can't even find the ones i have so yeah oh man what was the name of this phone it's gonna it's gonna kill me now because there was like these commercials that you would see on like some of the news stations like uh it was a where it was a service never mind i can't remember it was basically like a serve it was a a cellular service for, um, kind of catering towards like older, the older generation where,

Starting point is 01:10:12 where, you know, like, uh, I don't know. I'm just gonna make up a name, like go call or something like that, you know?

Starting point is 01:10:19 And it would be like, they would just give you a flip phone and here you go. Like you never saw that any of those commercials on like, it seems like all the news stations would have them in the middle of the day no but i figure like that's where alan's going to end up pretty soon i'm not far off like he alan would be the type of person to move into like one of those 55 and up type neighborhoods that take care of like all the yard and everything like alan would be the type of person to move into it way sooner than he needed to.

Starting point is 01:10:46 Just cause he's like, no, this sounds like luxury. I don't have to take care of the lawn. You get my trash for me. No. Yeah. Let's do that.

Starting point is 01:10:54 Hey, you're making me sound lazy. I don't mind paying my guy to cut the grass. I don't, you don't mind to pay him. That's right. I mean, I'll work to make that money and then I won't to pay him so I don't have to do it.

Starting point is 01:11:07 I didn't say that you didn't have to pay for it in those communities. I mean, you know, decided for you. All right. Well, now back to our regularly scheduled program and when things don't work and we continue with Joe. All right. So when things don't work and we continue with joe all right so when things don't right uh so well you know what happens if you tried to do a right and uh you don't have enough notes for quorums you know we mentioned earlier it's kind of up to the databases to depend to decide what to do with it based on what they're trying to do uh so what do you do is it basically you return

Starting point is 01:11:42 an error if you get a canker quorum or you take it, right? Those are your only two options. You say, no, back it out or we'll take it and figure it out. I don't know. Come back later. And if we choose to go on operating and we've chosen to say we require a quorum, we didn't get it, but we kept it anyway. That's what we call a sloppy quorum because it didn't meet our minimum and we

Starting point is 01:12:07 did it anyway. You know, what was interesting here too, is they brought in this term that I did. They didn't mention anywhere else and they kind of snuck it in and they were saying, even, even if you didn't meet the total number,

Starting point is 01:12:23 maybe it wrote to nodes that weren't the home nodes. And I didn't like, that was the first time that they mentioned in the entire section, which means that maybe there's like these primary nodes that you have and maybe some secondary, I don't know. But then they likened it to, Hey, what if you get locked out of your house? You might go over and knock on your neighbor's door and be like yo

Starting point is 01:12:45 can i can i crash on your on your couch and they'll be like yeah but as soon as soon as your house gets unlocked you're out of here so they brought this up and like i said they didn't really mention anywhere else so i don't know where that came from but they also wrapped that in there with the sloppy quorum. Yeah, I totally missed that. For like home, doing some Googling here. Yeah, it was odd, but you know, whatever. That was in the gray box. Don't read

Starting point is 01:13:17 the gray boxes. You got to read the gray boxes. That's where all the good stuff, that's where the tofu is. But so there was another thing that they said about this that was interesting is you can That's where all the good stuff, that's where the tofu is. But so there was another thing that they said about this that was interesting is you can, by doing this, by not available, but it does come at the cost of consistency because if for some reason your standard node goes down and these other ones, these temporary ones don't ever get, you know, updated back across, then you lose that data, right? You'll never come back up to a consistent data state. Yep. So what about concurrent rights?

Starting point is 01:14:13 So, you know, we mentioned a leader list. So what happens if we have two different clients trying to write the same exact data with two different values? So we mentioned the version number, like kind of logical clock thing. Basically, if the client can tell us the version that they're trying to write because they just read version 16 and now they're writing an update to it. So here you go, this will be your new 17.

Starting point is 01:14:36 That's one way that the nose can detect it. There's a problem because, hey, I got 17 and you've got 17 and the values are different um sounds kind of unlikely but depending on your use case you know different kind of applications could be a bigger problem or not but it does happen and it's something that you have to make a decision about these databases can't just leave that unhandled they have to pick something to do about it and there's a couple of strategies and we're going to, I don't know that that like,

Starting point is 01:15:08 uh, maybe we're going to get to this part though. Cause, um, there was a thing that we haven't discussed yet, which was the, um, always messed up this acronym conflict free.

Starting point is 01:15:23 I think it was called con conf that you can hear me say it is. Yeah. Okay. Well then I'll stop. Okay. Yeah. So, um,

Starting point is 01:15:32 yeah, we'll get there. We'll get there very soon actually. Uh, so, uh, yeah, before we get there,

Starting point is 01:15:38 maybe we deleted the acronym, but the section is coming up. Um, but the first strategy that we, talked about and well but before just first to just address the your concern there is basically there is a way to kind of chain uh things together that are dependent and so it's not completely a solve list i guess where i was going with this is when i was reading this portion of the book, imagine, and again, because like relational databases are just so ingrained in my head when I think about data and everything. So as I was

Starting point is 01:16:12 reading this portion of the book, I was like, imagine if you were to work with your, your, your data team and you were to say, okay, for every time I read a row from this, this database, you're going to return back the row version number to me every time I read a row from this, this database, you're going to return back the row version number to me. And I'm always going to like, you know, if I go to write, if I go to, you know, do an update, I'm going to say like, Hey, here's the up, here's the, um, what the row version that I know of for that thing, you know, and, and maybe like, I'm always as a client, I'm always aware of what that version

Starting point is 01:16:46 number was. So in this example you gave was the, the, you talked about the logical clock, right? And we're, the point I was trying to make is that, um, there are data structures in some of these systems that are meant for dealing with these type of conflict resolution problems. They're like specific data types. I think it was React specifically that was called out as having – the acronym was CRDT. CRDT, yeah. But it stood for conflict-free replicated data types, if I recall.

Starting point is 01:17:27 And I guess that data type would have these kind of things in it automatically so that you as a client don't know, right? In order to handle that type of thing? Yes, that's part of a strategy called happens before which is uh it's kind of an evolution of the first one um but so i would say we can get to that in a sec oh okay sorry yeah uh yeah so i i had uh i had removed that part from the note but i added it back um i so, it's inside Baseball. But yeah, it's back in there.

Starting point is 01:18:08 So the kind of simplest case is we say two clients went and wrote version number 17. And remember, these things don't necessarily know that there's a problem, right? So the problem is discovered sometime in the future when a client reads and says, Hey, I asked all of you to give me the data for this entity. And I got two different version 17s that had different values. So I've noticed that there's a problem. So how do we choose which one's the right one? Well, if all you have is version number 17 and the data, you have nothing else to go on.

Starting point is 01:18:43 So the client's going to have to pick one and say, well, okay, crap. picked at this point. Now, if you have any extra data that could be used here, like perhaps like a time, like a timestamp or something, then that client could use that information to say, oh, you know what? This one actually came in an hour after the first one. So we're going to update everybody with this number 17. And so this is the one that wins. And if this data matters or whatever, that's just too bad because our goal here is to be consistent, not correct. Sounds crazy in computer science terms, but that's what we're talking about. And that is the last right wins scenario, right?

Starting point is 01:19:38 And that's hard to say because clock time we've talked about is not easy to do across replicated systems. Well, this is where that row version idea that I was talking about comes into play. Because in the case of this system, as a client, if you're very aware of what that version number is, and one write concurrently happens to one node, and another write happens to another node. And maybe like a third one, like read from one of those, like, I'm trying to think how that shows you the original idea that I had. But basically, the idea was that like one of those is going to have a later time or a later version number. And so the system would know that that was the last version number or the last, you know, that, that, that's supposed to be the last, right. And again, that's just me trying

Starting point is 01:20:30 to rethink like, Hey, how could I do this in this other system? But I'm pretty sure that if I remember right, that, uh, the, the conflict free replicated data types was mentioned a couple of times in the first time I thought was in this portion of the book. Yeah, it was mentioned here and they have a whole set of data types that help deal with these conflicts. Okay. I remember Cassandra had another kind of technique where they recommended basically saying every right is immutable and you generate a UUID here for each write. So each one is unique and you can kind of tell them apart.

Starting point is 01:21:09 And one trick you can do with UUIDs, which I'm not sure if they're using it here, but you can make sure that they're always increasing. So if you get that UUID from like a centralized server or something, so you know, for example, that that's always an increasing value, then you can take a look at the two UUIDs. This one's greater than that one, so it came later. So let's do that. Of course, the obvious problem there is sometimes depending on which parts of the document you're changing, it may not make sense to take the later one.

Starting point is 01:21:34 Maybe you want to take the earlier one. Maybe you want to merge them both. That's something we'll get to in a minute here. But either way, either the system's going to make a decision and it could be wrong, or it's got to throw some sort of like conflict out and have a manual intervention by a you know developer dba or someone's going to have to go in there and say i want this and not that i mean you know kind of going back to some of the examples there from the previous section you know if if the three of us

Starting point is 01:22:00 are working in say a google document all at the same time, putting it together. Like maybe we're writing a book and I changed the title at the same time that Joe changes the first sentence of the intro paragraph. That's an example that goes where you don't necessarily want to get rid of my change just because it, you know, yours might technically be like a millisecond later and by a last right win strategy. Okay, sure. Fine. It's last, right. But you know, again, that goes back

Starting point is 01:22:32 to like knowing what your data usage pattern is going to be and like which one of these scenarios is going to work for your need. Yep. And then we get back into this whole this happens before relationship and concurrency. And we've talked about this in the past, the causal relationships. But they basically say there's ways to know whether or not these rights are concurrent or not. And it all boils down to if one right knows about another right. So if there's three possible States, if you have two pieces of data that are trying, if you have two rights that are trying to happen to the same exact data, then either a happened before B B happened before a, or a and B happened at the same time. And if they happen at the same time, then they're considered concurrent.

Starting point is 01:23:25 And now you got to figure out, well, which one wins. It's easy when A happened for B because you just say, Hey, well, B was a later value. I'm going to take that one.

Starting point is 01:23:38 When it, when it comes in concurrently, now you have conflict resolution that you have to go after. It was also, they'd say that it was easier was easier when there was that causal relationship to know what the ordering was supposed to be. So one of the examples that was given in the book was that if B is an update statement, then you know that A had to happen first as an insert because B is dependent on the data being inserted

Starting point is 01:24:04 and that's why it's trying to do an update. But if both of them were insert statements, then there is no such relationship there to know that one had to happen before the other. Right. And this is where they get into some of the merging things that you could do with this data. So you could do the last right wins, which we talked about a minute ago. Or maybe you just say, hey, union all the data together, right? Like if it's some sort of collection.

Starting point is 01:24:37 And I think they gave an example of like a shopping cart. We're not going to go into that thing because it tied our minds up and not just trying to read it and look at the, at the image, but I could summarize it. I'll summarize it this way. Cause, cause I,

Starting point is 01:24:52 I had a hard time with that, with that section, with the shopping cart example. But do either of you use like a, a shopping app for your favorite, like for your preferred grocery store, for example? No.

Starting point is 01:25:07 Oh really? Kind of. Kind of. I go to my preferred grocery store. Okay. Weirdos. Um, I don't know these people,

Starting point is 01:25:13 but I live in the year 2021 and we have this great ability to where, uh, we can use an app to do all of our grocery shopping and stuff just gets magically, uh, appears in our car. When we happen to drive by, they just throw it into your car. It's really cool. Um, but anyway, where I was going with that, those, I was thinking like you could get in a situation where like, Hey, maybe I'm, um, looking at it from the computer. Like, okay, this is the bulk of what I need,

Starting point is 01:25:41 blah, blah, blah, blah, blah. Oh, wait a minute. Do I have something? Let me go look in the refrigerator. So you, then you take your phone, you go walk to the refrigerator. Like, you know what? I need to add some eggs to it, blah, blah, blah. So the point is, is that you had multiple grocery items that were being inserted into your shopping cart from multiple different devices. And no one device might have the entire list at that particular time until it's like, oh, I tried to write it and now I'm going to merge all that together because I see that this other device added eggs to the list and I only knew of bacon and milk.

Starting point is 01:26:16 And now I'm going to merge all that together. And where the problem comes up with that, which kind of, they didn't call it out specifically, but last episode, if you recall, we talked about an, an issue where Amazon would, um, like there was, there was this bug that Amazon had where you would add items to the shopping cart and then you would delete items and then items would magically reappear sometimes. And they didn't call it out specifically in this portion that, that bug is this portion, if I recall correctly, but it definitely did make me remember that idea. And they called out then in the case that when you're only adding in items, then it's easy to just keep merging the collections together. Not a problem, right? But if you were going to do a deletion then you had to that's where you

Starting point is 01:27:06 absolutely had to have tombstone markers to so that the data still stayed in the collection but as a known delete right otherwise you didn't know that it was supposed to be gone yeah and we say tombstone we basically say like we don't just delete the item from the cart you say actually you keep a record that says this item had been added and then it was removed so if you get like a you know a late kind of syncing data here then we know that you know we should have deleted it and now and now for your mind melt section you re-add the item to your cart yeah what do you do then right then you have another version on it right that's the only way to do it yeah so so yeah at any rate that whole section i highly recommend if you get this

Starting point is 01:27:51 book just you know sit down for a 30 minute meal and be fresh in the morning when you read that section is my advice to you like it's not a bad section and no yeah it's only a couple pages long but it is a bit of a like if you're not used to working in that type of system in that type of environment it's a bit of a mind melting kind of thing like wait what why yeah trying to keep it all in sync and that's why we don't want to talk about it it's almost even more confusing when we go to here yeah well it's i think that would have been harder to describe which by the way the multiple device thing was really good um but if we had tried to replicate

Starting point is 01:28:31 what they talked about in the book it would have been worse than us drawing diagrams um through our talking so um so then the last little section that we had here on the leader list were version vectors um and this is kind of interesting. So basically, if you take a record that exists on multiple replicas and you get that version number for that same record across those replicas, that collection of those versions is called a version vector. And this is where they go back. Like, REAC apparently does some interesting things all the way around in this whole leaderless thing,

Starting point is 01:29:07 because they have what they call dotted version vectors. And what that means is they send the vector, they send the vectors back to the clients when the values are read. And then when they go to do the rights, they send all those versions back at the same time as well. And then that's how, and I mean, we, they didn't get it in the details and probably for good reason, but it does some sort of comparison to know, you know, Hey, yeah, this one's good to update here. Right? Like it helps

Starting point is 01:29:36 with its own conflict resolution. I mean, this kind of goes back to the idea that I was describing earlier, where like, if you were just thinking in like a SQL server world and you're thinking about the row versions, right. If that is part of the identifier that is included back with the data and, and RIOC apparently has like data types where I'm assuming that that's abstracted away from you. You don't even realize that it's there. Then, then when, when you, as a client try to send that data back, then, uh, the, then the server can know, can deal with like, Hey, I just got multiples of these. And I can like look at those version numbers to see like, Hey, who has, if I'm going to go with a last right win strategy, which one do I want to use?

Starting point is 01:30:17 And, and they actually, if I remember right, and this, this was the portion of the book where there was also, you know, so far we were talking about like, um, the version numbers as it might relate to a single system, but there was also, uh, a,

Starting point is 01:30:35 um, identifier for the actual server itself. And I think that this was, um, I don't remember if this was specific to react or if it was just elsewhere, but basically as part of that version vector you might not only just have an ID or a version

Starting point is 01:30:50 ID for the document or row or whatever however you want to refer to it but also for the host that it was on as well so that you could kind of like know oh yeah this is the version from there and that's why it's got that score. Yeah I think they said that's why it's got that score.

Starting point is 01:31:08 Yeah, I think they said that in React, they actually convert that. They take it, and they sort of, I don't know that they said they hash it, but they turn it into a string by combining some of those pieces. And then that allows them to understand, hey, is this data an overwrite, or is it some sort of concurrent thing? And it also, they talked about creating these sibling data sets or updates, and this would allow them to be able to merge these siblings properly by using these vectors. So, again, there's probably a whole bunch going on behind the scenes there, and I'm guessing each database system does it a little bit differently.

Starting point is 01:31:43 So, you know, that's about as deep as we went into that. So it sounds like, from what we're hearing so far, that REAC is pretty stellar, right? It sounds like. It sounds pretty awesome, right? If you had to just pick a number off the top of your head, because I'm asking you, then where do you think it falls on a DB engines ranking?

Starting point is 01:32:09 I've barely heard of it. I, so I would say probably pretty low. I mean, what, what's the number between that we got? Like what's the max number that we got? Number is three 50.

Starting point is 01:32:22 Well, there's a bunch of geez, what is this? I guess they get tied at some point. I'm going to say they're down there 90. I would put Cassandra at top 20 though, probably.

Starting point is 01:32:36 Cassandra, that sounds reasonable. You're right on Cassandra. Cassandra is 11, but RIOC is there's two versions of it there's a key value version that is 66 and then there's a time series version of it that is 258 way down yeah so i mean it seemed to get a lot of love in this book but but maybe it's not as, you know, once I saw like the time series and the key value parts, I'm like, oh, maybe it's not quite as like general purpose. It's like a

Starting point is 01:33:11 little bit more because, you know, when I think of like key value kind of things, then like, like a Redis, for example, is what I think about in that kind of thing. Like, you know, caching type things. So, so maybe it's not like general purpose kind of thing, like, you know, caching type things. So maybe it's not like general purpose kind of database, but it does have some like really slick features about it. Yeah, so I kind of, so when I think Cassandra, I think like columnar, I think like analytics, like really great adding, you know, fast writes, leaderless, so I know it's used a lot in that space. Everything I learned about React, it sounds really great.

Starting point is 01:33:48 It sounds like, you know, Dynamo competitor, but it's also a Mongo competitor, which is just ferociously, you know, if you're talking about, obviously, you know, I said there's two different kinds of React. There's the key or the document type, but as soon as you say, like, document DB, I start thinking Mongo. Well, I mean, it is so popular. It is by far not even close.

Starting point is 01:34:12 The most popular document database there is. Like not even close. Yeah, so I don't know if that's part of it or you know what the deal is. Like maybe React's only for super scale. Okay, but Cassandra, Yeah, so I don't know if that's part of it or what the deal is. Maybe React is only for super scale. Okay, but Cassandra is a wide column database, but wide column is not the same thing as columnar storage.

Starting point is 01:34:35 Yeah, I looked it up to double check it. So they also consider it columnar. So it kind of wide column columns, like it kind of boils down to being sparse. So you can have rows with different numbers of columns, but the columns do line up. And so that's kind of like the trick and distinction there between the Y column. But I'm far from expert in Cassandra. I just spent a couple of hours

Starting point is 01:34:54 playing with it and was just looking at my notes. But column storage database would be something like analysis services, for example. Or what was that competitor that we looked at that was Alan? Druid.

Starting point is 01:35:07 Druid, yeah. You know, those, they're different, right? Because those are OLAP databases, right? And Cassandra is not. But the columnar is more about how the data is stored so you can query it faster, right? Yeah. I mean, it's interesting, but I thought that wide column did not equal columnar.

Starting point is 01:35:40 And if you want to say on me I'm wrong and leave a comment so that you can have a chance to win the book, then you can go to www.codingbox.net slash episode 162. Yeah, just to walk it back. So I have one spot. So if you look at DB Engines, it does mention it as columnar. But then I did a little bit more Googling, and it gets crazier. They call it a column family store. Wait, they call Cassandra columnar?

Starting point is 01:36:07 So DB Engines does. But I don't want to say anything because there's some very subtle distinctions. And so it's like, basically, don't listen to anything I have to say about this. I mean, that's what I'm tripping on. I'm not even looking at the same thing you're seeing because I like, I'm not seeing where it refers to as columnar. And that's why I'm like, wait, are we talking about the same thing? Yeah. So I don't know where I saw that now. So yeah. Wide column, wide column store.

Starting point is 01:36:35 Well, I'm trying to, I'm trying to find out if they even have a columnar storage on here so that I could see like how they would refer to it. So like, for example, if I were to search for analysis services, that doesn't even show up in here. Is there any? They do have Druid. Druid? Is Druid in there?

Starting point is 01:36:52 And it's called a multi-model. So wide column stores, they basically say, because a record can have billions of columns, like not necessarily a fixed, they say it's seen as a two-dimensional key value store. So basically you have your key to get to the main record and then you can look up other things within the record. So it's pretty interesting. Yeah, it shares characteristics of being

Starting point is 01:37:20 schema-free with document stores, but the implementation is different. And it's not to be confused with COM-oriented storage in some relational systems. This is an internal concept for improving the performance of an RDBs for OLAP workloads and stores the data of a table, not record after record, but column by column. So very much like what you said earlier, OLAP is more like it's tied to columnar storage as opposed to this where it has its own. Yeah.

Starting point is 01:37:49 And Cassandra is not. Right. Right. Okay. Right. Row oriented. But yeah. So yeah, it's really tough to like we need a Cassandra expert on here.

Starting point is 01:38:00 So what they don't even talk about it. Calm. They don't even have like uh on db engines.com they don't even list like columnar storage databases unless it's listed as something else that i'm not seeing like even for druid they referred to it as multi-model and i'm where where was drew it again because they gave an explanation as to 119 or no 100 they called it relational dbs and time series dbs which it is very much a time series it's odd that they even listed as relational to be honest with you but it's definitely they definitely describe it as an olap database okay i don't know i don't

Starting point is 01:38:43 understand the internet anymore um None of this makes sense. Yeah. It gets really sticky. And like I said, leave us a review and you can explain to me. And by doing so, not only do I get to benefit from it by learning from your amazing knowledge, but you get a chance to win the book because this is an amazing book that everybody should have on their bookshelf. I would, I would dare to say that next to your copy of the gang of four, you would have designing data intensive applications. I feel that strongly about the book. And, uh, if you have read it and you disagree and want to fight me, um, you can meet me out after school at three 15 in the parking lot.

Starting point is 01:39:34 No, man, it's a gas station yeah right okay so you know we will have obviously some links to resources we like surely this book will be one of them I know the guy that does the notes so I'll make sure that that is the case and And with that, we head into Alan's favorite portion of the show. It's the tip of the week. All right. And I'm stealing a tip here from cb.show slash tips. So thank you, micro G provider of so many great jokes for this, uh,

Starting point is 01:39:57 for this tip, um, borrowing, you can have it back. Uh, it's a list in GitHub of awesome falsehoods. And falsehoods what we're talking about here, that programmers believe. Falsehoods in this case are things that are commonly believed by people

Starting point is 01:40:15 that maybe aren't necessarily into that domain very much but are actually not true. And so I looked at a couple of these and the first one I clicked on was music. And so I'll give you a couple of examples here is, uh, music can be written down, right? Like we've all seen sheet music, right?

Starting point is 01:40:32 We've all seen MIDI data. We've seen all sorts of stuff, but when it comes down to it, there's not a great way to capture every nuance. And, you know, you like, you can pluck a string with your finger.

Starting point is 01:40:44 You can pluck it with your, uh, pick. You can bang it with your finger. You can pluck it with your pick. You can bang it with your hand. You can hit it with a tin can. You can do all sorts of stuff in order to kind of make these sounds. So it's not fully reproducible. Oh, I can even, I can even do it better than that. The timing, the timing of the thing. Okay.

Starting point is 01:41:01 It's four, four. Okay. What does that mean? Like that, that tells me how many, how many accounts to give for a given note, but like how fast is how fast am I? One, two, three,

Starting point is 01:41:10 four, one, two, three, four, or my one, two, three.

Starting point is 01:41:13 And they try to give you like, Hey, here's the, uh, you know, this is moderately fast or whatever, but there's still no, like,

Starting point is 01:41:20 you know, dead and they, they'll give you like, you know, Hey, you could set your, uh, metronome to this setting. But even there's times where, um, I think I was talking with you about this earlier,

Starting point is 01:41:30 Jay-Z where some systems you can like play along, uh, with the music. And like, if you're listening to the music, as you're watching these, this automated system, like roll through it, that it will either finish before or after the song that you're actually listening to play. So there is no, there isn't, you know, uh,

Starting point is 01:41:51 like a, a correct, you know, one way to, that we all agree on, like here's the, the tempo, you know,

Starting point is 01:42:00 as a, as a mathematical fact, right? Yeah. You can take a song and, you know, set your metronome to it and it sounds great until it doesn't because the original artist didn't play along with the metronome. So is it really 120 beats per minute

Starting point is 01:42:13 or is it however fast the drummer played, you know? Yeah. Here's another one. So, you know, there's more than one variant of the metric system. So I jumped to the science category here and this is just something where, like, you might think, like, there an empirical and there's metric. And so you go and build a system not realizing that people that are deeper into

Starting point is 01:42:31 machine industrial design, that there's actually two different metric systems, MKS and CGS. I don't know what that is. But it's just kind of cool to go in there and see like, hey, this is a domain I'm interested in. Maybe I'll make an app here. And whoa, apparently I don't know know finding something about fonts here uh did you know actually i know all these things i guess i know a lot about fonts yeah wow i'm like a font genius here i'm sorry well i might not be no this is everything. Yeah, you already know everything about fonts apparently.

Starting point is 01:43:10 All right, let me find another one. Do I? Falsehood is about Bitcoin. Oh my gosh. That you can get rich on it. Oh yeah. Did you guys see this story where there was a kid from Georgia?

Starting point is 01:43:25 I say kid. I think he was in his 20s. He invested $20 in some random crypto, one that I'd never heard of. But he'd been playing it in the crypto market for the past six to eight months. And he'd invested a total of $20 into some random crypto, went to bed, wakes up the next morning, $1.4 trillion is in his Coinbase account. Did you see that story? He wasn't able to withdraw any of it.

Starting point is 01:43:56 Exactly for him, no. That's terrible. Yeah, that right there would put a lump in your throat, I think. You wake up and you're the richest single person in the world, all because you were smart enough to invest $20. That's crazy. Yeah. It would have been amazing if you could have actually withdrawn any of it.

Starting point is 01:44:14 That would have actually made me smile really big. Here's an example I found that's pretty good. And this is one that programmers will be more likely to be familiar with. But what's the format of an IP address? You might be tempted to say, well's between uh zero and 255 you know four times separated by dots like well that's ipv4 uh ipv6 is different what about sliders those represent ranges also ipv6 is also uh all sorts of cool rules about how you can condense it so like zeros don't have to be rendered so it's totally valid to have colons next to colons because you've compressed it. And you can even say you can replace repeatable zeros and there's all sorts of rules around it. And all those are valid.

Starting point is 01:44:54 So if you've got a form that takes an IP address and you don't take in all the valid formats, then that's somewhere where someone who doesn't know as much about IPs might think that the simple stuff that they're used to seeing is okay. And so this is just a list of links to these domains that have these old subtleties that matter if you're working in them that you might take for granted. Email addresses are another good one, by the way. Yeah, and zip codes. Like once you get out of the U.S., like even like how many numbers is there in a zip code? You might say five, but now it's much more common in the year 2020, 2021, to see the dashes in the next four

Starting point is 01:45:28 that they get on another level. And who knows what they'll add in the future? It's just really hard to know. But they have an awesome one on here. I see ISO 8601, our friend. And they specifically call out string form adding of date is hard. True. Very hard. True.

Starting point is 01:45:46 Very true. Yep. All right. Well, uh, now that I don't know what I thought I did know, um, thanks Mike RG for pointing that out to us and we can pass along that to

Starting point is 01:46:01 everybody else and make them feel worse about their day too. But, uh, how about one that I don't know if you know. So this came up because in conversation I was talking with a friend because he wanted to do a cron job, but he really only needed to do it like one time and he wanted to do it at a specific time and that was one and done, right? And I was like, well, you know, there's actually a better way to do that there. You can use an at command. Have you ever used these or you guys familiar with this? So, uh, if you don't already have it,

Starting point is 01:46:35 you can do a pseudo app install at just the, literally the letters 80. And what you can do then is you can say at in a time and then give it a command and it's it's basically like a one-time cron job it's going to whatever time you give it it's going to so you can and you can give it like specific like human readable kind of times which still blows my mind because it's so much easier because you could say like at 803 p.m and then a command and it figures that out. And I'm like, what? We just talked about 80, I saw 86 or one and how dates are hard.

Starting point is 01:47:11 Like how did they figure that out? How did they know? But it does. And, and there's like some other commands that go around it. Like, um, there's at Q.

Starting point is 01:47:21 So if you have a queue of things, you can see what that queue is. And then with the at command, you can do, um, I want to say it's like a dash. things, you can see what that queue is. And then with the at command, you can do, um, I want to say it's like a dash I to where you can inspect what the, um, uh, content of that job is. So at a given time, and you could even do like, if you wanted to pass it a file, you could say like, Hey, at 8 PM, uh, dash F and then point it to a file, which could be a script of like, you know,

Starting point is 01:47:49 a bunch of things that you want to have happen. If you, if you need to do that, then there's some, there's some other commands that can go around it to like manage that queue. So if you wanted to remove something from it, um, uh,

Starting point is 01:48:02 I mentioned that queue, but if you wanted to add, remove a particular job from that queue, you could at RM all as one concatenated, uh, you know, quote word, it's not really a word, but you know, one command, um, and, and you'd give it the ID. So in that case, it would be important to have looked at the output from at queue so that you can do the, um, the at RM. And in fact, I think if you wanted to see the output, um, if you wanted to like verify the given command that, um, that at command that I mentioned,

Starting point is 01:48:34 you had to know from at Q what your output ID is in order to know it. And, and you'll be able to see, like, it'll create, it'll show you, like, all of the environment variables and, you know, the whole profile that that shell is going to run under, you know, everything it would do. The same as if it was, you know, as if you had logged in and you were running, like, your Bash RC or your ZShell RC or whatever, you know. You would see all of that as part of it. Right. Um, and you know, there's at allow and deny, uh, users that you could add to it. And then there's a batch version of that, which is basically, there's a batch command. That's like a shortcut to the at command with the dash B parameter. So I know that this is a lot to take in and I'm saying like at a lot of times with a lot of stuff, but I'm going to include a link to a Linux eyes article that talks about this

Starting point is 01:49:33 and like all the cool things that you can do with it. But yeah, if you ever need like a one, one and done. So by the way, I, cause I use this all the time, every time that we, you know, a little bit of behind the scenes things here. So when we publish an episode, yeah, sure, you can go in and use your content management system and schedule the post and whatever. But we flatten our feed so that we're not constantly hitting the database. So it's just, you know, so that's rendered out one time. And so everything is pointing to that rendered out feed, right, that's already been flattened. So anytime I'm publishing an episode, I will say, okay, you know, publish. I'll tell the content management system, hey, make it public

Starting point is 01:50:27 at this particular time. And then I set an at command to flatten the feed at this other time after the article has been published. It gets used regularly. In fact, I've even scripted it around, you know, but whatever. Beautiful. All right, I've got a couple here. The first one, we've talked about Kotlin a ton of times on the show.

Starting point is 01:50:56 We're all fans of it. It's beautiful. Like, it's just an excellent language. Well, I know that we've talked about some of these online playgrounds that you can do for like C sharp and JavaScript. Like if you ever need to do anything, well, there's one for Kotlin that they actually host themselves and you can go to play.kotlinlang.org and you can play around with Kotlin without actually having to go get an ID or anything. Just try it out. They've got examples that you can click on and it'll show you some stuff.

Starting point is 01:51:27 They've even got like a little learning type thing there. So a fantastic way to get your hands dirty with it and see if it's something that you're interested in. Oh man, this is actually really cool. Uh, I didn't realize this, but the, um, it's actually like our friends at educative where when you want to learn something like there's a, there's a, a block there that you can like actually,

Starting point is 01:51:55 if you go to the examples portion on the Kotlin playground, you can edit the block and then click the play button and you'll see it happen there in your browser. It's really cool, isn't it? It is really slick. Yeah, this is,

Starting point is 01:52:10 this is good stuff. I'm going to say that they borrowed that idea from educative cause I'm sure this is probably one of the best ways to learn some of the nuances of the language. Like if you're, if you're trying to come from something else, because what I was just saying is you click those examples on the left. It'll have like,

Starting point is 01:52:31 they have a section for like flow control. So if you want to do when or loops or ranges, they've got all the code there on the page and you can just play with it and see what happens. So it's, it's a great way to get your head wrapped around their syntax and, and all the little bits there. So, yeah, highly recommend checking that out.

Starting point is 01:52:49 If you've never looked at Kotlin, we love it. You can even introduce errors and it will like show you the lines of code that erred. Oh, nice. So they've got like a little compiler in there. So here's one for anybody out there doing Docker files to build Docker images and all that kind of stuff. So I happen to be working on a project that's got a lot of Maven dependencies. And it's brutal. Like downloading those dependencies can take a long time being nice. And so when you go to do something like a Docker build, how you set up your Docker file is super duper important.

Starting point is 01:53:36 So I'll have a link in the show notes here to the copy command inside a Docker file. And what I want to call out here is, to me, one of the most important things that's really easy to overlook is in the bottom of that section, they have a little note block. And it says this, the first encountered copy instruction will invalidate the cache for all following instructions from the Docker file if the contents of the source copy have changed. So if a single file has changed, it will invalidate the entire Docker file cache

Starting point is 01:54:15 of the layers after that, and it'll have to rebuild them all. So I was curious because I saw that you shared this earlier and, and I guess what surprised me was that, um, I guess what surprised me is that you were surprised by, because like if you had any line change, it doesn't necessarily have to be a copy command, right?

Starting point is 01:54:40 In Docker, in the Docker file, if, if let's say you had a 37-line file, right? If line 7 changed, anything about line 7 changed, whether it's copying files in or you changed the name of the script, then all lines after line 7 are now invalidated, right? I'm not talking about changing the line. I'm talking about the actual source contents change. Right.

Starting point is 01:55:03 And so in this case, you're copying a file in to it, and that would cause it to invalidate, which would invalidate everything below it. And it's not surprising when you think about how Docker layers work, but if you're doing something like building an application, like using Maven inside your Docker build, it's going to create a target directory, right?

Starting point is 01:55:31 So you automatically change the file structure of the system, especially if you map the volume to get those things in. And so anytime you do a copy dot and then space dot slash, your file contents are changing every single time. And you may not know it. You may not realize it. So there's two important call outs here. One, be aware of what copy in the source files.

Starting point is 01:56:11 Right. Which meant a whole bunch of copy statements, like a lot of lines of that. Copy the palm, then install the dependencies, then do your code. Yeah. Yeah. It's a whole lot of stuff. And the thing that stinks is when you're in development mode you're changing that stuff quite a bit right which which means that you have to

Starting point is 01:56:31 remember it's not so bad that you're going to have to rebuild these things anyways but you'd have to remember to go back into your docker file and say oh yeah i added a new folder with a new palm with new sources and so you can't forget that so the second part of my tip here is you can leverage a Docker ignore file, a dot Docker ignore file. So it's very similar to a get ignore file in the way that the expressions work. So what you can do is you can say, omit all the target files, omit all the additional directories and things that you don't need. And then that way you can just say copy dot space dot slash, which means says copy everything from my root directory in, but ignore all the targets, ignore all these.

Starting point is 01:57:19 And essentially what I did is I made sure that I kept all the proper directories that have palm files in them and any source directories. And then that way I'm guaranteed I'm only getting the things I care about to actually build the applications. So, okay, go ahead and finish. So that's basically it. Um, Okay, go ahead and finish. So that's basically it. What I was going to say with that is that the, in part of, I mean, sometimes it's just like relative to like the things that you've already been working on and like, you know, I was like, yeah, I, you know, that, that part didn't surprise me was because we are already doing this elsewhere in, in, you know, this behind the scenes, but in our applications specific to like, um, and this is why maybe you didn't see this or know about this because you've been in the Kotlin land while I've been in.net land.

Starting point is 01:58:28 And we have this other.net line land where we do have a large Docker file where we are doing all of these copies statements like you described, but also doing them in specific orders of like, well, this one doesn't change as often. And then things that do change more often are near the end of that copy statement so that you can have like the strategy of like, I mean, it does suck because like you are taking the hit of, you know, trying

Starting point is 01:58:57 to like manually optimize this Docker thing, but you know, it's so that you can get the advantage of the caching, but doing some of that.net development, I have, you know, it's so that you can get the advantage of the caching. But doing some of that.NET development, I have, you know, with these Docker builds, I have, like, been hit where it's like, oh, sometimes it's really fast. And then sometimes it's like, oh, I changed something a little early up in there. And so, boom, you know, I take a hit. But the key part for me was I didn't want to manage. Like, you know, I think I had seen the.net version of it where you got a bunch of different copies in there. I didn't want a hundred of those things because the project that we're working on is growing and growing and growing. And I didn't want to have to remember,

Starting point is 01:59:37 oh, I added this folder. Okay. It crashed when I went to deploy it. Okay. I didn't add it in the Docker file. I wanted to be like, now granted what I do lose in the way that I'm doing it is what you're saying is if there's one project in there that changes less frequently than others, I don't get the to think that maybe I just generate this Docker file and I also generate the Docker ignore files sort of on the fly. No, that wouldn't work too well. Only because then you lose all the caching but it could make it to where anytime you add new projects it builds things in a proper order um so i don't know man you could look at metadata if you use something like sonar cube or any of those that tell you like which files or which projects changes the most you could probably use something like that generate a good one but i don't know i. I think the part of the takeaway there, though, is the difference in the use case is that, if I understood correctly, too,

Starting point is 02:00:54 a lot of the things that you want to cache, though, are things that you're downloading. So it's not like you already have them necessarily. Right. Like, you've got to go and get them so it'd be like you know let's let's take either of these situations out of the equation and say like it'd be like trying to cache your node modules directory right it's very much like that you're not going to know all those things in advance necessarily to be like okay copy this node for this folder from the node modules now copy that one now copy that one now copy that you know

Starting point is 02:01:23 i mean technically could you go figure it out yeah maybe you know it sounds really tedious i don't want to do it um so maybe that's the exactly it's not exactly like that i mean to to break it down so that that people get the context of like what you just said with the npm thing right like the way that you get those those modules downloaded is you do an NPM install, right? And then that would bring them all down locally. I do have a step like that, but that builds a base image

Starting point is 02:01:52 that then my next image will use. And then that way, all those dependencies are cached already in that. And if nothing changes in there, then that image never gets rebuilt. I don't take that hit anymore. But the part that I was talking about was very similar to your.NET one, which is I have a bunch of projects and I

Starting point is 02:02:11 don't want to have to call out every single palm, just like you have to call out every SLN or every CS proj file. I don't want to call them all out. I just want it to intelligently include all my palm files and all my source files. Not necessarily in any great order, but just make sure they're all there so that when I do a build, it'll have everything it needs. So it may not be in the perfect order, but assuming not much changes, it should cash and stay good. So, so just know that the copy statement does matter a lot, but also you can use the Docker ignore, which will limit the files that actually get pulled in when you do your copy statement,

Starting point is 02:02:57 right? Like anything that your copy does will have already ignored everything that was in that Docker ignore. Like any of your local build artifacts you would want to exclude. Totally. Those are going to change, but the source behind it might not change. Right. So, yeah, the important thing is we learned Docker ignore.

Starting point is 02:03:14 Hey, cool. So, with that, we hope you've also learned a bunch of stuff, too, because I know I have. And if you enjoyed the show, and even if you didn't enjoy the show, subscribe anyways. I mean, let's be honest. Just subscribe.

Starting point is 02:03:29 And you can find us wherever you like to find your podcasts. And if there's some place that you like to find podcasts and we aren't there, reach out to Alan and complain to him. That would be my recommendation.

Starting point is 02:03:42 I have a good SLA. Yeah. Oh God, I forgot his SLA. Reach out to Joe. Reach out to them. That would be my recommendation. I have a good SLA. Yeah. I forgot his SLA. Reach out to Joe. Reach out to Joe. Trouble. Tweet us? I don't know. At any rate, as Alan said earlier in his

Starting point is 02:04:00 smooth jazz voice there for his late night radio DJ hosting job that he's going to start pretty soon. Thank you for listening. Yeah. If you haven't already, we would greatly appreciate it if you left us a review. You can find some helpful links at www.codingblocks.net slash review.

Starting point is 02:04:19 Hey, and while you're up there at the site, make sure you do check out the show notes examples. We have all kinds of discussion. And remember, if you leave a comment on this show, you'll have an opportunity to win a copy of the book. And, uh, yeah,

Starting point is 02:04:31 we got a Twitter. Um, you won't see a lot of, of, uh, hot takes. You're not going to see, uh,

Starting point is 02:04:36 lots of, uh, threads like you see on Twitter, but we, we will see our, uh, good retweets, funny jokes,

Starting point is 02:04:43 and, sometimes pictures of pets. Yeah, so check it out. That's all that matters. Check it out.

Coding Blocks - Designing Data-Intensive Applications – Leaderless Replication

We wrap up our replication discussion of Designing Data-Intensive Applications, this time discussing leaderless replication strategies and issues, while Allen missed his calling, Joe doesn't read the ...gray boxes, and Michael lives in a future where we use apps.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.

Your Ad Here

Coding Blocks - Designing Data-Intensive Applications – Leaderless Replication

We wrap up our replication discussion of Designing Data-Intensive Applications, this time discussing leaderless replication strategies and issues, while Allen missed his calling, Joe doesn't read the ...gray boxes, and Michael lives in a future where we use apps.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.