Coding Blocks - Designing Data-Intensive Applications – Data Models: Relationships

Starting point is 00:00:00 You're listening to Coding Blocks, episode 124. Subscribe to us and leave us a review on iTunes, Spotify, Stitcher, and more using your favorite podcast app. And check us out at CodingBlocks.net where you can find show notes, examples, discussion, and a whole lot more. Send your feedback, questions, and rants to comments at CodingBlocks.net, follow us on Twitter at CodingBlocks, or head to www.CodingBlocks.net and find all our social links there at the top of the page. With that, I'm Alan Underwood.

Starting point is 00:00:29 I'm Joe Zak. And I'm Michael Outlaw. This episode is sponsored by Datadog, the monitoring platform for cloud-scale infrastructure and applications allowing you to see inside any stack, any app, at any scale, anywhere. And Educative.io, level up your coding skills quickly and efficiently whether you're just starting, preparing for an interview, or just looking to grow your skill set. And Clubhouse, the fast and enjoyable project management platform that breaks down silos and brings teams together to ship value, not just features. All right. And today we are

Starting point is 00:01:14 continuing on with chapter two of designing data intensive applications, a book we've been talking about for a while. And today we're going to be focusing in on talking about the differences between document and relational models. And we kind of talked about this a little bit last episode, but we're going to knock things out of the park this time. In case you're following along with the book, we're actually starting with the section on many-to-one and many-to-many relationships. Before we get into that fun part, we'd like to thank those that took the time to leave us a review. From iTunes, we have Campfires.

Starting point is 00:01:49 Oh, man. I even practiced this one. Dang it. Amize. No, that's wrong. Amize. No, that's wrong. I don't.

Starting point is 00:02:04 You're doing good. Go ahead. Okay. Am I? 776. I'm so sorry. Jozak Alan Outlaw, which I want to point out only my name is spelled correctly in that one. That is good.

Starting point is 00:02:18 That's fine. I love that one. S.K. Metzger, Napalm 684 and dingus the first that that's my new title for uh my name pronunciation oh and uh you know jozak alan outlaw wrote in question uh you didn't i bet you didn't know that as we do do. Yeah, as we do. So I was curious as to what your take would be on this, because his goal would be to create mobile apps and then eventually move on to universal Windows apps.

Starting point is 00:02:59 And he's read some C Sharp books, right? And he's becoming more comfortable with the language. But he's beginning to realize that he understands the syntax and the constructs of the language, right? But he doesn't feel comfortable boasting that he knows C Sharp or using the.NET library because of – well, okay, so he says, do you guys think that I need to learn more.NET and then create applications targeted to Universal before going to learn Xamarin? So before he gets into the mobile thing, does he stay just in the Windows environment? Or go ahead and start, go to the Xamarin first. Then he also adds that because of – would he be able to boast of knowing – of being a.NET developer without knowing all of the classes in the.NET library? Oh, yeah. So I counted.

Starting point is 00:03:53 Two full questions there. Do you remember when I counted the number of classes in the.NET? I tried to figure out how many were just in the system namespace and it was like 4,000. Right. And that was deduplicated. So I got rid of all the two strings and stuff. I mean there's a ton. I would definitely not worry about getting to know everything in the standard

Starting point is 00:04:07 library. So, yeah, I mean, his first question, you guys think I need to learn.NET and then create applications, blah, blah, blah. So learn the fundamentals, right? Learn the data structures, learn the data types, like your native ones, your structs, all that kind of stuff, learn that.

Starting point is 00:04:27 And then from there, just do what you need to do, right? Like that's, that's my take on it because let's be honest. Okay. So there's, there's.NET framework, which, you know, the three of us know and love and have used for years. And then there's.NET Core, which changed the game completely because the namespaces are no longer the same. Some of the things that lived in one namespace over here no longer exist over here

Starting point is 00:04:51 or is in a different namespace altogether. And then let's go ahead and bolt on one more. You have the.NET Standard framework, or it's not even a framework, but the.NET Standard set of libraries that's sort of a bolt-on to allow things to be more cross framework, cross platform compatible. And guess what? Those are all different namespaces. So my take is learn the constructs of the language.

Starting point is 00:05:15 Learn all the structs and the data types and understand why and when you'll use those and then build what you want to build. That's when you've use those and then build what you want to build that's when you've become a dot net developer uh yeah i would agree with some of that like basically my take on it was like you know if building a mobile app is the thing that you want to do then just go start building a mobile app like yeah you know you don't have to you don't have to start with the desktop apps or you know if that's or the universal uh windows apps if that's not your first goal, right? And then through the practice of creating that mobile app, you're going to figure it out. Now, I do think you're going to like – by going that route though because now you're mixing worlds of.NET and Xamarin with iOS or Android or whatever,

Starting point is 00:06:09 that's going to like if it was a first-time developer and hadn't done anything else, then that's where it would be like, hmm, that might add a little extra complication to it. A lot of extra complication. Yeah, I was trying to be nice. So that's the thing, right? Like I don't think your goal should be, hey, I'm a.NET developer. Your goal should be whatever it is you want to create. It sounds like, like in his case, if it was me, right?

Starting point is 00:06:31 Like in his case, he's wanting to create mobile apps. And guess what? There are a whole slew of problems that come along with writing a mobile app, right? If you're writing something for a phone, it's very different than writing a website. It's very different than writing a desktop app because now you have connectivity issues. You have I need to sync data at different times. And like there's all kinds of things that come into play there. So if that's truly where your passion is, go learn that.

Starting point is 00:06:58 And then instead of saying I'm a dot net developer, say I'm a mobile app developer. Right. Because now I understand the problems in that particular space. And who cares what the underlying language is? Because in all honesty, if you know how to do the app in that space, it doesn't matter if it's C sharp. It doesn't matter if it's Swift. It doesn't matter if it's what you understand the paradigm there. And somebody put that in a comment recently really made me happy me happy. But you actually understand that space. And now you understand basic programming. You can go do it in whatever language is available.

Starting point is 00:07:32 That's at least my thinking on it. Yeah. Okay. Go ahead, Joe. I was going to say, too, when you're starting with a new paradigm, a paradigm for apps, I aim for like a really simple mobile app first. Like don't, if you've got a dream business to, you know, make the new Uber of pizza or whatever, I wouldn't go for that dream app first. I'll do something real simple there, get it out there, publish it, call it done, move the next.

Starting point is 00:07:55 Like I would step it up in that way. And if you need to Google something for language, it's going to happen. But I would just try to take an extra minute to like look a little bit deeper than the first stack overflow answer. So if you see, you know, it says, hey, use this namespace. Take a minute to go click on that namespace or find that namespace in documentation and see what else is there. Just because, you know, you're kind of new to that paradigm. Yeah.

Starting point is 00:08:14 And if you're on Stack Overflow looking at answers, don't always look at the first one. Scroll down the page because oftentimes the accepted answer is the first person who answered it in a way that solves somebody's problem. It doesn't mean it was the best answer. So keep looking. Yeah. That's the thing. It's like not always the really good answer isn't always the approved or, you know, select answer. It may not even be upvoted that much because it was at the very bottom of the page, right?

Starting point is 00:08:40 Or we've seen times where it is the more upvoted answer and still not the approved one. Right. Or we've seen times where it is the more upvoted answer and still not the approved one. Right. So, yeah, I mean, I guess my whole thing here is don't worry about calling yourself a.NET developer. Be worried about what it takes for you to become a good developer, meaning you understand the data structures and when to use them and all that stuff. And then go build what you want to build. Right? Like the rest of that stuff just comes along for the ride.

Starting point is 00:09:10 Yeah. And I remembered as we were talking about this how I was going to pronounce that name. I think it's supposed to be pronounced Amiza. I think you've got it like five different ways. You've hit one of them. Yeah, I think you probably did it right somewhere in there. Yeah. Well, thank you for recognizing. But that was the one that I wanted to say and forgot. And now that I got that out there, I covered all my bases.

Starting point is 00:09:27 See what I did? That's awesome. Hey, by the way, like we don't get that many questions. Like if you guys like that, shoot us a question. Like we always send us comments or whatever. But if anybody has a question, like we love. Michael Outlaw just wrote in. Yeah, yeah.

Starting point is 00:09:42 Outlaw just wrote in. But no, no. I mean, that's kind of fun and it's nice to – I mean we get those – we definitely get those in person more than we do – I know what my question would be. What would you – oh, man. How do you – Who's your favorite? Why is it Joe?

Starting point is 00:10:01 Cool. So this is the last time that we will announce this because I will actually be there after this episode. So I'm going to be speaking at NDC London on real time streaming or near real time streaming with Kafka and SQL Server and Elasticsearch and a bunch of other stuff. So if you are going to be at NDC London, please come say hi to me. If you're in the London area and you guys want to hook up for lunch or dinner or something, let me know. I will have a little bit of free time before some of the

Starting point is 00:10:31 conference stuff. So definitely hit me up. Would love to meet some of you folks. And that sweet new swag that you hinted at last time is in. Yeah, so we got some hats. We got some toboggans and stuff. So maybe Toboggans? That's what I call them. You call them beanies.

Starting point is 00:10:46 Is that not what they are? Skullcap. I don't know. I think toboggans are sleds. Are they? Yeah. Are they also hats? I don't know.

Starting point is 00:10:53 I think they're hats. Define – hold on. I think a skullcap or a beanie. Define toboggan. Oh, my gosh. They're toboggans. Boom. Get some.

Starting point is 00:11:02 They become toboggans. Yes. A long, flat bottom – oh, that's also a sled. But, yeah, I think it's also a hat. It Boom. Get some. They become toboggans. Yes. A long, flat bottom. Oh, that's also a sled. But, yeah, I think it's also a hat. It's the kind of hat you wear on a sled. There you go. So, at any rate, we've got –

Starting point is 00:11:12 I don't know, man. My search for toboggan just came up with sleds. So, we've got at least five flavors of hats, right? We have two trucker caps with different colors, and we've got three different color variations of these beanies, skull caps. Yeah, because even Wikipedia is saying toboggan is a sled. Well, Wikipedia is not right. Yeah, I found a better source that talks about the distinction. It's very interesting.

Starting point is 00:11:40 Okay, cool. See? I want to learn. So I'm going to be packing up some of the swag and bringing it with me. So I'll probably be handing some out during the talk. And then I'll probably, like anybody that I meet prior to the talk, obviously you'll take precedence. So if there's anything special you want, you know, just let me know. And hey, Outlaw, can we get some pictures of this up on this episode?

Starting point is 00:12:02 So if anybody has any special requests, they can be like, yo, I want option number three. I will absolutely get some pictures out there. Yes. I like it. I won't commit to only the episode though. Okay, cool. All good. Because I'll probably like throw some out on Slack, you know, maybe some tweets.

Starting point is 00:12:17 I don't know. I might get cray cray. You don't know. I like it. Where I might put these messages, these pictures at. Well, if you do look at the website at this particular episode in order to see those pictures, then you can go ahead and leave a comment for a chance to win the book. I see what you did there.

Starting point is 00:12:33 The book, Designing Data-Intensive Applications. That, sir, is called a segue. Yep. But actually, I'm going to throw another, I don't know, monkey wrench in the works because I got more news for you. I'm going to be at South Florida Software DevCon. They changed the name of South Florida Code Camp to this. And I'm going to be there. And, actually, I'm planning on stealing Alan's slides and doing a poor job of it.

Starting point is 00:12:58 So, if you're in the neighborhood, you should stop by. Hey, which slides are you taking? Which one are you doing? No, I'm kidding. I'm doing a slightly different version of streaming architecture by example because I'm going to focus on the front end too. So we're going to talk about GraphQL subscriptions and a little bit of Kafka. So I'm going to try and kind of show the

Starting point is 00:13:14 full story and show you what it's like on the front end and the back. It will be good. Joe Zach has done some nice work on that. And as if that isn't going to be enough, all three of us will be at the Orlando Code Camp March 28th

Starting point is 00:13:29 where you could definitely have your chance to kick all of us in the shins, but mostly aim for Joe. Just ask Mike. I bite back, so that's going to be a problem. That implies that you've bitten me. That was weird. Yeah, that is kind of odd.

Starting point is 00:13:45 Awkward! That implies that you've bitten me. That was weird. Yeah, that is kind of odd. All right. So awkward. All right. Well, let's move on. Let's talk about document databases. All right. So we kind of take some things for granted. We kind of assume a little bit of knowledge on your part about relational databases. We probably shouldn't do that, but we tend to talk about this stuff a lot.

Starting point is 00:14:04 So hopefully we're not going too crazy here. And if we are, you know, stick with it because maybe it'll turn out cool anyway. But so a lot of times when people talk about relational databases and relating data, we talk about normalization, which I'm so used to dealing with and have been doing for such a long time

Starting point is 00:14:24 that I had a hard time really kind of trying to figure out how to say it in just a sentence. But I came up with one that took me way too long. Did you hear how I pronounced it 6-N-F last episode? Because if you want to talk about anyone who has trouble pronouncing it, I vaguely remember that. Nominization? Yeah, I just could say sixth normal form yeah oh yeah that's right all right so what we got for the description here yeah not going to dive into the different levels of it but basically the idea behind normalization and like a kindergarten

Starting point is 00:15:00 crayons like high level view is associating meaningful data with a key and then relating data in different parts of the system by keys rather than their values. So if I've got a user record whose name is Joe, I don't go around and say user Joe, user Joe, user Joe throughout my code, throughout my queries and wherever I'm interacting with the data. I'm going to use some sort of unique identifier, like a number or a GUID that represents that person. And then if you go and change the name Joe to Joseph, then you don't have to go and change all these other places that we reference it. So that key provides us a little bit of

Starting point is 00:15:37 abstraction there. And it buys us a couple other things too that we're going to get to here in a minute. Basically, the main point of normalization is that we're separating our meaningful data, data that actually means something and reflects something in the real world. We're adding a key to that and then joining things and relating things by those keys. So I would like to expand on this a little tiny bit just because of what it – We thought you might. Yeah. Yeah. So I agree with what you're saying and that you have the key and all that stuff.

Starting point is 00:16:12 But there's a couple of things that I think might help it clear up for people that don't live in this world. When we talk about relating keys, the whole purpose is to keep a distinct list of data in that one spot, because that's the whole point of the relation, right? So by saying not user Joe, not user Joe, if Joe is one, then that means if you ever want to refer back to Joe in all your data in every other table, you're going to use one, right? And the whole point of that is there's one instance of Joe that exists in that user table. So that's the reason why I wanted to expand on it because the notion is not just, Hey, you've got these keys that you're using all over the place. The whole point is you don't duplicate data in a normal form database, right? Like what is it? Third normal form, I think is

Starting point is 00:17:03 what you typically go to if you're, if you're trying to get it down to like really good stuff. Yeah. And then that's where it was getting crazy with sixth normal form because you end up with tables that are just almost like all. Actually, doesn't it get worse as you go? Wait, is it higher or lower? I can't even remember.

Starting point is 00:17:18 This is going back to. Well, sixth normal form, you would basically have like, you know, it's just like tables of IDs. You break out. Yeah. Like even dates, you're breaking apart dates and all that stuff it gets crazy but yeah so third normal is pretty normal it's it's what you typically see is third or fourth yeah that sounds that sounds a little complicated third normal is normal but yeah the key here is you have separate records

Starting point is 00:17:42 to represent each thing right so i guess the easiest way to think about this, if you went to like Facebook, like if you went to Facebook and you saw a bunch of posts from Joe Zach, you could think of, hey, every time it wants to show his name up there, it went back to that user table to pull his name out, right? Instead of it wasn't saved on that comment, it wasn't saved on that post, it wasn't saved on that post. Right. It's always going to look it up back at that user table. And that's normalization. All right.

Starting point is 00:18:12 And I think that it kind of sounds complicated if you're not familiar with the concept. You know, someone maybe not familiar with relational databases here, like third normal form, sick normal form. It sounds intimidating, but really kind of once you've got that main principle down and understand the reasons behind it, it almost kind of comes intuitively. Like I don't know any developers who could really define the difference between third normal form and second normal form. But most of them who have been working with databases for a while are just going to kind of do that naturally.

Starting point is 00:18:39 Yeah. Well, I was about to say, how about we give an example, and then I see that the next thing you had there was an example. Yeah, we did the Facebook one, but I wrote one here where basically we've got a database table representing purchases on an e-commerce website like Amazon, where you would have an order ID that represents the order, and then a product ID that represents the products that are part of that order that were purchased. And those products also represent the products that are in sale in the store. So the products are represented by a product ID that represents the products that are part of that order that were purchased. And those products also represent the products that are in sale in the store. So the products are represented by a product ID. And that product ID applies whether it's in the store for sale or also is associated to those orders via some sort of lookup table that combines order ID and product ID to say that this order is comprised of these products.

Starting point is 00:19:21 I think it might be helpful, though, if explained like what a denormalized version would be. Like if you didn't normalize it, like what would that record look like? So you might say like, oh, well, you'd have the order number and the product description and the product SKU and the product price. All of that would be like in one big column, right? Hey, is that further down when we talk about the NoSQL implementation? Yep. Yeah.

Starting point is 00:19:48 We'll talk about it a little bit, but I love what you're saying. Yeah. It's basically, yeah, the opposite there is kind of storing it like a document where we have an order and it's comprised of all the data, which is actually specific to that order. And, you know, order is kind of a bad idea because if you just reference the product ID and your order ID to product table, then that price changes. You don't want the price that the customer paid to change in that order. So typically you actually take a snapshot with things like order.

Starting point is 00:20:19 Let's use Alan's comment example. It could be a Facebook or any kind of forum comment or whatever, right? what you might also display as the username and then a timestamp of when he did it. And then the full comment that he might've made, whereas the normalized version, you're not going to have, you're going to remove his name and his display name from that record. And instead you would just have an ID that would point back to a user table that you would retrieve that stuff from. And then that way, as maybe he might change his name, Joe does this all the time in Slack, right?

Starting point is 00:21:21 Where he'll be like Joe to the Z. And then he'll be like Joe, Kotlin, Zach, or whatever, depending on the day of the week, right? Where he'll be like Joe to the Z and then he'll be like Joe Cotland, Zach, or whatever, you know, depending on the day of the week. Right. And the way all of those other messages get updated is because they're all pointing to that user record for Joe's act so they can see, oh, he changed, this is his new display name. And so all the past ones get updated as well. So I think I can quickly put this into concrete terms. So we're talking about the denormalized version. Basically, you're storing exactly what you see, right? So if you're looking at that Facebook post and it says Joe Zack and it

Starting point is 00:21:54 has a comment on it, it has a timestamp, all that's in one record, right? That's all saved in a document, just like a snapshot of the page, right? If we're talking about the normalized form in the database, you might have a comments table that has comment IDs on it, like one through a billion for Facebook or quintillion. I don't even know. They've made up new numbers for Facebook. They have. So you're going to have that comment ID. You'll have the actual comment that somebody typed in, and then you're going to have a user ID one for Joe, right? So that's what the record would look like. And then if you want to look

Starting point is 00:22:30 up what Joe's name was for that page, now you're going to have to go over to the user table and look up user ID one and next to it, it'll have first name, last name, Joe Zach, right? So that's how they're different, right? Like in, in the denormalized form, you got one record with all the data in it. In the normalized form, you got multiple tables that you got to go to, to go piece all the data together. And that may sound a little bit worse at first when you think about having to go put this stuff back together, you know, putting Humpty back together again, maybe hitting nine tables to get a blog post and all the articles. Like it seems much simpler to store the content of the article in the comments all in one thing, and then it's only one call to get that out.

Starting point is 00:23:06 It's going to be super fast, and there's all sorts of benefits there, and that's all true. However, you miss out on flexibility if you store things that way. So, for example, in the comment example, you see a blog post and all the comments associated with it. That's great for viewing on the website. But what if I want to click on the user who left a comment or a user who left a comment and see every comment they've ever made? Like, oh, well, in that case, we have to go load every blog post we've ever had and then scan the comments for anything that person has ever done and then go pull those out. And so that's a lot of extra work. And so what was really convenient for one use case is terrible for another.

Starting point is 00:23:40 Yeah. And another example to pile on there just to take that even a step further is, OK, so in one in one, let's say that you have you have a Facebook post and all the replies on it. Right. If that's in a denormalized form, it might store all those comments on that one post. Right. But that means that if anybody wants to add a new comment, you got to load all that stuff back out, update the entire thing and shove it back in. Right. In a relational normalized world, you just add another comment to that comment table and tell it, hey, the parent post was this number up here. Right. So there's all kinds of efficiencies that happen in different ways, depending on what you need. Yeah. I want to point out to that. A lot of times you'll see this stuff used together in various

Starting point is 00:24:27 different ways. That's what kind of what this book's all about. And so we're going to be getting all this stuff, but a common approach for WordPress and dealing with this exact situation is to have a relational database that has all the relational bits broken up and normalized and is flexible. And so you could run reports and do whatever you need. And then they'll have a caching layer that is basically a key value that acts more like a document model where we say, okay, whenever this blog post is updated or a comment is added, go ahead and update this hash table or

Starting point is 00:24:54 this, this, uh, cash data store with the big block of data. And then whenever a user comes in shortcut, skip the relational data and then go hit this. And so our main use case is going to be fast of people hitting the website because that happens so much more often than these reports. But we still have the backing of the relational database whenever we need to, to run those infrequent results. So combining these things in smart ways is really where the magic is. Yep. And the key here is what he just said.

Starting point is 00:25:47 There's no one size fits all, right? Like that's more importantly, these technologies exist because they solve very specific problems, right? And there is crossover between where you'd use them. But chances are, if you start seeing pain points in one versus the other, that's when you're going to start going, hmm, maybe I need to look at this other solution. And that's what this chapter is all about, is exploring those lines and what helps make us make those decisions. So, I mean, this kind of goes along the lines of something you already said, Alan, but the next point we had here was just that normalization reduces redundancy and improves data integrity. Yep. You have one copy of Jozak, right? Yeah. Yeah. Which is kind of the example that you were given where instead you would have the one table for the comments instead of reloading

Starting point is 00:26:19 that whole thing every time. You know, an interesting thought here is I wonder if these document databases are becoming way more popular nowadays because storage costs are so dirt cheap. I mean, you know, back in the day you had a relational database. Part of it was, hey, we didn't want five terabytes of data because that was hyper expensive, right? Nowadays it's almost like, ah, whatever, buy some more hard disks, put them in there, you know? Yeah, that's almost like, ah, whatever. Buy some more hard disks, put them in there. Yeah, that's true. And for stuff like reporting, a lot of times it makes sense. Sure, it's slow to go out and query and do all those kind of analytical things we want to do.

Starting point is 00:26:53 But maybe we can live with that happening once a night or once an hour or something. So it's all about tradeoffs. And maybe you're okay with that. And if you are a small business getting started and you're not really sure how many customers you're going to have or you're trying to save money, then document databases are a great way to get up and running fast that makes it easy for app development and you can always pivot later if you need to i kind of feel like technology is just circular though and like if you wait long enough like if you missed it the first time it came around you're like oh i didn't get on that that document database bandwagon like

Starting point is 00:27:22 it'll wait long enough it'll come back around and you'd be like, you can catch it that time. Yeah, definitely. I will say to, to caution based off what Joe said right there, a lot of people do tend to lean towards the document database because it maps really easy to the code that people write, like developers.

Starting point is 00:27:39 That's not necessarily what you want. Again, depends on your use case. Right. And hopefully we'll get into some of that here. Well, I mean, part of what I was thinking, like when you asked that question, I was like, well, I mean, you have things like LAMP that kind of like make it easy, right? But then it's like, well, did things like that make it more popular or did that become a thing because it was already popular, which is where I'm leaning? What about the mean stack?

Starting point is 00:28:04 I mean, that was the entire flip, right? That was MongoDB and Express. So it started with the document database up. But wasn't the M in LAMP Mongo? No. LAMP was Linux, Apache, Mongo, PHP, I thought. No, no, no. MySQL.

Starting point is 00:28:20 Oh, you're right. MySQL. Yeah. So the mean stack was Mongo. Yeah, you're right. Mongo, Express,. So the main stack was Mongo. Yeah, you're right. Mongo, Express, Angular, Node. Yep. Yeah.

Starting point is 00:28:29 Yep. Yeah. So one thing the book points out is that as your product matures, your data gets more complicated. So maybe we start out with a blog post and then we add comments and then we add the ability to upvote comments. And then we add the ability to have smiley faces or frowny faces or other emojis with comments. And now we add the ability to attach a GIF with the comments. And so you can imagine. Oh, I don't like what I wrote.

Starting point is 00:28:52 Now I need the ability to soft delete my comment. Yeah, exactly. Or edit or hide or whatever. So as your project grows, your app becomes more complicated because it's working and people are using it. The data model gets more complicated. And that's fine for relational. It's kind of built for these complex relationships. But the document model doesn't really do so well. You have to move that model to either, you basically have to migrate that data to new models, or you have to be able to support multiple different use cases of kind of different

Starting point is 00:29:25 versions of data so it puts more work on you the developer as time goes on so what was easy in the beginning is hard as things move on but the inverse is also true right like so the sql model might hold up really well or not sql the relational model might hold up really well for these complex use cases but your performance may suffer, right? Yeah. And it's complicated in the beginning. If you're normalized, you're going to have like eight tables to do something stupid like get a blog post. Right.

Starting point is 00:29:53 And maybe multiple queries. Yep. So relationships are really kind of the key distinction here. It's the thing that makes document databases easier to scale. It's what keeps their data tightly located in the same location, so high locality. And it's what makes RDMSs so flexible. So that's really kind of the key distinction. If you try and decide whether to go with a relational database or a document database,

Starting point is 00:30:20 then you need to understand your relationships. And those document databases that we mentioned struggle as things get more complicated. As the relationships do. I mean, like one of the use cases that has always driven me crazy is, let's say that, like, let's go back to the whole post thing and names because it's easy. You have 10,000 posts that Joe Zack did, right? And let's say that you stored his name in there on each one in a document database, Joe Zach.

Starting point is 00:30:47 All of a sudden, tomorrow, he wants to change it to Jay-Z the Evangelist, right? Now, that update's not easy, right? Especially depending on how complex your document model was. If you kept all your comments nested in one document in a document database, you've now got to traverse all those things and find all the places where Joe Zach showed up and change him to Jay-Z the evangelist, right? And so updates aren't easy, right? Like they can be extremely heavy on the writes, extremely heavy on the reads and all that kind of stuff. Whereas in a relational database, you update one record and everything's fixed, right? It's magic.

Starting point is 00:31:24 Relational assuming that it's normalized. Assuming it's normalized, right? Not a denormalized. Because denormalized has the same problem in both the relational database world and in the document database world. You can do them both that way. Sorry, I was changing my name in Slack. Is it the evangelist now? Sort of. The JZ the revenger? So Slay the Spire, my favorite game in the world, just came out in the new version. So it's got an E at the end, too, so I'm the Slay the Spire evangelist.

Starting point is 00:31:54 Very nice. Yeah. Properly awkward, just like me. It's great. So one thing that's really important, too, and we're going to kind of talk about this a little bit after the break here but document database designers have to be a lot more careful careful careful careful about their decisions because those decisions are harder to change so in a relational database you can move columns around you can do things you can run queries it's still a pain in the butt depending on how your uptime and your different restrictions but it's nothing compared about uh compared to what

Starting point is 00:32:29 you have to do in a document database and so it's up to you as the developer and the maintainer of this data model to understand the use cases and what needs to happen in these migrations so you know what was easy in the beginning gets harder and harder as things go on and as changes are made. Do you think that's where people run into problems where they're like, you know, maybe a document database gives them a bad taste and they'll feel forever burnt by it. Yeah.

Starting point is 00:32:56 That integrity. Yeah. It's hard. Like, you know, we talked about the posts with the comments and the comments have users that have made the, you know,

Starting point is 00:33:02 next thing that you're storing the user's image and then you have to update the image. And yeah, I could definitely see that that lack of redundancy or actually having redundancy and losing that integrity is really rough if you move to a database, a document database, especially if you're used to those relational things existing. Well, I just meant, I mean, that was great examples, but I was referring to the previous point though, about like deciding to change your model. Like, you know, oh, I mean, that was great examples, but I was referring to the previous point, though, about, like, deciding to change your model. Like, you know, oh, I don't want this field there anymore, or, you know, I want to change the type, or I want to rename it was that, you know, relational was kind of it's basically enforcing the schema on write, whereas with the document, you're enforcing

Starting point is 00:33:52 the schema on read. So it's easy, but you unless you get away with stuff until it's like, oh, I want to make this change. Oh, God, now I got to find every place that's using that. Yeah, it's you end up with one of two things, right? Like in the relational world, you're forced to make that schema change, and you're forced to make the code changes to support that schema change, right? Mostly.

Starting point is 00:34:18 You could have the same runtime errors with relational code that you can with a document. But here's the big difference. The document is usually more forgiving in that you can with a document. But here's the big difference. The document is usually more forgiving in that you can just add as many properties as you want to it, right? Or remove. But the problem is your application code can get really complicated and really messy trying to keep up with every revision of that that you've done to support 12 versions back, right?

Starting point is 00:34:43 So there's a versions back, right? So there's a trade-off, right? Like there's a real trade-off and it could get really hard to maintain your app if you're not making clear, concise decisions to migrate data away from the old patterns and move it into the new patterns versus, hey, I'm just going to make my app support all the old versions plus the new versions, right? It almost feels like like you should just always use both yeah maybe that's what it should be like you you have your document database for your reads but that thing is populated by your your relational which is used for your

Starting point is 00:35:19 transactions and rights and whatnot i don't know that'd be crazy talk to you well documents are great for cash and it's great for some other things like sometimes you don't have a schema or sometimes you just kind of accept whatever so if you're like dumping log files for example some sort of persistent data store and some logging sources have different fields because they you know it doesn't make sense to have them in others or in slightly different formats it's great to be able to throw that stuff into a document database over relational uh database because of those different formats and that's a to be able to throw that stuff into a document database over a relational database because of those different formats. And that's a strength there where you can support many different things without even knowing what you're getting in.

Starting point is 00:35:51 And they can be up to your query language or application to kind of decide how to show that. Well, let's get into these next bullets that you've got here because I think it's going to point out some of the things that will help you understand why you would go one or the other yeah we're mostly going to be focusing on uh many to one and many to many relationships and it's what it's just what it sounds like if i'm a person on facebook i have many friends and another way to that's probably another bad example dang it because it's also a mini mini relationship or like maybe i'm friends with Outlaw. Outlaw is friends with me and friends with Alan. So there's these kind of mixed

Starting point is 00:36:29 up relationships where things can point to each other. They can be recursive. They can be all sorts of kind of funky. And that's the kind of stuff that relational databases do really good at and document databases not so much. Well, you could still go with your Facebook example though and say like there's one of you, but there might be multiple comments that you've made.

Starting point is 00:36:50 Right. Yeah. Right? Yeah, that's good. So that's your one-to-many in your Facebook example. Right. Yeah. Although I'm kind of digging the Slack one better.

Starting point is 00:36:58 I think we should go with the Slack one. So several different comments, but the friends are a many-to-many because I can have one person who's also a friend of me and also a lot of other people. And so things are just kind of crazy in the way that we represent those relationships in the databases a little bit different. But there are a couple of particular benefits that the book came up with, like consistent styling and spelling for meaningful values. So like an example here might be if you have a user status, it could be inactive or, I don't know, married, single, whatever, unknown. If you want to change, you spell unknown wrong. You spell it uncowen. Then you only have one spot to do it.

Starting point is 00:37:37 But if that was a string stored in a document database, you'd have to update that everywhere that had it, which is a big batch job all of a sudden. If I remember right, this portion in the book, they were talking about like the consistent spelling and styling would be was used for locations, if I remember right. So like, yeah, pay sure. Let's go back to the Facebook example. Right. And like you could do your check ins.

Starting point is 00:37:59 Right. All of those locations, you know, you might want to have like a global list of, you know, well, I didn't mean to use the word global there, no pun intended. But you might have one table that contains all of the locations in the world, and then that check-in is just a pointer to one particular location. And then that way, if the location decides to change the spelling, would that still be – I mean, if you checked into Constantinople and then it became Istanbul. Istanbul. So that's actually where documents are more interesting, and we should talk about that in a second

Starting point is 00:38:41 because I have several things that I want to mention. Yeah, I guess if it's like a record of an event, like maybe that's where you would want to like, whatever. So let's go ahead and do it real quick then because this brings up the thing that Joe touched on for a second earlier is he was talking about orders, right? And then products. You have an order. You had five products on the order. They all have a price associated about orders, right? And then products. You have an order, you had five products on the order. They all have a price associated with them, right? In a relational database world, you're going to have order one, product one,

Starting point is 00:39:13 order one, product two, order one, product three, right? And if you do it right, those products that were associated with the order, you're going to have an order item table with the product number and the price that was charged at the time. Because if you naively set the order ID and the product ID and just assumed that was it, when that product price changes somewhere else, then you're going to change what your order was today. So today you bought the thing for 10 bucks, but tomorrow it goes down to five. It's going to look like you paid five bucks for it tomorrow.

Starting point is 00:39:49 So this is where document storage is actually really good, right? For that snapshot in time thing. So instead of having a relational database where you're kind of faking this stuff, instead think about it like an invoice page that you would get, right? You're going to have the order number on the page. You're going to have a list of the items, the amount of money that was charged for them, the tax that was charged for them,

Starting point is 00:40:12 the shipping that was charged for them. It's a snapshot in time. It doesn't change, right? So similar to your Istanbul, not Constantinople thing, if you were to check in today and it's Istanbul and they all of a sudden decide that they want to change it to Constantinople tomorrow, it shouldn't be any different because you checked into Istanbul. You didn't check into Constantinople and that's an app decision, right? Like that, you as a developer, you as the business person making that decision has to come up with, what do I care about? Do I care about this location? Do I do care? Do I care about what somebody actually checked into at the time? And that's where it's a,

Starting point is 00:40:48 it's a decision on, do I use a normalized data form or do I use a denormalized data form as more of a snapshot? Yeah. I mean, I was kind of thinking like that, like if it's supposed to store like a, you know,

Starting point is 00:40:59 if you're recording a moment in time, then you probably want to record like that exact moment down to the detail. You might even include the item description in your product example. Totally. You know, because maybe, you know, part of the description was like, you know, included it was a bundle of some sort. And then later you end up changing what that bundle is or whatever. Or even better, there might be, or not better, but an alternate reason why you might include all that product detail stuff is you might want to apply some deep learning later and say, hey, what was it that caused my sales to drop 50% or what was it that caused my sales to go up 50%? If you have all that data directly on those documents, then it can say, oh, well, we see that there were three new bullet points that were added to the product description.

Starting point is 00:41:46 This might be what caused it to be different, right? Now, that said, though, it did sound kind of like you were making the case at the time, though, especially with the product example, the e-commerce example, that you would rather prefer a document database model over the relational model. But you could still do it all in relational and be just fine. There would be no problems with that because like one example where you might want, you might say, well, maybe I do want this in a relational is in the case of as it comes time to make specific updates to the order as you're shipping specific items, right?

Starting point is 00:42:29 Instead of having to like load up the entire document with all of the line items in the order, just to update the status of one of those line items to say it's shipped, right? It might be easier in a relational world. Yeah. Depends on like, you know, how big are your average orders, right? Which I think that was something that we talked about in an episode or two ago.

Starting point is 00:42:47 Yep. And again, there's also less conflicts. Like if maybe someone's updating their name or shipping address on the order while someone else is updating the price on a product, it would stink if they had to update the same document and possibly get a conflict there where they're passing the whole object back to save. Whereas in a relational database, you're both modifying just a small piece of that data. Yep. Yeah, I think it's becoming clear to everybody that there is no clear answer. Yeah. Clear as mud.

Starting point is 00:43:14 I definitely default to a relational database. If I'm starting with an app, even though I know it's easier in some cases to go with a document database, my default is still relational database unless I think I'm going to get a ton of data. So the relationship, that relational database isn't going to be feasible or there's some sort of other really good case. And I want to mention too, one thing we didn't really talk about is that if you're doing like a distributed system or you're integrating multiple systems, you're usually going to pass messages around, which are going to be a snapshot of data from one system to another. And you're not going to send like 12 different records and tables. You're going to consolidate that into some sort of shared object that everyone knows how to read and understand.

Starting point is 00:43:53 So we're going to pass that as a message. So that's another case where now, you know, if we had a relational database and we're communicating between different systems, we've got my C sharp code that's taking my data from 10 tables and putting it into you know an object for the user user interacts with it we save it back to those 12 tables now we take that out and that's probably gonna do some sort of object relational uh what's called an orm mapper object relational mapper and then on the way out to the next system we're gonna have to put it back through an object relational mapper again if we get any sort of changes like say that the order is shipped and shipping system sends back a message we might need to have some object relational mapping again in

Starting point is 00:44:32 order to take that object that comes from the shipper in order to update however many tables it needs to so there's a lot of translation that happens so it really is a headache but it's flexible true uh no ambiguity too uh so we mentioned you know if you reference the user joe But it's flexible. True. No ambiguity, too. So we mentioned, you know, if you reference the user Joe over the place, so, you know, maybe delete all comments by the username Joe. Potentially you could delete someone else who has the same name that's actually a different person. So by having a separate key, it makes it really easy to identify which one we're talking about. So, you know, maybe there's multiple Jozaks. I can't really think of a great example here where we might have this.

Starting point is 00:45:10 Multiple JZs. There are two that we know of. There are two JZs in the world. We know of two JZs, and we need to update one of them to be KotlinJZ. Yeah. And, you know, we-Z. Yeah. And how, you know, we can't update both of them.

Starting point is 00:45:27 We can just say like update users where name equals Jay-Z. Yep. And so the key fixes that. Yeah. Yep. So there's, yeah, there's no ambiguity about that.

Starting point is 00:45:38 It's for sure. You're updating the one key that you're passing, uh, updating meaningful values. Easy. Just like we mentioned. So if we need to change, you know, Jay-Z's home state from New York or Florida or whatever,

Starting point is 00:45:48 then there's only one spot to do that. And everywhere that needs to know that is going to have that reflected automatically. Localization support. I thought this one was cool. I don't know. I don't really tend to think about this too much. It was kind of a neat idea

Starting point is 00:46:01 where maybe you have like a status table of user statuses, inactive, active, married, single, whatever. Then you might want to support multiple languages like Spanish, French, English. And so you might have either separate columns there for the different languages, or you might have just a separate table that can swap those values

Starting point is 00:46:17 in, but they have the same shared key. But the meaningful data behind those essentially changes or what is reflected in the UI is changed there. Okay. So I've seen that's something you definitely don't get that with a document database. I don't know that I've ever thought to solve my localization problems like that. Yeah.

Starting point is 00:46:38 Yeah. Well, I haven't had to think about localization in a long time. So. Yeah. And then and but. So, okay. And then, and, but so, okay. So there was that,

Starting point is 00:46:49 but then also when I originally read this, the way I, I was thought you might be going about this is because we've talked about localization as it relates to document databases and as being an advantage to documents because the data was localized there too. So that's where I thought you were going. And I'm like, wait, what?

Starting point is 00:47:04 Oh yeah. I'm talking about, yeah. Language. Yeah. Language support. Yeah. Okay.

Starting point is 00:47:09 Yeah. And the way I've usually done that is, um, in the past is, uh, we'd have some sort of big resource file. So you would have a string in like everywhere that says like resource ID or something.

Starting point is 00:47:19 So all your tables that would need that would have a key to this other document. And what's nice about having it all in one kind of document or one big file is you can send that whole big file over to a translation service to say, hey, I need Russian. And so they would go through this big document and translate all your sentences, all your words, everything. And so you don't have to go and then take

Starting point is 00:47:36 that and throw that back into the hundreds or so tables that need it. Right. The last tip they gave was better search. And I didn't really understand it, but I don't know. Yeah. I mean, yeah, it makes sense here. In a relational database, you can have parent-child data relationship in the same table, right?

Starting point is 00:47:58 So it's easy enough to think about something like an organization, right? So there's the head of the company, then their subordinate, then their subordinate, whatever, right? Because it's basically going to be, hey, my parent is this one, is this one, is this one. So it's easy to query that to get it. Whereas in a document relational world, that's not so easy. You're going to store the entire hierarchy in one document. Yeah, that's kind of weird. And how are you going to get to level five of that thing, right? Like that's where things get complicated in the document world, whereas in the relational world, it just sort of makes sense. Yeah.

Starting point is 00:48:41 I did have one other thing to add here. We talked about keys on users. Just one thing to mention here for people who aren't that familiar with normalized databases and all that, it doesn't necessarily mean that there's just one key per record. So I'll go to the localization thing. Cause I think it fits in well, or maybe it does.

Starting point is 00:49:02 So you might have a languages table that would have English, French, Spanish, whatever. Right. And let's say it was one, two, three, one English, two French, three Spanish. Then you might have another table that your status table, like he said, right. Married, single, complicated. So one, two, three there, you might actually in that table that has the, the statuses, you might also have another column for the language ID in there. So you can have what's called a composite key and a normalized database so that basically instead of status being just its own ID, you're going to have a combination of the status ID and the language ID, and those together will give you the name, right? So just be aware, having a key doesn't mean that there's only ever one key in a table. That's not, and actually in some database practices, that is frowned upon, right? Like they don't want you to create just basically what people call them are meaningless keys, surrogate keys.

Starting point is 00:50:08 There's probably a holy war discussion on this that we could... I like it. Surrogate. Surrogate for the win. So here's the deal, right? Like we should at least point this out. You will have database purists say that you should never create a surrogate key, right? You should never have one, two, three, four, five for your key IDs. And they will say you should always use a meaningful key. And then that ultimately has led to companies building databases where the key for a user is a social security number typically stored in clear text. That is horrible. Don't do it. So there's some sort of

Starting point is 00:50:51 middle ground somewhere that needs to be found. But the key is don't use very sensitive information for keys for your data, right? If there is something that makes data unique easily and it's obvious and it's not some sort of personally identifiable information, sure, go for it, right? But there's absolutely nothing wrong with using a surrogate key either if it makes sense. So, I don't know. I took a database class in college and they recommended using social security numbers for users because it just makes sense. Right. It's a natural key.

Starting point is 00:51:24 And a lot of people would use email addresses as keys. And guess what changes a lot and what people lose a lot of, right? Like you change a job, you no longer have that email address. Your company gets bought. You no longer have the same email address, right? Like don't use things that can change as your natural key and don't use sensitive information as your key and i mean there are straight up holy wars on this subject so um yeah i just it's worth talking about surrogate keys are perfectly fine yeah yeah i had a i had a i found a like a comment that i

Starting point is 00:51:59 thought well this is crazy i'm kind of shifting gears for a moment, but we had talked about MySQL, you know, I don't know, a little moment ago. Somewhere. Yeah. There was a quote. I never realized this, but there was a quote in the book where there was talking about like related to like with the relational database, like one of the advantages with the relational database is if you did need to change Joe Zack, for example, right? In the document example that we gave, you might have to go and pull out every comment. Like it would be difficult to find every comment that Joe Zack made because he decided to change his name to J to the z right and then um but if it was in a database then it'd be real quick because you know you're just altering a simple alter statement you know that one value boom done milliseconds

Starting point is 00:52:54 it's done dude you have not done sql in a minute you don't alter data alter a table you update i'm sorry i'm sorry yeah it's been a minute yeah oh okay yeah i'm sorry because they were talking about like altering the table because you wanted to uh add a column to it that's why that's what threw me off i'm sorry all right sorry cool yeah so i was giving a bad example there with this i am paying attention there the document db at any rate um they were saying like okay so but you know that like in a SQL world, Postgres, SQL Server, DB2, whatever, you wanted to

Starting point is 00:53:30 alter a table just real quick, milliseconds to add a new column to it, right? Except that's not true for MySQL. And I didn't know this, but they say that MySQL is a notable exception. It copies the entire table on alter, which means that it

Starting point is 00:53:46 can take minutes or even hours of downtime altering a table if it's large enough. That's really interesting. Hey, you know, there are various, I'm sorry to interrupt, but there are various tools that exist to work around it. But yeah, I didn't know that. I didn't know that either. You know what else does that though? If you alter a table in SSMS, so SQL Server Management Studio and hit save there, it actually does a select into. You know, I learned this early in my career. Oh, you mean rather than writing the query yourself, using the wizard to do it? Yeah.

Starting point is 00:54:14 So if you are one of those people that's always right-clicking and designing a table in SQL Server Management Studio and you hit save there, it's actually writing all the data to a new table to make that stuff happen. So if you want it to happen in milliseconds like what Outlaw just said, write the alter statement. It's much safer. Well, it's safer and quicker. The next chapter of this book I want to mention, it dives into the

Starting point is 00:54:40 underlying structure of SQL Server and databases like it. I like it. I think it's probably maybe my favorite chapter. I don't know, though. I don't know. They're all pretty good. All pretty good. This episode is sponsored by Datadog, a monitoring platform for cloud-scale infrastructure and

Starting point is 00:54:56 applications. Datadog provides dashboarding, alerting, application performance monitoring, and log management in one tightly integrated platform so you can get end-to-end visibility quickly. Visualize key metrics, set alerts to identify anomalies, and collaborate with your team to troubleshoot and fix issues fast. And they have integrations for just about any technology you would want. We've talked about Kubernetes.

Starting point is 00:55:23 They have integrations for Kubernetes. Kafka or Kafka, whichever you prefer, they've got your favorite, whichever one you decide to call it, they've got integrations for that. Name a database platform. They've got it. Whatever you want to do, they've got an integration for you that can help you to monitor that platform and be informed of what's going on in your environment. And I always like to point out their blog because I think it shows a lot about the product and also the cool things you can do with it that may not necessarily kind of spring to mind when you think about the product. And so I wanted to specifically mention

Starting point is 00:55:59 the article on what's it called? basically extracting metrics from your logs for viewing historical trends and actually tracking against SLOs, surface level objectives, which we talked about before. So go check that blog post out and just see how they wrote it up and the visualizations that they used, just pulling from like plain old log files. It's a really cool blog post, and it shows you a lot about what they could do for you. Yeah, and we'll include a link to that in the show notes. And, you know, while you're there, go to Datadog. You can try it yourself today by starting a free 14-day trial and also receive a free Datadog t-shirt when you create your first dashboard. So, yeah, again, head over to www.datadog.com slash coding blocks to see how Datadog can provide real-time visibility

Starting point is 00:56:46 into your application. Again, that's www.datadog.com slash coding blocks to sign up today. All right. So I would like to ask you to leave us a review because we really love it. It's really important to us. It helps us grow the show. It helps us keep our egos afloat in times of trouble. So we're really relying on you for a lot here. We try to make it easy for you. So if you go to codingblocks.net slash review, we've got all the links there and we'll help guide you to a place. You don't have to install anything crazy, places like Stitcher or Podchaser. You just sign up for a basic account. So if you could do that for us, we'd really appreciate it. Now, is keeping our egos afloat really the reason? No.

Starting point is 00:57:30 I don't think so. That's not the reason. Hey, we've done this bag 124 times. We've got to come up with new things. That's a lot to ask. Okay. All right. Well, we're about to get into my favorite portion of the show, but you know, how about if we like give a joke?

Starting point is 00:57:46 Do you want a joke? I do want a joke. All right. So I got, I got several, I got a backlog here. And this will be in large thanks, large part due to Mike RG and Arlene who hooked me up with a bunch of jokes. So Arlene said, you know, Hey, don't be worried about your, you know, because we just had Christmas, right? And like smart TVs, there's that whole story about the smart TV spying on you.

Starting point is 00:58:17 Did you hear about that? Oh, yeah. Yeah. Okay. There were lots of articles on that. Yeah. The FBI even came out and said something about it. Yeah.

Starting point is 00:58:24 So Arlene wrote in, she's like, Hey, don't worry. Don't be worried about your smart TV or smartphone spying on you. Your vacuum cleaner has been gathering dirt on you for years. That's awesome. All right.

Starting point is 00:58:40 So with that, we head into my favorite portion of the show. Survey says. All right. Let's see. Back a few episodes, we asked, with the new year coming, what kind of coding resolution do you plan on setting? I plan to learn dot, dot, dot. And your choices were a new language, like Rust, Go, or Lowell code.

Starting point is 00:59:11 No, seriously, Lowell code. A new JavaScript framework like React or Angular, but probably ExtJS. Or infrastructure things like Docker or Kubernetes. Is virtual PC still a thing? Or higher level concepts like machine learning and AI so I can prepare myself for Skynet. Or more about an OS, maybe a new OS, or just get better with a current one.

Starting point is 00:59:42 Or streaming data solutions like Kafka or Kinesis or Kafka. Kafka. Depends. Remember, it depends on where you're from. What part of the country are you from? Your next or search solutions like Elastic, Azure Search. I don't know. I need to Google it some more.

Starting point is 01:00:06 Or algorithms. I need to go back to basics. How does Bellman Ford work again? Or data structures, because I want to go way back to basics. Or lastly, all about cloud services. I hear AWS is a thing. So, Joe, how about you go first? Which one do you think is it?

Starting point is 01:00:33 You know, this stuff, there's a lot of really good ones in here. But I think that I'm going to say infrastructure. I think this is the year of infrastructure. Okay. Infrastructure. Dr. Cooper Dendys, virtual PC. Okay. Virtual PC.

Starting point is 01:00:47 Yep. Well, what's your percent? 26%. 26. Okay. That's pretty high. I'm going to, I really don't know. There's some good ones in here.

Starting point is 01:01:03 I'm going to say higher level concepts like machine learning and AI so I can prepare myself for Skynet. I'm going to go with 14%. All right. Kubernetes or virtual PC at 26% versus higher level concepts like machine learning and AI at 14%. How about a joke first? And we'll be back after these messages. Why, this is from Mike RG, why do programmers prefer dark mode? They haven't seen the light.

Starting point is 01:01:50 So they can see sharp? One thing. Because light attracts bugs. Oh, man. I like it. And Joe wins. Really? Yeah.

Starting point is 01:02:02 Infrastructure, things like Docker and Kubernetes was the top answer. More than 26%. Yes. It was 48% of the vote. Wow. Okay. So my talk at Orlando Code Camp is going to be on Kubernetes. Yeah.

Starting point is 01:02:18 Changing it up. Yeah. Wow. Yeah. Yeah. Awesome. But it makes sense, though. I mean, but it makes sense though. I mean, like.

Starting point is 01:02:30 I honestly, so I thought it was going to be one of those two, but machine learning is such a hot topic. I thought that one would have crawled up. Is it number two on the list? It is not. It is not. No. Somebody going to learn law code. I'll tell you what number two is.

Starting point is 01:02:40 Law code. Cause I'm, I've got my finger on the pulse. Uh, it is, pulse. It is algorithms. No, we said it was low code. Yeah, it's low code. Yeah, no, it was learning a new language. Yeah. It was number two.

Starting point is 01:02:55 Now, the higher level concepts like machine learning and AI was number three. How many percentage? Uh, 36% for, um, the new language and 33 for higher level. So it was almost predominantly everything was in those three categories. Wait, these percents aren't adding up. I was going to say,

Starting point is 01:03:19 cause that's almost a hundred. Was it a multi-select? Oh, it was, it was. That's why. Okay. I a multi-select? Oh, it was. It was. That's why. Okay. I was like, wait a minute.

Starting point is 01:03:30 What? My maths aren't good. It's been a while since I've been in a math class, but I could have swore that basic addition I was decent at. Okay. But still, those three are the top three. Those three are the top three. Those three are the top three. Okay, cool. I like it.

Starting point is 01:03:47 So, yeah, I'm definitely going to do a Kubernetes talk then at Orlando CodeCamp. And I will submit that talk tomorrow. So, you know, I did my New Year's resolution was basically dedicate time and kind of pick ahead of time what I was going to spend time and focus on. I started off this year with Kubernetes. I'm like, all right, that's it. I'm going to do like five hours of dedicated Kubernetes practice and research. And I did off this year with Kubernetes. I'm like, all right, that's it. I'm going to do five hours of dedicated Kubernetes practice and research. And I did, and it's awesome. And unfortunately, five hours is not enough to be an expert in it.

Starting point is 01:04:13 Apparently, there's a lot more. So I don't know. I'm going to throw another two at it and see what happens. Five barely makes you functional. I know cube cuddle. Hey, that's 40% right there. That's a 40% increase. That's hilarious. All right. Um, all right. Well, what do you say? Uh, another joke.

Starting point is 01:04:34 Sure. So, um, this one is also from Arlene and she says that a developer accused of unreadable code refuses to comment. Very good. All right. So today's survey should be no surprise. This is the highly anticipated survey of the year. We said we would do it in the last episode

Starting point is 01:05:05 which keyboard do you use or i could say it like keyboard and say keyboard so which keyboard do you use and your choices are and now some of these uh you know we're just gonna like splat it so that it's like you know know, everything by that, you know, within that company, because some of the variations don't really matter. Right. Right. So anything code keyboard. Right. So code keyboard splat.

Starting point is 01:05:33 I don't care which individual model. Same for the DOS keyboard. Ergodocs. And then these I will have slight differences for the kinesis advantage splat so all you know i don't care if you have that kinesis one versus the kinesis two carbon that doesn't matter um the kinesis freestyle splat or the apple magic keyboard or the Apple Aluminum Wired Keyboard. I wasn't sure if there was like a real difference there. I was kind of thinking like maybe there is.

Starting point is 01:06:11 Battery life, man. I've heard horrible things about the Magic Keyboard. Yeah, well, there's – I was actually kind of torn about like how to word this one because there's the Apple Aluminum Wired ones. But then based off of the same aluminum one there was a wireless version of it that had like replaceable batteries and then there's the magic keyboard that you can't replace the battery ah okay so i don't know uh that's good enough if you have the if you have the old school aluminum Apple one, then just pick the wired one.

Starting point is 01:06:45 You know, it's cause that's kind of the same. Uh, the, the Microsoft sculpt, sculpt ergonomic keyboard, which, um,

Starting point is 01:06:55 I believe is someone's a fan of that one. It looks like a man array. It's amazing. Then there's the Microsoft surface keyboard, which if you are a fan of the Apple keyboard and want something like that for a Windows platform, welcome to the Microsoft Surface keyboard. And then we had a write-in from Kevin who commented because he found on the resources page where we had recommended the Microsoft Sculpt keyboard. I think that was a recommendation from Alan in the past. And he was like, oh, you should check out the Ultimate Hacking Keyboard.

Starting point is 01:07:31 And then this is where I ended with, okay, so fine. So maybe I hit all the big ones that you hear a lot of talk about, but I figure like, okay, fine fine there'll be a category for other mechanical keyboards so if you're like but hey i have a corsair mechanical keyboard you didn't include that specific model or you didn't give me the cherry switches or the brown switches yeah yeah you know you didn't include my my razor mechanical keyboard then it's like okay fine you're in that category and and it'll be good enough right uh or you have some other chiclet keyboard that didn't get included so there's a category for that and then this one i did for alan there's other the other other keyboard the other other there you go so that's the category for if you let's say you like

Starting point is 01:08:23 have some kind of quiet keyboard, you know, like, oh dude, I love my whisper quiet keyboard that came with my Dell, right? You know, that's what that one is. It's neither a chiclet nor is it a mechanical. And then the one that came with my laptop, because ain't nobody got time for carrying around fancy keyboards so that is uh i i'm i'm kind of looking forward to this survey i like to how ridiculously priced are some of the responses going to be that like i'm because i'm going to have forgotten some somebody's gonna be like but but outlaw you forgot this keyboard that I love. Right, yeah. You could only get it on like a Kickstarter.

Starting point is 01:09:07 You had to be like one of the first 500 people to get it. It was $1,000. But here's what you can do with it. Every key is independently movable on your desk. Hey, and I don't know if you guys are a part of the gear channel in Slack. But you should be. There is a lot of keyboard love that has happened in there over the years, man. Like, dude, some people

Starting point is 01:09:27 have some wicked setups. I can't wait to talk about this one in more detail. But I don't want to like, I feel like I've already given too much talking about one of those in particular. But yeah. So, one last joke

Starting point is 01:09:44 before we wrap up this section. How about that? You feeling up to it? I like me some jokes. So Mike RG shared this one with me. This is just good advice for anyone, by the way. Do not, I repeat, do not use beef stew as a password. Okay.

Starting point is 01:10:09 It's not stroganoff. Oh my gosh. Get out of here with that. That's way funnier than it should have been. Your delivery was spot on, man. So thank you, Mike and Arlene, for sharing this with us. Wonderful.

Starting point is 01:10:32 This episode is sponsored by Educative.io. So every developer knows that being a developer means constant learning, new frameworks, languages, patterns, and practices. But there's so many resources out there, where should you go? Meet educative.io. Educative.io is a browser-based learning environment allowing you to jump right in and learn as quickly as possible without needing to set up and configure your own local environment. The courses are full of interactive exercises and playgrounds that are not only super visual, but more importantly, they're engaging.

Starting point is 01:11:06 And the text-based courses allow you to easily skim the course back and forth like you would in a book or blog article. There's no need to scrub through hours of video just to get to the parts you care about. Now, here's the thing. All of their courses have free trials and a 30-day return policy, so there's no risk to you. You can try any course. You can use our special link. You can get 10% off the course. And if you don't like it, hey, there's the 30-day return policy. But they've also introduced subscriptions. So now, sure, you could go and get that 10% off of one course, or you can get an additional 10% off of the subscription,

Starting point is 01:11:45 which is basically like getting a discount on every course that they have by going to our special link, educative.io slash coding blocks, where you can learn more about the great subscription options that they have. And I've mentioned several times the course that I took, Grokking the System Design Interview, that lines up really well with this book, but I didn't really talk much about how I used that course. And I thought it was kind of interesting. I didn't talk to someone about it recently. What I did with those examples, like a Pastebin or a Twitter, is I would read through the description of the product, which is like a Twitter or Pastebin or GitHub or something that I already knew.

Starting point is 01:12:20 And I would think for a few minutes about how I would design that system. Then I would read how they approach the problem in this course because they break it down into different sections and different services. And then I would go back through and try to explain the architecture

Starting point is 01:12:35 after I knew how it worked in my own words. And I found that to be a really effective tool and it gave me a lot of perspective on these services and a much bigger appreciation for it. And I know in this particular course rocking the system design interview you can actually access some of the chapters uh just for free just open you don't even have to create an

Starting point is 01:12:54 account i don't think and uh so i definitely recommend checking out if that's something you are even remotely interested in and uh yeah check that out and uh see if that's something you'd be interested in and they've got um Instagram and tiny URL available for free. Very nice. So you can start your learning today by going to educative.io slash coding blocks. That's educative, E-D-U-C-A-T-I-V-E dot I-O slash coding blocks and get 10% off any course or an additional 10% off a subscription. All right. So now we get to talk about something really exciting that we haven't gotten to or we didn't discuss in the other section, which is a big benefit of relational databases that a lot of people forget about and I definitely forget about, which is the query optimizer.

Starting point is 01:13:42 And just like we mentioned before, there are legitimate reasons for having denormalized data in your relational database. Like we talked about having a snapshot of the data that you place with your order because it's important for historical reasons. There's also other reasons you might do it, like for example, performance, or when the value really matters,

Starting point is 01:13:59 like you might store the ultimate value of your order that was placed, add it up once, never to be added up again, because every time you run reports on orders, you don't have to sum up all those products. And you don't want to have to deal with weird things like penny problems or whatever, if any sort of sale affected the ultimate price. It's very important that those pennies match when you're dealing with the credit card agencies or whoever is getting those payments. So it's important for security. It's important for reporting. It's just important sometimes to unnormalize your data. And I'm going to say this is not the usual use case. So even though we do call it out, I would say, I don't know, maybe 5% of your use cases are going to deal

Starting point is 01:14:44 with denormalized data unless you're doing something pretty funky. Think it's more? Depends on the hammer and nail problem. If you're one of those people that works at a company where they're like, we got SQL server, that's our tool, that's what we're doing, then you might have read-only databases you've created that are nothing more than denormalized tables, right?

Starting point is 01:15:09 So I don't know. I think that calling out a percent is rough on this because I've definitely worked at spots where everything happens in SQL Server. They've got reports. They've got whatever, right? Like it's all there, and there are processes built around things to denormalize data at certain points of the day and all kinds of stuff. If you say the word materialized or denormalized at least once a day, then, hey, you're my people. Fist bump.

Starting point is 01:15:37 Well, but even you might even keep it in SQL server, but then as your way of, you might have a process that runs on some batch maybe or whatever that would take that out, flatten it, descend it off to like, you know, to, to build a search engine index that's outside of your system.

Starting point is 01:16:02 Oh, maybe, but I've totally seen it where you're not shipping that data off. You're putting it in another table, right? And then you index the who you out of that table, the who is that a technical term? That is a technical term. We ran across that one. So you index that thing eight ways from Sunday to get it to work, right?

Starting point is 01:16:19 So, I mean, yeah, totally. There are situations where people do that. They ship the data off and use it in other places. But I've seen way too many times where they're like, no, no, the database is my tool. This is where it shall live. That's where it lives. Well, it is a good tool. And one reason that's a good tool and another good reason that's a really good tool is the query optimizer that most modern big relational databases have.

Starting point is 01:16:49 And the reason that it's important is because let's consider the converse first. If you didn't have a query optimizer, which we'll define in a minute, then the performance of the query you're running might vary wildly based on how you empirically write that query. Like the table you joined from or select from and then the tables that you joined to might have a big performance impact if you say use the bigger table first and the smaller table first. Also things like the order that your where clauses, the conditions and filters and things like that get applied. And even the sorting get applied in your query. You can imagine if, if you're writing a SQL query, you change the order of items in your where clause and the performance was dramatically different. Like that would be

Starting point is 01:17:36 the world we were living in if we didn't have a query optimizer. And what the query optimizer does is it's kind of a part of applications, like a little module that runs. And whenever you send it a query, it's responsible for taking a look at the metadata and statistics that it keeps about your data. And it's supposed to figure out the best way to execute this query in order for it to be the most efficient in terms of resources and speed. And anyone who's worked with... And what? In theory. In theory, yeah. And anyone who's worked with databases for a while knows sometimes you have to give a little nudge or a little help

Starting point is 01:18:10 or a little hint here or there in order to get it to go right. But overall, it does a pretty good job. In most cases, you don't have to worry about it. And what's really cool about that is that your data changes. Sometimes tables grow at different rates, like table A may grow much faster than table B. Sometimes you add new indexes to table C, or sometimes you add columns. And so you can imagine if you had to go back to each of the queries that you've ever written, any query that could ever run the system, and take a look at how it runs because you added a new column. Like that would be insane, but that would be the world you were living in. You'd have to be evaluating stuff even with no schema changes.

Starting point is 01:18:46 Just if one of the tables grew out of sync with the others, you might have to take a look at how things are running and figure out if you need to rejigger your query in order to run well. I'd never actually thought about it that way, but that's a really good point of why the optimizers

Starting point is 01:19:01 are such a big deal is queries you've already written that you never have to think about or yeah you totally hopefully don't have to think about again right yep query optimizer is basically a generic tool that you just kind of get for free on all your queries and yeah sometimes you have to nudge it doesn't always work great sometimes you have to know more about it than you want to but overall it does its job pretty good and the big players like the sequel servers and oracles that's kind of their secret ingredient that's the magic that's made them survive over the years is because it performs really well and it has you know some advanced features that also

Starting point is 01:19:36 perform really well but you know you you made that comment about like uh you would hope that you know you shouldn't reordering your where shouldn't affect your performance. But I've definitely seen cases where it's like, oh, hey, if you were to put what you have in the predicate there as part of the join clause, it would make it better. Or maybe vice versa, like all of a sudden your query performs way better. Yep. It's not the case there's like an or. An or in the join? Right.

Starting point is 01:20:11 Yeah, because sometimes I feel like when your system gets to the point where you have to like really understand what the optimizer is doing in order to like to make that thing perform, then that should be a code smell that something's wrong. Or that you're at the outer reaches of that technology. Right. That's what I'm saying. Yeah.

Starting point is 01:20:32 That's the code smell. Yeah. That's definitely the point. Have you guys ever felt proud that you tricked the query optimizer? No, I hate it every time. Yeah. I mean, after you spent four hours trying to figure out why it wasn't working, then you're like, wow, I wrote a query so complicated that their algorithm that they've probably spent millions of dollars on can't make this good. I win.

Starting point is 01:20:58 Yeah. Yeah. And there's a lot of catches if you deal with like one, like I know SQL Server pretty well. So parameter sniffing is a common problem where different arguments kind of cause use cases to go a little crazy. Also, store procedures tend to be compiled when you save them. And sometimes the plan that it saves in order to run as a performance optimization ends up getting out of date. And so you have to kind of refresh those. And there's all sorts of tips for that.

Starting point is 01:21:24 Then the one the outlaw the one i was thinking is uh if you do an or clause in inner join if you do that in the on a lot of times it performs terribly it's kind of a known terrible thing but if you union that thing which seems like it would be so much slower it tends to run so much better when you have a lot of data yeah i mean the there the fact of the matter is the SQL language has gotten so big and full of featured that you can do stuff that, I mean, how would you write an optimization engine that could handle everything, right? at the query, it's also examining statistics on tables. It's examining how much data is on the tables. It's looking at the indexes available for tables. So it's having to create this entire matrix of things to try and come up with, okay, I think this is the best order to run these things in. Right. And, and sometimes it's not going to get it right. And then that's when you have to

Starting point is 01:22:19 work magic. And that's when like to outlaw's point a second ago, you might have pushed it past what it should be doing now, right? And that's when you might be looking at other technologies to assist you. And it's funny, too, is if you have dev and prod environment, something might run great on dev because the ratios of data are different. And then you go out to prod and it runs terribly because the shape of that data and the size of those data don't match dev rather. Yeah, everything is fast for small n. Dude, you could even have the exact same data in two places and simply have a fragmented index in one of them and your query goes from half a second to an hour.

Starting point is 01:23:05 There's just so many variables at play. Just know that the optimizer is there to help you. It's not a simple thing. And once you run into it, you'll know because you'll spend hours on it. You've convinced me. Document database, all the things. I'm sold. So relational databases like Microsoft or Oracle has put a ton of time into this optimizer and everyone benefits like forever on.

Starting point is 01:23:31 And so even if you mess with stuff and you remember the times you mess with it, at least me personally, I would put that as like the 1% of the time. I don't even have to worry about that stuff most of the time. But what I do, it's terrible. Don't get me wrong, it's finicky. But the point is that you've got this one great tool that applies to everything in the database, and it works out really well. And the document database, because so much of the design and that work and knowing about your domain and your use cases is on you, you can't have just a general, generic optimizer that can operate in the same way. Sure, you can have indexes, you can have statisticses you can have

Starting point is 01:24:05 statistics you could have some smarts under the covers but it just doesn't have the same level of flexibility and the ability to change those plans so much because at the end of the day you're mostly just kind of you're either fetching stuff by key and getting the whole document back or you're doing some basic kind of map reduce type things where you're kind of splitting up and searching through data at all, kind of a very simple level compared to the types of relationships that are supported in relational databases. That's really interesting. I mean, it's a great point, too.

Starting point is 01:24:33 Yeah, and that's something I never really considered. I just kind of think like, you know, before reading this book, to me, it was always like, does it need to scale a document? Does it need relationships? Then relational. But now I kind of have a little bit better understanding of the reasons behind those things. It makes sense to me. So on the next section is on how to choose document for,

Starting point is 01:24:53 uh, versus already mess. And we've kind of been talking about this all the time, but I tried to kind of boil it down into a couple, um, somewhat simple rules. So document databases really shine when you need schema flexibility. Like we talked about the logging example

Starting point is 01:25:08 where maybe you've got different sources kind of logging in and you don't want to break or prevent or try to fit data that doesn't necessarily match between two different systems, especially if you've got a tool that can handle that well.

Starting point is 01:25:22 You know one thing I've always thought about and I don't know why this particular thing is always in my head when it comes to, to document databases is I think about product catalogs and I think about motherboards, you know, how many connections. So random it is,

Starting point is 01:25:38 but here's the thing. Like if you ever go to shop at what you guys both did to build a computer, you're looking for certain things like Thunderbolt ports or whatever, right? That's stuff that gets added every year. Like every motherboard that comes out has different connectors. You know, it went from USB one to USB two to three to USB C to Thunderbolt to whatever. And motherboards are just something that they're constantly bolting new technology onto. And in a relational database world, that's a bit of a nightmare unless you're doing the EAV schema. But in a document database, you could literally just say,

Starting point is 01:26:18 here's my, you know, here's a new property. I have Thunderbolt. Yes. Right. And it's, and that kind of thing. I don't know why it's always been that, but I guess it's because motherboards are the biggest pain in the butt to decide on when you're trying to build a computer because you want certain functionality, right? Like you want a fast bus,

Starting point is 01:26:39 you want this, you want that, whatever. So at any rate, yeah, that's the one that seems better. Like, especially if you're trying to build a PC, you need to be able to look up and say does this have x usbc or whatever but

Starting point is 01:26:49 imagine being like an amazon or a bnh and now you sell motherboards and cameras and bicycles and skateboards and so that's the place where a document database really shines because you've got these things that don't match up and if user searches and says, you know, what's a good example? Like to M SATA, right? I shouldn't get bicycles. My, we all agree. Like it should just, whatever products do have M SATA, like those should be returned. And that works really great. If you had a relational database that tried to map out all those relationships between every possible combination of things that would be miserable. It's never, it's never going to work. So I can say with some confidence that the search on Amazon is using some sort of document

Starting point is 01:27:29 database or a search engine. I'm glad you said the search part because when you were talking about that at first, I'm like, well, okay, if you pictured the product page, you probably do want to have some consistency and know what fields are there. I'm expecting there's going to be a title for this product. I'm expecting that there's going to be a description. I'm expecting there's going to be a, you know,

Starting point is 01:27:49 a price, you know, things like that. But all of these other attributes, which is, you know, all they are that you're describing, we're just tacking on stuff.

Starting point is 01:27:58 Like when I think about the power of just being able to like add those in on the fly, then to me, that's like, um, like faceted navigation or guided navigation. I've heard it termed in so many different ways. But as you're searching, like Newegg is great about this, right? You go there, you search for something, and then there'll be all the facets that you could like possibly change the results by, you know, like, you know, checkboxes. And it's like, oh, hey, did you want 1080p?

Starting point is 01:28:28 Because we have 8 billion of them that are 1080p. But if you're looking for like 8K, well, we have two. So if you click on this magic button, boom, we're going to filter it right down to just the two that are 8K. But to Joey's point, like that doesn't make – resolution doesn't make sense for a skateboard, right? So now you're like, oh, but no, I want to use skateboard. And you're like, oh, did you want a birdhouse skateboard or a blind skateboard? You're like, well, of course I want a birdhouse, right? Well, I think to take it just one little tiny step further in terms of just why it matters so much when we're talking about the document database versus the relational, though,

Starting point is 01:29:06 is in this whole scheme of flexibility. Like I said, Thunderbolt, right? Like it's just something that's recently been added to new motherboards. You might have that checkbox on the side of the page that you say, hey, yeah, Thunderbolt matters to me, right? In the document database world, you didn't have to go back in and update every one of the old motherboards in the system to say Thunderbolt. No, you could just have a little toggle in your app that says, hey, give me everything where Thunderbolt has something that says yes or it has a number greater than zero. And then you get them back. Right. In the relational world, you might have to do something sort of special or you're going to have to modify your queries to do something special. It says, hey, where exists this attribute, blah, blah, blah, blah, blah.

Starting point is 01:29:51 Right. So there's there is a difference in how you have to handle that data in two different places. But see, this is where like it gets a little messy, though, in this description, because Because even in the example that you just gave with the motherboard, right? If in that kind of document world, now you've got to go through each one of those documents to, okay, I've got 8 billion products that are all in documents. And Alan said he cares about the ones that have Thunderbolt 2. Now I got to go and query each one of those to look for that one specific field in the document. And those are the only ones I want to get back with. So like I look at, I view these documents as a great way to feed an index, right? Because then you can't have just like arbitrary data that's going to come in and like it figures out how it's going to index it. But for the actual search

Starting point is 01:30:50 mechanism itself, I don't assume that it's keeping it in that document format though. Go ahead. I was going to say the next chapter specifically dives into this particular use case. But one way to kind of approach that with a relational database is say we're going to have a simple table with keys and values like string, string, and then we can group those things together. Problem is with such a wide variety of things, you basically, potential results, you have to do a lot of work when you're selecting that stuff in order to find the ones that have the most commonality in order to kind of show in that in that facet of navigation. But what document databases like search engines like Elastic or Solr or Lucene does

Starting point is 01:31:29 is basically store that stuff out in indexes. So when you bring your data in, it stores it as a whole document and it retrieves it as a whole document, but it maintains a separate set of data about those documents that kind of keeps those facets, those facets separated. What am I saying? Those faceted separately and searchable separately, and then it's able to cross correlate those really fast.

Starting point is 01:31:53 Right. So it's highly optimized. Right. So it's almost like indexing at the row level rather than the column. Right. So, so for the Thunderbolt two type thing, you'd actually have an index that was Thunderbolt two and then under it, it would have a link to all the documents that had it, right?

Starting point is 01:32:07 So it's the whole reverse index thing, but I guess that's what I'm saying, right? Like the way that you would approach these things from a relational database standpoint versus a document database are very different, and the performance implications can be massive, right? Like trying to run that same query against a relational database, you're going through potentially hundreds of millions of rows to get to that stuff in an index that's not necessarily optimized for that kind of search. Whereas in the document world, it can be totally tailored to it and kind of easily in some cases. So yeah, again, it's a use case based thing. And that's why we're not saying choose one over the other. It's, Hey, there's use cases for both of these. So you're saying pick document database. Always.

Starting point is 01:32:53 If you're writing a search engine, document database is a great way to go when you couple it with those indexes. Yep. And that is true. That is absolutely true. But search engines are also really rigid and they don't join well. And so they're only really good at presenting the face of basically the kind of general shape of that index. While each document may have varying fields in it, you still are generally going to want some sort of index that represents like one type of entity. So you don't want to necessarily have products in there and customers in the same index because it doesn't really make sense to search those things together and so if things get a little squirrely and you can't join them together very easily because of the underlying indexing model but trade-offs all

Starting point is 01:33:33 over the place but see i'm looking for something that's going to scale so for consulting just call 1-800 coding blocks and we will mix Mix a lot. Kick them nasty thoughts. Wow. Document databases scale really well. So like the Elastics of the World, the Mongo's scale really well because of that locality that we talked about before where data is stored close together. So if I've got a gigantic

Starting point is 01:33:58 index like Amazon's Every Order Ever Made and maybe I've got a thousand nodes with data on those documents, then each document is like a one atomic thing that gets pulled out. And so I can easily add more notes to that because ultimately it doesn't have to share the data. It doesn't have to break up that data into multiple different services and different servers. And it doesn't have to go looking for that. It basically has like one sort of key to get that item or it's doing some sort of map reduce job in order to kind of sift through all that data and find based on whatever type of thing.

Starting point is 01:34:32 So it's able to split those things up really nicely because of that locality. Hey, so to explicitly call this out, we are saying that document databases scale well. Relational databases typically do not. Right. Without some sort of additional application layer on top of it that handles sharding and that kind of stuff. But just by the very nature, RDBMSs are typically not what you're looking at for horizontal infinite type scaling. Yeah, and it's kind of funny. There's a lot of blending. Like SQL Server has kind of some documenting type capabilities and some support for json stuff like that and some of the data the document databases have some relational type features now built in so it's definitely a scale

Starting point is 01:35:14 it just uh you can kind of make some judgments based on the type of underlying kind of data structures underneath um which is all about the next chapter. We'll be talking about like LSM trees and B trees and how that underlying data structure difference makes a huge amount of difference. And if you're designing a RDBMS from scratch or you're doing a document database, you've got some decisions to make there because they both have major trade-offs that bubble all the way up and change how you experience those products. Now, somebody's going to take issue with this, though, and say, like, well, hold on now. I mean, we've pointed out the Stack Overflow, for example, architecture in the past.

Starting point is 01:35:56 And you're like, they get a ton of traffic. And they're on SQL Server. What do you mean it doesn't scale? But while that's true, they also have a lot of other architecture in front of it or beside it to assist it, like caching in front of it

Starting point is 01:36:16 or elastic indexes that you would hit to search. So I think what we're saying when we say the scalability is like, does it like horizontally scale to handle a billion users or is it well you can have a primary and a backup right in the case of like a sql server yeah it's really tough it's and you pretty much can't say you know yes or no to something

Starting point is 01:36:46 that doesn't scale. It kind of doesn't really make a lot of sense because if you start making trade-offs in SQL Server, you can scale really well if all you care about is reading, for example. All you care, you can have tons of read replicas, no problem. You can scale that all day long and support tons of concurrent users, but you sacrifice the ability to have data and the ability to make updates. And so you could say it doesn't scale well, but at the same time, if you need the same features from relational databases and you're trying to build that into a document database,

Starting point is 01:37:13 you're going to start suffering some of the drawbacks that relational databases have because you're just building those drawbacks into your application now. So everything's got trade-offs. And when I typically think of something scaling well or not, it deals with how easy it is to scale and how flexible and what my options are for scaling in well but i i kind of i i was trying to phrase it in the terms of like

Starting point is 01:37:38 you can't you can only have like a usually like in a s SQL Server world, a primary and a backup. You're limited to how many instances you could have. They have clusters. I'm not as familiar with them. But even then, you still do. It's not like a document database. You just throw another node at it it and you're like, hey. Yeah, that's what I mean.

Starting point is 01:38:07 You have another node. It's not, like, something like a SQL server is not infinitely horizontally scalable. Right. Right? You can't just keep adding on another database to it. Right. Now, there are some little differences here and there. Like Postgres, for example, you can have multiple nodes to serve the reads, but you're still limited to one for the writes.

Starting point is 01:38:30 Right. And going back to the Stack Exchange performance thing, it's such a perfect example of what we've been – I don't even want to say we're beating around the bush in this episode, is there are different use cases that serve these things. And it can be all the same application, right? So the Stack Exchange thing, and we'll probably have a link in the description for it, they have SQL servers that are their primary stuff where all the writes go. But behind it, they have caching servers, the Redis servers that are there. They also have three Elasticsearch servers that are running. So they obviously take the data from their SQL server. They turn it into search indexes because people search a lot to get data out of Stack Overflow. That's going to be

Starting point is 01:39:18 served up from the Elasticsearch servers, right? Which are probably not even going to get hit most of the time because they're probably going to hit Redis first, right? So they've got tiers or layers to where you first make a search. Hey, did this pop up recently? Hit Redis, right? Bring it out of the cache. If it didn't hit Redis and it wasn't in the cache, then go to Elasticsearch. Hey, all right, excellent. Now I've got everything here. If I go write a new comment, that's getting written to SQL Server. And then that's going to get built back out into Elastic and then go up through Redis and all that, right? So what scales is that they've augmented their various technologies with the things that answer the

Starting point is 01:40:03 use cases they need, right? Being able to do fast writes, they're going to SQL server. Being able to do fast reads, they're using Elasticsearch and they're using Redis. And they all serve different purposes, right? And that's kind of what we've been trying to get at this entire episode is there is no one size fits all. If there was, everybody would use it, right? I think Azure would love you to think that Cosmos DB is the one size fits all thing because they tell you, hey, it's infinitely scalable. You can get results back in microseconds or whatever. I can't tell you whether or not that's true or not. But my guess is if that was the case, then nobody would use anything else. And it's got different interfaces

Starting point is 01:40:48 and storage engines too, so you have to make those trade-offs. It's tempting to just kind of answer with the pros of each different approach, but realistically if you're talking about a real application, decisions you make about those storage engines and interfaces is going to have an impact on some of those numbers.

Starting point is 01:41:04 Yep. So document databases are flexible in what they can store often called schemeless or schema on read which we kind of talked about like the product example there where sometimes that could be a really good thing the downside is that's also not we're not enforcing any sort of integrity there so if the developers make mistakes or bad data gets in there or some sort of required field isn't in there, we don't have those kind of checks that are really good and exist and are very common in relational databases. I want to mention too that document databases tend to have poor support for joining multiple documents. And when I say that, it's because a lot of the information that you would want to join on is kind of buried down in that document. So if I want to join my user document

Starting point is 01:41:51 against my product document, then I'm probably going to be joining on, say, like a user key that's buried somewhere in that user object, which means I have to fetch that object and extract that user in order to see if it even joins at all. So it just means I have to do a lot of work where that's the kind of stuff that the SQL databases or relational databases can do really quickly. A lot of times, even just looking at indexes without even getting anywhere near the actual data for those rows, it can just keep that stuff in a smaller spot and reduce the workload that has to be done. And like we mentioned several times, document databases require a lot of upfront work when designing

Starting point is 01:42:29 because you have to know about those use cases and you don't have this generic query optimizer to kind of help you out on the basic use cases. And so you kind of have to think about every little design decision, like do we put the usernames associated with the comments associated with the blog post, or do we want those to be separate documents and only keep a list of comment IDs and fetch those later? Like, those are things that you have to really spend time thinking about, which kind of sounds fun.

Starting point is 01:42:57 It does require a lot more thought up front. Yeah, I like thinking. And RDMS is what they're great at is just those relationships like there's many to ones and many to minis and as we said uh data tends to get more complicated more highly connected over time so they're really flexible they're really powerful but the downside is that they don't scale very well and in part it's mainly due to the fact that they are so flexible that it's hard to split up that data and still have those kind of same optimizations

Starting point is 01:43:28 and that same model work so well for it. So who won? RD Master or Document? Who wore it better? I think we already agreed that we were going to just use Document Databases for everything, so I guess Document Database. You know the funny thing about this is, I don't know if you guys get these in your, in your feeds for news or whatever, like Postgres is constantly popping up in my feed for being the RDBMS that supports document databases.

Starting point is 01:43:58 Right. Because, because they allow you to basically mix them. Right. Because they allow you to basically mix them, right? So they have a JSON format type or a JSON data type that you can actually search and index and do all the things that you can kind of do with something like a MongoDB all within your relational world. Which is kind of getting the best of both worlds until you start talking about scaling, right? So, yeah. And how many, or I should say, I will place a bet that every single NoSQL database, every NoSQL document database has a SQL query language that they called SQL and they had a press release out

Starting point is 01:44:36 that says Mongo now supports SQL. Or, you know, whoever Elasticsearch supports SQL, Kafka supports SQL. They all have their own fun. Jira has sql right if you're looking for tickets they all support that kind of syntax but that doesn't mean that they have all the benefits of those relationships and joining and flexibility that rdms's have yep so so i'm going to say that rdms won here because that's what we all really want deep down

Starting point is 01:45:03 it's just that we can't have it. Because it doesn't scale very well. But if someone invented an RDBMS that scaled perfectly, easily, no problem. You could just throw another node in and rebalance somehow and it all just worked. We would all just use that. Come on. Okay. So, I'm going to backpedal on that.

Starting point is 01:45:28 I'm going to say SQL, the language one, and the storage technology is. No, no, no, no. Yeah. You can't change the rules, man. We're not talking about SQL as a concept, as a language. We're talking about relational databases versus document databases, and we don't care about the languages you use to query it. Okay. you use to query it. Okay, so in that regard, I will say that the concept of relational has won because every single tool on the planet, like Joe said, has tried

Starting point is 01:45:55 to bolt on some sort of SQL language to allow for relationships, right? Big data, document data storage, whatever it is, they all want to deal with the relationships. Well, I don't like this because now if I'm going last, like we're all going to like, you know, agree. So that can't be the thing. Are you trying to agree with this? Are you saying relational one? Well, OK, so that's what I want to say because yeah like you know like what joe said like if if if they were able to figure out a way to make relational infinitely horizontally scalable

Starting point is 01:46:34 like hands down like why why wouldn't you do it i mean think about it you got like you said with capabilities of postgres sql server added support for j So, I mean, it's like, well, why wouldn't you have your data like, hey, this is the data that I need in a relational format over here in these tables. And, oh, but I needed this JSON blob over here for this other reason. Like, yeah, of course you would. But? But you guys already picked relational. So document databases win. Because if you say document databases,

Starting point is 01:47:09 databases suddenly got the ability to join really well, and that's compelling to realize you still have to put a lot of effort into, to know how to organize your data in order for it to make sense and be useful. So yeah, it's still, still a pain. Relational is definitely more flexible. And the primary reason people end up going to the document world are two, twofold, right?

Starting point is 01:47:31 Scalability and it more mentally maps out well to what the developer's code is in the first place. So weird to say that the relational is more flexible than the document though, right? Because typically when you think about it, you're like, oh, I don't know what the structure of my data is going to be. So I'm just, you know, like that's the advantage of having the document though. Right. Because typically when you think about it, you're like, oh, I don't know what the structure of my data is going to be. So I'm just, you know, like that's the advantage of having the document database. You're like, ah, I don't know what it's going to be yet. I'm just, I got a dev and you know, that's all I care about right now.

Starting point is 01:47:54 Dev, dev, dev, dev, dev, code, code, code, code. Right. Like, but we're saying that the relational is more flexible, which seems so counterintuitive. Hey, but this is coming from three guys that have played in all the worlds, right? Like we've spent our fair share of time in multiple of these things, the relational, the document databases, just the paradigms of like search engines and stuff like that. Paradigms? The paradigms. It's a pair of diggums. And the thing – I mean the fact of the matter is can we all agree that where relational seems to fall down the most is performance, right? That's where you start running into problems.

Starting point is 01:48:34 Wait, wait, wait, wait, wait, wait. Performance? What do you mean? Describe that. Query performance. As things get bigger. As things get bigger. Okay.

Starting point is 01:48:43 Your performance degrades not linearly but typically logarithmically, right? Like all of a sudden you hit a threshold and your query – yeah, it went from one second to ten minutes. And it's like, whoa, what just happened? If you have – but in that example, though, you're not doing just a simple like select where my key equals this one simple key. Could be. I mean, it seriously could be something really stupid. Like, we've seen query performance degrade massively as you hit certain thresholds, and none of them really make any sense.

Starting point is 01:49:15 On the flip side, in the document world, you run into some really hard problems to solve that a simple relationship would fix, right? You can spend a lot of time trying to solve that problem. And that's, that's where it's like, man, I can do it all in the relational world, but I'm going to have some performance hits and I'm not going to be able to

Starting point is 01:49:37 scale it. And the document world, you just got some things that are just almost impossible to solve without just writing some complex code to make it all happen. That's why we need you, listener, to go read this book. Right. Or just make a new one that just combines these two in a good way. And so if you're interested in writing a data store that does things the right way, then this is the right book for you, especially when we start talking about B-trees and LL7 trees.

Starting point is 01:50:07 Super exciting. It sounds exciting. This episode is sponsored by Clubhouse. Clubhouse is a fast and enjoyable project management platform that breaks down silos and brings teams together to ship value, not features. Let's face it, slow, confusing user experiences so last decade. Clubhouse is lightning fast, built for today's software teams with only the features and best practices you need to succeed and nothing more. Yeah, and here's just a few of the highlights about Clubhouse. They have flexible workflows. You can easily customize workflow states for

Starting point is 01:50:50 teams of projects of any size. You have advanced filtering. You can quickly filter by project or team to see how everything is progressing. And you have sprint planning. Set your weekly priorities with iterations and let Clubhouse run the schedule. Clubhouse integrates with the tools you love too. They tie into existing tools, services, and workflows. So you can get notifications or create a story in Slack or update the status of a story with a pull request or preview designs from Figma links. Or you can even build your own integration with their APIs and a lot more. And Clubhouse is an enjoyable collaboration tool as well.

Starting point is 01:51:28 You can easily drag and drop things in their user interface. It's got a dark mode, which I know you guys love, emoji reactions, which I really love, and a lot more. So when you're done doing your best work and your team is just clicking, then life is good. Clubhouse has recently made all core features completely free for teams with up to 10 users, and they're offering CodingBlocks listeners two additional free months on any paid plan with unlimited users and access to premium features. Give it a try at clubhouse.io slash CodingBlocks. That's clubhouse.io slash coding blocks.

Starting point is 01:52:06 Alright, so we've talked a whole lot about relational versus document databases and we will obviously have some resources we'd like in here. And now it's time for my favorite part of the show. It's the tip of the week. And I'm actually going first this time and it's good because I got like 30 tips here. You've been going first for like the last several episodes. Okay, yeah. Like don't make it out like it's new.

Starting point is 01:52:32 No, it's not new. All right, but I am going first. So on topic with this show, so this is where our closing statement a little while ago where Joe said that relational one and I said relational one and outlaw decided not to say it. This is where relational actually does win. So this is why I think it does is because if you look at like big data tools, there's a tool called Presto DB that is worth checking out because it's really interesting, more or less. So what this thing does behind the scenes, and it's used by Facebook and a lot of companies, it will take data from heterogeneous data sources, right? You could take data from an S3 bucket.

Starting point is 01:53:20 You could take data from a relational database. You could take data from files, right? And then treat them like their database tables and do queries against them and join them and group by them and do all kinds of craziness. So we are literally talking about data that's not even located in the same data center and join it like it's in a database. These are the kind of tools that people create because relationships matter in data. And this is like, there was a need for it. And this allows you to do it. It's a distributed SQL query engine for big data. So worth checking out a lot of tools, use it behind the scenes, Amazon's AWS Athena, which works on their S3 buckets and all that kind of stuff. This is

Starting point is 01:54:03 what it's using behind the scenes. So really, really cool stuff. Um, I have no idea how the magic works. My guess is it writes a lot of this stuff to disc and then does a lot of hashing, but I've never looked into the specifics, but it's really cool. And it does some powerful stuff. Um, all right. So now that I'm done drooling over that, so here's just some usability tips that I've picked up. So I might've mentioned that I switched over to iOS from Android. What? No, I don't think that was made. Okay. I don't think that was said. So it's public knowledge now. It's been a, I'm a 50, 50 guy here. Like 50% of me loves it. And 50% of me loathes it. And, and I'm going to give you some tips of things that I've learned along the way that have made it less painful for me. Um, so one of the things is applications that you need to use things from Google Drive.

Starting point is 01:55:07 So an example is I have a key pass database for my work passwords and that kind of stuff. Right. And I store that thing on Google Drive. Well, I downloaded this application. It's called Strongbox, and the app is actually really nice. But here's the problem. If you want to give an application access to Google Drive, Google scares the ever-living heck out of you

Starting point is 01:55:33 because they're like, hey, this application will have the ability to see all of your documents. So if you have anything up there with special information in it, just know that they could go look at it. Hopefully you're downloading apps from reputable companies, and that doesn't matter. if you have anything up there with like special information in it, just know that they could go look at it. You know, hopefully you're downloading apps from reputable companies and that doesn't matter, but still it's terrifying.

Starting point is 01:55:51 So I say when you're talking about one file that has all your database, all your credentials, your credentials, and God knows what else you're storing up there. Right? So I quickly said no, because I panicked. Um,

Starting point is 01:56:03 but here's where things are interesting. That was something requesting access to Google Drive, and Google has to allow that token to do those things on your behalf, right? I found out that there is an application that comes with iOS called Files that has access to Google Drive. So if you turn on the connection between files in Google Drive, then what you can do is from any application that needs access to a document in Google Drive, you can basically point it at your files on your local, and it can look at Google Drive for you. So the only thing you're having to trust is Apple. And I'm way more comfortable trusting Apple with trying to do this than some random developer out

Starting point is 01:56:51 there. Right? So, so basically what I'm saying is use the files application to proxy your access to something like Google drive. Um, you can probably, and you can even do the offline mode thing with it too, which was really nice. So you can say, hey, make this thing available offline. So really helpful. Now I can open this thing and not worry about it and I don't have to give anybody else access to my Google Drive. I got a really simple solution to that problem, by the way. What's that? Just create a LastPass, a work, using your work account, create a LastPass account, and then just share that with your personal. Yeah, I want to do that.

Starting point is 01:57:32 I want to keep them completely separate. I didn't want any LastPass touching. Like, I don't, I'm weird about that. Like, I like to keep my work stuff 100% separate from my personal. Like, I don't even want to touch on the same apps. I don't know why. Yeah, I'm weird like that. But they are touching the same apps on your phone. No, no, no. No, these things are sandboxed. Totally separate. They're not touching each other.

Starting point is 01:57:52 That's one of the reasons I went to iOS. All right. So the next thing that I learned about today that was mind-blowing to me that I didn't know about, have you guys ever pinned a tab in Chrome? Yes. Okay. I was all by myself. Have you guys ever pinned a tab in Chrome? Yes. Okay. I was all by myself. Yeah. I had no idea. So you can pin tabs in Chrome. So, like, we're using G Suite at work now for our email and that kind of stuff.

Starting point is 01:58:16 And so you basically always need your email open, and you always need, like, your calendar open. I didn't know about this pinning thing. I always opened up Chrome and then went to the two. I don't have to do that anymore. I can pin the tab. It keeps them off to the side, nicely tucked away. And every time I open up Chrome, they're there for me now.

Starting point is 01:58:32 I love it. So if you don't know about that, right click a tab in Chrome, say pin, and it'll stick it over on the left for you. And when you reopen Chrome, it should still be there. Love it. It's an itty bitty little icon. It is. It's so beautiful. I like it. Oh, really? That's the thing I hate about. That's why I there. Love it. It's an itty-bitty little icon. It is. It's so beautiful.

Starting point is 01:58:46 I like it. Oh, really? That's the thing I hate about it. That's why I don't use it. Dude, your Gmail icon is the only thing that shows up, and your calendar icon is the only thing. I know what those look like. I'm happy with them. Yeah, I know.

Starting point is 01:58:58 I like the fact that when it's not pinned, then you would see an email count. Yeah, I don't care about that. Mine's just so high that it doesn't matter somebody's not practicing inbox zero very well alan there's no such thing i'm not pointing any fingers but i feel like you people uh on this show that have inbox zero have too much time on your hands all right so. So, and the last thing, so we've talked about this. I don't know about on the show in the past, but I think we have. So we've talked about something like when you're doing multi-factor authentication, typically you have to scan a QR code that will seed something for Google authenticator or something like that. So that when you go to log in, like GitHub is one

Starting point is 01:59:43 of them that you can set up MFA for. So one of the things we've talked about in the past, I think even Leo Laporte has discussed it, right? When there's a QR code reader, they'll take a picture of the QR code and save it somewhere, which is really crazy because now you're saving the seed image that somebody else could scan and potentially get your tokens, right? That kind of bothers me. It always has. I haven't done it because I just don't like it. It's like writing down your password on a piece of paper and leaving it somewhere. Don't like it. Here's what's killer. So I'd always use the Google Authenticator app on Android and iOS. No more. I shall do it no more. And here's why. The Microsoft Authenticator app for both iOS and Android is better in two ways. One, if you scan in the QR code, it actually syncs

Starting point is 02:00:35 to your thing. So if you switch to Android or go back to iOS or go anywhere else, your seeded values are there. Isn't this the exact same problem, though, why you don't like saving the key record? There's a secured in some sort of encrypted storage somewhere. I'm not worrying about it. The other thing, too, is – You hope it's secure. Man, look, dude, if I'm going to worry about that –

Starting point is 02:00:58 No, no, no, no, no, but no, no, wait. It's better than me taking a picture and storing it in Google Photos. It's the same – okay, but it's the same problem, though. Either you're going to trust. I mean, it sounds like I sound so defensive here. You do. But it's the same problem because if you're going to say like, hey, it's Microsoft that's doing it and I trust them. They're going to securely store that thing.

Starting point is 02:01:26 It's the exact same problem. If you did store it in your Google drive, your Google files, or, you know, if you were to like, however else you tried more, like to save that though, into like some other trusted provider, it's being stored in an encrypted vault. So I'm better with that. I mean, you would assume, I would, I assume, and maybe wrongly, that whatever is in your Google Drive is also encrypted on their disks and their data center. Like you can't just go look at it. No, but I guess what I'm getting at

Starting point is 02:01:56 is if you have it just in your photos on your phone, because you snapped a picture of the freaking thing and you don't go out of your way to make sure that that's deleted off your device off whatever photo cloud provider that you're hooked into and all that it's on you to make sure that's all good this it's literally taking the seed and storing it in their own probably azure vault behind the scenes but i'll take it a step further though and this is the reason why i like it even better than google Authenticator, even though the backup's a big part of it, the other one is you can't just open it. So with Google Authenticator, you open the app and it's there. There's nothing that makes anybody authenticate to it before you open up Google Authenticator. With the Microsoft one, you

Starting point is 02:02:41 actually have to authenticate to the app first before it'll show you any of the seeded values, which is really nice because if somebody gets a hold of your Google authenticator and happens to know any of your passwords to get into anything they're in, right? So it's another layer of protection. So I have a link to that. It's much better. It's much better. It still seems like it's the same problem though.

Starting point is 02:03:04 Like either way, you're trusting somebody. You're either trusting somebody with your example of holding your QR codes, or you're trusting that the way Microsoft is storing this data is secure. Which I do. Either way, it's a trust thing. But it's almost like LastPass. You're trusting that LastPass is encrypting your data properly and keeping it in a trust no one environment, right? Yeah. And so you've got to

Starting point is 02:03:31 lay that trust somewhere. There's always the chicken or the egg, right? And what I'm saying is I much more trust them to store that securely than me taking a picture on my phone and trying to put it away somewhere. See, the mere fact that this is backed up is almost like one reason why I don't like it. Oh, I freaking love it. I love the idea that you have to authenticate to it before you can see it. That part of love. I hate the fact that it is backed up somewhere because I view those one-time codes as those should not, by design, nothing else should have that.

Starting point is 02:04:04 Until you switch phones and then you're like, oh, man, now I got to go try and figure out where to get this thing reseated from. And then I got to reset it on everybody else who's using it. That drives me crazy. Every time I have to reseed something, I've got to go reset up every single device that needs those codes. Right? So this is beautiful. That's it. It's perfect. You should download it. Everybody should download it. Right? So this is beautiful. That's it. It's perfect.

Starting point is 02:04:26 You should download it. Everybody should download it. It's amazing. Trust Microsoft. All right. All right. Um, okay.

Starting point is 02:04:36 Well then with that craziness, that awesomeness is what he meant to say. Oh, did I mispronounce that? I'm sorry. Well, we know how I am with proper nouns. So, hey, you know what?

Starting point is 02:04:53 If you're looking for something to give me for a Secret Santa and you're like, oh, man, I'm late. Joe Recursion Joe is here to save you with the most ultimate gift for me, obviously. So this company, Varianto, maybe? That's how you pronounce it? Varianto colon 25, which is an unfortunate name because you can't just Google that because then Google will assume that's a host name in a port. So, yeah. Why? I don't know.

Starting point is 02:05:24 That's the name of the company though but uh he sent me a link to a get deck of playing cards and i was like oh that is so amazing like every card on it you know, just picture your normal, you know, playing cards, right? You know, two through ace, right? But like each card has like a git comment on it, like a, you know, a git command and, you know, what the command does. That is hilarious. And I found it by searching for git playing cards. Yeah.

Starting point is 02:06:07 But that's so cool that is awesome so i was like oh i can't not share the get playing cards i feel like you need to buy them yeah yeah well i'm kind of waiting on like you know one of my friends to like you know for a late uh secret center gift i don't know like hint hint uh you know You know what, Alan? I love your authenticator app suggestion. It was the most amazing suggestion ever. Brown nosy will get you everywhere. I'll have a link

Starting point is 02:06:38 to that. Thank you, Joe Recursion Joe. That's awesome. I've got two tips and I'm verifying one of them first. All right. Great. That worked. Okay.

Starting point is 02:06:50 So I'm going to change the order here from what's in the notes based on what we were just talking about. Gmail tip for you people that have a hard time keeping to inbox zero. Alan. Whatever. And you just want to declare email bankruptcy, which is what I really do. I say inbox zero, but what I really do is declare bankruptcy like once a month. You

Starting point is 02:07:12 can do in Gmail is type in before and then put a date. So like 2019-01-01 is the beginning of the year. So it's a great time to do this. So you say anything before 2019-01-01, you know, basically more than a year out. And the next thing, here's the cool trick is colon unread.

Starting point is 02:07:30 So any emails that you have that were unread, so notifications, junk, spam, whatever that are in your inbox that are older than a year, just type that filter right there and you'll get them. It'll select all of them. And you can just hit that archive button. You don't even have to delete you can just archive them and that number will go down from like 97 000 to like 100 boom email bankruptcy i feel like you're looking at my inbox how'd you know that number it's so nice that you could do that every month or whatever you do like if you haven't read it if you haven't clicked you're're not going to scroll back to page 200 to look at your unread emails, right? Man, you don't know.

Starting point is 02:08:07 That's why I'm out. All right. So that's pretty cool. Now, the second tip is a new font that just came out today released by JetBrains. It's a monospace font, so everything lines up character by character and it supports ligatures. It looks really nice. It reminds me a lot of fear of code, which I think I give his tip a week.

Starting point is 02:08:30 Another time. Um, it's just a little bit, I don't know, different. I don't know really how to describe it, but the page that they used to describe this font is like how all fonts should be described.

Starting point is 02:08:43 If you just look at it, if you click that link, and we'll have it in the show notes, they animate the things that make this font different from other fonts. Like they show how they increase the height to make different characters more intelligible. They even have a little thing if you scroll down showing the weights of the font and how they're different. And so you can move your mouse to the left or right

Starting point is 02:09:03 and have it increase or decrease the weight and do it in italics too so you can see how the font changes as the font weight increases. Dude, I love that shit. It's just really put together well. I was going to say the same thing. Yeah, man. JetBrains. Gosh, they're so good. Between this and Kotlin, it's

Starting point is 02:09:19 like they're just crushing it in so many different places that it makes it so if you're going to make a font or a new language, it's like you have to at least be that good. They've raised the bar. And that kind of sucks for the rest of us. That's really good. That is so good, man. I can't believe how many ligatures it supports, too.

Starting point is 02:09:38 I don't know if you saw that. It is 130. I can't tell. It cycles too fast. 138 ligatures. All right. 138. And it supports 143 different languages, like human languages, French, Afrikaans, Finnish, whatever.

Starting point is 02:09:54 So it's great. And a lot of these, like they're comparing against Fira code. Yeah. And Consolus, which is the Visual Studio default, right? Yeah. And that's a great font, too. So it's cool to see you do it. And the way they animated it, it kind of explains the differences between those other popular choices and why they thought this one had a place to exist in the world, too.

Starting point is 02:10:17 So it's just fantastic. And I love Debrain. The distinctiveness of symbols. You will never again confuse an L. Is that a lowercase L or an uppercase I? Or is it a one? And the zero. That's right.

Starting point is 02:10:30 Man, I can't tell you how many fonts just drive me crazy with zeros. Oh, man. Like the default where it's like, oh, well, if it's capital O, it's just a little bit fatter. Yeah, it's fatter. It's not as skinny. Yeah, zero skinny. You're like, what? So check this out.

Starting point is 02:10:46 So if you scroll down to the comparison section, I've never seen this type of control on a web page before. It's got one font. It's got basically a text, a block of code. And on the left is JetBrains Mono, and the right is Consolus. And you can drag a slider to the left or to the right to change the font. Now, I'm used to seeing that sort of thing for images, but I've never seen that done with fonts.

Starting point is 02:11:05 So it's just kind of funny to kind of scroll it back and forth like you would if you were comparing a filter on Snapchat or something. It's just cool. Maybe I haven't gotten to this part yet. Yeah, I don't see the comparison thing. Where is it? I would say about one-third of the way through the page. Oh, that's way up there.

Starting point is 02:11:20 Right after the numbered bullets. Oh, oh, oh. Yeah, isn't that cool? I mean, I think it's just an image in the background, so I don't think it's way up there. Right after the numbered bullets. Oh, oh, oh. Yeah, isn't that cool? I mean, I think it's just an image in the background, so I don't think it's doing anything crazy. Originally, I thought maybe it was actually doing it. Look for the word comparison. You'll see it's truly about a third of the way down the page. There's a little box on the right with some arrows and a line.

Starting point is 02:11:39 Oh. Dude, look, regardless of how good the font is This page is just worth spending time on I know I thought I might design my own font one day But I'm like, well, I'm not doing all this stuff for it Yeah, man This is awesome

Starting point is 02:11:54 Yeah, it's totally free It's open source So if you want to fork it and do, I don't know Some font stuff Then you can do it Wow I'm totally changing to this Just because

Starting point is 02:12:03 I want to support them Yeah I want to changing to this just because I want to support them I want to go to there excellent tip man alright and alright well Joe just showed us all up with that those stupid authenticator tips hey and you didn't steal that

Starting point is 02:12:24 from anybody, did you? No. What? An original tip from the Joe Zack. How about that? Yep. I saw it on Reddit. Did you really?

Starting point is 02:12:37 Yeah. That's hilarious. All right. Well, we hope you've enjoyed the show uh this was you know we're getting through this designing data intensive applications book we'll figure it out eventually yeah um in the meantime we'll have a bunch of links to you know the tips and the resources for this episode and uh you know be sure to subscribe to us in case if like a friend happened to point you in the direction of the show or they're,

Starting point is 02:13:08 you know, let you listen on their device or whatever. Uh, you can find us on iTunes, Spotify, Stitcher, and more, uh,

Starting point is 02:13:14 using whatever your favorite podcast app is. And, uh, while you're out there on the internet, it's looking for us. You can find us at www.codingblocks.net slash review, where you can find some helpful links to leave us a review if you haven't already. We would

Starting point is 02:13:27 greatly appreciate it. Yep, and while you're up there at codingblocks.net, check out all our show notes, examples, discussions, more. We have copious notes. And send your feedback questions and rants to the Slack because it's awesome and there's a lot of good jokes in there, apparently.

Starting point is 02:13:43 You can follow us on Twitter at CunningBlocks or head over to CunningBlocks.net and find all our social links at the top of the page. We hope. Man. That hits a little. We had some WordPress issues.

Coding Blocks - Designing Data-Intensive Applications – Data Models: Relationships

While we continue to dig into Designing Data-Intensive Applications, we take a step back to discuss data models and relationships as Michael covers all of his bases, Allen has a survey answer just for... him, and Joe really didn't get his tip from Reddit.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.