Coding Blocks - Designing Data-Intensive Applications – Data Models: Relational vs Document

Starting point is 00:00:00 You're listening to Coding Blocks, episode 123. Hey, one, two, three. Hey, subscribe to us and leave us a review on iTunes, Spotify, Stitcher, more using your favorite podcast app. And check out CodingBlocks.net where you can find show notes, examples, discussion, and like 122 other episodes. True that. Send your feedback, questions, and rants to comments at CodingBlocks.net.

Starting point is 00:00:22 Follow us on Twitter at CodingBlocks or head to www.codingblocks.net and find all our social links there at the top of the page. With that, Happy New Year. I'm Alan Underwood. I'm Joe Zack. And I'm Michael Outlaw. This episode is sponsored by Datadog, the monitoring platform for cloud-scale infrastructure and applications allowing you to see inside any stack, any app at any scale, anywhere. And educative.io, level up your coding skills quickly and efficiently, whether you're just starting, preparing for an interview,

Starting point is 00:00:59 or just looking to grow your skillset and about you. One of the fastest growing e-commerce companies headquartered in Hamburg, Germany, that is growing fast and looking for motivated team members like you. All right. Well, as we always like to do, let's,

Starting point is 00:01:18 uh, we want to say thanks to everybody who took the time out of their busy day to leave us a review. And so, uh, from iTunes, we have Boulder dudeude333, ThePank1, and I'm not going to be fooled by this because he said that it's supposed to be pronounced like fish. So he'd probably want me to say Fish26, but I'm not falling for it. I'm pretty sure it's Fizzich26.

Starting point is 00:01:43 I think that's – who are you going to trust gonna trust you're gonna take his word for it or mine good i was so hoping you hadn't read it that's so awesome yeah i did and i liked the the pang one because i was like oh wait a minute is that supposed to be like the penguin right oh yeah yeah penguin yes very good stuff all right so we actually have a short bit of news this time we don't have an entire episode going off on the stack overflow um you know problems that we ran into uh so in this case uh reminder i'll be at DC London on January 31st talking about real time streaming using SQL Server, Kafka and some other stuff. So if you are there or if you're in the London area, hit me up. I would love to meet some people out there while I while I have the opportunity to be out there and I might even have some swag with me.

Starting point is 00:02:41 So, you know, definitely hit me up. Outlaw, don't we have some swag coming here pretty soon? I'm hoping it'll be here in time for your trip. I don't want to give away anything in case we don't. Yeah, if we don't, we don't, but I may have some. So at any rate, yes, definitely reach out to me and we'll try and set up a time to all get together. And then, Joe, what you got?

Starting point is 00:03:03 So I'll be down in South Florida at CodeCamp on February 29th. It's not a joke. I guess it's just one of those years. It's on leap year, South Florida. And I'm going to talk on streaming architectures by example. We're going to be talking about like Kafka down at the bottom, all the way up to GraphQL subscriptions at the top. And hopefully that's going to be a cool talk and let me know, kick me in the shins, you know, whatever. And one last one, because I don't have it

Starting point is 00:03:30 in here, and we should, we just talked about a second ago, is I believe all three of us will also be down at Orlando Code Camp March 28th. Yes, yep. Oh, I was thinking it was the South Florida got known. Nope, Southida is down in davie

Starting point is 00:03:47 slash miami area and i'll be all by myself for that one okay yep so uh we'll get this in the show notes here but yeah definitely um if you are in the area come come hang out say hi to us and you know enjoy a day of learning for nothing. Alan and Joe in the shins. Kick us in the shins. No, kick Joe in the shins. I got two of them. First two people get free kicks. Everybody else

Starting point is 00:04:16 has got something coming to them. It's a risk you got to take. Also, we're giving away a book. So drop a comment down in the comment section on the website and you might win. On this episode, which will be codingblocks.net slash episode 123. One, two, three.

Starting point is 00:04:38 Yep. And now we're jumping in here with data models, which the book specifically calls out as being one of the most important pieces of software. Data model or a data model? I think we already have the data. I say data. By the way, I believe I can't

Starting point is 00:04:59 remember who it was now, but somebody called us out and said, hey, let's mention the portion of the book that we're looking at. So this is starting in Jim. Jim. Jim Hummelson, right? Yeah. So, yes, we are. This is starting in Chapter 2.

Starting point is 00:05:15 So, yes, it took us a little while to get through Chapter 1 because there was a bunch of meat there. So this is starting in Chapter 2. This is the very beginning of it. So if you're trying to follow along, this will help you out. Yeah, good call. Oh yeah, so data models, which is being one of the most important pieces of

Starting point is 00:05:34 software. Is that right? Yeah, that's what they said. That's what they said in the book. Part of developing software. Okay, that's it. The phrasing 100% of it. All sorts of developing software. I tell you, for so many years and honestly, I'm still kind of there, I tend to think of kind of database and persistence first. It's just kind of an old, bad, maybe not, I don't know, habit of mine. So I was happy to see that particular sentence.

Starting point is 00:05:59 Like, yeah, it is pretty dang important. And they say the reason it is is because a lot of times it dictates how the software is written, right? So when you're thinking about these data models, whether or not you're using a relational database system or something, it's going to drive how you do your application stuff, right? Well, maybe more. I'm sorry. But maybe more importantly, though, I think we've talked about this before, but they also mentioned that it dictates how you even think about the problems that you're trying to solve. And I think we've talked about this in the past as it relates to like which language did you start your, you know,

Starting point is 00:06:39 developing career with or your programming adventures with, because that often led to like how you would think about the problems, right? Like if you, if you started in a objectorian language, like a C++ or C sharp or Java, then, you know, you thought about things in these object kind of ways. Whereas if you came from like a, um, you know, maybe something that was a little bit more free, like a Python or a JavaScript, you know, then you weren't held to those same kind of constraints in the way you thought about how to solve the problem to begin with. Yep. I was going to say, when I'm designing an application, if it's totally from scratch, I just got an idea, I tend to think persistence first. So if I'm coming up with a business idea or app idea, I might start thinking, well, I'm going to want to go with a relational database on this because my data is going to be kind of small.

Starting point is 00:07:25 So I'm not going to hit some of the limitations and constraints about it. It's simple. I can work fast and familiar with the tools with it. And then from there, I'll go on and maybe choose a language or a platform or a framework or other things. But I still kind of tend to think of the data as being, I don't want to say the heart or the center because we talked about how that's a bad idea when we did the architecture series. But it's still just such a big part of the decision for me that ends up influencing more factors down the line, at least for me. So that remind me towards the end, I'm going to ask a question about that when we get to the end in regards to designing with the database first. Cause I think,

Starting point is 00:08:06 I think we might have some interesting answers to my question. Yeah. Okay. I'll try to remember that. No promises. I don't know that I'll remember it either. So if we get there and I remember, awesome.

Starting point is 00:08:18 If not, it's going to be like in star Wars spoilers. Right. We'll, we'll try to remember to come back to it. Yes. Um, so what are the things that they say about this and the reason why it's

Starting point is 00:08:31 super important, not only because of how you think about the problem and the fact that it dictates how you write the software, but it's also because typically your software is just layers of, of more data models stacked on top of each other. Right. Like one thing, um, obfuscating the one beneath it. Right. Yeah. We, we kind of made a comment about that. Um, or I think I made a, uh, some kind of comment to that,

Starting point is 00:08:59 maybe the last episode or the episode before where I was talking about like how, you know, if you recall where I was saying like, Hey, imagine if you took everything you need today and you went, you know, back in, you got in the, um, doc Brown's DeLorean and you went back in time, right? Like you still couldn't like immediately jump to where we are today. You'd still have to like go through all those lessons learned and create all of those. Um, you know, those are different abstractions on top of one another, right. To get to where we are today. Yeah. Which is a really difficult thing, right? Like, I mean, even,

Starting point is 00:09:32 even us three and probably a lot of people listening to the podcast, if you've been doing software development for any amount of time, there's some stuff you can read a hundred times, but until you actually go to do it, you don't truly understand what it means when, when you start hitting all those different pieces, right? Back to back. I was just thinking even in regards to how much more complicated chips are today versus what it would have been like, what I imagine it might have been like to do any type of hardware, low-level, what we would call like a low level hardware programming, you know, from decades past. Right. And, uh, you know,

Starting point is 00:10:11 now it's like even the chips have gotten so complicated to where you think that you're interacting directly with the chip, but it's actually an abstraction that's sitting on top of the chip itself that it's, it lives in on the chip, but it's, it's like a chip in a chip kind of thing, you know, specific to like the Intel processors,

Starting point is 00:10:32 you know, and that's why I'm saying like, you know, there's no way you're going to get to that level of difficulty. Like, you know, you're going to have to go through iterations to get there. And that's what,

Starting point is 00:10:42 like we, as a, as a, as a, as a being have like gone through decades of iteration until we've like, Hey, you know what? We can make this easier. And now we'll have this other like abstraction on top of the chip,

Starting point is 00:10:53 you know, to deal with the registers and the hyper threading and whatnot. And, you know, you just call that, that API that we'll give you and trust that, you know, we're going to take it from there.

Starting point is 00:11:01 Well, that's kind of the next point that they get to is this, this whole abstraction. The reason we have them is because we write code that represents objects that represent the real world or the real problem that we're trying to solve, right? And that's the whole reason why we have these layers, you know? So you have your chipset and it has its registers and everything that you have to do. But then when you go into write stuff to use that, you're going to write things in a way that makes sense for the problem you're trying to solve. Right.

Starting point is 00:11:28 And then it just keeps stacking more layers on top of that until you finally get to the application that somebody is using. That's kind of like the story of technology throughout history. Like you mentioned kind of going back in time with knowledge. If you drop me back in like 1800, like, hey, Joe Zack, Kotlin master here, and I'm ready to show you guys the future. I don't know. I need some buckets of silicon, I guess. Maybe somebody fetch me some electricity. Oh, yeah, I'd be in trouble.

Starting point is 00:11:55 You took the time machine too far back. Yeah. Yeah, they would be so disappointed. They'd be like, you can't do anything, can you? But I've got this cool thing called Kotlin. It is funny like if you if we if we were to go back that far in time like i would have no discernible skill that would be worth anything for that time period i would just be like all right well i guess i'm gonna die over here in the corner because i'm not gonna be able to teach you guys a game called basketball

Starting point is 00:12:20 that's the only thing that's really going to transfer here uh that's amazing i'll be like this chair is uncomfortable and all i do is sit all day everything more comfortable it's a rock yeah doomed uh so i don't know what we're talking about anymore but uh something about json yeah so so we have we objects, right? These things that we code these, you know, real life objects that we try and put to code, but then these things have to get translated in some sort of format where they get persisted, right? We got JSON, XML, relational tables, graph databases, like all kinds of things. And then, and then there's a layer on top of that where the people that built the storage engine had to determine how the model, how to get that data on disk and in memory so that it supports things that you need, right? So maybe it's either fast writes or maybe it's searchable or maybe it's fast lookups or whatever, right?

Starting point is 00:13:21 So we've already talked about multiple abstractions for your simple little application that you're trying to do, right? Like you might even just have a form that you're trying to fill out, but there's all these layers that live under it just to make that stuff happen. There's so much we take for granted. And that's the one thing that like going through this book has just made me think just how much more we take for granted. And like when you watch something like a, a Star Trek, for example, and you know how often they'll use, you know,

Starting point is 00:13:51 like voice dictation and they're in, in like AI to do everything for them. Right. Just like when we get to that point, like how much more we will be taking for granted. Right. Right. Yeah.

Starting point is 00:14:05 We're already getting there. Well, the crazy part is, so we just talked about like where it's storing things on this, but then if you take it to the extreme, you were talking about the chips earlier outlaw, uh, all that stuff then has to be converted into electrical pulses.

Starting point is 00:14:18 Yep. Um, or electrical currents. Um, so in some cases, if it's being transmitted by light, it could be light pulses. Uh, if it's being transmitted by light, it could be light pulses. If it's getting stored on tape, it could be magnetic. There's so much going on from the top

Starting point is 00:14:33 down and it's all just additional layers, right? All APIs that are being interfaced with from your application all the way down to the hardware layer and then the hardware on how it's actually going to transmit that stuff and store it. I've heard the recommendation. I think it was from Scott Hanselman. It basically said, like, learn your layer of abstraction and one below it. That's a really good point, honestly. It's almost like the world in any version of whatever you want to call the technology, right?

Starting point is 00:15:03 Because, you know, everything's complex around us, right? Not just things related to computers. But, you know, it's basically like an inverted, like an upside down triangle, right? Or, you know, like a cone, basically, right? And each layer is building on top of it, getting a little bit wider and more complex as you keep building, adding more things on it. But at the very bottom of that bottom point, like what's being done is like, you know, depending on your perspective, you might say, oh, that's simple, right?

Starting point is 00:15:35 You're just flipping a bit, right? Well, what does it take to flip that bit? I mean, you're going to your point about the electrical current and all that, right? Like, no, man, making electricity, that's not simple, right? And it's like, okay, well, we made electricity, so now it's simple. You know, it's like, yeah, so that point is, like, ultimately going to become very, very small, but you keep building more complex things on top of it once you solve it.

Starting point is 00:15:58 Like, look at the wheel, right? The idea of building the wheel, that's pretty simple. But now look at all the things we do with it, right? There's not just cars and planes and trains, but like all the different ways that like, uh, even the rollers that are used on assembly lines to like, you know, move a box from one person to the next person without anyone having to touch it. Right. Like things like that, that are all based on that same kind of concept. Yeah, I mean, we've talked about it before. Programming is very much like anything else, like building a house or anything. There's your base things that you have to do and you have to build layers on.

Starting point is 00:16:35 The biggest difference is we realize we screw up a lot of stuff in programming and we tear out the floors a lot, right? Which I don't think a lot of people really like doing that in their homes, but that's probably the equivalent of it. Yeah, but remember, we're not supposed to refer to it as building houses anymore, right? That was what we learned from – I forget what it was. I think it was the pragmatic programmer that we learned that from where they said to refer to it as gardening. Ah, that's right. That's right.

Starting point is 00:17:00 That everybody makes the mistake of referring to programming as building a house, And they use that analogy because it's something that people can relate to. But really, the better analogy would be gardening. I don't feel so bad about my node modules folder anymore. Like, all it's doing is just showing me that lineage. Like, if I go drive my car, like, I don't need to see the, you know, or I don't get to see all the little various pieces of, like i don't know fire and chemistry and like uh oil refining that goes into such a little trip uh it's all kind of abstracted and hidden from me but uh node modules once you hit that javascript layer it just kind of uh explodes well i love to think

Starting point is 00:17:38 about how how complex cars have gotten like i'm one I'm a car fan. I love cars. Right. Um, but you know, you go back and you look at a car from the sixties, right? Look at how few wires there are in that car. Oh yeah. There's hardly any wires in it. Now, now go look at a modern car, you know, and I'm not even, I'm not, we don't even have to go to the electrical cars. Like we could just stay on the normal combustion engine cars and just look at how much more complex they are i mean they're they're you know you take a car from the 60s you could crawl into the engine bay to work on it and stand in the engine bay to work on it and and still have plenty of room like there could be two or three of you working on it while you're steering it right you get it you open up the hood on modern car, you got to take a tire off to change a light bulb.

Starting point is 00:18:26 Man, you know what? This is actually sort of tangential to development because this kind of reminds me of an article I read not terribly long ago where they said that newer cars are super expensive to work on. So, for instance, a lot of these cars that have proximity sensors around them, right? Like backup sensors and people getting into your lane sensors and all that. They said the problem is these cars were built with all this technology, but the maintenance bays that have existed for years at companies that work on these cars aren't big enough to maintain them. They're not set up to deal with the technology that's being put into these cars. And so it can be super expensive to work on these things because they don't have a place to do it.

Starting point is 00:19:18 So keep that in mind. When you're building software, think about the maintenance of it because it actually matters, right? And how people have to maintain it. Because if you don't do it in a way that makes sense for other developers following you, it can be expensive and problematic. So I don't know, it's just, it's sort of a decent parallel in true hardware, right? But getting back to this, one of the things that they say in the book and we know is true is complex applications, maybe not even, not even so complex of applications. They all have many layers on it, right? Like we've talked about before in C sharp, like naively when you first start playing with web API and we we've talked with people on our site channel and stuff about this. We'll get questions about, Hey, how would you do this?

Starting point is 00:20:05 A lot of people, when they start doing Web API, they'll code all their code right there in the Web API endpoint, right? And that seems great at the time until they realize, well, I need to share this with something that's not making a web request, right? And so eventually what you do is you move that code out of your Web API and you put it into more of a central application layer or something, right? And then your web API can use it and anything else that needs to use it can use it. So even in simple applications, you start building those layers so that you have APIs talking to APIs. I mean, I faced a real world example of this not too long

Starting point is 00:20:40 ago where I had some code that needed to be refactored, but the code itself was written for an executable. It was just a normal command line EXE. And, you know, that we, it was one of the tools that we use. And, but I needed to refactor it. And it was like, well, okay, this code is all in the EXE. I can't write a unit test to verify how it currently works and then verify that my refactoring doesn't change anything. So I got to move this out into another library so that it can become an API to the EXE so that I can then have a way to verify how it works and then verify that I didn't break it when I refactor it, right? Yep.

Starting point is 00:21:28 Just, and the thing is, the whole purpose of these layers is to hide the complexity under it, right? That's really what it is. Your layer, like when you created this new library, that was there just to serve as the API for that program to call it, right? Like everything is truly just about simplifying things as you move further in your stack. Yep.

Starting point is 00:21:50 Yep, program is just all about abstraction. Sounds pretty cool. Isn't there some kind of joke about that too? About like, I think we've referenced it before, but like what do you need? What does your program need? Another abstraction or something like that? Or like what makes it more?

Starting point is 00:22:05 I'm sure there are jokes. It's just like, yeah. I was hoping as I said it, that maybe one of you guys would remember it and like fill in the blanks for me, but I got nothing. I barely remember what I ate for breakfast.

Starting point is 00:22:18 So, um, which is funny because I can remember how to program. I just can't remember any of the other stuff around it. So I don't know if it's just garbage in, some garbage out. I don't know. It's not so much when you get something new in, something's got to go out. Yeah, a lot tumbles out.

Starting point is 00:22:33 I feel like more falls out than comes in, though. Yeah, well, buy breakfast for sure. So, yeah, one of the reasons that they say that these abstractions even exist is to allow different groups of people to work together, right? Like you might have a data team working with the application team or you might have an ops team working with, you know, your application team, something like that, right? But these abstractions allow different groups to work together. Yep. And there are many different types of data models. We're talking about a lot of that coming up here.

Starting point is 00:23:06 And they all have kind of different usages and basically different pros and cons and different needs that they were designed for. And you've got to kind of know what those pros and cons are and understand what you're getting into, especially with persistence. A lot of times you end up with one of these things that stays around for a long, long time. People don't tend to change their persistence very often, although as we've said before, it definitely does happen. But these tend to be big kind of cornerstone pieces of your architecture. Yeah. I mean, when you say that it does happen, it's probably like once a decade kind of change

Starting point is 00:23:39 there. Yeah. Do you think that that would be a fair statement or agree or disagree? More often, less often? I used to agree. Okay. You think the data model changes more often than that? Yeah.

Starting point is 00:23:52 It seems like, I guess, I don't know, maybe just the positions we've been in the past year, year and a half. It feels like it's happened way more than it ever has in my career, right? So I think it depends on, I take it back. I think now that cloud is becoming more prevalent in, in every business's strategy, that these things are probably prone to change a little bit more going forward for companies that are trying to migrate to the cloud. Maybe.

Starting point is 00:24:23 So, so if you picked, so you're saying that if you picked Oracle today, you might pick Postgres next year and then change your mind and go to MySQL the following year and then get crazy and go to DB2 because, you know, why not? Or right, like if you're moving to Azure, maybe you're going to use Cosmos DB because that's their infinitely scalable database architecture and you don't want to have to deal with the problems that you ran in with your on-prem database server. So again, I think if you're running your same application the same way that you've always run it and you never plan on changing

Starting point is 00:25:02 it, the chances are you're never going to change that data model under it, right? Yeah, that's why I said the once a decade thing. I think that once you get your application working, whatever data model you settled on, that thing is going to be the beast for a long time, right? Like it's going to be rare. You'll change. It'll change. It's just a rare day that it'll change. That's why I was referring to it as like the once a decade kind of thing. Like I'm hoping that changes based off this podcast. So for one, for one reason alone or, or this book, the fact that we're covering this book is I feel like for so long people have tended to use whatever database system or whatever storage system that they've used for everything, right? Because they've bought it, they've got it, they have people that know that stuff. And so they start trying to squeeze everything out of it

Starting point is 00:26:00 that it wasn't even intended for, right? We've talked about it in the past, like search engines. We did an entire series or episode on Elasticsearch and the fact that search engines exist for a reason. Well, a lot of people will try and cram that into a database. So what I'm hoping is maybe you're not replacing your existing storage system, database system or whatever, but maybe you start thinking about how you augment it with your application, with various needs of your application in mind. So maybe that's more what I'm hoping for. See, I don't know, man. I disagree because I think that if you took the time to go with Cosmos DB, you picked

Starting point is 00:26:41 that one, for example. Once you buy into some platform, right? Cosmos is the Azure platform, if I remember correctly. You're not going to suddenly or easily say, hey, you know what? We're switching to AWS Aurora, right? Right. Because there's buy-in. Once you get into one of those cloud platforms,

Starting point is 00:27:04 there's kind of like lock-in. I called it buy-in. I meant to say buy-in. Once you get into one of those cloud platforms, there's kind of like lock-in. I called it buy-in. I meant to say lock-in. You can easily get locked into that one thing if you use the technology specific to that one cloud stack. But going back, though, is what I'm saying is maybe you're not changing out the entire back. But when you, when you decide that you need to have this fancy little search app or search feature in your application, instead of trying to cram it all into Azure or into Cosmos or into Oracle or whatever your existing database system was, you say, Hey, we're trying to make a search system. Let's use the right storage and data model for the job. Right? So I guess that's where I'm going is, yeah, you probably aren't going to get your entire system.

Starting point is 00:27:48 It doesn't make sense. But maybe you augment it with a different storage model. I think polyglot persistence is big time on the rise. And a lot of that's like just the cloud because it's gotten easier and microservices are more common. So people are having multiple databases and they've kind of they've agreed to pay that piper and now they're starting to realize the benefits like 10 years ago 20 years ago something like that uh you know you might say if you met somebody at a meetup or something you said uh we do some sql and we've got a graph database here on the side you'd be like well what are you facebook like mr fancy pants you know it'd be kind of a notable

Starting point is 00:28:22 experience but now more and more organizations and smaller and smaller organizations are kind of moving to like multi data model paradigms, I think. Yeah, and I'm hoping that they do. And I'm hoping it's all intelligent thinking on, hey, what is the right tool for the job? Right. That's what I'm hoping happens. So I don't know. We'll see. So data models do have a huge impact on how you write your applications. And this is just what we were talking about. So thinking about how these data models are used and what they're used for

Starting point is 00:28:59 will help you make decisions that might actually ease your road ahead. Yep. And, uh, yeah. So, um, one thing that we've, go ahead.

Starting point is 00:29:12 Yeah, no, no, the, we missed that line. You're about to talk about it. Uh, I wasn't,

Starting point is 00:29:17 I was, but outlaw is, oh yeah. I wanted to make sure that we've called out that it can take a lot of time and effort to master just a single single data model, which is kind of the point that we were talking about before. Yeah. A lifetime for some data models. Yeah. Seriously.

Starting point is 00:29:35 Yeah. It's a it's a full time job. So, yeah, when we talk about persistence, the old argument used to be that it was like crazy complex. And so you needed to be getting a lot out of it. And I think it still is true. It's just that we can see that there's companies and organizations that have been getting a lot out of having multiple different kinds of data stores. And, I mean, think about it. Like, if you question that statement about it taking a lot of time and effort, I mean, there are people that have made their entire career just on something like SQL Server, for example.

Starting point is 00:30:04 Oh, yeah. Right? And a lucrative career, right? Yeah. entire career just on something like a sequel server, for example. Oh yeah. Right. And a lucrative career. Yeah. Yeah. I mean like how many people have you ever met in your life? They were like full time Oracle. That's all they did was Oracle database.

Starting point is 00:30:14 Right. Yeah. So yeah, it's crazy. Yeah. And there's still, there's a lot to know. There's new features that come out all the time.

Starting point is 00:30:21 There's intricacies and there's new challenges that make that challenging. Yeah, it's just hard. Data's hard. It all is. So that takes us into, you know, we're talking about a couple data models today or tonight, but specifically relational versus document. So I think it's probably pretty easy to say that like everybody, like relational is the one that everybody's going to know the best, right? Like that's the one, you know, based off of SQL,

Starting point is 00:30:56 like you probably learned that one before you learned of anything else, right? I would think so. I would definitely think so. I would definitely think so. I used to think so. You used to think so? Ooh. Okay. Yeah, I just, I don't know what, like, I think now it's kind of easy to use NoSQL database of the JavaScript, so I

Starting point is 00:31:15 wonder if there's, like, kind of new articles that are kind of Mongo-faced or if that's changing, basically. Because, like, if you want to work with a relational database, until containerization came along, you kind of had a pretty lengthy setup process. It was a pain in the butt to even get your environment set up.

Starting point is 00:31:35 So it'd be like, hey, write your first web app. Steps 1 through 11 are setting up your web server and database. Hey, but in fairness, it wasn't much different for setting up Mongo or anything else either, right? It was still a bit of a process to set them up. Yeah, but they came around at a time when things were getting easier. Okay. And so people were getting used to just kind of hitting curl or sometimes they have like a one-line command in order to install Mongo. And it would be defaulted with passwords.

Starting point is 00:31:59 And now containers came around now. And so they were just able to jump into that kind of easier user experience because they were more modern and just kind of evolved later but my bet though is that like relational would still be something that most people have heard about more often than not because coming out even from a school point of view right like you're going to talk about normalization it's going to be a subject that's going to come up in a classroom. Yeah. Hey, do,

Starting point is 00:32:27 do either of you guys, did you read in the book when sequel or relational model was first proposed to Edgar Cod? I, I did, but I didn't know if I was supposed to say it. So you can say, you can, it's shocking to me how long ago this actually was.

Starting point is 00:32:44 If I remember right, it was 1970. That is correct. 1970. You kid. You kid. It's shocking to me how long ago this actually was. If I remember right, it was 1970. That is correct. 1970. And this wasn't the SQL language. This was more about the relational model, right? So you had relations, which in SQL Server, or not SQL Server, in SQL are known as tables. And then you had the unordered tuples, and those were your rows, right, with your columns. So that whole thing was the idea for it was introduced 40 years ago. 50. 50.

Starting point is 00:33:18 50 years ago. Oh, my gosh. 50. Oh, my God. Yeah. So you know what the crazy part is? And this is what they said in the book. Oh my gosh. 50. Oh my God. Yeah. So, you know, the crazy part is,

Starting point is 00:33:27 and this is what they said, the book, the book, people doubted it would even work. That is awesome. If you think about that, that's okay. We still doubt if it works.

Starting point is 00:33:38 Yeah. They're like, this is not going to be efficient. And so people started doing it. I'm like, Oh crap. Okay. This is,

Starting point is 00:33:44 this is the way this is the way. And then, in 2020 or you know 2000 people were like you know what yeah maybe this isn't the best most efficient way but you know that's what's so crazy is this is another thing that they pointed out in the book is they said that it's dominant sequel sequel itself its dominance has been around since the mid-80s which means that it's been going on for over 30 years or right at 30 years, which in any kind of software is an absolute eternity. Most things don't hang around for five years, let alone 30. So that's pretty incredible. Can I get a redo real quick? Hold on.

Starting point is 00:34:21 This is the word. That's not better but but hold on though because i mean it's not like we're talking about a a particular piece of software that's lasted we're basically at this point like a better comparison would be to say like okay uh c is still a thing and c was originally started in year blah blah blah right like you know, blah. Right. Like, you know what I mean? But I think what we're talking about here is the whole notion of tables and rows and columns. And the fact that that is still the predominant storage system used by most applications since the, since the mid eighties,

Starting point is 00:35:07 like you can't point at a programming language that's been that popular since then. I mean, sure, C++ and C have been around, but that's probably not the go-to. Java's going to be more popular. C Sharp's, you know, more popular. JavaScript's probably more popular than both of them. So it's just... I mean, it depends on the thing, though,

Starting point is 00:35:23 and that's what I'm saying. Like, C came out in 1972, so it's just, I mean, it depends on the thing though. And that's what I'm saying. Like C, C came out in 1972. So it's almost as old as SQL. Right. Right. And it's still, you know,

Starting point is 00:35:33 widely used for like operating system development. Right. Right. Cause it's super low level. Right. So depending, I mean, depending on your case,

Starting point is 00:35:42 right. But in almost every application that you'll use anywhere, there's probably a relational database system behind the scenes using it or being used by it. Right. And that's pretty crazy when you think about it. Remember when Microsoft was, I don't know how to say this, experimenting with the idea of making the file system use a relational database under the covers? Was that Longhorn, really? Yeah, that was the code name for their SQL-based file system.

Starting point is 00:36:19 I thought it was great at the time. I was like, oh, I know how to do it. It still makes sense to me. You can search and join. It just seemed like a cool way of doing things. They ultimately ended up abandoning it. But I think they kind of brought some pieces of it forward.

Starting point is 00:36:32 It's crazy. I think that was Windows 8 era. That was a long time ago. No, it hasn't been a while back. It was pre-Vista. Yeah. Wow. Wow. Good times. So, bringing it back in. So, the origins for the idea of these kind of Yeah. Wow. Wow. Good. Yeah.

Starting point is 00:36:47 So I'll bring it back in. So the origins for the idea of these kind of relating these energies together has this basis back in business data processing and specifically transaction processing. And we're going to be talking a lot about differences and transactions and analytics later. But when we say transaction here, what we're talking about is basically putting data in and then being able to get that data out. Kind of like a transaction you might have with a bank or some sort of exchange with a person.

Starting point is 00:37:15 Or a guarantee that if you put something in that it's there. Yeah, that you're able to get it back out. Yeah. You're able to do things like fail. Fail the whole process. If you've got five things things like, you know, fail, like fail the whole process. So, like, if you've got five things to do, you can wrap in a transaction. And if one fails, then you're able to, you know, back everything out should you choose.

Starting point is 00:37:34 Yep. And they call out that there's been a number of these competing storages over the years. So, a couple of them that they called out were the network and the hierarchical models from the 70s and 80s. And then, and this was funny, I didn't realize this, but apparently like object databases were a thing in the late 80s and early 90s, which this was before my programming time. So, you know, it's not surprising that I didn't know it. But I'd always heard that Postgres was an object-oriented database. That's when I first heard about Postgres.

Starting point is 00:38:09 I think it's got some origins there, and they kind of dropped them over time. Yeah, I think it is in your time, and you might not have realized it, Alan, because I definitely remember even like in the mid-late 90s hearing references to object databases. Yeah, I mean, I don't even know what they would have been. I hadn't really gotten started programming hardcore until probably right around 99, 2000, somewhere around there. I had imagined that it was based around,

Starting point is 00:38:39 them calling that was based around basically inheritance, so being able to inherit a table and also having properties. So you would have maybe methods associated with your tables and you can inherit those tables and those methods and enhance them with new columns and new functionality

Starting point is 00:38:52 and override and stuff like that. Which is one of the things that Postgres does do that things like SQL and Oracle doesn't is you can inherit a table, right? You could have a people table and then a manager's staff, whatever you could, you can do that, which is kind of crazy. I thought they moved away from that though. You could have a people table and then a manager's staff, whatever.

Starting point is 00:39:05 You could do that, which is kind of crazy. I thought they moved away from that, though, in Postgres. Yeah, it's not something that they necessarily show a lot. I don't know if you can still do it, but that's definitely what its origins were. So it's powerful, but I would imagine it also leads to a crazy amount of code on several sides of it. So, here's one that, like, I don't know if it ever counted as an object database, but it always made me think about it whenever the object databases would come up. But do you guys remember Lotus Notes? Oh, yeah.

Starting point is 00:39:39 I hated Lotus Notes. Yeah, a lot of people did. Yeah. Most people. Most people. It might be fair to say most people, but definitely a lot sounds fair. But I always wonder

Starting point is 00:39:49 and I don't remember exactly how it was classified as object database or not, but that was the one that I was... I know it was IBM's baby. Well, originally it was Lotus's baby. Oh, originally it was Lotus's baby.

Starting point is 00:40:06 Oh, which IBM bought. Yeah, that's right. Interesting. Well, here's a type of database that I had never worked with for sure. XML databases. Isn't that crazy? Wouldn't you just shoot yourself if that was even a thing that you had to mess with? I was trying to imagine what that would look like.

Starting point is 00:40:23 There's like XPath and things like that. And there's schem schemas and there's a lot of things that are kind of in common with the database so i guess it's not unheard of to think that you would be able to kind of search the xml database as if it were like one big document or something uh no thank you and what they call out in the book and this true, there have been a ton of competitors over time, but still relational databases are kind of the thing, right? Like they are not the only thing now. There are definitely other purpose built type things, and they even call it out in the book basically anything that you use on the internet maybe even in an application on your system more than likely has some sort of relational database behind it somewhere and that's saying a lot yeah i think

Starting point is 00:41:17 it's still the default choice like if you don't tell me anything about your app you should say we're going to start an app tomorrow what should i use it's gonna be well all right well i guess you know i mean for me sql server or postgres uh you know that's just what i go with kind of by default like if there's a reason i need to split it up or scale or build or fetch stuff differently then you know that's another story but still my default so so curious real quick you said sql server or postgresgres. Any reason MySQL's not in that running? You know, I kind of got a bad taste in my mouth. I used to work with MySQL a lot back in the day, and it just had some awkward defaults back then,

Starting point is 00:41:53 like the MyISM table structure or whatever the format for the tables was bad. And there were a lot of things you could do to kind of work around those, but I was just kind of, I was using MySQL in an era when it was kind of painful to do it. And like Postgres had just kind of been starting to upstage it. And so I don't even really know what MySQL is like now. But I just kind of had it in my mind that Postgres was my other choice there nowadays. Yeah, I'm kind of with you there on, on the MySQL. Cause I, I too used it in a time period where like it was, you had to really want to use it. Like it was, you know, the,

Starting point is 00:42:33 the tooling just wasn't there. So it was a little bit more painful to use. You could do some amazing things with it. I'm not trying to count it that discount that from it, but it was, it was a little bit more difficult, but I think even now, though, if I had to choose, you know, without a doubt, like I would go to Postgres because, I mean, if you're looking for a free database that is like a free relational database that is fully feature rich, you know, on par with some of the ones that you would pay money for, I mean, Postgres is there, right? I don't know that you could say the same for MySQL. But, I mean, your mileage may vary.

Starting point is 00:43:11 But I was curious, though, and I looked it up for Lotus Notes, because that's a database system we've never, I don't, that might be the first time in our six years of recording that we've ever brought that up, but it's considered a document database. Really? Which makes a lot of sense, too.

Starting point is 00:43:31 As I was reading it, I was like, yep, yep, I can see that. It's been a long time. I don't miss those days. Yeah. I just looked at a quick comparison, and I this is kind of a funny tagline comparison. MySQL is known as the world's most popular open source database. That's their thing. Postgres' thing that's kind of known for is the world's most advanced open source database.

Starting point is 00:44:00 That's kind of interesting from, from everything that I remember going through when looking at them, it seems like Postgres was more often the one that enterprises would adopt if they were looking for an open source one, right over my SQL. It seemed like my SQL be picked up by application developers doing everything. But if it was actually a big business doing it, they wanted Postgres, which was always kind of interesting to me.

Starting point is 00:44:30 Yeah. It's interesting, though. So maybe I should reevaluate and check back in. Let us know in the comments and maybe we'll in the book. I'm trying to remember, what was the name of that? It was a tip of the week that I think Joe had in the past one or two, maybe three episodes back for, oh, db-engines.com. Oh, yeah. And it was episode, it was the reliability episode. So it was episode 120. And I was curious to see like, oh, wait, how does it compare?

Starting point is 00:45:01 Because, you know, like if you go to their homepage, MySQL is the DBMS of the year 2019. I was like, all right, well, I guess more people are going to pick MySQL over Postgres. And here's the thing, though, and it's still higher if you go to the rankings page, even if you're not looking at their best of the year, the rankings page, Oracle 1, MySQL 2, SQL Server 3, Postgres 4. Number 5 is the first time you get to a document database, MongoDB. Right. And then the next one on the list is DB2, which is, again, back to relational.

Starting point is 00:45:36 So like most of the top 10 that you'll see there are relational databases. So it's pretty interesting. Like we said, they're really popular today. I'm sorry. There's only three in the top 10 that aren't relational. Yep. It's crazy. SQL, no.

Starting point is 00:45:53 Redis, Elastic, and Mongo are the three. So. Yes, I guess most people would pick MySQL is the point. This episode is sponsored by datadog a monitoring platform for cloud scale infrastructure and applications datadog provides dashboarding alerting application performance monitoring and log management in one tightly integrated platform so you can get end-to-end visibility quickly visualize key metrics set alerts to identify anomalies and collaborate with your

Starting point is 00:46:25 team to troubleshoot and fix issues fast. Now, hey, have you ever thought about using a Datadog for a service like this? How about monitoring your DevOps workflows and pipelines? So if you are an Azure DevOps user, you can monitor your workflows and pipelines with Datadog to verify that your builds are actually working and that the deployments are happening the way you expect that they would be happening. I mean, that was, saw that blog article. I was like, oh, that is a crazy, awesome use case for Datadog that would have never dawned on me. But you're like, oh, hey, outlaw, you know, Azure isn't my thing. We're all on AWS.

Starting point is 00:47:05 Don't worry. There's another article that you might like. I'll include some links to these. But for monitoring Amazon EKS on Fargate with Datadog. Like, how about that? That's so awesome that you could like monitor your entire Kubernetes cluster with like serious visibility into what is going on inside of all

Starting point is 00:47:27 those nodes too with datadog and it's funny that you mentioned the blog uh i had an article i wanted to mention too uh which was actually all about tagging theory or tagging strategies basically for um observability purposes so um services like aws or like kubernetes all the major cloud providers uh give you the ability to tag things like basically key value pairs. And they've got a really great and insightful read about basically some suggestions for how to tag your architecture so you can use it effectively when you're trying to track down problems and just see what's going on there. And so definitely recommend checking that out. Yep. So try it yourself today by starting a free 14 day trial and also receive a free Datadog t-shirt when you create your first dashboard. Yep. So again, head over to www.datadog.com slash coding blocks to see how Datadog can provide

Starting point is 00:48:21 real-time visibility into your application. Again, that's www.datadog.com slash codingblogs to sign up today. Alright, and now it's time for me to please ask you for a review. And we try to make it easy for you. We know it's a pain, but you know how much this means to us. If you're a first-time listener, then just know that we love reviews and we need them. And you're going to be hearing us ask you about them a lot because they mean so much to us. And we really appreciate when we get them.

Starting point is 00:48:52 And please consider it. Go to codingbox.net slash review. We try to make it easy for you. You can find links there to review us either on iTunes or Stitcher or Podchaser or whatever your choice is. We appreciate them all and thank you very much. And with that, we head into my favorite portion of the show, Survey Says. All right. So let's see.

Starting point is 00:49:20 A couple episodes back, we asked, what is the single most important piece of your battle station? Because if you recall, we had just talked about our builds in the episode before, I believe. So your choices were the keyboard, of course. Johnny Five, need input. Or the mouse. My clicking game is on point. Get it? Or the monitor, I see dead pixels. Or obviously not the peripherals. It's all about the tower of power.

Starting point is 00:49:57 Or the chair, or should I say the CEO chair. And lastly, it's all about the desk. Nothing else matters if it's sitting on a TV tray. All right. So, Alan, how about you go first? What's your pick and your percent? This one's a hard one, man. Like, I could see it being one of two things. Actually, probably one of three or four things here.

Starting point is 00:50:23 I'm going to go with it's all about the the tower of power and I'm going to say 35%. Okay. Uh, that's, that's probably right. So can I just choose the same thing? Hey, you can, you can price his writing.

Starting point is 00:50:47 We can either win together or lose together? Yeah, I'm fine with that. We will share or lose this victory together. You can outdo him with the percent. You want to go to 36%? No, no. I don't feel good about that. I have friends on the show.

Starting point is 00:51:08 I think it's 35%. Are you going to pick the exact same number? Yeah. That's no fun. No, one of you needs to be higher. No, this is really fun. This is 2020. That's what it is.

Starting point is 00:51:19 Yeah, I'm changing things. I'm being direct. We're breaking the rules. I'm pretty sure that the Price is Right rules, you can't do that. You can't pick the same. Hey, we got no Bob Barker here. Yeah, you do. He retired anyways.

Starting point is 00:51:29 Did you not hear the way I announced this section? We absolutely do. You get a happy Gilmore's here in a second. Even though he wasn't the guy that would say that. All right. Well, so both of you are going to be boring and pick the same answer of, obviously, it's not the peripherals. It's all about the tower of power. And you're both going to pick the lame answer of 35%.

Starting point is 00:51:53 Right? Yep. I have that right? All right. And if we're both right, we have to do a spinoff according to Price is Right rules. The Showcase Showdown. Yeah. No, you're right The Showcase Showdown. Yeah, no, you're right.

Starting point is 00:52:06 Showcase Showdown. All right, well, you're both wrong. You both lose. Was it the monitor? No. Chair? No. Do you want to keep guessing?

Starting point is 00:52:19 Wow. The keyboard. The keyboard. No. Really? The keyboard. Of course, it's the keyboard. No. Really? The keyboard. Of course, it's the keyboard. No.

Starting point is 00:52:26 How many times did we talk about mechanical keyboards versus chiclet keyboards? All the opinions that people have about those, of course, that was going to rank high. No. How high was it? It was 28% of the vote. Oh, okay. So this is all spread out then. We've been hacked.

Starting point is 00:52:45 What's number two then? So number two and three is a tie at 24%. And that's going to be the tower of power and the monitor. Very good. Wow. Yeah. The monitor and the tower of power. Well, now I'm scared because we've been hacked.

Starting point is 00:53:04 So if you get any emails about diet pills or anything, it's not actually for me. If you get any friend requests from me and we're already friends, don't accept it because I've been hacked. Well, if the worst of the spam email that you're getting from us is diet pills, then you're probably all right. Yeah, you do it. Because there's much worse. Yeah, man. There's some do it. There's much worse. Yeah, man. There's some scary ones. That's really good.

Starting point is 00:53:30 I was not expecting that. No, that's a surprise. All right. Well, how about for this episode survey? Since we're talking about data models, we ask, which data model do you prefer? And your choices are. Hold on. I almost feel like because it's keyboards hey our next survey we won't we won't usurp this one our next survey is going to be which keyboards do you use

Starting point is 00:53:53 right like oh the one they ship the cheap now huh can we do that one now i need to know i know it's actually bothering me now like there's There's some things I don't know about keyboards, apparently. No, no, but it will be our next survey. We must do it. So this one. All right. Sorry. Sorry to interrupt.

Starting point is 00:54:13 Back to our regularly scheduled program here. Outlaw, I take it you're making making notes for the next survey. Yeah, I am. We never have a survey ready ahead of time. I know this is amazing but but this actually this i i'm gonna die trying to wait on these answers because we're talking probably another two months yeah what's gonna happen though is uh people have already heard this survey idea and everybody's gonna like just write it in as their comment or we'll get there'll be a whole

Starting point is 00:54:43 slack conversation about it. And yeah, that's where it'll happen. Leave comments on this episode. If you'd like, that's fine too. We're still going to have the survey. Yeah. All right.

Starting point is 00:54:53 All right. So for this episode survey, as since we're talking about data models, we ask which data model do you prefer? And your choices are the relational model. I love many, many joins. Six normal, blah, blah, blah, blah. Six, I can't even say it. I can't even say the words. Sixth normal form, all the things. Or the document model. I'll worry about the data

Starting point is 00:55:21 structure when I read it. Or graph model. It just sounds cool. Like, oh, you're still using relational data models? That's cute. Or polyglot persistence. I'll use what I think makes sense for the use case. I love that you shamed somebody out of a relational data model. Oh, that's so yesterday.

Starting point is 00:55:47 That's cute. 1970 called. What was his name again? Edgar called. He wants his data back. Yeah. Got a Mr. Dykstra on the phone for you. Oh, man. You sound Mr. Dykstra on the phone for you. Oh, man.

Starting point is 00:56:05 You sound angry. Sounds angry. This episode is sponsored by educative.io. Every developer knows that being a developer means constantly learning new frameworks, languages, patterns, and practices. But there's so many resources out there. Which one should you choose? Meet Educative.io. Educative.io is a browser-based learning environment allowing you to jump right in and learn as

Starting point is 00:56:34 quickly as possible without needing to set up and configure your local environment. The courses are full of interactive exercises and playgrounds that are not only super visual, but more importantly, they're engaging. And the text-based courses allow you to easily skim back and forth in the course, kind of like a book. So there's no need to scrub through hours of video to get to the parts you care about. And incredibly, all their courses have free trials and offer a 30-day return policy. So there's no risk to you.

Starting point is 00:57:02 Now, here's even better though. So when we mentioned in the past that they have introduced subscriptions, they've extended the offer. So I don't know how much longer this offer is going to last. So get it while it's while the getting's good. But yeah, they have 50% off their annual subscriptions. So I mean, you could kind of think of it as like, Hey, you could go to educative.io slash coding blocks and you could get 20% off of a single course if that's the way you want to go. Or you can get 50% off of every course by signing up for a subscription during this limited time offer. And, oh, by the way, I should mention too that when you do lock in for that subscription, you're locked in at that discounted price for as long as you remain a subscriber. That's really nice.

Starting point is 00:57:49 I want to mention, too, that they actually make a lot of updates to their courses. So I've noticed that a few system design problems in my favorite course, Crocking the System Design Interview, have been added since I looked. And I can see that they just added one for Ticketmaster now that I haven't seen. So that's really cool. And I also want to mention that you're able to preview many of the chapters. So it looks like seven out of 31 of the sections in this particular course that I really enjoy is previewable. So you can get a sense of what all those other sections are like. And as I've mentioned before, and as we talk about the show pretty often now that we're discussing the book, there's a lot of big systems here like Twitter, YouTube, or Netflix, or designing a web crawler that are just highly relevant to the kind of stuff that we're talking about lately. So you should definitely give that a look if that's something you're interested in.

Starting point is 00:58:38 Yep, it's a great way to learn. So start your learning today by going to educative.io slash coding blocks. That's educative, E-D-U-C-A-T-I-V-E dot I-O slash coding blocks and get 20% off any course. So let's get into the discussion about NoSQL, right? I think we've covered document databases enough. I'm sorry, relational databases enough. And now we want to do a deep dive on Lotus Notes. And it's only fair, right?

Starting point is 00:59:13 This is the big debut. It will be the next database. It's the big debut on coding blocks for Lotus Notes. You're welcome. You're welcome. So, yeah, NoSQL is the latest competitor to relational databases, or is it? We'll find out here in a little bit. But here's an interesting thing about it. And it always bugged me, the name of NoSQL.

Starting point is 00:59:34 And I'm glad that they actually elaborated on it in the book because I truly hate the term NoSQL. It was originally intended as a catchy Twitter hashtag for a meetup about open source distributed non-relational databases. Okay. I love how that, like I loved how that was, how it caught on as a term, right?

Starting point is 00:59:56 Like, right. I guess we can't live without Twitter now. Like that, that's what the world has come to. Like our, our terminology is now coming from hashtags that derive on Twitter. So, you know.

Starting point is 01:00:08 You know, remember Weight Watchers or, you know, you've heard of the company Weight Watchers. It's pretty obviously what they do. If you're not familiar with the company by the name, the rebranding to WW. Really? Yeah. That's terrible. It works better on social media. Ah, interesting.

Starting point is 01:00:28 So www.ww.com. Yes. Terrible. So what they ended up doing with NoSQL, though, is they sort of repurposed what no sequel was. So now instead of just no sequel, it stands for not only sequel. So they said, what'd they say?

Starting point is 01:00:53 They retroactively like repurposed the name for that. So yeah, I mean, whatever no sequel basically means anything that's not a relational sequel data type thing, right? Which doesn't, the part about that that is crazy is it doesn't indicate the type of storage, right? Like when you talk about SQL, you're talking about tables, rows, and columns.

Starting point is 01:01:19 When you're talking about NoSQL, it could be anything from a search engine like Elasticsearch to a key value pair thing to a JSON document storage to BSON document storage. It just means that it's not a relational database. And it's funny and awkward how many non-relational databases, so how many NoSQL databases have some sort of SQL language variant that works for them. Almost all of them. Yeah, almost all of them. Yeah. Almost all of them. Yep. Wasn't there something else that we discussed recently where,

Starting point is 01:01:49 um, I'm trying to remember what the, what the name of it's called. I keep thinking of anagram, but that's not it. Uh, where it was, they made up the,

Starting point is 01:01:59 the name for it after the fact, like everybody thought it was a, some abbreviation where every letter meant something, but it really wasn't. And they made one after the fact. And there's an actual term for that. And I can't even remember what the, the,

Starting point is 01:02:15 the normal form is called. Cause it's not a, it's not a anagram and it's not abbreviation. Yeah. It's going to bug me now. Uh, you know, what's interesting about this? So talking about these NoSQL databases and what you just said, Joe, about the fact that they all have a SQL type thing on them,

Starting point is 01:02:35 when you start looking at big data implementations, if you dig into the tools around Hadoop or AWS, Data Lakes, or anything like that, almost every single one of them has some sort of foundational tool that is SQL to be able to interact with all of them, right? So it's here to stay for a long, long time. So they said, hey, so why are people creating these NoSQL solutions, right? And this is kind of interesting.

Starting point is 01:03:08 One was a need for greater scalability than what traditional RDBMSs could do. So, you know, we've talked about this whole vertical versus horizontal scaling. A lot of times your oracles, your SQL servers, those type of database systems if you want to run faster you didn't you needed more hardware on the same machine right faster processors more ram more space faster drives whatever but then you kind of hit a limit and there's a couple reasons for that which um i don't really go into too much but uh trans being just having transactions and a transactional nature is part of that. And also because relational databases, you kind of take your data and you split it up into a

Starting point is 01:03:51 normal form. So you've got like your addresses over here and your people over there and your orders over there and separate tables that you can then bring back together. If you run a sort of query on something that you've scaled and you've split that information you've had to put that information in different spots and so if you've got to go and for every query execute and find information for multiple servers you're introducing a lot of latency and a lot of overhead and like what if one of them's down but a lot of times you'll see in document databases in particular and this isn't for all of them but for for most of them, it's very common, they actually store the data all together.

Starting point is 01:04:28 So the document is stored and retrieved whole. There's no putting it back together. And what that means is if you've got 10 different servers and the information lives on one or potentially multiple if it's replicated, then you can go and fetch it from one spot and you go and put it in one spot. And it makes a big difference when it comes to getting that information back out.

Starting point is 01:04:52 Yeah, and that's why you can have these very large data sets, right? So if you have a database that has a few tables, like you just said, maybe users and orders, you got 20 million users and you got 100 million orders. That's a lot of join that has to happen, right? To bring that data back and to be able to sort it. And hey, if you just want the first 100, guess what? You're going to get the first 100, but it has to do that join and sort everything to know what that first 100 was, right? So it's a super expensive operation. whereas in that distributed model, like what he was just talking about, you know, hey, you want customers and orders.

Starting point is 01:05:30 Well, you might have a customer and the orders might just be attached to that customer in that same data blob there. So yeah, that whole need for greater scalability was to be able to handle super large data sets and also fast writes. So you're writing across multiple machines. You have the IOPS available across all those machines. Now,

Starting point is 01:05:54 to be fair though, you could solve that same situation in a relational database. Like you didn't have to go document to do it. If you were to like, I mean, there might, it might not be pretty, but you could like have one column of data where it's like, okay, Hey, here's a, an XML blob or a Jason blob that is like all of your orders for that. So you, you get that one record that is the customer record. And there's this other column that you're going to have to interpret, right? The big difference between what you're saying, yes, totally.

Starting point is 01:06:30 You could absolutely do that, right? You have a text blob or some binary blob or something that's available in that data row, right? You could do that, but now it's up to your application to be able to do anything with it. Whereas if you're using one of these, you know, No SQL solutions, it handles all that garbage for you. Right. Like, well, but kind of the, and that's why I made the joke in the survey that like, if you go with the document model, right? Like you're with the relational, you're, you're guaranteeing the structure of the data on the rights,

Starting point is 01:07:02 but on the document model, you have to enforce the structure of the data data as you read it. And so in either way, if you like did this super gross thing that I said that you could do where you created a column and, you know, had the big Jason blob or the XML blob in it, you know,

Starting point is 01:07:21 that's where you're kind of guaranteeing or enforcing that, like, hey, when I read this column, I'm going to enforce what I think the structure of that should be at the time. But what you didn't solve there, though, was the scalability, right? Like this whole notion that, hey, that server no longer can handle the capacity. Now you're scaling out. So you fix the thing where you're not necessarily joining the data, but you're not fixing the thing where the data set grows so large that it just performs like garbage on that one server. Right. So that was part of the equation.

Starting point is 01:07:51 Well, but that's but also running on multiple servers, though, like that's not necessarily tied to the data model. Right. So you could partition the data and potentially have it like one big table and then kind of split it up amongst multiple nodes. But I think that when you do something like that, if you just kind of cram all the data into like a column, you start to lose some of the benefits of relational database. Like you're not able to relate that data as easily,

Starting point is 01:08:16 you're not able to query as easily. And no single databases, like document databases have mechanisms like they're kind of built around this notion of like MapReduce, which is just kind of designed to deal with multiple nodes and kind of running these things in parallel and putting it all back together. So it's just, it was set up from that, with that idea from the get-go. Which, by the way, for the Postgres conversation we had earlier,

Starting point is 01:08:40 that's actually one of the real big boons for it is a lot of times people that want a RDBMS plus a NoSQL solution, they'll go with Postgres because it has really good support for doing document queries within their own column. So if you have a JSON column, then you can query that thing. So, you know, it is kind of interesting. So the lines have blurred a little bit because the relational database systems out there, the people that have made them have looked at it and said, oh, people need this other stuff too.

Starting point is 01:09:13 You know, let's add these features on. So there's a lot of bolt-ons to the RDBMS world now. Yeah. Have we talked about NoSQL versus SQL before? The questions come up a lot. Like we've been come up a lot. We've been asked that a lot in the Slack groups, and it's always a hard thing to answer in a few sentences because people are like, hey, should I use Mongo or should I use Postgres?

Starting point is 01:09:37 And it's like, man, what are you trying to do? And we'll get into some of the differences of them here in a minute because I think it's worth talking about. As a matter of fact, I think wrapping up this episode, we're going to sort of talk about the use cases and where they come in. So we'll touch back on that. I mean, NoSQL has definitely come up more than one time. Yeah.

Starting point is 01:10:00 It comes up a lot. And we're going to be talking about a lot more kind of coming up because as we're comparing different data models and different types of applications. But like what Al said about like whether you enforce your schema on read or on write is really important. And it's also as much of a weakness in some ways as a strength in others. So like sometimes you have highly varied data. And one example I've heard given before, it's like if you're like an Amazon and you've got a camera that has like these 150 different specifications and sizes and information about it, and you've got a bicycle, which has this 150 other completely disparate set of, you know, points of data, the only things they have in common are like a picture and a name and a price, then it's good that you can kind of like store these things

Starting point is 01:10:45 without having such a rigid schema. And frankly, some of the properties might have different kind of restraints on them. So maybe like a camera, you know, they have a size property, but it only really makes sense in millimeters. And the bicycle is much bigger parts. And so you're dealing with inches or feet or something. And so it's kind of good to be able to break that down at a more granular level.

Starting point is 01:11:07 And that's something that doesn't really fit very well into relational database because you end up getting this like columnar explosion. You know, it's funny. That thing that all of what you just said reminds me of that article I wrote several years ago now. It's been like three or four years ago about the entity attribute value schema and having multiple types

Starting point is 01:11:27 of products in a product database. It's still one of the most hit articles on our site, which is crazy because I wrote it a long time ago. But I think I'm going to revisit it because with a lot of the things that we've learned over the years, like I've even answered some of the questions that have come up on that post and basically said that, you know, people are like, Hey, did you finish? Like, no, you know, I saw something shiny somewhere else and I haven't gotten back to it. But, but basically what it boils down to is I think it'd be a mixture of the two, right? Um, if it ever, when it comes back, right? Like there's probably some good pieces for the relational model and there probably some good pieces for the relational model

Starting point is 01:12:05 and there's some good pieces for a no document or a document type model. So yeah, anyways. I'll include a link to it. And it was called the database schema for multiple types of products. It was the article title. Hey, I wanted to take a moment real quick

Starting point is 01:12:20 because it was going to drive me nuts if I did not remember what the name of that word was. So an acronym is the first word i was trying to remember right like if you have if you make up a word that's basically like other words like ibm you know international business machines right but there was some something that we talked about just recently where it was uh which is what the no sequel is where where they changed it to be not only SQL, so they reverse-engineered that, which is a Bacronym. A Bacronym, that's awesome.

Starting point is 01:12:53 Yeah. Interesting. I swear PHP was personal homepages when I first heard about it, and then it became like PHP Hypertext Processor or something. Maybe it was PHP that was the one that we were talking about. I can't remember. I can't remember.

Starting point is 01:13:07 I can't remember either, but that's also driving me nuts. These are the things that keep me up at night. Right? Unfortunately, these things don't keep me up. It is. Yeah, the recursive definition was a backgroundism. A backgroundim.

Starting point is 01:13:21 A backgroundim. Alright, so onto the second. so what's the need the second one here is there's a big desire for foss if you ever see foss that's free open source software right as opposed to the very expensive commercial database systems that a lot of companies use the oracles the sql servers all that like those things have a price tag on them. Very comma, very comma expensive. Yes. Maybe another comma. And they're getting more so as time goes on.

Starting point is 01:13:50 And you know, I half, man, I wonder what you guys think. I half wonder if it's because they're just trying to push everybody into the web services, right? Because being completely honest, it's way cheaper to go with something like Azure Cosmos DB than it is to buy a bunch of on-prem licenses for SQL server nowadays. So I don't know. I I'm, I'm curious if that's the marketing strategy there. Cause they're like,

Starting point is 01:14:16 Hey, if we can just keep it all up here, we can keep our own telemetry. We can find out what everybody's doing. I don't know. I don't know, man. I mean,

Starting point is 01:14:24 maybe, but then I'm like thinking to like an oracle for example i mean oracle has been such a major and dominant player in the database world for how long now ever and yeah and like i'm trying to think like okay okay, well, what is there? Wow. How do they fit into that scenario that you just don't they have a cloud? I don't know if they do. They do have cloud offerings, but it's well, yeah, I don't know. I mean, Oracle's even more expensive than SQL server. So it's I mean, like I've seen the price tags on these things. They they're not for the faint of heart. Um, so the next reason that, that people

Starting point is 01:15:08 want these no SQL things is there's these specialized query operations that just don't work well in a relational model, right? Like if you, if you're doing some application things, they just don't map well. And another one is the shortcomings of relational models. Um, so like what Joe Zach was just talking about where you have, you know, products that have just crazy amounts of metadata that are associated with them, trying to keep a schema up to date. Every time a new product hits the market that has a new piece of metadata that doesn't fit with what you already have, like trying to change that schema all the time is not only difficult, but it's also error prone, right? Like it's really easy to blow up your system when, when, uh, everything's not in sync properly. So those are the needs that they had there. I mentioned to the different applications that have different needs and may require different data models, which I thought is a nod to polyglot persistence where you might have the same data in two different databases and you've got to sync those.

Starting point is 01:16:13 The reason you do that is that those different databases have different purposes and different needs and different formats, different data storages. You might have data in a search engine for searching and also a relational database for its transactional abilities so that makes a lot of sense yeah i was going to specifically call out search indexes as you know a big reason why you might have multiple copies of the data yep and that's a huge one and that's actually one of the things i want to point out is in the book, they actually said different applications will have different data models. But that's where I feel like things are changing a little bit, whereas it might be the same application, just different needs in the application are using different data models. Right. Like you could totally have an application that does orders. Right.

Starting point is 01:16:59 And maybe its primary persistence layer is the database because it's transactional. But when you go to look up those orders later, that could be backed by a search index, right? Or something like that. So that's where I feel like even just slices of an application now can pull from different places. But then keeping those things in sync behind the scenes can be a bit of a pain. Well, even, I don't know if this would fit. Because same application, your e-commerce site example that you were given, but I was thinking like,

Starting point is 01:17:30 well, you might have the products in a search index. And placing the order might be done in a transactional relational database. I mean, being perfectly clear on this, if you look at how some of the big companies do it, right? So you're absolutely right. Your product catalog might be in a search index. Your orders might also be in a search index. Ultimately, the transactions might happen in a database, but they're probably going to a persistent queue first, right? So I mean,

Starting point is 01:18:02 there's all kinds of data models happening just for this one thing that feels very simple to you, but behind the scenes is hyper complex because of how everything needs to happen every step along the way. Well, even going along the lines of like the multiple copies of the same data, to add onto that idea, even the concept of like, okay, well, you know, from the user's point of view, yeah, they might hit a search index to get the products, but from an administrative point of view, uh, to manage that product catalog, you might be hitting, uh, you know, a relational database to, you know, for the edits or whatnot, or, you know, I'm excluding the queuing mechanism that you're, you know, that

Starting point is 01:18:45 layer of complexity, but yeah. Yeah. These things aren't simple. When you start getting into this, it's a very much a lot to manage. So this is, so we're going to move into the section here, which they called the object relational mismatch. And I thought this was pretty interesting. So they said in the book, and I don't know if this is true or not. I haven't really ever thought about it that much, but they said most applications today are written in some sort of object oriented programming language. I don't know if that's true or not, but let's assume that it is. I mean, Java is so popular. Why would you not agree with that? What was so popular? Java. It's so popular. Why would you not agree with that? What was so popular? Java. It's so popular.

Starting point is 01:19:25 Why would you not agree with that? I don't know. I guess like when I think about this stuff nowadays, I mean, you have functional programming languages. You have your pearls, your procedural programming languages of the world. I don't know. I guess like when I heard that, I was like, maybe, maybe not. I don't know. I live in an object-oriented world, but I don't know. I guess like when I heard that, I was like, maybe, maybe not. I don't know. I live in an object oriented world, but I don't know if everybody does. So I don't know if you heard about like VB.net was pretty popular. It was and still is apparently. I don't always do a good job of it. I don't think any of us do, but that's fair. So what they said about this, this is one thing that is near and dear to our hearts.

Starting point is 01:20:09 We've talked about this several times in the past. Is there's typically a translation layer that's required to map your data from its storage engine to your object-oriented world. And in this case, they're talking about relational databases. So you're talking about ORMs, right? You have your database tables, and then that's somehow got to make its way into objects that you can interface with in your application. Yeah, that kind of stinks, right? It's like objects translate to this other model, and then we translate back into objects. So from an app developer's perspective, it's kind of like, okay, so why am I doing this? this but from anyone else from a data perspective who's working with that data in the database and doing more data-centric things then it makes more sense for the format it's just been frustrating and it's

Starting point is 01:20:53 been a point of frustration for a long time yeah and and the crazy and they call it here and it's sort of interesting because this goes back to almost like an electric type of area, is they call it an impedance mismatch. Just this whole notion of taking stuff from a relational model and putting it in your object model. And the thing to me that stinks about it is a lot of times, and Joe, you alluded to this earlier, is you think about the database first. So you

Starting point is 01:21:25 create that stuff and then that sort of, you're thinking about your application. So you designed your database tables a certain way and then that flows out to your application. Well, ultimately what happens is things change in your database and they no longer match what you're actually trying to model in your application anymore. And that, to me, is where it all goes sideways, right? Like this thing that used to mean something now means something else, and there's no good way to break that. So you didn't think about speakers then when you thought about the impedance mismatch? Oh, absolutely.

Starting point is 01:22:02 Ohms? Yeah. It's electronics, right? It's electric currents. No, no, no, just speakers, though. Ohms. Yeah. It's electronics, right? It's electric currents. No, no, no. Just speakers, though. Just speakers. I was thinking about all the subwoofers that I can possibly fit.

Starting point is 01:22:14 Did you ever try and hook up multiple speakers to the same amplifier and you were like, why is this not working the way I thought it would? Right? Did you? Yeah, absolutely did. Are you kidding me? Run them in parallel or in different series yeah absolutely yeah i don't know about the problems that you're describing but most most speakers those problems yeah so yeah the thing about this is like they talk about

Starting point is 01:22:41 you've got these frameworks out there and you've probably heard of several of these so you got active record which is a big ruby one hibernate which is a java one entity framework which is our c-sharp favorite um there's a ton of these right and they're all there basically to reduce the boilerplate and to help reduce the amount of code that you actually have to write to use the stuff, but none of them fully fix the problem that there is this mismatch between a database and your objects. Yeah. I mean, this is going back to the, you know, maybe not even going back, but just, you know, how like specific to entity framework, right? Like some people absolutely hate it because of the ways that it, you know, can go retrieve data if you're not careful about how you write the code that is using entity framework under the covers, right? And you could say that it's because of this

Starting point is 01:23:37 mismatch in like how you're trying to represent the data in memory versus how it was represented in the relational database, you know, was represented in the relational database. You know, that, that, that's causing the issues there. Yeah. It's, it's not an easy, no, it's complicated. I mean, it really isn't. It's just, I think as long as you have this, where your database drives your application or whatever, you're always going to run into these things and you can model things pretty close. But going back to our domain driven design discussions before, it's not necessarily what you want to do anyways. Right. Just because something's named customer in the database doesn't mean that's what you should be naming in your application, right? Like if you're writing something for, let's call the shipping department, right?

Starting point is 01:24:31 Or actually maybe not even shipping department, let's call it a vendor, right? A customer means something for a vendor than it does for your customer service department, right? So just because the tables name something over here, doesn't mean that's how it should be represented in your application. And then that means you're going to be building more layers on top of that to hide the fact that it was even called customer in the database, you know? So I don't know, man, it just gets into a really nasty place, but these ORMs, at least in my opinion, are pretty necessary to make your applications not completely garbage. Well, just because you end up kind of creating your own anyway.

Starting point is 01:25:12 Right. So there was the statement that you weren't sure about where most applications written today are in object-oriented programming languages. And I joked about VB.net. So I thought, oh, this might be a fun little thing to go back to. Excuse me. The Tyobi Index for December of 2019. So extremely current.

Starting point is 01:25:40 Extremely current. And the top 10 programming languages. Now, don't go and cheat. Don't go look at it. Is that what you're doing? Don't go do that. Stop. Not yet.

Starting point is 01:25:57 I'm almost there. Stop. Stop. But what do you think number one is? It's got to be VB.net. I mean, of course. I mean, I'm not going to fault you for selecting that choice but joe vb.net no guys i told you earlier it's java it's clearly java so top 10 goes like this java c python c C Sharp, Visual Basic.net, JavaScript, PHP, SQL, and number 10 is Swift. So out of that top 10, according, well, I mean, this is the world-renowned Tyobi index.

Starting point is 01:26:40 Visual Basic.net, There's no way. Out of that top 10, though, only, what, three of them? Let's see. C and SQL. So just two of them are not object-oriented. Well, PHP isn't technically object-oriented, right? They kind of bolted it on like 20 years ago. Yeah. I mean, 20 years ago is i mean i wait you know what years ago

Starting point is 01:27:08 is long enough that i would call it i need to rewind my statement on visual basic maybe it is it may be because i don't write desktop applications that's why i don't see it as higher maybe and tyyo b is not right not necessarily used to that but just the whole list in general a lot of it's so far from my experience in the world that i have a hard time with it but maybe maybe they're looking at some stuff delphi is number 12 yeah i have a number 12 is not. No, there's no way that's right. I mean, I don't know why you would not choose this. Obviously, this is correct. Yeah, I just can't see it.

Starting point is 01:27:59 All right. So, again, we poo-poo on Tyobi. I'm going to have a link to the ultra-accurate world-renowned Tyobi index for December 2019. And you can, you know, make your own judgments. That's beautiful. All right. one is in the book, they talk about like the difference between a relational implementation of like LinkedIn's resume type thing versus a no SQL implementation of the same type thing. And I figured, you know, let's not copy everything out of the book. Let's talk about like something that we've all sort of worked with, which is like orders, right? Customers and orders and that kind of thing. And talk about some of the good things for relational databases,

Starting point is 01:28:51 the bad things, and the good and bad for NoSQL. What do you guys think? I mean, I definitely like the scalability. I like the locality of the data, so you'll be able to kind of fetch the most common stuff with your data back so uh you're talking about you're talking about the no sql approach yeah specifically about no sql and the reason why i started with the sql there is because i'm so used to relational that it's like not even fair to be like well that's just because that's how that that is the

Starting point is 01:29:21 way doc just used to work i mean it's important to call it because i don't know that we've said Because that is the way. It's important to call it, because I don't know that we've said this yet, but when you refer to that locality, if we're talking about the document model, there was the example that was given earlier about you have the customer record and maybe that document, or I should say customer document, not record. But that document includes all of the orders along with it. Right. And it's, that's the locality that those orders being already there coming along for coming along for the ride for free when you query and get that single, uh, customer document. So maybe we should back up for people that haven't messed with NoSQL or document databases or anything like that in the past. We're talking about in a document databases, basically every record is a document. And if you want to visualize that thing as if you're comfortable with JSON, that's probably a good one. So you might have this object and let's say this is the customers table.

Starting point is 01:30:26 That document is going to represent each one of us. Like each one of us would have a record, right? There's going to be a Michael record and Alan and Joe. And then in that, there's going to be a property called orders, right? And that's going to have an array of other documents under it, right? So Michael's going to have 50 orders. I'm going to have 60 and Joe's going to have 70. And all the details of those orders will be in line in that document as nested JSON type documents.

Starting point is 01:30:51 Right. So that's sort of an easy way to visualize it. If XML is your thing, just think about it as XML with nested nodes. Right. But basically everything's stored in one place as opposed to the relational database where you're going to have a customer's table, an orders table, an order items table, and a bunch of other tables that have to be joined together. So hopefully that paints the picture here. And it's really good for, say, like web apps where you want to go get the customer, you want to get the orders. It's kind of entity focused. It's terrible for writing reports where sometimes you want to know how many people bought this product on a holiday because of a coupon.

Starting point is 01:31:31 What are our top ten selling products? Right. Yeah. Because those are all nested within a bunch of customers, within the orders, within the order details. Yep. Yeah, it's really tough. And I wanted to point out, too, we didn't really talk about it, but a lot of the niceties that you have with relational databases, like things like indexes that make it fast and perform

Starting point is 01:31:48 to look up stuff, a lot of that still applies to NoSQL document databases. They can still keep indexes. They can still do little tricks and optimization, caching, performance optimizers, things like that. A lot of those things do carry over. It's just this kind of fundamental kind of structural difference that makes all the difference. Yep. And so a couple of other things to tack on here for me, like looking at the, at the no sequel approach, they already said like some of the niceties is you want to go get all the orders

Starting point is 01:32:22 for that one customer. You just grab that one customer and you got them all right. Um, another nice part about document databases, if you think about them, that document is a snapshot in time. There is no relationship anywhere, right? So let's say that you place an order today and you're at your house in Florida, right? That address of where it shipped to is a snapshot of that moment in time. There's no lookup to an address location in another table. So you don't have to worry about, Hey, well, if somebody goes and changes that, is it going to change where it says that I shipped this thing to? Cause I really shipped it to Florida. I didn't ship it to Georgia. Right? So, so to me, like when I look at

Starting point is 01:33:06 a lot of no SQL or, or document storage solutions, a lot of times it's, am I trying to snapshot something? Am I trying to get back a hierarchy of things all at once? You know, those, those are really important things to me to think about. You know, Santosh does a great presentation. Everybody, Santosh and Orlando does a great presentation. Our buddy Santosh in Orlando does a great presentation on NoSQL. I'm going to see if I can find a copy of it online that we can point to. Oh, excellent.

Starting point is 01:33:34 Then going to the relational side of things, so not even talking about scalability, right? Let's just assume in the NoSQL world scalability is one of the reasons why you want to use it, right? If all of a sudden you have way more customers than you did previously and it needs to perform well, that's one of the beauties of most NoSQL implementations is, hey, you can just add 10 more

Starting point is 01:33:58 servers and magic happens, more or less, right? It's more to manage, but it works. But in the relational world, the thing that's nice is, hey, I need to go look up information about this customer, right? I can look at an order number and it's probably going to have the customer number on it. I can go look up the customer information and I have it right now. And like Joe said, if I want to do reports, hey, what were the top selling products this month? That's really easy to do because all you got to do is go to the order details table and basically do a sum on the quantity and all that. And you're good. But then you have the scalability problems and you also have the issue of, hey, how do you store snapshot data?

Starting point is 01:34:40 Right. I mean, I know all three of us have run into this in the past where it's like oh well we need to we need to track changes in the system so what do you do you end up creating a bunch of archive tables or or if you've got something like sql server you got temporal tables or something like that you just go through you jump through a lot of hoops to do stuff that it's really not designed to do yeah how many times do not designed to do. Yeah. How many times were you given a bug that, uh, what's his like,

Starting point is 01:35:07 this shouldn't happen. Look at the state and you go. And by the time you check it down, you're like, well, this wasn't the state of it. That time happened, this happened and this happened.

Starting point is 01:35:14 And that's why it looks like this now. Yep. And that's, that's the thing, right? Like when you talk about a relational database system, the purpose of it typically is transactions and truly relational lookup data, right? When you start breaking outside of that world to where

Starting point is 01:35:33 you don't want that stuff to change, right? Like you want to know that, you know, maybe if it was a woman and when she placed that order two years ago, she was Jane Doe and now she's Jane Smith. Should that have changed? Right? Like if you go look her up now, are you going to be able to find Jane Doe if you changed her name to Jane Smith? And there's only one record of truth.

Starting point is 01:35:58 So there's a whole lot of things that you have to consider when you're talking about, do we use a relational database system for this or do we use a NoSQL implementation? And by the way, we haven't even talked about some of the other cool things, right? Like we're talking about document storage right now. There's also graph databases. There's key value databases. There's search indexes, right?

Starting point is 01:36:21 Like all of those are NoSQL solutions. And this book goes really deep on all this. So stay tuned. So that leads me back to the question that I told you to remind me about, but I actually typed it in at the end so I'd remember. So you said that you typically think about the database first when you go to create an application. Now that all of us have worked in multiple storage systems, do you still find yourself

Starting point is 01:36:50 doing that from the database perspective? I think, you know, we talked about the educative course on the system design interview. I think that that particular course maps well to the way I kind of think about things where like I start with a problem, I start slicing it up, I start thinking about the use cases. And very quickly, I end up thinking about data. But I don't even want to say databases anymore. I think about it in terms of more of services or systems. So I might like if I'm drawing an architectural diagram, there's a good chance I'm going to end up with a relational database. But there's also a good chance I'll end up with SQL in there or maybe like some Reddish cache or some other things. I'm going to end up with a relational database, but there's also a good chance I'll end up with SQL in there or

Starting point is 01:37:25 maybe like some Redis cache or some other things I'm kind of thinking like high level about kind of a larger application. And so it's, I don't, I would say that now in the last couple years, I don't think about the database first, but I still think about my data

Starting point is 01:37:42 services first. Okay, interesting. What about you, Outlaw? I mean, I'm thinking about, like, which cool framework I get to play with. So, you know, is this going to be a React thing, an Angular thing? Is this a Python thing?

Starting point is 01:37:56 Janko. Yeah, I don't think about it in terms of the database first. Yeah, I guess, like, if it's something that I'm doing on the side, you know, just to play around with something, like I'm definitely just thinking about it more, you know,

Starting point is 01:38:13 like, Hey, what, what stack do I want to play with to learn with? Like, I'm not even, I don't even care what the data is. And if it's something for professional then needs,

Starting point is 01:38:21 then I probably already have the data somewhere that I don't have, that I don't already, that problem is already solved for me. So I don't even have to think about it like that. It's nice just to hear how we came to the question because I kind of assume you said, hey, how do you do a Twitter? And that's how I kind of approach it. But your perspective is more of like hey, I want to do something cool. How do I want to do it? Yeah, it's Yeah. It's, it's interesting. I think for me, when I do that stuff nowadays, I'm tending to think about more what the data is going to be used for. So, so in the past, I would have definitely thought about

Starting point is 01:38:57 the database design first, because that's what I did for years and years. Right. And now I think more about how's this going to be used? So does this need to be in a search index? Or does this need to be in a relational table? Or do I need to snapshot the data? And so a document storage, so I'm always now thinking about how, like what is going to be the use of this thing?

Starting point is 01:39:21 So I'm way over-complicating the systems before I even set them up. This episode is sponsored by About You. About You is one of the fastest growing e-commerce companies in Europe, headquartered in Hamburg, Germany. The online fashion store is currently live in 10 European markets with more than 8 million app installs and 15 million active users on its platform, which handles more than 300 million API calls per day. In 2018, About You reached a company valuation of more than $1 billion, moving up to the exclusive circle of European unicorns. This could only be achieved by the excellent work of About You's tech teams.

Starting point is 01:40:08 One third of their employees are developers and come from over 40 different nations, which truly enriches the teamwork of the company. What they all have in common is that they're highly driven by the passion to develop the best product on the market. About You also has an award-winning organizational move model that allows developers to switch teams, ensuring constant learning and developer fulfillment. About You is built at software in-house with leading technologies like Laravel, Node.js, and TypeScript on the

Starting point is 01:40:37 server side, Vue.js and React on the client side, and Flutter for mobile applications. Besides a variety of free drinks and fresh fruits, About You offers free language courses and helps new employees in the relocation process if they move from abroad. Moreover, developers get free tickets to About You's own organized conference, Code.Talks, one of the biggest tech conferences in Europe. The conference that is taking place in Hamburg is visited by more than 1,500 developers. Furthermore, About You offers a well-structured onboarding process with a buddy system that provides access to e-learning tools such as laracast.com and egghead.io.

Starting point is 01:41:18 When starting at About You, you have the choice between different hardware setups as well, like MacBook or Windows Notebook, and the kind of IDE that you want to work with. About You is growing fast and is constantly hunting for new and motivated team members. About You currently has positions available for full stack, front end, Dart, Flutter developers, a quality assurance engineer, a project manager, as well as other exciting leadership opportunities. So does this sound good to you? Apply now at aboutyou.com slash job. They're looking forward to hearing from you. Again, that's aboutyou.com slash job to apply now. All right. So now I just want to quickly mention some resources we like, of course, the book Designing Data-Intensive Applications, and we're giving away a copy. So make sure to leave a comment there and maybe you'll win. And obviously the Tyobi index is

Starting point is 01:42:25 going to be in there. Super accurate, Tyobi. I mean, everybody wants to read that one. So with that, I guess we'll head into Alan's favorite portion of the show. It's the tip of the week. I do love this portion. I didn't love it

Starting point is 01:42:41 tonight, though, because I haven't done any coding in two or three weeks. So, yeah. Two or three. But I did find some things. And part of this came up during the show. So, the whole tangent, or not tangent, but the whole thing where we were talking about the MySQL versus Postgres and you guys were talking about tooling and how it was so bad on MySQL back in the day. Well, I feel like because I've played with it in more recent than I guess what you guys have,

Starting point is 01:43:12 the MySQL tools, at least the freely available open source ones, are way better than the Postgres ones because pgAdmin is about what you get out of Postgres and it's severely lacking. So my first tip of the week is if you are working with multiple database systems or if you're working with Postgres or anything that's sort of weak in the tool area, check out JetBrains DataGrip. It is really good. Like super good. I know Outlaw, you've used it a lot. I think you kind of are a fan of it. Yeah? No. Yeah.

Starting point is 01:43:45 No. Yeah. I mean, it's still, um, it's hard. Like, you know,

Starting point is 01:43:51 I've done SQL server for so many years and use SQL management studio for so many years that that's like, that's your friend. I mean, you know, your friend, like you can go drinking with your friend and you know, if you get into trouble,

Starting point is 01:44:03 your friend's going to bail you out, you know, but, but then comes with your friend and you know if you get into trouble, your friend's going to bail you out, you know. But then comes long data grip and you're like, oh, hey, that's kind of cool. You know, it's, you know, I like new cars and it's a new car, right? It's a new car with a ton of functionality. So it's got all the new wiring that we were talking about earlier. Right. So yeah, data grips. Nice.

Starting point is 01:44:29 If you plan on doing something in Postgres or something, check it out. I think it's 99 bucks or is it one 99? I don't know. Um, I'll look it up. How's that? Yeah.

Starting point is 01:44:39 At any rate, it's probably worth it because fighting with your database system is not that great. And then, so the next thing I want to do is I believe, I think it was Microgy from, or Mic RG in Slack. He actually posted this the other day because he's heard me and Joe Zach talking about Kafka and, you know, our pains and whatever. There is a, another GUI out there for it that allows you to look at topics and all this kind of stuff.

Starting point is 01:45:11 So I'll have a link to that. I can't even pronounce what it is. So did you say this work Kafka? So what did you say it was for Kafka? Cause I didn't hear it. If you did. Yes. For Kafka.

Starting point is 01:45:22 Okay. And then I wanted to give a shout out to a little project that Joe Zack had put together that's actually really cool. So what he did is he had wrapped a lot of the Kafka admin calls with GraphQL. So you can actually, he's got a graphical UI to where you can start typing in the query. And Joe, I don't remember everything that it does, but I want to say that you can pull back a list of topics. You can pull back data in the topics. Yeah, like 100% of the admin functionality, the admin client specifically, you can get via GraphQL. And it's also, you're able to fetch the messages to via subscriptions.

Starting point is 01:46:06 Yeah, so super, super cool stuff. So there'll be a link in there for his as well because it's worth checking out. I think last time I looked at it with him, the only thing he said is some things are slow because he's querying a bunch of metadata from all the different brokers or whatever. But, I mean, there may not be an easier way to find out what's going on with a bunch of low-level data than this type thing. So I would definitely check that out. Heck yeah. And by the way, JetBrains is $200. Datagrip is $200.

Starting point is 01:46:39 Okay, $200. Still probably worth it, just a heads up. Well, that's actually for organizations, though. I'm sorry. For individual use, then it would only be $90. Okay. There you go. At 90 bucks.

Starting point is 01:46:51 Absolutely. Get it. No question. At 200, tell your company to buy it. Right. And hey, free tip here. If you sign up for the mailing list, we frequently have JetBrains licenses to give away and data grip as well.

Starting point is 01:47:04 Yep. All right. So for my tip of the week, have you ever had a need to do any geodesic measuring, distance measuring? Just this morning. Really? No. Oh, I thought you were serious. No, totally not.

Starting point is 01:47:23 Oh, man, he's going to steal my thunder here. Do you know what it is when I say it, when I said it? Nope. Okay. Okay. So I sound super smart because I said that word. I said the words, but I didn't know that that's what it was called either. But we've all heard the phrase like as the crow flies.

Starting point is 01:47:42 Yeah. Okay. So basically geodesic distance is when you want to measure the distance between two points and you just want to measure it as a straight line. You don't care about the actual roads that you have to travel on. Right. And so I had this recent need where I was like, Hey, what is the point to point distance from where I'm at to this other location. And I didn't realize this is a thing in Google Maps and has been since 2014. So what you can do is you can go in and like put in your, the addresses that you're interested in and you can right click on the first point and say, so let's say that you like, uh, said map, you know, from your house to, uh,

Starting point is 01:48:27 some store, I don't care. And, and so you have both points showing up on the map. So you can right click on the first point and select measure distance, and then right click on the second point and select distance to here. And Google will draw a straight line from the two points and it'll show you in the bottom, there'll be, uh, the bottom of the map, it'll actually show you the actual distance. But on that straight line though, you'll see, uh, depending on the scale may vary, you know, if it's like here every 10 miles or every five miles or every one mile, you know, but there'll, there'll be like hash mark for some kind of representation with, you know, and it'll tell you like 10 mile, 30 mile or 10 mile, 20 mile, 30 mile, et cetera. So I thought that was super cool. Um, yeah. And then, um, a tip from, uh, you know,

Starting point is 01:49:17 Mike RG, cause of course we would have another tip from Mike RG. Um, any tip that we give is there's probably like at this point, I want to say there's a greater than 90% chance that he hit us up on Slack and was like, hey, have you seen this? So we all know how I like Git. I might have talked about it once or twice. So he sent me this article that's titled How to Undo Almost Anything with Git. And it's like a cheat sheet of here's all the commands to undo different things, right? Like you want to fix your last commit message? Well, here's the git commit amend command that you would need to use to do that, right?

Starting point is 01:50:03 You know, now I'll share this. And as I've said with similar kind of, you know, get recommendations in the past, you know, careful about changing shared history, right? Always keep that in mind. In my mind, if the history has already been shared and others are using it, then it is not cool to go and change that. Right. But I'll include that link in there because I thought it was had some really neat tips and it was all in one great place. So I'll include that article. All right.

Starting point is 01:50:39 Well, I forgot. I forgot about the tip of the week. So I'm playing off this time. I forgot. I forgot about the tip of the week. So, so I'm playing off this time. Do y'all remember Yahtzee? Absolutely. Yahtzee, the video game reviewer, right? Oh, wait, the video game.

Starting point is 01:50:58 Are you thinking the game? Get out of here. Isn't that what you said? I'm talking about Yahtzee. So when YouTube was still kind of young, there was a young video game reviewer with a terribly foul mouth that used to do video game reviews. And they were animated. Anyway, it was a popular YouTube video that kind of went viral way back in the day. And I guess he's been doing all sorts of awesome stuff ever since that I just kind of missed out on. But I had a bunch of Audible credits to spend at the end of the year that I had kind of forgotten about.

Starting point is 01:51:32 And so I went through and I just kind of picked up a couple books. And I recognized the voice of a book I was listening to called We'll Save the Galaxy for Food. It was like funny, kind of like a Douglas Adams kind of, it's's like a ring world or sorry disc world or um Hitchhiker's Guide to the Galaxy kind of humor like very funny and I recognized the voice and I looked at the author and it was it was Yahtzee the video game reviewer from way back oh whoops that's not his voice well no that was that was I was that was the video that Otto played when I go to the Yahtzee channel on YouTube. This person has, I believe it's an Australian accent and very foul. Also very funny. So anyway, the book is really creative.

Starting point is 01:52:19 It's really funny. He actually reads his own audio book. I've been really enjoying it. I'm still very early into it, but it kind of scratches that kind of like old school Douglas Adam, kind of Terry Pratchett kind of itch that I've had for a long time. So I was very happy with that. And I think Galpragman in particular would enjoy it. So I'm going to try and get him hooked on it too.

Starting point is 01:52:40 And if that sounds like something you're interested in, then you should take a tech break and give it a listen I like it alright so this episode we talked about data models and we've got a lot more coming so stay tuned yep and don't forget to leave a comment if you want

Starting point is 01:52:58 a chance to win the book because it is a fantastic book so go up there and do that alright well with that be sure to subscribe to us on iTunes, Stitcher, Spotify, or more using your favorite podcast app in case if a friend happened to point you into the direction of a specific episode or you're listening on their device. And if you haven't already, we would love if you would leave us a review. You can find some helpful links at www.codingblocks.net

Starting point is 01:53:24 slash review yep and while you're up there at codingblocks.net check out our show notes examples discussions and more answer your feedback questions and rants to the slack channel and also on twitter at coding blocks or head over to going by sign and if we get it fixed uh you'll see all the social links at the top of the page sorry about that it definitely wasn definitely wasn't me. I just happened to be the last one to touch it. I don't know what's going on.

Coding Blocks - Designing Data-Intensive Applications – Data Models: Relational vs Document

We're comparing data models as we continue our deep dive into Designing Data-Intensive Applications as Coach Joe is ready to teach some basketball, Michael can't pronounce 6NF, and Allen measured som...e geodesic distances just this morning.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.

Your Ad Here

Coding Blocks - Designing Data-Intensive Applications – Data Models: Relational vs Document

We're comparing data models as we continue our deep dive into Designing Data-Intensive Applications as Coach Joe is ready to teach some basketball, Michael can't pronounce 6NF, and Allen measured som...e geodesic distances just this morning.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.