Coding Blocks - Specialize or Bounce Around?

Starting point is 00:00:00 You're listening to Coding Blocks, episode 153. Subscribe to us on iTunes, Spotify, Stitcher, and more using your favorite podcast app. And hey, leave us a review if you can. We'd appreciate it. And we've got a website, codingblocks.net, where you can find show notes, examples, discussion, and a whole lot more. And you can send your feedback, questions, and rants to comments at codingblocks.net. And you can follow us on Twitter at Coding Blocks, or you can head to www comments at codingblocks.net. And you can follow us on Twitter at codingblocks or you can head to www.codingblocks.net

Starting point is 00:00:29 and find all our social links there at the top of the page. With that, my name is Alan Underwood. I'm Joe Zach. I thought it was like a late response there. Like it was like a you know, I tell you a UDP joke, but you might not get it or anything. Oh, right.

Starting point is 00:00:50 Yep. Now I'm Michael Outlaw with the on-time response. Yes. This episode is sponsored by Datadog, the cloud-scale monitoring and analytics platform for end-to-end visibility into modern applications. And DataStax, the open multi-cloud stack for modern data apps built on open source Apache Cassandra. All right, I'll do the intro here. So we are coming around in this episode basically because we've all been crazy busy and we have things to talk about, but we didn't have time to research a ton.

Starting point is 00:01:23 So we're going to join back around the water cooler and talk about things that are on our minds in the development world. And with that, before we get into that, this is going to be a rant, a rant session. Like, so what's up with Bob? Right. Yeah. I mean, I can't promise a rant. I can't not promise a rant.

Starting point is 00:01:47 You'll have to stick around to find out. It's been, it's been about a month of, of crazy. So, um, at any rate, before we jump into our water cooler topics that we have,

Starting point is 00:01:57 we like to first take those that leave us a review. So I think, uh, Mike, you got this. Oh, sure. Uh, all right. So in, on Mike, you got this? Oh, sure. All right.

Starting point is 00:02:06 So on iTunes, we had Peter B. Sadface, Jackafus, and 1T. Underscore. Okay. That's assumed. But the underscore is in the wrong place. It should have been one underscore T. I don't know. It was hanging off the end. How do you know? Maybe it was a member variable. It should have been underscore one T. Oh, that's true. That's true. Although some people hate, you know,

Starting point is 00:02:36 doing variable names like that. True. Well, thank you for leaving those reviews. Those all three were amazing. So really appreciate it. So I could have put this at the tips of the week, but you know, some people aren't going to hang around for that. And so I wanted to put this up front because it might help some people out. So I recently had my wife's computer just started blue screening like crazy. And the errors that were showing up made it look like it was a problem with the drive. So as you do, you know, I'm not just going to get a drive. If I got to tear all of it open, then I'm going to upgrade. I got a motherboard, CPU drive, all that garbage,

Starting point is 00:03:18 get everything in there. And it's still blue screening. And I'm like, man. And my wife tells me, well, the video card was doing something weird. Like the screen was glitching and stuff. I was like, oh man. All right. So I'm going to try and get a video card. You can't buy video cards right now. I don't know if you've looked, they're all out, right? Like Bitcoin mining or crypto mining plus the pandemic, they're wiped out. At any rate, I did have a video card laying around, stuck it in there was still getting a blue screen. It was the Ram. So there's a program. I'll have a link in the show notes here called mem test 86. It's super popular. It's real easy to get running with. You download the program, you say, install it on a USB. Then you boot to the USB. This thing,

Starting point is 00:04:04 after just a couple couple minutes had thousands of memory errors, right? And I was like, okay, that looks like that might be the problem. So ordered new RAM, got it, went ahead and ran the mem test 86 on it too, just to make sure I wasn't going to be going into the same problem, right? Everything went perfect. So just what I'm calling out here is a lot of times, and this happened with my machine as well years ago, test your memory. If you start getting sudden blue screens for no apparent reason, it might be that you got an update in your OS that's now touching memory locations on those memory sticks that it never used before, there could be all sorts of reasons that you start getting it.

Starting point is 00:04:49 But check your RAM. It takes probably an hour or two to run this thing. Just do it. And then at least you can sleep peacefully and then know that it's probably a USB driver or something. So just one point of clarification there though, I would assume that in today's age of operating systems with, uh, uh, what's it called the, the memory address randomization layout. There's a name for it. I can't remember, but yeah, your operating system, regardless of the update, if it's a modern operating system is randomly using all of it i would bet i you would

Starting point is 00:05:26 think but it's like i said it was really weird it was like after a windows update or something all of a sudden she started getting blue screens like did you move the machine at all just curious had it been recently like it had been jostled hi thank you for calling coding box technical support uh yeah have you have you tried turning it off and on? Right. Have you recently moved the computer? So frustrating, dude. But yeah, I mean, honestly, it's worth trying.

Starting point is 00:05:54 So do that. And then, Joe, Zach, I think you got something in here, or maybe that's you outlawed on it. No, I just want to mention we got some really great comments on the last episode, which was about Python. About virtual environments. There was some of that. So we got some really good feedback from people that spend a lot of the time with Python.

Starting point is 00:06:13 So that was really good. Just wanted to encourage you to check out those comments. And we love any feedback. So if we say something that frustrates you and drives you nuts and you want to set the record straight, then awesome. That's a great place to do it. If you want to just tell us, you know, we're awesome, then, you know, hey, I'll take that too. Right?

Starting point is 00:06:33 Yeah. Yeah. I mean, it's funny. That's usually when we do get the comments, when we say something, people are like, no, they're wrong. My people. My people. That's right.

Starting point is 00:06:44 They're wrong. Yeah. That's totally fine. I wrong. And that's totally fine. I mean, it's good for discussion. And honestly, usually we learn something and so does everybody else, right? So yes, totally, totally open to that. This is a real-time conversational-based test-driven development where we just say some stuff and we're testing to see if you're listening. Right. Yeah, no revisions.

Starting point is 00:07:05 Right. And I can't go back and fix anything. How many times has it been something where you're like, oh, I meant to say that, but now it's like 10 minutes later. Oh, this is how I should have explained it or whatever. And it's. Oh, dude. Look, Joe Zach and I know for sure, because we've edit videos and do stuff like that,

Starting point is 00:07:19 man, you could second guess yourself to death editing a video. Like you can take a 15 minute clip that you plan on publishing. It could take you 10 hours to do so. Cause you're always like, Oh, I should have added this, should have revised that. And then eventually you're like,

Starting point is 00:07:34 I don't care. Ship it. Yeah. Outlaw commented on the audio quality on my Kinesis keyboard review. He's like, Hey man, did you? And I was like,

Starting point is 00:07:44 dude, shut up. I don't want to stop it. The way you say that though review. He's like, hey, man, did you? And I was like, dude, shut up. I don't want to stop it. The way you're saying that, though, you're not really painting me in the best light there, man. I come across as a jerk in this story. No, no, not a jerk. I was just like, yo.

Starting point is 00:07:55 I was like, yo, man, what's up with that audio? That's crap. Have you heard it? Did you use that? I thought you used noise reduction. I was like, dude. It's one of those things that after you've been at it for like eight hours you're like i don't care yeah i always get a lot of hiss when i record this microphone on windows i the recorder comes out great i think but uh i give up now mine was i

Starting point is 00:08:20 was trying to tweak it to be some sort of like studio style. And then I jacked up the audio and I was like, there's no going back now. Yeah, done. Yeah, whatever. That's what happens when you experiment. So at any rate, now. Did you try to noise reduce? No, just kidding. Hey, hey.

Starting point is 00:08:37 So, yeah. So getting into this one, one of the topics that I don't know, man, like I'm just I really want to open this up for conversation because I have known developers that are like C sharp or die. Right. Like or or Java or die or or I had a good friend. You're not gonna believe this one. Wait for it. But a good friend of mine, love him to death, Pearl or die. Now that was back then, not today, you know, in fairness, but you know, yeah.

Starting point is 00:09:15 And, and so I wanted to open this up for conversation because personally I will work in whatever language makes sense, right? Like if, if an application is written fully and go, I'm going to do it and go, right? Like if I've got to add a library to it, I'm not going to try and figure out some sort of way to do like a calm wrapper. I'm going to be like, well, I guess I'm going to dip my feet into go, right? If, if something's written in Python, like I just recently did. Okay. I want to pick up Python. I originally went down the 3.9 route and then they're like, well, our app is in 2.7. I said,

Starting point is 00:10:02 okay, I will go back to 2.7, whatever. Right? Like I just don't care. I'm going to use whatever language makes the most sense. But I do get it if you say, hey, I'm a Java developer. I'm going to know everything there is to know about Enterprise Java beans. I'm going to know everything about every library, every framework. I'm going to know everything about Spring, Spring Boot, everything, right? Like, I'm going to know it all. There's value in that, too. So I'm curious what you guys' take on this is like, what,

Starting point is 00:10:29 how do you feel about it? Do you, do you feel like people should be, you know, like a hero, a champion of their language, or should they be a champion of being a developer for, for whatever it is that needs to be done?

Starting point is 00:10:42 Do I raise my hand? No. If you raise your hand, you get kicked off the podcast. Wow. That got dark quick. Um, so, so my opinion on this topic and I've shared this, I think I've shared it on this show before and I've definitely shared it with friends in the past though. It was like, you know, uh, I, I personally consider myself a software developer, not a C sharp developer, for example, or Python or Kotlin or insert language here. Um, and so by that, what I mean is that, you know, whatever the language is that is needed to solve the problem, then that's what I'm going to use, right?

Starting point is 00:11:29 And I think even between the three of us, you know, like there's this whole, you know, the whole, because, you know, if we get tasked with doing anything web app related, like, okay, there's some HTML, there's some JavaScript, there's some CSS. Oh my gosh, maybe it's SAS. Then you have whatever your backend language is, it's going to be a Java or Kotlin or Pearl or C Sharp or whatever. And then, oh my gosh, I got to go query a database. So it's going to be T-SQL. It's going to be PL SQL. It's going to be whatever, some other thing, right?

Starting point is 00:12:14 So I don't even think that the three of us could even say that we would be like champion one language. That's not even a thing for us, right? I don't know. Joe? So my answer here is to just do what feels good to you. I think there's a lot of value in going really deep on something we've talked about,

Starting point is 00:12:33 T-shaped developers, and I think that's great. If you really want to do that, there's a lot of value to that, to being an expert. The thing you have to be careful with is those platforms die, so you can't get too attached to any one technology because sometimes like something just comes along and just knocks it down. And then, you know, it can be hard to kind of pivot your career,

Starting point is 00:12:52 especially if you've done some branding or like a website or a channel or something around that technology, then that can be a big loss. So, you know, I always say, yeah, silver light flash, you know. Coding blocks.Silverlight. Yeah, yeah. You know, at the time, we thought it would never go away. So, yeah, I always think be careful of that.

Starting point is 00:13:15 But aside from that, like, I was just thinking, like, this week, I worked in, and, you know, I don't want to sound like a, I don't know, like a name dropper or something. But legitimately, I've worked in JavaScript, Kotlin, Python, PowerShell, Bash, did a lot with YAML. I'd worked in Mongo a little bit. I did some stuff with Elasticsearch, SQL Server. Like that's, and it's not even the end of the week. You know, like I bounce around a lot. And I think that there's room in the world for people that bounce around a lot.

Starting point is 00:13:42 And it is kind of scary, though, because I can go on a stream and be like, I'm going to make something new in Python. And even though I've been working with Python on and off for several months now, I'm like, wait, is it underscore underscore knit equals main? I forget. Let me just Google it. There's real basic things I just don't have that muscle memory for. And that's kind of scary.

Starting point is 00:14:00 If I go to start a new Kotlin project, like setting up log4j, setting up this or that, those things are things that I'm really missing in my tool belt. Or someone who's been working in Java or Kotlin or something for years, they've probably set up new projects a hundred times, and so that's nothing to them. I know how to maintain these apps. I can bring along everything I know about software engineering, like abstractions and good and, you know, debugging and whatever, all that stuff transfers really well. But there's things about individual languages that you're just going to hurt on if you bounce around a lot like that. And I feel that pain, you know, all the time. It's like embarrassing when I don't know, like, does Python have a

Starting point is 00:14:37 switch statement? I forget. I've only been working with it for, you know, months or, you know, Kotlin. I'm like, ah, what's, how do I do the Kotlin ternary stuff? I've only been working with it for years now. And that's ridiculous, but I know it's there. I know there's something similar. And so I, you know, that's what feels good to me. I've, uh, I think everyone kind of has a role to in, in kind of managing your career and the way that you move. And I think that when things come up at work, you have a choice to say like, Hey, let me grab that or let me take a look. Or that's someone else's thing. I'm not going to be effective at that, so someone else should do it.

Starting point is 00:15:10 And it seems like some people kind of gravitate towards one or the other. So as far as I can tell, the world takes all types. And so that's my answer is just do what feels good. Don't brand around it. I do have a question, though, because you hit on something that I think is really interesting. I like that. that you're going to set up Sarah log or log for net. You know that if you're going to do dependency injection, there's a handful of things there. You get a chance to learn and see all the bits that exist around a language. So you kind of know what's out there. Do you think there's benefit in staying deep in something,

Starting point is 00:16:01 at least for some period of time, just so that you're aware of that? Like you said, like you've built up that muscle in C sharp. And so when you go over to Python, you know, Hey, there's probably a logging, you know, hook for this language. There's probably a DI hook for this language. So is there value in sticking with something for a while just so that you learn how to be a better software developer before you start jumping from platform to platform? I remember being a cold fusion developer and kind of having a lot of habits that I kind of

Starting point is 00:16:38 grown up with as a developer because I had kind of grown up on that platform. And so the ways I interacted with databases and everything were kind of very much grown out of the patterns that they had. And it wasn't until I started like looking at other languages, like C sharp and Ruby on rails and stuff. And I started seeing like ORMs and things that did things in a different way that even when going back to cold fusion, it changed how I thought about it and changed how I did things.

Starting point is 00:16:59 And I think for the better, you know, hopefully. So I definitely think there's something to kind of changing your perspective in order to learn new patterns. Because you can get kind of set in your ways if you're doing something for years. And there's a definite danger. And maybe you'll never know there's a better way to do things because you're so used to just, you know, typing in the stuff that comes with the muscle memory. Yeah, there is something both, uh, refreshing and frustrating about it. I mean, like I,

Starting point is 00:17:26 I think that a lot of what you said, Jay-Z, like when, when you were talking a moment ago about, uh, like forgetting little things in language, I was like, Oh my God,

Starting point is 00:17:34 I'm not the only one. Like, no, you're not. I'm so bad about that. Like, you know, if you asked me to spin up a brand new,

Starting point is 00:17:41 uh, you know, project in a particular language, I might like, wait, if it's not C sharp, hold on. Um, what you're like, I like, there are little things that I'll forget and I have to like go in, I'll have to go and check, you know, like, Hey, I want to pass back multiple values. Cause you know, Alan loves it when I do that. How would I do that? You know,

Starting point is 00:17:59 like whatever. Um, so, so there's definitely value in going deep if you, if you want now, you know, so, so from the individual perspective, totally agree, do what you want to do. And, and, you know, whatever, whatever you enjoy, whatever makes you happy, that's ultimately what's going to matter the most. And if, if, if staying in one particular language or bouncing around all of them is what feels right, then fine. And you know what, you're probably going to change your mind during the course of your career a couple of times anyway. So, you know, whatever,

Starting point is 00:18:28 you know, right now you might be focused in on one and eventually you'll start bouncing around and then you might get tired of that and go back to just focusing on one. But from a company perspective, we've also talked about it from their point of view too, where like even in the last episode, um, episode, uh, what was it? 152 about virtual environments. The, see what I did there? Oh, let's go. You know, Google, I think it was like Python where we can and C or C++ where we must, right?

Starting point is 00:18:59 Like, you know, they had a preference. And we've seen like, take Amazon, for example,, where if they were coming out with a new service, all the libraries for that new service were Java first. They were heavily Java focused. And you could only imagine that anything coming out of Microsoft is probably going to support Kotlin first and then C sharp. Oh wait, so get that backwards. So here's a followup question for you guys. I don't know that we've ever even talked about this,

Starting point is 00:19:32 you know, in the surveys and stuff that we've seen about favorite languages, right? Like I want to say rust was up there at the top or whatever. Has there, has there ever been a language that frustrated you or, or a language objective C? Okay. So here, here's my question. And I think that you might be able to more fully answer that. Is it the language itself that frustrates you or is it the tooling and the ecosystem around that language that frustrates

Starting point is 00:20:07 you because for me personally while i don't love java because it's so verbose the language doesn't frustrate me c-sharp doesn't frustrate me javascript none of them frustrate me like a language is a language it's the tooling and the ecosystem that have always bugged me. Yeah. I mean, like I was, you know, partly joking about objective C when I, when,

Starting point is 00:20:29 when I said it, the reason why it came to mind though, was that it was, there were parts, there were things about the syntax that were so different to me. They were like, just so foreign looking, like,

Starting point is 00:20:44 you know, compared to like any other language. Like you look at, you look at C sharp or Java and depending on like how much context I might give you, you might not be able to tell what the language is. Right. Similarly, like, you know, if, if I gave you a, well, no, I was going to say Java and Python, but you, you definitely be able to tell the difference. But I you, uh, well, no, I was going to say Java and Python, but you, you definitely be able to tell the difference. But I mean, even with some, some JavaScript though, you wouldn't be able to some, some JavaScript, depending on like how, how well, uh, you know, you know,

Starting point is 00:21:14 depending on your linting style, like you might not be able to tell it from, uh, uh, um, well, I guess for the, if you're using lets, you might be able to see it, but you know what I'm saying? Like the point is, is that like a lot of those languages, they all are kind of similar, but there was just some parts of Objective-C that were so different to me that it just always felt weird. Now, granted, it's been over a decade since I did anything with Objective-C, and to your point about the tooling, definitely a thousand percent. The tooling was awful, you know, back in like the, you know, 2008, 2009, 2007, like that kind of

Starting point is 00:21:55 timeframe for Xcode, you know, trying to, trying to step through the debugger and even see the current value of a variable. And it's like, no, do that you gotta like print it out to a thing like you're like this is the most basic of things that i expect from my from my ide and and i can't see this like what right i mean the the one that springs to mind for me that i might have had frustration with was C++. But that's because you have so much control over everything that if you screw something up, you know, you can easily shoot yourself in the foot. But it wasn't ever a problem with the language. It was just the fact that you had to know that if you're doing memory pointers and that kind of stuff, you just had to know about it, right? So I don't know.

Starting point is 00:22:45 I don't know. What about you, Jay-Z? Any language that ever just really bothered you? Or was it the tooling and environment ecosystem, the package management, whatever? Honestly, the only language that really I kept feeling I should like but just kept irritating me was Python. And we talked about that in Death Blast. And don't get me wrong i like python i'm you know i spend my leisure time learning python better and you know whatever but there's

Starting point is 00:23:11 things about the language just like the you know the way you use like the length function and pass into variable rather than object dot length or whatever just there's some standards and some things that were done that just like dried me up a wall because it's different than every other language that i use frequently. And I'm constantly doing the wrong thing, which is just frustrating because it feels, I think it's one of those things where it feels so good that the things that trip me up feel terrible. It's just like frustrating.

Starting point is 00:23:37 So yeah, Python definitely frustrates me, but you know, that's not a knock on Python. You know, I frustrate me too, whatever. That's it. Sure. I didn't realize that you were knock on Python. You know, I frustrate me too, whatever. That's interesting.

Starting point is 00:23:47 I didn't realize that you were frustrated with Python. I don't think I got that takeaway from last episode. So that's, that's interesting. Yeah. Yeah. No. All right.

Starting point is 00:23:56 Cool. Well, that was a, that was my, my first water cooler topic. What do we get up next? Yeah. I do want to say that like,

Starting point is 00:24:05 at least with like Xcode in recent years, it definitely got a lot better, you know? So I would imagine now that like, you know, Swift, especially I would imagine would be a much like a pleasure to, to,

Starting point is 00:24:19 to write in, especially compared to that, you know, 15 years ago of objective C, you know, I mean, tool tooling can make or break anything, right?

Starting point is 00:24:30 Like straight up, you know how, um, you're like, there's questions like this that always pop up, right? Like which language is best, right?

Starting point is 00:24:39 Uh, a C or a C plus plus or a Python or a Java, right? Like, you know, everybody's always kind of like, Oh my God, Java, right? Like, you know, everybody's always kind of like, oh my God, you know. But, you know, the answer is very simple.

Starting point is 00:24:52 If you're writing an operating system, use C. If you're writing a complex application where execution speed is extremely important, use C++. If time to market is key, but execution speed is not important, use Python. And if your boss told you, do it in Java or you're fired,

Starting point is 00:25:09 do it in Java and then look for a better workplace. And the comments can be left at slash... Well, that joke came from Arlene, so ping her. Send that to Slack. As I throw Arlene, so ping her. Okay, yeah. Send that to Slack. As I throw Arlene under the bus. That's awesome.

Starting point is 00:25:31 Oh, man. I will say, as far as Java goes, there are a lot of tools. You're going to learn Java. Any Apache project, pretty much. You're going to learn about some Java concepts, like how they do their logging, how the JVM works, you know, heap all that stuff. Like it's going to happen. Yeah. You know, that's interesting too, because the whole, like, uh, this,

Starting point is 00:25:55 the single language or, or pick one or, you know, choose one a lot of times it might be driven by, I'm using Apache Kafka. Everything's in Java. Do you really want to blaze that trail to go create something that doesn't exist in your favorite language? Maybe you do. But maybe if you want to go the easy road and learn off the people that have gone before you, you're probably going to go with Java, right?

Starting point is 00:26:26 Like it's, that drives a lot of my decisions in terms of what I'm using. So there's certain kinds of categories that you should like, you know, depending on where you want to go with your career, you would choose to hit or not. Like, uh, it's kind of like everyone should probably know JavaScript or like, you know, it's going to be hard to avoid JavaScript completely unless you're like a back-end embedded developer or something just and that's all you ever want to do but if you want to do back-end embedded stuff then you're looking at uh having to know either c++ or go or rust if you're looking at doing like um enterprise development uh back-end services type stuff like you're going to learn java or c++ i'm sorry sir

Starting point is 00:27:04 uh c sharp if you're going to do data science like you're looking at learn Java or C++ or C Sharp. If you're going to do data science, you're looking at Python or R. So it's kind of like picking the technologies like, hey, I want to do front-end and data science. Well, here are your choices. You can learn all of them or you can kind of pick one from each category and really specialize in it depending on

Starting point is 00:27:20 where you want to go. Speaking of where you want to go, Yeah. It's a good point. Speaking of where you want to go, who is hiring remotely now? I kind of have the idea to look up. I've been hearing a lot of people who've been switching jobs in the last year or so. And a lot of them have been to really big companies

Starting point is 00:27:39 that they wouldn't have been able to switch to two years ago, even because those companies didn't have remote hiring policies. And so, you know, if they did, it was an exception. And so I went and looked up and tried to find a list of prestigious companies that have recently opened the floodgates to remote workers.

Starting point is 00:27:55 And I threw a couple of the companies that have always been remote, just that I thought were kind of notable. And let me check Twitter. Cause I put this out on Twitter to See if anyone has anything that I missed. But in the show notes, we are going to have links to every company here that we're mentioning.

Starting point is 00:28:14 And so if you want to do some browsing, just look around a little bit and we've got some stuff aggregated for you here. And you probably should not do this on your work computer. Now you can't visit codingbox.net on your work computer because you might get one of these links.

Starting point is 00:28:31 Just don't click the links. Don't click the links. Yeah. For those that don't know, if you're new to the industry, all three of us do security software for a living, and it scans stuff like this. So if you click links like this at work and and your work has a policy set up to look for links like this you might be noteworthy

Starting point is 00:28:53 in the it department you're gonna get flagged on every one of them right now well no no so go to clickbox.net but just be careful when you click them don't don't don't click them on your work computer is what we're saying well this would be a good time to tell you, though. If you're listening to the application or the podcast through your podcast app on your phone, then you likely can see the show notes directly in your podcast app. Oh, that's a good point. And you can click it there. Yes, that's probably not a terrible idea.

Starting point is 00:29:20 Unless it's a work-sponsored phone, then. Don't do it. It's a work-sponsored phone. You're hopeless at that point. I don't know what to tell you. You're already using a work phone for – Or even your personal phone on your work Wi-Fi. That's probably not a good idea either.

Starting point is 00:29:33 Yeah, that's true. There's so many reasons not to do anything with your work type thing. But at any rate, go ahead, Joe. Into the good stuff. Two things and then three things before we get to the good stuff real quick. One, I want to mention that there's exceptions for everything. Some of the companies that we say no

Starting point is 00:29:51 are maybe they have exceptions or they have maybe they allow it next month or something after we record. And also subsidiaries can have completely different rules. If you look at like GitHub's hiring versus LinkedIn's hiring versus Microsoft company hiring, it could be way different. Individual teams like the Windows team and the Azure team, you know, which I'm sure are bigger than one individual team, but those groups are going to have different hiring policies and whatever.

Starting point is 00:30:18 So it's not really enough to say that one company does or doesn't. But we do have a list of like, I don't know, maybe 20 or 15 links here that are worth perusing if you are thinking about making a step to like a big or prestigious company. And I think it's worth considering that if you haven't done that in your career yet. Because having a big name on your resume can open so many doors down the line. So it might be worth giving a shot if you're, you know, suddenly in a position where you want to be switching jobs and these things have become available. Because, like, once you have, like, an Apple or Facebook or something, like, I mean, it just does really great for the rest of your career. So it's something to consider.

Starting point is 00:31:01 And it's been Apple. So there are three companies that I looked and they are, you know, three faint companies that I saw don't, don't seem to allow remote working. They're not hiring remote workers. Now all three of them have remote policies for COVID, but I couldn't find a single remote job listed.

Starting point is 00:31:20 That was like for programming for these three companies. And that was Apple, Netflix, and Google. That surprises me on Netflix. Honestly, I don't know why. Oh, man. So, Netflix, their CEO

Starting point is 00:31:33 is straight up anti working from home. Interesting. Yep. He's been interviewed. He's got some quotes talking about how it's just not as good and how it's, you know, basically all the downsides of it. So I don't see Netflix really going that route anytime soon. They're the most antagonistic that I found Apple.

Starting point is 00:31:52 Like you, they're hard to even browse their jobs. Honestly, it's like, do I have to create an account? And like they have the Apple store stuff too. It's kind of mixed in there. So it's just kind of overwhelming.

Starting point is 00:32:01 So it's really hard for the developer program to get the certificate so that you can sign your request to query it. Even to try and search for their jobs, it's kind of hard to do without signing up for an account and logging in and stuff. So I just thought that was kind of a bad experience overall. I mean, you know, if Apple finds us, great. But that was just kind of strange to see a company that kind of makes it difficult to find out if they're hiring or not. I'm really surprised about Google, though, because I could have sworn that there was a story that came out last year with them announcing that they were going to. I know as part of the pandemic that Facebook had made big strides to going remote. And I forget what the percentage was, but by some time and within the next few years, a large portion of their workforce was going to be remote, if I recall.

Starting point is 00:32:55 Yeah, Facebook definitely did that. Yeah, and so they're one of the ones on the list. Let me mark that off. Yeah, Facebook has a lot of remote jobs open, and I've been contacted by many recruiters letting me know that uh facebook is hiring and looking for remote workers so they're definitely one of the ones the search has a really easy way to just click remote so you can see all the engineering jobs that are remote and um yeah just double check google there's no way to search for remote jobs i want to say if you search remote you'll find like one And it's for something else. It's like remote controls

Starting point is 00:33:25 or something. Before we get into some of the fan companies, I've got to mention two of our sponsors for this episode also have remote careers. Datadog and Datastacks both have career pages with remote listings.

Starting point is 00:33:42 So that's pretty awesome. Very cool. I just got, I've got several buddies now that are working from Microsoft remote now. And initially it was just the people I knew that were kind of in consulting-ish or, you know, whatever the type of roles. Not anymore.

Starting point is 00:33:58 I knew it was a couple people now that are doing remote development full-time jobs for Microsoft. Very cool. I mean, Microsoft has really worked hard towards putting together the tools necessary to even be able to do that, right? Like Teams. Teams is not as good as Slack, but it's better than a lot of options out there.

Starting point is 00:34:20 And it would make, I mean, my kids use it for school. Like it makes sharing documents and interacting and that kind of stuff really easy and so they've they've definitely been investing in what i believe they saw when the pandemic started about everybody's going to be in a remote world you know yep and they're so big too like you could work for microsoft and work for github which by the way is is 100% remote. 100%? They have a remote.

Starting point is 00:34:48 So, their development team is remote, I should say. Interesting. You know, I'm sure that somebody has an office somewhere and, you know, whatever sales, who knows. But, yeah, they have a big thing about their remote culture on their website. But I didn't know that before looking for it um but yeah just microsoft like hey github uh linkedin xbox azure windows office sharepoint uh they have a huge consulting arm it's just huge you know sql server uh both sides making sql server and also you know people writing the tools around it and stuff so yeah uh amazon amazon went to their their site they got 10 000 openings god and that's not

Starting point is 00:35:29 like warehouse jobs it's just freaking amazon jobs are we talking about it oh it's 11 000 now open jobs and they're it related or it's all it, if I filter it out to a remote and software development, they are hiring for 165 remote software developers right now. That's a lot of remote positions that you could go apply for right now. Right. Amazon has a really rough, uh, you know,

Starting point is 00:35:57 interview process. So you're going to want to do some problems on like leak code or something beforehand and read up on that. But if you're thinking about making that jump in 2021, here's a great option for you. That's a lot of open seats. And, uh,

Starting point is 00:36:11 we actually know somebody who, uh, I didn't, I haven't told you all yet. I'll tell you later, but, uh, they just got a job with Amazon,

Starting point is 00:36:17 uh, development from Nebraska. Cool. Remote. So very cool. Cheers. You know who you are and it pays well um yeah especially years

Starting point is 00:36:28 three and four man all right uh so mention github gitlab same thing two big git companies just saying have fully remote development teams

Starting point is 00:36:42 we knew anybody that liked Git. Yep. Twitter is one of those companies that changed their policies and now they're hiring remote developers, so that's pretty cool. And I've got a couple here that have just, I guess GitHub and GitLab probably too, but they have always been remote developers. So Confluent, we've talked a lot about with Kafka, Elastic.

Starting point is 00:37:07 Mongo has a lot of jobs that are open. Netlify, Heroku. And I should mention too, like all of these things, like if you're a front-end developer, I hope you don't feel excluded because I'm talking about like Elastic and Mongo and Kafka because they all have websites, they all have documentation, they all have tools that have user interfaces that are all built in web.

Starting point is 00:37:27 So, you know, these are not back-end distributed jobs. It's a mix of everything. Datadog, you know, they're a huge company and they do a lot of monitoring APM type stuff. They have a lot of visualizations. They've got a really strong front-end engineering team. So I hope you don't feel left out there. I do have a couple, I've got three links here to remote game jobs in case you're looking for getting into game development, game studios,

Starting point is 00:37:58 you're looking for remote workers. RemoteLeaf and RemoteOK.io are two sites that I found that have job listings that are somewhat curated and remote. Okay. IO are two sites that I found that have a job listings. They're somewhat curated for remote development companies. And I added one in here cause I didn't see it in your list. And that was just baffling to me because no developer would be complete without stack overflow and they have remote work. And we actually outlaw.

Starting point is 00:38:27 And I, I believe it was just us. I don't think Joe was there. Correct. Sat in on a meetup where they did talk about their interview process, which was really interesting, right? Like,

Starting point is 00:38:40 uh, I mean, we don't want to dive into it too deep because I think we actually talked about it a little bit during our interview. Yeah. But one of the interesting things that they said is they would throw a problem out there and see how people start implementing it. And then they would change the rules a little bit to see, oh, well, the requirements just changed. What are they going to do?

Starting point is 00:39:03 Right. Did they back themselves into a corner? How are they going to do right um did they back themselves into a corner how are they going to fix that like it was it's very much an evolution of okay the interviewee did this let me see if i could throw a wrench in there okay they did that now let me see if i can make a left turn up here right like it was just really cool to hear so yeah so uh sorry go ahead i was just going to add to it like it was it was one of our favorite we've described it as one of our favorite meetups yeah because it was a very unique uh take on the meetup in general because they were basically doing like a for real interview but but we got to watch it.

Starting point is 00:39:46 Like we were, it was almost like we went to a small off Broadway play and, you know, it was like the technical interview off Broadway and, and, you know, but periodically during the, during the meetup,

Starting point is 00:40:00 they would like interact with the audience there. Yeah. It was done really well. Yeah. How would you like that to be your interview? Like, Hey, you want to do it at a meetup? We're doing this live. Oh my gosh. No. Can I pull the audience? It'd be, it'd be like a, who wants to be a millionaire, right? Like, you know, can I,

Starting point is 00:40:23 can we do a 50 50? Can we kill two of the answers? Oh, my gosh. Can I phone a friend? Oh, my gosh. Who should do that on YouTube? So, I'll end with a question that I always like to ask people when they're talking about jobs or whatever. I strongly believe, I've said for a long time, that you should always know the top three companies that you would want to work for today if you lost your job or had to switch. Because then you can try to manage your car, your career, and try to kind of line yourself up with

Starting point is 00:40:57 those things if those are, you know, really who you want to work for. And if they're not who you really want to work for, find someone that you do really want to work for and keep that in mind. And now I want to expand it and say like, you know, if you were the kind of person that wants to, you know, work in an office and whatever, so fine. You should know your top three in your local area that you could go work for and go apply for tomorrow. Now you should also, as a backup plan B, you know, we just solicited like 20 companies. Go pick three that you think are the most interesting and just low key and background,

Starting point is 00:41:26 maybe follow them on Twitter, follow their engineering blogs, whatever, start learning about them. So that if you decide that you want to make a big leap in your career and try to do something, make a big move, then you know what to kind of stock up and get that ammo in your back pocket.

Starting point is 00:41:41 Today's episode of Coding Blocks is sponsored by Datadog, a software-as-a-service-based monitoring and analytics platform for cloud-scale infrastructure, applications, logs, and more. Datadog uses machine-learning-based algorithms to detect errors and anomalies across your entire stack, which reduces the time it takes to detect and address outages and helps promote collaboration between data engineering operations and the rest of the company.

Starting point is 00:42:09 And if there's any company that you want to trust to know how to detect your errors and your anomalies, it's going to be data dog because they have an article for everything. They have technology. They've got a plugin for everything that you want is built in, built into the tool. Like there's like 400 plus of these things just right built into it. And I promise you, like when I say they're everywhere and they know everything about this stuff, like there's a

Starting point is 00:42:36 Kubernetes podcast. What was it? It was the, actually it's called the Kubernetes podcast from Google. episode 137. They were just on talking about the container report. So their expertise is out there. And you should trust them for this stuff. Yeah, absolutely. I love that episode. Actually, that's where I found the container report that we talked about in this episode. And we talked about Datadog a lot in this episode because

Starting point is 00:43:08 they are really highly relevant. The ants are a really important problem, which is seeing what the heck is going on. And they end up saving you a lot of money and saving you problems and saving you time. Time is money. So, it's a

Starting point is 00:43:24 no-brainer to me, uh, you know, it's a, a no brainer to me, but, uh, you know, well, like I say, they got a cool t-shirt, so maybe I'm a bit biased. Yeah. And Hey, you want to get a cool t-shirt? Let me tell you how. Here's your secret. Shh. Tell, tell only all your friends go to data dog, hq.com slash coding blocks today to start

Starting point is 00:43:44 your free 14-day trial. And if you start that free 14-day trial and you install Datadog's agent, Datadog will send you a free t-shirt. And it's a super cute t-shirt. It's got the Datadog logo on it, which, you know, if you're a cat person, I'm sorry, but it's going to have a dog on it. But you're going to like it either way. Cause it's awesome.

Starting point is 00:44:09 Again, start your free data dog trial today to start monitoring in real time. Listeners of this podcast will receive a free t-shirt. Just like we mentioned, once you install the agent and create one dashboard. So again, that URL you want to go to is data dog, HQ.com slash coding blocks, data, That URL you want to go to is datadoghq.com slash codingblocks, datadoghq.com slash codingblocks to get started today. And remember, only tell all of your friends and family and people you meet on the street.

Starting point is 00:44:39 Just everyone. Only tell everyone. All of them. And check out the careers page. Just saying. Yeah. Good stuff the careers page. Just saying. Yeah. Good stuff. Hey there.

Starting point is 00:44:48 We could use your review. Really bad. Audible, new platform, just came up with podcasts. They're doing podcasts now. And they've got a way to do reviews. If you are an Audible listener, then you know how important reviews are for, you know, when you find books and stuff. And it's equally as important for us to find, you know,

Starting point is 00:45:08 new people on the podcast. So that's a huge for us. That's, you know, that's how we, as a podcast survive, that's our air, our oxygen.

Starting point is 00:45:16 So, uh, if you go to coming blocks, that slash review, we try to make it easy for you. We try to give you links to places that will let you find the podcast on those platforms and leave reviews. And so if you if you uh like what you're hearing then take a minute and leave us one of them five star babies because we love this all right and with that we head into

Starting point is 00:45:37 my favorite portion of the show survey says all right so So a few episodes back, we asked, I mean, it was the holiday time. So this question is definitely gonna make more sense if you frame it in that mindset. What's your favorite Christmas movie? And your choices were, It's a Wonderful Life. I like the classics. Or A Charlie Brown Christmas. That poor, poor Christmas tree. Or Frosty the Snowman with a corncob pipe and a butt on his nose.

Starting point is 00:46:14 Or wait a minute. Or a Christmas story. Only I didn't say fudge. Or how the Grinch stole Christmas because my heart is three sizes too small or the nightmare before Christmas. It's what I imagine happens when Jack White meets Tim Burton or Rudolph the Red Nosed Reindeer. An awkward story about being different enough to be important or die hard. It is a Christmas movie. Welcome to the party,

Starting point is 00:46:47 pal. All right. So, uh, thanks to, to Tucko. Uh, we know that Alan would go first because this is an odd numbered episode.

Starting point is 00:46:58 This one's hard, man, because like all of these are great. Right. because like all of these are great right but i'm going to say that the overwhelming majority is going to say a christmas story only i didn't say um and i'm gonna go it was painful was like, just get it out. Say it already. Right.

Starting point is 00:47:30 I think I'm going to go with 25% because there's a lot of choices here. Okay. 10% on. Did you pick mine? I mean, that's the right answer. Yeah. So 10%. Price is right on me. $1.

Starting point is 00:47:49 A Christmas story, 10%. Okay, so Joe goes a Christmas story, 10%, and Alan goes Christmas story, 25%. Is that correct? Yep. Survey says you're both wrong. No. Come nobody picked yours of course it's gonna be die hard no are you kidding me i don't believe it's not a christmas movie it is a christmas

Starting point is 00:48:17 movie and you're making stuff up it takes part during christmas i need to see the results. Listen, it was 57% of the vote was Die Hard. Really? A Christmas story. 57. I would have assumed would have been definitely a strong contender for second place. It was third. Wow. The Grinch.

Starting point is 00:48:40 The Grinch. No, you would have thought, right? Nightmare Before Christmas, of course. Charlie Brown was second. No. Right? No, man. the grinch no you would have thought right nightmare before christmas of course charlie brown was no right no right no man no that doesn't even make sense well i could say 50 57 of you are are cooler than me uh apparently but how nightmare before christmas wasn't you know in the top three i will never understand i've never seen it honestly it was in the bottom three i know right oh my gosh it's the only one that i can tolerate that and christmas story a christmas in the top three, I will never understand. I've never seen it. Honestly, it was in the bottom three. I know. Right. Oh my gosh.

Starting point is 00:49:06 It's the only one that I can tolerate that. And Christmas story, a Christmas story. Yeah. I don't know how you could think that, how you could say that you can't tolerate diehard. Come on. Oh yeah.

Starting point is 00:49:17 Cause it's not a Christmas movie. It is a Christmas movie. Listen, it is a Christmas movie. And let me tell you something. Like, let me point out this. Cause think about it this way. Think, it is a Christmas movie. And let me tell you something, like, let me point out this because think about it this way. Think about it this way.

Starting point is 00:49:30 And in next time you watch diehard, you're going to be like, like your mind's gonna be like blown. Like you'll never be able to see the movie in the same light again. The movie takes place on Christmas Eve. Now let that sink in for a second, because that means that all those people worked on Christmas Eve. And then after they had a Christmas party at the office on Christmas Eve. So it was bad writing.

Starting point is 00:50:04 No! That was not the takeaway. That was the it was bad writing. No! That was not the takeaway, Alan. That was the takeaway. Bad writing. But I haven't seen this movie in so many years that I don't even remember what Bruce Willis looks like in the movie. Oh, my God. Alan, I could practically quote the movie from the start. Do you want to just?

Starting point is 00:50:23 Hey, look, man. I know you watch it at least 12 times a year i haven't seen it since probably 1990 so yeah it's been a minute i i know what how do you do you not watch christmas movies at christmas what kind of monster are you oh by the way it's been a minute There was a whole discussion on that over in Slack about how, hey, what is this? Is this some new hipster young thing? Like, okay, boomer. Wait, what was the hipster young thing? Nobody had seen Die Hard?

Starting point is 00:51:00 No, no. It's been a minute. Like when something's been a long time say it has been a minute so yeah that was that was actually a really entertaining conversation nobody's seen die hard nobody believes it's a christmas movie so we're good there oh my god all right well then you're going to be disappointed when you see part two that opens up with you you know, Christmas music as well. And they both, one and two, end on the same Christmas song. Matt, I'm going to have to watch it, like, maybe this weekend. I don't know.

Starting point is 00:51:33 All right. You definitely should, but this is not the time of year to watch a Christmas movie. That would just be awkward. So, are we saying that there were parts of forrest gump that were like in the 80s so is that an 80s movie yeah i would say yeah sure i like your logic maybe sure or 90s movie whatever life is like a box of chocolates yeah oh god now do we all have to do like a Forrest Gump impersonation? I think we do.

Starting point is 00:52:08 Because Alan started it. I am offended. I like sauteed shrimp and peppered shrimp. Oh, my gosh. I want to hear Joe's. Come on, Joe. Ow, something bit me. That was my favorite scene. That was pretty good. I didn't expect it. That was my favorite scene.

Starting point is 00:52:26 That was pretty good. I didn't expect it. That was pretty good. All right. I got some big shoes to fall. I like your shoes. I'm in my first pair of shoes. Mama said they's my magic shoes.

Starting point is 00:52:46 She said, they take me anywhere. That's pretty good. All right. Well, uh, okay. I'm going to change the topic cause this is going off the rails.

Starting point is 00:53:00 So instead, Hey, here's a fun one for you. What's the opposite of a formaldehyde? Oh, no, man. None.

Starting point is 00:53:15 You hide that. I don't know. Casual to Jekyll. Oh my gosh. I like it. Thank you. Klaus. It's awesome. All right.

Starting point is 00:53:31 So for this episode, then we're talking about, you know, what were we talking about? I forgot now, you know, languages and working from home. So I guess this question has nothing to do with it. I thought it did when I started that. It does after this. Oh, sorry. This is going to make sense later. Hear me now and understand me later.

Starting point is 00:53:53 Yes. All right. So today's episode survey will be, when you start a new project, in regards to the storage technology, do you go with what you know? It's not even a debate. Or try something new.

Starting point is 00:54:11 Might as well learn something. Or seriously consider the options. Deliberate, debate, try not to hate. Or storage. You're funny. This episode is sponsored by DataStax. If you've done curbside pickup from a major retail store, checked Pinterest, or watched a movie on Netflix, you're already a Cassandra user. Why not make something amazing with Cassandra yourself? Okay, so we haven't gotten there yet, but we're going to talk about a lot of data storage technologies in this episode, and Cassandra is one of them. And truly, if you need a scalable data storage mechanism, Cassandra is one of the best out there. But maybe what's not so fun is trying to keep that thing running and chugging along and staying alive. And that is where data stacks can help you out.

Starting point is 00:55:08 Yeah. So as Alan was saying, data stack Astra does the heavy lifting of managing the infrastructure, serverless scaling operations, creating your data access APIs so you can focus on the code that matters to you. Astra automatically provides standard developer-friendly APIs like REST, GraphQL, Schemaless JSON documents, even a native SQL query language, which is their own Cassandra query language, is the easy button for a scale-out, always-on database as a service that spans the globe. I've had my eye on Cassandre for a long time. I'm still really trying to wrap my head around it because it's a whole other class of database that I'm used to dealing with. Wide column, highly available. One of the phrases I've seen used about data stacks particularly is they're viral from day one, which is a really cool notion and something I've been wanting to explore for a long time.

Starting point is 00:56:03 So keep your eye out for a stream. I really want to get that done soon. But I do want to mention when you first sign up with DataStax, they're so user-friendly for developers. One of the first things you do when you create an account with your GitHub account or Google or email and password is there's a drop down that says, are you just learning Cassandra? Are you building an app or are you migrating an app? And based on that, they cater that beginning experience for you. So for me, obviously I packed,

Starting point is 00:56:31 I picked learning Cassandra and it immediately threw me into really nice getting started guides that showed me basically what the heck was going on and how I can really get my hands on it and start moving immediately, which is just really exciting for me. So, I mean, one of these days, I just got to get some time on a Saturday and just get in there and learn what's going on. Yeah, and we'll get into use cases of when you might want to use Cassandra. But one of the things that I love about DataStax Astra here

Starting point is 00:57:03 is the multi-cloud, multi-tenant, or dedicated clusters on AWS, Azure, or Google Cloud. Like, pick one. Pick three. Whatever you want. They're there. They're there for your needs. And, by the way, you can get started in five minutes or less. No credit card needed. five gig free database.

Starting point is 00:57:28 Go to datastacks.com slash coding blocks to sign up today and get $300 free credit with promo code coding blocks. That's pretty amazing, right? So again, that's datastacks.com slash coding blocks. Sign up today and get your free $300 credit, but you got to use the promo code coding blocks. All right. So we're back. And now that thing that we did in the past that we said you'd understand in the future, we're here now. We're in the future. It was like 60 seconds ago. Remember? Yeah, back then.

Starting point is 00:58:09 So of late, I've had some internal musings about this. And I can honestly say that the three of us have spent a lot of time dealing with storage technologies and what they're really good at and what they're not so good at. And you can spend a lot of time trying to work around what they're not good at, right? Like, we did an episode on search engines. I don't know when. I'm sure Outlaw knows the number off the top of his head. Right. Like we did an episode on search engines. I don't know when I'm sure outlaw knows the number off the top of his head, but, but one of the things that we said is really good for search, right? Like really good for, for reading, not so fast at writing because it has to index things and do all kinds of stuff. Right? So maybe all you know is elastic search. All you've ever done is write search applications. So it makes sense. And then you get this new project handed to you and you

Starting point is 00:59:19 have to do relational data. And a lot of people will just say, well, I know Elasticsearch. I'm going to figure out a way to make it work in here, right? Because that's where I've spent all my time. That's what I'm going to do. And vice versa, I've seen where people are like, well, I know SQL Server. We'll make it do it. And I just kind of wanted to bring up, like, if you're going to be working on a project that's going to have any kind of substantial amount of lifetime or data to it, and it's not just some little project that you're spinning up to make something happen. about the different types of storage engines out there just so you can make a decision that will benefit the project and you and the organization as a whole as time goes on. Right. So I don't know,

Starting point is 01:00:16 you guys will talk about the different types here in a second. You guys have any followup thoughts on that? I think it's kind of like the languages, you know, we said it um there's different categories and so kind of knowing the ones from the one are kind of having a favorite or two from the ones that uh you're most interested in are really interesting so i just kind of i was actually just going through the categories and saying like hmm which ones am i most familiar with here and there and and so it was just uh just kind of interesting and especially when we talk about things like Presto or whatever, that kind of tie things together. I was just kind

Starting point is 01:00:47 of thinking about what that means here. I think it's, I think it's easy for all of us, whether it be the language that you want to code in or that the storage technology you want to use, it's easy for us to, uh, you know, I have a hammer and everything's a nail, right? And so, you know, and by that, I mean, like, I'm going to use C sharp to solve every problem I ever have, or I'm going to use SQL server for everything. You know, so, so I think that the ask is on us to, you know, try to recognize if we are doing that and, you know, to be aware of the pitfalls that come with that. Because then you're going to try to code your way around problems that might not exist had you picked the right technology for your particular use case. And so you need to evaluate what your use case is carefully to make sure that you understand, like, you know, yes, this would be the technology of choice for that thing.

Starting point is 01:01:57 One thousand, hundred thousand percent agree with what you just said. If you start feeling like you're creating an inordinate amount of code to make something happen and it feels unnatural, it probably is, right? Maybe there's not something that meets your exact need. Maybe there is. But at any rate, so let's go through some of the categories here. So first, I want to mention that I did, I love this website. I've been there several times over the years. It's db-engines.com and they have a ranking. Like you can just go to the ranking and they list them all out, like from the most popular to the least popular. And there are so many of these things like on this page, they have 334 storage engines listed on this one page.

Starting point is 01:02:50 That's a lot. I don't think by any stretch you should be going down to the bottom of the list to picking something unless you just want a side project. But, you know, it's cool to know that there's a lot out there. So first up is probably the one that has been around that most people are familiar with the longest. And that's the relational database. Now, I put a couple asterisks next to this thing simply because they used to, like even on the site here, this dbengines.com, it used to just be listed as a relational database. Now, when you go there, rightfully so, almost every one of them says relational comma multi-model. And the reason they do this, and this is important, it's because most people have done just what Outlaw said a minute ago, which is, I have a hammer, everything's a nail.

Starting point is 01:03:48 If you look at SQL Server, it has functionality to be a search engine. You can turn on full text search indexing and it can be a search engine. It's not a scalable search engine, but it's got one. You can do graph databases in SQL server. It has the ability to do graph database type stuff in there. It's got the ability to do things like JSON stuff. Like there are so many different things that they've crammed into it because people have said over time, I pay for SQL server. I want to use it for everything, right? And so that's what they've done. That's actually the case also with PostgreSQL, probably Oracle, and a couple other

Starting point is 01:04:33 things that they threw in here. I've also got MySQL on that list. A couple other ones that I threw in here are some of the cloud-provided ones. So there's Google big query. There's Azure Cosmos DB. You're going to hear that one a lot because they've tried to make that the Swiss army knife of databases in the cloud. So yeah, man, relational databases. If you are dealing with data that is related and you need lookup data that ties to 10 different tables or something, that's your tool. Outlaw, you look like you're about to say something. Well, yeah.

Starting point is 01:05:10 Thank you for recognizing. I didn't even raise my hand. You didn't, right? I see these cues. I forgot. That's what threw it off. I should have raised my hand. Right.

Starting point is 01:05:21 Yeah. You were talking about everything being like they're constantly adding in and it becomes a Swiss army knife. Like I think it was like the 2017 version of SQL Server that added in support for R and Python directly in your code. Like you could you could write a stored procedure or a code that would call out to python and pass in the results it was actually kind of neat because you know especially if you know if you're thinking about like from a machine learning kind of perspective if you wanted to uh um you know do anything with that but i never saw anything like um real world with it though it seemed kind of like that's cool that you can do that right i don't know that you should do that yeah you asked for it we gave it to you now uh you mentioned that the j the json support

Starting point is 01:06:15 though um postgresql actually has some pretty cool json support in it very good json support as a matter of fact like amazingly good yeah it's not it doesn't feel like it's necessarily an afterthought like it was a well approached probably you know don't use it for all your json needs probably not so is that little server still y'all's go-to for relational? Hmm. Hmm. It would have been a couple years ago, like five years ago, right? All right.

Starting point is 01:06:52 It hasn't gotten any worse. No, no. It's getting better. Here's where I'll say. So I think I know what Outlaw is going to say. On SQL Server, the reason it would be a go-to for me is all the amazing tooling built around it. If you need to move data, you got SSIS. If you need extended tools, you've got Redgate tools, right? Like there's so much built around SQL Server because it's a paid product that people are willing to pay for,

Starting point is 01:07:25 that there's a lot of things out there for it, right? But now, Outlaw, I'll let you take what you were about to say because I think I know where you're going to go. Yeah, I mean, I've had a love affair for the past few years now with PostgreSQL. So there's nothing else I would even consider if I, I mean, it's free. So right there, like, and there's a huge community around it. So, you know, unless, unless you had some reason why that wasn't going to be like,

Starting point is 01:07:54 if it was too much or something like, you know, um, maybe if it was like, uh, you know, an I I IOT kind of application or something where you needed a super light footprint and maybe you just wanted like a SQL light database or something

Starting point is 01:08:07 like that. But otherwise, yeah, like for any kind of like real kind of lifting out, you know, I would consider Postgres SQL well before I would go to SQL server. What about you, Joe?

Starting point is 01:08:18 Yeah. I mean, that's where I'm at. I, uh, I am so much more comfortable in SQL server though. Like that's definitely a, I think in SQL server first.

Starting point is 01:08:26 And so it pains me to say that, but I don't really understand that SQL Server licensing. I would do a managed service for, if I was doing like a cloud kind of product that had a backend, I would use a managed service for a database anyway. So, you know,

Starting point is 01:08:41 I'm kind of tempting to go SQL Server, but Playhouse Crest, I don't know, it just seems like it kind of tempting to to go sql server but play oscrest i don't know it just seems like it kind of is a natural fit there and i know it's free uh so i yeah i'm torn but i would probably go postgres i guess and you can run it on any platform and it would also be available in any cloud platform that you'd want to do too. Now that's if you want relational data. There's SQL Server too, though. Yeah. Cloud SQL has SQL Server, Azure. Yeah, yeah, yeah. But it's the three big.

Starting point is 01:09:12 I know, but it sounded like that's why you were saying you would go SQL Server, though, is that maybe I misunderstood, but it sounded like you were saying that was one of the things you liked about going with SQL Server is that it would be available in cloud platforms. and i'm saying like so is postgres sql but also like you know yeah with what version was it where they introduced um the linux capability for sql

Starting point is 01:09:36 server i think it was 17 2017 okay yeah you know but the point is like, so it's relatively still a new thing for SQL server, but yet it's not for Postgres. You know, it's, it's, that's a well oiled machine by now in Postgres. There, there is one more thing I want to call out about relational databases in

Starting point is 01:09:58 general. I'm not saying completely because there are exceptions, but they typically don't scale horizontally, right? And that is a very, very big thing. If you know that you're going to be dealing with, yeah, if you know you're going to be dealing with terabytes of data that you need to do relationally, it may not work out great, right? And that's, that's one thing to call out. Now go ahead. Outlaw asterisk,

Starting point is 01:10:27 right? There's an asterisk on that statement because again, my love of Postgres, like if you're, you know, that, that horizontal scaling horizontal is true for the rights, but not for the reads,

Starting point is 01:10:40 which still can matter. So, yeah, it depends on like what your application is. This is where it goes back to knowing your use case right right if your use case is like a an application where like you know it's going to be heavy reads like like take a look at uh you know codingbox.net it's my favorite website it's the most popular one according to alexa uh you know that is a you know we get a babilion concurrent users and so it's heavy reads right and limited rights to to it right so depending on your out your your use case you know it might be fine you know and honestly who am i kidding i'm going to try and run my database in kubernetes for as long as I can.

Starting point is 01:11:26 I'm there. I'm sold. I was actually just scrolling through the Datadog data report for 2020. Did you know that Postgres is the number third most popular container image? Oh, that's really interesting. Number three. No, I didn't. They have a lot of data at Datadog.

Starting point is 01:11:44 What's one and two? Nginx and Redis. Okay, They have a lot of data at Datadog. What's one and two? Nginx and Redis. Okay, that makes a lot of sense. Yeah, I'll just blast those real quick. It's crazy to me how many of them are database related. I guess not. I don't know. I have completely thoughts about why.

Starting point is 01:12:00 But Elasticsearch is number four. Then MySQL, number six. RabbitMQ and Calico are tied. MongoDB is number seven. Number eight. Kafka. Yeah. That was a lot.

Starting point is 01:12:17 A lot of data storage engines there. I think all of them, except for NGINX, all of them have been storage related. Yeah. Next is GitLab and Jenkins. They're tied. Interesting. And the last one is Vault.

Starting point is 01:12:33 That's Corp's Vault. It's super interesting to see how many were data-related. I was wondering, is that because the app stuff is just so fractured? It's like different languages and stuff. It's like you might have a company that's Python, Ruby, C sharp Java, but they're all using Postgres. You know, I think that's exactly what it is.

Starting point is 01:12:55 I mean, if, if you boil it down to front end or, or server type technologies that are serving up things, there's so many different flavors of it. So you're not going to get a strong focus on, on a particular one, right? Like you probably have flask and Django type containers that are running that right there, split your, your group in half,

Starting point is 01:13:22 right? Whereas everybody got to store that stuff somewhere. So it kind of makes sense that those data storage engines would be high up on that list, I would think. And I should clarify, too, this is a top, they call it off-the-shelf images. So this means you are straight up running the image from Docker Hub. Right. And for something like a Python, you're going to do a from Python,

Starting point is 01:13:42 and you're going to add all your stuff on top of it. So that's probably a huge part of that, too. So that's definitely something. But it's interesting. And they've got a really great graph here on the survey report. Yeah, Datadog's good at that stuff. Yeah. And did you know, like, we're on a Datadog kick now.

Starting point is 01:14:03 No, forget it. Just go to their blog. They've got a topic for everything, and I'll leave it at that. Right. Get you a shirt while you're at it. They really do. All right.

Starting point is 01:14:15 So we talked about the relational databases, right? And for the fact that you're going to use them for relational data, reads, writes, transactions. If you need transactional data, they're really. Reads, writes, transactions. If you need transactional data, they're really good at it. Most of them, right? Meaning that you need to write an order, then you need to write order details, and you need to write payment details before you can say something's done. They have transactions that basically say, hey,

Starting point is 01:14:40 if this failed in the middle, roll it back. Don't keep any bad, dirty data in there. Roll it back, right? So that's what we're talking about there. So relational database is really good for that. Lot of use cases. The next one up that I've got here is search. So search engines, Elasticsearch, Splunk, Solar, Azure Search, AWS Cloud Search. There's a bunch of them. This is something where you write data, the engine indexes that data so that you can search it effectively to get stuff back out.

Starting point is 01:15:13 It is highly optimized for reading and truly using search capabilities to get it back, right? And a lot of times these things are horizontally scalable. Elasticsearch, I don't know about Solr. Splunk definitely is. Azure Search does it for you behind the scenes. AWS Cloud Search does it for you behind the scenes. But yes, not great for updating documents and large amounts of data. Fantastic at searching large amounts of data that you've indexed. I thought, am I wrong? I thought AWS Cloud Search was based on solar. It might be on the Lucene engine.

Starting point is 01:15:57 Very much, very possible that it is. There is AWS Elasticsearch and AWS Cloud Search, and they may both be Elasticsearch under the hood. Yeah, CloudSearch is their own. You don't really know what's behind the scenes, right? You just use their interfaces. Yeah, I mean, it probably is using Lucene or Solr behind the scenes, but yeah. It's a managed service, right?

Starting point is 01:16:25 It's a managed search engine for you. Whereas the other ones you're going to have to stand up and maintain and keep running. It is solar. There's a, in the AWS developer forums, there's a question about, is it solar based? Is CloudSearch solar or based on solar and an amazon employee responds back it does use solar as the underlying engine

Starting point is 01:16:51 okay and he references an article in that cloud search faq cool so i guess my key point here is if you are doing something like a search engine, I'll never forget years and years ago, somebody reached out to me. I don't even remember why me specifically, but it was random from the internet. Somebody was building a WordPress plugin site that was more searchable. So if you've ever had to search for WordPress plugins, their search is annoying. Like you can't sort by the most downloads or you used to couldn't or the highest rated or, hey, only give me the highest rated that has more than three ratings, right?

Starting point is 01:17:33 Like there was no great way to sort through the plugins. And so this guy decided he wanted to create a website. Let's call it WordPressPlug WordPress plugin search.com or something. And what he did is he crammed all the data into a MySQL database. And guess what? Just like what we said earlier, relational databases are good at some things, not so great at others. So search was one of the things that it struggled with because you're not indexing the data eight ways from Sunday, right? So if he would use the search engine, put this stuff in here, indexed it once a day, then his search would have been super fast. He would have had crazy quick aggregates,

Starting point is 01:18:17 paging, all that kind of stuff, right? That's what search engines bring to the table. But if you're trying to do something like update the number of page hits, every time somebody clicks a page, a search engine is not where you want to try and do those updates, right? It's not made for that kind of stuff. I don't know if you guys have anything else you want to add on to these. I mean, specific to the, you know, I mean, I don't want to bash on that one. We did, we did do a whole deep dive on search-driven apps. That was episode 83, and we covered a lot of the technologies there. Yep.

Starting point is 01:18:54 All right, so the next category I have up is document databases. And you might have heard of this one, Mongo. We've got Amazon's DynamoDB. It's actually AWS DynamoDB. And then Azure Cosmos DB showing up again, and then CouchDB. And really what this is, is just what it sounds like. You're storing an object. So if you're familiar with JSON,, you've got a definition of an object, right? If it's a person as a first name, last name, and, you know, an age, we'll say you store that entire document as a record over there. It's not broken out into separate fields or anything like that. If you ever want to come back and update that document in a relational database. Let's say that just as an example, let's say that you had an employee thing, right? And all three of us report to the same manager.

Starting point is 01:19:52 In a relational database, you can just say, hey, they changed managers from manager one to manager two, right? In a document database, you don't have those luxuries. You got to go find all those documents, pull them out and update each one of them individually to point to it. And a lot of times you save the data denormalized too, because it just makes more sense. It's faster, right? You don't have the same type of luxuries typically that you do in a relational database. And now it's flexible. You know, the relational database, you can kind of get this from that, join this, that, whatever, no big deal. And the query optimizer will figure it out. There's NoSQL.

Starting point is 01:20:32 We've talked about this before. There's document databases. Like you really, you kind of have to understand how your data is being used in order to arrange that in a smart way because it's kind of on you. But it's really convenient, especially if you have a lot of JSON. Well, sorry, I didn't mean to interrupt. No, you're good. It's kind of interesting that you say, though, that they're not flexible because it really depends. Because we've talked about this before, right?

Starting point is 01:20:56 I think it's part of the designing data intensive application series because the main difference between a relational database and a document database is where, when is the schema going to be enforced? Because in a document database, uh, you know, that Jason that you mentioned one document to the next might not have all the same fields. And so in a relational database, the schema is enforced when you write to the table, right? Like you can't, you can't like just decide to add a new column and like insert into some fictitious column,

Starting point is 01:21:33 column that doesn't exist yet and expect that that work. Also, if there's a column that has a not null value, you can't exclude it unless there's a default constraint on it, on that column in a relational database. But in a document DB, the like kind of similar to what Joe said, the onus is on you as the caller to enforce whatever schema you want, you choose to use as part of that. Right. Um, so, so when you read that document, you're going to like do the checking to see that it matches what you, what you, you know, expect, expect to have. And, you know, you'll make the decisions how to deal with it at that time.

Starting point is 01:22:15 So it's kinda, it kinda is like a little bit, you know, it is more flexible in some ways. Well, this is where, you know know looking at like order systems in the past you know a lot of times they would be done in relational databases and and when people design the shopping carts or whatever they wouldn't think about what some of the implications were right like so for instance um let's say that we set up Joe Zach as a customer. He's got his address as you know, one, two, three main street or whatever. And if you do that in a relational way, when he orders a product last year and he was at

Starting point is 01:22:58 one, two, three main street, that's all good. It's going to get to his house. It's going to say it shipped and it was there if you kept this in a relational world and now he moves to three four five you know second street if you update that record it updates the history of where things happened right and people a lot of times didn't think about that in the relational world this is where something like a document database can really shine because you can save the entire order as it happened, basically as a snapshot in time, right? So here's the order, here's the order details attached to that same document. Here's the address, here's the billing, the shipping, all of it never changes, right? It doesn't matter if

Starting point is 01:23:40 somebody goes in and updates his information in the address book, that orders a snapshot. So, you know, knowing those use cases matter. And again, like Joe said, knowing how you're going to use that data really drives how you're going to store and set up those schemas in your application. And, and a lot of times it's up to your application to have the smarts of how it is. Like for me, that's the biggest challenge. You go into a database, you look at a table, you say, Hey, this table requires these five columns, right? You go into a document database. The only way you're going to find that stuff is if you start looking at application code typically, right? Unless you start doing some metadata querying to say, Hey, what are all the fields that show up in this particular storage area? But it is a different use case.

Starting point is 01:24:30 I'm sorry. I'm just all hung up thinking if I want to go manage service or Kubernetes for my hypothetical application here. Kubernetes. How much pain do you want to endure? Not much. There's two things. How much pain and how much cost? Because I've actually heard people say that Azure Cosmos DB is amazing.

Starting point is 01:24:49 You can run up a bill with that thing pretty quick, right? It's proprietary, though. I just have a hard time with that. I'd like to take this moment to remind you of K8S.AF so that as you're contemplating, you know, Kubernetes versus managed, you know, there's a lot of greatness about Kubernetes. I'm not going to knock it, but you know, there's also some, some horrific stories about it as well. Ah, whatever. Ah, whatever.

Starting point is 01:25:26 You can't listen to them haters. But, but no. I can, I can, because I tell you what, just recently bit me. What recently bit me is that apparently there was a known bug that if you used in a label, a hashtag or a pound sign or whatever you choose to, whichever

Starting point is 01:25:46 you do that, and depending on what your age is, you might call it a hashtag or you know, might call it a pound sign. And it brought down the cluster though, just because that one character was in the label of a

Starting point is 01:26:02 pod disruption budget. Like that is, that is, that is, label of a pod disruption budget. Like that is, that is, that is, that is crazy. That is so comical to think that like a single character could bring down a cluster. So yeah,

Starting point is 01:26:17 that's a, that's definitely a, a K eights story. K eights AF story right there. That's power. No, I'm sorry. There should never be... Just save yourself from panic and never put spaces and names of anything.

Starting point is 01:26:32 Don't use special characters. Yeah. You know, I mean, if you have to, find underscores. If for some reason you can't do that, find dashes. And keep your name short, too. I'm not saying that it was... Truth. Like, i don't even know how that character got in there i don't know you know anything about it i'm assuming that it was like some automation tool that like you know because it was already like it already seems like it would be bad right because that would be treated, right? Because that would be treated, you would think that that would be treated as an inline comment in YAML, but I don't think YAML supports an inline comment.

Starting point is 01:27:11 And by the way, the cluster he's talking about crashing was actually a Google Kubernetes engine cluster, right? So we're talking about the company that basically wrote the book on Kubernetes. Yeah, that was impressive work. I wasn't going to throw him under the bus, man. Like, wow. I mean, it would probably happen in Azure. What is it? A.K.

Starting point is 01:27:35 I can't even remember. Azure Kubernetes. A.K.E.? A.K. I can't remember now. I don't remember. Yeah, it crashed the entire cluster, but whatever. All right, so let's move along here because I think I am dragging on these.

Starting point is 01:27:48 So we got key value stores. Those are truly what they sound like. You've got Redis, Amazon DynamoDB, CosmosDB, again, the Swiss Army Knife of everything. Memcached and etcd. And if you didn't know, etcd is actually used in Kubernetes behind the scenes for storing values, key value pairs. It's really, you know, hey, you have a value and then you have another value assigned with it. That's as simple as that. The beautiful benefit to these is they typically scale pretty well. If you're talking about something like Redis, you know, that's an in-memory thing.

Starting point is 01:28:31 You probably aren't scaling that. But Azure DynamoDB, you can kind of throw whatever you want at it and it'll scale. The next one is one that I think me and Joe Zack had fallen in love with a while back. And I've taken a long hiatus from it was graph databases, man. These things are amazing. If you load them up, right? Yeah.

Starting point is 01:28:53 And they make really hard things really easy. Uh, but you know, there's definitely some downsides and, and how you have to work with them is definitely different for me, but, uh, it's just,

Starting point is 01:29:02 it's just really cool to be able to solve some really hard problems. Like, let me see all the people who have this and are associated with that and also have that. And zippy-doo-dah, there you go. And that's something in SQL that would be like many levels of nesting and taking forever to run. And a graph database basically is just set up for that. Right. And the big difference there is, like I mentioned earlier, SQL Server supports graph databases. You could actually set it up to be a graph database. But the big difference is it's almost like with Elasticsearch or any of the search engines, you have to know how you need to put that data in there to be able to set it up to do those relationships.

Starting point is 01:29:41 Right. Like you can't just take a relational database and say, oh, here it is, point these at these, and it's all going to work because it's not quite as simple. Although they tried to make it that way with SQL Server. But some of these are Neo4j. I think that's the most popular one by a pretty long shot. Azure Cosmos DB again, dbgraph. And again, I put my SQL Server in here as one of the ones that I know that they have that feature in there. Um, the next one up another storage type technology is

Starting point is 01:30:13 time series databases. So this is a database where every entry in there is basically timestamp. They are expecting you to get data in because you want to track it over time, right? So think not necessarily logs, but a lot of times metrics or anything like that that you want to see over time. These are the types of things that you're going to be looking at. And some of the popular ones are InfluxDB, KDB Plus. I've never heard of that one. And Prometheus. Prometheus is used a lot of times for tracking metrics and applications, right?

Starting point is 01:30:49 Like that kind of stuff. Grafana. Yeah, you'll use Grafana with Prometheus a lot to be able to visualize that stuff. And then, so here's wide column databases. This one's interesting. We'll be talking about one of our sponsors here in a minute, DataStax, Cassandra. Cassandra is one of them, HBase, Azure Cosmos DB again, and Google BigTable was probably the OG of all of these, I think, because as a matter of fact, I believe that Cassandra

Starting point is 01:31:21 was built based off the white paper that Google had put out on their big table setup. But the white column one, I was reading about it and really, I don't know the best way to put it other than it's similar to columnar storage, except you can store multiple columns per data file. And so it gives you a little bit more flexibility, but so these are sort of in a, in a thing all their own. And a lot of these have a tendency to be able to scale like crazy, right? Like that's Cassandra's claim to fame is you can scale that thing just stupid and you're basically guaranteed a hundred percent availability right and and it's fast so all pretty interesting things know about those and then the columnar file formats oh i'll let somebody else grab this because it looks like somebody added it here and at the end yeah i just threw that in there um that was something that i just think is really cool like

Starting point is 01:32:24 uh you can do some really cool stuff with just flat files if you've got like a tool that you can use to query like we talked about drill or what was the other one we always talk about presto yeah and so common law file formats are just a cool way of serializing your data so that it can be read and

Starting point is 01:32:40 used really easily and so we've got parquet orc and I keep hearing a lot about arrow, which sounds really interesting, but it's a little bit more involved than just the format. But yeah, it looks neat. So I got my eye on it.

Starting point is 01:32:54 One of the things when we were first looking at these things that I didn't realize at the time, I don't know why, but when we talk about these file formats, parquet orcc, Arrow, whatever, it's storing lots of records in a single file. And then within that file, the records are broken up into columns for their indexing more or less. And a lot of times they'll also have metadata in the files so that you can get quick stats on the files. And that's how tools like drill and PrestoDB and all that can query

Starting point is 01:33:31 these things so fast. So it's, it's interesting. You take 500 gigs of data and stick it into a Parquet file or not 500 gigs, 500 megs of data, put it into a Parquet file. And the way that it's stored in that file is what makes it fast to read. But wasn't it, uh, is it work that I'm thinking of? There was a, there was a something that Uber created that was to make, to work with parquet files because like parquet files are not, it's not advantageous to like try to, you can't update those easily in place.

Starting point is 01:34:03 And, uh, Uber specifically came out with a technology where they were keeping the metadata separate from the other files so that you could get high-level stats about it. But then they had another technology that they were using to do their write spec. Do you remember? What was the name of that? Hoodie. Hoodie. That's what it was. Yes.

Starting point is 01:34:23 H-U-D-I. Do you remember what was the name of that hoodie? That's what it is. Yes. H U D I. So, so org is actually a, another flat file format that came, I believe after parquet, that's supposed to be more efficient. I think there are some tools out there that are way faster if using the org format versus parquet, but parquet is like the kind of the standard, like maybe the,

Starting point is 01:34:44 you know, the one that, that most things out there support. And it's interesting, too, because we talked about all this, but similar to the conversations we had related to the design and data intensive applications, there are other storage storage technology type applications like a kafka that aren't considered a database per se but they kind of are like when you like really start to like talk about like indexing right then it's like kind of what's the difference at that point right because you know they do have um you know you kafka is kind of a unique animal in this regard because you can go without a schema so it could be like a document

Starting point is 01:35:36 database or you can have a schema right and you can but that's a separate technology, right? It is. It is. Yeah. But there is like a deterministic way of how this particular file is going to be written to which partition is it going to be written out to, right? All of the behind the scenes of how it's writing is very database-like. And in fact, I think there is like a RocksDB that's used in my – am I thinking of it wrong? Streams uses RocksDB to manage its storing of keys for joining and whatnot. So that's right because that's where Kafka gets confusing because there's Kafka and there's Kafka Streams, and Kafka Streams is an API to work with Kafka. Right, right. So, hey, because we just had this conversation here, I thought it was worth adding this other data storage type on here, and that's queues, right? So Kafka is a queue.

Starting point is 01:36:40 It's not a database or anything. And RabbitMQ is another one that goes into that. It's also just a queue. It's not a database or anything. And RabbitMQ is another one that goes into that. It's also just a queue. There is a big difference between the two. RabbitMQ is a true traditional type queue where something goes in, it gets popped off, and then it's gone. Kafka is more of a continual transaction log that you can purge these things off over time and hyper scalable can handle, you know, trillions of rights depending on the setup. Um, but yeah, I totally that the website that I went to DB engine failed me. They didn't have cues on there. So, um, yeah,

Starting point is 01:37:19 these fall in there and another really important type of storage mechanism know about because they come in handy for different use cases so i just picked up on what the difference was between arrow and parquet like what the differentiator was so parquet is uh encoded and meant for long-term storage on disk and files and it's great for you know querying column stuff too. But a big part of that is how it's encoded for writing and for long-term storage. Arrow is optimized for in-memory

Starting point is 01:37:51 columnar operations. So you take a hit on the storage because it's not laid out for that very well, but it's much faster for reading and performing operations. So if you can do stuff in memory, it seems like a really good format. So that seems really interesting. I think I'll have to kind of chew on that a little bit. So smaller,

Starting point is 01:38:08 so, so smaller amounts of data then in comparison, I would think that, or if you're willing to pay more for storage. Uh, I mean, here's one thing to know about these three file formats is they basically grew out of the need for HDFS,

Starting point is 01:38:26 which is the Hadoop file system, because a lot of times, I mean, if you think about it like this, we said that relational databases, there's a falling over point where they just don't scale well, right? They don't, the performance will fall over after you hit something. Well, HDFS is dirt cheap because you can store files out there forever for almost nothing. But if you want a place to where you can put your data that you can do mining on later, you typically put it in one of these file formats because Hadoop has all kinds of tools to where you can query these files like their databases and do all kinds of mining and stuff. I mean, outlaw and Joe did the video a while back of Apache drill where they were doing this stuff, right?

Starting point is 01:39:12 Like it's almost like magic, but at any rate, yes. So that is another thing. If you're dealing with longterm storage and you want to be able to go mine data that you have from historical stuff, those are great formats for that kind of thing. I mean,

Starting point is 01:39:26 tools like drill and Presto that they're just, it's amazing that they work and that they work as well as they do because you can have like two different technologies and you can query them with one thing and it'll just like, you know, combine those results. like join them like a database how's that even work man like you're gonna query a csv file and a parquet file in a postgres sql database to make this to get this query result and somehow it's like it's just

Starting point is 01:39:59 magic it's totally amazing and then if you mix it in with the thing that you said from Uber, the hoodie stuff, then it's like a whole nother layer of magic, right? Like, wait, I got data updating in these flat files, sort of real time ish. I don't know the world.

Starting point is 01:40:14 I don't know if we've linked to this before, but we're about to, because Uber has some amazing, you know, stuff in their blog. Oh, if you've never followed the uber blog engineering blog there's just some fantastic awesome articles in there about some of the things that they've done and like from machine learning to data storage data just architecting like how they were going to do in, in like long-term plans.

Starting point is 01:40:45 And like, if you follow some of these articles, you could, you could go back in time and then see like how they would say like, here's what we're doing today, but here's our long-term plan for how we're going to get to there. And the things that they were doing to contribute back to, um, you know, in, in, back to the community in the way of open source. You know, what's so funny about that outlaw so um i i basically hate social media right because i feel like you can't really say much out there with the guy without getting in a fight but the irony here is i saw somebody the other day i don't know what the the thing was about why this guy was basically attacking Uber, but basically said that they're not a tech company.

Starting point is 01:41:26 They're a transportation company. And I looked at that and I was like, I want to reply so bad because Uber is only as good as they are because of all the technology things that they have created and invested in. Right. Like, I mean, their entire business is built around being able to track somebody. Like, would you be happy if you ordered an Uber and it didn't show up on your phone showing that, hey, he's three minutes away, he's two minutes away, he's driving down the road right now? No, you wouldn't be happy. They are absolutely a technology company and their core business that is driven by that technology is transportation. Right. But but yeah, it's just one of those things that shocks me that if you don't know, if you've never looked at their engineering blog, if you don't realize how much data they have to crunch every day, it's easy to say, oh, no they just, it's just people driving around. That's all it is. I mean,

Starting point is 01:42:28 some of the articles that were so great because like you look at their architecture and the thought that they've put behind it because they were saying like, okay, some of our data needs to be, you know, going back to going back to the start of this episode, we were talking about like knowing your use case, right. And the language and, you know, or the storage technology that you need to use. And like some of their articles, they'll talk about like, okay, we need some data that's written that needs to be queryable for machine learning purposes.

Starting point is 01:42:57 Some of it that needs to be used for marketing purposes. Some of it that needs to be used for this other developer purpose. Here it is for applications, blah, blah, blah, blah, all the different use cases and everything. And like how they were,

Starting point is 01:43:08 you know, the different challenges that they have with that, but also trying to keep like a centralized, like, you know, a master record of the doc of the, of that data. Right.

Starting point is 01:43:21 Oh, it's impressive. Yeah, it really, there's this juicy. I forget at the time when they wrote some of these articles, they were processing over 20 petabytes a day, right? Well.

Starting point is 01:43:32 Petabytes. And I'm sure. 2018, it was 100 plus petabytes. So, yeah. I mean, they're just growing like crazy, man. So, yeah, it's cool stuff. So, at any rate, they're a great example of know the storage technologies you need to make some of the decisions you got to do, right? Because they don't use the same thing for everything, right? They're using Hadoop files for their data lakes.

Starting point is 01:44:00 They're using near-line Cassandra stuff for other things. They're using Postgres. They're using everything, right? Different use cases.. They're using post. They're using everything right. Different use cases. So anyway. All right. I'm done with that. All right.

Starting point is 01:44:12 Well, then, uh, wait, I was going to say, I looked at Uber to see if they were one of the companies that was hiring remotely. Uh, neither them or lift our, uh,

Starting point is 01:44:21 listing remote jobs. Really? Yeah, man. All right. Sorry. Go ahead. I'll,

Starting point is 01:44:26 I mean, you know, I was just going to ask one last question related to this, you know, water cooler episode, which was be, you know, what did the cannibal get when he was late for dinner?

Starting point is 01:44:48 Burned a cold shoulder thank you again Klaus that's really good Klaus you're good at this you're good at this Klaus yeah he should be like a comedian oh that's good um okay so uh yeah we'll have a bunch of resources to things that we've mentioned in this episode

Starting point is 01:45:09 in the resources we like section. And with that, we head into Alan's favorite portion of the show because Joe doesn't have a favorite. We've tried to establish this before. I have a favorite. Alan has a favorite. Joe's just like, whatever. So Alan's favorite portion of the favorite joe's just like whatever uh so alan's favorite

Starting point is 01:45:26 portion of the show it's the tip of the week yeah and i got guilted into finding two so i i was like you only have one i was like oh man all right so my, go ahead. You're going to say something. Okay. So my first one grew out of a need in my Python app of being able to pipe my Google credentials into my Kubernetes pods so that they could authenticate, right? Because the last thing you want to do is be passing around usernames and passwords and all kinds of stuff. you know, when you're trying to authenticate to Google Cloud Storage or anything, right? So if you use the G Cloud tool, I think it's, if you download it and you run it, it's like G Cloud auth login.

Starting point is 01:46:19 If you run that command, it'll prompt you for your username. Then after you, it'll open up a browser, you'll go in, you'll authenticate with your Google credentials, and then it'll basically give you back a token that you need to paste in. And assuming everything matches, it'll store a JSON file on your system that is your credentials, right? It's got some encrypted stuff in there for it to be able to talk to Google, which means every time you go to run a GS util or a G cloud command from your local machine, you don't have to put in your username and password anymore. It always goes and use that credentials file. Well, if you have pods that are going to be running some Python or anything else, right? Anything that needs to talk

Starting point is 01:47:05 to those Google services. It's kind of a pain if you don't have a way to get those credentials in there and you don't really want to be copying your file into a bunch of pods. Anyways, there is a plugin. If you're using mini cube called GCP auth, I'll have a link here. and if you run that and you're running minikube it will already mount that credential file into pods so automatically your pod if you have any programming access or apis that are calling gcp services it'll automatically use that credential this is perfect for local development so that actually saved me a ton of time. The next thing, well, Hey, can I make a comment? You can, um, because like you mentioned about like, uh, you know, for local development and, and passing that into your,

Starting point is 01:47:56 your pods and everything, but you could also be tempted to like use your, um, to build a, build a Docker container that you might just have it in there that you might pass in as an environment variable and then you'd set an args or something like that. But you have to be careful when you're doing that kind of thing,

Starting point is 01:48:17 especially if that's a Docker container that you plan to share. Because, for example, if you use the args docker command in your docker file, you're actually going to persist those credentials in the resulting docker image there. So you really want to, again, know your use case, right? What you're trying to do. So yeah, definitely. Anytime you're dealing with credentials, you want to find the best possible way to not be passing it around everywhere. Right. And so this makes it a whole lot easier.

Starting point is 01:48:57 And then this one, this is a tip from micro G it was in the, uh, the Slack channel tips and, uh, tips and something. If you're not part of the Slack community, you really should be because we've got some amazing people in there. Tips and tools. Tips and tools. If you go to codingblocks.net slash Slack, you can auto-sign yourself up and get into the channel. We have a few people in there. Microgi just happens to be one of the awesome ones,

Starting point is 01:49:29 and he's got a search engine for Calvin and Hobbes. So that's my other tip. Maybe we'll put a smile on your face. So if you click that and go to it, then you can find some good funnies. So, yeah. All right. And mine is going to be that report. I actually referenced this earlier.

Starting point is 01:49:47 Datadog, they publish reports every once in a while about various things. We talked about the serverless one, I think, a while back. They just updated one of their articles with data from 2020. And it includes some really good stats and some visualizations and, of course, links back to more detailed data. But as you know, the Kubernetes runs in half of container environments. That's awesome. So that took me a minute to kind of parse, but there are some services I kind of forgot

Starting point is 01:50:12 about now that just run in containers. You could throw a container at the Azure container service, for example. It's not Kubernetes, but it runs, you know, maybe there's Docker Swarm still around, Mesos or whatever else would be easy to run in your containers. But yeah, Kubernetes is in half,

Starting point is 01:50:28 and if you look at the graph, it's going up. That's cool. There's a couple other points I just wanted to hear just because they're cool. This one is one of my favorites. A majority of Kubernetes workloads are underutilizing CPU and memory. So in Kubernetes, you can set a request and a limit the limit will uh you know prevent you from going over requests though kind of allocates stuff and says

Starting point is 01:50:52 like hey hold hold on to this for me because i'm gonna need it 49 of containers use less than 30 of the requested cpu so that's money that you're paying and holding on reserving and not using. 45% of containers use less than 30% of the requested memory. So we're over-provisioning it as kind of an industry here. So I thought it was really interesting. And if you go check out this report, there's other stuff there. They highlight like 11 big items. I just read two of those there that I just thought were really interesting.

Starting point is 01:51:24 So that's really good. And it's again food for thought if you're dealing with these environments. So similar to how we would over-provision on-prem hardware, we're over-provisioning our pods. Exactly. Right. And so here's the cool part about what you're saying there, though, is for anybody that doesn't understand how this works, when you are requesting your pod

Starting point is 01:51:46 you can tell it the resources you want right like the ram and the cpu if you are under utilizing that by a lot then that means that other pods can't be scheduled on that same node and a node is a virtual machine you're paying for that has a certain amount of resources. So basically you're overpaying for the VMs that your cluster uses that you could be, you might even be able to cut down your, based off what they're saying right here, you could cut down the number of nodes you have potentially by a third, right?

Starting point is 01:52:21 49%. Yeah. It's a little fishy, but yeah, somewhere fit and call 50%. There you go. Cost savings. A couple thoughts there.

Starting point is 01:52:29 One is that a node, it could be virtual, but it could also likely just be a physical machine. Yeah. Depending on. I guess I'm talking about in the cloud world, right? Even in the cloud world, though, you could have like dedicated hardware. Oh, yeah, they might very well. I doubt it, yeah they might very well i doubt it but they might um and so so then the other thought though that came to mind

Starting point is 01:52:51 is that i wonder if because there's a there correct me if i'm wrong but i thought it was like considered a best practice to set the request in the limit to the same because of problems with, um, uh, what's it called? Oh, dynamic allocation where it's having to size up and down all the time. Cause the problem that you run into is like, let's say,

Starting point is 01:53:15 let's say that you say, Hey, I'm going to request, um, uh, 10, 10 gigs of Ram, but it could burst up to a hundred gigs. So the limit would be a hundred gigs.

Starting point is 01:53:26 And then you get enough pods scheduled on that to where like, what if they were to all, you know, try to reach their limits at the same time? Like you would get into these weird errors that you're like, suddenly a pod failed and you don't know why it failed. And the reason, but you later determined that the reason it failed is because it tried to allocate memory that it couldn't, even though it was scheduled on a node that it should have, but because they all tried to schedule, tried to, you know, they all bursted at the same time, say, then you get this problem. Whereas if you set your request and your limit to be the same, then you're guaranteed that your pod is going to have what it needs and anything that

Starting point is 01:54:06 is on that node is also going to be guaranteed to have what it needs and so you don't run into that burst problem so over provision for what your needs are at the peak and then just deal with the fact that it's never running at peak well i mean yeah still try to like provision it i guess right but yeah i did it i mean the thing is the thing is the takeaway from this is the data dog is awesome right because i agree because this goes to show like how many times do we harp on this while going through the DevOps handbook, right? Observability metrics, right? Like that kind of thing matters. Like you wouldn't know this had data dog, like this is literally their bread and butter, right?

Starting point is 01:54:55 And providing you with these kinds of stats, providing these types of visualizations and just having that kind of data is, you know, that, that is what they know how to do and they're amazing at it. is, you know, that, that is what they know how to do. And they're amazing at it. And, you know, because of those metrics and that capability, then you're able to see like, Oh, Hey, we're using, uh, you know, less than 30% of what we requested. Maybe we scale it back. We, you know, we scale back our pods to use like 50% of what we're currently using. And that way we still have a little overhead for when we do need to burst,

Starting point is 01:55:26 but not as much. Right. And maybe we just have more pods instead, you know, or, you know, we, we,

Starting point is 01:55:32 we, uh, have more horizontal scaling, uh, rather than trying to, you know, vertically scale them, which did you know that there are like horizontal horse vertically scalable

Starting point is 01:55:44 pods? I don't think I've ever messed with that. I was reading about it. It's weird. Like it gets things go plaid. Um, any rate, I was,

Starting point is 01:55:58 we'll sidebar that. But, um, I didn't want to comment to going back to mini cube. Like if you've never used mini cube, I'm want to comment, too, going back to Minikube. Like, if you've never used Minikube, I'm going to give you one reason why you should right now. And you're going to be like, oh, that's such a great reason. I should definitely do it. They have the emojis right there.

Starting point is 01:56:17 Like, when you start it up, you see, like, all the great little emojis. Like, that's a little bit of happiness right there in your life. I will tell you. I will warn you. If you decide, hey, I want to check out this mini cube thing. And you're one of those people that has Docker desktop running. And you decided to provision the heck out of Docker because you're going to run all the things in Docker. Just know Docker is running in a VM. If you said I have a laptop with 32 gigs of RAM, I'm going to give 24 gigs of that to Docker and I'm

Starting point is 01:56:45 going to give it half of my CPUs. That's taken. So when you go to startup Minikube, it also creates a VM. So you can't try and provision another 24 gigs of RAM and half your CPUs to that because they're taken. So typically when I'm running mini cube, I shut down Docker desktop, but that's okay because mini cube allows you to act like a Docker Damon or demon as well. And you can run all the same Docker commands and everything will work the same as Docker in mini cube. So just be aware of that. Asterisk. Asterisk. Okay. What about?

Starting point is 01:57:38 Because, correct me if I'm wrong, but on Windows, though, that's not necessarily the case if you're using WSL2 because it will use Hyper-V, right? That still creates a, I believe with Hyper-V, it's still creating you a virtual machine behind the scenes that has those resources. I thought the point was that it wasn't going to immediately consume and reserve them all. It would just use them as needed with Hyper-V. That was the big advantage of Hyper-V over the previous implementation. Yeah, I have no clue. I know it does create a VM behind the scenes because you can open up Hyper-V Manager and go look at it. And I think it, I mean, at least in the past it did, but I haven't done the WSL2 route. So maybe it's not.

Starting point is 01:58:13 Well, somebody can write in and correct me on that as well as my inline YAML comment that turned out to be totally doable. All right, let's do it. All right. So with that, you know, it's time for my tip of the week. I got to tell you, I was curious, so I ordered on eBay both a chicken and an egg.

Starting point is 01:58:37 We're going to see which one comes in first. Which one comes first? That's a gift from Klaus. He's so hilarious. Where does he come up with this stuff I don't know he shares it with us and we're fortunate enough to have him share that with us so

Starting point is 01:58:53 for my tip of the week I have a fairly newish to you article from last summer for tips and tricks for running Strimsy with kubectl. So if you're using Kafka in your Kubernetes environment and you're using the Strimsy operator for that, then, uh, they've got like a little, uh, short, it's very short,

Starting point is 01:59:20 uh, you know, set of, um, you know, tips and tricks for it. And the thing that caught my attention first, I was like, oh man, I never want to forget this thing, was that it had a table of all the shortcuts and the long names and what those resources belong to. So if you wanted to do like a, let's say you already have kubectl alias as K, and you wanted to say K get Kafka Connect, you could just you already have KubeKuddle alias is K and you wanted to say, uh, K get Kafka connect. You could just say K get KC instead. But then you're like, well, wait,

Starting point is 01:59:51 what if I want to get a connector instead of Kafka connect? Then it's like K get K CTR. Right. Um, so I, I loved those, but they were also like other little tricks that they had in there as well for dealing with it. Obviously, you know, again, for the Strimsy operator, but yes,

Starting point is 02:00:10 I'll include that link. That's the most excellent. And Strimsy for those not in the know is running Kafka as an operator in Kubernetes. So. And maybe one day I'll be able to talk to these guys and doing a deep dive on Kubernetes, but so far I'm losing that battle or Kafka

Starting point is 02:00:27 we do a whole episode on virtual environments and I think next episode let's spend a good two hours talking about event time processing time, clock time windowing

Starting point is 02:00:44 sessions and all the fun stuff that you get in streaming systems. Let's do it. That's actually not a terrible topic. Let me get a picture here. Right. That's what I was going to say. Right there is the problem. You have to virtually draw a picture in your mind.

Starting point is 02:01:00 Wait. I'm so good at describing these images. Are you kidding? Imagine a ski slope. This is called skew. All right. Well, so we hope you enjoyed

Starting point is 02:01:14 the episode. If you haven't already, if you like, maybe a friend gave you a link to an episode or you're listening into it on their device, you can subscribe to us

Starting point is 02:01:24 on whatever platform you choose to find your podcast. iTunes, Spotify, and Stitcher are the big ones. But, you know, if you find a platform that we are not on that is your favorite, let us know and we will correct that. And, hey, if you want to leave us a review, because, you know, as Joe mentioned, we really do appreciate those. They always put a smile on our face. You can find some helpful links at www.codingblocks.net slash review.

Starting point is 02:01:55 Hey, and what episode is this? This is 153. I thought we just did 155. No, according to math, we did 152 beforehand. But let me confer with the math of a chicken. I've done some things. Okay, so it might be. We might have

Starting point is 02:02:16 done 155 and then came back to do 153. I had our surprises in store. Alright, so we're on 153. So what I was going to say here is if you head to codingblocks.net slash episode 153, no spaces, no underscores, none of that garbage. Like Joe said, keep it simple. You can find the amazing show notes, all the links and stuff that you should not click at work. They'll be in here. so definitely go check that out and hey if you

Starting point is 02:02:47 have any questions or rants or any of that kind of stuff uh send them to slack uh go to coding box.net slash slack and and join up there and at joe with all of your complaints don't ask me it's not even joe well i don't even know what his is anymore. His has changed so many times. That's right. There's a reason I camouflage. Hiding in plain sight. Make sure to follow us on Twitter at CodingBlocks or TikTok or head over to

Starting point is 02:03:15 CodingBlocks.net and find all the social links at the top of the page. We don't actually have a TikTok. We should grab it. I'm not putting TikTok on anything, man. We just need to squabble on the name. Spies. We're grab it. I'm not putting TikTok on anything, man. We just need to squabble on the name. Spies. Oh, we do need to.

Starting point is 02:03:28 We're at war. I'm going to let Joe do that. Joe, that's your to-do item. All right, TikTok. Here it comes. You can install it, get all the spyware that comes with it, and let us know. That's right. That's right.

Coding Blocks - Specialize or Bounce Around?

It's been a minute since we last gathered around the water cooler, as Allen starts an impression contest, Joe wins said contest, and Michael earned a participation award....

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.