Two's Complement - Boring is Awesome

Starting point is 00:00:00 I'm Matt Godbolt. And I'm Ben Rady. And this is Two's Compliment, a programming podcast. Hey, Ben. Hey, Matt. I hear you've been thinking about databases. A little bit. I have some ideas about databases.

Starting point is 00:00:26 So when you mean database, you mean like SQL databases, like server databases, right? Yeah, Postgres. As opposed to the more colloquial... MySQL, you know, Oracle if anyone still uses that. I don't know. I'm pretty sure someone's still using Oracle. There's got to be at least one or two people. Oracle seems like they have money, so there has to be someone somewhere that's using Oracle. There's got to be at least one or two people. Oracle seems like they have money,

Starting point is 00:00:46 so there has to be someone somewhere that's using Oracle. What about SQLite? Does that fit into your general pantheon of data buy? That is the plural of database, I'm sure. Yes, but in a more interesting way. Oh, okay. But yes, it does. So what's your beef with databases?

Starting point is 00:01:01 I don't... Because I'm assuming it's a beef, because you otherwise wouldn't have mentioned it. I have a your beef with databases? I don't... Because I'm assuming it's a beef, because you otherwise wouldn't have mentioned it. I have a minor beef with databases in that I think that databases should be used to store relational data. If you don't have relational data, you should strongly consider not using a database. That is my beef with databases. Because I see many, many situations where people do not have relational data, and they're like, we have data.

Starting point is 00:01:36 Where should I put this data? I'll put it in a database. And that creates a whole host of problems. So to be clear here, you're talking about also relational databases, not like the Nosgul style. Right, yes. Which, you know, I think that was a Lord of the Rings thing, wasn't it? The Nosgul? Yeah, uh-huh.

Starting point is 00:01:54 The Ring Race. Sorry, I completely caught you off guard there. That is why. No SQL. Sorry for the listeners who aren't playing along with the stupid pronunciation thing uh no sequel databases which are much more like just data dumps um with no with no relational component or not with only you know complicated layering to get document traditional rdbms yes things yes and your observation is that some data that is non-relational ends up in a database because it's convenient probably yes or Yes. Or it's just there.

Starting point is 00:02:25 It's familiar, I think, is a lot of what it comes down to. Right? Right. People, they know how to configure them. They know how to use them. They know how to interact with them. They're comfortable with SQL. You know, they have operational teams that can support them.

Starting point is 00:02:41 There are lots of vendors that'll give you one. You know, you've been using Postgres for 10 years and it's, like, super comfy. Right. It's like a warm blanket, right? Right. What's wrong with that? Well, the problem is, I mean, it gets back to the old programmer adage of use the right tool for the right job, right? And just because a tool is familiar doesn't necessarily mean it's the right thing to do. Now, I'm going to have a slightly hard time blaming anyone for using, you know, we had this phrase at Prevco. We said boring is awesome.

Starting point is 00:03:11 And, you know, using boring technologies, old technologies that work really well and are well trodden and have all the bugs sort of wrung out of them already. I'm going to have a really hard time complaining that somebody is using one of those to solve a problem because it's like yeah okay you want to actually get this thing working and you don't want to be a technology fetishist and just use the newest coolest no sequel thing or whatever sure fine respect but

Starting point is 00:03:37 there are lots of times when you could get even more boring than a relational database and put it in oh I don I don't know, a file? I was going to say, what could be more boring than a database? Yes. Are you telling me a file? Yes.

Starting point is 00:03:51 So what is this boring axis that you're referring to? What is boring a proxy for in this particular instance? What are you getting from this thing? Yeah, what are we getting from boring? Well, the first thing we're getting from boring is that there's a large community. The tool that you're using has been used in a lot of different contexts. It has very nice documentation, has a large community around it that can help you use it.

Starting point is 00:04:15 A lot of the bugs have been worked out of it. It's been used in production environments in a lot of different ways. And it's just a generally reliable tool. So that's like Postgres definitely ticks the box of boring in that respect because it just works. Everyone understands that you can advertise for a job and say, hey, if you've got Postgres experience, someone can come in and you know roughly where you stand with that. Yep.

Starting point is 00:04:40 But you're saying that even more, like before there was Postgres, before there was any database, there was a file. Yeah. File systems. File systems are real boring. They're even more boring than databases. You put file systems on your resume and people are going to go like, what? Right?

Starting point is 00:04:58 Like, it's super boring. But turns out, if you don't have relational data data files are a great way to store data you can write them to the file system and read them right it's so give me an example of the kind of data you're thinking about when you're very specifically saying like non-relational data give me an example because i'm i'm i want to sort of get, yeah. Well, anything binary, first of all, right.

Starting point is 00:05:27 I, whenever I see people putting like JPEGs as binary blobs into a database, I'm thinking like, okay, like I get that you have a data storage device of some kind and you want to put the stuff in there. But you know, is that really the best place to put it now again if you have other relational data around it you know you have relational data that needs to refer maybe you know maybe it's your avatar in your forum system and

Starting point is 00:05:58 everything else is relationally you just have the ping stored in there but right it's not like you can do select ping underscore decode of blah in your and like actually do something with it the data it's just it's a but i mean that's why databases have blobs after all that is what these are meant for right but you're saying that like if that's all you're doing yeah maybe you need to think carefully if primarily what you're doing is storing images you really need a relational database to store your images. I think probably you don't. And there are also other reasons why, you know, it's obviously important to think about the context of things that can be used. Scalability is a big concern in a lot of different contexts.

Starting point is 00:06:40 You can get scalability through relational databases. But if you're using the relational database simply as a mechanism for scalability what about an object store right you don't necessarily need to use files or you can't use files because it's like okay well this has to run on multiple computers they can't all share the same file system without some nfs that we don't want to do yes three types yeah yes what about a file store again you're not using you don't have relations in your data. Don't use a relational database. Use something simpler. So, you know, there's stuff like that. So, like, for example, if you're building internal tools, right? Some internal website, something like that. The number of people that will ever use this tool in its entire lifetime is like two dozen, right? Right.

Starting point is 00:07:30 Do you really need to run that on more than anything but like a single server or a virtual machine or something where, you know, you can just write files and read files and back those files up and it'll be fine? I don't think that you do. And if you can do that, then you get a whole bunch of other stuff for free. We had talked on an earlier episode about the importance of manual testing, being able to run things locally and do what the users do exactly on your local development workstation and be able to reproduce the steps that they take and troubleshoot things in the same way that they will. And I think that's extremely important. So important, in fact, that I would be reluctant to add any technology to a project that hindered me in that effort.

Starting point is 00:08:14 One of those things can be a relational database. If you have to have a database loaded with all the data, with all the right schemas loaded up on your machine, then you can't have a lot of automation around creating that. It's harder. It's harder to get. It's hard. You're right. It's a barrier to doing it.

Starting point is 00:08:32 It's not impossible in like virtualization technology and other things have come along to make it a bit easier, but it's not straightforward compared to here's a file. But the question is, will you? Right. Can you and will you or tutor yes i can make a docker container for my uh database and i can write all the scripts and tools to load all the things into it and i can integrate that in with all my projects so when i fire up the server whatever it is it also fires up the database and it tears it down properly so i'm

Starting point is 00:09:03 not leaking docker containers and my laptop crashed. Well, you had 35 instances of Postgres running. That's why it crashed. All those things. You can solve all of those problems and take the time to solve all those problems. A lot of people don't. They just say, well, I'll just run the database myself. Or everyone points to the dev database, which you just assume runs somewhere. Because they don somewhere because they don't want to take oh is someone else using dev right now because i'm

Starting point is 00:09:28 running my tests against it oh sorry i'm yeah you know that's yeah we've all been there yeah yeah so i think that's what people generally do because they don't want to take the time to automate it but i think that there's an even better option there which is why do we need a database for this thing right so you mentioned files that makes sense to me right um i remember that at prev prev co the sort of internal link shortener that was hacked together used exclusively a file based an append only text file to store you know space separated here's the short url here's the log ur. And it just loaded it into memory every time it started back up again. And there you are. That's a simple enough thing if it fits into memory.

Starting point is 00:10:10 And that's a perfect, I think, example of what you're talking about. There's no relational aspect to this whatsoever. It can all live in memory. And so it was implemented as a JavaScript, like, map, literally from the short URL to the long URL. And it was, yeah, as I say, a text file. But what if your data is bigger than that or needs to be indexed more than that?

Starting point is 00:10:30 You get an awful lot for free with something like a SQL database. Hey, I need to look up by this field or this other field. And now I can write the code to do that, but it's kind of easier if I just let the database manage that bit and I can create two indices now, right? I mean, if you're looking things up by various fields, that sort of smells a little bit like relational data to me, in which case I would say.

Starting point is 00:10:51 Not necessarily. Like, for what it, you know, take my URL shortener, for example. What if I wanted to say, okay, well, what are the short URLs that lead to this big URL? Like essentially an inverted index, right? Right. Obviously, I can write the code just to have two maps, one that goes from short to long and one that goes from long to short. But I'm basically making a database. And if anyone needs anything more, like, hey,

Starting point is 00:11:15 what about it has a user field in it as well, right? And that could be who created it. And I said, well, find all the ones created by me. Now that you were definitely straying into relational areas here. Yeah, but I mean, the other aspect here is that if you're making a URL shortener for an internal application, then the lifetime of this application

Starting point is 00:11:32 is going to have megabytes of data at the most, which means you can do just about anything that you want, right? I'm sort of just making the point that if you start down a road where you end up writing all of these things

Starting point is 00:11:46 and then one day you're like oh no it doesn't fit in memory well this is annoying how are we going to make it scale how are we going to if one day it's never going to fit in memory okay sure of course but there are certain categories of things where you can be like the number of short URLs at this company well you say that

Starting point is 00:12:02 we actually did hit this problem which is why i kind of bring this up there was a api server could generate the short urls which meant that you could very quickly churn through and create thousands of them which was fine until it then took you know days for the machine to restart because it had to read through terabytes of this text file but it was you know again wasn't a big problem and yeah okay adding layers in your software could mean you could switch it out later on and put something else there. I'm just saying it's –

Starting point is 00:12:28 I mean, it's a hell of a lot easier to go from a set of flat files to a database than it is to go the other way. Correct. Correct. But the API that a database gives you is one that is sort of sort and search and find and reorder and limit and all these things, which you might not need in a relational data store, but you still want to be able to do those things, aggregate stuff. And you end up writing that yourself.

Starting point is 00:12:52 The file doesn't give you that. So the database is both the storage mechanism and the querying mechanism for that data. Whereas if it's a file, it's just a storage mechanism. And then it's up to you to kind of layer everything else, which is probably a feature, but I just want to sort of talk about yeah talk about i mean certainly if you find yourself building your own indexing system into flat files you're probably it's probably time to move on to something else right but maybe not a relational database is perhaps maybe not a

Starting point is 00:13:18 relational database and one of the places where i definitely see people abusing relational databases is with messaging based systems right oh my with a message or an event-based system people's using the database as a as a terrible message queue where they're writing things in and reading things back out and trying to time those things select star from this where id is great and are equal to the last id I got from you exactly retrying oh no uh-huh did i get any new rows in this table did i get any rows in this table and that is that is a huge dysfunction right that sounds yeah that's definitely somebody running around with with a hammer thinking everything else is a nail at that point like well i got my database what else can i do with it right

Starting point is 00:14:01 and and and the thing is is that quite often systems devolve into or evolve evolve into sort of more event-based systems because people want things faster they want them in real time they want to update automatically right like i want my web page to update automatically i want my report to update automatically and so like they sort of evolve into these systems over time and they don't people don't stop and sort of reevaluate and be like, all right, well, while we have like, you know, gigabytes of data and not terabytes or petabytes of data, maybe now should be the time where we take the leap to go from something that is relational to something that is more event-based.

Starting point is 00:14:39 Right. And there are lots of tools for that. One of the things that I will say is that I personally think it is generally easier to create the sort of like in-memory slash stub slash fake implementations of most message systems than it is to reproduce all of SQL. Right. alluded earlier to SQLite, which is possibly the one exception to this, right? And it's a great tool and you should use it if you find yourselves in these situations. But generally, if you've got N consumers and M producers, and you just want to tie them together, like you can do that in memory for a single node to test locally pretty easily, right? So you can have your real producer consumer that talks to you know kafka or rabbit mq or zero mq or you know your message bus of choice whatever um and you can have an alternate implementation of that that is not talking to any of those services just runs entirely memory and as soon

Starting point is 00:15:38 as it receives a message it sends it all the producers because they're all in the same process and that makes it really easy to run things locally and test locally in a pretty realistic way. Like, obviously, you're not going to be able to, like, tease out all the weird things about your messaging system by doing that. But, you know, you can do most things. And so that sort of particular dysfunction, I think, is a bad one when people don't sort of take a beat to just say, like,

Starting point is 00:16:01 maybe we should switch. The thing you just said there about the SQL is an is an interesting one though because whenever i have used uh sql um and usually it's with something like sqlite that i'm actually using a file somewhere because for some of the reasons that you said you know i don't need a server or stuff um but the i end up having to wrap all of my uh objects um interactions with the database in a very high level abstract api so that i can test them because there's no way in heck i'm going to test the sql query itself or maybe i am but there's only so much you can do and be sure that you've done the right thing there so right so you know and then you or or you know and i guess there's the the traditional solution to this is to have

Starting point is 00:16:42 an orm which then maps your objects into a database and you kind of assume the ORM just works. And then you test the objects or the ORM mapped objects and the interactions with those and just assume. But yeah, having to sort of stub out something that looks like SQL doesn't sound very testable. Yeah. Well, it's just you wind up with this sort of mock magic approach where it's like,

Starting point is 00:17:07 okay, and maybe you do the thing where you, you know, you do make it till you fake it, right? So you like run it against the real database

Starting point is 00:17:15 and you make sure that it really works and then you stub out the parts that were interacting with the real database using the data that you got back. The results that you got back. Yeah, right. Basically, it's like, you know,

Starting point is 00:17:25 write your test and don't mock anything out. Connect to the real database and then copy-paste and then edit. You know, shrink it down. You know, all that good stuff. You can do those techniques and that's fine. It's like a kind of a brittle test. A brittle is not the right one.

Starting point is 00:17:39 I mean, it's a reliable test in that it will give you the same behavior every single time. You're not going to get any weird effects where it's like, I ran it and it give you the same behavior every single time. You're not going to get any weird effects where it's like, I ran it and it failed and I ran it in the past. It's like, no, you're getting the opposite of that with that approach, which is good. But if you ever change your mind about what you want that SQL to be, you have to go through the whole process again and basically take the knocking out and then redoing it and putting it back in. It's a bit more like the thing we discussed claire with the sort of like golden test the acceptance testing except there isn't an obvious place to put us an automated system such as the uh the test that she was talking about yeah there because you have to talk to the database

Starting point is 00:18:14 and then you do this manual process of like getting rid of the the wheat in the chat from the chaff and yeah that's yeah you yes we talked about sequel i but although i i sort of glossed over that a little bit but sqlite is an in an intermediate kind of form because it has yeah some benefits it certainly doesn't have the drawbacks of needing a central database server with all the docker-y thing or the dev instance or whatever it can be just it is just a local file on disk um so what do you what are your feelings about that i think i think that's a really good intermediate thing there was a project that i worked on oh god when was that i want to

Starting point is 00:18:53 say it was like 10 years ago but i don't even remember but basically we had made the conscious choice to stick to like very generic you know antsy SQL basically to say, we are going to be able to work with any database, not just Postgres, not just something else, basically only for testing, right? So that we could run against SQLite and you could bring the whole system up with SQLite and be very confident that when you moved over to Postgres or MySQL or whatever we were using for production, we were using Postgres in production.

Starting point is 00:19:24 It would just work. It would just work. It would just work, right? And obviously, you know, there are cases where you can find different data vendors, interpret things in different ways, run the problems. But for the most part, that was a pretty good solution. And I honestly, I feel like this was a while enough ago. I don't know if that's a realistic solution anymore, honestly. I feel like there might be people that are like,

Starting point is 00:19:45 yeah, if you're going to use Postgres, there's no way you can write standard SQL and have it actually get the value of Postgres that you want. Like, okay, maybe that's true. That's what I was going to say. The value there, you know, as soon as you start down the road of like stored procedures to update things,

Starting point is 00:19:59 which of course you typically would only do if you're starting to take the benefit of like maybe some of the more relational things in the database because you have to atomically update three tables or something like that. In which case you kind of maybe we've moved out of the part that you were talking about where you're like misusing relational databases to store non-relational information. That's like, no, that's a valid use of a database. If it's a database like RDMS stuff, fine, go, knock yourself out. But is it just a file store of JPEpegs uh or you know url shortener

Starting point is 00:20:27 even a url shortener thing is on more on the fence but yeah um is is there um is this top of mind because of things that you're thinking about at the moment or is this just something that came to you i mean it's it sort of touches on something that i think we were going to maybe talk about it in a different podcast but maybe this will be the blend of these two things. Which is like the project make file. You know what I mean? Where I personally

Starting point is 00:20:55 think that Oh, who said this? This is probably not... I'm just going to attribute everything on this podcast that I can't remember who said it to Michael Feathers. And then I'll be like, right. 16% of the time I've gotten, I've gotten a lot of wisdom in my career for Mr.

Starting point is 00:21:11 Feathers. He's a wonderful person. But he said, you know, code is a way you treat your coworkers. Yes. I think that was him. It probably is him.

Starting point is 00:21:21 And one of those aspects, I think is if you want to bring people onto a project, right? You want people to help you, fundamentally. You have to help them help you, right? You have to do things for them to make it easy for them to contribute. You can't just push it all on them and be like, well, if you're a real programmer, you would just read through all these things and figure out how it works. Or, you know, read my partially up-to-date documentation that I wrote three years ago or whatever it is. Right.

Starting point is 00:21:50 You have to create an environment that is welcoming and friendly and easy to use. Otherwise, they're either not going to work on it or they're going to be forced to work on it and they're going to hate you, right? Or they're going to hate the code. They probably won't hate you. Not you. They will grumble. They might grumble about you a little bit. But mostly, they'll just hate the code, right? Or they're going to hate the code. They probably won't hate you. Right. Not you, although they will grumble. They might grumble about you a little bit. But mostly, they'll just hate the code, right? They'll hate the thing that they're doing, which is not good. It's just like not filling up the coffee machine or leaving your smelly lunch in the fridge. This is a bad thing that you can do to

Starting point is 00:22:18 your coworkers, and you should not do this. And so, one of the aspects of this, I think, is you should be able to check out a repository and run a simple command and do all the things that we have talked about on these podcasts over many times. You should be able to run it locally and manually test it. You should be able to run the tests and verify that they pass. You should be able to deploy it.

Starting point is 00:22:41 You should be able to build an artifact that is deployable. You should be able to do all of these things. And there's not that many. It's like maybe half a dozen, right? It's like run the system, run the tests, build a deployable artifact, deploy the artifact, right? If you can do those things, then you can do most things that software engineers need to do. And you should automate all of those things. How do you automate all those things? That's another question. The way that I've been doing it in recent years is by using Make. Because Make is a tool that is good at resolving dependent tasks, sometimes in parallel. And it's ubiquitous, like basically any Linux environment that you're going to be in is going to have Make. And yeah, Make files aren't the easiest thing in the world to write but they're actually not

Starting point is 00:23:25 crazy hard to read like well if you've already got one and you sort of understand how targets work they're not crazy hard to read and you if you're working in a compiled language you might want to use make or cmake but you might want to use make to do some stuff anyway so you probably have them all there anyway so it's not Now, can you do this with shell scripts? Absolutely, you can. I have. It works great. Can you do it with other tools?

Starting point is 00:23:49 Sure. Again, applying boring is awesome. I would go for a more boring tool here because there are definitely some boring solutions. But that is a thing that I think is important. So to answer your question of why is this top of mind for me it's because i've had a few projects recently that have had data that was marginally relational and certainly not very big that depended on a relational database that was like got it i figured there was a wound here oh and and the instructions are in the readme are like, install Postgres, load these schema, you know, create these tables by loading the schema in, and then configure the Postgres URL to this, and then you can start the system.

Starting point is 00:24:32 And you're like, no, make. I want to do make test. And if it needs Postgres, then fine. It may be even. It can bring, you know, Docker, whatever, something, or any Podman uh data but there should be no manual steps in this that's the critical thing exactly anytime that the yeah i i think you know well you and i agree on this very very strongly right every every project that i've worked on and i've had so much positive feedback from people that are saying like i can't i love it when it's your project

Starting point is 00:25:00 because i just do git clone and then i type make and the compiler itself even gets installed on my computer and just works i'm like yes that's that's how it should be if i need a magical version of gcc because i need this particular flag then i will arrange for that to be on your computer as a result of typing make as opposed to here's a list of pseudo apt get install crap that you have to do first like that's not that that should not ever um be be uh allowed um and yeah i mean there's a variety of uh of open source projects that i've worked on that all have a similar thing and i think it's a big bit and in fact actually i have someone raised a bug recently because it's one of the things that stopped working but mostly i can point people that say compiler explorer and say yeah you know how you get it running locally make

Starting point is 00:25:44 and it'll churn away for a bit and then you go to port 102.4.0 and then you've got your own local install of it and it's like people like oh i was expecting there to be more no it's just that because that's all you need again it's broken right now apparently but but but yeah working on it's um it's i think it's a valuable um an important thing it's just valuable and important thing. And as an API, you can go far worse than Make, as you say. I mean, NPM sort of does it for the JavaScript community at some point, and there's Mavens and things and whatever. But Make can run those.

Starting point is 00:26:18 I usually have a Make file at the top of my project that maybe even runs CMake that then runs Ninja for all. But you don't have to know that. If you're just saying, no, make the project. I don't care. It's like, well, there's layers and layers of things going on. You don't need to know about it. Hey, Conan's being installed in a virtual, Python virtual in your machine, and we're installing all the dependencies through Conan, right?

Starting point is 00:26:33 But again, you don't need to know that. It just works for you. And it's all done through the magic of make. Yep. And that serves two really important purposes, actually. One is that it is this sort of like, you know, code is a way you treat your coworkers thing. But the other thing is, is that it is an absolutely correct form of documentation, right? Like, how do you configure and build and deploy the system? Well, it's all here. I'm 100% sure it's correct because we use it every day, all day. It's how my CI runs. It's how my deploys run. Exactly.

Starting point is 00:27:09 It's how I run locally. Right. So not only can you read that to figure out how it all works, but you can confidently change it and know like, oh, if I make this change here, everyone's going to do it like this with this version of the code. There's no like separate like oh well there's the build but then there's the code and you have to keep them in sync and if you roll back one you got to roll back the other it's all together it's all in one place and it all works and so there's huge value sort of documentation and and coordination value in automating those things and this is i mean me, this is sort of one of those things

Starting point is 00:27:45 you just have to choose to do, right? Like we've kind of talked before about like, you know, we're wizards, we can do anything. What you choose to do in the sea of all possible things is going to determine a lot about what your working environment is like and what you're able to do and what you're not able to do. I don't think anyone, I don't think any of our listener listening to this podcast right now. I'm reliably informed that we have at least two, actually. Now, I was talking to somebody other than our respective spouses. Right.

Starting point is 00:28:18 So, both of our listeners would agree, I think, that any of these things that we've talked about on this podcast are possible to do, right? It's just a question of should you do them? And I think that you kind of just have to start with the decision of like, yes, I'm going to automate this stuff entirely so that you can just type make. And yes, that might lead me down some strange paths where I'm building tools to make sure that it is possible to do this. But if you make the decision to do it, then everything sort of will follow along from that if you're committed to it. Just like we said earlier as well, if you can start from the beginning in that way, it's harder than – sorry, it's easier than retrofitting it later. So if you're like – well, it's just always been the case.

Starting point is 00:29:04 You type make and it gets everything. we started with just hello world and you know we've got the compiler and we got the the thing building and then we oh we added a dependency on a third party library okay we're going to make sure that that comes down as part of the make file yeah and you sort of incrementally put it on rather than having it um uh trying to sort of retrofit it yeah it's it's easier to do those kinds of things. But again, I think you're right. It's an effort of will on your own part that you have to make that decision.

Starting point is 00:29:31 This is going to be worth it. I'm going to take a hit early on. And I mean, once you've done it a few times, it's not even a hit. It's just a way of life, right? It's sort of the same, the Tao of a new project. As you go, you know, new directory. The very first thing i do is

Starting point is 00:29:46 vi make file and i'll paste in uh worth saying there's a really nice little pattern that we've um we've picked up along the way and both of them well i've picked up from you but i think you picked up from jake mccreary who picked up from someone else of having like a help target that sort of grips itself out of the make file and does with a bit of like orc and said and magical things kind of makes an auto help page for your make file and so you can just maybe your default target is that as well so if you type make it just says hey these are the things you can do and you're like that's great yeah um but yeah so that's what i'll paste that snippet in into my make file and then i'll like just have a make echo target that just says hello

Starting point is 00:30:25 world and then you know start from there yeah that help file thing i think is is nice like you know just sort of gives you that sort of half dozen here are the things you can do as a developer and you sort of gets people started the other thing about this it is not only is this a something that you know if you started early you know so you get that momentum going it know, if you started early, you know, so you get that momentum going, it's easier. If you started early, there's actually lots of situations where you can, you know, tap into the power of laziness in order to get people to do the right thing, which is, and a great example of this, I think, is continuous deployment. So, if on day one, you've, you know, followed my advice and say, the first thing you do is deploy. So deploy your hello world that does basically nothing and have it automatically deploy whenever you push to the

Starting point is 00:31:12 main branch, then it will be difficult to not have production in sync with the main branch because it's going to do that automatically whenever you deploy. And people will just orient their behaviors around that from the start. They'll be like, well, if we push to the main branch, it's going to deploy. So how do we make sure that that doesn't break? Well, I know. I'll write a test.

Starting point is 00:31:33 Or I know I'll do this other thing or whatever. You've got smart people. They'll figure it out. But if you start with that philosophy, it actually becomes the easy thing to do to do it right, as opposed to this extra step that you have to take. But you have to start there, or you have to very quickly get there. Because if you go in later, it's like, well, we're going to deploy to production every time you push to the main branch. You'll get 100 very valid reasons why that's a bad idea, right? And you should not do that that and that's actually an interesting you you said you think i you reminded me of um a couple of issues i've seen in the last

Starting point is 00:32:10 couple of weeks which have both come down to not projects not auto pushing on their latest version and then later on somebody act accidentally or you know as a side effect pushing a newer version of the project and breaking other things because it was like a relatively significant number of changes that got rolled out to a system. And you're like, no, if it's pushed every time you push, then we'd find out a lot earlier and it would be causally linked with the thing that you had just done as opposed to, but I just did this thing. How on earth can that affect this other thing? Oh, I picked up two weeks worth of changes in one go ah it's shocking how much and i mean if you if you talk to anybody that was into like lean systems and the lean stuff like you know 10 years ago or whatever they'll tell you this obviously but it's like there's it's

Starting point is 00:32:57 shocking how much queuing theory there is in software software development management and process and stuff right like if you if you understand queuing theory really well you can start to see those things in how developers push out changes right and and you know the whole toyota production system and all that sort of fed into all this stuff this was this is what the cool kids were talking about and like 10 years ago really i'm not one of those the post the post agile people uh the agile some of the agile refugees that were like you know why are we all talking about stand-ups and cards and things i just want to build stuff um but yeah like like queuing up changes like a perfect of this is exactly what you're talking about, is the longer you queue up changes, the more cost there is to actually deploying those changes. And that happens in multiple dimensions.

Starting point is 00:33:55 One is that you've lost context, right? The people who made the changes just have slept since then. And they just don't have the sort of top of mind knowledge that they would have had if it was like, all right, I just built this thing and now I'm going to deploy this thing. Hey, it broke. That's probably the thing that I just changed.

Starting point is 00:34:15 I know exactly what's going on. And it's all, the cash is all still warm, right? Like it's all, it's all up there. The other thing is that you, you can unfortunately sometimes defer those bugs for your coworkers,

Starting point is 00:34:27 which not only have they slept since then, they're not you, which means they don't know anything about this change that's going on. Which is what happened to me, yeah. Exactly, exactly. So that can increase the cost. And the other thing is that you get errors on top of errors, right? So somebody checks in a change that breaks something. Somebody goes and makes another change

Starting point is 00:34:45 and they look at your code and they go, okay, well, apparently that's how it works now. I'll do that. And they're doing something wrong. And then they make two of those things that are wrong. And that just sort of compounds on top of each other until the thing finally hits the real world. And then that whole chain of things breaks

Starting point is 00:35:01 because we were building wrong on top of wrong this whole time and you never knew it. So those, mean those are just some sort of basic ways but it's like this this this general problem of if you're queuing up changes to your system you're taking on a lot of risk and you got to be really careful that that risk is actually worth it sometimes it is sometimes you can't just do things where it's like yeah literally every change just goes right to prod you know there are there are situations where that can't happen. Lots of situations where that can't happen. But understanding that your goal should always be to shrink it. And to also just recognize if you can't do this, well, here are some of the problems that you're going to encounter. You're

Starting point is 00:35:39 going to encounter the problem like you saw today. Okay, how do we deal with that problem when it happens? One of the things that I have advocated for a long time is that git revert is not a personal insult. Reverting commits is something that you should take advantage of, right? Like it's not, you're not, you will have a much more complicated operational process if you have this mentality of everything that everyone has ever committed to this repository must be either fixed or remain pristine or never get rolled back, like your life will be made so much easier if you just sort of have a meeting where we all come together and be like, all right, everyone in this room are all going to agree. If I revert your commit, it's because I love you. And I want you to be able to go on vacation and not have to worry that the code that you've committed to the repository is perfect and unassailable in all ways. You can leave the building and go home to your family and loved

Starting point is 00:36:34 ones. And if I see that you've made a mistake, as we all do, I'm just going to revert it. And I'm going to tell you that when you come in the next day, be like, yep, Ben reverted my commit. Thank you for reverting my commit. That means I can fix this now at my own leisure and not have to be woken up at 2 o'clock in the morning by pages or interrupted by dinner saying, hey, Ben, you committed a bunch of bad code and then you left the building and now we need you to fix it right now. It's like, why can't you just – I love this. Why can't you just revert it? I think this is a brilliant analogy, yeah. Because there is – you're right.

Starting point is 00:37:01 I mean, isn't it a funny social issue that yeah I do feel guilty reverting someone else's change it's like you know somehow a bad reflection on them when it's like no it is a pragmatic thing that I'm doing to buy us back the stability that we had before and unless their

Starting point is 00:37:19 change was required for operational reasons then often as you say it's like well okay you can come back in tomorrow and you can revert the revert and then you can fix whatever issue it was and then you can yeah no harm no harm done and and i'd like to think that if someone reverted one of my changes i wouldn't feel put upon but you know it's uh it's it's it's i i yeah i do like the if i revert your changes because i love you and i want you to have a lovely evening without me or a vacation or whatever yes whatever it is whatever it is i want you to be happy i want you to be uninterrupted in your life and i'm just going

Starting point is 00:37:56 to revert your change and then we'll talk about it tomorrow right or whatever right after lunch whatever it might be um and it's it's it's one of these things of like, I feel like if you can adapt some of these things. We've talked before on this podcast about like, you know, the reason that I got so interested in engineering practices, agile engineering practices in particular, is because I sort of realized it's like, if there's certain things that you do and you do them well, there's a whole host of other things that you don't need to do. Right? do and you do them well, there's a whole host of other things that you don't need to do, right? And I feel like this is an example of that, where it's like, if you get comfortable with this as a team, as an organization, where it's like, yeah, when we commit, it goes right to master. It goes to the main branch, goes to the main branch, gets deployed to production. That's just how everything works. If we run into problems, we revert the commit, and then we've got a reverted commit, and then that gets deployed, and now

Starting point is 00:38:46 the problem is fixed, right? If you do that, you don't have the queuing problems. You don't really have to worry that much about versioning and keeping old versions. You have a nice thing of depending on the

Starting point is 00:39:01 particulars of your project, not every project is going to be able to do this, but you can get into situations where it's like, you know, unless you find yourself very often needing to roll back and your deployment system, however it is, doesn't let you just roll back to a particular commit, you can rerun that SHA. But there's a whole bunch of versioning things that you probably also don't have to worry that much about. Your solutions to those can be significantly simpler because you're just

Starting point is 00:39:38 reverting commits instead of oh, I need to roll back to version 1.27 and then where is version 1.27? I don't know. I stored it in an artifactory or whatever. We got to fetch it from artifactory. There's just a bunch of stuff you don't have to build. So, I think, and again, not every project is going to

Starting point is 00:39:53 be able to do this. This is not a universal solution. But I think the main thing is just sort of thinking in these terms and trying to, like, simplify things in these ways. You'd be surprised at what clever solutions you can come up with if you just embrace the philosophy of it right start with the philosophy be like how do i how do we get as close to this as we can work back from that so how do we get to that from databases i feel like somewhere along the line i know there is a link but you kind of switched gears you know

Starting point is 00:40:19 add another thing i did i did but that's a great thing but yeah so my understanding is that we got there from like if you don't have to fire up a giant database or run against a big database then that enables you to have the kind of self-contained hermetic project where you just clone the project and type make and you can run all the tests you can do all the deployment you can do everything within that world without having some exogenous dependency an exogenous unnecessary dependency exactly on a database exactly just trying to make sure that we've got the trail yeah tie tie all these ranty pieces together that's a good idea yeah no that's exactly right it's it's it's you know everyone sort of agrees that simpler is better and and all we disagree

Starting point is 00:41:04 as what is it what does it mean to be simpler some people would say like why are, you know, everyone sort of agrees that simpler is better. And all we disagree as what is it? What does it mean to be simpler? Some people would say, like, why are you, you know, building your writing your own code to scan a file to query things that you could just throw into relational database? Isn't it simpler to just write a little bit of query instead of to write 100 lines of code? And my argument a lot of the time, and again, this is very context sensitive, but a lot of the time is, no, I'd rather write 100 lines of code than have a database. Because if I have to maintain a database, then I can't do all these other things.

Starting point is 00:41:33 And the other things are more valuable to me. Yes, exactly. The other things are more valuable to me than saving myself 100 lines of code, right? I'll just write the 100 lines of code. It'll be fine. And then that means that when you clone my repository and write make run, the system comes up and you can use it just as a user would with no special stuff to have to make it work.

Starting point is 00:41:55 And when I deploy it, I know exactly how it's going to work because I don't have to coordinate the deployment of the software with the deployment of a database or write database migrations that go from one thing to the other thing or any of that because i have my hundred lines of code to replace all of that yeah cool well i think that is databases fully covered we need to come up with a better ending than that maybe we could stop a bit earlier than this and i'll just do some magic editing because that seemed like a natural

Starting point is 00:42:25 end point. Or maybe we just put this into it and then everyone can see how rubbish we are at finishing things. How bad are we at endings? We are this bad at endings. You've been listening to Two's Compliment, a programming podcast by Ben Rady and Matt Godbolt.

Starting point is 00:42:45 Find the show transcript and notes at twoscompliment.org. Contact us on Twitter at twoscp, that's at T-W-O-S-C-P. Theme music by Inverse Phase, inversephase.com.

Two's Complement - Boring is Awesome

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.