Two's Complement - Boring is Awesome
Episode Date: July 18, 2021Ben and Matt think boring things are good, and provide a few examples. Databases, for example, are boring...but even more boring options exist! Matt explains how boring tools make it easy to automate ...local development tasks on his funny side project. Ben reverts your commits because he wants you to be happy.
Transcript
Discussion (0)
I'm Matt Godbolt.
And I'm Ben Rady.
And this is Two's Compliment, a programming podcast.
Hey, Ben.
Hey, Matt.
I hear you've been thinking about databases.
A little bit.
I have some ideas about databases.
So when you mean database, you mean like SQL databases, like server databases, right?
Yeah, Postgres.
As opposed to the more colloquial...
MySQL, you know, Oracle if anyone still uses that.
I don't know.
I'm pretty sure someone's still using Oracle.
There's got to be at least one or two people.
Oracle seems like they have money, so there has to be someone somewhere that's using Oracle. There's got to be at least one or two people. Oracle seems like they have money,
so there has to be someone somewhere that's using Oracle.
What about SQLite?
Does that fit into your general pantheon of data buy?
That is the plural of database, I'm sure.
Yes, but in a more interesting way.
Oh, okay.
But yes, it does.
So what's your beef with databases?
I don't...
Because I'm assuming it's a beef,
because you otherwise wouldn't have mentioned it. I have a your beef with databases? I don't... Because I'm assuming it's a beef, because you otherwise wouldn't have mentioned it.
I have a minor beef with databases in that I think that databases should be used to store relational data.
If you don't have relational data, you should strongly consider not using a database.
That is my beef with databases.
Because I see many, many situations where people do not have relational data,
and they're like, we have data.
Where should I put this data?
I'll put it in a database.
And that creates a whole host of problems.
So to be clear here, you're talking about also relational databases, not like the Nosgul style.
Right, yes.
Which, you know, I think that was a Lord of the Rings thing, wasn't it?
The Nosgul?
Yeah, uh-huh.
The Ring Race.
Sorry, I completely caught you off guard there.
That is why.
No SQL.
Sorry for the listeners who aren't playing along with the stupid pronunciation thing uh no sequel databases which are much more like just data dumps um with no with no relational
component or not with only you know complicated layering to get document traditional rdbms
yes things yes and your observation is that some data that is non-relational ends up in a database
because it's convenient probably yes or Yes. Or it's just there.
It's familiar, I think, is a lot of what it comes down to.
Right?
Right.
People, they know how to configure them.
They know how to use them.
They know how to interact with them.
They're comfortable with SQL.
You know, they have operational teams that can support them.
There are lots of vendors that'll give you one.
You know, you've been using Postgres for 10 years
and it's, like, super comfy. Right. It's like a warm blanket, right?
Right. What's wrong with that?
Well, the problem is, I mean, it gets back to the old programmer adage of use the right tool
for the right job, right? And just because a tool is familiar doesn't necessarily mean it's
the right thing to do. Now, I'm going to have a slightly hard time blaming anyone for using, you know, we had this phrase at Prevco.
We said boring is awesome.
And, you know, using boring technologies, old technologies that work really well and are well trodden and have all the bugs sort of wrung out of them already.
I'm going to have a really hard time complaining that somebody is using one of those to solve a problem because it's like yeah okay
you want to actually get this thing working
and you don't want to be a technology
fetishist and just use the newest
coolest no sequel thing or whatever
sure fine respect
but
there are lots of times
when you could get even
more boring than a relational database
and put it in oh I don I don't know, a file?
I was going to say, what could be more boring than a database?
Yes.
Are you telling me a file?
Yes.
So what is this boring axis that you're referring to?
What is boring a proxy for in this particular instance?
What are you getting from this thing?
Yeah, what are we getting from boring?
Well, the first thing we're getting from boring
is that there's a large community.
The tool that you're using has been used in a lot of different contexts.
It has very nice documentation, has a large community around it that can help you use it.
A lot of the bugs have been worked out of it.
It's been used in production environments in a lot of different ways.
And it's just a generally reliable tool. So that's like Postgres definitely ticks the box of boring in that respect
because it just works.
Everyone understands that you can advertise for a job and say,
hey, if you've got Postgres experience, someone can come in
and you know roughly where you stand with that.
Yep.
But you're saying that even more, like before there was Postgres,
before there was any database, there was a file.
Yeah.
File systems.
File systems are real boring.
They're even more boring than databases.
You put file systems on your resume and people are going to go like, what?
Right?
Like, it's super boring.
But turns out, if you don't have relational data data files are a great way to store data you can
write them to the file system and read them right it's so give me an example of the kind of data
you're thinking about when you're very specifically saying like non-relational data give me an example
because i'm i'm i want to sort of get, yeah. Well,
anything binary,
first of all,
right.
I,
whenever I see people putting like JPEGs as binary blobs into a database, I'm thinking like,
okay,
like I get that you have a data storage device of some kind and you want to
put the stuff in there.
But you know, is that really the best
place to put it now again if you have other relational data around it you know you have
relational data that needs to refer maybe you know maybe it's your avatar in your forum system and
everything else is relationally you just have the ping stored in there but right it's not like you
can do select ping underscore decode of blah in your
and like actually do something with it the data it's just it's a but i mean that's why databases
have blobs after all that is what these are meant for right but you're saying that like if that's
all you're doing yeah maybe you need to think carefully if primarily what you're doing is
storing images you really need a relational database to store your images. I think probably you don't.
And there are also other reasons why, you know, it's obviously important to think about the context of things that can be used.
Scalability is a big concern in a lot of different contexts.
You can get scalability through relational databases.
But if you're using the relational database simply as a mechanism for scalability what about an object store right you don't necessarily need to use files or you can't use files because it's like okay well this has to run on multiple computers they can't all
share the same file system without some nfs that we don't want to do yes three types yeah yes what
about a file store again you're not using you don't have relations in your data. Don't use a relational
database. Use something simpler. So, you know, there's stuff like that. So, like, for example,
if you're building internal tools, right? Some internal website, something like that.
The number of people that will ever use this tool in its entire lifetime is like two dozen, right?
Right.
Do you really need to run that on more than anything but like a single server or a virtual machine or something where, you know, you can just write files and read files and back
those files up and it'll be fine?
I don't think that you do.
And if you can do that, then you get a whole bunch of other stuff for free.
We had talked on an earlier episode about the importance of manual testing, being able to run things locally and do what the users do exactly on your local development workstation and be able to reproduce the steps that they take and troubleshoot things in the same way that they will.
And I think that's extremely important.
So important, in fact, that I would be reluctant to add any technology to a project that hindered
me in that effort.
One of those things can be a relational database.
If you have to have a database loaded with all the data, with all the right schemas loaded
up on your machine, then you can't have a lot of automation around creating that.
It's harder.
It's harder to get.
It's hard.
You're right.
It's a barrier to doing it.
It's not impossible in like virtualization technology
and other things have come along to make it a bit easier,
but it's not straightforward compared to here's a file.
But the question is, will you?
Right.
Can you and will you or tutor yes i can make a docker container for my uh database and i can write all the scripts and
tools to load all the things into it and i can integrate that in with all my projects so when i
fire up the server whatever it is it also fires up the database and it tears it down properly so i'm
not leaking docker containers and my laptop crashed.
Well, you had 35 instances of Postgres running.
That's why it crashed.
All those things.
You can solve all of those problems and take the time to solve all those problems.
A lot of people don't.
They just say, well, I'll just run the database myself.
Or everyone points to the dev database, which you just assume runs somewhere. Because they don somewhere because they don't want to take oh is someone else using dev right now because i'm
running my tests against it oh sorry i'm yeah you know that's yeah we've all been there yeah
yeah so i think that's what people generally do because they don't want to take the time to
automate it but i think that there's an even better option there which is why do we need a database for this thing right so you mentioned
files that makes sense to me right um i remember that at prev prev co the sort of internal link
shortener that was hacked together used exclusively a file based an append only text file to store
you know space separated here's the short url here's the log ur. And it just loaded it into memory every time it started back up again.
And there you are.
That's a simple enough thing if it fits into memory.
And that's a perfect, I think, example of what you're talking about.
There's no relational aspect to this whatsoever.
It can all live in memory.
And so it was implemented as a JavaScript, like, map,
literally from the short URL to the long URL.
And it was, yeah, as I say, a text file.
But what if your data is bigger than that
or needs to be indexed more than that?
You get an awful lot for free
with something like a SQL database.
Hey, I need to look up by this field or this other field.
And now I can write the code to do that,
but it's kind of easier
if I just let the database manage that bit
and I can create two indices now, right?
I mean, if you're looking things up by various fields, that sort of smells a little bit like relational data to me, in which case I would say.
Not necessarily.
Like, for what it, you know, take my URL shortener, for example.
What if I wanted to say, okay, well, what are the short URLs that lead to this big URL?
Like essentially an inverted index, right?
Right. Obviously, I can write the code just to have two maps,
one that goes from short to long and one that goes from long to short.
But I'm basically making a database.
And if anyone needs anything more, like, hey,
what about it has a user field in it as well, right?
And that could be who created it.
And I said, well, find all the ones created by me.
Now that you were definitely straying into relational areas here.
Yeah, but I mean, the other aspect here is that
if you're making a URL shortener
for an internal application,
then the lifetime of this application
is going to have megabytes of data
at the most,
which means you can do
just about anything that you want, right?
I'm sort of just making the point
that if you start down a road
where you end up writing
all of these things
and then one day you're like oh no it doesn't fit in memory
well this is annoying how are we going to make it scale
how are we going to
if one day it's never going to fit in memory
okay sure of course
but there are certain categories of things where you can be like
the number of short URLs at this company
well you say that
we actually
did hit this problem which is why i kind of bring this
up there was a api server could generate the short urls which meant that you could very quickly
churn through and create thousands of them which was fine until it then took you know days for the
machine to restart because it had to read through terabytes of this text file but it was you know
again wasn't a big problem and yeah okay adding layers in your software could mean you could
switch it out later on and put something else there.
I'm just saying it's –
I mean, it's a hell of a lot easier to go from a set of flat files to a database than it is to go the other way.
Correct.
Correct.
But the API that a database gives you is one that is sort of sort and search and find and reorder and limit and all these things,
which you might not need in a relational data store,
but you still want to be able to do those things,
aggregate stuff.
And you end up writing that yourself.
The file doesn't give you that.
So the database is both the storage mechanism
and the querying mechanism for that data.
Whereas if it's a file, it's just a storage mechanism.
And then it's up to you to kind of layer everything else,
which is probably a feature,
but I just want to sort of talk about yeah talk about i mean certainly if you find yourself building your own indexing system into flat files you're probably it's probably time to
move on to something else right but maybe not a relational database is perhaps maybe not a
relational database and one of the places where i definitely see people abusing relational databases is with messaging based systems right
oh my with a message or an event-based system people's using the database as a as a terrible
message queue where they're writing things in and reading things back out and trying to time those
things select star from this where id is great and are equal to the last id I got from you exactly
retrying oh no uh-huh did i get any new
rows in this table did i get any rows in this table and that is that is a huge dysfunction
right that sounds yeah that's definitely somebody running around with with a hammer thinking
everything else is a nail at that point like well i got my database what else can i do with it right
and and and the thing is is that quite often systems devolve into
or evolve evolve into sort of more event-based systems because people want things faster they
want them in real time they want to update automatically right like i want my web page
to update automatically i want my report to update automatically and so like they sort of evolve into
these systems over time and they don't people don't stop and sort of reevaluate and
be like, all right, well, while we have like, you know, gigabytes of data and not terabytes
or petabytes of data, maybe now should be the time where we take the leap to go from
something that is relational to something that is more event-based.
Right.
And there are lots of tools for that.
One of the things that I will say is that I personally think it is generally easier to create the sort of like in-memory slash stub slash fake implementations of most message systems than it is to reproduce all of SQL.
Right. alluded earlier to SQLite, which is possibly the one exception to this, right? And it's a great tool and you should use it if you find yourselves in these situations. But generally, if you've got
N consumers and M producers, and you just want to tie them together, like you can do that in memory
for a single node to test locally pretty easily, right? So you can have your real producer consumer that talks to you know kafka or rabbit mq or zero mq
or you know your message bus of choice whatever um and you can have an alternate implementation
of that that is not talking to any of those services just runs entirely memory and as soon
as it receives a message it sends it all the producers because they're all in the same process
and that makes it really easy to run things locally and test locally
in a pretty realistic way.
Like, obviously, you're not going to be able to, like,
tease out all the weird things about your messaging system by doing that.
But, you know, you can do most things.
And so that sort of particular dysfunction, I think, is a bad one
when people don't sort of take a beat to just say, like,
maybe we should switch.
The thing you just said there about the SQL is an is an interesting one though because whenever i have used uh sql um and usually it's
with something like sqlite that i'm actually using a file somewhere because for some of the reasons
that you said you know i don't need a server or stuff um but the i end up having to wrap all of my
uh objects um interactions with the database in a very high level abstract api so that i can test them
because there's no way in heck i'm going to test the sql query itself or maybe i am but there's
only so much you can do and be sure that you've done the right thing there so right so you know
and then you or or you know and i guess there's the the traditional solution to this is to have
an orm which then maps your objects into a database and you kind of assume the ORM just works.
And then you test the objects or the ORM mapped objects
and the interactions with those and just assume.
But yeah, having to sort of stub out something
that looks like SQL doesn't sound very testable.
Yeah.
Well, it's just you wind up with this sort of mock magic approach
where it's like,
okay,
and maybe you do the thing
where you,
you know,
you do make it till you fake it,
right?
So you like run it against
the real database
and you make sure that it really works
and then you stub out the parts
that were interacting
with the real database
using the data that you got back.
The results that you got back.
Yeah, right.
Basically, it's like, you know,
write your test and don't mock anything out.
Connect to the real database
and then copy-paste and then edit.
You know, shrink it down.
You know, all that good stuff.
You can do those techniques and that's fine.
It's like a kind of a brittle test.
A brittle is not the right one.
I mean, it's a reliable test
in that it will give you the same behavior
every single time. You're not going to get any weird effects where it's like, I ran it and it give you the same behavior every single time.
You're not going to get any weird effects where it's like, I ran it and it failed and I ran it in the past.
It's like, no, you're getting the opposite of that with that approach, which is good.
But if you ever change your mind about what you want that SQL to be, you have to go through the whole process again and basically take the knocking out and then redoing it and putting it back in.
It's a bit more like the thing we discussed claire with the sort of like golden test the acceptance testing except there isn't an obvious place to put us an automated system such
as the uh the test that she was talking about yeah there because you have to talk to the database
and then you do this manual process of like getting rid of the the wheat in the chat from
the chaff and yeah that's yeah you yes we talked about sequel i but although i i sort of glossed
over that a little bit but
sqlite is an in an intermediate kind of form because it has yeah some benefits it certainly
doesn't have the drawbacks of needing a central database server with all the docker-y thing or
the dev instance or whatever it can be just it is just a local file on disk um so what do you what
are your feelings about that i think i think that's a
really good intermediate thing there was a project that i worked on oh god when was that i want to
say it was like 10 years ago but i don't even remember but basically we had made the conscious
choice to stick to like very generic you know antsy SQL basically to say,
we are going to be able to work with any database, not just Postgres,
not just something else, basically only for testing, right?
So that we could run against SQLite and you could bring the whole system up
with SQLite and be very confident that when you moved over to Postgres
or MySQL or whatever we were using for production,
we were using Postgres in production.
It would just work. It would just work.
It would just work, right?
And obviously, you know, there are cases where you can find different data vendors,
interpret things in different ways, run the problems.
But for the most part, that was a pretty good solution.
And I honestly, I feel like this was a while enough ago.
I don't know if that's a realistic solution anymore, honestly.
I feel like there might be people that are like,
yeah, if you're going to use Postgres,
there's no way you can write standard SQL
and have it actually get the value of Postgres that you want.
Like, okay, maybe that's true.
That's what I was going to say.
The value there, you know,
as soon as you start down the road of like stored procedures
to update things,
which of course you typically would only do
if you're starting to take the benefit of like
maybe some of the more relational things in the database
because you have to atomically update three tables or something like that.
In which case you kind of maybe we've moved out of the part that you were talking about where you're like misusing relational databases to store non-relational information.
That's like, no, that's a valid use of a database.
If it's a database like RDMS stuff, fine, go, knock yourself out.
But is it just a file store of JPEpegs uh or you know url shortener
even a url shortener thing is on more on the fence but yeah um is is there um is this top of mind
because of things that you're thinking about at the moment or is this just something that came to
you i mean it's it sort of touches on something that i think we were going to maybe talk about
it in a different podcast but maybe this will be
the blend of these two things.
Which is like the project make file.
You know what I mean? Where
I personally
think that
Oh, who said this? This is probably not...
I'm just going to attribute everything on this podcast that I
can't remember who said it to Michael Feathers.
And then I'll be like,
right.
16% of the time I've gotten,
I've gotten a lot of wisdom in my career for Mr.
Feathers.
He's a wonderful person.
But he said,
you know,
code is a way you treat your coworkers.
Yes.
I think that was him.
It probably is him.
And one of those aspects,
I think is if you want to bring people onto a project, right?
You want people to help you, fundamentally.
You have to help them help you, right?
You have to do things for them to make it easy for them to contribute.
You can't just push it all on them and be like, well, if you're a real programmer, you would just read through all these things and figure out how it works.
Or, you know, read my partially up-to-date documentation that I wrote three years ago or whatever it is.
Right.
You have to create an environment that is welcoming and friendly and easy to use.
Otherwise, they're either not going to work on it or they're going to be forced to work on it and they're going to hate you, right?
Or they're going to hate the code.
They probably won't hate you.
Not you.
They will grumble. They might grumble about you a little bit. But mostly, they'll just hate the code, right? Or they're going to hate the code. They probably won't hate you. Right. Not you, although they will grumble.
They might grumble about you a little bit. But mostly, they'll just hate the code, right? They'll hate the thing that they're doing, which is not good. It's just like not filling up the coffee
machine or leaving your smelly lunch in the fridge. This is a bad thing that you can do to
your coworkers, and you should not do this. And so, one of the aspects of this, I think,
is you should be able to check out a repository
and run a simple command
and do all the things that we have talked about
on these podcasts over many times.
You should be able to run it locally and manually test it.
You should be able to run the tests and verify that they pass.
You should be able to deploy it.
You should be able to build an artifact that is deployable.
You should be able to do all of these things. And there's not that many. It's like maybe half a dozen, right? It's like run the system,
run the tests, build a deployable artifact, deploy the artifact, right? If you can do those things,
then you can do most things that software engineers need to do. And you should automate
all of those things. How do you automate all those things?
That's another question. The way that I've been doing it in recent years is by using Make.
Because Make is a tool that is good at resolving dependent tasks, sometimes in parallel. And it's ubiquitous, like basically any Linux environment that you're going to be in is going to have Make.
And yeah, Make files aren't the easiest thing in the world to write but they're actually not
crazy hard to read like well if you've already got one and you sort of understand how targets
work they're not crazy hard to read and you if you're working in a compiled language you might
want to use make or cmake but you might want to use make to do some stuff anyway so you probably
have them all there anyway so it's not Now, can you do this with shell scripts?
Absolutely, you can.
I have.
It works great.
Can you do it with other tools?
Sure.
Again, applying boring is awesome.
I would go for a more boring tool here because there are definitely some boring solutions.
But that is a thing that I think is important.
So to answer your question of why is this top of mind for me it's because i've had a few projects recently
that have had data that was marginally relational and certainly not very big
that depended on a relational database that was like got it i figured there was a wound here oh
and and the instructions are in the readme are like, install Postgres, load these schema, you know, create these tables by loading the schema in, and then configure the Postgres URL to this, and then you can start the system.
And you're like, no, make.
I want to do make test.
And if it needs Postgres, then fine.
It may be even.
It can bring, you know, Docker, whatever, something, or any Podman uh data but there should be no manual steps in
this that's the critical thing exactly anytime that the yeah i i think you know well you and i
agree on this very very strongly right every every project that i've worked on and i've had so much
positive feedback from people that are saying like i can't i love it when it's your project
because i just do git clone and then i type make and the compiler itself even gets installed on my computer and just works i'm like yes that's that's how it should be
if i need a magical version of gcc because i need this particular flag then i will arrange for that
to be on your computer as a result of typing make as opposed to here's a list of pseudo apt get
install crap that you have to do first like that's not that that should not ever um be be uh allowed
um and yeah i mean there's a variety of uh of open source projects that i've worked on that
all have a similar thing and i think it's a big bit and in fact actually i have someone raised
a bug recently because it's one of the things that stopped working but mostly i can point
people that say compiler explorer and say yeah you know how you get it running locally make
and it'll churn away for a bit and then you go to port 102.4.0 and then you've got your own
local install of it and it's like people like oh i was expecting there to be more no it's just that
because that's all you need again it's broken right now apparently but but but yeah working on
it's um it's i think it's a valuable um an important thing it's just valuable and important thing.
And as an API, you can go far worse than Make, as you say.
I mean, NPM sort of does it for the JavaScript community at some point,
and there's Mavens and things and whatever.
But Make can run those.
I usually have a Make file at the top of my project that maybe even runs CMake that then runs Ninja for all.
But you don't have to know that.
If you're just saying, no, make the project. I don't care.
It's like, well, there's layers and layers of things going on.
You don't need to know about it.
Hey, Conan's being installed in a virtual,
Python virtual in your machine,
and we're installing all the dependencies through Conan, right?
But again, you don't need to know that.
It just works for you.
And it's all done through the magic of make.
Yep.
And that serves two really important purposes, actually. One is that it is this sort of like, you know, code is a way you treat your coworkers thing. But the other thing is, is that it is an absolutely correct form of documentation, right? Like, how do you configure and build and deploy the system? Well, it's all here. I'm 100% sure it's correct because we use it every day, all day.
It's how my CI runs.
It's how my deploys run.
Exactly.
It's how I run locally.
Right.
So not only can you read that to figure out how it all works, but you can confidently change it and know like, oh, if I make this change here, everyone's going to do it like this with this version of the code.
There's no like separate
like oh well there's the build but then there's the code and you have to keep them in sync and
if you roll back one you got to roll back the other it's all together it's all in one place
and it all works and so there's huge value sort of documentation and and coordination value
in automating those things and this is i mean me, this is sort of one of those things
you just have to choose to do, right? Like we've kind of talked before about like, you know,
we're wizards, we can do anything. What you choose to do in the sea of all possible things
is going to determine a lot about what your working environment is like and what you're
able to do and what you're not able to do. I don't think anyone, I don't think any of our listener listening to this podcast right
now.
I'm reliably informed that we have at least two, actually.
Now, I was talking to somebody other than our respective spouses.
Right.
So, both of our listeners would agree, I think, that any of these things that we've talked
about on this podcast are
possible to do, right? It's just a question of should you do them? And I think that you kind of
just have to start with the decision of like, yes, I'm going to automate this stuff entirely so that
you can just type make. And yes, that might lead me down some strange paths where I'm building tools to make sure that it is possible to do this.
But if you make the decision to do it, then everything sort of will follow along from that if you're committed to it.
Just like we said earlier as well, if you can start from the beginning in that way, it's harder than – sorry, it's easier than retrofitting it later.
So if you're like – well, it's just always been the case.
You type make and it gets everything. we started with just hello world and you know we've
got the compiler and we got the the thing building and then we oh we added a dependency on a third
party library okay we're going to make sure that that comes down as part of the make file yeah and
you sort of incrementally put it on rather than having it um uh trying to sort of retrofit it
yeah it's it's easier to do those kinds of things.
But again, I think you're right.
It's an effort of will on your own part
that you have to make that decision.
This is going to be worth it.
I'm going to take a hit early on.
And I mean, once you've done it a few times,
it's not even a hit.
It's just a way of life, right?
It's sort of the same, the Tao of a new project.
As you go, you know, new directory.
The very first thing i do is
vi make file and i'll paste in uh worth saying there's a really nice little pattern that we've
um we've picked up along the way and both of them well i've picked up from you but i think you
picked up from jake mccreary who picked up from someone else of having like a help target that
sort of grips itself out of the make file and does with a bit of like orc and said and magical things kind of
makes an auto help page for your make file and so you can just maybe your default target is that as
well so if you type make it just says hey these are the things you can do and you're like that's
great yeah um but yeah so that's what i'll paste that snippet in into my make file and then i'll
like just have a make echo target that just says hello
world and then you know start from there yeah that help file thing i think is is nice like you know
just sort of gives you that sort of half dozen here are the things you can do as a developer
and you sort of gets people started the other thing about this it is not only is this a something
that you know if you started early you know so you get that momentum going it know, if you started early, you know, so you get that momentum going, it's easier. If you started early, there's actually lots of situations where you can,
you know, tap into the power of laziness in order to get people to do the right thing,
which is, and a great example of this, I think, is continuous deployment. So, if on day one,
you've, you know, followed my advice and say, the first thing you do is deploy. So deploy your hello world that does basically nothing
and have it automatically deploy whenever you push to the
main branch, then it will be difficult to not
have production in sync with the main branch because it's going to
do that automatically whenever you deploy. And people will just orient
their behaviors around that from the start.
They'll be like, well, if we push to the main branch, it's going to deploy.
So how do we make sure that that doesn't break?
Well, I know.
I'll write a test.
Or I know I'll do this other thing or whatever.
You've got smart people.
They'll figure it out.
But if you start with that philosophy, it actually becomes the easy thing to do to do it right, as opposed to
this extra step that you have to take. But you have to start there, or you have to very quickly
get there. Because if you go in later, it's like, well, we're going to deploy to production every
time you push to the main branch. You'll get 100 very valid reasons why that's a bad idea,
right? And you should not do that that and that's actually an interesting you you said you think i you reminded me of um a couple of issues i've seen in the last
couple of weeks which have both come down to not projects not auto pushing on their latest version
and then later on somebody act accidentally or you know as a side effect pushing a newer version
of the project and breaking other things because it was like a relatively significant number of changes that got rolled out to a system.
And you're like, no, if it's pushed every time you push, then we'd find out a lot earlier and it would be causally linked with the thing that you had just done as opposed to, but I just did this thing.
How on earth can that affect this other thing?
Oh, I picked up two weeks worth of changes in one go ah it's shocking
how much and i mean if you if you talk to anybody that was into like lean systems and the lean stuff
like you know 10 years ago or whatever they'll tell you this obviously but it's like there's it's
shocking how much queuing theory there is in software software development management and
process and stuff right like if you if you understand
queuing theory really well you can start to see those things in how developers push out changes
right and and you know the whole toyota production system and all that sort of
fed into all this stuff this was this is what the cool kids were talking about and like 10 years ago really i'm not one of those the post the post agile people uh the agile some of the agile refugees that were
like you know why are we all talking about stand-ups and cards and things i just want to
build stuff um but yeah like like queuing up changes like a perfect of this is exactly what you're talking about, is the longer you queue up changes, the more cost there is to actually deploying those changes.
And that happens in multiple dimensions.
One is that you've lost context, right?
The people who made the changes just have slept since then. And they just don't have the sort of top of mind knowledge that they would
have had if it was like,
all right,
I just built this thing and now I'm going to deploy this thing.
Hey,
it broke.
That's probably the thing that I just changed.
I know exactly what's going on.
And it's all,
the cash is all still warm,
right?
Like it's all,
it's all up there.
The other thing is that you,
you can unfortunately sometimes defer those bugs for your coworkers,
which not only have they slept since then, they're not you, which means they don't know
anything about this change that's going on.
Which is what happened to me, yeah.
Exactly, exactly.
So that can increase the cost.
And the other thing is that you get errors on top of errors, right?
So somebody checks in a change that breaks something.
Somebody goes and makes another change
and they look at your code and they go,
okay, well, apparently that's how it works now.
I'll do that.
And they're doing something wrong.
And then they make two of those things that are wrong.
And that just sort of compounds on top of each other
until the thing finally hits the real world.
And then that whole chain of things breaks
because we were building wrong on top of wrong
this whole time and you never knew it. So those, mean those are just some sort of basic ways but it's like this
this this general problem of if you're queuing up changes to your system you're taking on a lot of
risk and you got to be really careful that that risk is actually worth it sometimes it is sometimes
you can't just do things where it's like yeah literally every change just goes right to prod
you know there are there are situations where that can't happen. Lots of situations where that can't
happen. But understanding that your goal should always be to shrink it. And to also just recognize
if you can't do this, well, here are some of the problems that you're going to encounter. You're
going to encounter the problem like you saw today. Okay, how do we deal with that problem when it happens?
One of the things that I have advocated for a long time is that git revert is not a personal insult. Reverting commits is something that you should take advantage of, right? Like it's not,
you're not, you will have a much more complicated operational process if you have this mentality of everything that everyone has ever
committed to this repository must be either fixed or remain pristine or never get rolled back,
like your life will be made so much easier if you just sort of have a meeting where we all come
together and be like, all right, everyone in this room are all going to agree. If I revert your
commit, it's because I love you. And I want you to be able to go on vacation and not have to worry that the code that you've committed to the repository is perfect
and unassailable in all ways. You can leave the building and go home to your family and loved
ones. And if I see that you've made a mistake, as we all do, I'm just going to revert it. And I'm
going to tell you that when you come in the next day, be like, yep, Ben reverted my commit. Thank
you for reverting my commit. That means I can fix this now at my own leisure and not have to be woken up at 2 o'clock in the morning by pages or interrupted by dinner saying, hey, Ben, you committed a bunch of bad code and then you left the building and now we need you to fix it right now.
It's like, why can't you just –
I love this.
Why can't you just revert it?
I think this is a brilliant analogy, yeah.
Because there is – you're right.
I mean, isn't it a funny social issue that yeah I do feel
guilty reverting someone else's change
it's like you know
somehow
a bad reflection on them when it's
like no it is a pragmatic thing that I'm
doing to buy us back the
stability that we had before and unless their
change was required for operational
reasons then often as you say it's like well
okay you can come back in tomorrow and you can revert the revert and then you can fix whatever issue it
was and then you can yeah no harm no harm done and and i'd like to think that if someone reverted
one of my changes i wouldn't feel put upon but you know it's uh it's it's it's i i yeah i do like the
if i revert your changes because i love you and i
want you to have a lovely evening without me or a vacation or whatever yes whatever it is whatever
it is i want you to be happy i want you to be uninterrupted in your life and i'm just going
to revert your change and then we'll talk about it tomorrow right or whatever right after lunch
whatever it might be um and it's it's it's one of these things of like, I feel like if you can adapt some of these things.
We've talked before on this podcast about like, you know, the reason that I got so interested in engineering practices, agile engineering practices in particular, is because I sort of realized it's like, if there's certain things that you do and you do them well, there's a whole host of other things that you don't need to do.
Right? do and you do them well, there's a whole host of other things that you don't need to do, right? And I feel like this is an example of that, where it's like, if you get comfortable with this as a
team, as an organization, where it's like, yeah, when we commit, it goes right to master.
It goes to the main branch, goes to the main branch, gets deployed to production. That's
just how everything works. If we run into problems, we revert the commit, and then we've
got a reverted commit, and then that gets deployed, and now
the problem is fixed, right? If you
do that, you don't have the
queuing problems. You don't really
have to worry that much about
versioning and keeping
old versions. You have
a nice thing of
depending on the
particulars of your project, not every project is
going to be able to do this, but you can get into situations where it's like, you know, unless you find yourself very often needing to roll back and your deployment system, however it is, doesn't let you just roll back to a particular commit, you can rerun
that SHA. But there's
a whole bunch of versioning things that you
probably also don't have to worry
that much about. Your solutions to those
can be significantly simpler
because you're just
reverting commits instead of
oh, I need to roll back to version 1.27
and then where is version
1.27? I don't know. I stored
it in an artifactory or whatever.
We got to fetch it from artifactory. There's just a bunch
of stuff you don't have to build. So, I think,
and again, not every project is going to
be able to do this. This is not a universal
solution. But I think the main
thing is just sort of thinking in these terms and
trying to, like, simplify things in
these ways. You'd be surprised at what
clever solutions you can come up with if you just embrace the philosophy of it right start with the philosophy be like how do i
how do we get as close to this as we can work back from that so how do we get to that from databases
i feel like somewhere along the line i know there is a link but you kind of switched gears you know
add another thing i did i did but that's a great thing but yeah so my understanding is that we got there
from like if you don't have to fire up a giant database or run against a big database then
that enables you to have the kind of self-contained hermetic project where you just clone the project
and type make and you can run all the tests you can do all the deployment you can do everything
within that world without having some exogenous dependency an exogenous unnecessary
dependency exactly on a database exactly just trying to make sure that we've got the trail
yeah tie tie all these ranty pieces together that's a good idea yeah no that's exactly right
it's it's it's you know everyone sort of agrees that simpler is better and and all we disagree
as what is it what does it mean to be simpler some people would say like why are, you know, everyone sort of agrees that simpler is better. And all we disagree as what is it?
What does it mean to be simpler?
Some people would say, like, why are you, you know, building your writing your own code
to scan a file to query things that you could just throw into relational database?
Isn't it simpler to just write a little bit of query instead of to write 100 lines of
code?
And my argument a lot of the time, and again, this is very context sensitive, but a lot of the time is, no, I'd rather write 100 lines of code than have a database.
Because if I have to maintain a database, then I can't do all these other things.
And the other things are more valuable to me.
Yes, exactly.
The other things are more valuable to me than saving myself 100 lines of code, right?
I'll just write the 100 lines of code.
It'll be fine.
And then that means that when you clone my repository and write make run,
the system comes up and you can use it just as a user would
with no special stuff to have to make it work.
And when I deploy it, I know exactly how it's going to work
because I don't have to coordinate the deployment of the software
with the deployment of a database or write database migrations
that go from one thing to the other thing or any of that because i have my hundred lines of code to replace all of that
yeah cool well i think that is databases fully covered
we need to come up with a better ending than that
maybe we could stop a bit earlier than this and i'll just do some magic editing because
that seemed like a natural
end point. Or maybe
we just put this into it and then everyone can see
how rubbish we are at finishing things.
How bad are we at endings?
We are this bad at endings.
You've been listening to Two's Compliment,
a programming podcast by Ben Rady and
Matt Godbolt.
Find the show transcript and notes at twoscompliment.org.
Contact us on Twitter at twoscp, that's at T-W-O-S-C-P.
Theme music by Inverse Phase, inversephase.com.