Two's Complement - Running Programs
Episode Date: September 12, 2025Matt and Ben discuss running in production; from running processes in screen to battling systemd configuration files. Ben sketches out daemonization rituals while Matt channels Tolkien to explain proc...ess hierarchies. Our hosts discover that Ansible playbooks are just bash scripts with better PR, and everyone still Googles journalctl syntax.
Transcript
Discussion (0)
I'm Matt Godbolt.
And I'm Ben Radie.
And this is To's Compliment, a programming podcast.
So we planned comprehensively, as always.
So we planned comprehensively as always.
And today's topic is going to be signals.
and processes.
Yeah.
And that is the some extent of our planning.
We said those words out loud.
And I said, yes, and hit record.
And then we continue talking about it during the intro.
Uh-huh.
And we're here.
So why is that top of mind for you?
Is there a reason why you are worrying about this right now?
There's a reason that I'm worried about this right now,
which is that I'm always worried about this,
because I see part of my job as a software engineer
is making sure that the software that I write actually runs
and does what it's supposed to do.
I know that there are lots of places in the world
where as a software engineer,
you're expected to write code,
and then there's another group or team or organization
or outsourced company
that is responsible for actually taking that software
and running it on computers
and making sure
that it continues to run on those computers
and that it delivers the value
that it is intended to do.
And in some cases, those things are
like very separated.
Right. You might just make a PR to a function.
You change a function, your test pass,
you check it in, and then you have literally no idea
how it ends up serving people's requests
or whatever it is your company does, yeah.
Right, right.
And then on the other end of that spectrum,
I think you can have situations
and I have definitely been in these myself
where it is like, no, we're building this
for the very first time.
There's no infrastructure team.
There's you.
And you are going to compile your code.
You are going to SCP your code onto a server somewhere.
And then you are going to run a screen
and then exec that program in the screen.
And now you can post in a Slack channel
or some other place.
Hey, we've deployed.
production.
And by production, you mean, yes, the only reason it didn't quit is because I'm still
running it in a screen.
Yes, exactly.
I do control A, control D in the screen, and now our production environment is...
Everything's fine.
Yes.
Everything's fine.
What's your logging strategy?
Oh, we log back in and we reattach.
You check the screen.
See what happens.
Yeah.
Okay.
That does work, but I can understand why, yeah, you might want something a little more.
Yes.
Well, and those are the two ends of the spectrum, I think, if we are going to simplify.
it down to a spectrum of like like um and and you know i think that um you can in your career
and i have done a lot of this as a software engineer you can kind of like um hop to the left i
don't know the side of that spectrum and say all right well okay i obviously i don't want to run it
in a screen what else could i do and then you start learning about like system d and things like run it
and Supervisor D and things like that.
Old school no hub was right.
Right.
And then log out and then you're done, right.
Exactly.
And then of course, you know, you start moving into distributed vitamins to cloud.
You learn about, you know, Kubernetes and Elastic, what does it, ECS stand for?
I forget.
Container store?
No, container service.
Container service.
Yeah.
Yeah.
Or you've got, what's the, what's the?
Hashikorp thing, nomad.
Nomad, similar things.
All these things which are like orchestration setups that say,
hey, you just tell me through some mechanism what you would like to have running
and I'll find a place to run them and run them in a particular controlled way.
And then you take that part of the deployment and running part is taken out of your hands.
It's done by a framework.
But presumably, yeah, go on.
But all these things are accomplishing what is fundamentally the same goal,
which is I have produced software and I want.
it to run on a computer or maybe multiple computers maybe not multiple computers it's
like oh this needs to run it like exactly one or exactly one right like they can all this like
something is consuming a key and there better only be one of them at a time or bad things are
going to happen right like yeah so uh i think all of that uh is kind of encompassed in this in this topic
of like i'm trying to run a program and how do i actually make sure that
that is happening the way that I want.
And I think that we could even structure this from sort of the bottom up, right?
So we started with screen and I'm just running screen and now I've got a process and it's
executed.
Even screen is one level too far from, I literally run the process and it's there and I'm
and I'm watching it.
And I'm watching it.
I'll control see it, which, you know, is also valid.
But it gives us a sort of starting point.
Like what happens when you fire up a process and why is that not okay?
Yeah, right, right. Yeah, that's great. I love this. Okay, so it's like, so you, when you do that, you're like, all right, my plan for deployment now is I'm going to SSH onto the production server or EC2 instance or whatever you got, and I'm going to copy and I'm going to SCP my bits up there. Yeah, let's not get into packaging and deployment. That's even more public. Let's leave it at that. Some magical process happens and you have the bits that you need on that machine. Yes, I have my executable bits on the machine and then I'm just going to run it. Well, now what you have is you have a process,
whose child, who's a child of an SSHD process.
Probably a child of the shell that you ran it on.
Oh, yeah, no, you're doing it.
I mean, if you're going to, no, you're absolutely right.
So if you've got the tree, it's like, okay,
the child of Bash, and then Bash is going to be a child of SSHD,
and then that's going to be a child of the parent SSH server,
and then that's probably going to be a child of a knit, right?
Which nowadays is probably system D.
Yeah, Thorin, son of Thrain, son of,
It's going to be your program, son of Bash, son of SSHD child process, son of SSHD child process, son of
SSHD parent process.
Right, right.
So yeah, got it.
Yes, that makes sense.
All right.
So if you, so if you naively, or maybe like, not naively, but you just sort of like, just
have enough knowledge to be dangerous, you're like, oh, I've got the ampersand operator
in Bash, that I could put at the end of that, because it's like, okay,
cool, the production server is running on my laptop. And if I put my laptop to sleep, or, you know,
the SSH session, the client is on my laptop. The server is a server. But it's like, all right,
I started on the server. Now I want to go home. I need to close my laptop lid and I need to leave.
Well, what exactly is going to happen if I close this lid? Like, I don't want it to stop, right?
So you're like, okay. Well, let's talk about what happens in that situation. Yeah, okay.
absolutely clear. Right. So let me read this back to you. So you're saying, yeah, you're running as
described the production binary having SSHed into a machine and you've closed your laptop lid.
All right. So assuming or even, yeah, assuming you just close your laptop lid and nothing shut down
nicely, it just literally suspended. I don't actually know exactly what your laptop will do in this
situation. But let's just assume it disappears off the network instantaneously, which is also completely
reasonable if you go into like a tunnel on the train on the way home, that kind of thing. Right.
then eventually the TCP connection between your computer and the SSH demon
on the remote end will time out there'll be a keep alive that's missing probably or some
other heart beating mechanism will go down and the SSH demon will say hey that person's gone
now it's time to clean up their session it will I think kill the bash process and the
bash process then will kill all of the children that it knows about, something like that,
or there's some, yeah, so this is, this is kind of it, right?
So what is?
Yeah, signals and processes, right?
Yeah.
I mean, I know that the result will be, my program will die.
Exactly how that dies.
I'm not 100% sure, but that's what would happen.
Eventually, maybe five or six minutes later when the SSH demon times out your connection and
says this person's not there anymore, it kills the process tree through some mechanism,
and then, yeah, you get a phone call
as you've got onto the train
telling you that the production system is down,
please fix it.
Exactly, exactly right.
And this is maybe where we troll our listener
into posting the right answer on the internet to this
because I would suspect what probably happens
is that the SSH demon kills like the process group.
Of course, yeah, because Bash becomes a process.
group controller or whatever the name is a leader process group leader that's right there's
yeah yeah there's my stephen's book uh i haven't got it here no but yeah there's okay it's probably
going to send a sig term to that process group and so every process in the process group is going
to receive that term signal and then hopefully gracefully shut down i don't know if it follows it up
with like a sig kill at some point or not maybe it does maybe it doesn't i'm not exactly sure what
no no but that would seem reasonable yeah so that you never you don't end up with loads of
processes that just decided not to kill themselves and frankly i think bash will probably do the right
thing yeah for that in that process that yeah circumstance so day one we try to deploy like this
we we we close our laptop lid we go home we get the unfortunate call and then we rush home and then
we open the laptop lid back up and then we rerun the process of all right well i can't do that so
an enterprising person might say okay you and i'm going to do you what i'm going to do
is I'm going to use the bash ampersand command
because I know that that will put a process
into the background, right?
And so I'm going to do that next time.
I'm going to run.
I'm going to do it my deploy.
I'm going to put an ampersand on the end, right?
And then I'm going to like,
now it's running in the background
and now I shouldn't have to worry about this.
Although if I were to do that with like a process
like we were just talking about,
the very first thing I would notice
is that my shell prompt comes back
and then immediately loads of junk
from my log file is now appearing
over top of what I'm running.
So even before we get into processes,
there's like a pragmatic thing.
So what I would probably do is redirect output to,
you know, slash tilda log.txte
and then put the ampersand on the end.
Right. So good.
And now it's in the background.
And I think we're great.
And I, you know, I'm tailing that log file for a bit.
And that's safe because that's a separate process.
Yeah.
And now I close the laptop lid and get on the plane,
a plane train, whatever.
any more mode of talk what happens now right well i think what happens and i i think this because i've had
this burn me from time to time is that yes you'd redirected standard out but you did not redirect standard
error and so there is actually still the demon has a uh a file handle that it thinks it needs to be
writing to back to your thing and so you put this in the background and you do this again and it breaks
again it does exactly the same thing all over again well i think there's more than one reason so yes
First of all, standard error isn't going anywhere useful.
Right.
The second thing here is that although it is in the background, it's still a child of Bash.
So it's got you coming both ways and maybe thirdly, thirdful, is it standard input is still potentially connected to the console, the terminal, something.
I'm waving my hands a lot here because that's a very, I'm less sure about it.
But I certainly know that if you try and read from the console, you'll get one of the even more esoteric signals about like, hey, yeah, you can't.
You're not connected to it right now.
And then you'll get stopped.
And so you'll see in Bash, stopped input required or something weird like that.
So all of those would defeat you and you end up with a dead process.
Right.
Right.
So this is where you started investigating all of the various options that you can pass to SSH when you run this because you're like, I'm going to make a script.
I'm going to make a script that works, and I'm just going to run the script, and it's going to do my deploy, and then I'm going to trust that it works.
And you start learning that, okay, well, I need to do the option that, like, doesn't read from standard in, because I don't want the standard in problem.
And then I've got to make sure that I redirect standard out and standard error.
You're saying this thing in the background.
Just to be clear, these are options you say to SSH or to the Bash.
SSH.
Oh, I see.
So now we're not going to run Bash at all.
We're just going to run the executable directly.
and well so you're going to run so i'm thinking of the world where it's like you do a thing
you like copy the bits up to the machine uh-huh and then you have like a separate s-sha call
where you're passing the command that you want to run as an argument into right so you're no
longer running an interactive session you're just going to right that makes sense okay then that takes
bash out of the equation which helps us a bit in this context although there is a there is still
another Bashian solution that I think
I see people go for, which is you type disown
in Bash, which says,
push this thing and make it not
a child of this process anymore.
And that probably,
probably
might solve the problem most of the time.
Yes. Except you've left a big
like rake in the grass
for that because there are other processes
on the system that might wish to get rid of
that apparently now orphaned
process. So
that's what NoHUPS
Sephora. It's like it gets with the hang up and there's some other things that it does.
And then there's demonization and other bits and pieces, which I'm sure we'll get to in a second.
But let's, let's put that to one side and let's go down the rabbit hole that you've described,
which is that like, I'm now going to run SSH on my computer.
Yeah.
And I'm going to pass it rather than just SSH, I'm going to do slash pass slash to my executable with all the redirects and things set and try and run it from a server and have it live on the remote machine with all of the pipes and
things, stood in puts, did error and stood out, all connected to sensible places.
So yes, yes. Go ahead. Sorry, that's where I cut you off.
So you do that. And then you should, I believe, be able to SSH in separately and do like a PS tree
and see that the parent process of this, the parent of this process is now one.
because it is disconnected from what it was doing before.
Right.
From the process group that it was in before.
And at that point, you maybe have something where you can close your laptop and have it
hang out.
Now, hopefully you sent your logs somewhere sensible and you don't fill up the disk with logs.
You can pipe it into syslog, which is something that I do when I'm trying to,
trying to punt on this problem entirely.
You know what?
There's already a log rotation system on this machine,
and it's called cis log.
So I'm just going to pipe all my logs into that.
Quite possibly you already have log aggregation set up for that
so that you can go and read it on like a website and all that kind of
as well.
Maybe you do.
I mean,
but yeah,
if you're considering that option,
you probably don't because you probably don't have any other
infrastructure to lean on.
Right.
Right.
Yeah.
Yeah.
Okay.
So that seems reasonable.
Yeah.
Yeah.
So what do you do after this?
So you do this, you finally can go home now.
You can shut your laptop and go home.
And you're like, surely we can make this better than this.
What do we do next?
Yeah, right.
So I still have.
You make the system D job is one is, um, well, see, I was thinking another thing that.
So there is a process call, that process is a terribly overloaded term.
There is a sequence of things you can do on a POSIX system to become a demon.
Mm-hmm.
special incantation you've got to sacrifice something that's correct yes there's a there's a pentagram
involved um and not a damon also because matt damon is the only damon so aside here so as you recall
one of the first uh folks at the company you still work at was also called matt and was not me
and we were discussing various long-lived processes that we were designing a system to use
and the obvious name was the Matt Demon.
To be pronounced, Matt Damon, obviously.
Right, right.
But we never did it.
Anyway, demonization is, let's not get into politics.
Becoming a demon, as I understand it, is a multi-step process.
The first thing you need to do is fork, which gives you a new process, a shiny new process.
Then you call something called set Sid, which says,
I would like to become the session leader for this new process that I've been being created
because only a process group, and I'm doing this from memory, so listener, please.
And although betting, this is not necessarily correct.
So to take this massive pile of me hallucinating all of us.
We are.
Yeah, let's know.
Yes.
So you fork.
The child process then does set Sid to become a process leader in its and a new group.
And then if I remember rightly, you have to fork again.
to then dissociate yourself from any last tendrils that that previous process had.
And now you're running and you are completely in the clear.
It's something like that.
It's some weird sequence of events,
which means that you have lost all connection with the previous process.
And so when you run some like system process and you pass it with dash dash D or dash
sorry, then, and it immediately returns and disappears, apparently like, hey, did it do anything,
but you run PS and it's still running. That's the kind of process that it's been through.
And you know, you can type jobs and it won't be there. It's like completely lost from you.
And probably I don't realize that the thing you were just talking about, and I'm having the penny is dropping now,
some of the flags that you were talking about finding for SSH to set it up correctly might be the ones that
effectively have the same side effect.
But I, having just written something that is a demon for the, if you go back to the
system D conversation we were having last time, something became a demon and I went through
that process.
So it's a bit somewhat in top of mind.
And even though I had a demonization thing that, I still, you can choose, I think,
system D, which we're going to, to say either system D runs the process and does that for
it in its own container, or it's expecting it to run in that particular way.
and so it can babysit different types of processes
if I remember rightly.
Okay, let's go back to what you said about System D
because that sounds like a useful thing to know about.
Right.
So just to put the problem in context,
System D is a solution to a problem.
What's the problem?
Well, so here's the problem.
You've written your script.
See the last conversation we had about it as to what solution it might be.
What problem it is?
What problem are we creating by?
solving another problem right yeah I think actually is that a thing I feel like I've said
this before on the podcast I don't remember the difference between computer science and
software engineering do we know this one computer science is solving problems
with computers software engineering is solving the problems that you create when
solving problems with computers and yes this is a that checks out the math checks
out for that yeah and so what problems are we are we both solving and creating by
using system d well so you write your bash script it deploys your thing you shut your laptop and then
you wait five minutes you open it back up and then you have to space jack and it's still running and you're
like all right i think i maybe believe that this is going to work and you go home and the next day you
come in and still running cool and then three days later it crashes and you're like you know what
would have been super cool is instead of me getting a phone call in the middle of the night because it
crashed if it had just restarted well i mean wouldn't it be cool if it hadn't crashed would be what
the first thought you'd have, but at three in the morning, you probably just want to go,
oh, for God's sake, just restart the thing.
Just restart it, please.
I'll fix it tomorrow, but can we please not call me because I have to SSH back in and
rerun the script again or whatever.
Right, right, right.
So you're like, I just, I want this to restart.
And then you Google and you're like, well, maybe I should run this in system D, right?
And so you wind up making a whole system D job definition.
And you, I forget where you put it?
You put it in Etsy something, right?
Or is it, yeah.
So there's, I don't even remember now.
So, I mean, my understanding is in the beginning, there is, there was in it.
And in it is effectively the first thing that the kernel executes as a user process.
And it then decides what to do.
And back in the mysteries of time, there were like run levels and it was all like clever
directory structures and things like that.
And it just fired up the right sequence of demon processes, one of which would be,
you know, an SSHG so you could log into the machine or a Getty that would actually let you
type on the console to get into the machine. And that was it. And then after that, you're off to the
races. And System D is the new in it. And instead of it being a set of essentially shell scripts
that get run to fire things up in the right order, again, I'm probably a bit of scrape covering
missing loads of bits of context here. But it's a sort of a more principled approach where
you have units that I like, I would like this thing to run, please.
I would like this to be true under these circumstances.
And it depends on these other things that also need to be either running or at least
have started before me.
And so instead of having essentially numbered directories with, you know, 40.
Yeah, RC.D or RC.1, RC.2, something.
Yeah, those were the run levels, I think, which was slightly different because it's single user mode
versus multi-user mode.
But this is more like, hey, what sequence do I need to run things in and shut them down?
in in order for my system to come up. And system D does that kind of the right way by actually
tracking dependencies, which again was expensive and caused me, caused me problems in our last
conversation, but is the right approach and the correct thing to do. And so that's what
system D is. It's like the the overarching orchestrator of a computer and all of the processes
that are running on it. And so yes, to make something running system D, you put a file in the right
magical place. You issue the correct incantation to system D to go and notice that that file is
there. And then what? I'm looking at you because I thought you might have just done this and
you could answer the question. Well, you need to reload the system D. Yeah, there's like demon control
reload or something. That's the magical incantation that says, hey, system D, look if through your
configuration files, something has changed. Yes. Please do the needful.
now. And then it should start up. And then you're using something like journal CTL to look at the logs of the
thing to make sure that it started. Which is, I think for most people, when Linux systems particularly
moved from in it to system D, the biggest frying pan to the side of the head was, where are all my
chuffing logs? They used to be in var log, whatever, and that's burnt into my mind. They are text files and
in var log blah and system d stop that and now there are a few logs in var log but nowadays you have
to interact with it through and it has a binary log file format as I understand it behind the hood and
you have to learn journal control journal CTL which I still haven't learned and I still Google the
same thing over and over and over again and type in the thing that tells me to do which is note to
self don't don't do taking a note there don't do that make a cheat sheet for it stick it to my monitor
to like all the other cheat sheets I have.
Yeah, so that was, but that was like the,
that broke most people, I think,
because I didn't have to interact with adding
and removing demons from my system.
That's what, you know,
my package management system did.
But whenever something went wrong,
I'm like, where the hell's the log file?
Anyway, so,
it's a magical program called journal control.
Okay.
I feel like this is like,
like, I want to go to the next level now.
It's like, okay, cool.
We're going to run this on like two computers
because we discovered.
Let's finish the thought.
The reason it crashes.
It got even killed.
Right, right, right.
Let's just finish the thought there.
So very concretely, you would install the binary to a known good location, which you probably
were anyway.
It wasn't just your home directory, hopefully.
Yeah.
Pick a user that you're going to run it as.
Yes, that's true.
Might be root, might not.
Yeah, let's hope it avoids being root if it can.
Yeah.
But then, yeah, you make a little text file that sort of, it looks like Toml-ish to me, that system
D-config-ish file.
that says, hey, I need these things.
I provide these things, which you often don't have to do.
This is how I'm going to be started up.
This script needs to run before I run.
This needs the script needs to run after I run.
There's a few like customization points you've got like that.
And you can say what you're wanted by as well.
So in this instance, you probably say, I'm wanted by multi-user dot target,
which is like a magical sort of target that says, hey, when it becomes a multi-user system,
the fifth, whatever, run level five.
then this is, I am saying that I am wanted by it,
which is a way of you kind of going the other way around
from the usual dependency saying,
it depends on me.
Right, you're joining the dependency tree there.
Yeah.
So now when you start,
when you reboot the machine,
your service will come back up.
And then you can have some policies about retrying,
restarting it,
maximum number of times to restart,
how often to wait between,
how long to wait between them,
those kinds of things.
And then effectively,
it runs itself after that.
So that's what we do.
Yeah.
Yeah.
And so your installation process,
is copy the binary bits up and make sure that this system deconfiguration is there.
And then obviously, if you want to restart it, there are processes for restarting service,
restart and all that kind of good stuff.
Yeah, service ETL.
Yeah, I still use service space, service name restart.
There's almost certainly a hundred ways to do it.
Honestly, I still want to go var run, blah, or whatever the whole thing was.
I actually don't know what the command is, but it just comes out of my fingers when I need to say,
make that thing run again.
But yeah, service space, name of things, space restart is now what I've learned to do.
But okay.
Okay.
So that's where we are.
Right.
So now we're good, right?
We know that the process is being appropriately managed by a piece of software that's designed to start it up at the right time and keep it running.
It also has some handling for like if it does output to stand it out, it'll go to a well-defined log place inside this journal control thing.
If it crashes, it will restart it.
If you reboot the machine, it'll come.
back up with it if you set that to be so everything is wonderful so what's next right so what's next
is that you discover that the thing just crashes every four or five days uh because it's running out
a memory because it needs to run on more than one computer it is too big so you have to now run it on
multiple computers and you're assuming whatever work you've ruled out the there's a memory leak
type yes it's not a memory leak yeah it's just too much beta yeah too much yeah too much yeah
So what do we do now, then?
So now we need to run it on multiple computers.
And so, like, one thing you might reach for here is Ancible?
I was going to say, is probably duplicating the line in the SCP-E-space SSH machine,
service power restart, and just do four-host in.
For host-in, host list, yes.
And just do the exact same thing.
I would do, right, at least to start with, right?
That's the V-0 of anything.
Yes.
Well, okay, let's deploy it to the two computers I know about.
right now and just do the same thing on both of them and then that is probably what I would do
and then I would have the thing where I would try to deploy it and there'd be some package or some
configure oh we got to increase the size of the maximum size of the receive buffers on the network
and so now I've got to like go and change that configuration I got to change it and I've already
scaled this out to like 10 computers now like every month for the last you know 10 months I've been
just adding another having another host the list of hosts yeah and that's
And now it takes like, you know, three minutes just to iterate through all of them.
And I'm like, oh, and I have to remember to log in and set all these settings every time I had a new host and it's getting worse and worse and worse.
Okay.
So we've now gone firmly outside of signals and processes.
And now this is like the setting up of the machine here is what you're talking about, which is well valid.
And if you think of, you know, the system, sorry, the system D configuration unit file, whatever we just said, as being part of this machine configuration, then.
And it does make sense to talk about some of the other things that you might need that machine to have set up, like packages.
And as you say, system setting.
So let's segue into that.
Let's do it.
Yeah, okay.
So you've decided that now, okay, I need to retire this BAScript.
It's served me well, but it's time to move on to something a little bit where I don't have to like build all the stuff myself and make sure that it works and troubleshoot it all.
So I'm going to try to use ancible, let's just say.
What is ancible and what makes something able to be anced, which is presumably what it means?
Well, first you have to have pants, and you can have ants in your pants.
That would be pants.
And then, pansable.
That's going to be the fork of ancible.
Okay.
Ancable.
So Ancable is honestly a tool that I have only used sometimes.
It is not, I sort of like wind up making the jump.
from like the shell script to like terraform.
That's usually what I do is I'm like,
all right, I'm gonna go and I'm gonna have
something like Nomad Manage these
or I'm managing the cloud.
So at that point,
you jump straight out into sort of an orchestration environment
as opposed to you, I'm controlling individual machines
because that's the other thing in here,
that host list and provisioning of those machines.
We're assuming that these machines exist
and you haven't got to like make them appear in EC2.
But let's go through what ANSI is.
But real high level, ANSIBLE is,
you write a playbook and I think that playbook is pretty much in YAML and it's got like the steps that you want to perform and there's like a lot of sort of baked in things of like oh I need to copy this artifact from this place to this place cool I need to create a configuration file here cool I need to restart system D cool it can do all those things for you and there's lots of baked in tools in Ansible to sort of do the typical system management things you can install packages you can create users
you can you know because it's like hopefully like you said we weren't running this thing as
root so we had a dedicated user for it i need when i'm setting up a new machine i need to make that
user i need to make sure they don't have a password that they have the right sage keys you know
all those kinds of wonderful things um so you have some you know script or some playbook that you
run you know as root because it needs to be able to do all these things but then it sort of
sets up the environment and then like subsequent deploys and things can um you know kind of uh
the program can run as as a user and it doesn't need to get got it right that makes sense so it is
it is essentially a canonical canonic that word um of what it the steps that you need to do the
playbook i mean that's a good name for it right like it it replaces the playbook which is the you know
the google doc that you have that says when remember when you create a new machine here's the 25 steps
do you have to do and you kind of roll your eyes and do them.
And it's like, well, let's automate this.
And it does it in a principled way using, with a bunch of support files that help you
make support functionality that lets you do like ad user rather than having to go,
whatever steps.
Actually, you'd have to take to add the user, which I forget these days.
Okay, that makes sense to me.
I think one of the things that I have had difficulty in getting my head around when looking
at these sets of tools.
And only because you've mentioned Terraform, one thing I like about something like Terraform is that you kind of describe the end state.
Yeah.
And Terraform's responsible for getting whatever the current state is to the end state.
Yeah.
So whereas with things like Ansible, as I understand it, is you have to be very careful to either be item potent so you can run the same thing twice.
and it doesn't re-ad another user
if there is one already called that thing.
Or you just have to not run that step again.
You know, like, hey, once we add that user,
don't try and do it again.
And then you kind of go like,
well, now I want to change the user
to have a different, you know, full name or a different shell or whatever.
You're like, ah, now I have to run the change command.
Right.
I can't just change the ad.
And Unix systems are so, so complicated.
I can't actually imagine how you could write a more,
a general purpose, like make my system look this way thing, except for at least one listener,
somebody is currently shouting Nix into the void as they're walking along.
And I know that Nix solves this in a very cool way, and I'm very excited by it.
But I don't have any personal experience with it other than someone demoing to me and me
going, wow, that is super cool.
Yeah, I've heard those same things about Nix.
that yeah Nick seems to be it seems to be like a kind of a mind virus that people get not in
a bad way necessarily that that does sound majority but like because once you get it I think you're
like oh my gosh this is how everything should always be done yeah and that's great and you become
like proselytize it to everybody and then most people's eyes glaze over and you're like that seems
great and then you just log back onto the machine and just go pseudo apt install bomb and you're like
there we have we done anyway back to oops I've just banged
my, sorry editing Matt, I've just, I've just whacked the microphone stand. So where were we?
So I was sort of saying that there's a sort of difference between sort of prescriptive run these
things in order and maybe they're item potent or maybe they can adapt and say like, well,
if there's a user already there, don't re-ad it, that kind of feeling versus the Terraform thing
where you just say, I should like this to be the end state. Here is a list of users the machine has
to have with the properties that users have. Right. And then Terraform goes behind the scenes and
goes, well, look at what users I've got. Oh, now I'll make a
plan. A plan is add three users, delete one user, and presents it to users. This is what I'm
going to do. Yeah. Have you ever actually used Terraform to do that type of system administration
before? Not on a system, no. I've only ever done it with infrastructural components. Yeah.
So yes, that is true. I've never used it for. I don't know if it can do that actually.
I don't know that it does. You're right. Yeah, now I say, but suddenly that's where I was going with
that was less that terraformed specifically, but like the phrasing is either outcome or steps.
And I, you know, it's nice to supply the outcome.
But yeah, I don't know if something does exist.
And my only interaction with things like that are with Packer, where I always start from a script, an empty image and then run the sequence of steps to make an image that looks the way I want it to.
So I never go back to it and kind of go, hey, I want that image, but slightly different.
So, yeah.
Yeah. Yeah. Yeah, maybe that's the, I feel like this, this podcast is like the rough draft.
of a conference talk because it's like imagine that you want to run a program what do you do
and you just sort of work up from the bottom up and then I think that's like when was the last
time you gave a conference talk come on it's your turn oh it's been a long time I I'm probably
overdue very much part of the the last week's conversation the reason I was looking into that
was because I was avoiding writing several conference talks that I have to give in about a month's
time and a week has passed since we last spoke now I'm giving away all of our secrets
although much longer will have passed in real time and I'll probably given the conference
talk by the time I've released this so listener you can be the judge of whether it was any good
or not but yeah I have done no work on it at all so oops but yeah this is a rough draft of
a conference talk on it is so you want to deploy a service so you know exactly yeah you want
run some software, right? How are you going to do it? And I feel like the punchline of this is like,
okay, and now we're migrating this all to the cloud and we're going to use Terraform. We're going to
use GCPR. Or maybe you have like, you know, a lot of companies I feel like these days have like
an essentially like an internal cloud. Like they're still using Terraform, but they're using
tools like Nomad and they have their own, you know, physical servers and they have an infrastructure
team that's managing it all. And this maybe leads us back. This is how you get this. Okay, this is
the whole website. This is how you get into the state where you're just like, yeah,
I just, like, changed one function with some unit tests and pushed a PR, and I have no idea where it goes.
Yeah, that's exactly right.
Yeah.
Uh-huh.
Yeah.
Well.
And now the circle is complete.
And now the circle is complete.
Yeah.
I think we've, we've probably reached a good spot then.
Yeah.
It's good to know these.
I think like all of these, everything we talk about, really, certainly everything that I hold dear that we talk about on this, on this podcast is all about.
finding the right level of abstraction, knowing that there's a level beneath you,
which in this case, you know, maybe your level of abstraction is those cloud tools
that we've just been talking about and the services they're on.
But knowing enough about the level beneath you to say like, okay, I do know that there
are processes that run and that something is taking care of the input and output for those processes
and making sure the right signals get to them at the right time and not the wrong things
like me logging out.
But I know that it exists and maybe I could sketch something, but I don't,
necessary note of the top of my head. And then you should know beneath that what that something
exists, right? Beneath that layer, we know that there is a system D and I don't know how that
works. But it's always good to have a decent understanding of the level beneath where you're working
and then be aware of the layer below that. Right. No vaguely what to Google. Right. Or ask
chat GPT. Or ask your favorite. Yeah. Ask your favorite. Yeah. Yeah. And so I think this
plugs into that kind of mindset completely as like yeah. Yeah. It's kind of like.
know how the cloud works and then know where to look when it doesn't work.
Mm-hmm.
Mm-hmm.
Yeah.
Like if you, I, honestly, the only downside to this is that in those environments,
I feel like where you have those like, you know, a million layers of abstraction
between you and the physical server, if you're like an old fuddy-ddy-ddy like us,
and you're like, can't I just S-S-H-N?
It's like, no, you can't have root.
It's like, why?
I know exactly what to do.
I know exactly how to fix this problem
and now I'm going to have okay fine sure
yeah well and of course
the irony is they can probably give you route
but it's not even on the real computer
because several layers of virtualization
away from the machine that's actually running
you talk about it's like it's running in a container service
there's no route to give you like you can't get there from here
right yeah yeah yeah yeah cool
all right friend well this has been great
yeah we jammed it we did it not bad for winging it
yeah listener you can let us know post a comment
somewhere. I mean, some people watch
this on YouTube and that's where I see most of the comments
and then otherwise tweeted us
or hackied them, bio
mastodony thing, or just email us.
But we'd love to hear
what you think and what we're doing
right and wrong. We've never really asked that.
We just do this for us.
This is just our excuse to catch up, isn't it?
Yeah, that's true.
Cool. All right, friend.
Well, have yourself a great weekend.
And don't speak to soon.
Until this.
You've been listening to Toos Compliment, a programming podcast by Ben Rayleigh and Matt Godbob.
Find the show transcript and notes at www.2.2.complement.com.com. Contact us on Mastod. We are at Tooscomptlement at hackyderm.com.
Our theme music is by Inverse Phase. Find out more at InversePhase.com.
I don't know.