Screaming in the Cloud - Helping Securing the Python with Mike Fiedler
Episode Date: December 5, 2024On this Screaming in the Cloud In this episode of Screaming in the Cloud, Corey Quinn is joined by AWS container hero and security engineer at the Python Software Foundation, Mike Fiedler. Th...ey delve into the intricacies of Python's ecosystem, discussing the evolution of PyPI, its significance, and the ongoing battles against security threats like account takeover attacks and typo-squatting. Mike sheds light on his role in maintaining the security and reliability of the Python Package Index, the importance of 2FA, and the collaborative efforts with security researchers. Corey and Mike also explore the challenges and philosophies surrounding legacy systems versus greenfield development, with insights on maintaining critical infrastructure and the often-overlooked aspects of social engineering.Show Highlights(0:00) Introduction(0:47) The Duckbill Group sponsor read(1:21) Breaking down the Python nomenclature and its usability(5:49) Figuring out how Boto3 is one of the most downloaded packages(6:43) Why Mike is the only full-time security and safety engineer at the Python Software Foundation(9:53) How the Python Software Foundation affords to operate(14:17) Mike's stack security work(16:14) The Duckbill Group sponsor read(16:57) Having the "impossible job" of stopping supply chain attacks(21:00) The dangers of social engineering attacks(24:44) Why Mike prefers to work on legacy systems(33:30) Where you can find more from MikeAbout Mike FiedlerMike Fiedler is a highly analytical, forward-thinking Information Technology professional. His broad-based background includes systems administration and engineering in global environments. Mike is technically astute and versatile with ability to quickly learn, master, and leverage new technologies to meet business needs and has a track record of success in improving performance, stability, and security for all infrastructure and product initiatives.Mike is also bilingual, speaks English and Hebrew, and he loves solving puzzling problems.LinksMike’s Mastadon: https://hachyderm.io/@mikethemanMike’s Bluesky: https://bsky.app/profile/miketheman.comMike’s Python Software Foundation blog posts: https://blog.pypi.org/The Python Package Index Safety & Security Engineer: First Year in Review: https://blog.pypi.org/posts/2024-08-16-safety-and-security-engineer-year-in-review/SponsorThe Duckbill Group: duckbillgroup.comÂ
Transcript
Discussion (0)
social engineering or social acumen, I think is very important to exercise because at the end of
every security pipeline or security process are humans.
Welcome to Screaming in the Cloud. I'm Corey Quinn. I am joined today by longtime friend and
first-time guest somehow, Mike Fiedler, who is, among other things, an AWS container hero, which is far from
the most interesting thing about him. His day job is the PyPI safety and security engineer at the
Python Software Foundation. Mike, thank you for joining me. I'm surprised you could find the time.
I thought you had people all busy fixing dependency problems. Thanks for having me,
Corey. It's great to be on. I also cannot believe that I've never
been on before, but it's great to be here. This episode is sponsored in part by my day job,
the Duck Bill Group. Do you have a horrifying AWS bill? That can mean a lot of things.
Predicting what it's going to be, determining what it should be, negotiating your next long-term
contract with AWS, or just figuring out why it increasingly resembles a
phone number, but nobody seems to quite know why that is. To learn more, visit duckbillgroup.com.
Remember, you can't duck the duck bill bill. And my CEO informs me that is absolutely not our slogan.
I have to start here. It's a little bit of a confusing ecosystem, to put it gently. I use
Python when I want to get work done. It's very much not something that I've ever really peeked
behind the curtains on. It's just there. I type pip install or poetry install or one of the other
five ways of installing dependencies, all of which hate each other. And that sort of gets the job
done and I move on with my life. What is PyPI versus Python, the software language versus the Python Software Foundation
versus a big snake?
Fair enough.
Let's try and disambiguate by starting kind of from the beginning.
Python, the language, is a standards-driven open source language invented in 1991, I want
to say, and it's been around for about 30 years. And it's been in
constant evolution. There's a bunch of really committed core developer contributors that have
been volunteering their time and effort to develop the Python language ever since. That is the tool
that most people use to get their work done. Most people who have been around for a little
while are familiar with the mass migration of Python 2 to 3,
which for almost all of us consisted almost entirely
of replacing print followed by a space and quotation marks
with print, open parentheses without a space,
and then the quotation marks.
Yeah, the 2 to 3 migration was a challenging one for everybody
to the point where amongst the core developers,
I think there's a tacit agreement of there will be no Python 4
because it hurt the world so much to make the 2 to 3 migration.
That said, that doesn't mean that the language doesn't evolve and things don't change.
It just changes at a slower pace and kind of has a long tail of support.
I believe five years for every Python major revision is supported now.
It's kind of impressive on some level. Well, who uses Python? Everyone. Freaking everyone uses
Python for something somewhere. It's probably one of the most approachable, user-friendly languages
out there. You can read the code and it does almost exactly what you would expect most of
the time. Yes, yes, yes. You can do obfuscated Python if you want to be particularly obnoxious or, you know, edit someone else's code
until it turns out that way if you want. But it's very approachable. And on some level, it's like,
is this actually how it works? It feels like it's too easy. Yeah, I think that's one of the things
that has led to Python's major adoption across the world at GitHub Universe earlier this year. They announced that
according to their kind of measurements, Python has overtaken JavaScript as the most popular
language on GitHub. With the caveat of they broke out TypeScript separately from JavaScript,
as I recall the internet drama of the hour. Oh, I do not recall that, but I'll take the win.
Hey, you do super well when they kneecap the competitor. It goes great. Works out for me. And stats, right? But the TBOE index, which has their own measurement
of language popularity, has been tracking Python for a while, and it has been number one for a
number of years. We'll drop a link to that in some show notes. But the thing remains that we've put
Python on the moon through the Mars, sorry, on Mars, on the Mars helicopter.
So that is powered by Python. Things all over the planet, whether that be industrial machinery or
tools that build industrial machinery or your doctor's office or games you want to play with
your kid. So all of these things are built in Python. So it's quite pervasive, largely because it is that approachable, but also because there's a multitude of user extensions or projects, aka dependencies, libraries, packages.
There's lots of names for the same thing that exist.
So you can kind of pick up a well-formed Lego block that does most of what you're doing, plug it in,
and keep on going and doing what you're doing.
Yeah, for those who heard of Boto3,
that is the AWS SDK
or extension for Python.
There's a reason you don't have
to wrap every request
with his own SIGV4 signing
or and have to wind up
construct your own HTTP
part response parser.
You just tell it to do the thing,
and it does the thing, and it's glorious.
Yeah, so Boto3 is one of the most widely downloaded packages
in the Python ecosystem
because it's so widely used to interact with AWS.
That could be on servers.
It could be on CLI tools.
It could be inside Lambda functions, which run Python,
that want to re-interact back with AWS APIs.
So I have to ask, this probably ties back to what you do for your day job.
How do you know that it's one of the most widely downloaded packages?
So I work on PyPI, which is an acronym for the Python Package Index at pypi.org.
You mentioned that you would pip install or poetry install or use one of the many
other tools to download
and install Python. I wave my hands in the air and
fret until someone comes and fixes it for me, like a
toddler. Hey, whatever gets the job done,
right? And so
I asked this at a data science
networking event of like, okay,
when you run pip install inside your
Jupyter notebook, where do you think that comes from?
And they were like, I don't know, It just works. Jupyter. Duh. Yeah. You already said it
was on Mars. What's the problem here? Could be on Jupyter. But I was like, okay, I work on the
thing that gives you those packages reliably and kind of securely every single time. I'm not the
only one who works on it, but I'm the only one who works on it full time. You are the safety and
security engineer. It feels like, well, we the only one who works on it full time. You are the safety and security engineer.
It feels like, well, we only need one problem solved, which that always feels like a weird
role to have existing in isolation.
But it's interesting as well that they have someone devoted full time to caring about
this, because on some level, it doesn't sound that hard.
It's basically a big web static website that gets recompiled whenever someone updates something,
which is probably frequently OK. That adds a little bit of complexity.
And but where's the hard part?
Does that require a full time security person?
Oh, wait, it links to arbitrary third party things that wind up effectively turning into
a remote code execution.
You can trick people to including a line that says import some magic string.
OK, this starts to be a little more interesting. Even as I'm thinking
about this, like, oh, that's why you're there. How is there only one of you?
Yeah. So the index itself, PyPI, has been in existence in one form or another for about 20
years. It's gone through a bunch of different hands, rewrites, and kind of re-imaginations.
The reason it's called an index is because it used to only be an index, just a page, an HTML page with links to other people's storage of software.
And over time, because folks don't have hosting capacity or didn't want to kind of keep those packages up in perpetuity, we moved over to a hosting mode where anyone across the Internet can upload a package to PyPI.org
and other folks can pip install and download it
without having to worry about where is this from.
It feels like the original version could have just been like a script
that runs on Cron every five minutes.
It grabs all the packages and creates an index file.
And I'm sure it was originally into your lasting shame
that script was no doubt written in Perl.
So it needed to be fixed immediately. And here we are. I'm not going to go down that path because I don't, I'm not as
familiar with the super legacy code, but most of it that I have seen, it has been written in Python.
You do eat your own dog food. Yes, you got to. But the permutations of this have kind of evolved
and it's been largely volunteer driven volunteer contributors
volunteer admins for this massive service that basically underpins a non-small part of the
internet and the operations now there are plenty of mirrors out there and mirroring softwares that
you can set up for your own corporation and your enterprise to say we would like to you know keep
our own copy of the
index locally. So that way, if there's any problem upstream, we still have all those packages we have.
And most folks should do that because then you are owning your own availability. But for the
vast majority of open source consumers out there, which are most people, there's not a need to build
your own index. So we make sure that it is up and running.
We can't do that on our own without a lot of generosity because the Python Software
Foundation is a non-profit, and my role in particular is funded through generous donations
from folks like Amazon Web Services and then next year through the Alpha Omega Foundation.
But the ability to invest in keeping critical infrastructure up and running is not an
easy task that you can just hand wave and say some company will handle this.
No, and even the raw cost of this would bankrupt most folks. I don't know where you actually host
these things. It somehow never occurred to me to look at, okay, when you're downloading from the
local mirror, where is that mirror resolving to exactly? Because if it's one of the major cloud providers,
egress fees are expensive. Yeah, if I download a gig, it'll be nine cents and I'm not much of a
user, but you know, I have a dumb provisioning process that redownloads every single time the
container runs. Oops. And there's a lot of idiots like me out there doing the exact same thing. This can
become a massive denial of wallet attack if you're not, I guess, conscious about how these things are
supposed to work. So we are conscious about it. And we're again, we've gotten some great donations.
So our infrastructure is donated largely by Amazon Web Services and Google Cloud through
their open source contribution program. But the actual egress that you are consuming when you are re-downloading the same package again and
again is a donation from Fastly through their fast forward program for open source and non-profit
users. So we are achieving close to a 99.95 cash hit ratio through the Fastly network.
So most of the calls of the, I don't know, 60,000 requests per second that are happening against PyPI's APIs are going to hit a Fastly cache and never phone home all the way back
to us.
Which is phenomenal.
It's, if you can resolve, if you can service a request as close to the customers or user
as possible, go ahead and do it.
Sorry, they're not customers. They're users. ahead and do it. Sorry, they're not customers.
They're users.
If they're not paying you, they are not customers.
That is going to be a horrible revelation for a whole bunch of freemium model startups
someday.
Yes, we do not charge money for usage.
There is potentially some paid thing for corporations and organization features that is kind of
in the works, but that is not live yet.
So yeah, we exist currently 100% on kind of inbound donations.
The ability to do so, again, is made possible by our generous donors, but it's also made
possible by folks like myself and other contributors talking about what we're doing and publicizing
the ability.
And at the end of each year, our impact report at the Python Software Foundation, I try to give them some
stats about how much we've grown in our usage over the years. And it's kind of doubling every year,
which is crazy to think about. Yeah, that's right. It was Daniel Sternberg who first really drew my
attention to this problem years ago. He is the original author of Curl, you know, that library that's used by freaking everything every time you need to make a web request. And it turns out he was doing most of this as a labor of love and trying to figure out, like, what do you do for a day job? It's like, that is absurd. Like the old XKCD of the entire tower of blocks built on one thing maintained by someone in Nebraska is very much that type of approach. And, oh, right, you have to make a little bit of noise, especially when you have something like this, in order to get the resources and attention it needs. Because otherwise, look at all the browser extensions that are abandonedware and then get purchased for some paltry sum of money by malware authors, and they just go ahead and inject
whatever they want. Everyone's browser still has that configured. That's a terrifying threat
vector. It really is. And the notion of that exists in the universe of PyPI, but it's less
so about folks being abandoned. And the things that we've tried to protect against are account
takeover attacks. So we reference Bodo, right? Bodo is
managed by the release team who kind of uploads new versions to PyPI.org on schedule. And every
single one of the folks who has access to that project could become a phishing target. And if
they can get phished, they can get their accounts kind of compromised, and then someone else could be uploading Bodo,
and that's a problem.
So in order to kind of prevent that, we enacted a couple of years ago and finally forcefully put our foot down in 2024 to require 2FA for all user accounts.
So a second factor for authentication prevents the casual account takeover attacks.
Again, no security is perfect, but we move
forward on our progress to getting more secure solutions out there so that the universe will
be a little more secure every single iteration we go at it.
I guess a question I probably should have asked a little bit earlier in this conversation is when
you talk about security, especially when you're talking about something that I perceive to be relatively low-level infrastructure,
my naive assumption at first was,
oh, you're just the person that makes sure
that the web server stays patched
and, you know, doesn't have the SSH port
just hanging out there,
flapping in the breeze for everyone to connect to.
Sounds like what you're doing is a lot more up the stack.
It is.
I do kind of work throughout
the different layers of the stack.
You do that too, though.
Well, we do have our director of infrastructure and an infrastructure engineer who work on not just the PyPI universe. They work on everything Python Software Foundation. So, you know, they can dedicate a fraction of their time to PyPI. So I'll collaborate with them on certain parts of the stack. But in the kind of application side and all the way up to the client side, that's kind
of where I live.
And I try to find new ways to prevent the problems from happening.
So problems happen.
Folks do register for an account.
It's a free service.
They'll upload some malware.
They'll abuse the service. And we also partner with a variety of volunteer security researchers.
And these are folks who may have their own security research companies, but a lot of them
are just an email that have kind of expressed some interest and can report back to us,
hey, this new thing that somebody uploaded 10 minutes ago looks really smelly, and here's why.
And then we've developed the kind of messaging and pingback and workflows to get that information into a PyPI admin's hands as soon as possible
to be able to get the context, react quickly, and take that down, put it in a quarantine state or take it down completely to prevent anybody from accidentally
falling for a, you know, discord message that says, here, you should run this pip install
command and that'll solve all your problems because that's a huge vector. Here at the
Duckbill Group, one of the things we do with, you know, my day job is we help negotiate AWS
contracts. We just recently crossed $5 billion of contract value negotiated.
It solves for fun problems such as how do you know that your contract that you have
with AWS is the best deal you can get?
How do you know you're not leaving money on the table?
How do you know that you're not doing what I do on this podcast and on Twitter constantly
and sticking your foot in your mouth. To learn more,
come chat at duckbillgroup.com. Optionally, I will also do podcast voice when we talk about it.
Again, that's duckbillgroup.com. It seems like there's a very, there's very much a taking for
granted of all of these things. Like the idea of a supply chain attack feels like it's this esoteric,
very remote, very obscure thing
until it happens to you
and you realize just how easy it is
for something like that to happen.
You basically have an impossible job.
Yeah, I mean, every job,
if you frame it like that, is impossible.
But I think the hope that we have
is that by raising the visibility
of potential supply chain attacks, we get people
interested in, okay, well, what do I do about that? So we can go into that in a bit. But there's also
the like, it won't happen to me, it will happen to somebody, and it might not happen to you,
it'll happen to somebody near you, and then they'll move laterally and get you, right? So this is
something that attackers love to do is find some sort of injection point
and then move laterally. So it's like, even if I can kind of get you to install a Discord thing,
guess what? I'm now Corey Quinn on Discord and I can ask other people to do more serious things.
So just because I got a Discord token doesn't mean that that's the end of the attack.
Let me ask you this, and feel free to tell me that it is not something you can discuss,
and that will be fine. But let's say I have an evil, nefarious idea that I want to trade in
the latest version of cryptocurrency, AWS credits. So what I'm going to do is I'm going to publish a
package, BOIO3, just change the T to a Y, one key off
typo, given the scale of it, I'm sure people do it all the time. And then I'm going to upload a
direct clone of BOIO3 with one minor change in that it just takes any environment credentials
or keys it finds and sends them to me. Where does that break down as I start down that or does it?
Conceptually, it does not break down. The mechanisms that you would use in order to get
that OEO3, that is what we would call a typo squatting attack. So you are squatting on a name
and hoping that somebody typos that. For any company or any individual who is installing
packages, we often recommend using a hash-based lock file from a tool like Poetry or an extension called
pipcompile or piptools or any number of the other tools out there to generate a,
these are known hashes to this file. GitHub's Dependabot and other kind of dependency update
tools recognize these lock files, and they will go out and check out and do updates for you.
So as long as you get that first package in correctly, spelled Boto3, you are unlikely
to ever get Boyo3 because you are not typing pip install Boyo3. But for the offhand one-off people
who are doing so, or if you are using that in a Dockerfile command that does not have hashes
and you did a typo, yeah, that's going to be a problem. So what we try to do in that situation
is once it's reported to us, we get it off the index. And we often prohibit that name from
further use because it is now known to be malware. So even if you got it, the next time you run your
pip install command,
even with the typo, it should fail. And at that point, you should hopefully notice.
We also take these takedowns and report some of them out to a malicious database or an advisory
database that folks can use as like a pip audit to see, am I using any of these files? Known
problems. The vast majority of the typosquats
don't make it in there
because it's just a lot of noise.
At some point they may,
but they don't today.
The final kind of piece
is that we have these
wonderful volunteer researchers
who will try and profile the attackers
and kind of build up a
this is what this looks like
and update their scanning rules
to kind of listen to the PyPI feed
and see, is this a new one that looks like that? Yeah. Okay. Let's report it. And that makes our
lives as admins a little easier because we've seen this kind of attack before. We know this reporter
that we can kind of match things a lot faster. There's lots of focus on when you're chasing
these things down. You have that aspect of it. Social engineering attack that you alluded to a few minutes ago about impersonating someone on
Discord. That's that was wild. Back when I was Freenode network staff many years ago, we had
someone pretending to be Matthew Prince, CEO of Cloudflare, trying to tell us, let us mention in
passing what the origin was of something that Cloudflare was protecting. Yeah, that doesn't
seem like the sort of thing that he would actually ask
because he's not, you know, an idiot.
Imagine that.
So it was, that took a fair hit of thought there
because I'm used to running stuff
that no one really cares about.
When you're talking about running something
that everyone uses,
on some level, everyone has to care about it,
but most people don't even know that it exists.
Yeah, that's, I think,
one of these kind of interesting problems, right?
I know you had done an episode on remote development environments, right? So using remote development environments is
a way to sandbox your development environment, your code, to say, you know what, there's nothing
in this sandbox that shouldn't be there already. So if I am installing untrusted code, it can only
do what it can only affect whatever's in there.
Now, again, that might be more than you want, but at least it's sandboxed.
There's other methods that are around social engineering that are fascinating to think about.
I do remember one company I was at, at a company all hands, the CFO stood up in front of the entire
company and said, I will never text anyone to
wire me money. So if you ever see a text from me for a wire, that is not me. Do not trust it.
Report it immediately to our IT team. And it's like that kind of social engineering or social
acumen, I think is very important to exercise because at the end of every security pipeline
or security process are humans.
And we're just trying to do the best we can.
We've seen that here at the Duckbill Group, a targeted spear phishing attack where a bunch of some pretending to be Mike, you know, my business partner and the CEO was sent to three
or four people here, not me suspiciously, OK, asking to have, I think it was iTunes
gift card sent.
And we'd already spoken to the team.
It's like, great.
You know how normally,
if he wants to do something weird,
we'll talk about it in Slack.
Yeah, if you can get access to our Slack as him,
congratulations.
You probably earned it at that point.
But just some, this is Mike.
It's my new phone number, my personal phone.
Yeah, sure it is, buddy.
Sure.
I think that that's a far overlooked part
of all security is, you know, does this make sense,
right? Did what I just get make sense to me? Should I verify it? Should I blindly accept it?
And I think that that's the same thing that we should apply to software packages and things we
install is, do I know where this came from? You know, a little bit farm to table, as it were.
Let's look at the ingredients that I'm about to consume.
If I'm doing a one-off BS script in a cloud environment, that's just trying something out.
Maybe it's not that critical, but if I'm building my team, my project, my company's infrastructure,
let's take a more critical eye to what it is we're doing and, you know, secure our environment. Maybe we shouldn't make, maybe we should not be able to make network calls
externally to untrusted sources. That's an old school, you know, outbound firewall rule that is
pretty easy to kind of conceptualize. And, you know, maybe we should apply things like that to
secure our teams. So there's a variety of different tactics in play, all to kind of make security a little bit easier for folks.
I want to make sure that whatever is on the index is relatively trustworthy. So, you know,
there are well over half a million packages and, you know, 12 million releases of those packages
out there today. I can't monitor all of them, but we have a great network of folks who are doing
that. One other topic that
I wanted to get into with you. It's something that you said that I thought was just shockingly
objectionable enough to be worth talking to you about here. And how did you phrase it? Specifically,
that you were talking about legacy systems and that you found it to be more satisfying to work
in legacy environments than do greenfield development. Now, before we dive in,
yes, of course, legacy is that condescending engineering term for it makes money somehow.
But where do you come from on this, given that you work for a nonprofit, you definitionally do
not make money with it? Yeah, I've worked at a variety of different companies throughout my 30
years in this ridiculous industry that we all call work. And something has been true. The word legacy is
often used, as you said, kind of as disparaging as this is bad. But legacy is the thing that
is left behind, right? Software that is not useful gets replaced. Software that is useful
is still around. Sometimes people don't like it, and that's a different problem. But it's still useful. Otherwise, it wouldn't be there. And working on systems that are useful, I find the most satisfaction because somebody is deriving value out of this. channel to a B2B integration or a website feature that is only used by 3% of our clientele,
it's still useful. If it wasn't useful, we should get rid of it. And with a legacy system,
you have a lot of the pieces in play that you already need to make that money or to get that
goal. Whereas greenfield development is far more
aspirational and unknown. You're going to hit new and interesting bugs. I'd like to hit well-debugged
systems and kind of evolve them over time, as opposed to saying, let's jettison this thing and
start a brand new stack because you're going to spend more time debugging things you had already
debugged. One thing that I've always had a keen appreciation for that I think transfers
is as a consultant,
you have to have respect for what came before.
There's a reason things are the way that they are.
There are constraints that may have been non-obvious.
And whenever I'm dealing with existing code bases,
okay,
why was it written like this?
There's probably a reason that I'm not immediately seeing before I go in and
start ripping everything out, maybe read through it once or twice, just to get a vague idea of what it is
you're about to do here. Because what you're doing now works, sort of. And you want to make it do
something different, maybe understand what it's doing now so you don't inadvertently leave it in
pieces on the floor. Yeah, I think Martin Fowler goes a lot into that in his book Refactoring. And I think where a lot of people
fall on this whole legacy systems are bad are, I don't understand it enough, or it doesn't feel
good in my hand, right? Or the modules aren't constructed the way I would like them to be.
Guess what? You can refactor it. You can change it. You can modify things to make them nicer. Sandy Metz, the author
of Practical Object-Oriented Design in Ruby, had a phrase that I really liked. I don't know if that
was the origin, but it was, make the change easy and then make the easy change. Sometimes making
the change easy is actually very hard, but if you make the thing that you want to do, like the second part, easy to change,
then next time you need to change it, it will likely be easier to change. So constant refactoring
is something that a lot of teams that I've worked with have kind of bemoaned. The product managers
are never going to let us have time to refactor. It's like, that's part of the work. It's not like
a ticket to refactor. It's you're touching this part of the work. It's not like a ticket to refactor.
It's you're touching this module. You need to bake in an extra X amount of time to refactor
to make the easy change. And part of that too, is there's the idea that when I talk to customers
all the time, or clients as we call them when we're actually a professional services company,
we've never really seen, except in extreme situations, someone rewriting an
application for cost purposes. Someone did that once when they were on, once upon a time, Google
App Engine, realizing big table charge per write. They were hemorrhaging 800 grand a month for what
became a quarter million dollar AWS bill. Great. So get off of that and move. Sure. Most of the
time it doesn't happen that way. So they'll care about
cost and do a refactor in the next version of the architecture. So you can bake that in,
but almost never are they going to do a rewrite just to save money, nor should they, frankly.
Well, it's one of those things that I've had to make this decision a few times throughout my
career of like, when is the right time, right? If it's for cost, well, what else could we be doing?
And is this costing us more? How many engineers are we going to have work on this? How much
downtime? How many outages are we willing to take? All of them.
Until it's done. And how much will that cost us? Oh, wait, that number is much higher than just
living with the cost of an inefficient system for
another few years.
Yeah, okay, let's not do it yet, right?
But let's set ourselves a marker to say, in two years or whatever their time frame is,
let's reevaluate these pieces and see if they are still true and if they exceed the threshold
now.
And are we willing to invest this amount of money on renovating a house, basically, right?
You have to do a gut reno and pull pieces out and move walls. That's pricey. And where are you going
to live in the meantime? So all of those things have to be true in order for you to actually make
a fiscal argument for a refactor. But if you are constantly doing a little bit of refactoring every pull request to
say, ah, this module is sharing enough code with three other modules, let me pull out a helper
function and reuse it. Oh, great. Now we have the right abstraction. We have not prematurely
abstracted. We have the right abstraction and it costs us less to maintain it because we've
written the test around the helper function
instead of the top level function.
So writing the tests is even easier.
The challenge, of course, is the other side is this does act as a form of technical debt.
The CDK is great when you're trying to build something out in a reasonable recurring way.
But where it falls down, at least for me, is, okay, I built and deployed a thing.
Now I don't have to touch it again for two years.
Great.
Now I do it and it is screaming and whining at me every step of the way when all I'm trying
to do is change a single line of code, maybe even a text string.
Oh, that version of Node or Python is deprecated.
We won't support that in Lambda.
Oh, your versions of everything is ancient, updated.
Oh, now you have critical breaking dependencies.
It's like, I just want to change a single line of code.
I want to refactor everything to modern, but I'm faced with no choice when that happens.
It's very frustrating. That is a frustration I have had myself. And one way I found to combat that is to take kind of this approach of if it hurts, do it more often. So, you know, in this
case of like, you do it more often, so that way it doesn't hurt as much. So for me, I've got Dependabot running and pushing updates and have a CICD pipeline for
some of my projects that I trust enough that I can just merge and deploy.
And if it passed CI, I am happy enough with it.
So then you can automate pushing that button as well.
And then you can just have stuff flow through your system.
And then when you come back to it two years later, it's relatively fresh.
That's a good way of doing it. The challenge, of course, I'm doing this across a bunch of
different languages simultaneously. Everything requires its own approach. I've been using ASDF
as a version manager, which is universal. But then that conflicts with a bunch of things in some
particular Python and other language workflows. And all of it's a disaster. Nobody's happy. I'm
starting to love ephemeral dev environments for this explicit purpose, but then nothing I need
is local there. And I need to install the universe again. And computers are hard. That's why we have
jobs. People are harder. That's why you have jobs. People are very hard. People can be great and the
worst, right? Like they fall on the spectrum of everything. Software is the same. I think one of the reasons we have so many package management tools is because we're
standards driven as opposed to one tool opinion driven.
The standard is the PyPI index will have an HTML page that has all the files, versions,
and all the hashes contained in them.
That's a standard.
You can build your own tool if you don't like it based on those standards.
And that's how we get tools that it's like, well, it does what I want it to do. And it's like,
great, it will work. And we commit to kind of having that contract with our consumers.
So that way, if you want to use poetry, if you want to use pip, if you want to use something else,
that's your prerogative, but that's also your problem. There are no guarantees. So anybody saying,
I'm going to guarantee that this won't conflict, they're probably trying to sell you something.
I really want to thank you for taking the time to speak with me about all this.
If people want to learn more, where's the best place for them to find you?
I am on Mastodon on the HackyDerm site. I'm at MikeTheMan on HackyDerm. I'm at MikeTheMan.com on BlueSky. And you can also
read all the blog stuff I write on PyPI at blog.pypi.org. And we will include links to all
of that in the show notes. Thank you so much for finally agreeing to sit down with me and suffer
the slings and arrows. I appreciate it. Absolutely. Mike Fiedler, the security engineer at the Python
Software Foundation. I'm cloud economist Corey Quinn, and this engineer at the Python Software Foundation.
I'm cloud economist Corey Quinn, and this is Screaming in the Cloud.
If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice.
Whereas if you hated this podcast, please leave a five-star review on your podcast platform of choice.
Along with an angry comment, including a screen on why we should be using Rust instead.
Please be sure to link to your PowerPoint deck.