Screaming in the Cloud - Helping Securing the Python with Mike Fiedler

Starting point is 00:00:00 social engineering or social acumen, I think is very important to exercise because at the end of every security pipeline or security process are humans. Welcome to Screaming in the Cloud. I'm Corey Quinn. I am joined today by longtime friend and first-time guest somehow, Mike Fiedler, who is, among other things, an AWS container hero, which is far from the most interesting thing about him. His day job is the PyPI safety and security engineer at the Python Software Foundation. Mike, thank you for joining me. I'm surprised you could find the time. I thought you had people all busy fixing dependency problems. Thanks for having me, Corey. It's great to be on. I also cannot believe that I've never

Starting point is 00:00:45 been on before, but it's great to be here. This episode is sponsored in part by my day job, the Duck Bill Group. Do you have a horrifying AWS bill? That can mean a lot of things. Predicting what it's going to be, determining what it should be, negotiating your next long-term contract with AWS, or just figuring out why it increasingly resembles a phone number, but nobody seems to quite know why that is. To learn more, visit duckbillgroup.com. Remember, you can't duck the duck bill bill. And my CEO informs me that is absolutely not our slogan. I have to start here. It's a little bit of a confusing ecosystem, to put it gently. I use Python when I want to get work done. It's very much not something that I've ever really peeked

Starting point is 00:01:31 behind the curtains on. It's just there. I type pip install or poetry install or one of the other five ways of installing dependencies, all of which hate each other. And that sort of gets the job done and I move on with my life. What is PyPI versus Python, the software language versus the Python Software Foundation versus a big snake? Fair enough. Let's try and disambiguate by starting kind of from the beginning. Python, the language, is a standards-driven open source language invented in 1991, I want to say, and it's been around for about 30 years. And it's been in

Starting point is 00:02:07 constant evolution. There's a bunch of really committed core developer contributors that have been volunteering their time and effort to develop the Python language ever since. That is the tool that most people use to get their work done. Most people who have been around for a little while are familiar with the mass migration of Python 2 to 3, which for almost all of us consisted almost entirely of replacing print followed by a space and quotation marks with print, open parentheses without a space, and then the quotation marks.

Starting point is 00:02:37 Yeah, the 2 to 3 migration was a challenging one for everybody to the point where amongst the core developers, I think there's a tacit agreement of there will be no Python 4 because it hurt the world so much to make the 2 to 3 migration. That said, that doesn't mean that the language doesn't evolve and things don't change. It just changes at a slower pace and kind of has a long tail of support. I believe five years for every Python major revision is supported now. It's kind of impressive on some level. Well, who uses Python? Everyone. Freaking everyone uses

Starting point is 00:03:11 Python for something somewhere. It's probably one of the most approachable, user-friendly languages out there. You can read the code and it does almost exactly what you would expect most of the time. Yes, yes, yes. You can do obfuscated Python if you want to be particularly obnoxious or, you know, edit someone else's code until it turns out that way if you want. But it's very approachable. And on some level, it's like, is this actually how it works? It feels like it's too easy. Yeah, I think that's one of the things that has led to Python's major adoption across the world at GitHub Universe earlier this year. They announced that according to their kind of measurements, Python has overtaken JavaScript as the most popular language on GitHub. With the caveat of they broke out TypeScript separately from JavaScript,

Starting point is 00:03:56 as I recall the internet drama of the hour. Oh, I do not recall that, but I'll take the win. Hey, you do super well when they kneecap the competitor. It goes great. Works out for me. And stats, right? But the TBOE index, which has their own measurement of language popularity, has been tracking Python for a while, and it has been number one for a number of years. We'll drop a link to that in some show notes. But the thing remains that we've put Python on the moon through the Mars, sorry, on Mars, on the Mars helicopter. So that is powered by Python. Things all over the planet, whether that be industrial machinery or tools that build industrial machinery or your doctor's office or games you want to play with your kid. So all of these things are built in Python. So it's quite pervasive, largely because it is that approachable, but also because there's a multitude of user extensions or projects, aka dependencies, libraries, packages.

Starting point is 00:04:55 There's lots of names for the same thing that exist. So you can kind of pick up a well-formed Lego block that does most of what you're doing, plug it in, and keep on going and doing what you're doing. Yeah, for those who heard of Boto3, that is the AWS SDK or extension for Python. There's a reason you don't have to wrap every request

Starting point is 00:05:15 with his own SIGV4 signing or and have to wind up construct your own HTTP part response parser. You just tell it to do the thing, and it does the thing, and it's glorious. Yeah, so Boto3 is one of the most widely downloaded packages in the Python ecosystem

Starting point is 00:05:34 because it's so widely used to interact with AWS. That could be on servers. It could be on CLI tools. It could be inside Lambda functions, which run Python, that want to re-interact back with AWS APIs. So I have to ask, this probably ties back to what you do for your day job. How do you know that it's one of the most widely downloaded packages? So I work on PyPI, which is an acronym for the Python Package Index at pypi.org.

Starting point is 00:06:02 You mentioned that you would pip install or poetry install or use one of the many other tools to download and install Python. I wave my hands in the air and fret until someone comes and fixes it for me, like a toddler. Hey, whatever gets the job done, right? And so I asked this at a data science networking event of like, okay,

Starting point is 00:06:20 when you run pip install inside your Jupyter notebook, where do you think that comes from? And they were like, I don't know, It just works. Jupyter. Duh. Yeah. You already said it was on Mars. What's the problem here? Could be on Jupyter. But I was like, okay, I work on the thing that gives you those packages reliably and kind of securely every single time. I'm not the only one who works on it, but I'm the only one who works on it full time. You are the safety and security engineer. It feels like, well, we the only one who works on it full time. You are the safety and security engineer. It feels like, well, we only need one problem solved, which that always feels like a weird

Starting point is 00:06:48 role to have existing in isolation. But it's interesting as well that they have someone devoted full time to caring about this, because on some level, it doesn't sound that hard. It's basically a big web static website that gets recompiled whenever someone updates something, which is probably frequently OK. That adds a little bit of complexity. And but where's the hard part? Does that require a full time security person? Oh, wait, it links to arbitrary third party things that wind up effectively turning into

Starting point is 00:07:16 a remote code execution. You can trick people to including a line that says import some magic string. OK, this starts to be a little more interesting. Even as I'm thinking about this, like, oh, that's why you're there. How is there only one of you? Yeah. So the index itself, PyPI, has been in existence in one form or another for about 20 years. It's gone through a bunch of different hands, rewrites, and kind of re-imaginations. The reason it's called an index is because it used to only be an index, just a page, an HTML page with links to other people's storage of software. And over time, because folks don't have hosting capacity or didn't want to kind of keep those packages up in perpetuity, we moved over to a hosting mode where anyone across the Internet can upload a package to PyPI.org

Starting point is 00:08:06 and other folks can pip install and download it without having to worry about where is this from. It feels like the original version could have just been like a script that runs on Cron every five minutes. It grabs all the packages and creates an index file. And I'm sure it was originally into your lasting shame that script was no doubt written in Perl. So it needed to be fixed immediately. And here we are. I'm not going to go down that path because I don't, I'm not as

Starting point is 00:08:29 familiar with the super legacy code, but most of it that I have seen, it has been written in Python. You do eat your own dog food. Yes, you got to. But the permutations of this have kind of evolved and it's been largely volunteer driven volunteer contributors volunteer admins for this massive service that basically underpins a non-small part of the internet and the operations now there are plenty of mirrors out there and mirroring softwares that you can set up for your own corporation and your enterprise to say we would like to you know keep our own copy of the index locally. So that way, if there's any problem upstream, we still have all those packages we have.

Starting point is 00:09:11 And most folks should do that because then you are owning your own availability. But for the vast majority of open source consumers out there, which are most people, there's not a need to build your own index. So we make sure that it is up and running. We can't do that on our own without a lot of generosity because the Python Software Foundation is a non-profit, and my role in particular is funded through generous donations from folks like Amazon Web Services and then next year through the Alpha Omega Foundation. But the ability to invest in keeping critical infrastructure up and running is not an easy task that you can just hand wave and say some company will handle this.

Starting point is 00:09:53 No, and even the raw cost of this would bankrupt most folks. I don't know where you actually host these things. It somehow never occurred to me to look at, okay, when you're downloading from the local mirror, where is that mirror resolving to exactly? Because if it's one of the major cloud providers, egress fees are expensive. Yeah, if I download a gig, it'll be nine cents and I'm not much of a user, but you know, I have a dumb provisioning process that redownloads every single time the container runs. Oops. And there's a lot of idiots like me out there doing the exact same thing. This can become a massive denial of wallet attack if you're not, I guess, conscious about how these things are supposed to work. So we are conscious about it. And we're again, we've gotten some great donations.

Starting point is 00:10:37 So our infrastructure is donated largely by Amazon Web Services and Google Cloud through their open source contribution program. But the actual egress that you are consuming when you are re-downloading the same package again and again is a donation from Fastly through their fast forward program for open source and non-profit users. So we are achieving close to a 99.95 cash hit ratio through the Fastly network. So most of the calls of the, I don't know, 60,000 requests per second that are happening against PyPI's APIs are going to hit a Fastly cache and never phone home all the way back to us. Which is phenomenal. It's, if you can resolve, if you can service a request as close to the customers or user

Starting point is 00:11:23 as possible, go ahead and do it. Sorry, they're not customers. They're users. ahead and do it. Sorry, they're not customers. They're users. If they're not paying you, they are not customers. That is going to be a horrible revelation for a whole bunch of freemium model startups someday. Yes, we do not charge money for usage. There is potentially some paid thing for corporations and organization features that is kind of

Starting point is 00:11:42 in the works, but that is not live yet. So yeah, we exist currently 100% on kind of inbound donations. The ability to do so, again, is made possible by our generous donors, but it's also made possible by folks like myself and other contributors talking about what we're doing and publicizing the ability. And at the end of each year, our impact report at the Python Software Foundation, I try to give them some stats about how much we've grown in our usage over the years. And it's kind of doubling every year, which is crazy to think about. Yeah, that's right. It was Daniel Sternberg who first really drew my

Starting point is 00:12:23 attention to this problem years ago. He is the original author of Curl, you know, that library that's used by freaking everything every time you need to make a web request. And it turns out he was doing most of this as a labor of love and trying to figure out, like, what do you do for a day job? It's like, that is absurd. Like the old XKCD of the entire tower of blocks built on one thing maintained by someone in Nebraska is very much that type of approach. And, oh, right, you have to make a little bit of noise, especially when you have something like this, in order to get the resources and attention it needs. Because otherwise, look at all the browser extensions that are abandonedware and then get purchased for some paltry sum of money by malware authors, and they just go ahead and inject whatever they want. Everyone's browser still has that configured. That's a terrifying threat vector. It really is. And the notion of that exists in the universe of PyPI, but it's less so about folks being abandoned. And the things that we've tried to protect against are account takeover attacks. So we reference Bodo, right? Bodo is managed by the release team who kind of uploads new versions to PyPI.org on schedule. And every single one of the folks who has access to that project could become a phishing target. And if they can get phished, they can get their accounts kind of compromised, and then someone else could be uploading Bodo,

Starting point is 00:13:46 and that's a problem. So in order to kind of prevent that, we enacted a couple of years ago and finally forcefully put our foot down in 2024 to require 2FA for all user accounts. So a second factor for authentication prevents the casual account takeover attacks. Again, no security is perfect, but we move forward on our progress to getting more secure solutions out there so that the universe will be a little more secure every single iteration we go at it. I guess a question I probably should have asked a little bit earlier in this conversation is when you talk about security, especially when you're talking about something that I perceive to be relatively low-level infrastructure,

Starting point is 00:14:27 my naive assumption at first was, oh, you're just the person that makes sure that the web server stays patched and, you know, doesn't have the SSH port just hanging out there, flapping in the breeze for everyone to connect to. Sounds like what you're doing is a lot more up the stack. It is.

Starting point is 00:14:40 I do kind of work throughout the different layers of the stack. You do that too, though. Well, we do have our director of infrastructure and an infrastructure engineer who work on not just the PyPI universe. They work on everything Python Software Foundation. So, you know, they can dedicate a fraction of their time to PyPI. So I'll collaborate with them on certain parts of the stack. But in the kind of application side and all the way up to the client side, that's kind of where I live. And I try to find new ways to prevent the problems from happening. So problems happen. Folks do register for an account.

Starting point is 00:15:21 It's a free service. They'll upload some malware. They'll abuse the service. And we also partner with a variety of volunteer security researchers. And these are folks who may have their own security research companies, but a lot of them are just an email that have kind of expressed some interest and can report back to us, hey, this new thing that somebody uploaded 10 minutes ago looks really smelly, and here's why. And then we've developed the kind of messaging and pingback and workflows to get that information into a PyPI admin's hands as soon as possible to be able to get the context, react quickly, and take that down, put it in a quarantine state or take it down completely to prevent anybody from accidentally

Starting point is 00:16:05 falling for a, you know, discord message that says, here, you should run this pip install command and that'll solve all your problems because that's a huge vector. Here at the Duckbill Group, one of the things we do with, you know, my day job is we help negotiate AWS contracts. We just recently crossed $5 billion of contract value negotiated. It solves for fun problems such as how do you know that your contract that you have with AWS is the best deal you can get? How do you know you're not leaving money on the table? How do you know that you're not doing what I do on this podcast and on Twitter constantly

Starting point is 00:16:42 and sticking your foot in your mouth. To learn more, come chat at duckbillgroup.com. Optionally, I will also do podcast voice when we talk about it. Again, that's duckbillgroup.com. It seems like there's a very, there's very much a taking for granted of all of these things. Like the idea of a supply chain attack feels like it's this esoteric, very remote, very obscure thing until it happens to you and you realize just how easy it is for something like that to happen.

Starting point is 00:17:12 You basically have an impossible job. Yeah, I mean, every job, if you frame it like that, is impossible. But I think the hope that we have is that by raising the visibility of potential supply chain attacks, we get people interested in, okay, well, what do I do about that? So we can go into that in a bit. But there's also the like, it won't happen to me, it will happen to somebody, and it might not happen to you,

Starting point is 00:17:35 it'll happen to somebody near you, and then they'll move laterally and get you, right? So this is something that attackers love to do is find some sort of injection point and then move laterally. So it's like, even if I can kind of get you to install a Discord thing, guess what? I'm now Corey Quinn on Discord and I can ask other people to do more serious things. So just because I got a Discord token doesn't mean that that's the end of the attack. Let me ask you this, and feel free to tell me that it is not something you can discuss, and that will be fine. But let's say I have an evil, nefarious idea that I want to trade in the latest version of cryptocurrency, AWS credits. So what I'm going to do is I'm going to publish a

Starting point is 00:18:21 package, BOIO3, just change the T to a Y, one key off typo, given the scale of it, I'm sure people do it all the time. And then I'm going to upload a direct clone of BOIO3 with one minor change in that it just takes any environment credentials or keys it finds and sends them to me. Where does that break down as I start down that or does it? Conceptually, it does not break down. The mechanisms that you would use in order to get that OEO3, that is what we would call a typo squatting attack. So you are squatting on a name and hoping that somebody typos that. For any company or any individual who is installing packages, we often recommend using a hash-based lock file from a tool like Poetry or an extension called

Starting point is 00:19:07 pipcompile or piptools or any number of the other tools out there to generate a, these are known hashes to this file. GitHub's Dependabot and other kind of dependency update tools recognize these lock files, and they will go out and check out and do updates for you. So as long as you get that first package in correctly, spelled Boto3, you are unlikely to ever get Boyo3 because you are not typing pip install Boyo3. But for the offhand one-off people who are doing so, or if you are using that in a Dockerfile command that does not have hashes and you did a typo, yeah, that's going to be a problem. So what we try to do in that situation is once it's reported to us, we get it off the index. And we often prohibit that name from

Starting point is 00:19:56 further use because it is now known to be malware. So even if you got it, the next time you run your pip install command, even with the typo, it should fail. And at that point, you should hopefully notice. We also take these takedowns and report some of them out to a malicious database or an advisory database that folks can use as like a pip audit to see, am I using any of these files? Known problems. The vast majority of the typosquats don't make it in there because it's just a lot of noise.

Starting point is 00:20:27 At some point they may, but they don't today. The final kind of piece is that we have these wonderful volunteer researchers who will try and profile the attackers and kind of build up a this is what this looks like

Starting point is 00:20:41 and update their scanning rules to kind of listen to the PyPI feed and see, is this a new one that looks like that? Yeah. Okay. Let's report it. And that makes our lives as admins a little easier because we've seen this kind of attack before. We know this reporter that we can kind of match things a lot faster. There's lots of focus on when you're chasing these things down. You have that aspect of it. Social engineering attack that you alluded to a few minutes ago about impersonating someone on Discord. That's that was wild. Back when I was Freenode network staff many years ago, we had someone pretending to be Matthew Prince, CEO of Cloudflare, trying to tell us, let us mention in

Starting point is 00:21:17 passing what the origin was of something that Cloudflare was protecting. Yeah, that doesn't seem like the sort of thing that he would actually ask because he's not, you know, an idiot. Imagine that. So it was, that took a fair hit of thought there because I'm used to running stuff that no one really cares about. When you're talking about running something

Starting point is 00:21:35 that everyone uses, on some level, everyone has to care about it, but most people don't even know that it exists. Yeah, that's, I think, one of these kind of interesting problems, right? I know you had done an episode on remote development environments, right? So using remote development environments is a way to sandbox your development environment, your code, to say, you know what, there's nothing in this sandbox that shouldn't be there already. So if I am installing untrusted code, it can only

Starting point is 00:22:03 do what it can only affect whatever's in there. Now, again, that might be more than you want, but at least it's sandboxed. There's other methods that are around social engineering that are fascinating to think about. I do remember one company I was at, at a company all hands, the CFO stood up in front of the entire company and said, I will never text anyone to wire me money. So if you ever see a text from me for a wire, that is not me. Do not trust it. Report it immediately to our IT team. And it's like that kind of social engineering or social acumen, I think is very important to exercise because at the end of every security pipeline

Starting point is 00:22:43 or security process are humans. And we're just trying to do the best we can. We've seen that here at the Duckbill Group, a targeted spear phishing attack where a bunch of some pretending to be Mike, you know, my business partner and the CEO was sent to three or four people here, not me suspiciously, OK, asking to have, I think it was iTunes gift card sent. And we'd already spoken to the team. It's like, great. You know how normally,

Starting point is 00:23:06 if he wants to do something weird, we'll talk about it in Slack. Yeah, if you can get access to our Slack as him, congratulations. You probably earned it at that point. But just some, this is Mike. It's my new phone number, my personal phone. Yeah, sure it is, buddy.

Starting point is 00:23:19 Sure. I think that that's a far overlooked part of all security is, you know, does this make sense, right? Did what I just get make sense to me? Should I verify it? Should I blindly accept it? And I think that that's the same thing that we should apply to software packages and things we install is, do I know where this came from? You know, a little bit farm to table, as it were. Let's look at the ingredients that I'm about to consume. If I'm doing a one-off BS script in a cloud environment, that's just trying something out.

Starting point is 00:23:55 Maybe it's not that critical, but if I'm building my team, my project, my company's infrastructure, let's take a more critical eye to what it is we're doing and, you know, secure our environment. Maybe we shouldn't make, maybe we should not be able to make network calls externally to untrusted sources. That's an old school, you know, outbound firewall rule that is pretty easy to kind of conceptualize. And, you know, maybe we should apply things like that to secure our teams. So there's a variety of different tactics in play, all to kind of make security a little bit easier for folks. I want to make sure that whatever is on the index is relatively trustworthy. So, you know, there are well over half a million packages and, you know, 12 million releases of those packages out there today. I can't monitor all of them, but we have a great network of folks who are doing

Starting point is 00:24:44 that. One other topic that I wanted to get into with you. It's something that you said that I thought was just shockingly objectionable enough to be worth talking to you about here. And how did you phrase it? Specifically, that you were talking about legacy systems and that you found it to be more satisfying to work in legacy environments than do greenfield development. Now, before we dive in, yes, of course, legacy is that condescending engineering term for it makes money somehow. But where do you come from on this, given that you work for a nonprofit, you definitionally do not make money with it? Yeah, I've worked at a variety of different companies throughout my 30

Starting point is 00:25:19 years in this ridiculous industry that we all call work. And something has been true. The word legacy is often used, as you said, kind of as disparaging as this is bad. But legacy is the thing that is left behind, right? Software that is not useful gets replaced. Software that is useful is still around. Sometimes people don't like it, and that's a different problem. But it's still useful. Otherwise, it wouldn't be there. And working on systems that are useful, I find the most satisfaction because somebody is deriving value out of this. channel to a B2B integration or a website feature that is only used by 3% of our clientele, it's still useful. If it wasn't useful, we should get rid of it. And with a legacy system, you have a lot of the pieces in play that you already need to make that money or to get that goal. Whereas greenfield development is far more aspirational and unknown. You're going to hit new and interesting bugs. I'd like to hit well-debugged

Starting point is 00:26:32 systems and kind of evolve them over time, as opposed to saying, let's jettison this thing and start a brand new stack because you're going to spend more time debugging things you had already debugged. One thing that I've always had a keen appreciation for that I think transfers is as a consultant, you have to have respect for what came before. There's a reason things are the way that they are. There are constraints that may have been non-obvious. And whenever I'm dealing with existing code bases,

Starting point is 00:26:58 okay, why was it written like this? There's probably a reason that I'm not immediately seeing before I go in and start ripping everything out, maybe read through it once or twice, just to get a vague idea of what it is you're about to do here. Because what you're doing now works, sort of. And you want to make it do something different, maybe understand what it's doing now so you don't inadvertently leave it in pieces on the floor. Yeah, I think Martin Fowler goes a lot into that in his book Refactoring. And I think where a lot of people fall on this whole legacy systems are bad are, I don't understand it enough, or it doesn't feel

Starting point is 00:27:33 good in my hand, right? Or the modules aren't constructed the way I would like them to be. Guess what? You can refactor it. You can change it. You can modify things to make them nicer. Sandy Metz, the author of Practical Object-Oriented Design in Ruby, had a phrase that I really liked. I don't know if that was the origin, but it was, make the change easy and then make the easy change. Sometimes making the change easy is actually very hard, but if you make the thing that you want to do, like the second part, easy to change, then next time you need to change it, it will likely be easier to change. So constant refactoring is something that a lot of teams that I've worked with have kind of bemoaned. The product managers are never going to let us have time to refactor. It's like, that's part of the work. It's not like

Starting point is 00:28:23 a ticket to refactor. It's you're touching this part of the work. It's not like a ticket to refactor. It's you're touching this module. You need to bake in an extra X amount of time to refactor to make the easy change. And part of that too, is there's the idea that when I talk to customers all the time, or clients as we call them when we're actually a professional services company, we've never really seen, except in extreme situations, someone rewriting an application for cost purposes. Someone did that once when they were on, once upon a time, Google App Engine, realizing big table charge per write. They were hemorrhaging 800 grand a month for what became a quarter million dollar AWS bill. Great. So get off of that and move. Sure. Most of the

Starting point is 00:29:02 time it doesn't happen that way. So they'll care about cost and do a refactor in the next version of the architecture. So you can bake that in, but almost never are they going to do a rewrite just to save money, nor should they, frankly. Well, it's one of those things that I've had to make this decision a few times throughout my career of like, when is the right time, right? If it's for cost, well, what else could we be doing? And is this costing us more? How many engineers are we going to have work on this? How much downtime? How many outages are we willing to take? All of them. Until it's done. And how much will that cost us? Oh, wait, that number is much higher than just

Starting point is 00:29:42 living with the cost of an inefficient system for another few years. Yeah, okay, let's not do it yet, right? But let's set ourselves a marker to say, in two years or whatever their time frame is, let's reevaluate these pieces and see if they are still true and if they exceed the threshold now. And are we willing to invest this amount of money on renovating a house, basically, right? You have to do a gut reno and pull pieces out and move walls. That's pricey. And where are you going

Starting point is 00:30:12 to live in the meantime? So all of those things have to be true in order for you to actually make a fiscal argument for a refactor. But if you are constantly doing a little bit of refactoring every pull request to say, ah, this module is sharing enough code with three other modules, let me pull out a helper function and reuse it. Oh, great. Now we have the right abstraction. We have not prematurely abstracted. We have the right abstraction and it costs us less to maintain it because we've written the test around the helper function instead of the top level function. So writing the tests is even easier.

Starting point is 00:30:50 The challenge, of course, is the other side is this does act as a form of technical debt. The CDK is great when you're trying to build something out in a reasonable recurring way. But where it falls down, at least for me, is, okay, I built and deployed a thing. Now I don't have to touch it again for two years. Great. Now I do it and it is screaming and whining at me every step of the way when all I'm trying to do is change a single line of code, maybe even a text string. Oh, that version of Node or Python is deprecated.

Starting point is 00:31:15 We won't support that in Lambda. Oh, your versions of everything is ancient, updated. Oh, now you have critical breaking dependencies. It's like, I just want to change a single line of code. I want to refactor everything to modern, but I'm faced with no choice when that happens. It's very frustrating. That is a frustration I have had myself. And one way I found to combat that is to take kind of this approach of if it hurts, do it more often. So, you know, in this case of like, you do it more often, so that way it doesn't hurt as much. So for me, I've got Dependabot running and pushing updates and have a CICD pipeline for some of my projects that I trust enough that I can just merge and deploy.

Starting point is 00:31:55 And if it passed CI, I am happy enough with it. So then you can automate pushing that button as well. And then you can just have stuff flow through your system. And then when you come back to it two years later, it's relatively fresh. That's a good way of doing it. The challenge, of course, I'm doing this across a bunch of different languages simultaneously. Everything requires its own approach. I've been using ASDF as a version manager, which is universal. But then that conflicts with a bunch of things in some particular Python and other language workflows. And all of it's a disaster. Nobody's happy. I'm

Starting point is 00:32:25 starting to love ephemeral dev environments for this explicit purpose, but then nothing I need is local there. And I need to install the universe again. And computers are hard. That's why we have jobs. People are harder. That's why you have jobs. People are very hard. People can be great and the worst, right? Like they fall on the spectrum of everything. Software is the same. I think one of the reasons we have so many package management tools is because we're standards driven as opposed to one tool opinion driven. The standard is the PyPI index will have an HTML page that has all the files, versions, and all the hashes contained in them. That's a standard.

Starting point is 00:33:00 You can build your own tool if you don't like it based on those standards. And that's how we get tools that it's like, well, it does what I want it to do. And it's like, great, it will work. And we commit to kind of having that contract with our consumers. So that way, if you want to use poetry, if you want to use pip, if you want to use something else, that's your prerogative, but that's also your problem. There are no guarantees. So anybody saying, I'm going to guarantee that this won't conflict, they're probably trying to sell you something. I really want to thank you for taking the time to speak with me about all this. If people want to learn more, where's the best place for them to find you?

Starting point is 00:33:36 I am on Mastodon on the HackyDerm site. I'm at MikeTheMan on HackyDerm. I'm at MikeTheMan.com on BlueSky. And you can also read all the blog stuff I write on PyPI at blog.pypi.org. And we will include links to all of that in the show notes. Thank you so much for finally agreeing to sit down with me and suffer the slings and arrows. I appreciate it. Absolutely. Mike Fiedler, the security engineer at the Python Software Foundation. I'm cloud economist Corey Quinn, and this engineer at the Python Software Foundation. I'm cloud economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice. Whereas if you hated this podcast, please leave a five-star review on your podcast platform of choice.

Starting point is 00:34:17 Along with an angry comment, including a screen on why we should be using Rust instead. Please be sure to link to your PowerPoint deck.

Screaming in the Cloud - Helping Securing the Python with Mike Fiedler

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.