Screaming in the Cloud - Kubernetes and OpenGitOps with Chris Short

Starting point is 00:00:00 Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at the Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud. Let's face it. On-call firefighting at 2 a.m. is stressful.

Starting point is 00:00:35 So there's good news and there's bad news. The bad news is that you probably can't prevent incidents from happening. But the good news is that Incident.io makes incidents less stressful and a lot more valuable. Incident.io is a Slack-native incident management platform. It allows you to automate incident processes, focus on fixing the issues, and learn from incident insights to improve site reliability and fix your vulnerabilities. Try Incident.io to recover faster and sleep more. systems. I've got five bucks on DNS, personally. Why scroll through endless dashboards while dealing with alert floods, going from tool to tool to tool that you employ, guessing at which puzzle pieces matter? Context switching and tool sprawl are slowly killing both your team

Starting point is 00:01:36 and your business. You should care more about one of those than the other. Which one is up to you? Drop the separate pillars and enter a world of getting one unified understanding of the one thing driving your business, production. With Honeycomb, you guess less and know more. Try it for free at honeycomb.io slash screaming in the cloud. Observability, it's more than just hipster monitoring. Welcome to Screaming in the Cloud. I'm Corey Quinn. Coming back to us since episode two, it's always nice to go back and see the where are they now type of approach. I am joined by senior developer advocate at AWS, Chris Short. Chris, been a few years. How's it been?

Starting point is 00:02:23 Corey, we have talked outside of the podcast, but it's been good for those that have been listening. I think when we recorded, I wasn't even like, when was season two? What year was that? I think episode two was pre-pandemic and the rest. Oh, so yeah, I was at Red Hat maybe. Yeah, you were at Red Hat stuff back when you got to work on open source stuff as opposed to now where you're not within a thousand miles of that stuff, right? Actually, well, no. So to be clear, I'm on the EKS team, the Kubernetes team here at AWS. So when I joined AWS in October, they were like, hey, you do open source stuff. We'd like that. Do more. And I was like, wait, do more? And they were like, yes, do more. I was like, okay. So

Starting point is 00:03:05 since joining AWS, I've probably done more open source work than the three years at Red Hat that I did. So that's kind of like, it's an interesting point when I talk to people about it because the first couple of months are like, you know, my friends are like, so are you liking it? Are you enjoying it? What's going on? Do they beat you with reads? All the questions people have about companies. Right. I get a lot of random questions about Amazon and AWS that I don't know the answer to. When I started telling people I fixed Amazon bills, I had to quickly pivot that to AWS bills. People started asking me, well, can you save me money on underpants?

Starting point is 00:03:45 Fine, get the prime credit card docs 5% off the bill, so there you go. But other than that, well, can you save me money on underpants? It's how do you find, get the prime credit card docs, 5% off the bill. So there you go. But other than that, no, I can't. No. Yeah. And like, I had to call my bank this morning about a transaction that I didn't recognize and it was from Amazon. And I was like, that's weird. Why would that money should flow one direction? And that's the wrong direction from my employer. Like what is going on here? It shouldn't have been on that card kind of thing. And I had to explain to the person on the phone that I do work at Amazon, but under the web services team. And he's like, oh, so you're in IT? And I'm like, no, it's actually this big company that it's a cloud company.

Starting point is 00:04:20 And they're like, oh, okay. Yeah, the cloud. Got it. So it's interesting talking to people about, I work at Amazon. Oh, my son works at an Amazon distribution center. Cool. I know about that, but very little. I do this. Oh, your son works at Amazon distribution center. Is he a robot? Is normally my next question on that. But yeah, it's neither here nor there. So you and I started talking a while back. We both write newsletters that go to a somewhat similar audience. You write DevOps-ish. I write last week in AWS. And recently, you also started EKS News. Because yeah, the one thing I look at when I'm doing these newsletters every week is, you know what I want to do? That's right. Write more newsletters. So you are just a glutton for punishment. And yeah, welcome to the addiction,

Starting point is 00:05:09 I suppose. How's it been going for you? It's actually been pretty interesting, right? Like, we haven't pushed it very hard. We're now starting to include it in things. Like, we did Container Day. We made sure that EKS News was on the landing page for Container Day at KubeCon EU. And it's kind of just grown organically since then. But it was one of those things where it's like, that EKS News was on the landing page for Container Day at KubeCon EU. And, you know, it's kind of just grown organically since then. But it was one of those things where it's like, internally, this happened at Red Hat, right? When I started live streaming at Red Hat,

Starting point is 00:05:37 the ultimate goal was to do our product management. Like, here's what's new in the next version thing. Do those live. So anybody could see that at any point in time, anywhere on Earth, the second it's available. Similar situation to here, this newsletter actually is generated as part of a report my boss puts together to brief our other DAs or developer advocates, you know, our solutions architects, the whole nine yards about new EKS features. So I was like, why can't we just flip that into a weekly newsletter? You know, like I can pull from the same sources you can. And what's interesting is he only does the meeting biweekly.

Starting point is 00:06:14 So there's some weeks where it's just all me doing it. And he ends up just kind of copying and pasting the newsletter into his document and then adds on for the week. But that report meeting for that team is now getting disseminated to essentially anyone that subscribes to EKS.news. Just go to the site. There's a subscribe thing right there. And we've gotten 20 issues in and it's gotten rave reviews, right? I have been a subscriber for a while. I will say that it has less Chris Short personality to it than DevOps-ish does, which I have to assume is by design. A lot of the Duckbill Group's marketing these days is no longer in my voice, rather intentionally, because it turns out that being a sarcastic jackass and doing half-billion-dollar AOS contracts

Starting point is 00:07:01 tend not to be the most congruent thing in the world so okay we're slowly ameliorating that yeah it's professional voice versus snarky voice well and here's the thing right like i realized this year with devopsish that like if i want to take a week off i have to do like what you did when your child was born you hired folks to like do the newsletter for you or i actually don't do the newsletter. It's binary, hire someone else to do it or don't do it. So the way I structured this newsletter was that any developer advocate on my team could jump in and take over

Starting point is 00:07:37 the newsletter so that if I'm off that week or whatever may be happening, I, Chris Short, am not the voice. It is now the entire developer advocate team. I will challenge you on that a bit because it's not Chris Short voice, that's for sure, but it's also not official AWS brand voice either. It is clearly written by a human being who is used to communicating with the audience for whom it is written. And that is no small thing. Normally when, oh, there's a corporate newsletter, that's just a lot of words to say it's bad. This one is good.

Starting point is 00:08:15 I want to be very clear on that. Yeah. I mean, we have just like DevOps-ish, we have sections just like your newsletter. There are certain sections. So any new what's new announcements, those go in automatically. So like that can get delivered to your inbox every Friday. Same thing with new blog posts about anything containers related to EKS. Those will be in there.

Starting point is 00:08:38 Then containers from the couch, our streaming platform, essentially for all things Kubernetes. Those videos go in. And then there's some ecosystem news as well that I collect and put in the newsletter to give people a broader sense of what's going on out there in Kubernetes land. Because let's face it, there's upstream and then there's downstream. And sometimes those aren't in sync. And that's normal. That's how Kubernetes kind of works sometimes. If you're running upstream Kubernetes, you are awesome. I appreciate you. But I feel like that would cause more problems than it's worth sometimes. Thank you for being the trailblazers the rest of us can learn from your misfortune. Yeah, exactly right. Like, please follow your bugs accordingly. EKS is interesting to me because I don't see a lot of it, which is a

Starting point is 00:09:24 probably going to get a whole lot of wait, what moments? Because wait, don't you deal with very large AWS bills? And I do. But what I mean by that is that EKS, until you're using its Fargate expression, charges for the control plane, which rounds to no money, and the rest is running on EC2 instances running in a company's account. From the billing perspective, there is no difference between we are running massive fleets of EKS nodes and we're managing a whole bunch of EC2 instances by hand. And that feels like an interesting allegory for how Kubernetes winds up expressing itself to cloud providers.

Starting point is 00:10:03 Because from a billing perspective, it just looks like one big single-tenant application that has some really strange behaviors internally. It gets very chatty across AZs. There's no reason to and whatnot. And it becomes a very interesting study in how to expose aspects of what's going on inside of those containers and inside of that Kubernetes environment

Starting point is 00:10:23 to the cloud provider in a way that becomes actionable. There are no good answers for this yet, but it's something I've been seeing a lot of. It's like, oh, I thought you'd be running Kubernetes. Oh, wait, you are, and I just keep forgetting what I'm looking at sometimes. So that's an interesting point. The billing is kind of like, yeah, it's just compute, right? And my insight into AWS and the way I start thinking about it

Starting point is 00:10:43 is always from a billing perspective. It's great. It's because that means the more expensive a service is, the more I know about it. It's like IAM, what is that? Like, oh, I have no idea. It's free. How important could it be? Professional advice. Do not take that philosophy ever. Security, it matters. Oh my God, you are all stars. Your IAM policy should not be. I digress. Yeah. Anyways. So two points I want to make real quick on that is one, we've recently released an open source project called Carpenter, which is really cool in my purview because it looks at your Kubernetes file and says, oh, you want this to run on an ARM instance. And you can even go so far as to say, right,

Starting point is 00:11:25 here's my limits, and it'll find an instance that fits those limits and add that to your cluster automatically, run your pod on that compute as long as it needs to run, and then if it's done, it'll downsize eventually kind of thing, your cluster. So you can basically just throw a bunch of workloads at it, it'll auto detect what kind of compute you will need and then provision it for

Starting point is 00:11:51 you, run it and then be done. So that is one way folks are probably starting to save money running EKS is to adopt Carpenter as your autoscaler as opposed to the inbuilt Kubernetes autoscaler because this is instance-aware, essentially. So it can say like, oh, your massive ARM application can run here because, you know, thank you, Graviton. We have those processors in-house. And, you know, you can run your ARM64 instances.

Starting point is 00:12:20 You can run all the Intel workloads you want. And it'll right-size the compute for your workloads. And it'll look at one container or all your containers, however you want to configure it. Secondly, the good folks over at kubecost have OpenCost, which is the open source version of kubecost, basically. So they have a service that you can run in your clusters that will help you say, hey, maybe this one node's too heavy. Maybe this one node's too light. And, you know, give you some insights

Starting point is 00:12:52 into Kubernetes spend that are a little bit more granular as far as usage and things like that go. So those two projects right there, I feel like will give folks an optimal savings experience when it comes to Kubernetes. But to your point, it's just compute, right?

Starting point is 00:13:10 And that's really how we treat it kind of here internally is that it's a way to run compute, Kubernetes or ECS or any of those tools. A fairly expensive one because ignoring entirely for a second the actual raw cost of compute, you also have the other side of it, which is in every environment, unless you are doing something very strange or pre-funding as a one-person startup in your spare time, your payroll costs will and should exceed your AWS bill by a fairly healthy amount. And engineering time is always more expensive than services time. So, for example, looking at EKS, I would absolutely recommend people use that rather than rolling their own because get out of that engineering space where your time is free. I assure you from a business context, it is not. So there's always that question of what you can do to make things easier for people and do more of the heavy lifting. Yeah. And to your rather cheeky point that there's 17 ways to run a container on AWS, it is answering that question, right? Like those 17 ways, like how much of this do you want to run

Starting point is 00:14:12 yourself? You could run EKS distro on EC2 instances if you want full control over your environment. And then I run IoT Greengrass core on top within that cluster so that I can run my own Lambda function runtime so I'm not locked in. Also DynamoDB local so I'm not locked into AWS, at which point I have gone so far around the bend no one can help me. Pro tip, don't do that. Just don't do that. But to your point, we have all these options for compute and specifically containers because there's a lot of people that want to granularly say, this is where my engineering team gets involved. Everything else you handle.

Starting point is 00:14:51 If I want EKS spot instances only, you can do that. If you want EKS to use Carpenter and say only run ARM workloads, you can do that. If you want to say Fargate and not have anything to manage other than the container file, you can do that. It's how much does your team want to manage? That's the customer obsession part of AWS coming through when it comes to containers is because there's so many different ways to run those workloads, but there's so many different ways to make sure that your team is right-sized based off the

Starting point is 00:15:30 services you're using. I do want to change gears a bit here because you are mostly known for a couple of things. The DevOps-ish newsletter, because that is the oldest and longest thing you've been doing at the time that I've known you. EKS, obviously. But when prepping for this show, I discovered you are now co-chair of the Open GitOps project. Yes. So I have heard of GitOps in the context of, oh, it's just basically your CICD stuff

Starting point is 00:15:56 is triggered by Git events and whatnot. And I'm sitting here going, okay, so from where you're sitting, the two best user interfaces in the world that you have discovered are YAML and Git. And I just have to start with the question, who hurt you? Yeah, I share your sentiment when it comes to Git. Not so much with YAML, but I think it's because I'm so used to it.

Starting point is 00:16:19 Maybe it's Stockholm Syndrome, maybe the whole YAML thing. I don't know. Well, it's no XML. We'll put it that way. Thankfully, yes. Because if it was, I would have way more like just template files laying around to build things. And Rage. Don't forget Rage. And Rage. Yeah. So GitOps is a little bit more than just Git and IAC, infrastructure as code. It's more like Justin Garrison, who's also on my team. He calls it infrastructure as software because there's four main principles to GitOps. And if you go to openGitOps.dev, you can see them. It's version one. So we put them on the website right there on the page. You have to have a declared state and that state has to live somewhere. Now it's called GitOps because Git is probably the most full-featured thing

Starting point is 00:17:07 to put your state in, but you could use an S3 bucket and just version it, for example, and make it private so no one else can get to it. Or you could use local files. Copy of, copy of, this thing, restored, parentheses, use this one, dot final, dot doc, dot zip. You know, my preferred naming convention. Ah, yeah, wow, okay. use this one dot final dot doc dot zip you know my preferred naming convention ah yeah wow okay everything i touch is terrifying yes geez i'm sorry so first it's declarative you declare your

Starting point is 00:17:35 state you store it somewhere it's versioned and immutable like i said and then pulled automatically don't focus so much on pull but basically software agents are applying the desired state from source. So what does that mean when it's, you know, the fourth principle is implemented, continuously reconciled. That means those software agents that are checking your desired state are actually putting it back into the desired state if it's out of whack, right? So you're talking about agents running it persistently on instances, validating a checkpoint on a cron. How is this meaningfully different than a Puppet agent running in years past? I learned to speak publicly by being a traveling trainer for Puppet, same type of model. And in fact, when I was at Pinterest, we wound up having a fair bit, like that was their entire model where they would have the puppets code would live in an S3 bucket that was then

Starting point is 00:18:29 copied down, I believe via Git and then applied to the instance on a schedule. That sounds like this was sort of a early days GitOps. Yeah, exactly. Right. Like, so it's, I like to think of that as a component of GitOps, right? DevOps, when you talk about DevOps in general, there's a lot of stuff out there. There's a lot of things labeled DevOps that maybe are or maybe aren't sticking to some of those DevOps core things that make you great. Like the stuff that Nicole Forsgren writes about in books, you know, Accelerate is on my desk for a reason because there's things that good, well-managed DevOps practices do. I see GitOps as an actual implementation of

Starting point is 00:19:17 DevOps in an open-source manner because all the tooling for GitOps these days is open-source and it all started as open-source. Now you can get like Flux or Argo. Argo specifically, there's managed services out there for it. You can have Flux and not maintain it through an add-on on EKS, for example, and it will reconcile that state for you automatically. And the thing I like to say about GitOps specifically is that it moves at the speed of the Kubernetes audit log. If you've ever looked at a Kubernetes audit log, you know it's rather noisy with all these groups and versions and kinds getting

Starting point is 00:19:54 thrown out there. So GitOps will say, oh, there's an event for said thing that I'm supposed to be watching. Do I need to change anything? Yes or no? Yes? Okay, go. And the change gets applied or, hey, there's a new Git thing. Pull it in. A change has happened in Git. I need to update it. You can set it to reconcile on events, on time. It's like a cron or it's like an event-driven architecture, but it's combined. How does this survive the stake through the heart of configuration management? Because before I was doing all this, I was, I believe, a T-shaped engineer. You're brought across a bunch of things, but deep in one or two areas, and one of mine was configuration management.

Starting point is 00:20:35 I wrote part of SaltStack once upon a time due to a bunch of very strange coincidences all hitting at once. I taught people how to use Puppet, but containers ultimately arose, and the idea of immutable infrastructure became a thing. And these days, when we're doing full-on serverless, well, great. I just wind up deploying a new code bundle to the lambdas function that I wind up caring about, and that is an immutable version replacement. There is no drift because there is no way to long in and change those things other than through a clear deployment of this is the new version that goes out there.

Starting point is 00:21:06 Where does GitOps fit in to that imagined pattern? So configuration management becomes part of your approval process, right? So you now are generating an audit log essentially of all changes to your system through the approval process that you set up as part of your, how do you get things into source and then promote that out to production. That's kind of the beauty of it, right? Like, that's why we suggest using Git, because it has functions like requests and issues and things like that, that you can say, hey, yes, I approve this, or hey, no, I don't approve that, we need changes. So that's kind of natively happening with Git and GitLab, GitHub, whatever implementation of Git. There's always some kind of...

Starting point is 00:21:48 Jithub is, I believe, the pronunciation. Jithub. Yeah, that's what I'm... Today I learned. Okay. Exactly. That's one of the things that I do for my last tweet in AWS.com Twitter client that I built because I needed it. And if other people want to use it, that's great. That is now deployed to 20 different AWS commercial regions simultaneously.

Starting point is 00:22:07 Wow. And that is done via, because it turns out that that's a very long execute for loop if you start down that path. I wound up building out a GitHub Actions matrix, sorry, a Jithub Actions matrix job that winds up instantiating 20 parallel builds of the CDK deploy that goes out to each region as expected. And because that gets really expensive with native GitHub Actions runners, that's like 36 cents per deploy, and I don't know how to test my own code, so every time I have a typo, that's another quarter in the jar. Cool, but that was annoying for me,

Starting point is 00:22:41 so I built my own custom runner system that uses Lambda functions as runners running containers pulled from ECR that, oh, it runs in parallel less than three minutes every time I commit something between I press the push button and it is out and running in the wild across all regions,

Starting point is 00:22:58 which is awesome and also terrifying because as previously mentioned, I don't know how to test my code. Yeah, so you don't know what you're deploying to 20 regions sometime. Right, but it also means I have a pristine, recomposable build environment because I can just automatically have that go out. The fact that I'm either merging a pull request or doing a direct push because I consider main to be my feature branch,

Starting point is 00:23:18 as whenever something hits that, all the automation kicks off. That was something that I found to be transformative as far as a way of thinking about this, because I was very tired of having to tweak my local laptop environment to, oh, you didn't assume the proper role and everything failed again and you broke it. Good job. It wound up being something where I could start developing on more and more disparate platforms. And it finally is what got me away from my old development model of everything I build is on an EC2 instance.

Starting point is 00:23:47 And that means that my editor of choice was Vim. I use VS Code now for these things, and I'm pretty happy with it. Yeah. So, you know, I'm glad you brought up CDK. CDK gives you a lot of the capabilities to implement GitOps in a way that you could say, like, hey, use CDK to declare I need four Amazon EKS clusters with this size, shape, and configuration, go. Or even further, connect these EKS clusters to RDS instances and load balancers and everything else. But you put that state into Git, and then you have something that deploys it automatically upon changes. That is infrastructure as code. Now when you say, okay, main is your feature branch,

Starting point is 00:24:30 things happen on main, if this were running in Kubernetes across a fleet of clusters the globe wide in 20 regions, something like Flux or Argo would kick in and say, there's been a change to source main and we need to roll this out and it'll start applying those changes now what do you get with git ops that you don't get with your configuration i mean can you roll back if you ever have like a bad commit that's just awful i mean that's really part of the process

Starting point is 00:24:56 with git ops is to make sure that you can a roll back to the previous good state, roll forward to a known good state, or C, promote that state up through various environments. And then having that all done declaratively, automatically, and immutably inversion with an audit log, that I think is the real power of GitOps in the sense that like, oh, so-and-so approved this change to security policy XYZ on this date at this time. And that to an auditor, you just hand them a log file and like, here's everything we've ever done to our system. Done. Right? Like you could get to that state if you want to, which I think is kind of the idea of DevOps, which says take all these disparate tools and processes and procedures and culture changes, culture being the hardest part to adopt in DevOps.

Starting point is 00:25:50 GitOps kind of forces a culture change where you can't do a cab with GitOps. Those two things don't fly. You don't have a configuration management database unless you absolutely... Oh, you have a cab now, but they're all in the comments of the pull request right exactly like don't push this change out until thursday after this other thing has happened kind of thing yeah like that all happens in github but it's very democratizing in the sense that people don't have to waste time in an hour-long meeting to get their five minutes in right doordash had a problem as As their cloud-native environments scaled and

Starting point is 00:26:26 developers delivered new features, their monitoring system kept breaking down. In an organization where data is used to make better decisions about technology and about the business, losing observability means the entire company loses their competitive edge. With Chronosphere, DoorDash is no longer losing visibility into their application suite. The key? Chronosphere, DoorDash is no longer losing visibility into their application suite. The key? Chronosphere is an open-source compatible, scalable, and reliable observability solution that gives the observability lead at DoorDash business confidence and peace of mind. Read the full success story at snark.cloud slash chronosphere. That's snark.cloud slash c-h-r-o-n-o-s-p-h-e-r-e. So would it be overwhelmingly cynical to suggest that GitOps is the means to implement what we've

Starting point is 00:27:17 all been pretending to have implemented for the last decade when giving talks at conferences? I wouldn't go that far. I would say that GitOps is an excellent way to implement the things you've been talking about at all these conferences for all these years. But keep in mind, the technology has changed a lot in the, what, 11, 12 years of the existence of DevOps now. I mean, we've gone from, let's try to manage whole servers immutably to, oh, now we just need to maintain an orchestration platform and run containers. That whole compute interface, you go from SSH to a Docker file. That's a big leap, right? Like, you don't have bespoke sysadmins.

Starting point is 00:28:04 You have like a platform team. You don't have bespoke sysadmins. You have, like, a platform team. You don't have DevOps engineers. They're part of that platform team or DevOps teams, right? Like, which was kind of antithetical to the whole idea of DevOps, to have a DevOps team. You know, everybody's kind of in the same boat. changing in GitOps and Kubernetes land is like a platform team that manages the cluster and its state and health and production essentially. Then you have your developers deploying what they want to deploy and whatever namespace they've been given access to and whatever rights they have.

Starting point is 00:28:38 So now you have the potential for one set of people, the platform team, to use one set of GitOps tooling. And your applications teams might not like that, and that's fine. They can have their own namespaces with their own tooling in it. Like Argo, for example, is preferred by a lot of developers because it has a nice UI with green and red dots, and they can show people, and it looks nice. Flux, it's command line based. And there are some projects out there that kind of take the UI of Argo and try to run Flux underneath that. And those are cool kind of projects, I think, in my mind. But in general, right, I think GitOps gives you the choice that we missed somewhat in DevOps implementations of the past because it was, oh, we need to go get cloud.

Starting point is 00:29:27 Well, you can only use this cloud. Oh, we need to go get this thing. Well, you can only use this thing in-house. And there's a lot of restrictions sometimes placed on what you can use in your environment. Well, if your environment is Kubernetes, how do you restrict what you can run, right? You can't have an easily configured, say, no open source policy if you're running Kubernetes. So it becomes, you know- Well, that doesn't stop some companies from trying.

Starting point is 00:29:55 Yeah, that's true. But the idea of like enabling your developers to deploy at will and then promote their changes as they see fit is really the dream of DevOps, right? Like same with production and platform teams, right? I want to push my changes out to a larger system that is across the globe. How do I do that? How do I manage that? How do I make sure everything's consistent? GitOps gives you those ways with Kubernetes native things like customizations to make consistent environments that are robust

Starting point is 00:30:27 and actually going to be reconciled automatically if someone breaks the glass and says, oh, I need to run this container immediately. Well, that's going to create problems because it's deviated from state and it's just that one region, so we'll put it back into state. But if you're dueling banjos at some point, you You'll try doing something manually, it gets reverted automatically.

Starting point is 00:30:48 I love that pattern. You'll get bored before the computer does, always. Yeah, and GitOps is very new, right? When you think about the lifetime of GitOps, I think it was coined in like 2018, so it's only four years old. I prefer it to ChatOps, at least, as far as implementation and expression. ChatOps was a way to do DevOps. I think GitOps gives you a... Well, chat ops is also a way to wind up giving whoever gets access to your Slack workspace root and production, but that's neither here nor there.

Starting point is 00:31:14 Yeah, we all like to pretend that that's not a giant security issue in our industry, but that's a topic for another time. Yeah, and that's why GitOps also depends upon you having good security and good authorization and approval processes it enforces yeah who doesn't have one of those yeah if it's a sole operation kind of deal like in your setup your case i think you kind of got it doing right right like as far as get ups goes to be clear we are 11 people and we do have dueling pull requests and all the rest but most of the stuff I talk

Starting point is 00:31:46 about publicly is not our production stuff, so it really is just me. Just as a point of clarity there, the 11 people here do not all, the rest of them do not just sit there and clap as I do all the work most days. Right. No, I'm sure they don't. I'm almost certain they don't clap for you. No. No, they try to talk me out of it. Yeah, exactly.

Starting point is 00:32:02 So the setup that you, Corey Quinn, have implemented to deploy to these 20 regions is kind of very GitOps-y in the sense that when main changes, it gets updated. Where it's not GitOps-y is what if the endpoint changes? Does it get reconciled? piece you're probably missing is that continuous reconciliation component where it's constantly checking and saying this thing out there is deployed in the way i want it you know the way i declared it to be in my source of truth yeah when you start having other people and getting involved there can yeah that's where regressions enter and it's like well i know where things are so why would i change the end point yeah it turns out not everyone has the state of the entire application in their head that really should live in, you know, Gitter S3.

Starting point is 00:32:45 Yeah, exactly. When I think about interactions of the past coming in as a new DevOps engineer to work with developers, it's always been, well, developers have access to prod or they don't. And if you're in that environment with you're trying to run a multi-billion dollar operation and your devs have a direct or one dev has direct access to prod because prod is in his brain. That's where it's like, well, now wait a minute.

Starting point is 00:33:10 Prod doesn't have to be only in your brain. You can put that in the code base and now we know what is in your brain, right? Like you can almost do, if you document your code well, you can have your full life cycle right there in

Starting point is 00:33:25 one place, including documentation, which I think is the best part too. So, you know, it encourages approval processes and automation over this one person has an entire state of the system in their head. They have to go in and fix it. And what if they're not on call or in Jamaica or on a cruise ship somewhere kind of thing? Things get difficult. Like, for example, I just got back from vacation. We were so far off the grid, we had satellite internet. And let me tell you, it was hard to write an email newsletter where I usually open 50 to 100 tabs. There's a little bit of internet out of California way. Yeah. It's always weird going from like, especially after the pandemic, I have gigabits symmetric here and going even to reinvent where I'm trying to upload a bunch of video and whatnot. And the conference wifi was doing its thing and well, Verizon 5G was

Starting point is 00:34:16 there, but spotty and yeah, usual stuff. Yeah. It's amazing to me how connectivity has become so ubiquitous. To the point where when it's not there anymore, it's what do I do with myself? Same story about people pushing back against remote development of, oh, I'm just going to do it all on my laptop because what happens if I'm on a plane? It's, yeah, the year before the pandemic, I flew 140,000 miles domestically, and I was almost never hamstrung by my ability to do work. and all my only local computer is an ipad for those things so it turns out that is less of a real world concern for most folk yeah i actually ordered the components to uh upgrade an old nook that i have here and turn it into my like this is my remote

Starting point is 00:34:58 code server that's going to be all attached to github and everything else that's where i want to be have tailscale and just VPN into this box. Tailscale is transformative. Yes. Tailscale will change your life. That's just my personal opinion. That's not AWS's opinion or anything. But yeah, when you start thinking about your network as it could be anywhere, that's where Tailscale really shines. Tailscale makes the internet work like we all wanted to believe that it worked.

Starting point is 00:35:28 WireGuard is an excellent open source project and Tailscale consumes that and puts an amazingly easy to use UI and troubleshooting tools and routing and all kinds of forwarding capabilities and makes it kind of easy, which is

Starting point is 00:35:43 really, really, really kind of awesome. And Tailscale and Kubernetes have a good story. Yeah, networking and easy don't belong in the same sentence, but in this case, they do. Yeah, and trust me, the Kubernetes story and Tailscale, there is a lot of there there. I understand you might want to

Starting point is 00:35:57 not open ports in your VPC maybe, but if you use Tailscale, that node is just another thing on your network you can connect to that and see what's going on your management cluster is just another thing on the network where you can watch the state but it's all you're connected to it continuously through tail scale or you know it's a much lighter weight kind of meshy vpn i would, if I had to sum it up in one sentence. That was not on our agenda to talk about at all. Anyways. No, no. I love how different topics we talk about on these things. We'll have to have you back soon to talk again. I really want to thank you for being so generous

Starting point is 00:36:36 with your time. If people want to learn more about what you're up to and how you view these things, where can they find you? Go to chrisshort.net. So Chris, short. I'm 6'4", so remember it's short.net. And you will find all the places that I write. You can go to devopsish.com to subscribe to my newsletter, which goes out every week this year. Next year, there'll be breaks. And then finally, if you want to follow me on Twitter, Chris Short at Chris Short on Twitter, all one word. So you'll see two S's. Like, it's okay. There's two S's there.

Starting point is 00:37:11 Links to all of that will, of course, be in the show notes. It's easy for people to do the clicky, clicky thing as a general rule. Clicky things are easier than the wordy things, yes. Says the Kubernetes guy. Yeah, says the Kubernetes guy. Yeah, you like that, huh? Like I said, Argo gives you a UI. Thank you so much for your time. I really do appreciate it.

Starting point is 00:37:30 Thank you. This has been fun. If folks have questions, feel free to reach out. I am not one of those people that hides behind a screen all day and doesn't respond. I will respond to you eventually. I'm right here, Chris. Come on, Come on. You're calling me out in front of myself. It might take a day or two, but I will respond. I promise. Thanks again for your time. This has been Chris Short, Senior Developer Advocate at AWS. I'm Cloud Economist

Starting point is 00:37:55 Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, and if it's YouTube, click the thumbs-up button. Whereas if it's YouTube, click the thumbs up button. Whereas if you've hated this podcast, same thing. Smash the buttons, five-star review, and leave an insulting comment that is written in syntactically correct YAML

Starting point is 00:38:14 because it's just so easy to do. If your AWS bill keeps rising and your blood pressure is doing the same, then you need the Duck Bill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duck Bill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started. This has been a HumblePod production.

Starting point is 00:39:04 Stay humble.

Screaming in the Cloud - Kubernetes and OpenGitOps with Chris Short

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.