Screaming in the Cloud - Kubernetes and OpenGitOps with Chris Short
Episode Date: July 14, 2022About ChrisChris Short has been a proponent of open source solutions throughout his over two decades in various IT disciplines, including systems, security, networks, DevOps management, and c...loud native advocacy across the public and private sectors. He currently works on the Kubernetes team at Amazon Web Services and is an active Kubernetes contributor and Co-chair of OpenGitOps. Chris is a disabled US Air Force veteran living with his wife and son in Greater Metro Detroit. Chris writes about Cloud Native, DevOps, and other topics at ChrisShort.net. He also runs the Cloud Native, DevOps, GitOps, Open Source, industry news, and culture focused newsletter DevOps’ish.Links Referenced:DevOps’ish: https://devopsish.com/EKS News: https://eks.news/Containers from the Couch: https://containersfromthecouch.comopengitops.dev: https://opengitops.devChrisShort.net: https://chrisshort.netTwitter: https://twitter.com/ChrisShort
 Transcript
 Discussion  (0)
    
                                         Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at the
                                         
                                         Duckbill Group, Corey Quinn.
                                         
                                         This weekly show features conversations with people doing interesting work in the world
                                         
                                         of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles
                                         
                                         for which Corey refuses to apologize.
                                         
                                         This is Screaming in the Cloud.
                                         
                                         Let's face it.
                                         
                                         On-call firefighting at 2 a.m. is stressful.
                                         
    
                                         So there's good news and there's bad news.
                                         
                                         The bad news is that you probably can't prevent incidents from happening.
                                         
                                         But the good news is that Incident.io makes incidents less stressful
                                         
                                         and a lot more valuable. Incident.io is a Slack-native incident management platform.
                                         
                                         It allows you to automate incident processes, focus on fixing the issues, and learn from
                                         
                                         incident insights to improve site reliability and fix your vulnerabilities. Try Incident.io to recover faster and sleep more. systems. I've got five bucks on DNS, personally. Why scroll through endless dashboards while
                                         
                                         dealing with alert floods, going from tool to tool to tool that you employ, guessing at which
                                         
                                         puzzle pieces matter? Context switching and tool sprawl are slowly killing both your team
                                         
    
                                         and your business. You should care more about one of those than the other. Which one is up to you?
                                         
                                         Drop the separate pillars and enter a world of getting one
                                         
                                         unified understanding of the one thing driving your business, production. With Honeycomb,
                                         
                                         you guess less and know more. Try it for free at honeycomb.io slash screaming in the cloud.
                                         
                                         Observability, it's more than just hipster monitoring. Welcome to Screaming in the Cloud. I'm Corey Quinn.
                                         
                                         Coming back to us since episode two, it's always nice to go back and see the where are they now type of approach.
                                         
                                         I am joined by senior developer advocate at AWS, Chris Short.
                                         
                                         Chris, been a few years. How's it been?
                                         
    
                                         Corey, we have talked outside of the podcast,
                                         
                                         but it's been good for those that have been listening. I think when we recorded, I wasn't
                                         
                                         even like, when was season two? What year was that? I think episode two was pre-pandemic and
                                         
                                         the rest. Oh, so yeah, I was at Red Hat maybe. Yeah, you were at Red Hat stuff back when you
                                         
                                         got to work on open source stuff as opposed to now where you're not within a thousand miles of that stuff, right? Actually, well, no.
                                         
                                         So to be clear, I'm on the EKS team, the Kubernetes team here at AWS. So when I joined AWS in October,
                                         
                                         they were like, hey, you do open source stuff. We'd like that. Do more. And I was like, wait,
                                         
                                         do more? And they were like, yes, do more. I was like, okay. So
                                         
    
                                         since joining AWS, I've probably done more open source work than the three years at Red Hat that
                                         
                                         I did. So that's kind of like, it's an interesting point when I talk to people about it because the
                                         
                                         first couple of months are like, you know, my friends are like, so are you liking it? Are you
                                         
                                         enjoying it? What's going on?
                                         
                                         Do they beat you with reads? All the questions people have about companies.
                                         
                                         Right. I get a lot of random questions about Amazon and AWS that I don't know the answer to.
                                         
                                         When I started telling people I fixed Amazon bills, I had to quickly pivot that to AWS bills.
                                         
                                         People started asking me, well, can you save me money on underpants?
                                         
    
                                         Fine, get the prime credit card docs 5% off the bill, so there you go. But other than that, well, can you save me money on underpants? It's how do you find,
                                         
                                         get the prime credit card docs, 5% off the bill. So there you go. But other than that, no, I can't. No. Yeah. And like, I had to call my bank this morning about a transaction that I didn't
                                         
                                         recognize and it was from Amazon. And I was like, that's weird. Why would that money should flow
                                         
                                         one direction? And that's the wrong direction from my employer. Like what is going on here?
                                         
                                         It shouldn't have been on that card kind of thing.
                                         
                                         And I had to explain to the person on the phone that I do work at Amazon, but under the web services team.
                                         
                                         And he's like, oh, so you're in IT?
                                         
                                         And I'm like, no, it's actually this big company that it's a cloud company.
                                         
    
                                         And they're like, oh, okay. Yeah, the cloud. Got it. So it's interesting talking to
                                         
                                         people about, I work at Amazon. Oh, my son works at an Amazon distribution center. Cool. I know
                                         
                                         about that, but very little. I do this. Oh, your son works at Amazon distribution center. Is he a
                                         
                                         robot? Is normally my next question on that. But yeah, it's neither here nor there.
                                         
                                         So you and I started talking a while back. We both write newsletters that go to a somewhat
                                         
                                         similar audience. You write DevOps-ish. I write last week in AWS. And recently, you also started
                                         
                                         EKS News. Because yeah, the one thing I look at when I'm doing these newsletters every week is,
                                         
                                         you know what I want to do? That's right. Write more newsletters. So you are just a glutton for punishment. And yeah, welcome to the addiction,
                                         
    
                                         I suppose. How's it been going for you? It's actually been pretty interesting, right? Like,
                                         
                                         we haven't pushed it very hard. We're now starting to include it in things. Like, we did Container
                                         
                                         Day. We made sure that EKS News was on the landing page for Container Day at KubeCon EU.
                                         
                                         And it's kind of just grown organically since then. But it was one of those things where it's like, that EKS News was on the landing page for Container Day at KubeCon EU.
                                         
                                         And, you know, it's kind of just grown organically since then.
                                         
                                         But it was one of those things where it's like,
                                         
                                         internally, this happened at Red Hat, right?
                                         
                                         When I started live streaming at Red Hat,
                                         
    
                                         the ultimate goal was to do our product management.
                                         
                                         Like, here's what's new in the next version thing.
                                         
                                         Do those live.
                                         
                                         So anybody could see that at any point in time,
                                         
                                         anywhere on Earth, the second it's available. Similar situation to here, this newsletter actually is generated as part of a report my boss puts together to brief our other DAs or developer advocates,
                                         
                                         you know, our solutions architects, the whole nine yards about new EKS features. So I was like, why can't we just flip that into a weekly newsletter?
                                         
                                         You know, like I can pull from the same sources you can.
                                         
                                         And what's interesting is he only does the meeting biweekly.
                                         
    
                                         So there's some weeks where it's just all me doing it.
                                         
                                         And he ends up just kind of copying and pasting the newsletter into his document and then adds on for the week. But that report meeting for that team
                                         
                                         is now getting disseminated to essentially anyone that subscribes to EKS.news. Just go to the site.
                                         
                                         There's a subscribe thing right there. And we've gotten 20 issues in and it's gotten rave reviews,
                                         
                                         right? I have been a subscriber for a while. I will say that it has less Chris Short
                                         
                                         personality to it than DevOps-ish does, which I have to assume is by design. A lot of the
                                         
                                         Duckbill Group's marketing these days is no longer in my voice, rather intentionally,
                                         
                                         because it turns out that being a sarcastic jackass and doing half-billion-dollar AOS contracts
                                         
    
                                         tend not to be the most congruent thing in the world so okay
                                         
                                         we're slowly ameliorating that yeah it's professional voice versus snarky voice well
                                         
                                         and here's the thing right like i realized this year with devopsish that like if i want to take
                                         
                                         a week off i have to do like what you did when your child was born you hired folks to like do
                                         
                                         the newsletter for you or i actually don't do the newsletter.
                                         
                                         It's binary, hire someone else to do it or don't do it.
                                         
                                         So the way I structured this newsletter was that
                                         
                                         any developer advocate on my team could jump in and take over
                                         
    
                                         the newsletter so that if I'm off that week or whatever may be happening,
                                         
                                         I, Chris Short, am not the voice. It is now the entire
                                         
                                         developer advocate team. I will challenge you on that a bit because it's not Chris Short voice,
                                         
                                         that's for sure, but it's also not official AWS brand voice either. It is clearly written by a
                                         
                                         human being who is used to communicating with the audience for whom it is written.
                                         
                                         And that is no small thing.
                                         
                                         Normally when, oh, there's a corporate newsletter, that's just a lot of words to say it's bad.
                                         
                                         This one is good.
                                         
    
                                         I want to be very clear on that.
                                         
                                         Yeah.
                                         
                                         I mean, we have just like DevOps-ish, we have sections just like your newsletter.
                                         
                                         There are certain sections.
                                         
                                         So any new what's new announcements, those go in automatically.
                                         
                                         So like that can get delivered to your inbox every Friday.
                                         
                                         Same thing with new blog posts about anything containers related to EKS.
                                         
                                         Those will be in there.
                                         
    
                                         Then containers from the couch, our streaming platform, essentially for all things Kubernetes.
                                         
                                         Those videos go in.
                                         
                                         And then there's some ecosystem news as well that I collect and put in the newsletter to give people a broader sense of what's going on out there in Kubernetes land.
                                         
                                         Because let's face it, there's upstream and then there's downstream. And sometimes those
                                         
                                         aren't in sync. And that's normal. That's how Kubernetes kind of works sometimes. If you're running upstream Kubernetes, you are awesome. I appreciate you. But I feel like
                                         
                                         that would cause more problems than it's worth sometimes. Thank you for being the trailblazers
                                         
                                         the rest of us can learn from your misfortune. Yeah, exactly right. Like, please follow your
                                         
                                         bugs accordingly. EKS is interesting to me because I don't see a lot of it, which is a
                                         
    
                                         probably going to get a whole
                                         
                                         lot of wait, what moments? Because wait, don't you deal with very large AWS bills? And I do.
                                         
                                         But what I mean by that is that EKS, until you're using its Fargate expression,
                                         
                                         charges for the control plane, which rounds to no money, and the rest is running on EC2 instances running in a company's account.
                                         
                                         From the billing perspective, there is no difference between we are running massive
                                         
                                         fleets of EKS nodes and we're managing a whole bunch of EC2 instances by hand.
                                         
                                         And that feels like an interesting allegory for how Kubernetes winds up expressing itself
                                         
                                         to cloud providers.
                                         
    
                                         Because from a billing perspective, it just looks like one big single-tenant application
                                         
                                         that has some really strange behaviors internally.
                                         
                                         It gets very chatty across AZs.
                                         
                                         There's no reason to and whatnot.
                                         
                                         And it becomes a very interesting study
                                         
                                         in how to expose aspects of what's going on
                                         
                                         inside of those containers
                                         
                                         and inside of that Kubernetes environment
                                         
    
                                         to the cloud provider in a way that becomes actionable.
                                         
                                         There are no good answers for this yet,
                                         
                                         but it's something I've been seeing a lot of.
                                         
                                         It's like, oh, I thought you'd be running Kubernetes.
                                         
                                         Oh, wait, you are, and I just keep forgetting what I'm looking at sometimes.
                                         
                                         So that's an interesting point.
                                         
                                         The billing is kind of like, yeah, it's just compute, right?
                                         
                                         And my insight into AWS and the way I start thinking about it
                                         
    
                                         is always from a billing perspective. It's great. It's because that means the more expensive a service is,
                                         
                                         the more I know about it. It's like IAM, what is that? Like, oh, I have no idea. It's free.
                                         
                                         How important could it be? Professional advice. Do not take that philosophy ever. Security,
                                         
                                         it matters. Oh my God, you are all stars. Your IAM policy should not be. I digress.
                                         
                                         Yeah. Anyways. So two points I want to
                                         
                                         make real quick on that is one, we've recently released an open source project called Carpenter,
                                         
                                         which is really cool in my purview because it looks at your Kubernetes file and says, oh,
                                         
                                         you want this to run on an ARM instance. And you can even go so far as to say, right,
                                         
    
                                         here's my limits,
                                         
                                         and it'll find an instance that fits those limits
                                         
                                         and add that to your cluster automatically,
                                         
                                         run your pod on that compute
                                         
                                         as long as it needs to run,
                                         
                                         and then if it's done,
                                         
                                         it'll downsize eventually kind of thing, your cluster.
                                         
                                         So you can basically just throw a bunch of workloads at it, it'll auto detect what kind of compute you will need and then provision it for
                                         
    
                                         you, run it and then be done. So that is one way folks are probably starting to save money running
                                         
                                         EKS is to adopt Carpenter as your autoscaler as opposed to the inbuilt Kubernetes autoscaler
                                         
                                         because this is instance-aware, essentially.
                                         
                                         So it can say like,
                                         
                                         oh, your massive ARM application can run here
                                         
                                         because, you know, thank you, Graviton.
                                         
                                         We have those processors in-house.
                                         
                                         And, you know, you can run your ARM64 instances.
                                         
    
                                         You can run all the Intel workloads you want.
                                         
                                         And it'll right-size the compute for
                                         
                                         your workloads. And it'll look at one container or all your containers, however you want to
                                         
                                         configure it. Secondly, the good folks over at kubecost have OpenCost, which is the open source
                                         
                                         version of kubecost, basically. So they have a service that you can run in your clusters that will help you say,
                                         
                                         hey, maybe this one node's too heavy.
                                         
                                         Maybe this one node's too light.
                                         
                                         And, you know, give you some insights
                                         
    
                                         into Kubernetes spend
                                         
                                         that are a little bit more granular
                                         
                                         as far as usage and things like that go.
                                         
                                         So those two projects right there,
                                         
                                         I feel like will give folks
                                         
                                         an optimal savings experience
                                         
                                         when it comes to Kubernetes.
                                         
                                         But to your point, it's just compute, right?
                                         
    
                                         And that's really how we treat it kind of here internally is that it's a way to run compute, Kubernetes or ECS or any of those tools.
                                         
                                         A fairly expensive one because ignoring entirely for a second the actual raw cost of compute, you also have the other side of it, which is in every environment, unless you are doing something very strange or pre-funding as a one-person startup in your spare time, your payroll costs will and should exceed your AWS bill by a fairly healthy amount.
                                         
                                         And engineering time is always more expensive than services time. So, for example, looking at EKS, I would absolutely
                                         
                                         recommend people use that rather than rolling their own because get out of that engineering
                                         
                                         space where your time is free. I assure you from a business context, it is not. So there's always
                                         
                                         that question of what you can do to make things easier for people and do more of the heavy lifting.
                                         
                                         Yeah. And to your rather cheeky point that there's 17 ways to run a container on AWS,
                                         
                                         it is answering that question, right? Like those 17 ways, like how much of this do you want to run
                                         
    
                                         yourself? You could run EKS distro on EC2 instances if you want full control over your environment.
                                         
                                         And then I run IoT Greengrass core on top within that cluster so that I can run my own Lambda function runtime
                                         
                                         so I'm not locked in. Also DynamoDB local so I'm not locked into AWS, at which point I have gone
                                         
                                         so far around the bend no one can help me. Pro tip, don't do that. Just don't do that.
                                         
                                         But to your point, we have all these options for compute and specifically containers because
                                         
                                         there's a lot of people that want to granularly say,
                                         
                                         this is where my engineering team gets involved.
                                         
                                         Everything else you handle.
                                         
    
                                         If I want EKS spot instances only,
                                         
                                         you can do that.
                                         
                                         If you want EKS to use Carpenter
                                         
                                         and say only run ARM workloads,
                                         
                                         you can do that. If you want to say Fargate and
                                         
                                         not have anything to manage other than the container file, you can do that. It's how much
                                         
                                         does your team want to manage? That's the customer obsession part of AWS coming through when it comes
                                         
                                         to containers is because there's so many different ways to run those workloads, but there's so many different ways to make sure that your team is right-sized based off the
                                         
    
                                         services you're using. I do want to change gears a bit here because you are mostly known for a
                                         
                                         couple of things. The DevOps-ish newsletter, because that is the oldest and longest thing
                                         
                                         you've been doing at the time that I've known you. EKS, obviously. But when prepping for this show,
                                         
                                         I discovered you are now co-chair
                                         
                                         of the Open GitOps project.
                                         
                                         Yes.
                                         
                                         So I have heard of GitOps in the context of,
                                         
                                         oh, it's just basically your CICD stuff
                                         
    
                                         is triggered by Git events and whatnot.
                                         
                                         And I'm sitting here going,
                                         
                                         okay, so from where you're sitting,
                                         
                                         the two best user interfaces in the world
                                         
                                         that you have discovered are YAML and Git.
                                         
                                         And I just have to start with the question, who hurt you?
                                         
                                         Yeah, I share your sentiment when it comes to Git.
                                         
                                         Not so much with YAML, but I think it's because I'm so used to it.
                                         
    
                                         Maybe it's Stockholm Syndrome, maybe the whole YAML thing. I don't know.
                                         
                                         Well, it's no XML. We'll put it that way. Thankfully, yes. Because if it was, I would have way more like just template files laying around
                                         
                                         to build things. And Rage. Don't forget Rage. And Rage. Yeah. So GitOps is a little bit more
                                         
                                         than just Git and IAC, infrastructure as code. It's more like Justin Garrison, who's also on my team. He calls it
                                         
                                         infrastructure as software because there's four main principles to GitOps. And if you go to
                                         
                                         openGitOps.dev, you can see them. It's version one. So we put them on the website right there
                                         
                                         on the page. You have to have a declared state and that state has to live somewhere. Now it's
                                         
                                         called GitOps because Git is probably the most full-featured thing
                                         
    
                                         to put your state in, but you could use an S3 bucket
                                         
                                         and just version it, for example, and make it private
                                         
                                         so no one else can get to it.
                                         
                                         Or you could use local files.
                                         
                                         Copy of, copy of, this thing, restored, parentheses,
                                         
                                         use this one, dot final, dot doc, dot zip.
                                         
                                         You know, my preferred naming convention.
                                         
                                         Ah, yeah, wow, okay. use this one dot final dot doc dot zip you know my preferred naming convention ah yeah wow okay everything i touch is terrifying yes geez i'm sorry so first it's declarative you declare your
                                         
    
                                         state you store it somewhere it's versioned and immutable like i said and then pulled automatically
                                         
                                         don't focus so much on pull but basically software agents are applying the desired state from source.
                                         
                                         So what does that mean when it's, you know, the fourth principle is implemented, continuously reconciled.
                                         
                                         That means those software agents that are checking your desired state are actually putting it back into the desired state if it's out of whack, right?
                                         
                                         So you're talking about agents running it persistently on instances, validating a checkpoint
                                         
                                         on a cron. How is this meaningfully different than a Puppet agent running in years past? I learned to
                                         
                                         speak publicly by being a traveling trainer for Puppet, same type of model. And in fact, when I
                                         
                                         was at Pinterest, we wound up having a fair bit, like that was their entire model where they would have the puppets code would live in an S3 bucket that was then
                                         
    
                                         copied down, I believe via Git and then applied to the instance on a schedule. That sounds like
                                         
                                         this was sort of a early days GitOps. Yeah, exactly. Right. Like, so it's, I like to think of that as a component of GitOps, right? DevOps, when you talk about DevOps in general, there's a lot of stuff out there. There's a lot of things labeled DevOps that maybe are or maybe aren't sticking to some of those DevOps core things that make you great. Like the stuff that Nicole Forsgren writes about in books, you know, Accelerate is
                                         
                                         on my desk for a reason because
                                         
                                         there's things that good,
                                         
                                         well-managed DevOps practices
                                         
                                         do. I see
                                         
                                         GitOps as an actual implementation
                                         
                                         of
                                         
    
                                         DevOps in an open-source
                                         
                                         manner because all the tooling for GitOps
                                         
                                         these days is open-source and it all started
                                         
                                         as open-source. Now you can get like Flux or Argo. Argo specifically, there's managed services out there
                                         
                                         for it. You can have Flux and not maintain it through an add-on on EKS, for example, and it will
                                         
                                         reconcile that state for you automatically. And the thing I like to say about GitOps specifically
                                         
                                         is that it moves at the speed of the Kubernetes audit log. If you've ever looked at a Kubernetes
                                         
                                         audit log, you know it's rather noisy with all these groups and versions and kinds getting
                                         
    
                                         thrown out there. So GitOps will say, oh, there's an event for said thing that I'm supposed to be
                                         
                                         watching. Do I need to change anything? Yes or no? Yes? Okay, go. And the
                                         
                                         change gets applied or, hey, there's a new Git thing. Pull it in. A change has happened in Git.
                                         
                                         I need to update it. You can set it to reconcile on events, on time. It's like a cron or it's like
                                         
                                         an event-driven architecture, but it's combined. How does this survive the stake through the heart of configuration management?
                                         
                                         Because before I was doing all this, I was, I believe, a T-shaped engineer.
                                         
                                         You're brought across a bunch of things, but deep in one or two areas,
                                         
                                         and one of mine was configuration management.
                                         
    
                                         I wrote part of SaltStack once upon a time
                                         
                                         due to a bunch of very strange coincidences all hitting at once.
                                         
                                         I taught people how to use Puppet, but containers ultimately arose,
                                         
                                         and the idea of
                                         
                                         immutable infrastructure became a thing. And these days, when we're doing full-on serverless, well,
                                         
                                         great. I just wind up deploying a new code bundle to the lambdas function that I wind up caring
                                         
                                         about, and that is an immutable version replacement. There is no drift because there is no way to long
                                         
                                         in and change those things other than through a clear deployment of this is the new version that goes out there.
                                         
    
                                         Where does GitOps fit in to that imagined pattern? So configuration management becomes part of your
                                         
                                         approval process, right? So you now are generating an audit log essentially of all changes to your
                                         
                                         system through the approval process that you set up as part of your, how do you get things into
                                         
                                         source and then promote that out to production. That's kind of the beauty of it, right? Like,
                                         
                                         that's why we suggest using Git, because it has functions like requests and issues and things
                                         
                                         like that, that you can say, hey, yes, I approve this, or hey, no, I don't approve that, we need
                                         
                                         changes. So that's kind of natively happening with Git and GitLab, GitHub, whatever implementation of Git.
                                         
                                         There's always some kind of...
                                         
    
                                         Jithub is, I believe, the pronunciation.
                                         
                                         Jithub.
                                         
                                         Yeah, that's what I'm...
                                         
                                         Today I learned. Okay.
                                         
                                         Exactly.
                                         
                                         That's one of the things that I do for my last tweet in AWS.com Twitter client that I built because I needed it.
                                         
                                         And if other people want to use it, that's great.
                                         
                                         That is now deployed to 20 different AWS commercial regions simultaneously.
                                         
    
                                         Wow.
                                         
                                         And that is done via, because it turns out that that's a very long execute for loop if you start down that path.
                                         
                                         I wound up building out a GitHub Actions matrix, sorry, a Jithub Actions matrix job that winds up instantiating 20 parallel builds of the CDK deploy that goes out to each region as expected.
                                         
                                         And because that gets really expensive with native GitHub Actions runners,
                                         
                                         that's like 36 cents per deploy,
                                         
                                         and I don't know how to test my own code,
                                         
                                         so every time I have a typo, that's another quarter in the jar.
                                         
                                         Cool, but that was annoying for me,
                                         
    
                                         so I built my own custom runner system
                                         
                                         that uses Lambda functions as runners running containers
                                         
                                         pulled from ECR
                                         
                                         that, oh, it runs in parallel
                                         
                                         less than three minutes every time I
                                         
                                         commit something between I press the
                                         
                                         push button and it is out
                                         
                                         and running in the wild across all regions,
                                         
    
                                         which is awesome and also terrifying because
                                         
                                         as previously mentioned, I don't know how to test my
                                         
                                         code. Yeah, so you don't know
                                         
                                         what you're deploying to 20 regions sometime.
                                         
                                         Right, but it also means I have a pristine, recomposable build environment
                                         
                                         because I can just automatically have that go out.
                                         
                                         The fact that I'm either merging a pull request or doing a direct push
                                         
                                         because I consider main to be my feature branch,
                                         
    
                                         as whenever something hits that, all the automation kicks off.
                                         
                                         That was something that I found to be transformative
                                         
                                         as far as a way of
                                         
                                         thinking about this, because I was very tired of having to tweak my local laptop environment to,
                                         
                                         oh, you didn't assume the proper role and everything failed again and you broke it. Good
                                         
                                         job. It wound up being something where I could start developing on more and more disparate
                                         
                                         platforms. And it finally is what got me away from my old development model of everything I build is
                                         
                                         on an EC2 instance.
                                         
    
                                         And that means that my editor of choice was Vim.
                                         
                                         I use VS Code now for these things, and I'm pretty happy with it.
                                         
                                         Yeah.
                                         
                                         So, you know, I'm glad you brought up CDK.
                                         
                                         CDK gives you a lot of the capabilities to implement GitOps in a way that you could say, like, hey, use CDK to declare I need four Amazon EKS clusters with this size, shape, and configuration, go.
                                         
                                         Or even further, connect these EKS clusters to RDS instances and load balancers and everything else. But you put that state into Git, and then you have something that deploys it automatically upon changes.
                                         
                                         That is infrastructure as code.
                                         
                                         Now when you say, okay, main is your feature branch,
                                         
    
                                         things happen on main,
                                         
                                         if this were running in Kubernetes across a fleet of clusters the globe wide in 20 regions,
                                         
                                         something like Flux or Argo would kick in and say,
                                         
                                         there's been a change to source main
                                         
                                         and we need to roll this out
                                         
                                         and it'll start applying those changes
                                         
                                         now what do you get with git ops that you don't get with your configuration i mean can you roll
                                         
                                         back if you ever have like a bad commit that's just awful i mean that's really part of the process
                                         
    
                                         with git ops is to make sure that you can a roll back to the previous good state, roll forward to a known good state, or C, promote that state up through various
                                         
                                         environments. And then having that all done declaratively, automatically, and immutably
                                         
                                         inversion with an audit log, that I think is the real power of GitOps in the sense that like, oh,
                                         
                                         so-and-so approved this change to security policy XYZ on this date
                                         
                                         at this time. And that to an auditor, you just hand them a log file and like, here's everything
                                         
                                         we've ever done to our system. Done. Right? Like you could get to that state if you want to,
                                         
                                         which I think is kind of the idea of DevOps, which says take all these disparate tools and processes and procedures and culture changes,
                                         
                                         culture being the hardest part to adopt in DevOps.
                                         
    
                                         GitOps kind of forces a culture change where you can't do a cab with GitOps.
                                         
                                         Those two things don't fly.
                                         
                                         You don't have a configuration management database unless you absolutely...
                                         
                                         Oh, you have a cab now, but they're all in the comments of the pull request
                                         
                                         right exactly like don't push this change out until thursday after this other thing has happened
                                         
                                         kind of thing yeah like that all happens in github but it's very democratizing in the sense that
                                         
                                         people don't have to waste time in an hour-long meeting to get their five minutes in right
                                         
                                         doordash had a problem as As their cloud-native environments scaled and
                                         
    
                                         developers delivered new features, their monitoring system kept breaking down. In an organization where
                                         
                                         data is used to make better decisions about technology and about the business, losing
                                         
                                         observability means the entire company loses their competitive edge. With Chronosphere, DoorDash is
                                         
                                         no longer losing visibility into their application suite. The key? Chronosphere, DoorDash is no longer losing visibility into
                                         
                                         their application suite. The key? Chronosphere is an open-source compatible, scalable, and reliable
                                         
                                         observability solution that gives the observability lead at DoorDash business confidence and peace of
                                         
                                         mind. Read the full success story at snark.cloud slash chronosphere. That's snark.cloud slash c-h-r-o-n-o-s-p-h-e-r-e.
                                         
                                         So would it be overwhelmingly cynical to suggest that GitOps is the means to implement what we've
                                         
    
                                         all been pretending to have implemented for the last decade when giving talks at conferences?
                                         
                                         I wouldn't go that far. I would say that GitOps is an excellent way to
                                         
                                         implement the things you've been talking about at all these conferences for all these years.
                                         
                                         But keep in mind, the technology has changed a lot in the, what, 11, 12 years of the existence
                                         
                                         of DevOps now. I mean, we've gone from, let's try to manage whole servers immutably to, oh, now we just need to maintain an orchestration platform and run containers.
                                         
                                         That whole compute interface, you go from SSH to a Docker file.
                                         
                                         That's a big leap, right?
                                         
                                         Like, you don't have bespoke sysadmins.
                                         
    
                                         You have like a platform team. You don't have bespoke sysadmins. You have, like, a platform team.
                                         
                                         You don't have DevOps engineers.
                                         
                                         They're part of that platform team or DevOps teams, right?
                                         
                                         Like, which was kind of antithetical to the whole idea of DevOps, to have a DevOps team.
                                         
                                         You know, everybody's kind of in the same boat. changing in GitOps and Kubernetes land is like a platform team that
                                         
                                         manages the cluster and its state and health and production essentially.
                                         
                                         Then you have your developers deploying what they want to deploy and
                                         
                                         whatever namespace they've been given access to and whatever rights they have.
                                         
    
                                         So now you have the potential for one set of people,
                                         
                                         the platform team, to use one set of GitOps tooling. And your
                                         
                                         applications teams might not like that, and that's fine. They can have their own namespaces with their
                                         
                                         own tooling in it. Like Argo, for example, is preferred by a lot of developers because it has
                                         
                                         a nice UI with green and red dots, and they can show people, and it looks nice. Flux, it's command line based. And there are some projects
                                         
                                         out there that kind of take the UI of Argo and try to run Flux underneath that. And those are
                                         
                                         cool kind of projects, I think, in my mind. But in general, right, I think GitOps gives you the
                                         
                                         choice that we missed somewhat in DevOps implementations of the past because it was, oh, we need to go get cloud.
                                         
    
                                         Well, you can only use this cloud. Oh, we need to go get this thing. Well, you can only use this
                                         
                                         thing in-house. And there's a lot of restrictions sometimes placed on what you can use in your
                                         
                                         environment. Well, if your environment is Kubernetes, how do you restrict what you can run,
                                         
                                         right? You can't have an easily configured,
                                         
                                         say, no open source policy
                                         
                                         if you're running Kubernetes.
                                         
                                         So it becomes, you know-
                                         
                                         Well, that doesn't stop some companies from trying.
                                         
    
                                         Yeah, that's true.
                                         
                                         But the idea of like enabling your developers
                                         
                                         to deploy at will
                                         
                                         and then promote their changes as they see fit
                                         
                                         is really the dream of DevOps, right?
                                         
                                         Like same with production and platform teams, right? I want to push my changes out to a larger
                                         
                                         system that is across the globe. How do I do that? How do I manage that? How do I make sure
                                         
                                         everything's consistent? GitOps gives you those ways with Kubernetes native things like customizations to make consistent environments that are robust
                                         
    
                                         and actually going to be reconciled automatically
                                         
                                         if someone breaks the glass and says,
                                         
                                         oh, I need to run this container immediately.
                                         
                                         Well, that's going to create problems
                                         
                                         because it's deviated from state
                                         
                                         and it's just that one region,
                                         
                                         so we'll put it back into state.
                                         
                                         But if you're dueling banjos at some point, you You'll try doing something manually, it gets reverted automatically.
                                         
    
                                         I love that pattern. You'll get bored before the computer does, always.
                                         
                                         Yeah, and GitOps is very new, right? When you think about the lifetime of GitOps,
                                         
                                         I think it was coined in like 2018, so it's only four years old.
                                         
                                         I prefer it to ChatOps, at least, as far as implementation and expression.
                                         
                                         ChatOps was a way to do DevOps.
                                         
                                         I think GitOps gives you a...
                                         
                                         Well, chat ops is also a way to wind up giving whoever gets access to your Slack workspace
                                         
                                         root and production, but that's neither here nor there.
                                         
    
                                         Yeah, we all like to pretend that that's not a giant security issue in our industry,
                                         
                                         but that's a topic for another time.
                                         
                                         Yeah, and that's why GitOps also depends upon you having good security
                                         
                                         and good authorization
                                         
                                         and approval processes it enforces yeah who doesn't have one of those yeah if it's a sole
                                         
                                         operation kind of deal like in your setup your case i think you kind of got it doing right right
                                         
                                         like as far as get ups goes to be clear we are 11 people and we do have dueling pull requests and
                                         
                                         all the rest but most of the stuff I talk
                                         
    
                                         about publicly is not our production stuff, so
                                         
                                         it really is just me. Just as a point of
                                         
                                         clarity there, the 11 people here
                                         
                                         do not all, the rest of them do not just sit there and clap
                                         
                                         as I do all the work most days. Right. No,
                                         
                                         I'm sure they don't. I'm almost certain they don't
                                         
                                         clap for you. No.
                                         
                                         No, they try to talk me out of it. Yeah, exactly.
                                         
    
                                         So the setup that you,
                                         
                                         Corey Quinn, have implemented to deploy to these 20 regions is kind of very GitOps-y in the sense that when main changes, it gets updated.
                                         
                                         Where it's not GitOps-y is what if the endpoint changes?
                                         
                                         Does it get reconciled? piece you're probably missing is that continuous reconciliation component where it's constantly checking and saying this thing out there is deployed in the way i want it you know the way
                                         
                                         i declared it to be in my source of truth yeah when you start having other people and getting
                                         
                                         involved there can yeah that's where regressions enter and it's like well i know where things are
                                         
                                         so why would i change the end point yeah it turns out not everyone has the state of the entire
                                         
                                         application in their head that really should live in, you know, Gitter S3.
                                         
    
                                         Yeah, exactly.
                                         
                                         When I think about interactions of the past coming in as a new DevOps engineer to work with developers,
                                         
                                         it's always been, well, developers have access to prod or they don't.
                                         
                                         And if you're in that environment with you're trying to run a multi-billion dollar operation
                                         
                                         and your devs have a direct or one dev has direct access to prod
                                         
                                         because prod is in his brain.
                                         
                                         That's where it's like,
                                         
                                         well, now wait a minute.
                                         
    
                                         Prod doesn't have to be only in your brain.
                                         
                                         You can put that in the code base
                                         
                                         and now we know
                                         
                                         what is in your brain, right?
                                         
                                         Like you can almost do,
                                         
                                         if you document your code well,
                                         
                                         you can have your full life cycle
                                         
                                         right there in
                                         
    
                                         one place, including documentation, which I think is the best part too. So, you know, it encourages
                                         
                                         approval processes and automation over this one person has an entire state of the system in their
                                         
                                         head. They have to go in and fix it. And what if they're not on call or in Jamaica or on a cruise ship
                                         
                                         somewhere kind of thing? Things get difficult. Like, for example, I just got back from vacation.
                                         
                                         We were so far off the grid, we had satellite internet. And let me tell you, it was hard to
                                         
                                         write an email newsletter where I usually open 50 to 100 tabs. There's a little bit of internet
                                         
                                         out of California way. Yeah. It's always weird going from like, especially after the pandemic, I have gigabits symmetric here and going even to reinvent where I'm trying to upload
                                         
                                         a bunch of video and whatnot. And the conference wifi was doing its thing and well, Verizon 5G was
                                         
    
                                         there, but spotty and yeah, usual stuff. Yeah. It's amazing to me how connectivity has become
                                         
                                         so ubiquitous. To the point where when it's not there anymore,
                                         
                                         it's what do I do with myself? Same story about people pushing back against remote development
                                         
                                         of, oh, I'm just going to do it all on my laptop because what happens if I'm on a plane? It's,
                                         
                                         yeah, the year before the pandemic, I flew 140,000 miles domestically, and I was almost
                                         
                                         never hamstrung by my ability to do work. and all my only local computer is an ipad for those
                                         
                                         things so it turns out that is less of a real world concern for most folk yeah i actually ordered the
                                         
                                         components to uh upgrade an old nook that i have here and turn it into my like this is my remote
                                         
    
                                         code server that's going to be all attached to github and everything else that's where i want
                                         
                                         to be have tailscale and just VPN into this box.
                                         
                                         Tailscale is transformative.
                                         
                                         Yes. Tailscale will change your life. That's just my personal opinion. That's not AWS's
                                         
                                         opinion or anything. But yeah, when you start thinking about your network as it could be
                                         
                                         anywhere, that's where Tailscale really shines.
                                         
                                         Tailscale makes the internet work like we all wanted
                                         
                                         to believe that it worked.
                                         
    
                                         WireGuard is an excellent
                                         
                                         open source project and Tailscale
                                         
                                         consumes that and puts an
                                         
                                         amazingly easy to use UI
                                         
                                         and troubleshooting tools and
                                         
                                         routing and all kinds
                                         
                                         of forwarding capabilities
                                         
                                         and makes it kind of easy, which is
                                         
    
                                         really, really, really kind of awesome.
                                         
                                         And Tailscale and Kubernetes have a good story.
                                         
                                         Yeah, networking and easy don't belong in the same sentence,
                                         
                                         but in this case, they do.
                                         
                                         Yeah, and trust me,
                                         
                                         the Kubernetes story and Tailscale,
                                         
                                         there is a lot of there there.
                                         
                                         I understand you might want to
                                         
    
                                         not open ports in your VPC maybe,
                                         
                                         but if you use Tailscale,
                                         
                                         that node is just another thing on your network you can connect to
                                         
                                         that and see what's going on your management cluster is just another thing on the network
                                         
                                         where you can watch the state but it's all you're connected to it continuously through tail scale or
                                         
                                         you know it's a much lighter weight kind of meshy vpn i would, if I had to sum it up in one sentence. That was not on our agenda
                                         
                                         to talk about at all. Anyways. No, no. I love how different topics we talk about on these things.
                                         
                                         We'll have to have you back soon to talk again. I really want to thank you for being so generous
                                         
    
                                         with your time. If people want to learn more about what you're up to and how you view these things,
                                         
                                         where can they find you? Go to chrisshort.net. So Chris, short. I'm 6'4", so
                                         
                                         remember it's short.net. And you will find all the places that I write. You can go to devopsish.com
                                         
                                         to subscribe to my newsletter, which goes out every week this year. Next year, there'll be breaks.
                                         
                                         And then finally, if you want to follow me on Twitter, Chris Short at Chris Short on Twitter, all one word.
                                         
                                         So you'll see two S's.
                                         
                                         Like, it's okay.
                                         
                                         There's two S's there.
                                         
    
                                         Links to all of that will, of course, be in the show notes.
                                         
                                         It's easy for people to do the clicky, clicky thing as a general rule.
                                         
                                         Clicky things are easier than the wordy things, yes.
                                         
                                         Says the Kubernetes guy.
                                         
                                         Yeah, says the Kubernetes guy.
                                         
                                         Yeah, you like that, huh?
                                         
                                         Like I said, Argo gives you a UI.
                                         
                                         Thank you so much for your time. I really do appreciate it.
                                         
    
                                         Thank you. This has been fun. If folks have questions, feel free to reach out. I am not
                                         
                                         one of those people that hides behind a screen all day and doesn't respond. I will respond to
                                         
                                         you eventually. I'm right here, Chris. Come on, Come on. You're calling me out in front of myself. It might
                                         
                                         take a day or two, but I will respond. I promise.
                                         
                                         Thanks again for your
                                         
                                         time. This has been Chris Short,
                                         
                                         Senior Developer Advocate at AWS.
                                         
                                         I'm Cloud Economist
                                         
    
                                         Corey Quinn, and this is Screaming in the Cloud.
                                         
                                         If you've enjoyed this podcast,
                                         
                                         please leave a five-star review on
                                         
                                         your podcast platform of choice, and if it's YouTube,
                                         
                                         click the thumbs-up button. Whereas if it's YouTube, click the thumbs up button.
                                         
                                         Whereas if you've hated this podcast, same thing.
                                         
                                         Smash the buttons, five-star review,
                                         
                                         and leave an insulting comment that is written in syntactically correct YAML
                                         
    
                                         because it's just so easy to do.
                                         
                                         If your AWS bill keeps rising and your blood pressure is doing the same,
                                         
                                         then you need the Duck Bill Group.
                                         
                                         We help companies fix their AWS bill by making it smaller and less horrifying.
                                         
                                         The Duck Bill Group works for you, not AWS.
                                         
                                         We tailor recommendations to your business and we get to the point.
                                         
                                         Visit duckbillgroup.com to get started.
                                         
                                         This has been a HumblePod production.
                                         
    
                                         Stay humble.
                                         
