Programming Throwdown - 135: Kubernetes with Aran Khanna

Episode Date: June 6, 2022

00:00:15 Introduction00:01:03 Aran Khanna and his background00:05:12 The Marauder’s Map that Facebook hated(Chrome Extension)00:20:11 Why Google made Kubernetes00:31:14 Horizontal and Verti...cal Auto-Scaling00:35:54 Zencastr00:39:53 How machines talk to each other00:46:32 Sidecars00:48:25 Resources to learn Kubernetes00:52:59 Archera00:59:31 Opportunities at Archera01:01:08 Archera for End Users01:02:30 Archera as a Company01:05:46 Farewells   Resources mentioned in this episode:Aran Khanna, Cofounder of Archera:LinkedIn: https://www.linkedin.com/in/aran-khanna/Website: http://arankhanna.com/menu.htmlTwitter: https://twitter.com/arankhannaArchera:Website: https://archera.ai/LinkedIn: https://www.linkedin.com/company/archera-ai/Twitter: https://twitter.com/archeraaiKubernetes:Website: https://kubernetes.io/Documentary: https://www.youtube.com/watch?v=BE77h7dmoQUIf you’ve enjoyed this episode, you can listen to more on Programming Throwdown’s website: https://www.programmingthrowdown.com/ Reach out to us via email: programmingthrowdown@gmail.com You can also follow Programming Throwdown on Facebook | Apple Podcasts | Spotify | Player.FM  Join the discussion on our DiscordHelp support Programming Throwdown through our Patreon  ★ Support this podcast on Patreon ★

Transcript
Discussion (0)
Starting point is 00:00:00 Hey everybody, so you might have heard the term Kubernetes or clustering or these kind of terms, containers and pods and kind of wondered, what is that? You know, that is a whole other universe, especially if you're in university, you might have a Kubernetes cluster at your university, but you're just a person who's using it. I bet your email server is probably running on a Kubernetes cluster you might not even know. And so we're going to really dive into what all of these things are and unpack them, which I think is going to be really, really exciting and super valuable. And to do that, we have Aran Khanna on the show. Thanks for coming on the show, Aran. Hey, thank you so much for having
Starting point is 00:01:00 me, Jason and Patrick. It's great to be here. Cool. Great. Yeah. So Aran is the co-founder of Archera, which is a startup that focuses on cloud optimization. And he is an expert on Kubernetes. And he's going to kind of really unpack all of it for us. But before we get into that, you know, Aran, like, how did you get started in tech? And what's been your journey like? Yeah, well, I can give you kind of the short version. I was born and raised here in Seattle around the time of, you know, the 90s boom in Microsoft, and then the subsequent boom in Amazon and AWS coming online, just as I was sort of coming up. But ironically, I had nothing to do with tech, didn't want to work in it. I was actually very into biology and synthetic
Starting point is 00:01:40 biology, biotech, as it's now called. And I was doing an internship my senior year before I was going off to college in that space in a biotech startup working on algae, trying to engineer them to create biofuels. And my cycle times on my experiments were like three weeks and the algae would all die, the fungus would come in, like, it was just such a nightmare. You have to unpack this for us. So I've heard about this, but I think I've heard about this from like a Deus Ex video game or something. This is like a real thing. So yeah, so this is a real thing where you actually are working on algae that creates fuel. Yeah, explain that for us. It's fascinating. It was really interesting. I mean, the world of
Starting point is 00:02:20 biotechnology in the last two, three decades has really exploded thanks to computers and then a lot of the new techniques. I'm sure you guys have heard of CRISPR and things like that. But the net is now that we have genome sequencing at such a cheap level, thanks to all the innovation in that space, we can actually not only read the genome, but understand what pieces are doing and then targeted almost like programming, go splice out pieces of them, put in your own kind of genes that you want that organism to express. Obviously, it's much easier with single-celled organisms like bacteria, and then let them multiply. And you can get these edits essentially that are functional, that are driven by you as the scientist. But to me, I didn't realize it at the time,
Starting point is 00:03:02 that was just an instance, a very complicated instance of programming. It was like the equivalent of, you know, writing everything in assembly back in the day. But the cycle times, you know, even when you're writing an assembly, you could click a button and the thing would execute. And even if the clock speed was slow, you'd get a result by the end of the day. With this sort of quote unquote programming these cells, it would take weeks, even with these single celled organisms for the things to culture and grow and for results to come back. If the batch was bad, if you made a single-point error or something like that, the equivalent of missing a semicolon, you'd have to wait two weeks to go
Starting point is 00:03:33 figure that out and then come back and do the experiment again. It was a nightmare. But during my lunch breaks, I would go upstairs and talk to the guys working in Java on the genome sequencing and the computational side, and they were just tearing through. They know, they would get through 20 experiments a day, because all they had to do was hit the enter button, and watch the program run and execute and analyze the data. And so as I went to college, I'm like, look, you know, if I want to do something meaningful, build interesting things, what's a better way to do it, you know, a way where each iteration takes two to three weeks, obviously, it's much better now, you know, a decade maybe later, or something where I can click a button, get a result immediately, and then distribute it to the whole world over the internet. And that's actually how
Starting point is 00:04:13 originally a lot of my passion and the interesting projects I worked on came out. I taught myself JavaScript and took the intro CS class in college, had really nothing to do with touching the servers for a number of years, but wrote a lot of really interesting Chrome extensions around digital privacy, one of which actually got me fired from Facebook, which is a fun story. But I ended up doing a lot of interesting stuff on the data visualization and the JavaScript side before, you know, realizing there was so much interesting stuff going on on the server side. And that's when I got into working in the cloud vendors, understanding really the nuts and bolts of how the back end of these systems, these large web systems worked. And then really ultimately got into machine learning and Kubernetes and all the interesting things that I'm sure we're going to talk about later in the episode.
Starting point is 00:04:54 Very cool. So, okay, so you're doing the Algae thing as like a high school project? This is pre-university. Yeah, pre-university, high school internship, essentially. Cool. And then went to university, studied CS, and then you started off doing front-end work. And so yeah, let's dive into that. So you built a Chrome extension that got you fired. So what exactly is going on there?
Starting point is 00:05:17 Yeah, that's a fun story. So originally, like I was saying earlier, the big issue with working in that synthetic biology field was the fact that, you know, one, it took so long to get iterations out. And then two, it was really difficult to share my work with the world, which is why I went to the other side of the spectrum and really started to dive into, you know, front end technologies. I loved getting something that users could touch and feel in their hands. And, you know, the last mile there is actually getting an application together. And luckily at the time, Chrome, Firefox, all of these browsers made it really easy without essentially having any experience on the server side to go and build a functional,
Starting point is 00:05:52 interesting web app. So I was playing around, you know, hacking on projects on the side during my time in college. And this was around 2015. I started to see and I was actually taking a privacy class at school at the time, taught by the former head of the FTC's technology division. So we were talking a little bit about privacy in class. And Facebook Messenger at the time was taking off around Harvard, the school I was at. And it was basically the primary way, 2014, 2015, for me and all my friends to communicate with each other.
Starting point is 00:06:22 One of the interesting things that I recognized was that every time a message was sent from a mobile device, by default, a location would be attached to that message, whether it was sent in a one-on-one chat, a group chat, what have you. And so I started to think, well, you know, if I just go through this backlog of messages sitting here in my browser and just plotted them on a map, would that be enough data to actually, you know, essentially dock someone or de-anonymize their location history based on this weird default in Facebook Messenger? So this is, wait, I want to make sure I get this clear. So this is someone who's messaging you or are you somehow able to get other people's messages?
Starting point is 00:07:01 Well, if they message you or any group that you are in, by default, that message would have location metadata attached to it, unless they'd gone in and turned off the default, essentially. So, you know, having the skills that I newly acquired to go build Chrome extensions, go build JavaScript apps, instead of doing this with a piece of paper and a pencil and writing a blog, I decided, well, what better way to not only test this out, but to let users see for themselves if this is something they should go and turn off than to build a Chrome extension that just sits in your browser, sucks in all that data when you go to Facebook, and then plots it on a map. And I cheekily called it the Marauders map because I was using the internal
Starting point is 00:07:37 builds to track my friends around Harvard, the same way Harry Potter tracked his friends with the Marauders map around Hogwarts. know, Hogwarts. And, you know, released it, I was actually completely independently accepted for a Facebook internship on the News Feed ranking team that summer. And honestly, I thought this would be a benefit to the users, it would educate users, it would give them insight into, you know, what these defaults were doing with their data and give them the ability to understand if they wanted to turn it off. So I thought this was a net benefit to users, released it, it went viral, got, you know, over 120,000 downloads on the Chrome extension store, the Chrome web store. And then Facebook reached out and said, you know, please deactivate
Starting point is 00:08:17 it. And I did left up the code is open source, obviously, because, you know, started as an open source research project to start with. And then the day before my internship was supposed to start, I'd complied with everything, taken it down, etc. You know, the VP of engineering and the head of HR called me a lowly intern and said, Hey, you know, we're going to rescind your internship because you didn't act in the best interests of Facebook. Shouldn't be surprising now was pretty surprising in 2015. We all know now what the cultural issues at Facebook were. But you know, the experience there was really interesting, because I was actually motivated, I could see that the impact was made, I could see the product decision was changed. And even though the, you know,
Starting point is 00:08:53 I had to go and scramble and find a job afterwards, the experience really stuck with me. And I actually did a number of additional projects, such as building a Chrome extension to suck in the Venmo history, that's all public and, you know, build a map of transactions projects, such as building a Chrome extension to suck in the Venmo history that's all public and, you know, build a map of transactions and, you know, do some other interesting de-anonymization projects while I was working on the privacy research side before I moved over into machine learning research and then the cloud. Oh, I see. So, so yeah, what was your, how did you get your toe in the water on the machine learning side? Was that a, like a subsequent internship or what?
Starting point is 00:09:25 Yeah, actually, this experience, you know, with Facebook, after being fired, I had to scramble and find a job. I ended up landing at a small startup with a Carnegie Mellon professor who was working on an open source deep learning framework called MX net, which eventually, you know, as we went through that, you know, that lifecycle, it became the open source Apache deep learning framework competing with TensorFlow and PyTorch. Before that, actually, right before working at the Facebook internship, I worked at Microsoft on the Azure team. I had some experience there. Then, lo and behold, after about a year and a half of developing MXNet with this team, I was going to go back and join them. AWS acquires that team and I end up going to AWS with them to build out the SageMaker suite of tools, as well as continue some of that open source work and the deep learning research I was doing with that professor.
Starting point is 00:10:17 Oh, very cool. So I see. So you started. So that was a pretty big transition, right? So you go from working on front end, you know, JavaScript and, you know, kind of Chrome extensions all the way into MXNet, which is kind of the guts of like TensorFlow operations. I mean, you're talking like, like really low level, like C and all sorts of like SIMD and kind of OpenCL. Wow. That was a pretty big transition. How did you, this is really interesting. You know, how did you build, you know, so you know how to program, you have the basics,
Starting point is 00:10:52 you've never done anything CUDA or OpenCL or whatever. What was day one like, you know, how did you build that muscle? Right. Day one was pretty brutal, but I will say that, you know, the year before I had transitioned had transitioned from the general CS track to more of the systems track. So in school, I had taken classes that were taught in C. Even our intro class was actually taught in C, but taken much deeper classes that were oriented around C programming. I took the operating systems class at school, which was really helpful for parallel computation and then really complex C programs that you had to put together like virtual memory. So coming in, I had battle scars. I knew that this was a different programming paradigm, like managing memory is nonsense when you're only operating in a JavaScript world, right? So having some of that coming in was obviously very useful. And then I actually
Starting point is 00:11:40 didn't purely study CS in school. I studied math as well. So having some of that pure math background is really helpful for the machine learning side because it wasn't just the low level programming, especially in a team of like it was three, four of us really working on it at the time in the startup. It was having the ability to go from that low level all the way to the high level. Like what is the mathematics that we need to implement here? Say, and how can we make it more efficient? What's kind of the net outcome? So having a lot of that framework was very helpful. And I think going in blind without
Starting point is 00:12:09 that low level understanding of memory management and, you know, programming in C, which obviously is very akin to what you're doing in CUDA, as well as the depth of understanding of, you know, real analysis and dynamical systems and, you know, measure theory and probability and statistics, at least basic statistics, very difficult to get your hands around what is this actual deep network that you're trying to implement. So luckily, I had both of those pieces, and then obviously took months and months of, you know, working in the GitHub, getting my PRs denied again and again and again, and reading and taking courses to really get to a point where I could be dangerous, so to speak. So it was a process, I'll tell you that. But you know,
Starting point is 00:12:50 at the time, I basically had lost my job. And I was trying to figure out what to do next. So it was a good use of time that summer, I will say. Yeah, totally. That makes sense. Yeah, I think MXNet is really interesting, because it's a difficult market to crack. I mean, TensorFlow has been around for a long time. PyTorch is really taking off. But the thing I liked about MXNet was that it wasn't, well, for a while, at least, it was really like a consortium of different kind of folks. It wasn't like a single sort of like owner or like single controller like there was for TensorFlow. And I think PyTorch actually, in my opinion, does a better job of kind
Starting point is 00:13:31 of getting feedback from the community. But I personally feel like TensorFlow, it's very clear that there is a single controller there and it's very difficult to sort of, you know, move the thinking in TensorFlow. Yeah, it's really interesting because that was my first real foray into open source projects, which I think obviously is very apt when talking about Kubernetes, which I think has some of the dynamics that I saw in that market,
Starting point is 00:13:57 obviously at a much later stage in its evolution. But I would fully agree. I think, you know, on the spectrum, when we were starting, there was TensorFlow, which was really a closed ecosystem in many ways, even though it was in or, you know, in theory, open source, you could see the PRs that get accepted and rejected. And it was pretty clear what the zeit FAIR and how independent it's been, I think it's been much more amenable to outside contribution, to community contribution. In fact, when we were at AWS incorporating MXNet there, we created Onyx with the PyTorch folks as that open model inter-exchange. And that was something that explicitly TensorFlow was not into. So, you know, I've been through some of those quote unquote, open source battles before in that world. And,
Starting point is 00:14:50 you know, I think it's interesting to see now that PyTorch has so much greater adoption, I think, because of the fact that it's been nimble and driven by the community versus TensorFlow, which, you know, really, for the longest time, just stuck with its declarative only approach, even though everyone's like, No, no, I want imperative. I want something like what MXNet offers or really what PyTorch ended up offering at the end to most deep learning scientists. Yeah, totally. And so now the third evolution of your kind of skill set is then going from, you know, this machine learning and vectorization and BLAS and all these things to services, you know, to clusters, Kubernetes.
Starting point is 00:15:29 And so what was that transition like? And what sort of motivated that? And then how did you build that set of skills? Yeah. So when we came in our MXNet team to AWS, essentially, we were in research mode. We're a bunch of researchers building an open source framework. And, you know, Amazon is a product shop. So're a bunch of researchers building an open source framework. And Amazon is a product shop. So it was eight of us and they're like, well, you guys got to go build a team and
Starting point is 00:15:49 ship some products. This was before SageMaker, which is what our team ended up shipping, as well as the suite of managed machine learning services and tools around that, like the model marketplace and DeepLens, which I actually pitched and led the team for. But generally, the whole ecosystem was pretty immature on the AWS side for machine learning. But generally, the whole ecosystem was pretty immature on the AWS side for machine learning, and we were the ones who were supposed to correct that. So with that kick in the butt, we went from being essentially an open source shop
Starting point is 00:16:14 with a heavy bend on research to trying to productionize some of this. And I think that's where I really started to cut my teeth on what does it take to essentially get research projects, which a lot of these were at the time in 2016, particularly around the deep learning side, vision, the language, speech, synthesis, et cetera, and put that into a productized state that's at the bar of an AWS service. That was a lot of learning and talking with customers, going through iterations with some of our biggest customers, we thought were going to be great fits for managed machine learning on AWS because they had all their data sitting there.
Starting point is 00:16:50 Taught me so much about not just how to build services, but also how customers use them in unintended ways, the ways that you have to be really thoughtful about things like pricing, which is really what the company I'm working on now is doing, trying to help folks optimize AWS, Azure, Google service pricing, and then really be thoughtful about the performance semantics because people will use things in crazy ways at crazy scales that you cannot really comprehend until you actually see it done. So really going through from becoming, being a researcher and doing this, you know, vectorized machine learning stuff, low level implementations of deep learning kernels through to actually building teams and launching services.
Starting point is 00:17:29 I think that was the AWS training that I take with me now and really look to apply in building products widely, particularly focused on the space of cloud infrastructure. Cool. That makes sense. Yeah. And so I think that, you know, the experience that you had is one that can resonate with a lot of people where you get something kind of working, you know, the experience that you had is one that can resonate with a lot of people where you get something kind of working, you know, on your desktop. And now you have 1000 people who need it, you know, and if it's a library, I think you're fine, because you could post it on GitHub, you make a release and people download it, and it scales that way. But if it's a service, that's where it becomes really unclear. And I think you can kind of get in trouble in both directions. In one direction, you spin up a
Starting point is 00:18:14 huge database and you spin up your own virtual private cloud and you do all of these things. There was an article about Fast, this company Fast. There's some kind of like FinTech. Actually, they recently closed, but they had built all this infrastructure. And really, they could have run the entire company on like one EC2 instance, you know, because they just they really didn't need all this. They didn't have the kind of customers. Now, part of it, I mean, there's more to that story. I think there was a little bit of fibbing around how many customers they really did have and everything. But you see this where you can kind of over-engineer things,
Starting point is 00:18:47 and now you're paying this really hefty bill every month, and most of that is wasted. On the flip side, the biggest fear folks have is our page goes on Hacker News. Some random person posts our page on Hacker News, and a whole bunch of people check it out. And the experience they get is like just the spinning beach ball or something because our one server has completely died and blown up. So then there's this third orthogonal axis where maybe you don't even use that many resources, but just the way you've designed things has caused you to get an extraordinary AWS bill. Like what I'm thinking of is cases where you have AWS Lambda, like trigger loops or, you know, triggers itself. Recursive Lambda. We see that a lot, actually. It's a
Starting point is 00:19:38 really interesting pattern. Yeah. And you end up with like a hundred K Lambda bill or something. So, you know, we'll get to that third axis. I think there's a lot of content there, but looking at the first two, you know, Kubernetes was really designed to be able to handle those two gracefully. And so I feel like that was kind of the big motivation behind something like Kubernetes, but I'd love to know, you know, you probably know a lot about the sort of history. And so, I know that at Google, they had Borg and some of these things internally. And so, what was the motivation for Google to make Kubernetes? And do you have any sort of inside baseball? Oh, my God, do I ever? And it's something that everyone can go and watch right
Starting point is 00:20:23 now on YouTube, because there was this amazing documentary two parter on the history of Kubernetes that was put out, I think, late last year, early this year, we can add it to the show notes, I can send it to you afterwards. But yeah, everything I'm about to say just comes from that it is so well done. It is like professional, it's incredible. They talked to everyone who was part of the team. But, you know, it has a very interesting history. Because think at the time it wasn't clear that, well, first of all, they wanted to get into the cloud market. Amazon was already very well established. Kubernetes was one of these weird kind of internal projects that took a lot of fighting, as you see in the documentary, to actually get approved. And then to get the open source piece of it approved was another whole battle that they had to wage.
Starting point is 00:21:03 And I think the documentary does a really good job of going through that. But it wasn't a kind of a straight linear path. It took a lot of dings along the way. And, you know, took some instances, as you can see in the documentary, where the engineering team is like, look, we're just gonna, you know, with or without leadership, we're gonna have to go and build this thing, you know, and let the code speak for itself. So there's a lot of kind of interesting sub stories in there, but generally came out of Google, there was a strong bend from the team to make it open source from the start. And building a lot of the community around it, I think was, you know, kind of an uphill battle in the early days for them, especially because of the fact that they didn't do this as like an Apache open source,
Starting point is 00:21:41 it was Google, it was owned by them, But there was an open source bend to it. Eventually, the CNCF and some of these more neutral organizations took it over. But it was a very interesting history where a number of Google engineers internally in a response to AWS and kind of the dominance that they were having in the compute cloud ideated on this project, had a lot of kind of back and forth battles to get it out the door as an open source version that anyone could use a board. And maybe it's worth stepping back and explaining what that system did. It essentially took a lot of these commodity machines running in the data center and using this new innovation of containers, which is essentially within a VM, within a virtual machine, on an
Starting point is 00:22:18 actual machine, another layer of abstraction using cgroups, namespaces, and Linux to really isolate sets of processes from each other within the operating system. And using that new construct that had started to come out, I think this was in the early 2010s and started to become popular. This is a way to actually take that construct that people are using to run generic applications on any kind of VM, particularly Linux, you could make a Windows container and things like that were running on those VMs, and actually orchestrate them at scale. So instead of just getting the ability to go build a container and run it on a VM, which was, you know, great in isolation for giving you that homogenous environment to go execute an application, you were also able to coordinate these applications across many VMs
Starting point is 00:23:06 into much more robust, distributed applications that Kubernetes as a control plane made it really easy to manage as it started to mature. Yeah, that makes sense. So cool. Yeah, we'll definitely have to watch a documentary. But at a high level, what you know, what Kubernetes is doing is if folks remember, we had an episode earlier on Docker. But just to give a really quick recap, we have virtual machines, which are pretty heavy, right? I mean, they have the whole operating system installed. I mean, the operating system alone might be like five gig, right? And you have to, it's very hard to sort of pass these around.
Starting point is 00:23:41 And you don't, like with a virtual machine, you have sort of the state of the hard drive, but you don't really know how you got there. So the nice thing about containers is that you have this Docker file, the script, which tells you, okay, the first line is, you know, start with a, you know, default installation of Ubuntu or something. And then the second line is, you know, go ahead and apt get these, you know, 10 zillion packages. And that third line is, you know, start these services. And the fourth line is maybe expose this MySQL port so that I can access it from the outside. And so anybody anywhere can run that Docker file and get a MySQL server kind of up and running. And the nice thing is because a Docker file has a reference to either another Docker file
Starting point is 00:24:30 or some base image, the Docker file itself and the container doesn't need to hold the entire OS in it, just needs to hold sort of the delta. So that's sort of Docker in a nutshell. Now we're faced with this issue of, okay, I have this way to create a container, but if I want to create, you know, a hundred of them and I want all hundred of them to be
Starting point is 00:24:50 in some sort of load balance or where now I have, you know, a hundred servers up and running and they're all able to handle requests and I want to be able to change that number from a hundred to 200, you know, there's a ton of complexity around that. And Kubernetes was designed to sort of handle all of that, you know, communication between all of these containers. So can you dive into a little bit of the sort of glossary? Cause I'm a little fuzzy on this stuff that there's, I think there's pods, right? There's a Kubernetes cluster. Well, there's containers as well. So, you know, you have maybe starting at the base level and moving up might be helpful. So the container, I think you did an awesome job of explaining.
Starting point is 00:25:26 That's exactly right. And people love containers because they're modular, easy to essentially recreate and great, you know, common homogenous places to run applications. So that's at the lowest level you have containers. One level above that, you have kind of sets of containers that are logically connected. And generally those are put into pods within Kubernetes. Those pods can then be grouped together within a namespace within Kubernetes. So this is all very abstract, we're not even touching the nodes and the instances and all that stuff yet. So you have containers, you have these logical groupings of containers in pods, you have the
Starting point is 00:26:03 namespace that is an organizational unit that can contain a set of pods for, say, an application. You have the SQL pods and the website serving pods that sit there together in a namespace for application A, for example. Or if you have a machine learning group that's using a bunch of GPU pods that can sit in application namespace B or something like that. From there, you actually can think about the whole cluster, which is really the set of nodes being managed by the Kubernetes agent on those nodes, plus the specific set of nodes
Starting point is 00:26:39 that are the master, which has essentially the requirement to coordinate all of the nodes and the pods. And it uses this very interesting distributed database, collect CD to do that and keep track of state in a way where if any piece of that cluster fails, the abstraction of the namespaces, the pods, the containers is still maintained, you know, even if the nodes are potentially unstable or unreachable or something like that. So I think that's at a high level kind of the anatomy of a cluster. And then you can get into kind of the specific pieces like load balancing, if you want to stick load balancers in front of specific pods,
Starting point is 00:27:17 like web serving pods, in case you have a spike in traffic or something like that, you want to make sure it all doesn't go to one machine to one pod. So those things are constructs in Kubernetes as well. Persistent volume claims, which is really attaching a disk into containers and pods, right? So they can have access to hard disk and not just the VM resources that technically aren't supposed to store state, they can be stateless, so to speak. And then you have these higher order operational abstractions like load balancers, where you have, you know, horizontal, sorry, not load balancers, rather autoscalers, where you have a horizontal autoscaler, vertical pod autoscaler, and a cluster autoscaler, which all serve different needs in terms of, you know, as you were alluding to earlier, making sure that
Starting point is 00:28:01 the number of nodes, the number of containers, the resources allocated to each container, all keep pace with the amount of load being experienced by applications in the cluster. So I know I went through quite a glossary there. Maybe it's worth stepping back and diving into each of those pieces a little bit. Yeah, there's a lot. Yeah, totally. Let's start with the autoscaler because I think something that's really fascinating. So actually, I take it back. Let's start with the load balancer because we kind of need to start there. So, you know, people, I'm sure some folks out there have used things like NGI and X and these other things that are independent of Kubernetes, where you can say, you know, if I get a request on this port, then 10% of them go to this IP address, you know, 90% go to this IP address. And so that seems
Starting point is 00:28:46 pretty straightforward. What does Kubernetes offer? Is it still like you bring your own like NGINX or is it built into Kubernetes? How does that work? Yeah, so generally nowadays, you know, you'll have services managed Kubernetes services, you don't have to roll the whole thing and manage it yourself. So you'll have things like EKS and on AWS, AKS on Azure, GKE on Google Cloud Engine. And you're able to, just using those, actually have access to the native load balancers within those cloud vendors. So you don't have to go and roll your own NGINX load balancer container and spin it up and manage it. You can just say, look, give me the Azure load balancer, the Google load balancer, the AWS load balancer for this set of resources. So that's generally nowadays, given the maturity and all the managed service in the space, how folks will leverage those things.
Starting point is 00:29:34 It's completely abstracted away as it should be, in my opinion. But fundamentally, it's not robust, obviously, than rolling it yourself, because this is our big managed services from these massive hyperscale cloud providers that manage traffic for Netflix and things like that. So they really scale up. But fundamentally, it's the same thing. As I get more volume, I'll report out on the volume I'm getting. So you can use that for intelligent decisions downstream. And I will segment that volume in a way that is not necessarily overloading any given resource within the cluster. Got it. Okay, that makes sense. So I see. So the individual nodes
Starting point is 00:30:13 have some way of telling the load balancer, you know, I'm in trouble, or, you know, I have a ton of work or I'm available. And then the load balancer then can, you know, send things to the right. So a pod is a group of machines, and the load balancer then can, you know, send things to the right. So a pod is a group of machines and the load balancer is always sending it to the same pod, but just to a different machine in the pod. Is that how it works? So it depends, right? Generally a pod is a group of containers, not machines. The machines are the actual nodes. So you can have multiple pods running on a machine with multiple containers within them. And the load balancer can be configured in different ways, but generally it could send it to, you know, multiple pods or a single pod that's scaling up or scaling down. There's kind of nuance in how you configure it.
Starting point is 00:30:57 It's kind of an open playground. is you'll have something like a vertical pod autoscaler and the load balancer would be sending to maybe two different pods or one pod that's scaling up, scaling down. Got it. Okay, so this is a good time to transition to that. So what is autoscaling and what is horizontal and vertical autoscaling? What does that distinction mean? Yeah, so horizontal autoscaling and vertical autoscaling, I think are pretty simple concepts when you think about it outside of like all this mess of containers and namespaces. Horizontal just means add more computers and vertical just
Starting point is 00:31:36 means make the computer bigger. It's kind of the way I think about it. And those things are the same whether you're managing VMs or containers or whatever. Now, there's specific in Kubernetes horizontal and vertical pod autoscalers that you can put in to your cluster. And then you have a cluster autoscaler underneath all of that, which is actually managing the nodes. Because fundamentally, even if you have a vertical pod autoscaler that says, as this node becomes busy, you know, bump up the number of CPU cores it has. It might actually physically run out of the CPU cores on the machine, the node that it's running on. So it might need to add more nodes or make the node bigger. So you have a cluster autoscaler actually underneath all of these that are very specific to the cloud provider
Starting point is 00:32:14 and can add and remove nodes based on the aggregate demand in the cluster. And then you have the horizontal pod autoscaler, which is just adding more containers, adding more pods as the load goes up. And generally the way to think about it is if you have a very parallelizable workload, like, you know, serving web requests where a lot of this stuff can be done in parallel, it's great to use something like a horizontal pod autoscaler and just have more replicas to load balance that load between. If you have something that is very serial and cannot be
Starting point is 00:32:43 parallelized easily, it might just be worth throwing, you know, a bigger node at it. Something that, say, requires a ton of memory to compute and is, you know, has these big spikes, like a machine learning workload that you need to load all the data in and then, you know, process it in some way all sitting on one node. Worth using the vertical pod autoscaler for that. But again, these things don't operate in isolation. They're happening together with multiple applications on a single cluster. And then the nodes under the hood need to actually react to it. And that's why we have a cluster autoscaler as well. It's adding, removing, and right-sizing nodes. And you have a lot of interesting projects that are coming out just in the last few years, like Carpenter from AWS, that's supposed to help
Starting point is 00:33:21 solve this problem and make sure that you're getting the most optimal nodes from both the cost and performance perspective added and removed from your EKS cluster based on the demand of the HPA, horizontal pod autoscaler, and the VPA, the vertical pod autoscaler. Cool. Yeah. So horizontal autoscaling seems relatively simple. The way I would imagine it is you could look at, I guess, the delay in between the time a request arrives and when it can be processed by one of these nodes and then just continue to add more nodes. That seems to, it seems like I could understand how you could build a, like a Kalman filter or something to like, you know, or a PID controller or something to like use the horizontal auto scaling. Vertical seems really difficult, right? Because you have to, it's like if you run out of memory, it's almost too late at that point to say,
Starting point is 00:34:11 oh, I need more memory. Or, you know, it's like you might be halfway through the process and say, oh, let me start the whole thing over again with twice as many CPUs, but then you didn't really save anything because you already were halfway done. So like, how does the vertical auto scaling actually work? I mean, what sort of signals do you get back to actually do that? Yeah, so I mean, it actually is difficult. In fact, one of the big production use cases we run internally is a lot of Argo workflows, which is another layer of orchestration on top of Kubernetes for specific ETL workflows.
Starting point is 00:34:45 And some customers will just send us a ton of data and we have to scale up sort of dynamically the pods based on that. And it's difficult. We got a lot of crash back off loops and things like that when, you know, the pod runs out of memory or can't initialize. So it's, you know, I will sitting here with the experience of having run this thing. It's not a solved problem. Like, I'll tell you that the easiest thing to do is over provision, essentially for the peak in advance. And VPA is great at scaling things down, I found, but not so great at scaling things up, as you alluded to. And, you know, I have the scar tissue of a lot of failed clusters, and, you know, data that didn't populate on a backfill to prove that it's not a solved problem. And, you know, it's hard to predict the future, especially when you have large spikes in your workload requirements. If you have something
Starting point is 00:35:29 like an easy sort of scale up, that's very linear and predictable, that's different than, you know, spikes of data coming in at arbitrary points, which I don't think there's any system, you know, short of just over provisioning to really handle that effectively. Scaling down is a different story, right? And great to save money, but if that comes at the expense of performance for a critical customer, that's a trade-off you as an engineer have to make. Yeah, that makes sense. I'm going to jump in here and interrupt our interview today to talk about our sponsor, Zencastr. Zencastr is an all-in-one podcast production suite that gives you studio quality audio and video without needing all that technical know-how. It records each guest locally, then uploads the crystal clear audio and video
Starting point is 00:36:16 right into the suite, so you have high quality raw materials to work with. Jason and I have been using Zencastr for programming Throwdown for a while now and it's a huge upgrade over the way we used to do things. It's so much easier and more seamless to have everybody join a Zencastr room and get individual audio streams for each participant which allows editing and mastering to go much more quickly. It just also feels like a better experience for all involved. I'm so happy that we have this new solution instead of the way we used to do things back when we first started. If you would like to try Zencast or to make your own podcast, you can get a free trial by going to zen.ai slash programming throwdown. That's zen.ai slash programming throwdown.
Starting point is 00:37:02 Back to the podcast. I remember a long time ago, but there was this machine learning job that I would run at a different company and we would always crash the first time because of the auto scaler. So basically, sometimes you crash twice. And so I think in the end, we ended up putting some kind of constraint saying,
Starting point is 00:37:23 don't even bother starting this job unless you have 20 gigs of RAM or something like that. But yeah, that comes to the, I think, to the other nice point of Kubernetes, which is when you're launching one of these pods, you can hard code what the memory and CPU requirements are. And the system, in theory, depending on how it's configured, will have to go and get those for you before you get the pod in. So that's generally how you, as a practitioner, as you were saying, get around those issues. You're not saying like, start this with an arbitrary amount and figure it out for me. You can say like, give me this much, otherwise it won't work, which I think is the model that most folks, if they're engineers and working on this, end up using instead of,
Starting point is 00:37:58 you know, trying to stick their finger in the air and tune it. Yeah, totally makes sense. Very cool. So, okay. So, so one thing we should talk about is, you know, how do machines communicate with each other? So, you know, you say I have this auto scaling group. And so at any given time, I have so many instances of this container on this pod. And so do people say, you know, I want instance 14 on this pod or like how to, you know, so if I have a a if I have a auto scale, you know, memcache server, and then I have another auto scale, you know, node, you know, back end. How do those things communicate with each other? I mean, is there like a special DNS or something or what happens here? I think you're leading towards the idea of a service mesh, maybe so we can get into that a little bit or I really don't know. I'm asking total ignorance here. I have no idea. So so yeah, well, so yeah, I guess maybe drilling down and up the stack, we can maybe start at the bottom this time instead of the most abstract, you know, these machines are naturally networked in the data center to some degree, right? They're all sitting right when you spin them up on a virtual private cloud, let's say like a VPC,
Starting point is 00:39:05 which is what the AWS construct is for this, where they have specific ingress and egress rules around the machines. So maybe they can talk out to the public internet, but only a few things can talk into them. So just from a security perspective, you have that first boundary around all the machines that are in the Kubernetes cluster. So then within that cluster, you then have the individual machines that are all networked to each other. When you initialize Kubernetes, it actually creates essentially communication between all of those nodes within the VPC. And there's a lot of complexity on how you can limit that and how you can orient specific applications and their networking semantics when you spin them up on certain pods. But generally, those machines are all networked together.
Starting point is 00:39:49 Can you dive into that a little bit? I mean, so how do the machines discover each other? I mean, how does that work? Yeah. So generally, when you use something like, you know, COPS, if you're hand rolling the Kubernetes deployment yourself, or use EKS, a lot of that is just kind of handled for you in many respects, especially nowadays. I'd say AKS, GKE, all of those specific services, just adding the marginal node, very easy. You don't have to worry about discovery or anything like that. It's provisioned for you. The Kubernetes agent is added to it. It's all kind of done seamlessly in the back nowadays. There was, you know, back in the day when you're hand rolling it, you could still have to run into issues where you might have to go in, debug and reconnect things to the cluster. But now it's
Starting point is 00:40:31 pretty seamless. What's interesting is now on top of it, you have these things like Istio, which is a very popular service mesh. And essentially when you have a distributed microservices architecture, like a ton of different pods, like you were saying, talking to each other within Kubernetes, having a control plane between those services, because this is one level of abstraction above even networking the nodes together in the cluster, right? These are networking the pods and services together. You need a control plane where you can essentially write in metrics and discover sort of the
Starting point is 00:41:04 certificates between these services. So then they can talk to each other within the cluster itself. And usually this is done by adding this kind of proxy, like Envoy, which is a sidecar that's deployed with each application. And then those all talk to each other, well, technically talk to the control plane and then talk to each other. And then it allows the actual applications on top of these machines, like the pods and the applications running within them to then all talk to each other and discover each other.
Starting point is 00:41:30 So Istio is probably by far the most popular, you know, control plane service mesh for this, but there's a number of options out there. And, you know, again, you have to do the interesting thing about Kubernetes is abstracting away all the low level stuff, but it still has to provide essentially the same primitives, those low-level things like EC2 instances do. So you have this new realm of software for things that AWS provided by default, like service discovery and things like that.
Starting point is 00:41:55 Got it. So if I, let's say I have a service, let's try and like use an example here because I'm still trying to wrap my head around it. So let's say I have a node, let's start with like something really simple where i have a node server on ec2 instance and then i manually spin up another ec2 instance and run a memcache server and then my node server like you know hard codes let's say the ip address of the memcache server and asks for cache hits when it doesn't get it it it goes and hits a database. So it just
Starting point is 00:42:25 has to be another EC2 instance, right? So it's all hard-coded IP addresses. And so that makes sense. I can understand how that works, right? Yeah. Elastic IP1 talks to Elastic IP2, they never change. Exactly. Yeah. So if we want to move that to Kubernetes, so now my node server, what would I put in that location field for the memcache server? Is there like a, yeah, like, I mean, do I hard code the load balancer IP or something? Or like, do I ask it's CEO for that? I mean, like, what actually happens there? Yeah, so generally, you would ask, you know, the way this works is you would ask Istio for it because it manages the discovery of the services and within the cluster, that is. And if you have to do any service to service communication within the cluster, again, it could be living on any node.
Starting point is 00:43:16 So the IP, the load balance or all of that might change. And Istio and a service mesh generally is what keeps track of this in a distributed system, right? Especially where the application could actually be bouncing around between multiple machines. So that is sort of the main paradigm, at least in production, that folks will use to enable this service-to-service communication within Kubernetes. And I think it actually works across application clusters as well, but that's getting into another layer of complexity. In general, having this layer
Starting point is 00:43:49 of misdirection through a service mesh is what allows you to go and discover and talk to services no matter where they may be located. Say the pod they were on fails and they had to migrate, this will keep track of all of that for you. Got it. So when my node code starts my Memcached client, before it can do that, it needs to have an Itzio, there's an Itzio like Python, sorry, Itzio node library or something. And it can somehow use that and ask Itzio, hey, I need the Memcached server. And maybe we assign a name to it. I need hey i need the memcache server and maybe there's we assign a name to it you know i need foobar's memcache server nitzio will come back with some kind of either a domain name or an ip address and i guess to your to your earlier point you can kind of register callbacks or something at it's you know and say hey if if foobar you know cash server dies
Starting point is 00:44:41 or it needs to get swapped out or something let let me know. And while your program is running, Istio might ping you and say, hey, I have a new IP address. And then you know you have to blow away your Memcache client and create a new one with the new address. Yeah. And actually, the reason Istio can do that is what it's functionally doing is literally sucking in and logging all of the network traffic in and out of the cluster. So that's also an interesting leverage point. For example, in AWS, if you're just using EC2 instances to do this, you would have to go, if you want to do log tracing or network monitoring, things like that, you'd have to go and add something to each instance that is running the application. In the Istio case on Kubernetes, you install it once, it has access
Starting point is 00:45:25 to all the cluster traffic. And then you can add things like, you know, APM or network monitoring, et cetera, on top of that in a really seamless, easy way. And just to dig in one step, the reason that the nodes can all discover each other is, and even discover Istio, is they have this Envoy proxy running on each of them. And that actually proxies the request to the right place, determined based on what Istio is saying is kind of the right place to route that within the cluster. So you have these two sort of components working together, one with this global view and one locally on each node
Starting point is 00:45:56 to functionally make that dance happen and make those routes work. Oh, interesting. So, oh, that is interesting. So this Envoy proxy might actually handle the changing of the IP addresses for you. So you might not even have to regenerate your client, because the individual packets that are asking to get sent to this address, that resolution can change. Yeah. Sorry, I said it was deployed on each node. I meant it's deployed alongside each service that starts in the cluster and services running on the VM. But yes, that proxy is really the thing that's doing that work of kind of running. Oh, I see. Okay. Another thing you mentioned was a sidecar. What's a sidecar? Yeah. I mean, the simplest way to think about it is it's essentially something that's deployed alongside a container and an application. I'm getting into the details here.
Starting point is 00:46:46 It essentially allows you to do things, bundle some service like the proxy alongside the sets of containers that you're deploying for an application. It's called a sidecar because it's not part of the main kind of payload and application workload, but these are these sort of add-ons that should be deployed alongside them to make them work. In the case of Envoy, the proxy, it makes them actually network correctly within the cluster. Got it. And so Sidecar is also some kind of Docker container. And so every time you run your container on Kubernetes, it starts by Docker running all of the Sidecars and then it Docker runs your container all on the same VM.
Starting point is 00:47:28 Yeah. Within the same pod, it's essentially just another container that again, kind of not defined by you generally, it's the service like a Envoy proxy, something like that, that runs alongside the application container, which is defined by you in the Kubernetes pod. And just weird terminology, but it's just another kind of prepackaged container that runs in the pod to help your application out. Got it. So if I wanted to like, for all of my pods, for all of my containers, I wanted to get everything out of var log and put it on a database or something. So I would create a sidecar that does that
Starting point is 00:48:05 and then attach that sidecar to all these pods. Yeah, I'm sure that's one way you could do it. You know, again, probably a better way, maybe do it in the application itself, but you know, to each their own. That's your castle in the sky to build, you know, just give you the bricks and all to do it.
Starting point is 00:48:21 Got it. Cool, cool. That makes sense. Yeah, this is super cool. And so what are some good resources for people who want to learn Kubernetes? Yeah, so I think that probably the best resources I found are kind of the tutorials on the official Kubernetes website. They have a whole list of those. And particularly, I think, you know,
Starting point is 00:48:43 the cloud vendors themselves, like AWS and, and Google with GKE, they have very great specific resources about spinning up and creating running toy applications with their specific managed services. I'd actually recommend starting there, particularly for folks who are more interested in kind of production use cases, because of the fact that, you fact that most production use cases of Kubernetes right now are being done through these cloud managed services. People aren't really rolling their own clusters and running cops anymore. As I alluded to throughout this episode, they have way, way easier ways to provision and manage these things now with managed services. So I would say, you could go to the official kubernetes.io site, the Cloud Native Computing Foundation, learn a lot there about the semantics of Kubernetes, how the paradigms
Starting point is 00:49:30 of deploying and managing application life cycles work. But for actually going and building a project and doing something, you know, putting out something like a website or service in the world, I'd go to the AWS or Google Kubernetes engine or AWS Elastic Kubernetes engine blogs and documentations and just start hacking from that because that's just a great way to get started and to really be on the right platform to scale up if the application is something that you want to scale up. Yeah, that makes sense.
Starting point is 00:49:57 Actually, I want to double click on that because that's a bit counterintuitive. A lot of people might think, let me start on my own computer and get it working on my own computer and then put it on the cloud. But to your point, there's the, they've handled like a lot of this is sort of infrastructure that is, you know, independent of any particular Kubernetes. And to go to a place that's handled all of that infrastructure is actually going to be much, much easier than you starting a Kubernetes cluster in your house. And so, yeah, so that's a bit counterintuitive, but definitely, you know, go on one of these public cloud providers. And let's say, you know, people are concerned about cost, especially when they're learning, you know, what is the, you know, what's like roughly what would it cost for somebody to learn?
Starting point is 00:50:41 You know, assuming they don't do anything really expensive for somebody to learn, you know, assuming they don't do anything really expensive for somebody to learn, you know, Kubernetes using Amazon or AWS? Yeah, so and this is literally where our company works. But generally, you know, you can run stuff locally, there's like minikube and a number of projects to actually run Kubernetes commands and a mini cluster on your MacBook or whatever. But generally, you generally, the point of this, as we've been talking about, is to actually manage multiple nodes and to get the thing working in a distributed environment, which really isn't as possible, particularly with all the networking and all that stuff that you get in the data center running on your local machine.
Starting point is 00:51:18 The good news is that AWS is actually, frankly, quite cheap. If you want to do a small cluster, you don't even have to manage your own master node. That's the beauty of these, specifically the managed service versions like EKS and GKE. You don't have to manage that central node and the database that keeps track of everything. They host that for you. So you're really only left paying for some small overhead, a few bucks a month for the control plane, that master node, and then whatever VM resources you consume. And you can stick to like the T2 micro kind of very small VM instances and spend maybe call it 20, 25 bucks a month max on this with a few nodes. So you can actually get that distributed application going. So not super expensive. And if you're a student, AWS, Google, they all offer free compute credits. So you could
Starting point is 00:52:04 probably get $500 in credits and run the thing for a year, a year plus free compute credits. So you could probably get $500 in credits and run the thing for a year, a year plus with no issues if you're only running a few small nodes in the cloud. And that is the way that you'll develop an application that is running 2,000 nodes as well. So you get great experience, hands-on experience that is highly portable to large scale scenarios for that 20, 30 bucks a month. And again, free compute credits if you're a student from these cloud vendors makes it really low barrier to entry to start hacking on this stuff. Very cool. Awesome. Yeah, I think that makes a ton of sense. I think there's a lot of really great resources here. I know Patrick's updated the show
Starting point is 00:52:41 notes. So folks can definitely check that out as well. Cool. So yeah, I think we did a really good job of covering the high levels here and getting folks what they needed to get started. So thanks so much for that, Aran. Oh, of course. And if you want to, you know, sorry, go ahead. Oh, I was going to say,
Starting point is 00:52:59 so now I want to dive into Archeria. So imagine, you know, now you've built this Kubernetes cluster, you're scaling up, you hit front page of Hacker News. Amazing. You have tons of customers. You're building a thriving business on Kubernetes. And then you get hit with this $100,000 bill or something for the month. And you're like, oh my God, what happened here? How do we how do we deal with this? I can tell you on a much smaller scale. You know, I basically built a Google Photos for my family.
Starting point is 00:53:34 Like I built my own Google Photos when I was in between jobs. And it has like a little Android app and a site and everything. And I actually got hit with some surprise bills. I mean, nothing massive like that or anything. But I remember I'm trying to remember exactly what happened. I'm totally drawing a blank here. But basically, I ended up getting hit with like a $200 bill. It was something it was something Oh, okay, I got it. So I was using Datadog, which we've talked about in the past big fans of Datadog. But I didn't really know what I was doing. And so Datadog was logging like literally everything, you know, it's like if the OS, you know, allocated some extra memory to a swap buffer or something, Datadog had it right.
Starting point is 00:54:10 And so I ended up getting hit with this like two or $300 bill, which I took a long time actually for me to figure out where the money was going. And because the Datadog thing was like a two or three click install. And so in the grand scheme of building this whole site, you know, I didn't really think much of it. And, and so I kept looking at my own code saying, Oh, I must, maybe I'm storing the photo like a thousand times or something. And it's like, okay, no, it's not S3. I mean, maybe it's like too many lambdas. No, that's not it. And then finally I figured out, Oh, it's this dataog sidecar, I think, or agent or something, which is just, you know, like costing me a ton of money. And so I can't imagine what it's like running a
Starting point is 00:54:51 business and getting hit with that times a thousand. And so, so it's really cool that you span up this effort to try and help folks with that. And so why don't you dive into like, what inspired you to start Archera and what the company does? Yeah, so really, the idea from Archera came from my time at both Azure and AWS launching kind of the SageMaker services, particularly at AWS. I don't know if many reviewers have played around with GPUs on the cloud, but the costs are enormous, probably 5 to 10x the commodity compute like, you know, T2, T3 memory and compute machines that generally you're using for web applications and the kind of common web or cloud use cases.
Starting point is 00:55:34 So incredibly expensive. And, you know, what I saw in the ecosystem in 2017, when we were starting to think about what our chair would be was the fact that there was great visibility, the tooling to show you where that spend is going, say to Datadog or something like that, if it happened last week or last month was pretty great. But the problem was that at the end of the day, a lot of the recommendations on how to act after, you know, where that spend is going was very nebulous. They give you five different recommendations that were often in conflict with each other. And then actually automating this thing, basically no one was doing it, especially the subset of recommendations that weren't
Starting point is 00:56:19 application impacting that may not need an engineer even to approve them. So I kind of saw this gap where visibility was great. There was a lot of visibility tools like Cloud Health, Cloudability, Cloud Checker, Cloud XYZ out there, and they were great at providing visibility, but failed to maximize savings. And at the global view that we got at AWS and Azure on, I think at the time, it was almost $300 billion of aggregate public cloud spend. It was estimated that about 33% of that was going to waste because of this gap between visibility and action. So what we started to think about is how do we create a platform
Starting point is 00:56:59 where you can take that visibility, which frankly we view as a commodity and auto-tagging and classifying costs, but then actually build on it to build really detailed forecasts and use those forecasts to then govern the management of the cloud where it matters and automate the commitment purchasing, which is really this layer that we haven't talked about that's even underneath the EC2 and VM nodes in this stack from containers and Kubernetes all the way down to EC. And then uniquely automate the commitment management and then ensure the commitment. So if you don't use that capacity, we guarantee to buy it back from you.
Starting point is 00:57:38 So instead of just providing visibility, we try and take the visibility that exists today and extend it with automation and use the predictability that that automation creates to actually share risk with our customers, to put skin in the game and actually ensure them that if their Kubernetes cluster scales down and we've committed to a bunch of capacity for them, we'll actually take those commitments off their books so they're not left paying anything. So that was really the model that we started our chair with. And we're about two and a half years old now, have a number of large public companies, including Fortune 500s
Starting point is 00:58:11 that we're working with, as well as tons of fast growing startups. And even in the toy example of running a Kubernetes cluster on the cheap, as cheap as possible, we have really small companies that we worked with where we'll say, hey, you want to experiment with this T3 node small cluster for three months. The rack rate on that, if you're just running it on demand, is say $5 an hour. We can do a three-year commitment for all of those nodes and get you a rack rate of say $2 an hour, much, much cheaper. But instead of you holding onto it for a full three years, after those three months where your project is done, you can either opt to keep that commitment month to month, or give it back to us immediately, no questions asked. And we take a small percentage of the kind of delta in savings to help customers sort of offset, to help us offset that risk and
Starting point is 00:59:01 give us sort of a revenue model to help customers essentially unlock this new optionality and the much deeper discounts that we're able to provide with this strategy that we've innovated. Cool. Yeah, that makes sense. So what is Archera's status as far, I mean, actually, it's been kind of interesting lately. There's been a lot of news about hiring freezes, right? There's, you know, Meta has this hiring freeze and Uber is claiming that hiring is a privilege. And so where's Archera there? Are you hiring? Are you hiring for interns, full timers, all of the above? None of the above? We're hiring for engineers, both intern and full time. So definitely reach out on our website, archera.ai. And yeah, we're looking for people who are anywhere from, you know, a little bit of experience, but want to learn all the way to like five, 10 years of experience running services and applications in the cloud.
Starting point is 00:59:57 So really the whole spectrum. Cool. And if somebody, let's say, is going through that intro to CS, like we talked about at the beginning, and they're just ramping up in their career, what is something they can do on the side to give them the type of tools they would need to ace an Archera interview? Yeah. So I would say that the biggest thing is side projects. We test a lot of practical skills. So for example, if you have spun up and used a mini Kubernetes cluster to host an application before, if you have gone and built a, you know, we have front
Starting point is 01:00:30 end roles as well. So gone and built a website with React or Vue.js, things like that translate to really hard practical skills that come out in interviews. And we do a lot of very practical interviews because we want people to hit the ground running on day one, no matter what skill level they are. So having that in your, you know, kind of back burner that you can pull on to speak to those project experiences is always really helpful. And then obviously there's this kind of standard coding interview stuff, but that's pretty generic. And I think maps across all companies, like making sure you have that data structures
Starting point is 01:01:03 and algos review session before you go into an interview, something like that. Yeah have that data structures and algos review session before you go and do an interview, something like that. Yeah, that makes sense. And then for an end user, if someone is just getting started with Kubernetes, is Archera, does it have sort of a free tier or does it at the moment,
Starting point is 01:01:17 does it have an option for those folks or is it really like not the right place for them? Well, we have a free tier. And as I said, we don't, even though we work with really large companies, we we have a free tier. And as I said, we don't, you know, even though we work with really large companies, we work with a lot of small startups as well. You know, we work with companies that were two people and we have a very generous free tier, you know, until you're spending like a million dollars a year, we're basically not billing you. And, you know, you can get started with cost visibility with this cloud insurance
Starting point is 01:01:41 stuff. That's very unique. And I think with zero time, zero engineering, SRE can save you a lot of money because the interesting thing is after the VM layer of the stack, you have these commitments that can basically change how that VM is built, but have no application impact. So by just automating that piece and providing insurance and helping you get more aggressive on that, fully in say four hours of work
Starting point is 01:02:04 and max of plugging the thing in, evaluating it, maybe even having a conversation with one of our salespeople and then clicking the button is really quick. And it's all self service as well. If you want to do that, we support Kubernetes. And our goal is to make it as easy as possible to get high savings with the lowest risk possible, no matter what cloud vendor you're running. Very cool. I'll have to check this out. This is awesome. Yeah, this is great. So actually, before we close up here, what is something that is unique about working at Archera? So what is something that really stands out?
Starting point is 01:02:37 Yeah, so you know, I think we grew mainly during the pandemic. So we're a remote first company, but we have a really great culture of getting together now, especially now that COVID is waning in person. So we've had some awesome meetups in Austin, Vancouver and Seattle. And we're a really fun company that I think is great to sort of, you know, have that, I think ideal, what I like to say is ideal kind of hybrid experience where you get the flexibility of being able to work from home. We have offices in kind of three major areas, Austin, Seattle, and Vancouver, but we get together a lot. And it's just an incredibly fun place from a kind
Starting point is 01:03:17 of openness to innovation perspective. We try and bring a lot of that Amazonian culture where anyone can create a document of what project they want to work on and what new kind of SKU or product customers would like. And we have a process for approving that and getting that into the workstream really quickly. So we love a lot of bottom-up innovation. Those are, I think, a few of the interesting things that we have at our chair that make it a really fun place for engineers to work. Cool. So give me an example of a in person thing that you did. That was really fun. Oh, well, we just did maybe two weeks ago, we did a big mini golf tournament at a bar here called flat stick in Seattle. That's a mini golf bar. So we have fun
Starting point is 01:03:56 stuff like that going on all the time. You know, we're trying to make it a really fun thing for people to come into the office. Nothing is mandatory, but, you know, people come in because they love seeing each other and hanging out at least once or twice a week. And, you know, I think the fact that we're a small team and super innovative, like people use the whiteboards all the time, which is a great sign to me. People are always ideating and discussing stuff. And, you know, we love that, that site guys, we never want that to leave the company. Oh, very cool. That is awesome. I might have to hit you up for some advice of Austin things to do. Oh, yeah, we got quite a list there. Very cool. This is awesome. So if folks do want to get a you know, are interested in a career at Archera, is it
Starting point is 01:04:37 archera.ai slash careers or? Yeah, I think archera.ai and then you go to careers yeah archera.ai slash careers that should have the latest link great well yeah this is amazing you know folks out there i mean we scratch the surface i mean kubernetes is an incredibly powerful tool but i think we did amazing job especially you are on did amazing job of kind of covering at a high level the way Kubernetes works, the way you can take an app that you have up and running maybe on your MacBook and be able to put it in the cloud at scale. So if you do get hit with hacker news number one or something, it doesn't blow up your infrastructure. And when that does happen and you need to keep track of your spend and you want to have a sort of, you know, have more confidence about your spend and how to improve it, you can check out Archera. And it sounds like, you know, for most folks out there, you can jump on the free tier, you can try it out
Starting point is 01:05:37 for, you know, trial period. And then if that looks like it works out for you, then you can continue using it. So it's sounding like an amazing product. Amazing time here talking about Kubernetes. Thank you so much, Ron, for your time. Thank you so much, Jason and Patrick. I really appreciated you having me on. This is a lot of fun. Cool. Excellent. And for folks out there, thanks again for supporting us on Patreon and through Audible. We really appreciate that. Thanks so much for your questions and comments. We've been getting a ton of email and actually a ton of people messaging Programming Throwdown on Messenger with stories about how they got into various things,
Starting point is 01:06:14 how they got into programming, how they started learning about Node.js and Next.js from talking to our chat with Guillermo. I'm sure we'll get some messages maybe six months a year from now about how people got into Kubernetes thanks to this cast. And it's so special when we get those emails. It's really, you know, it's honestly, it's why Patrick and I have been doing those for so many years. It's really special. And thank you so much for that, for that support out
Starting point is 01:06:40 there. And we will catch everybody on the next episode. See y'all later. Programming Throwdown is distributed under a Creative Commons Attribution Sharealike 2.0 license. You're free to share, copy, distribute, transmit the work, to remix, adapt the work, but you must provide an attribution to Patrick and I and sharealike in time.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.