Programming Throwdown - 135: Kubernetes with Aran Khanna
Episode Date: June 6, 202200:00:15 Introduction00:01:03 Aran Khanna and his background00:05:12 The Marauder’s Map that Facebook hated(Chrome Extension)00:20:11 Why Google made Kubernetes00:31:14 Horizontal and Verti...cal Auto-Scaling00:35:54 Zencastr00:39:53 How machines talk to each other00:46:32 Sidecars00:48:25 Resources to learn Kubernetes00:52:59 Archera00:59:31 Opportunities at Archera01:01:08 Archera for End Users01:02:30 Archera as a Company01:05:46 Farewells   Resources mentioned in this episode:Aran Khanna, Cofounder of Archera:LinkedIn: https://www.linkedin.com/in/aran-khanna/Website: http://arankhanna.com/menu.htmlTwitter: https://twitter.com/arankhannaArchera:Website: https://archera.ai/LinkedIn: https://www.linkedin.com/company/archera-ai/Twitter: https://twitter.com/archeraaiKubernetes:Website: https://kubernetes.io/Documentary: https://www.youtube.com/watch?v=BE77h7dmoQUIf you’ve enjoyed this episode, you can listen to more on Programming Throwdown’s website: https://www.programmingthrowdown.com/ Reach out to us via email: programmingthrowdown@gmail.com You can also follow Programming Throwdown on Facebook | Apple Podcasts | Spotify | Player.FM  Join the discussion on our DiscordHelp support Programming Throwdown through our Patreon ★ Support this podcast on Patreon ★
Transcript
Discussion (0)
Hey everybody, so you might have heard the term
Kubernetes or clustering or these kind of terms, containers and pods and kind of wondered,
what is that? You know, that is a whole other universe, especially if you're in university,
you might have a Kubernetes cluster at your university, but you're just a person who's
using it. I bet your email server is probably running on a Kubernetes cluster you
might not even know. And so we're going to really dive into what all of these things are and unpack
them, which I think is going to be really, really exciting and super valuable. And to do that, we
have Aran Khanna on the show. Thanks for coming on the show, Aran. Hey, thank you so much for having
me, Jason and Patrick. It's great to be here. Cool. Great. Yeah. So Aran is the co-founder of Archera, which is a startup that focuses
on cloud optimization. And he is an expert on Kubernetes. And he's going to kind of really
unpack all of it for us. But before we get into that, you know, Aran, like, how did you get
started in tech? And what's been your journey like?
Yeah, well, I can give you kind of the short version. I was born and raised
here in Seattle around the time of, you know, the 90s boom in Microsoft, and then the subsequent
boom in Amazon and AWS coming online, just as I was sort of coming up. But ironically, I had
nothing to do with tech, didn't want to work in it. I was actually very into biology and synthetic
biology, biotech, as it's now called. And I was doing an internship my senior
year before I was going off to college in that space in a biotech startup working on algae,
trying to engineer them to create biofuels. And my cycle times on my experiments were like three
weeks and the algae would all die, the fungus would come in, like, it was just such a nightmare.
You have to unpack this for us. So I've heard about this, but I think I've
heard about this from like a Deus Ex video game or something. This is like a real thing. So yeah,
so this is a real thing where you actually are working on algae that creates fuel. Yeah,
explain that for us. It's fascinating. It was really interesting. I mean, the world of
biotechnology in the last two, three decades has really exploded thanks to computers and then
a lot of the new techniques. I'm sure you guys have heard of CRISPR and things like that. But
the net is now that we have genome sequencing at such a cheap level, thanks to all the innovation
in that space, we can actually not only read the genome, but understand what pieces are doing
and then targeted almost like programming, go splice out pieces of them, put in your own kind of genes
that you want that organism to express. Obviously, it's much easier with single-celled organisms like
bacteria, and then let them multiply. And you can get these edits essentially that are functional,
that are driven by you as the scientist. But to me, I didn't realize it at the time,
that was just an instance, a very complicated instance of programming.
It was like the equivalent of, you know, writing everything in assembly back in the day.
But the cycle times, you know, even when you're writing an assembly, you could click a button
and the thing would execute.
And even if the clock speed was slow, you'd get a result by the end of the day.
With this sort of quote unquote programming these cells, it would take weeks, even with
these single celled organisms for the things to culture and grow and for results to come back. If the batch was bad, if you made a single-point error
or something like that, the equivalent of missing a semicolon, you'd have to wait two weeks to go
figure that out and then come back and do the experiment again. It was a nightmare.
But during my lunch breaks, I would go upstairs and talk to the guys working in Java
on the genome sequencing and the computational side, and they were just tearing through. They know, they would get through 20 experiments a day, because all they had to do
was hit the enter button, and watch the program run and execute and analyze the data. And so as
I went to college, I'm like, look, you know, if I want to do something meaningful, build interesting
things, what's a better way to do it, you know, a way where each iteration takes two to three weeks,
obviously, it's much better now, you know, a decade maybe later, or something where I can click a button, get a result
immediately, and then distribute it to the whole world over the internet. And that's actually how
originally a lot of my passion and the interesting projects I worked on came out. I taught myself
JavaScript and took the intro CS class in college, had really nothing to do with touching the servers
for a number of years, but wrote a lot of really interesting Chrome extensions around digital privacy, one of which
actually got me fired from Facebook, which is a fun story. But I ended up doing a lot of interesting
stuff on the data visualization and the JavaScript side before, you know, realizing there was so much
interesting stuff going on on the server side. And that's when I got into working in the cloud
vendors, understanding really the nuts and bolts of how the back end of these systems, these large web systems worked.
And then really ultimately got into machine learning and Kubernetes and all the interesting things that I'm sure we're going to talk about later in the episode.
Very cool.
So, okay, so you're doing the Algae thing as like a high school project?
This is pre-university.
Yeah, pre-university, high school internship, essentially.
Cool.
And then went to university,
studied CS, and then you started off doing front-end work. And so yeah, let's dive into
that. So you built a Chrome extension that got you fired. So what exactly is going on there?
Yeah, that's a fun story. So originally, like I was saying earlier, the big issue with working
in that synthetic biology field was the
fact that, you know, one, it took so long to get iterations out. And then two, it was really
difficult to share my work with the world, which is why I went to the other side of the spectrum
and really started to dive into, you know, front end technologies. I loved getting something that
users could touch and feel in their hands. And, you know, the last mile there is actually getting
an application together. And luckily at the time, Chrome, Firefox, all of these browsers made it really easy without
essentially having any experience on the server side to go and build a functional,
interesting web app. So I was playing around, you know, hacking on projects on the side
during my time in college. And this was around 2015. I started to see and I was actually taking
a privacy class at school at the time, taught
by the former head of the FTC's technology division.
So we were talking a little bit about privacy in class.
And Facebook Messenger at the time was taking off around Harvard, the school I was at.
And it was basically the primary way, 2014, 2015, for me and all my friends to communicate
with each other.
One of the interesting things that I recognized was that every time a message was sent from a mobile device, by default,
a location would be attached to that message, whether it was sent in a one-on-one chat,
a group chat, what have you. And so I started to think, well, you know, if I just go through
this backlog of messages sitting here in my browser and just plotted them on a map, would
that be enough data
to actually, you know, essentially dock someone or de-anonymize their location history based on
this weird default in Facebook Messenger? So this is, wait, I want to make sure I get this clear.
So this is someone who's messaging you or are you somehow able to get other people's messages?
Well, if they message you or any group that you are in, by default, that message would
have location metadata attached to it, unless they'd gone in and turned off the default,
essentially. So, you know, having the skills that I newly acquired to go build Chrome extensions,
go build JavaScript apps, instead of doing this with a piece of paper and a pencil and writing
a blog, I decided, well, what better way to not only test this out, but to let users
see for themselves if this is something they should go and turn off than to build a Chrome
extension that just sits in your browser, sucks in all that data when you go to Facebook, and then
plots it on a map. And I cheekily called it the Marauders map because I was using the internal
builds to track my friends around Harvard, the same way Harry Potter tracked his friends with
the Marauders map around Hogwarts. know, Hogwarts. And, you know, released it, I was actually completely independently
accepted for a Facebook internship on the News Feed ranking team that summer.
And honestly, I thought this would be a benefit to the users, it would educate users, it would
give them insight into, you know, what these defaults were doing with their data and give
them the ability to understand if they wanted to turn it off. So I thought this was a net benefit to users,
released it, it went viral, got, you know, over 120,000 downloads on the Chrome extension
store, the Chrome web store. And then Facebook reached out and said, you know, please deactivate
it. And I did left up the code is open source, obviously, because, you know, started as an open
source research project to start with. And then the day before my internship was supposed to start, I'd complied with everything,
taken it down, etc. You know, the VP of engineering and the head of HR called me a lowly intern and
said, Hey, you know, we're going to rescind your internship because you didn't act in the best
interests of Facebook. Shouldn't be surprising now was pretty surprising in 2015. We all know now
what the cultural issues at Facebook were. But you know, the experience
there was really interesting, because I was actually motivated, I could see that the impact
was made, I could see the product decision was changed. And even though the, you know,
I had to go and scramble and find a job afterwards, the experience really stuck with me.
And I actually did a number of additional projects, such as building a Chrome extension to
suck in the Venmo history, that's all public and, you know, build a map of transactions projects, such as building a Chrome extension to suck in the Venmo history that's all
public and, you know, build a map of transactions and, you know, do some other interesting
de-anonymization projects while I was working on the privacy research side before I moved over
into machine learning research and then the cloud. Oh, I see. So, so yeah, what was your,
how did you get your toe in the water on the machine learning side? Was that a,
like a subsequent internship or what?
Yeah, actually, this experience, you know, with Facebook, after being fired, I had to scramble
and find a job. I ended up landing at a small startup with a Carnegie Mellon professor who was
working on an open source deep learning framework called MX net, which eventually, you know, as we
went through that, you know, that lifecycle, it became the open source Apache deep learning framework competing with TensorFlow and PyTorch.
Before that, actually, right before working at the Facebook internship, I worked at Microsoft on the Azure team.
I had some experience there.
Then, lo and behold, after about a year and a half of developing MXNet with this team, I was going to go back and join them.
AWS acquires that team and I end up going to AWS with them to build out the SageMaker suite of tools, as well as continue some of that open source work and the deep learning research I was doing with that professor.
Oh, very cool.
So I see.
So you started.
So that was a pretty big transition, right? So you go from working on front end, you know, JavaScript and, you know, kind of Chrome extensions all the way into MXNet,
which is kind of the guts of like TensorFlow operations. I mean, you're talking like,
like really low level, like C and all sorts of like SIMD and kind of OpenCL.
Wow. That was a pretty big transition. How did you, this is really interesting. You know,
how did you build, you know, so you know how to program, you have the basics,
you've never done anything CUDA or OpenCL or whatever. What was day one like, you know,
how did you build that muscle? Right. Day one was pretty brutal, but I will say that,
you know, the year before I had transitioned had transitioned from the general CS track to more of the systems track.
So in school, I had taken classes that were taught in C.
Even our intro class was actually taught in C, but taken much deeper classes that were oriented around C programming.
I took the operating systems class at school, which was really helpful for parallel computation and then really complex C programs that you had to put together like virtual memory. So coming in, I had battle scars. I knew that this was a different programming
paradigm, like managing memory is nonsense when you're only operating in a JavaScript world,
right? So having some of that coming in was obviously very useful. And then I actually
didn't purely study CS in school. I studied math as well. So having some of that pure math background is really helpful for the machine learning
side because it wasn't just the low level programming, especially in a team of like
it was three, four of us really working on it at the time in the startup.
It was having the ability to go from that low level all the way to the high level.
Like what is the mathematics that we need to implement here?
Say, and how can we make it more efficient?
What's kind of the net outcome?
So having a lot of that framework was very helpful. And I think going in blind without
that low level understanding of memory management and, you know, programming in C, which obviously
is very akin to what you're doing in CUDA, as well as the depth of understanding of, you know,
real analysis and dynamical systems and, you know, measure theory and probability and statistics, at least basic statistics, very difficult to get your hands
around what is this actual deep network that you're trying to implement. So luckily, I had
both of those pieces, and then obviously took months and months of, you know, working in the
GitHub, getting my PRs denied again and again and again, and reading and taking courses to really
get to a point where
I could be dangerous, so to speak. So it was a process, I'll tell you that. But you know,
at the time, I basically had lost my job. And I was trying to figure out what to do next. So
it was a good use of time that summer, I will say. Yeah, totally. That makes sense. Yeah,
I think MXNet is really interesting, because it's a difficult market to crack.
I mean, TensorFlow has been around for a long time.
PyTorch is really taking off.
But the thing I liked about MXNet was that it wasn't, well, for a while, at least, it was really like a consortium of different kind of folks.
It wasn't like a single sort of like owner or like single controller
like there was for TensorFlow. And I think PyTorch actually, in my opinion, does a better job of kind
of getting feedback from the community. But I personally feel like TensorFlow, it's very clear
that there is a single controller there and it's very difficult to sort of, you know, move the
thinking in TensorFlow. Yeah, it's really interesting
because that was my first real foray into open source projects,
which I think obviously is very apt
when talking about Kubernetes,
which I think has some of the dynamics
that I saw in that market,
obviously at a much later stage in its evolution.
But I would fully agree.
I think, you know, on the spectrum,
when we were starting, there was TensorFlow, which was really a closed ecosystem in many ways, even though it was in or, you know, in theory, open source, you could see the PRs that get accepted and rejected.
And it was pretty clear what the zeit FAIR and how independent it's been,
I think it's been much more amenable to outside contribution, to community contribution. In fact,
when we were at AWS incorporating MXNet there, we created Onyx with the PyTorch folks as that
open model inter-exchange. And that was something that explicitly TensorFlow was not into. So, you know, I've been through some of those quote unquote, open source battles before in that world. And,
you know, I think it's interesting to see now that PyTorch has so much greater adoption,
I think, because of the fact that it's been nimble and driven by the community versus
TensorFlow, which, you know, really, for the longest time, just stuck with its declarative
only approach, even though everyone's like, No, no, I want imperative. I want something like what MXNet
offers or really what PyTorch ended up offering at the end to most deep learning scientists.
Yeah, totally. And so now the third evolution of your kind of skill set is then going from,
you know, this machine learning and vectorization and BLAS and all these things to services,
you know, to clusters, Kubernetes.
And so what was that transition like?
And what sort of motivated that?
And then how did you build that set of skills?
Yeah.
So when we came in our MXNet team to AWS, essentially, we were in research mode.
We're a bunch of researchers building an open source framework.
And, you know, Amazon is a product shop. So're a bunch of researchers building an open source framework. And Amazon is
a product shop. So it was eight of us and they're like, well, you guys got to go build a team and
ship some products. This was before SageMaker, which is what our team ended up shipping, as well
as the suite of managed machine learning services and tools around that, like the model marketplace
and DeepLens, which I actually pitched and led the team for. But generally, the whole ecosystem
was pretty immature on the AWS side for machine learning. But generally, the whole ecosystem was pretty immature
on the AWS side for machine learning,
and we were the ones who were supposed to correct that.
So with that kick in the butt,
we went from being essentially an open source shop
with a heavy bend on research
to trying to productionize some of this.
And I think that's where I really started to cut my teeth
on what does it take to essentially get research projects,
which a lot of these were at the time in 2016, particularly around the deep learning side,
vision, the language, speech, synthesis, et cetera, and put that into a productized state that's at the bar of an AWS service. That was a lot of learning and talking with customers,
going through iterations with some of our biggest customers, we thought were going to be great fits for managed machine learning on AWS because they
had all their data sitting there.
Taught me so much about not just how to build services, but also how customers use them
in unintended ways, the ways that you have to be really thoughtful about things like
pricing, which is really what the company I'm working on now is doing, trying to help
folks optimize AWS, Azure, Google service pricing, and then really be thoughtful about the performance
semantics because people will use things in crazy ways at crazy scales that you cannot
really comprehend until you actually see it done.
So really going through from becoming, being a researcher and doing this, you know, vectorized
machine learning stuff, low level implementations of deep learning kernels through to actually building teams and launching services.
I think that was the AWS training that I take with me now and really look to apply in building
products widely, particularly focused on the space of cloud infrastructure.
Cool. That makes sense. Yeah. And so I think that, you know, the experience that you had
is one that can resonate with a lot of people where you get something kind of working, you know, the experience that you had is one that can resonate with a lot of people
where you get something kind of working, you know, on your desktop. And now you have 1000 people who
need it, you know, and if it's a library, I think you're fine, because you could post it on GitHub,
you make a release and people download it, and it scales that way. But if it's a service, that's where it becomes really unclear. And I
think you can kind of get in trouble in both directions. In one direction, you spin up a
huge database and you spin up your own virtual private cloud and you do all of these things.
There was an article about Fast, this company Fast. There's some kind of like FinTech.
Actually, they recently closed, but they had built all this infrastructure.
And really, they could have run the entire company on like one EC2 instance, you know,
because they just they really didn't need all this.
They didn't have the kind of customers.
Now, part of it, I mean, there's more to that story.
I think there was a little bit of fibbing around how many customers they really did have and everything. But you see this where you can kind of over-engineer things,
and now you're paying this really hefty bill every month, and most of that is wasted.
On the flip side, the biggest fear folks have is our page goes on Hacker News. Some random person
posts our page on Hacker News, and a whole bunch of people check it out.
And the experience they get is like just the spinning beach ball or something because our one
server has completely died and blown up. So then there's this third orthogonal axis where
maybe you don't even use that many resources, but just the way you've designed things has caused you to get an
extraordinary AWS bill. Like what I'm thinking of is cases where you have AWS Lambda, like trigger
loops or, you know, triggers itself. Recursive Lambda. We see that a lot, actually. It's a
really interesting pattern. Yeah. And you end up with like a hundred K Lambda bill or something.
So, you know, we'll get to that third axis. I think there's a lot of content there, but looking at the first two, you know,
Kubernetes was really designed to be able to handle those two gracefully. And so I feel like that was
kind of the big motivation behind something like Kubernetes, but I'd love to know, you know,
you probably know a lot about the sort of history.
And so, I know that at Google, they had Borg and some of these things internally. And so,
what was the motivation for Google to make Kubernetes? And do you have any sort of inside
baseball? Oh, my God, do I ever? And it's something that everyone can go and watch right
now on YouTube, because there was this amazing documentary two parter on the history of Kubernetes that was put out, I think,
late last year, early this year, we can add it to the show notes, I can send it to you afterwards.
But yeah, everything I'm about to say just comes from that it is so well done. It is like
professional, it's incredible. They talked to everyone who was part of the team. But,
you know, it has a very interesting history. Because think at the time it wasn't clear that, well, first of all, they wanted to get into the cloud market.
Amazon was already very well established.
Kubernetes was one of these weird kind of internal projects that took a lot of fighting, as you see in the documentary, to actually get approved.
And then to get the open source piece of it approved was another whole battle that they had to wage.
And I think the documentary does a really good job of going through that. But it wasn't a kind of a straight linear path. It took
a lot of dings along the way. And, you know, took some instances, as you can see in the documentary,
where the engineering team is like, look, we're just gonna, you know, with or without leadership,
we're gonna have to go and build this thing, you know, and let the code speak for itself.
So there's a lot of kind of interesting sub stories in there, but generally came out of
Google, there was a strong bend from the team to make it open source from the start. And building
a lot of the community around it, I think was, you know, kind of an uphill battle in the early days
for them, especially because of the fact that they didn't do this as like an Apache open source,
it was Google, it was owned by them, But there was an open source bend to it.
Eventually, the CNCF and some of these more neutral organizations took it over.
But it was a very interesting history where a number of Google engineers internally in
a response to AWS and kind of the dominance that they were having in the compute cloud
ideated on this project, had a lot of kind of back and forth battles to get it out the
door as an open source version that anyone could use a board. And maybe it's worth stepping back and explaining what that system did. It
essentially took a lot of these commodity machines running in the data center and using this new
innovation of containers, which is essentially within a VM, within a virtual machine, on an
actual machine, another layer of abstraction using cgroups, namespaces, and Linux to really isolate sets of processes from
each other within the operating system. And using that new construct that had started to come out,
I think this was in the early 2010s and started to become popular. This is a way to actually take
that construct that people are using to run generic applications on any kind of VM, particularly
Linux, you could make a Windows container and things like that were running on those VMs, and actually orchestrate them at scale.
So instead of just getting the ability to go build a container and run it on a VM, which was,
you know, great in isolation for giving you that homogenous environment to go execute an
application, you were also able to coordinate these applications across many VMs
into much more robust, distributed applications that Kubernetes as a control plane made it really
easy to manage as it started to mature. Yeah, that makes sense. So cool. Yeah,
we'll definitely have to watch a documentary. But at a high level, what you know, what Kubernetes
is doing is if folks remember, we had an episode earlier on Docker.
But just to give a really quick recap, we have virtual machines, which are pretty heavy, right?
I mean, they have the whole operating system installed.
I mean, the operating system alone might be like five gig, right?
And you have to, it's very hard to sort of pass these around.
And you don't, like with a virtual machine, you have sort of the state of the hard drive, but you don't really know how you got there. So the nice thing
about containers is that you have this Docker file, the script, which tells you, okay, the first line
is, you know, start with a, you know, default installation of Ubuntu or something. And then
the second line is, you know, go ahead
and apt get these, you know, 10 zillion packages. And that third line is, you know, start these
services. And the fourth line is maybe expose this MySQL port so that I can access it from the
outside. And so anybody anywhere can run that Docker file and get a MySQL server kind of up and running. And the nice thing is because a Docker file
has a reference to either another Docker file
or some base image,
the Docker file itself and the container
doesn't need to hold the entire OS in it,
just needs to hold sort of the delta.
So that's sort of Docker in a nutshell.
Now we're faced with this issue of,
okay, I have this way to create a container,
but if I want to create, you know, a hundred of them and I want all hundred of them to be
in some sort of load balance or where now I have, you know, a hundred servers up and running and
they're all able to handle requests and I want to be able to change that number from a hundred to
200, you know, there's a ton of complexity around that. And Kubernetes was designed to sort
of handle all of that, you know, communication between all of these containers. So can you dive
into a little bit of the sort of glossary? Cause I'm a little fuzzy on this stuff that there's,
I think there's pods, right? There's a Kubernetes cluster.
Well, there's containers as well. So, you know, you have maybe starting at the base level and
moving up might be helpful. So the container, I think you did an awesome job of explaining.
That's exactly right.
And people love containers because they're modular, easy to essentially recreate and great, you know, common homogenous places to run applications.
So that's at the lowest level you have containers.
One level above that, you have kind of sets of containers that are logically connected.
And generally those are put into pods
within Kubernetes. Those pods can then be grouped together within a namespace within Kubernetes. So
this is all very abstract, we're not even touching the nodes and the instances and all that stuff yet.
So you have containers, you have these logical groupings of containers in pods, you have the
namespace that is an
organizational unit that can contain a set of pods for, say, an application. You have the SQL pods
and the website serving pods that sit there together in a namespace for application A,
for example. Or if you have a machine learning group that's using a bunch of GPU pods that can
sit in application namespace B or something like
that.
From there, you actually can think about the whole cluster, which is really the set of
nodes being managed by the Kubernetes agent on those nodes, plus the specific set of nodes
that are the master, which has essentially the requirement to coordinate all of the nodes and the pods.
And it uses this very interesting distributed database, collect CD to do that and keep track
of state in a way where if any piece of that cluster fails, the abstraction of the namespaces,
the pods, the containers is still maintained, you know, even if the nodes are potentially
unstable or unreachable or something like that.
So I think that's at a
high level kind of the anatomy of a cluster. And then you can get into kind of the specific pieces
like load balancing, if you want to stick load balancers in front of specific pods,
like web serving pods, in case you have a spike in traffic or something like that,
you want to make sure it all doesn't go to one machine to one pod. So those things are constructs in Kubernetes as well. Persistent volume claims,
which is really attaching a disk into containers and pods, right? So they can have access to
hard disk and not just the VM resources that technically aren't supposed to store state,
they can be stateless, so to speak. And then you have these higher order operational abstractions like load
balancers, where you have, you know, horizontal, sorry, not load balancers, rather autoscalers,
where you have a horizontal autoscaler, vertical pod autoscaler, and a cluster autoscaler, which
all serve different needs in terms of, you know, as you were alluding to earlier, making sure that
the number of nodes, the number of containers, the resources allocated
to each container, all keep pace with the amount of load being experienced by applications in the
cluster. So I know I went through quite a glossary there. Maybe it's worth stepping back and diving
into each of those pieces a little bit. Yeah, there's a lot. Yeah, totally. Let's start with
the autoscaler because I think something that's really fascinating. So actually, I take it back. Let's start with the load balancer because we kind of need to start there.
So, you know, people, I'm sure some folks out there have used things like NGI and X and these
other things that are independent of Kubernetes, where you can say, you know, if I get a request
on this port, then 10% of them go to this IP address, you know, 90% go to this IP address. And so that seems
pretty straightforward. What does Kubernetes offer? Is it still like you bring your own like
NGINX or is it built into Kubernetes? How does that work? Yeah, so generally nowadays, you know,
you'll have services managed Kubernetes services, you don't have to roll the whole thing and manage
it yourself. So you'll have things like EKS and on AWS, AKS on Azure, GKE on Google Cloud Engine.
And you're able to, just using those, actually have access to the native load balancers within those cloud vendors.
So you don't have to go and roll your own NGINX load balancer container and spin it up and manage it.
You can just say, look, give me the Azure load balancer, the Google load balancer, the AWS load balancer for this set of resources.
So that's generally nowadays, given the maturity and all the managed service in the space, how folks will leverage those things.
It's completely abstracted away as it should be, in my opinion.
But fundamentally, it's not robust, obviously, than rolling it yourself, because this is our big managed services from these massive hyperscale cloud providers that manage traffic for Netflix and things like that.
So they really scale up.
But fundamentally, it's the same thing.
As I get more volume, I'll report out on the volume I'm getting.
So you can use that for intelligent decisions downstream.
And I will segment that volume in a way that is not necessarily overloading any given
resource within the cluster. Got it. Okay, that makes sense. So I see. So the individual nodes
have some way of telling the load balancer, you know, I'm in trouble, or, you know, I have a ton
of work or I'm available. And then the load balancer then can, you know, send things to the
right. So a pod is a group of machines, and the load balancer then can, you know, send things to the right. So a pod is a group
of machines and the load balancer is always sending it to the same pod, but just to a
different machine in the pod. Is that how it works? So it depends, right? Generally a pod is a group
of containers, not machines. The machines are the actual nodes. So you can have multiple pods
running on a machine with multiple containers within them. And the load balancer can be configured in different ways, but generally it could send it to, you know, multiple pods or a single pod that's scaling up or scaling down.
There's kind of nuance in how you configure it.
It's kind of an open playground. is you'll have something like a vertical pod autoscaler and the load balancer would be sending to maybe two different pods
or one pod that's scaling up, scaling down.
Got it. Okay, so this is a good time to transition to that.
So what is autoscaling and what is horizontal and vertical autoscaling?
What does that distinction mean?
Yeah, so horizontal autoscaling and vertical
autoscaling, I think are pretty simple concepts when you think about it outside of like all this
mess of containers and namespaces. Horizontal just means add more computers and vertical just
means make the computer bigger. It's kind of the way I think about it. And those things are the
same whether you're managing VMs or containers or whatever. Now, there's specific in Kubernetes horizontal and vertical pod autoscalers that you can put in to your cluster.
And then you have a cluster autoscaler underneath all of that, which is actually managing the nodes.
Because fundamentally, even if you have a vertical pod autoscaler that says, as this node becomes busy, you know, bump up the number of CPU cores it has.
It might actually physically run out of the CPU cores on the machine, the node that it's running on.
So it might need to add more nodes or make the node bigger.
So you have a cluster autoscaler actually underneath all of these
that are very specific to the cloud provider
and can add and remove nodes
based on the aggregate demand in the cluster.
And then you have the horizontal pod autoscaler,
which is just adding more containers,
adding more pods as the load goes up. And generally the way to think about it is if you have a very parallelizable workload,
like, you know, serving web requests where a lot of this stuff can be done in parallel,
it's great to use something like a horizontal pod autoscaler and just have more replicas
to load balance that load between. If you have something that is very serial and cannot be
parallelized easily, it might just be worth throwing, you know, a bigger node at it.
Something that, say, requires a ton of memory to compute and is, you know, has these big spikes, like a machine learning workload that you need to load all the data in and then, you know, process it in some way all sitting on one node.
Worth using the vertical pod autoscaler for that.
But again, these things don't operate in isolation.
They're happening together with multiple applications on a single cluster. And then
the nodes under the hood need to actually react to it. And that's why we have a cluster autoscaler
as well. It's adding, removing, and right-sizing nodes. And you have a lot of interesting projects
that are coming out just in the last few years, like Carpenter from AWS, that's supposed to help
solve this problem and make sure that you're getting the most optimal nodes from both the cost and performance perspective added and removed
from your EKS cluster based on the demand of the HPA, horizontal pod autoscaler, and the VPA,
the vertical pod autoscaler. Cool. Yeah. So horizontal autoscaling seems relatively simple.
The way I would imagine it is you could look at, I guess,
the delay in between the time a request arrives and when it can be processed by one of these nodes and then just continue to add more nodes. That seems to, it seems like I could understand how
you could build a, like a Kalman filter or something to like, you know, or a PID controller
or something to like use the horizontal auto scaling. Vertical seems really difficult, right? Because you have to,
it's like if you run out of memory, it's almost too late at that point to say,
oh, I need more memory. Or, you know, it's like you might be halfway through the process and say,
oh, let me start the whole thing over again with twice as many CPUs, but then you didn't
really save anything because you already were halfway done. So like, how does the vertical
auto scaling actually work?
I mean, what sort of signals do you get back to actually do that?
Yeah, so I mean, it actually is difficult.
In fact, one of the big production use cases we run internally is a lot of Argo workflows,
which is another layer of orchestration on top of Kubernetes for specific ETL workflows.
And some customers will just send us a ton of data and we have to scale up sort of dynamically the pods
based on that. And it's difficult. We got a lot of crash back off loops and things like that when,
you know, the pod runs out of memory or can't initialize. So it's, you know, I will sitting
here with the experience of having run this thing. It's not a solved problem. Like, I'll tell you that the easiest thing to do is over provision, essentially for
the peak in advance. And VPA is great at scaling things down, I found, but not so great at scaling
things up, as you alluded to. And, you know, I have the scar tissue of a lot of failed clusters,
and, you know, data that didn't populate on a backfill to prove that it's not a solved problem.
And, you know, it's hard to predict the future, especially when you have large spikes in your workload requirements. If you have something
like an easy sort of scale up, that's very linear and predictable, that's different than, you know,
spikes of data coming in at arbitrary points, which I don't think there's any system, you know,
short of just over provisioning to really handle that effectively. Scaling down is a different story, right?
And great to save money, but if that comes at the expense of performance for a critical customer,
that's a trade-off you as an engineer have to make.
Yeah, that makes sense.
I'm going to jump in here and interrupt our interview today to talk about our sponsor, Zencastr. Zencastr is an all-in-one podcast production suite that gives you studio quality audio and video without needing all that
technical know-how. It records each guest locally, then uploads the crystal clear audio and video
right into the suite, so you have high quality raw materials to work with. Jason and I have been
using Zencastr for programming Throwdown for a while now and it's a huge upgrade over the way we used to do
things. It's so much easier and more seamless to have everybody join a Zencastr
room and get individual audio streams for each participant which allows
editing and mastering to go much more quickly. It just also feels like a better
experience for all involved. I'm so happy that we have this new solution instead of the way we used to do things back when we first started.
If you would like to try Zencast or to make your own podcast, you can get a free trial by going to zen.ai slash programming throwdown.
That's zen.ai slash programming throwdown.
Back to the podcast.
I remember a long time ago,
but there was this machine learning job that I would run at a different company
and we would always crash the first time
because of the auto scaler.
So basically, sometimes you crash twice.
And so I think in the end,
we ended up putting some kind of constraint saying,
don't even bother starting this job
unless you have 20 gigs of RAM or something like that. But yeah, that comes to the,
I think, to the other nice point of Kubernetes, which is when you're launching one of these pods,
you can hard code what the memory and CPU requirements are. And the system, in theory,
depending on how it's configured, will have to go and get those for you before you get the pod in.
So that's generally how you, as a practitioner, as you were saying,
get around those issues. You're not saying like, start this with an arbitrary amount and figure it out for me. You can say like, give me this much, otherwise it won't work, which I think
is the model that most folks, if they're engineers and working on this, end up using instead of,
you know, trying to stick their finger in the air and tune it.
Yeah, totally makes sense. Very cool. So, okay. So, so one thing we should talk about is, you know, how do machines communicate with each other? So,
you know, you say I have this auto scaling group. And so at any given time, I have so many
instances of this container on this pod. And so do people say, you know, I want instance 14 on
this pod or like how to, you know, so if I have a a if I have a auto scale, you know, memcache server, and then I have another auto scale, you know, node, you know, back end.
How do those things communicate with each other?
I mean, is there like a special DNS or something or what happens here?
I think you're leading towards the idea of a service mesh, maybe so we can get into that a little bit or I really don't know. I'm asking total ignorance here. I have no idea. So so yeah, well, so yeah, I guess maybe drilling down and up the stack, we can maybe start at the bottom this time instead of the most abstract, you know, these machines are naturally networked in the data center to some degree, right? They're all sitting right when you spin them up on a virtual private cloud, let's say like a VPC,
which is what the AWS construct is for this, where they have specific ingress and egress
rules around the machines. So maybe they can talk out to the public internet, but only
a few things can talk into them. So just from a security perspective, you have that first boundary
around all the machines that are in the Kubernetes cluster. So then within that cluster, you then have the individual machines that are all networked to each other. When you initialize
Kubernetes, it actually creates essentially communication between all of those nodes
within the VPC. And there's a lot of complexity on how you can limit that and how you can
orient specific applications and their networking semantics when you spin them
up on certain pods. But generally, those machines are all networked together.
Can you dive into that a little bit? I mean, so how do the machines discover each other? I mean,
how does that work? Yeah. So generally, when you use something like, you know,
COPS, if you're hand rolling the Kubernetes deployment yourself, or use EKS, a lot of that
is just kind of handled for you in many respects, especially nowadays. I'd say AKS, GKE, all of
those specific services, just adding the marginal node, very easy. You don't have to worry about
discovery or anything like that. It's provisioned for you. The Kubernetes agent is added to it.
It's all kind of done seamlessly in the back nowadays. There was, you know, back in the day when you're hand rolling it, you could still have to run into
issues where you might have to go in, debug and reconnect things to the cluster. But now it's
pretty seamless. What's interesting is now on top of it, you have these things like Istio,
which is a very popular service mesh. And essentially when you have a distributed
microservices architecture, like a ton of
different pods, like you were saying, talking to each other within Kubernetes, having a
control plane between those services, because this is one level of abstraction above even
networking the nodes together in the cluster, right?
These are networking the pods and services together.
You need a control plane where you can essentially write in metrics and discover sort of the
certificates between these services.
So then they can talk to each other within the cluster itself.
And usually this is done by adding this kind of proxy, like Envoy, which is a sidecar that's
deployed with each application.
And then those all talk to each other, well, technically talk to the control plane and
then talk to each other.
And then it allows the actual applications on top of these machines, like the pods and
the applications running within them to then all talk to each other and discover each other.
So Istio is probably by far the most popular, you know, control plane service mesh for this,
but there's a number of options out there.
And, you know, again, you have to do the interesting thing about Kubernetes is abstracting
away all the low level stuff, but it still has to provide essentially the same primitives,
those low-level things like EC2 instances do.
So you have this new realm of software
for things that AWS provided by default,
like service discovery and things like that.
Got it.
So if I, let's say I have a service,
let's try and like use an example here
because I'm still trying to wrap my head around it.
So let's say I have a node, let's start with like something really simple where i have a node server on ec2
instance and then i manually spin up another ec2 instance and run a memcache server and then my
node server like you know hard codes let's say the ip address of the memcache server and asks for
cache hits when it doesn't get it it it goes and hits a database. So it just
has to be another EC2 instance, right? So it's all hard-coded IP addresses. And so that makes
sense. I can understand how that works, right? Yeah. Elastic IP1 talks to Elastic IP2,
they never change. Exactly. Yeah. So if we want to move that to Kubernetes,
so now my node server, what would I put in that location field for the memcache
server? Is there like a, yeah, like, I mean, do I hard code the load balancer IP or something? Or
like, do I ask it's CEO for that? I mean, like, what actually happens there?
Yeah, so generally, you would ask, you know, the way this works is you would ask Istio for it because it manages the discovery of the services and within the cluster, that is.
And if you have to do any service to service communication within the cluster, again, it could be living on any node.
So the IP, the load balance or all of that might change.
And Istio and a service mesh generally is what keeps track of this in a distributed system,
right?
Especially where the application could actually be bouncing around between multiple machines.
So that is sort of the main paradigm, at least in production, that folks will use to enable
this service-to-service communication within Kubernetes.
And I think it actually works across application clusters as well,
but that's getting into another layer of complexity. In general, having this layer
of misdirection through a service mesh is what allows you to go and discover and talk to
services no matter where they may be located. Say the pod they were on fails and they had to migrate,
this will keep track of all of that for you. Got it. So when my node code starts my Memcached client,
before it can do that, it needs to have an Itzio, there's an Itzio like Python, sorry,
Itzio node library or something. And it can somehow use that and ask Itzio, hey, I need
the Memcached server. And maybe we assign a name to it. I need hey i need the memcache server and maybe there's we assign a name to it you know i need foobar's memcache server nitzio will come back with some kind of
either a domain name or an ip address and i guess to your to your earlier point you can kind of
register callbacks or something at it's you know and say hey if if foobar you know cash server dies
or it needs to get swapped out or something let let me know. And while your program is running, Istio might ping you and say, hey, I have a new IP address. And then you know you
have to blow away your Memcache client and create a new one with the new address.
Yeah. And actually, the reason Istio can do that is what it's functionally doing is literally
sucking in and logging all of the network traffic in and out of the cluster. So that's also an interesting leverage point.
For example, in AWS, if you're just using EC2 instances to do this, you would have to
go, if you want to do log tracing or network monitoring, things like that, you'd have to
go and add something to each instance that is running the application.
In the Istio case on Kubernetes, you install it once, it has access
to all the cluster traffic. And then you can add things like, you know, APM or network monitoring,
et cetera, on top of that in a really seamless, easy way. And just to dig in one step, the reason
that the nodes can all discover each other is, and even discover Istio, is they have this Envoy proxy
running on each of them. And that actually proxies the request to the right place,
determined based on what Istio is saying
is kind of the right place to route that within the cluster.
So you have these two sort of components working together,
one with this global view and one locally on each node
to functionally make that dance happen
and make those routes work.
Oh, interesting.
So, oh, that is interesting.
So this Envoy proxy might actually handle the changing of the IP addresses for you. So you might not even have to regenerate your client, because the individual packets that are asking to get sent to this address, that resolution can change. Yeah. Sorry, I said it was deployed on each node. I meant it's deployed alongside each service that starts in the cluster and services running on the VM. But yes, that proxy
is really the thing that's doing that work of kind of running. Oh, I see. Okay. Another thing
you mentioned was a sidecar. What's a sidecar? Yeah. I mean, the simplest way to think about it
is it's essentially something that's deployed alongside a container and an application. I'm getting into the details here.
It essentially allows you to do things, bundle some service like the proxy alongside the
sets of containers that you're deploying for an application.
It's called a sidecar because it's not part of the main kind of payload and application
workload, but these are these sort of add-ons that should be deployed alongside them to make them work. In the case of Envoy, the proxy, it makes them actually network
correctly within the cluster. Got it. And so Sidecar is also some kind of Docker container.
And so every time you run your container on Kubernetes, it starts by Docker running all
of the Sidecars and then it Docker runs your
container all on the same VM.
Yeah.
Within the same pod, it's essentially just another container that again, kind of not
defined by you generally, it's the service like a Envoy proxy, something like that, that
runs alongside the application container, which is defined by you in the Kubernetes
pod. And just weird terminology, but it's just another kind of prepackaged container that
runs in the pod to help your application out. Got it. So if I wanted to like, for all of my
pods, for all of my containers, I wanted to get everything out of var log and put it on a database
or something. So I would create a sidecar that does that
and then attach that sidecar to all these pods.
Yeah, I'm sure that's one way you could do it.
You know, again,
probably a better way,
maybe do it in the application itself,
but you know, to each their own.
That's your castle in the sky to build,
you know, just give you the bricks and all to do it.
Got it.
Cool, cool.
That makes sense.
Yeah, this is super cool.
And so
what are some good resources for people who want to learn Kubernetes?
Yeah, so I think that probably the best resources I found are kind of the tutorials on the official
Kubernetes website. They have a whole list of those. And particularly, I think, you know,
the cloud vendors themselves, like AWS and, and Google with GKE, they have very great specific resources about spinning up and
creating running toy applications with their specific managed services. I'd actually recommend
starting there, particularly for folks who are more interested in kind of production use cases,
because of the fact that, you fact that most production use cases of Kubernetes
right now are being done through these cloud managed services. People aren't really rolling
their own clusters and running cops anymore. As I alluded to throughout this episode, they have
way, way easier ways to provision and manage these things now with managed services. So I would say,
you could go to the official kubernetes.io site, the Cloud Native Computing Foundation, learn a lot there about the semantics of Kubernetes, how the paradigms
of deploying and managing application life cycles work.
But for actually going and building a project and doing something, you know, putting out
something like a website or service in the world, I'd go to the AWS or Google Kubernetes
engine or AWS Elastic Kubernetes engine blogs and
documentations and just start hacking from that because that's just a great way to get
started and to really be on the right platform to scale up if the application is something
that you want to scale up.
Yeah, that makes sense.
Actually, I want to double click on that because that's a bit counterintuitive.
A lot of people might think, let me start on my own computer and get
it working on my own computer and then put it on the cloud. But to your point, there's the,
they've handled like a lot of this is sort of infrastructure that is, you know, independent
of any particular Kubernetes. And to go to a place that's handled all of that infrastructure
is actually going to be much, much easier than you starting a Kubernetes cluster in your house.
And so, yeah, so that's a bit counterintuitive, but definitely, you know, go on one of these public cloud providers.
And let's say, you know, people are concerned about cost, especially when they're learning, you know, what is the, you know, what's like roughly what would it cost for somebody to learn?
You know, assuming they don't do anything really expensive for somebody to learn, you know, assuming they don't do anything really expensive for somebody
to learn, you know, Kubernetes using Amazon or AWS? Yeah, so and this is literally where our
company works. But generally, you know, you can run stuff locally, there's like minikube and a
number of projects to actually run Kubernetes commands and a mini cluster on your MacBook or
whatever. But generally, you generally, the point of this,
as we've been talking about, is to actually manage multiple nodes and to get the thing
working in a distributed environment, which really isn't as possible, particularly with
all the networking and all that stuff that you get in the data center running on your local machine.
The good news is that AWS is actually, frankly, quite cheap. If you want to do a small cluster,
you don't even have to manage your own master node. That's the beauty of these, specifically the managed service versions like
EKS and GKE. You don't have to manage that central node and the database that keeps track of
everything. They host that for you. So you're really only left paying for some small overhead,
a few bucks a month for the control plane, that master node, and then whatever VM resources you consume. And you can stick to like the T2
micro kind of very small VM instances and spend maybe call it 20, 25 bucks a month max on this
with a few nodes. So you can actually get that distributed application going. So not super
expensive. And if you're a student, AWS, Google, they all offer free compute credits. So you could
probably get $500 in credits and run the thing for a year, a year plus free compute credits. So you could probably get $500 in credits and run
the thing for a year, a year plus with no issues if you're only running a few small nodes in the
cloud. And that is the way that you'll develop an application that is running 2,000 nodes as well.
So you get great experience, hands-on experience that is highly portable to large scale scenarios
for that 20, 30 bucks a month. And again,
free compute credits if you're a student from these cloud vendors makes it really low barrier
to entry to start hacking on this stuff. Very cool. Awesome. Yeah, I think that makes a ton
of sense. I think there's a lot of really great resources here. I know Patrick's updated the show
notes. So folks can definitely check that out as well.
Cool.
So yeah, I think we did a really good job of covering the high levels here
and getting folks what they needed to get started.
So thanks so much for that, Aran.
Oh, of course.
And if you want to, you know, sorry, go ahead.
Oh, I was going to say,
so now I want to dive into Archeria.
So imagine, you know,
now you've built this Kubernetes cluster, you're scaling up,
you hit front page of Hacker News. Amazing. You have tons of customers. You're building a thriving
business on Kubernetes. And then you get hit with this $100,000 bill or something for the month.
And you're like, oh my God, what happened here? How do we how do we deal with this?
I can tell you on a much smaller scale.
You know, I basically built a Google Photos for my family.
Like I built my own Google Photos when I was in between jobs.
And it has like a little Android app and a site and everything. And I actually got hit with some surprise bills.
I mean, nothing massive like that or anything.
But I remember I'm trying to remember exactly what happened. I'm totally drawing a blank here. But basically, I ended up getting
hit with like a $200 bill. It was something it was something Oh, okay, I got it. So I was using
Datadog, which we've talked about in the past big fans of Datadog. But I didn't really know what I
was doing. And so Datadog was logging like literally everything, you know, it's like if the OS,
you know, allocated some extra memory to a swap buffer or something, Datadog had it right.
And so I ended up getting hit with this like two or $300 bill, which I took a long time actually
for me to figure out where the money was going. And because the Datadog thing was like a two or
three click install. And so in the grand scheme of building this whole
site, you know, I didn't really think much of it. And, and so I kept looking at my own code saying,
Oh, I must, maybe I'm storing the photo like a thousand times or something. And it's like,
okay, no, it's not S3. I mean, maybe it's like too many lambdas. No, that's not it.
And then finally I figured out, Oh, it's this dataog sidecar, I think, or agent or something, which
is just, you know, like costing me a ton of money. And so I can't imagine what it's like running a
business and getting hit with that times a thousand. And so, so it's really cool that you
span up this effort to try and help folks with that. And so why don't you dive into like,
what inspired you to start Archera and what the
company does?
Yeah, so really, the idea from Archera came from my time at both Azure and AWS launching
kind of the SageMaker services, particularly at AWS.
I don't know if many reviewers have played around with GPUs on the cloud, but the costs
are enormous, probably 5 to 10x the commodity compute like, you know, T2, T3 memory and compute machines that generally you're using for web applications and the kind of common web or cloud use cases.
So incredibly expensive.
And, you know, what I saw in the ecosystem in 2017, when we were starting to think about what our chair would be was the fact that there was great
visibility, the tooling to show you where that spend is going, say to Datadog or something like
that, if it happened last week or last month was pretty great. But the problem was that at the end
of the day, a lot of the recommendations on how to act after, you know,
where that spend is going was very nebulous. They give you five different recommendations
that were often in conflict with each other. And then actually automating this thing,
basically no one was doing it, especially the subset of recommendations that weren't
application impacting that may not need an engineer even to approve them.
So I kind of saw this gap where
visibility was great. There was a lot of visibility tools like Cloud Health, Cloudability,
Cloud Checker, Cloud XYZ out there, and they were great at providing visibility, but failed to
maximize savings. And at the global view that we got at AWS and Azure on, I think at the time,
it was almost $300 billion of aggregate
public cloud spend. It was estimated that about 33% of that was going to waste because of this gap
between visibility and action. So what we started to think about is how do we create a platform
where you can take that visibility, which frankly we view as a commodity and auto-tagging
and classifying costs, but then actually build on it to build really detailed forecasts and
use those forecasts to then govern the management of the cloud where it matters and automate
the commitment purchasing, which is really this layer that we haven't talked about that's
even underneath the EC2 and VM nodes in this stack from containers and Kubernetes
all the way down to EC.
And then uniquely automate the commitment management and then ensure the commitment.
So if you don't use that capacity, we guarantee to buy it back from you.
So instead of just providing visibility, we try and take the visibility that exists today
and extend it with automation and
use the predictability that that automation creates to actually share risk with our customers,
to put skin in the game and actually ensure them that if their Kubernetes cluster scales down and
we've committed to a bunch of capacity for them, we'll actually take those commitments off their
books so they're not left paying anything. So that was really the model that we started
our chair with. And we're
about two and a half years old now, have a number of large public companies, including Fortune 500s
that we're working with, as well as tons of fast growing startups. And even in the toy example of
running a Kubernetes cluster on the cheap, as cheap as possible, we have really small companies
that we worked with where we'll say, hey, you want to experiment with this T3 node small cluster for three months.
The rack rate on that, if you're just running it on demand, is say $5 an hour.
We can do a three-year commitment for all of those nodes and get you a rack rate of say $2 an hour, much, much cheaper. But instead of you holding onto it for a full three years,
after those three months where your project is done, you can either opt to keep that commitment
month to month, or give it back to us immediately, no questions asked. And we take a small percentage
of the kind of delta in savings to help customers sort of offset, to help us offset that risk and
give us sort of a revenue model to help customers essentially unlock this new optionality and the much deeper discounts that we're able to provide with this strategy
that we've innovated. Cool. Yeah, that makes sense. So what is Archera's status as far, I mean,
actually, it's been kind of interesting lately. There's been a lot of news about hiring freezes,
right? There's, you know, Meta has this hiring
freeze and Uber is claiming that hiring is a privilege. And so where's Archera there? Are
you hiring? Are you hiring for interns, full timers, all of the above? None of the above?
We're hiring for engineers, both intern and full time. So definitely reach out on our website, archera.ai.
And yeah, we're looking for people who are anywhere from, you know, a little bit of experience, but want to learn all the way to like five, 10 years of experience running services and applications in the cloud.
So really the whole spectrum.
Cool. And if somebody, let's say, is going through that intro to CS, like we talked about at the beginning, and they're just ramping up in their career, what is something they
can do on the side to give them the type of tools they would need to ace an Archera interview?
Yeah.
So I would say that the biggest thing is side projects.
We test a lot of practical skills.
So for example, if you have spun up and used a mini Kubernetes
cluster to host an application before, if you have gone and built a, you know, we have front
end roles as well. So gone and built a website with React or Vue.js, things like that translate
to really hard practical skills that come out in interviews. And we do a lot of very practical
interviews because we want people to hit the ground running on day one, no matter what skill level they are.
So having that in your, you know, kind of back burner that you can pull on to speak
to those project experiences is always really helpful.
And then obviously there's this kind of standard coding interview stuff, but that's pretty
generic.
And I think maps across all companies, like making sure you have that data structures
and algos review session before you go into an interview, something like that. Yeah have that data structures and algos review session
before you go and do an interview,
something like that.
Yeah, that makes sense.
And then for an end user,
if someone is just getting started with Kubernetes,
is Archera, does it have sort of a free tier
or does it at the moment,
does it have an option for those folks
or is it really like not the right place for them?
Well, we have a free tier.
And as I said, we don't,
even though we work with really large companies, we we have a free tier. And as I said, we don't,
you know, even though we work with really large companies, we work with a lot of small startups as well. You know, we work with companies that were two people and we have a very generous free
tier, you know, until you're spending like a million dollars a year, we're basically not
billing you. And, you know, you can get started with cost visibility with this cloud insurance
stuff. That's very unique. And I think with zero time, zero engineering,
SRE can save you a lot of money because the interesting thing is
after the VM layer of the stack,
you have these commitments that can basically change
how that VM is built, but have no application impact.
So by just automating that piece and providing insurance
and helping you get more aggressive on that,
fully in say four hours of work
and max of plugging the thing in,
evaluating it, maybe even having a conversation with one of our salespeople and then clicking
the button is really quick. And it's all self service as well. If you want to do that,
we support Kubernetes. And our goal is to make it as easy as possible to get high savings with
the lowest risk possible, no matter what cloud vendor you're running.
Very cool. I'll have to check this out. This is awesome.
Yeah, this is great. So actually, before we close up here, what is something that is unique
about working at Archera? So what is something that really stands out?
Yeah, so you know, I think we grew mainly during the pandemic. So we're a remote first company,
but we have a really great culture
of getting together now, especially now that COVID is waning in person. So we've had some
awesome meetups in Austin, Vancouver and Seattle. And we're a really fun company that I think is
great to sort of, you know, have that, I think ideal, what I like to say is ideal kind of hybrid
experience where you get the flexibility
of being able to work from home. We have offices in kind of three major areas, Austin, Seattle,
and Vancouver, but we get together a lot. And it's just an incredibly fun place from a kind
of openness to innovation perspective. We try and bring a lot of that Amazonian culture where
anyone can create a document of what project they want to
work on and what new kind of SKU or product customers would like. And we have a process for
approving that and getting that into the workstream really quickly. So we love a lot of bottom-up
innovation. Those are, I think, a few of the interesting things that we have at our chair
that make it a really fun place for engineers to work. Cool. So give me an example of a in person thing that you did.
That was really fun. Oh, well, we just did maybe two weeks ago, we did a big mini golf
tournament at a bar here called flat stick in Seattle. That's a mini golf bar. So we have fun
stuff like that going on all the time. You know, we're trying to make it a really fun thing for
people to come into the office. Nothing is mandatory, but, you know, people come in because they love seeing each other and hanging out at least once or twice a week.
And, you know, I think the fact that we're a small team and super innovative, like people use the
whiteboards all the time, which is a great sign to me. People are always ideating and discussing
stuff. And, you know, we love that, that site guys, we never want that to leave the company.
Oh, very cool. That is awesome. I might have to hit you up for some
advice of Austin things to do. Oh, yeah, we got quite a list there. Very cool. This is awesome.
So if folks do want to get a you know, are interested in a career at Archera, is it
archera.ai slash careers or? Yeah, I think archera.ai and then you go to careers yeah archera.ai slash careers that
should have the latest link great well yeah this is amazing you know folks out there i mean we
scratch the surface i mean kubernetes is an incredibly powerful tool but i think we did
amazing job especially you are on did amazing job of kind of covering at a high level the way Kubernetes works, the way you can take an app that you have up and running maybe on your MacBook and
be able to put it in the cloud at scale. So if you do get hit with hacker news number one or
something, it doesn't blow up your infrastructure. And when that does happen and you need to keep track of your spend and you want to have a sort of, you know,
have more confidence about your spend and how to improve it, you can check out Archera. And it
sounds like, you know, for most folks out there, you can jump on the free tier, you can try it out
for, you know, trial period. And then if that looks like it works out for you, then you can
continue using it. So it's sounding like an amazing product. Amazing time here talking about Kubernetes. Thank you so much, Ron, for your time.
Thank you so much, Jason and Patrick. I really appreciated you having me on. This is a lot of fun.
Cool. Excellent. And for folks out there, thanks again for supporting us on Patreon and through
Audible. We really appreciate that. Thanks so much for your questions and comments. We've been getting a ton of email
and actually a ton of people messaging
Programming Throwdown on Messenger
with stories about how they got into various things,
how they got into programming,
how they started learning about Node.js and Next.js
from talking to our chat with Guillermo.
I'm sure we'll get some messages
maybe six months a
year from now about how people got into Kubernetes thanks to this cast. And it's so special when we
get those emails. It's really, you know, it's honestly, it's why Patrick and I have been doing
those for so many years. It's really special. And thank you so much for that, for that support out
there. And we will catch everybody on the next episode. See y'all later. Programming Throwdown is distributed under a Creative Commons Attribution Sharealike 2.0 license.
You're free to share, copy, distribute, transmit the work, to remix, adapt the work,
but you must provide an attribution to Patrick and I and sharealike in time.