PurePerformance - How not to start with Kubernetes – Lessons learned from DevOps Engineer Christian Heckelmann
Episode Date: March 29, 2021To k8s or not – that should be the first question to answer before considering k8s. Granted – in many cases k8s is going to be the right choice but don’t just default to k8s because its hip or c...ool.In this episode we have Christian Heckelmann (@wurstsalat), DevOps Engineer at ERT, talking about his journey with k8s which started with installing k8s 1.9 on bare metal. He gives a lot of great advice based on his presentation “How not to start with k8s” such as Understand Networking, Don’t use :latest, Set Resource Limits, Train The People, Provide Templates and more.To get started with Kubernetes we encourage you to look at the YouTube Tutorials posted on TechWorld with Nana.https://twitter.com/wurstsalathttps://docs.google.com/presentation/d/1EL9OYe-1eOPXh6U8SMHnQxs8pcmr01d-uwoWoFnzUaY/edit#slide=id.g5420f4ebeb_0_5https://www.youtube.com/channel/UCdngmbVKX1Tgre699-XLlUA
Transcript
Discussion (0)
It's time for Pure Performance!
Get your stopwatches ready, it's time for Pure Performance with Andy Grabner and Brian Wilson.
Hello everybody and welcome to another episode of Pure Performance.
My name is Brian Wilson and as always I have my co-host Andy Grabner today.
Andy, how are you doing?
I'm really good. You're so polite all of a sudden. You made some terrible jokes earlier about me and what's wrong?
What's happened? Why are you polite all of a sudden?
Repentance. i feel bad you know plus the you know the the people on the other side of the
of the speaker they only have to know about the our beautiful relationship they don't have to
know about the dark side and all the dirty secrets yeah yes yes yes that'll come out in like you know
10 years from now when we're both on Skid Row with bad drug habits,
it'll be like how it fell apart.
Anyway, speaking of things falling apart,
I'm going to try an anti-segue.
Speaking of things falling apart, right?
Deployments fall apart all the time.
New technologies fall apart all the time.
A lot of times people go ahead
and try to use the fancy new toy
and there's no good guidelines on how to get started,
no good ways to say, hey, what could we have done
if we only knew or had some experience to build on?
What does this sound?
Am I going in the right direction, Andy?
Do you want to save me?
Yeah, I'm not going to save you.
I'm just continuing your thoughts.
The latest new cool technology, as we all know, is Kubernetes.
We had several people on the show lately talking about Kubernetes I'm just continuing your thoughts. The latest new cool technology, as we all know, is Kubernetes. We
had several people on the show lately talking about Kubernetes and what it all allows us,
kind of getting us to the promised land of enabling every developer to deploy into production
at any point without any problems because it's endless scalable, it performs well, and you don't
have to think about anything else other than writing a couple of lines of yaml code just just yaml yeah just yeah that's the only thing but we the thing is no so for real
we all move into that direction right most of the listeners are probably already in kubernetes
some will have to learn it and there are happy moments and sad moments and we want to make sure
that this the happy moments that you will have more happy moments than sad moments and we want to make sure that this the happy moments
that you will have more happy moments than sad moments with kubernetes and this is why we brought
christian heckleman on the podcast today christian from ert who has been uh i think christian
first of all hi i want to give you the word hi hello. Hello. Christian, we've been talking, we've been working together for
you know a year or two or even more, I don't remember, I think it was Barcelona. Yeah, so
it's with all the COVID around, you cannot tell what year it is right now, right? Exactly.
And you've been a part of the journey with, you know, obviously Dynatrace and with Captain,
but what really today is not about Captain, is not about Dynatrace and with Keptn. But what really today is not about Keptn,
is not about Dynatrace.
It's really about your experience and your work
that you are doing in adopting Kubernetes
for the company you work for
and the things you've learned, the things you love,
the things you don't like so much,
things you would have wished you would have known
before you get started.
And you did a great presentation,
which we're probably going to share. You it how not to start with kubernetes a lot of great wisdom a lot of great memes in there and it's more kind of comic book it's yeah
yeah which makes it more enjoyable to learn and work with kubernetes right but in all seriousness
before we get started christian can you give us a little background to yourself?
Because people can then better relate on who are you, where did you come from, especially professionally,
so that they know why you're actually transitioning over to Kubernetes and what you're doing right now with Kubernetes.
Oh, yeah. So first of all, my current position current possession position is a devops engineer drt so um a couple
years ago i didn't have anything to do with community kubernetes or docker at all so i
wasn't database administrator and before i was a database administrator for around about six seven
eight years i don't recall at the moment um i was in charge of the network of my former company.
And before then I was kind of field engineer outside.
And so when I started so that my second time I'm working for ERT and when I started at
ERT, we didn't have any Kubernetes, any Docker environments or something like that.
We were deploying our services via Puppet or on IaaS with WebDeploy, etc.
And I think it was three years ago or something like that.
It was Kubernetes 1.9 back in the days.
We started to, yeah, having a POC because one of our architects were asking us, so, hey, do we want to start and Mises, for instance, and we did a comparison.
And that was the first mistake we did, that we did this kind of submarine project, I will call it in German.
I don't know if there is the same in English, because I'm not a native speaker.
And we started that or I started or I, my task was then to install a Kubernetes cluster, which they can play around our developers.
And another guy was taking the Docker Swarm part.
And there the whole journey started i would say with with kubernetes
in our company and as i said that was the first mistake not to bring everybody on board building
the platform because um building kubernetes cluster or or yeah installing kubernetes cluster
is not the problem but maintaining, thinking about all the other stuff
like networking, storage, and so on and so on and so on.
This is quite a hard learning curve,
at least from my perspective.
And so, yeah, so I started, yeah,
going to Kubernetes IO or what the domain is
and then reading through the docs and how to
install kubernetes and i've provisioned myself three linux vms and start with kubeadm and and
all the stuff right i'm basically building kubernetes from scratch and um it was kind of
okay it's working and then we we thought about okay how are we getting our our traffic inside the cluster and
it was kind of okay googling and then searching by myself so if you're starting i would i would
always recommend if you start using kubernetes get help if you don't have the um yeah the
knowledge in your company get external uh knowledge, consultants or whatever, and start working
with them.
Because a lot of mistakes I made in the past were just because I didn't know it better,
or you couldn't find in the internet any better solutions.
So yeah, that's quite a big challenge running
Kubernetes, especially if you're running it
on-prem, right? So running
Kubernetes on-prem is a pain in the...
I just wanted to ask a question about that
because it's an interesting topic.
Andy, as we often
discuss with guests,
a lot of times they're
experimenting on your own, just like you or Christian.
And it's hard to get by until you can prove something out.
So what you just proposed there about getting external help, at least if it's paid, is sort of one of those chicken and the egg problems.
Because how are you going to get budget to get external help if you haven't proven something out, maybe haven't even gotten buy-in yet from your management obviously we've always seen success
cases on migrations have been with upper management buying but if you're doing this from from the
ground up in the submarine project how would you go how would you recommend somebody goes about
are there free external resources that you're aware of i know you mentioned something about
the internet at this point now as opposed to when you started,
are there some resources people can leverage
that they wouldn't necessarily have to go to a paid level yet
that they can start at?
How would someone tackle that from your point of view, Christian?
So when you want to deploy Kubernetes,
there are great provisioners,
so tools you can use to start with and which will
yeah give you the ability to to spin up a kubernetes cluster quite stable one
uh without the deep knowledge like like i stumbled across rancher which is an open source uh project
but even though they're um yeah buying or you can buy from them professional support as well.
But if you want to provision some clusters,
they have a quite good documentation on how to do stuff
like configuring
ingress and load balancing on your cluster and so on.
So I would recommend to do something like that or using
other tools like hopes vmware tensu or whatever what what your company um is able to do and
yeah so so but but it's still like when you're searching in the internet you find for for one
kubernetes setup or or what you want to achieve,
tons of different documentations and guides.
And when you're taking a look at the CNCF landscape, what I'm always doing,
if I need to, to bring in something new, like new storage provision or whatever,
I'm looking at the CNCF landscape and looking there what tools are available,
which are not officially supported,
but which are projects within the whole Kubernetes ecosystem, right?
Not relying on some weird GitHub repos with some scripts in it.
That's basically what I can tell the people.
And even though if you're starting with your Kubernetes,
you have to think about what kind of workload
you want to run on your cluster.
It's not only you're building a cluster
and then shifting everything from your big monolith
into a container and then throw it into the cluster.
I think this will make you not happy
to operate the cluster right because you
will find a lot of side effects here and there one one of the resources that i found recently
is it's called tech world with nana she is an amazing youtuber on technology topics
she's actually in austria she's she's from vienna and i reached
out to her and brian we may have her on one of the upcoming podcasts hopefully once she has time but
she did an amazing set of tutorials and she has like several hundred thousand views on these
tutorials uh so a lot of them are free most of them are obviously free on youtube giving you
like a four hours or an eight hours course on Kubernetes getting started with it.
So I suggest folks will add the link to this as well.
But to kind of Christian, what I hear from you is, and this is also part of your presentation,
I don't start with installing Kubernetes from scratch when by now things time has passed
as many options to really, you to really get started with Kubernetes.
And maybe even you just go to your favorite cloud provider
and then you just spin up a managed instance there.
Yeah, absolutely.
But even though if you're starting on a Kubernetes cluster,
if it's a GCP or an EKS cluster
or what it's called in Alibaba cloud i don't know
if you i would say if you only have to support one kind of of application or your company is
really small and you you don't have any uh or not any a lot of dependencies um across your
your environment like services needs to talk with services on your on-prem environment and a
lot of routing and so on, then spinning up an EKS cluster is it's really easy, right?
But as soon as you have to connect all the services with your legacy applications and so
on, then you still need to figure out, okay, how I'm sizing my VPCs,
how I'm doing routing there, how I'm doing security,
and then all of the stuff you need to think about.
While, of course, something like load balancing.
So one of the things I learned was running Kubernetes on-prem,
if you want to run a load balancer,
you first need to have something like Metal lb in place so so you can
have a floating ip around your cluster and then setting on top an ingress controller to to
distribute the traffic inside of your your cluster and if you're running this in an um let's say
managed system then you have to think about the costs as well, right? So sure, you can install an AWS, the AWS LB controller,
and every time you're deploying the service,
it will spin up a complete entire load balancer for you,
which is great.
But if you're having a couple of hundreds of services
running on your cluster,
then it will be getting quite expensive.
Having 400 load balancers here in place, and then you should think
about, okay, I need an Ingress controller for them because not all services needs
to have their own load balancer, right?
So you're and so on.
So there are still considerations you need to make, um, um, when you're running
on the cloud, but it gets getting much easier
because a lot of issues I had was
just because of the fact
we didn't know how to configure
or having best practices in place
for your worker nodes.
Let's say some system settings
on your RK send OS machines,
right, which was, yeah, so little story here.
Um, I had the issue that, um, with ZendOS and XFS file system,
um, the SLAP unreclaimable cache was, was filling up and, and worker
nodes were crashing out of sudden.
And this was all because of a lot of mounts, of mounts which were made from one of our deployments.
We didn't know why.
You have to dig into the kernel crash files and reading a lot of kernel bugs and so on to find out,
okay, when I changed file system to X4,
we don't have this issue anymore.
And this is what you don't have to think about when you're running
on an EKS cluster or a managed cluster, right?
But you still have to think about, yeah, other stuff you need to consider
like load balancing I mentioned before, right?
Yeah, so, I mean, you bring up a lot of points that, I mean,
I didn't run through into each of those because I'm not as deep down
into the weeds and having to provision and manage these Kubernetes clusters
for a whole organization, but just for the work we've been doing
with Captain, where we have been providing different ways
to stand up Kubernetes clusters, whether we give people instructions on how
to use EKS and AKS or something like that, that's kind of easy. But then people
are not familiar with Kubernetes and maybe they want to run it on a K3S,
as you know, right? They've been using Keys or micro-Kubernetes.
But all of a sudden, I think this was my wrong perception was I thought
I run a comment and have a Kubernetes cluster and all of a sudden, I think this was my wrong perception was I thought I run a comment and have a Kubernetes cluster.
And all of a sudden I can just deploy my apps and I don't have to take care of anything else because Kubernetes takes care of everything.
But then I realized that I have to understand what is an ingress?
What can I make sure that my
ingress is exposing my services to the outside world with SSL, with TLS?
And these are all things that I, coming from my background as a developer, never thought
of.
I never had to think about networking.
I never had to think about security.
And all of a sudden, if you really want to use this platform as a quote-unquote self-service magic tool, you have to take care
of this, right? And this is where I think some of the misconception always comes in. It is challenging.
Yeah, it is challenging. And when you mentioned the developers. So developers should focus on writing code
and not think about how networking is being done
and so on and so on.
But you need to understand the basic concepts of Kubernetes,
how, for instance, you're calling another service
within your cluster, right?
So what I always see is developers are provisioning
or you're installing a service, deploy a service in the Kubernetes cluster,
defining an ingress, and then calling another service through the ingress.
So the traffic is flowing outside of the Kubernetes cluster
to the public internet, back to the Kubernetes cluster,
which are basically, yeah, it's latency, right, for your service.
Well, latency and costs and costs yeah yeah
absolutely and i also had no clue about right i mean again when i started and now a year and a
half into it i if i think you're completely agree with you next time and i would start with something
like this i take the time and sit down and go through documentation some training because you
can avoid a lot of basic mistakes like exactly this.
And you are not aware of these mistakes.
Well, like that, you know, we talked about the load balancers.
If I spin up an EKS and I just say load balancer ingress,
oh, that's awesome.
And everything works fine.
And then you get the bill at the end of the month.
Exactly.
Exactly.
And then also other
mistakes.
There's one slide
in my presentation where
this Oprah meme
with everybody gets admin.
This was in the beginning.
I hadn't
had a clue about
RBAC and all the security stuff in Kubernetes.
That was something I learned afterwards, after developers deleted entire namespaces and environment-based namespaces,
not application namespaces, even environment namespaces, and how to prevent stuff like that.
Or how to, let's say, if you forget in your ingress annotation, in our case, the host field, it will
create an Asterix, a wildcard ingress for you. So out of sudden, every traffic was
routed to the one deployment and every other was yelling,
what's happening? When I'm calling my URL, I'm getting the other
service, right?
And so this was, I could mitigate this using the open policy agent,
for instance, but also RPEG you need to think about
and don't give everybody admin access on the server.
Yeah, because you would also not give everybody full access
to all the production servers all the time.
Yeah, absolutely.
And I think this is the challenging, again,
this is kind of the balance we want to,
we all preach at least, you know, in the work I do,
we preach about autonomy.
We preach about how can we make,
how can we give everybody more responsibility,
but still giving them enough guardrails so that they cannot make mistakes.
And I think this is the challenging thing now with Kubernetes also to find out.
It's a great platform if you use it right.
But also it's such a huge platform that not everybody should have access to everything.
But you have to have a basic knowledge and you have to figure out what makes sense
to give into the hands of certain people and what doesn't make sense what you need to provide some processes using tools like
continuous delivery tools that can then automate certain things right yeah so so a big part is
documentation to have documentation for the developers in place to have templates in place
they can reuse like i created a helm template for instance which
was basically covering most of our deployments and yeah doing stuff like um or preventing stuff like
accidentally exposing the service to the internet when it's not needs to be exposed to the internet,
right? Or setting resource limits by default on the deployment. Because what I've seen,
this was also a big learning thing for me, that dealing with resource limits and requests in
Kubernetes. So on our first cluster, I was always wondering, okay, why is Node
just exploding? Why it's not working anymore? What is happening there?
And then I figured out, okay, so the pods are utilizing too much space
or whatever, too much memory.
Java applications are quite memory consumption, very high, right? So what I did is, first of all, setting resource limits on namespaces,
those they cannot overutilize.
Yeah, the the RAM or CPUs and also building this into our template.
So this will be not forgotten anymore when they are deploying something. Right.
And yeah, stuff like that. template so this will be not forgotten anymore when they're deploying something right and um
yeah stuff like that so so documentation trainings for for the developers when they want to start using kubernetes because building a docker images is quite easy deploying something to kubernetes
is quite easy but does it always make sense it It's the other question, right?
Yeah, that's actually a good one.
So is Kubernetes necessarily by default the right choice for any type of app or might it be better to think about other platforms in the future?
Or like, I don't know, it could be that you're just deploying a container
on an ECS or Fargate or maybe it's serverless is better
or maybe it's just an old-fashioned VM and you just run it somewhere. Yeah, so the best example I have here is websites, static websites.
So why you should deploy a static website as a container or deployment in Kubernetes when you can put it into a three bucket and you're good to go, right?
That's a running container is constant. is when you can put it into a three bucket and you're good to go right that's
a running container is constant do you run do you run regular training sessions like how does this work do you do regular training sessions that you wish to have regular training sessions with
developers that or how does this work yeah so so um from time to time when I'm seeing there is something popping up like, hey, we want to deploy something to Kubernetes, then I'm reviewing the application or what they are doing there, giving guidance, having trained it depends on the knowledge of the developers. it all comes to to to governance that when somebody is writing new service that architect
should look into it should decide okay it doesn't make sense to run it on kubernetes i'm not using
kubernetes only because you can spin up an environment uh very fast and then without having
yeah to tell anybody else right because what i've seen um when we started using kubernetes
that every developer was
throwing its application into a container and on the cluster because they don't need to go over
the, and I don't know, the hurdle. The hurdle.
The hurdle, yeah, whatever. So I'm trying to, I apologize, to create a merge request in our Harvard Hira configuration, right?
And they used Kubernetes to bypass this,
even though if the service doesn't make sense at all
to run in Kubernetes, right?
You know what I'm hearing a lot of in this conversation,
and thank you both for, you know, I've been being quiet because I've been trying to listen and learn.
It sounds as if there's a disconnect between the general marketing of Kubernetes, and I'll use that loosely because there's no Kubernetes company per se right but the general marketing of kubernetes as pop it in everyone can be self-sufficient and
things just run smooth and easy right which we most of us at this point know it's not as simple
as that but kristen as you mentioned things like training like documentation and all these things
that for decades people have been trying to move away from. We're obviously not moving anywhere.
Now, the way I see Kubernetes,
and I don't mean to say that Kubernetes is a loss, right?
Because Kubernetes is opening a whole new world
for great other things.
I think there's just still a lot of overhead
and maintenance, scripting and all,
but a different type.
And the benefit of Kubernetes is that as opposed to,
let's say in the old days where you had to stand up your servers,
you had to have a physical location to run it yourself.
You couldn't just pay someone else to do it easily.
You had to pick your network, set up your network,
do all this other kind of stuff.
You can now spend time automating a lot of your deployment process which gives you back that time
to do these things like the documentation the training and these other bits that are still
gonna you know some of these things are not going to go away we still have we're still humans we're
still stupid meaning we don't have these things programmed into us. And if program Andy crashes,
we can't rely on program Andy being there,
so this has to be documented.
So I guess what I'm trying to say in a long-winded way
is that while not all the things that we all hate
about traditional setups go away,
the critical functions still remain,
yet because of the advancements in
technologies, this gives us the ability to automate and push past a lot of these other
things that we would normally have to do in tandem with the documentation and everything else. So
it's not a net loss. It's still a net win. But I think most people have to come to the realization
that this is not some magic fairyland where if we think back to
cloud foundry the idea is here's my code so i forget what the haiku was andy you know here's
my code deploy it i don't care where you know that's you still have a whole team of people
maintaining that cloud foundry thing but at least for the developer on that side it's a bit more
abstracted here as you're saying there are a lot of things you still have to know and learn um and i guess just
bursting the bubble that it's it's not just plug and play did plug and play ever come to fruition
remember way back when plug and play came on or my dating myself um it's it's yeah it's still
gonna be tough and there's still a lot of things and that's why i think this guide that you put
together so awesome because it's covering a lot of these things. Earlier when I was talking about it, are there
any resources out there for people?
Although this isn't a definitive
resource, this gives you a lot of things to say, hey, what
are some check marks we should go through?
Anyhow,
I'll shut up now. I've been rambling.
No, that was good.
And I think...
So
what I already told Andy yesterday so when when you're building
a kubernetes environment or kubernetes cluster it's not only the platform the developers are
using you're building that in the end you're building a small data center within your data
center or within the cloud right with all the different dependencies like networking and so
on storage blah blah blah and and you need to think about the stuff
when you need people who are taking care of the stuff as well.
I mean, I had a little smile on my face when Google announced their autopilot
feature of Kubernetes last week or this week.
I don't remember which are taking away a lot of administrative tasks but
also saying okay you can now only use this kind of networking provision or cni plug-in or whatever
right but it's basically in in the end you can only control the complexity if you're reducing
the number of potential combinations that make it so complex,
right? That's why it becomes an opinionated platform. And we're back to Cloud Foundry,
as I mentioned once in an episode several months ago. Yeah. And I think they had an idea there.
Yeah. I completely agree with you, Brian. And I think this is also what we all try to do,
we try to leverage the new shiny thing,
but then we learn while it's great and powerful,
it in the end doesn't make us more productive,
or at least not the developers.
Therefore, we need to come up with a very clear,
defined, prescriptive, opinionated path
of doing 80 or 90% of the work.
And for the rest of the 10 remaining percent
yes we may then need to go and look into some other options outside of our opinionated way
but i agree with this is also why we invest you know so much on on standardizing things whether
it's open telemetry whether it's the stuff, and now I say it, with Captain, we try to, it's all about
making, in the end, making it easier to get
work done on top of something that is very complex. But we also have to narrow
down the complexity because we can, this is also why, if you look at
what we ended up doing with Captain, we are really narrowing it down now
to say, you know, if you want to try Captain, we are really narrowing it down now to say,
if you want to try Captain, then you take a Kubernetes cluster and we want you to use,
let's say, Istio as a service mesh.
And we just give you sort of in the first iteration of Captain, we said it is deploy, test, evaluate.
This was super easy and super clear.
Now we went a further step because we had people had people that say well we need more flexibility but still by default we we give this prescriptive approach
and say this is how we think you are most productive in deploying this particular type
of technology or app or service and if you use it for 80 percent of the use cases, we think you're going to be fine.
And for the 20%, you can turn some knobs
and you can change our opinion
to be closer to your opinion.
And I think, Brian, we had a similar discussion
in one of the episodes we recorded
that hasn't aired yet as of today,
but I think it was with Baruch, if I'm not mistaken.
Oh yes, the Liquid software.
The Liquid software, same thing.
By the time people listen to this
it will have aired. It's the previous episode.
But yeah.
Christian,
I know
in the end of your presentation, and we
will share it, you have
a nice conclusion, and as I think Brian highlighted before we started the recording,
there's a nice meme on there.
It says the H in Kubernetes stands for happiness.
But you have a nice summary of points of things that you want to be careful with.
Like don't deploy a production cluster
without a review of a professional.
That makes a lot of sense.
And train the people.
We already covered that.
The templates.
I know you covered it slightly,
but I have a question on templates.
So if you provide templates,
but you allow people to modify,
templates are just templates,
but if you don't enforce them, do templates alone help you?
Do you need governance or something else on top as well?
Because otherwise people can do whatever they want with the templates.
So when you are having a lot of microservices and a lot of developers who are adjusting templates and so on, it's quite hard to get an overview of what they are doing there.
So at least that's my, what I've seen, what's happening.
But it could be in other companies, yeah, another thing, right?
So a lot of companies who are only deploying one set of software is different than having a lot of different systems flying around, legacy systems, and then want to transition to Kubernetes.
But in my opinion, you still need to have some kind of governance who are looking on new services, new deployments, which are then going to Kubernetes clusters.
So developers could play on one cluster if they want, right?
And then nobody wants to restrict them into playing around with stuff.
But as soon as the service is getting promoted to a higher stage, to an official
integration environment or whatever, then somebody should really look at what they
are doing there and how its service is working.
Normally, in my opinion, this is classical task for the software architects, right?
So they refuse service, what the service is doing, what the communication of the service is, and so on.
Another thing that I want to ask you now, you mentioned earlier some of the things that happened and that shouldn't happen, like no resource limits.
You mentioned access control and somebody was accidentally deleting all the namespaces.
Any other horror story?
Oh,
a lot.
A lot.
I think I've found
every
yeah,
and here again,
I don't have anything.
Was that Farfignitian?
Yeah.
If you step, it literally And here again, a German watch. Was that Farfignutian? Fettnäppchen.
Fettnäppchen, yeah.
If you step, it literally translated,
it means you're stepping into a puddle of fat.
That means you make a bad step,
and then you do something that you shouldn't do, right?
Are there puddles of fat? Sorry, are there puddles of fat laying around in Germany
that this came from?
I'll look that up.
I'll look that up, how that came about. It's just fascinating how that term came about.
Yeah, a lot of vice versa lying around in Bavaria. Yeah, sure. So for instance, a classical
thing is tagging of your Docker images and using something like tag latest.
So if you, I can remember I was searching around to get an stored provisioner.
Right.
So I used in the beginning of our first or second cluster,
cluster of S combined with a caddy to provision persistent volumes.
And yeah, it's a classical copy and paste. And yeah, it's a classical copy and paste.
And yeah, it's working.
We have persistent volumes.
But I didn't realize that the deployment of Icati was using the tech latest.
And node goes up, node goes down.
And all of a sudden it was, yeah,
and it was using for, at the end,
the image pull policy always,
which means every time the pod getting restarted,
it will be pulled in the latest image from Hecate.
And all of a sudden our storage provisioning
was not working anymore.
And it was kind of, okay, what's going on here?
And then you have to dig into the problem
and okay, why it's not working here and then you have to dig into the problem and okay why it's not
yeah working anymore and so on and so on and even um developers were using for instance um
heavily used image you know it's alpine and they're using alpine latest and i can remember on
in barcelona at the perform on one day um there was an um vulnerability found in the alpine image
for uh i don't know empty root password or something like that and they updated the the
alpine image and i was getting yeah pinged by everyone our company up a lot of people kind of
hey my service isn't working anymore in kubernetes Kubernetes down, Kubernetes down, help, help.
Yeah, world is on fire.
And I was kind of, yeah, but your pod isn't working at all, right?
It's, what should I do?
Kubernetes is working.
So your pod isn't starting up and have you tried to run the pot locally for instance that's also something i've seen a lot that developers are just yeah using the cicd tools to build their
containers and throw it in in the cluster and really developing inside of the cluster instead
of trying okay is my dockerfile building my application? So I have a lot of tickets regarding,
hey, my deployment isn't working yet,
but your Dockerfile could not be built.
So try it locally, fix it locally,
or then you can avoid this turnaround times
for fixing stuff like that.
And especially because then you become the bottleneck
and you deal with things that you shouldn't deal with
because these are basic things that should be checked beforehand absolutely
absolutely yeah i mean for the latest uh like latest tag isn't that something where opa comes
in the open policy agent that should that can you be used for that to validate like no latest is used?
Yeah, sure.
And I think it should be possible with OPA,
or how it's pronounced in English.
I don't know.
Policy agent.
OPA.
And you know OPA in German means grandfather.
Yes, my mom had a friend who was a grandmother.
Oma.
Yeah, Oma and Opa.
I knew this from that, but I forgot what it was.
I just know I heard it.
I'm pretending to be cool with you guys.
But I have a question.
Do you also say Oma and Opa, or do you say Großvater and Großmutter?
I'm saying Oma and Opa.
Okay.
But it depends on the region in Germany.
Yeah, same here, I guess.
I mean, Oma and Opa is very Austrian everywhere, I think.
As you know, I'm a little bit Austrian.
Yeah.
It's funny, because I know i know this total sidetrack but what i was yeah there but but i just have to say this because the um a lot of times there'll be music that i like and i
always want to go to andy with it but i'm like no it's german it's not austrian um and i'm thinking
like well is it close enough and then i'm like well is canadian close enough to you america
united states like they're totally different they speak the same language but it's it's different
enough so i i. So that's
why I don't bother you with the German music I listen
to. No problem.
So I'm the master of sidetracks
normally. But so
speaking of this latest, just to clarify, right, because
I'm pretty sure I understand what the deal is, but
for people who might not understand what people
are doing, what it sounds like people are doing
is instead of saying which version
they want to get, because a lot of times when you're getting something
from Docker or GitHub or something,
you just say latest, and that's a default tag
that gets applied to whatever the latest push is.
That's not something that people necessarily
even put on their builds.
But latest will always just get the latest.
So the recommendation would always say
use the specific version that you want
instead of just latest,
because obviously if you're always getting latest as soon as someone updates
it you're going to get that new one and you have no idea what you're going to get.
And another one, it's a security issue as well.
If you're pulling some images straight from Docker Hub, you are not aware of what is inside
the image.
Even popular public images couldn't be compromised
and so i always recommend to have a set of base images which developers can reuse on in your own
registry which you have you scanned and and like released for for official use for your developers
that's one of the things i would recommend as instead of yeah
running all the stuff but um as andy asked me for for other yeah mistakes i've seen so despite the
fact and i should call my my twitter handle grumpy admin after the show because it's i'm only
complaining and complaining and complaining but but k is great. Kubernetes is great for automation and so on.
But from time to time, you're thinking about,
okay, classical example,
somebody is deploying an application
and it's kind of, okay, I've deployed two pods.
And you kind of, okay,
and where is your auto-scaling configuration?
I've deployed two pods.
It's high available and I'm fine with that. And you're kind of okay why you don't use all the stuff kubernetes is for
right automatically scaling your deployment it's it's like deploying two vms in data center and
and you're good to go no that's not how it should work right and here we are again back on the training story to something like that.
Or health checks in the deployment.
If you're getting an email from one of the developers like, hey, can you restart my service in Kubernetes?
You'll be like, what the?
Right?
Because this is core capability and exactly what's built.
I understand.
I think it's great that you're complaining because, again,
we always hear about the positive side of Kubernetes,
which there's a lot, right?
But this is the real world.
What are the admins dealing with?
What are the things that people are still... What's the people factor, the human factor of leveraging?
You can build in whatever guardrails you want.
You can't, as they say, you can build in whatever guardrails you want you can't you can't as they
say you can't fix stupid all right so you still have that human factor that's gonna do things
wrong not leverage what's built in there you know it goes back to way back when i had a job at a
record store we had a big suck because there was like this island in the middle of the store where
we had like you know walkmans and radios that you could buy but then the register was at the front
and over the register was a gigantic sign
with an arrow that said, register.
Where to pay?
People would go stand at the other side
of it with their CDs in their hand, willing to buy it.
You're like, it's over here
under the big sign. You can't
pull that out.
Someone's not doing a health check.
What do you say?
How do you deal with that?
That's something you can't program.
I mean, fortunately and unfortunately, right?
Because if we could be programmed,
well, that's a whole different debate we won't get into.
But if we can be programmed as easily as computers,
then we wouldn't be human, I guess.
Yeah, and I have to say,
people are people.
And from time to time, people are so and and i i see it every
day at the moment because in front of my of my window there are rebuilding a bridge and they
close the the streets around it and you will not expect how many people are driving inside of the
construction even big big uh trucks and then trying to turn around and so on.
But it says, hey, stop here. The road is closed, but people
are still driving through
the construction site. And it's the same with developers. It's the same with
administrators. It's everywhere the same. And I'm not better.
No, no, no no not at all we all
we all make mistakes it's good to hear about them though yeah and then also um there was this
episode i cannot recall her name but um cd uh version two right and this is cool there there
are a lot of cool concepts and and And I got her point when she said,
okay, we only need one,
or that people are thinking we only need one big Kubernetes environment
and throw everything,
every, I don't know, development integration
or production pre-prod environment
into one Kubernetes cluster.
But as Andreas mentioned in this episode,
how do you validate if you want to,
I don't know,
upgrade your storage provisioner inside the cluster?
If you want to upgrade your ingress controller in the cluster,
that it will work.
So these are central components of your Kubernetes.
Or if you want to upgrade the Kubernetes version at all,
I would say I'm more on the side to test it in advance
on another system before I'm running it on production.
And, you know, it's the whole shiny new world.
It's blinking around and everybody wants to use it,
as you already mentioned.
But it's not everything gold right yeah
and i think this this whole thing with with the you you cannot run everything on one cluster
this can it actually came from the session we had with kelsey hightower i believe or at least
i heard it from him that that you always need to have environments you need to have stages because you need to test these changes on the underlying platform.
And you cannot just do everything in production.
Considering the time, because we have a hard stop with the recording in a couple of minutes,
Christian, I want to do one quick thing with you.
If somebody starts learning Kubernetes today or tomorrow, because it's late here,
but what are the five terms?
Because every technology comes with new terms, right?
What are the five terms everybody needs to understand
so that they know what this is all about?
And I start with one, because this is so they know
what I'm getting at.
Mine is kubectl or kubectl.
Everybody needs to know that this is the primary tool that you interact with Kubernetes.
It's a command line interface to Kubernetes.
What else do people need to know about
when they hear it the first time?
So I would say the Kubernetes objects that they all like.
What is an ingress?
What is in service?
What is in deployment?
What is a stateful set, for instance?
What are operators.
So it's a basic terminology and also how in a Kubernetes cluster you can traffic,
traffic is flowing around. So how to reach other services, for instance, over the internal DNS, like here.
You know what I mean?
Yeah, this is what I would recommend,
at least that the people know about, right?
And yeah, as you said, kubectl and also how to monitor
their
deployments. Well, that's easy.
Deploy one agent, one agent operator. And I have to say
this was one of the biggest things I was
yeah.
So when it comes to monitoring,
and I saw that Dynatrace released the one agent,
I was really happy about it because before it was kind of how to use Prometheus,
what kind of metrics do I need to pull
from the API server, whatever it's.
And with one agent, it's, yeah, it's perfect.
Can I throw another question?
Not a question, but to this idea, Andy, that you're bringing up there.
Now, again, I'm only in theory.
I'm on the pre-sale side.
I'm not dealing with this in real life.
But based on conversations we've had with, let's say, Kelsey and some others, would it also be a good idea for anybody moving into Kubernetes
to be able to answer why they're using Kubernetes?
Like in one or two paragraphs or less,
why are you using Kubernetes?
And if they can't answer that, don't start, maybe.
I mean, that's a little extreme, but...
And I think this is actually a great point
at the end of Christian's presentation, the last point.
Think about on how you want to deploy your apps before you start using Kubernetes.
So think about this first, so not the reverse, but really before you even get started, answer that question.
Yes, it's like I said before, you have to think about what workload you want to run on Kubernetes, what you want to achieve, what do you think is the advantage you want to utilize from Kubernetes
to run this workload on, right?
All right, Brian, I know we'll have a hard stop.
That's why I think we want to kind of conclude here.
And Christian, it was a pleasure having you.
I know this is not going to be the last.
I know we also have a
lot of other things planned we will be speaking at the redhead summit and redhead podcast they've
also invited us so that's going to be great i know you are i always keep telling you start saying no
at some point because you have a lot of work to do in your regular life but i'm still happy that
you often say yes to when we ask you. So thank you so much.
You're welcome.
You're welcome.
And it's a pleasure to be on a show where people like Cassidy Hightower and so on were guests as well.
But do you have a pair of sneakers here?
That's the big question.
Not from the Perform 2021.
I have my pair of sneakers from 2020.
So you at least have a pair.
Okay.
But I'm also wearing and,
and the podcast listeners will not see it,
but we can,
we can take a screen.
You've got the nice,
you've got the nice Dynatrace socks as an employee.
I got the crummier ones.
I got the,
the $2 version.
You have it?
Yeah,
I got it.
Those are awesome.
All right.
Well,
thank you very,
very much,
Christian.
Do you have any social media you want people to follow? LinkedIn, Twitter, anything that you're spouting off all of your brilliant observations? Or do you have a LinkedIn?
I have a Twitter account with a very professional handle. It's called at Wurstsalat.
At what a lot?
Wurstsalat, sausage salad in German.
Oh.
That's great.
And then don't expect any
let's say
useful content there.
Okay.
All right. Really, thank you for being
on. Andy, was there anything else or should we go ahead
and wrap up?
No, I think I'm good. Thank you so much, Christian.
That's really it. I'll see you soon.
If anybody has any questions and comments, you can reach
us at pure underscore PT
on Twitter or send us an email at
pureperformance.dynatrace.com. Thank you so much
for listening, everybody. And Christian, thank you so much
for being on. This was very enjoyable.
Andy, as always, thanks for being awesome.
Bye, everybody.
Bye-bye.