PurePerformance - Why DevOps must not mean Devs On Call with Michael Friedrich

Starting point is 00:00:00 It's time for Pure Performance! Get your stopwatches ready, it another episode of Pure Performance. My name is Brian Wilson and as always I have my co-host with me Andy Grabner. Hi Andy, I went with Spanish opening last time so I can't speak more than that in German except for a couple of words that i might not want to say on the on the podcast but brian exactly well i'm pretty sure our guests understand understood what i was just saying but at least you know good try with your opening that's hey that's awesome and i gotta say because we didn't do it

Starting point is 00:01:01 last time and i think it happened in between so obviously this is way in the past already. It's already way in the past now, but I want to congratulate Austria on Italy beating England. Yeah, because that means we lost against the Euro Cup champion. That's the only reason why we got kicked out early.

Starting point is 00:01:20 Exactly. So congrats to that. I know you were rooting for them for that exact reason. And when I saw that was going on, I was like, hey you were rooting for them for that exact reason and when i saw that was going on i was like hey andy will be happy for that so yeah congrats to that thank you and anyway yes i i could see our guests is like cracking up at the insanity because what people don't see is andy pulled out his austrian flag again and and was waving it around um yes it's that kind of podcast so anyway since we were talking about our guest, Andy, would you like to introduce him?

Starting point is 00:01:50 Yes. Yeah, of course. And I will actually do it in German quickly. Servus, Michael. Danke, dass du hier bist. And I guess I will switch back to English, because otherwise we lose a lot of people. But I would love, Michael, for you to introduce yourself. Michael Friedrich, developer evangelist at GitLab.

Starting point is 00:02:09 I think we crossed paths a couple of months ago, or maybe it's been longer, around our Captain project and then the stuff that you've been doing. But maybe for the world that is listening in, a quick introduction, who you are, what makes you, what are you passionate about, what are you doing these days, and then we jump into the topic, which I will introduce in a quick introduction who you are what makes you what are you passionate about what are you doing these days and then we jump into the topic which i will introduce in a second perfect thanks um yeah i've i'm really passionate about like monitoring and and all the ops stuff which brought me into the captain project which you do at dynatrace, or you sponsored it and donated it to CNCF. And the path to monitoring is actually

Starting point is 00:02:52 I started studying in Hagenberg at the University for Applied Sciences, Hardware Software Systems Engineering. So I wanted to learn how a computer works, because I didn't really know. And the turbo boost button on the 468 like you get 66 megahertz instead of 33 was like it's interesting i have no idea how this works and my teachers couldn't tell me so i went to went to go study um and at a certain point, I moved to Vienna in 2005 to write my diploma thesis and do my apprenticeship at the Mobilecom Austria, which is a telecom provider.

Starting point is 00:03:37 And so I learned about NFC, but it wasn't cool back then because there was no mobile devices. So for my diploma thesis, I moved into Linux and video streaming and audio streaming stuff, which also brought me into the biggest student storm in Vienna doing network administration and also a little bit of monitoring Nagios back then. It was not really nice to use, but nothing else was there. And i was like learning a different world because my studies was pretty much windows and when i applied for a job at the university of vienna the computer center central informatic teamst i was learning dns this is why my nickname is dns michi um because i pick it by job or by by my role and i said yes in a meeting around

Starting point is 00:04:29 like can you add an oracle back into nagios because i had installed nagios before but i didn't know oracle so i learned oracle is a database um coming from my c++ background on windows i needed to learn how to code C on Linux or Unix, which is an interesting experience. And so the whole learning journey goes on and on and on. And at a certain point, I wanted to do more. And I joined the iSinger project back then on monitoring, being an Argus fork.

Starting point is 00:05:00 And in 2012, I moved to Nuremberg for that reason, making it part of my development job. And after a while, or like turning back time a little bit into the future, I also figured that like teaching something which I learned is super interesting. So I did start trainings and this also involved the Git training, later GitLab training. And so I got in touch more with git lab um and this like brought me into hey what is a developer evangelist oh that's

Starting point is 00:05:32 different i can do podcasts meetings uh events talks sharing the things i'm passionate about um and i'm i'm not only like doing the developer experience stuff um but i'm not only doing the developer experience stuff, but I'm also focusing on the ops side of life, which brings me back into my monitoring love, basically, and observability and SRE and all the fancy things. Yeah, this is basically my story, which brought me into an all-remote job now. It's been 17 months now, so roughly one and a half years.

Starting point is 00:06:12 Time is running fast, especially throughout the pandemic, of course. I think we adjusted to virtual events somehow. I'm cheering to Linz, to austria because this is where i was born um some years ago um but yeah this this was a long introduction it's great no it's it's fascinating especially i like i love your uh your turbo button because i always always wondered why does it go from 33 to 66? I never questioned it. And why is it always just on? Exactly, why would it never just turn on by default?

Starting point is 00:06:50 Maybe heat or energy consumption people were thinking back then? I don't know, but did you find out? You said you researched it or you wanted to research it. Did you get an answer to what the heck it was up? I think it was actually overclocking. The thing is I learned more than I was asking for.

Starting point is 00:07:05 So I learned how to design a hardware CPU, basically, or embedded hardware. And I figured in my studies that circuit engineering is not something I'm good at. So for a hobby project where you can break things, it's awesome. It doesn't matter because it's cheap. But if a customer gives you 100,000 euros for designing an FPGA or ASIC and you basically, it doesn't work or you forget something, you can't just fix it in hardware. It's there and you need to throw it away.

Starting point is 00:07:36 And so I decided I'm moving on in the software area with later changing into network administrator ops side um and potentially i know too many things which i've learned in the past 15 years um that's also one of the reasons why i said okay development is nice um but i'm changing more into the role of an educator. I'm still doing a little development, learning new things, but I'm providing platforms or providing resources that everyone can learn and also contribute to open source projects, for example. And I think what I appreciate about the longer intro you gave

Starting point is 00:08:23 is it really just highlights how when you come into this industry, you don't just go into something and then you're there. It's like people should and most of us do feel free to move around to what sparks our interest until we find where in IT we want to settle in. I think that really helps highlight for people who are starting out what that what potentials there are so I think just even uh even even for that alone it was it was worth having the extended intro and now you're the record holder for the intro so congratulations okay but um um is it michael or michael what do you want to go with? I don't even know. Go for the English, Michael. Michael, perfect. So Michael, when we talked a couple of months ago

Starting point is 00:09:11 about this episode, I think we were chatting about what can we talk about. Obviously, your passion of educating people, your passion of talking about operations, your passion to reach out to the developer community. And then you brought up uh you shared your your dot-com uh presentation which i thought was pretty interesting from infrastructure as code to cloud native deployments in five minutes i think that's a that's a cool talk i look at the

Starting point is 00:09:36 slides but in general i think what i would love to hear from you as a developer evangelist is how can we enable developers to truly not only think about the next line of code, but think about the next line of code, how it then actually reacts in the system where it ends up living, which is production. So what can we do as evangelists? What can we do as advocates? What can you, Brian and I do to really make sure that the next generation of developers or the current generation of developers really thinks beyond what happens on their workstation, whatever that may be? And I know you have a great talk where you, I think, walk through different steps. But kind of let's dig into this topic because I think we have a lot of folks that need to run systems in production. They need to take care of monitoring.

Starting point is 00:10:26 But if we are educating folks on the very shift left side of the house to really build systems for that, then I think we will help everyone along the value stream. And so we would just love to hear your thoughts. Where do you get started? How do you educate developers? I think you're overwhelmed by the many tools and the many possibilities you have. So when you're starting your journey, someone says, run a Kubernetes cluster. I'm not interested in that because it doesn't really matter where I'm deploying the software right now. I shouldn't be obligated to think about this in the beginning of my journey.

Starting point is 00:11:05 I should have the tools and the abstraction layers which make it easy to, for example, learn a new programming language or learn how to use Docker or learn how to use containers in a relatively simple setup, but provided as a self-contained learning course or something like that. So it's easy to consume and you also can have the five minutes success. So it's like the, you want to see something break and then fix it. You don't want to see something green. That's kind of boring. And so one thing I did in the past was like, how do I approach, for example, CI CD? What is that? Let's start with the CI part because the CD past was like, how do I approach, for example, CI, CD? What is that? Let's start with the CI part because the CD part is like,

Starting point is 00:11:49 it's still a little afterwards. Of course, you can combine it and there are discussions around it, but when you're starting out, it's too complicated in my opinion. And I'm also leaving away like, it's not Kubernetes, it's not AWS, HyperCloud, whatever. You just want to run it somewhere at a certain point. And you also want to test it. And it potentially should be just the same,

Starting point is 00:12:15 only a different environment, basically. Because if you need to think too much into the ops direction, it's just like it blows your mind and you feel blocked. And this is something I also felt many years ago, not knowing where to start or just saying, hey, I'm not really sure if I can learn Python, for example. But if you have like a mentor or you have a friend or you have a team member who is able to do it

Starting point is 00:12:42 and you can like find a way to learn that together. This is like a great possibility. And one of the things we used was, I think it was Cloud9 back many years ago as an online collaboration platform. And we could like code live. In the end, we were sitting together in a in a in a meeting room on the couch programming on the tv screen um but yeah we we actually tried it to to do it in that way and this was fascinating because we didn't need to install anything locally

Starting point is 00:13:19 and there was no like pip install and then it breaks because version 2 or version 3 of Python and so on. And Docker was not really there yet, basically. Vagrant boxes became a thing, made it super easy or super convenient to test something. So you have a base image. You don't need to install VirtualBox and figure out how to install Red Hat in there from an ISO image where you have no idea where to get it.

Starting point is 00:13:51 This is the past, hopefully, but it made it more approachable. And the Vagrant Provisioner installed all the dependencies, and especially when an existing team got a new team member, the onboarding process was like, yeah, here's your notebook. Here is like maybe a document, maybe something. And then read the documentation of the tool,

Starting point is 00:14:18 basically to install the dependencies. And this is if there is is documentation i wrote lots of developer documentation back then because the onboarding was really hard with c++ and stuff like that but like keeping an eye out for documentation for making the the process of others easier and also providing a platform where it's easy, where just you fire up something, Docker Compose app or Gitpod, for example, in the cloud as an IDE or any other tooling. There is like a hundred competitors, I think,

Starting point is 00:14:57 which provide these server runtimes. But like try it out in an easy fashion. And even if you don't have access to a high-performing locally MacBook or notebook, you can still run it in the cloud somewhere. And sometimes, or oftentimes, it's for free because they have a free tier. And this was one of the mind-changing things in my journey that I was like, oh, I don't need to focus on the local dev environment. I don't need to learn Docker. I cannot remember the add and copy thing

Starting point is 00:15:35 or what is the syntax to expose a port, HP or something. So in the end, making it approachable and also allowing someone who didn't doesn't have a degree or study background or whatever education um that you can really like say okay i'm interested in that and i want to learn that and maybe i figure out why it's 66 megahertz from the hardware side but putting it on the software side and saying, okay, I want to learn how the iOS application works and how the push notification works.

Starting point is 00:16:12 And maybe I can implement that, but I don't need an iPhone for that. I get all the tools on some in the cloud. I think there is a rumor around Apple providing Xcode in the cloud or something there is a rumor around apple providing xcode in the cloud or something so which would be which would be perfect i mean all the stuff that you're saying is perfect for developer productivity developer experience right from zero to your first deployment in a matter of enough of no time because as you said like getting people up to speed and showing them

Starting point is 00:16:43 how the tools work and which tools to install. If we can solve this problem, and obviously this is all solvable and people have solved it for certain environments and technology stacks, then we just make so many more people more productive because they don't have to worry about all the stuff they shouldn't worry about anymore in 2021. One momentum was also when we moved from source installation

Starting point is 00:17:05 to packaging. So when I started with Nagios, there were no packages. And when you install something and then the libc or whatever is like replaced and then a sec fold and you're sitting in front of it

Starting point is 00:17:20 and want to eat your keyboard because you feel lost. You have no idea. And then someone says, use a debugger. I'm used to Visual Studio front-end debugging. And I'm like, okay, you have a package, you have debug symbols. I didn't know what debug symbols are back then. So I was really learning from my colleague. And this whole thing around packaging is super complex. So I thought like RPM spec file, I learned that because I needed it for the University of Vienna to deploy a singer back then. Okay, okay, it works. Oh, it doesn't work because macros and specific syntax to learn. Debian packaging is similar.

Starting point is 00:18:15 When you know how it's done and you have someone who taught you how to do it, it feels logical. But the learning barrier, the entry barrier is really high. In the end, hopefully you find something where you use fpm from jordan sizzle which is like an automated fashion it's not beautiful in a sense of a an upstream package i would tell you it's nice but it works so you need to find um like the balance of it works for me now i have my first success and i can optimize it later. You just need to keep the optimize it later action as a to-do and not just say, okay, I'm never doing it, like too much or too little time.

Starting point is 00:18:57 And once we had the packages, it was super convenient to install them. And then we had like automation workflows with Puppet or Ansible, Chef, whatever tools are out there. And they have gotten better, more approachable. So it was easy for me as a developer to understand what the Puppet Apply would do. So I'm describing how I want to install software in a virtual machine in Vagrant.

Starting point is 00:19:25 And this was like, oh, I can actually use it as a development environment now, but I could also use it as a demo environment for customers or for users who just want to do it, firing it up, and then learn something. We even used it in trainings. Only problem is the hotel Wi-Fi was not so good. So we needed to download 10 gigabytes of images.

Starting point is 00:19:49 But this is like a process in time. And then you see, okay, the download, the provisioning takes too long. Maybe we can just use something in the cloud. Okay, we need internet, but it runs self-contained somewhere else. Yeah, it's interesting. I think I was growing up in my developer experience in the right decade

Starting point is 00:20:14 to have nothing and now see everything grow. And on the other side, I was not on the, sorry to say that, hype train of Kubernetes and containers in the beginning because we didn't have any microservice application in development yet. So I was a little late to the party. And I also was sitting in front of the YAMaml manifest in kubernetes and was like i don't want to maintain that and then and then someone said yeah but you actually need to install

Starting point is 00:20:55 that i said i have ingress controller crd whatever thing I just want to use it. And I don't even want to write the YAML, to be honest. I want to copy paste it from somewhere. And I don't need to imagine how a developer feels. I feel it myself because I was sitting there,

Starting point is 00:21:21 I think last year or two years ago. Yeah, it looks interesting. Solves totally my problems, but I have no idea where to start. And I have a Udemy course or whatever online course to learn Kubernetes. I didn't finish it because if you're learning alone, it's boring at a certain point. And one of the things I found out is um trainings trainings are nice because you can learn in a in a small group like being a trainer and teaching someone else um but still

Starting point is 00:21:55 it's like it's custom and enterprise so like training for community members is oftentimes at the meetup or giving a talk or something else so i started to create my own mini workshops and mixing talks with workshops and stuff like that um and since we cannot meet currently um i made like a weekly tech coffee chat into a meetup, which I call the Everyone Can Contribute Cafe. It started out in German as a Kaffee, which is a German word, not an Austrian word. And from there, we met friends and I met people or I know people who are Kubernetes experts.

Starting point is 00:22:41 And I said, well, can we just do a session on Kubernetes in Headstack Cloud for example and we started that earlier this year and from there I learned so much from like seeing things break seeing things different opinions like the vanilla Kubernetes versus K3S and other variants that my ops and developer heart was like, oh, that's actually not so complicated as I thought and now I can try it out and even think about

Starting point is 00:23:16 for example, monitoring. How can I monitor workloads in Kubernetes? You have the big picture always in your head but if you don't try it out it's always marketing it will always like sell you something and you can't believe it um but maybe yeah it doesn't you never know if it really solves your problems um i think this reminds me it reminds me what you were just

Starting point is 00:23:45 talking about with with your everyone can contribute and i will definitely will link to it to one of the reasons why brian and i love doing the podcast so much because we learn so much from all of our guests yeah that we would otherwise either not have the time we never thought of but really you know learning topics and then in your case with everyone can contribute, as you explained with Kubernetes. I think it's also great to see how people then interact with the terminal, like what they do and how everybody works, like the little tips and tricks, the little tools. Oh, this is a command line I didn't even know that exists. And it's just fascinating to just learn from others.

Starting point is 00:24:22 It's also interesting to see the trends that occur. We've been doing this podcast for many years now, and when we think about what we see coming in and out, there's these trends that we, at least I'm sure Andy notices them too, because Andy's the summer writer, but the one trend I'm seeing, and I'm hearing it from you as well, and I've been bringing this up on a bunch of the past shows, Andy,

Starting point is 00:24:43 is again what Cloud Foundry was trying to do in the beginning was make a system like people are trying to do in Kubernetes now. And Cloud Foundry, maybe they were ahead of the game, maybe it just wasn't robust enough, who knows what reasons why, but now it seems like everybody's trying to abstract the toughness from kubernetes and make it so as a developer you could just go ahead write your code and push it in and and be done and it's just an amazing thing we were hearing from you know these concepts from you

Starting point is 00:25:15 hearing from a bunch of other people and i think it's really um it's a great trend and it's it's showing how something like how the how the the community is maturing Kubernetes to where it needs to be. So that when someone like, as you were describing sits down and looks at it, they're like, I, you know, I tried taking a Kubernetes class as well on Udemy.

Starting point is 00:25:36 And, you know, I, I went through it. I typed in the commands, I did stuff and I had a, a notebook open to write all the, you know,

Starting point is 00:25:44 type in all the key commands that I'm going to need to remember later. And I'm like, oh, I still don't even understand why I'm doing all this, especially because I'm not a developer and I'm not deploying code, but I wanted to get familiar with the concepts. I'm like, but this isn't teaching me the concepts of it. So yeah, it's definitely frustrating a lot of times, but it is great to see the growth of it. It's great to have that experience to say, oh yeah, I deployed Kubernetes on bare VMs without AKS or anything like that. Sure, I did that once. how these things go. I was curious, when we're talking about trying to get developers

Starting point is 00:26:27 to embrace all this stuff, how do we also get them to, not only developers, but let's say architects, embrace not just what environment they need, they want to pick to support their product from a code or operational point of view, but from a monitoring and maybe security or other concepts.

Starting point is 00:26:55 So one thing I see a lot lately is people are making a choice to say, hey, we're going to move to Kubernetes or we're going to move to whatever technology, we're going to do it in this weird way. And then afterward, they start looking at, okay, now we need to start considering monitoring this environment

Starting point is 00:27:09 and we need to start thinking about security for this environment. And when they go back and look at it, they realize it's really hard to do those in the environments they chose because they didn't think of it while they were thinking about what they wanted to do. So with the idea of getting developers to think about pipelines, getting to think about all this other great stuff, how do we then get them to start thinking about everything else when they're making these choices?

Starting point is 00:27:34 Is that something you've been running into? Is that coming up in your world? Or are you seeing it as an issue where they're not thinking big picture, they just focusing on their one area and then later on get stuck again yeah i think you're describing it correctly or what i'm seeing and i've experienced myself i've experienced it myself as well um i think like seeing the benefit of why do i want to add metrics or monitoring? Because your software isn't finished when you release it. You will be working on it. And most oftentimes, it's not only feature development, it's bug fixing. And you will spend time on trying to reproduce the problem,

Starting point is 00:28:22 maybe running your CI-CD pipelines, waiting endless times for everything to finish because works on my machine is not the same as works on the customer's machine or like anywhere deployed. And one key learning for me was I always wanted to see how much memory or CPU my application was consuming because the customer reported that as an agent, it was too much, couldn't afford it. And I was like, yeah, but I don't know.

Starting point is 00:28:53 How can I look into my application? Because for example, on Windows, it's super complicated to extract the performance counters and the event log and whatever things you want in a similar fashion okay linux is linux or unix also difficult if you don't know how to approach it um now try to imagine how to do that in a container this is a an additional abstraction layer more complexity um how to manage the docker no idea check docker in marius it doesn't it doesn't solve the problem um and like as a developer you you're looking for metrics for example or for a trend to improve it

Starting point is 00:29:35 i want to see when i change the code um that this potentially um increases the memory consumption because the algorithm is not good enough or it might lead to a certain like api requests being fired in a row and then the cpu cpu load um goes up because there is a deadlock or i forgot to like optimize the build or something like that and in in that regard and with um monitoring which runs in a staging environment where like the the code i've currently written gets deployed hopefully like in five minutes or in a little less time and i get feedback around hey by the way when you merge this now into the main branch memory is going up and customers won't potentially be happy

Starting point is 00:30:26 because you have methods to add annotations at 95 percentile measuring and all the important things others care about. I can see that and I can react on that because normally the workflow is, and this is an old workflow, you fix it in production at 3 a.m. in the morning when someone calls you or the customer is calling you. And there is always potential that something breaks. Because if there is like a demon crashing or something else, SREs can take care of it or the ops team or those being on call. I'm also saying that DevOps doesn't mean that developers need to be on call. It should be a team action.

Starting point is 00:31:19 In the end, as a developer, I want to avoid that something strange happens. But in the beginning, I need to learn it, I think, the hard way, like how to add good logs. Ugly stack traces don't help much, or the user won't see. The ones solving the problem in production, they don't care about the Java exception. They don't care about the Ruby stack trace.

Starting point is 00:31:44 They want to see immediately how to fix the problem. So structured logging or context-specific logging is something I need to learn as a developer. I've watched a talk from Nicolas Frankel a while ago who explained it like, how can I optimize the code for logging? And how does a JSON structured log look like? How can I make this more approachable for machines

Starting point is 00:32:11 so that I can store it somewhere in Elastic or Loki or whatever is needed and immediately see that, especially when we move a little bit outside the developer scope and say, hey, we want to do observability, combining metrics, logs, traces, and also seeing things which are not yet known, like the unknown unknowns. How can I ensure that developers can benefit from that? So they can also access the Grafana dashboards, for instance, and learn that the memory increasement

Starting point is 00:32:51 is actually tied to specific other things because the tracing said that the database backend queries also had an impact on that. Or the writing to the Redis caching system doesn't work for some reason. So things you cannot reproduce on a local machine, not in your cloud environment to develop, but you need a real system.

Starting point is 00:33:17 And I think the key thing is how can I attract developers to benefit from that? So making it approachable that they can install it or that they just get it, like you install Captain as a quality gate, for instance, and attach monitoring to it, it's complex. But make it as easy as possible with the five-minute success and say, ah, that actually helps me prevent production failure and burnout and other things.

Starting point is 00:33:46 How can we encourage everyone to think about these things if they have never been a problem for them? Because unless you, as you said, get woken up at three o'clock in the morning, why should I care? Why should I care about memory and CPU? Because in my environment, this has never been a problem. If you say we bake this automatically into the development process, right? Like the quality gates. If we just, like as we do test-driven development, we should also do, I don't know, quality gate-driven development, SRE-driven development, ops-driven development, SRE-driven development, Ops-driven development. I think we need to bake these basic concepts into our processes, the tools that we will use for delivery, for CI, CD.

Starting point is 00:34:33 They need to just have these checks. And if you are creating new code and your code is consuming far too much CPU or memory, then what we assume by default, then you should not even be able to then deploy anywhere. But my question really is, have you found out a way how you can make monitoring and all these things more attractive to developers by default, or is the only way really in to just provide the tooling out of the box as part of the developer experience,

Starting point is 00:35:03 or do we have to let them fail at least once and then they realize it's important? I think that not only selling monitoring for failure, but selling it in a way of saying, hey, you can provide proof in your documentation that this is the best performing application, or this is a business advantage for you. And you might get like an incentive for that because you did provide good work. So like making it a team effort and motivating the team and say, hey, if you increase the performance by, I don't know, 10% every month, there is like a bonus or there's something else in that regard. So making it attractive.

Starting point is 00:35:53 The other thing is if you're working in open source and you rely on good feedback as well, performance always is like everything which costs money for someone, compute resources or locally. If that gets better, you get get praise you get good feedback so this can be a motivation factor as well if you're seeking for praise and good feedback because normally you get bug reports which is depressing at a certain point um so like turning turning turning the table a little bit. But getting there is hard. And I'm also not a super fan of writing unit tests. I know that they need to be done and test coverage.

Starting point is 00:36:36 But I'm getting blamed by the CI system when I'm not writing enough tests. I would love for, as a developer who is sometimes a little lazy, I would love for some sort of automated testing um this needs i don't know if ai or machine learning is the right way to do it because they still there's still failure involved but if we think think around some chaos engineering, introducing failure to the system and simulating the failure and seeing how it behaves is an easier approach for me as a developer. I'm saying, hey, I want to battle test this application, but I have no idea why, because maybe I didn't learn it. In my studies, I learned how to define those extreme test cases and check whether there's a situation and providing a negative number to an only positive interface. But if you don't think about this, or you can't,

Starting point is 00:37:35 because the microservices are running on like 10 different nodes and pods and you have no idea how to test that. To be honest, I have neither. So I'm relying on best practices and systems who just break it. And they might be breaking the pod on the left side and the pod on the right side or just the bare metal system.

Starting point is 00:37:56 I don't care. But if there is like the possibility to, I don't know, open the sandbox and see who is fighting with whom. This would be interesting. And this is why I'm also trying to convince people to try specific tooling around chaos engineering, just to not rely on the human factor of writing a unit test,

Starting point is 00:38:22 but to see how can i like create a situation a human potentially wouldn't have imagined or on the other side moving from chaos engineering into the security side doing security testing based on experience from others like a vulnerability database or something else. And a while ago, I was running a security scanner on the REST API. I think it was an Asus. And this thing did test things I would never have imagined on the REST API. It also crashed the REST API because the implementation was not good. But this was an interesting experience because this was running on Windows and getting an insight via a stack guard violation, I think it was, then in the end, this was only reproducible on Windows

Starting point is 00:39:21 with some security scanner. And it made the software better because at some point we figured out what the problem was and we could fix it and our customers and users were happy because it was like the Windows portion of it. And it also improved the Linux version. And so like having these tools available and also deeply integrated or at least the possibility you can say here click or here automate stuff and it's there you don't need to like have a thousand tools or each

Starting point is 00:39:54 tool for one use case this isn't so i was we were running with we have been running jenkins in my past job and at a certain point yeah there's the git plugin there's docker plugin just whatever plugin and then you couldn't update the platform for security reasons because the plugin very incompatible and at a certain point i was like yeah i was like i just want to have it run and so this was like, we have a Git server there. It's GitLab. Interesting. Let's maybe learn the CI-CD Yammer. Could have been a different tool.

Starting point is 00:40:34 We could have looked into GitHub Actions if they would have been there at that time. And it's not really the tooling. It's like, what can I do with the Y-level workflows I'm currently in and trying to look what others are doing like yesterday we had anais in in the cafe from from cvode cloud and we also touched base a little around like 100 days of kubernetes learning how kubernetes works and seeing how others are approaching this learning curve. So I like talking too much, jumping between too many topics.

Starting point is 00:41:17 I think introducing chaos engineering, security, monitoring is important, but do it step by step. Don't try to reinvent the old environment. If there is a platform providing it, it's nice. But if you're trying to learn it, you need to do it step by step. Cannot enable everything and say, hey, problem solved. It's a learning journey. You need to invest time.

Starting point is 00:41:39 You need to convince your manager that you need to invest time and providing the results and saying, hey, I can prove you that the next release is more stable, has performance improvements. By the way, here's a chart PDF generated report. Have fun. This is my thinking of optimizing the steps. And then you get the time. And I'm here to praise and teach that way.

Starting point is 00:42:01 Now, one question I had on that, though, was are we asking developers to do too much? Because that's a real tall order for developers to start thinking about. As you mentioned, you don't want to write the unit tests, which means you have to then, even if you're going to write a unit test, you have to think about performance considerations.

Starting point is 00:42:17 If you're going to start thinking about security and understand the scans, what a better approach maybe to be to have a more well-rounded team where you have an actual performance engineer, someone who's in the security realm on the tier who work hand-in-hand. So you write your code, it goes to them,

Starting point is 00:42:36 they start applying the performance engineering practices. Maybe they're even doing the unit test because that's more their area of expertise. So that just like as a developer wants to focus on writing and publishing their code they can still do that and while it's important for them to understand what they're writing for and their goals are working in a team they're seeing the results they're seeing what that team is doing but i just think scaling to have all engineers

Starting point is 00:42:59 wear all these hats um it's asking a lot of people you know i mean it's to me it just sounds like a lot for developer to try to tackle i mean it's hard enough to get yourself out of a loop i guess sometimes you know but or am i missing it like yeah what no no it's um it's a trend i'm seeing myself like developers need to learn everything and at a certain point you're burning out from that because as a developer, you're basically becoming a DevOps engineer or DevSecOps engineer, whatever you may call it. And the thing is, the persona is just a single person

Starting point is 00:43:33 or it's like a single person team doing three different things or like 10 different things. And if that's not clear by management and leadership that this should be actually three people on the team, they easily burn out from that. So I think we are at a certain stage where I... So if you're looking into tracing or like open telemetry and things like that, you need to understand the SDK,

Starting point is 00:44:03 you need to learn how to instrument your code. I really like that we are getting better at it, but I'm currently sitting in front of it. I'm saying, I don't want to do that. It's just too complicated. And the problem is we have 25, 50 different programming languages. So maybe it's not available for my language right now is there another way from the outside maybe not because software architecture software evolved

Starting point is 00:44:35 over time um it's difficult you cannot always like have the in-memory debugger with an LED preload or whatever thing gets done. So it gets complicated. And I'm not super confident in saying, okay, developers need to do all the things. I'm seeing it as like you have a head on and hopefully you have an SRV team or a team which does all the ops stuff, but together, who's also capable of understanding how the code works and collaborating with developers, with DevOps engineers, with Q&A teams, basically all people involved in the software development life cycle also like why shouldn't the project manager not being able to understand cicd or understand the security dashboard um so i think there is a

Starting point is 00:45:33 there should be a shift in roles um but teams shouldn't be like uh minimized just because it's optimized the way we have optimized one one person to pay amazing but the others are burning out from that so it's taking care of mental health and and looking looking that like you're working the 40 hours a week or whatever the cycle is and also ensuring that people are taking paid time off or taking vacation off. It doesn't make sense that developers being on call and half a year later on, you can just say, okay, let's do sabbatical or let's take the other half year off because you're burned out from that.

Starting point is 00:46:18 Hey, Michael, I would love to point people to your DockerCon live conference again. I think I mentioned it in the beginning. I just have it open here on the side all the time. I make notes and changes. But I know we could go on for probably much longer, but I still love people to really look at the stuff. Because I didn't watch the recording,

Starting point is 00:46:44 but I walked through the slide deck that you had and I just found it fascinating because you covered, you did a great, great job in the slides alone to kind of follow along and understand the story and what you were telling. So just as a reminder, if you want to learn more, I think in our kind of talk format that we had now, we touched on a lot of these topics.

Starting point is 00:47:06 So just as a reminder again, as we're getting to the end here, this is why I'm kind of talking about this now, bringing it up. Kind of closing words, because I know we could probably have another session on like monitoring observability. You brought up open telemetry. We could probably talk hours and hours about that alone. Which brings me and also reminds me

Starting point is 00:47:29 that our colleague, Henrik Rexert, he just, as of today, of the time of the recording, published his new YouTube channel, Is It Observable? So I can also highlight this to make sure that in case people are interested in observability and open telemetry and Prometheus everything check out his channel there but uh michael any any final parting words uh for people that really want to understand how they can kind of maybe change their mind how they can become better how they can invest in developer productivity in case they're

Starting point is 00:48:02 not a developer but maybe in the EP team, in the engineering productivity team. Any other resources we should highlight besides those we've already mentioned? I think like watching through the archive of the evolution of KubeCon is a good thing, or like Cloud Native Computing Foundation. Not the marketing talks um

Starting point is 00:48:26 the the use cases where you can follow the journey you can follow the mistakes being made you can follow what like what tools are out there um so there is a lot in the cncf landscape and like focusing on specific building blocks or like lego, if you want to call it like that, totally helps. So if you're looking at Kubernetes, pick, for example, K3S because it's simpler to start with. If you want to look into monitoring, choose Prometheus and so on. So try to keep it as simple as possible. And I would say just, yeah, also talk with others.

Starting point is 00:49:08 If you have the possibility, right now meetups are a little complicated or in-person events, but try to join communities, YouTube channels, specific, like you can also join the Everyone Can Contribute Cafe. Just ping me, send me a dm on twitter and we can just make it happen um like learn from others and also do some not pair programming or pair debugging but group group debugging um hallway tracks something like that um and don't be shy to bring up a topic you're interested in. If there is something new to learn, it's like, oh, it's a new programming language,

Starting point is 00:49:48 a new tool list. We totally should try it out. And in the process of trying it out, we can send back documentation patches, write a tutorial, continue. Learning together is amazing. And this is what I would encourage everyone listening. Brian, do you have something to summarize?

Starting point is 00:50:04 Because I want to say something in the end no I I got nothing I gotta say something first of all I really love that Michael I'm saying the German pronunciation that we actually grew up in the same city in Linz

Starting point is 00:50:20 I have the fortune now that I can walk outside after this recording and go over to the Plaster Spectacle that you probably know. And they have a little fruit festival obviously in smaller scale because of still the COVID restrictions.

Starting point is 00:50:35 That's nice. It was a pleasure having you on the show. You really made me curious in the very beginning about this damn turbo button. It just boom! I remembered it when I had it. But really, thank you so much. I took a couple of other notes, and I want to highlight one. For me, it's very interesting because we talk a lot about SREs and DevOps,

Starting point is 00:51:03 but you made a comment and said, while SREs and DevOps can do a lot of things through automation, they cannot fix bad code, breaking code, especially if there's no evidence about that it is the code. This is where it comes back to,

Starting point is 00:51:17 you need to invest in proper structural logging so that you actually give the people the right evidence that where to look for, that it is actually the code that is broken and then invest in proper monitoring and making things monitorable so that SREs and DevOps can actually do the job and use automation.

Starting point is 00:51:33 So I think, but the biggest, I really like the biggest challenge is breaking code because SREs and DevOps can't fix it, especially if they don't even know that the code is the problem. And I think this sums it up pretty nicely with this is what we need to get in the heads of people. You need to develop, if they don't even know that the code is the problem. And I think this sums it up pretty nicely with, this is what we need to get in the heads of people.

Starting point is 00:51:48 You need to develop, you need to design, you need to engineer software that is able to run smooth in an automated way in production. And there's a lot of things we need to do to educate people. And it's great that you use your channel with everyone can contribute. We are trying to do our parts with Pure Performance. Henrik is doing his part with Is It Observable?

Starting point is 00:52:10 I hope people learn something from us, as we learn from people that we interview. Awesome. Thanks for having me. Yeah, and you know what, Andy? I just got to add to the Boost thing. I had to get a new Intel Mac recently because I can't get the M1 yet.

Starting point is 00:52:27 And I was noticing on the CPU, it's got the regular clock speed, but then it says boost up to... I was like... I remember registering at the time. I was like, oh, is that the old days? I'm like, I don't think there's a button for that. I think it just has the ability

Starting point is 00:52:42 to self-overclock when it needs to, obviously, if it can overclock without overheating. But they still have it in those Intels. Anyhow. You should probably monitor that. Yeah, but that's just too much trouble. I actually got a hint,

Starting point is 00:53:02 because I keep my laptop closed. I just use monitors for it. And it doesn't cool as well that way. So I took a hint from one of our colleagues and put a bunch of cooling fans on the back of the laptop. And then I turn a little fan on when I need to. But yeah, either way, I sometimes monitor the temperature. But anyway, way off topic. Eddie, everybody, thank you so much for listening.

Starting point is 00:53:23 Michael, I probably got that totally terribly wrong wrong but i attempted to say it in the german way if it if it sounded really poor in german let's just say i did it the austrian way everything is fine yes uh thank you so much for being on the show and uh for everyone listening thanks for listening we're getting andy i think this is our 139th episode. So we're getting close to 150 episodes. And thanks for listening. If you have any questions, comments, pure underscore DT on Twitter

Starting point is 00:53:54 or pureperformance.dynatrace.com for emails. Thank you, everybody. And talk to you all next time. Bye-bye. Thank you. Bye.

PurePerformance - Why DevOps must not mean Devs On Call with Michael Friedrich

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.