PurePerformance - 017 Features and Feedback Loops @ Dynatrace

Episode Date: October 25, 2016

Guest Star: Anita Engleder - DevOps Manager at DynatraceIn this second part of our podcast Anita gives us more insights into how new features actually get developed, how they measure their success and... how to ensure that the pipeline keeps up with the ever increasing number of builds pushed through it.We will learn more about the day-to-day life at Dynatrace engineering but especially about the “Lifecycle of a Feature, its feedback loop and what the stakeholders are doing to make it a success”Related Links:Dynatrace UFOhttps://github.com/Dynatrace/ufo

Transcript
Discussion (0)
Starting point is 00:00:00 It's time for Pure Performance. Get your stopwatches ready. It's time for Pure Performance with Andy Grabner and Brian Wilson. Sorry, I'm just trying to start a little more energetic than last time. Cuáles tú problemas? Still the same thing. It's about an hour later and the coffee's kicked in, so I wanted to start a little more energetic this time. Sorry about that. I know we do goofy things at the beginning, but it's...
Starting point is 00:00:44 No, you do goofy things at the beginning but it's no you do goofy things not we you that's right because i don't think you understand what goofy is um thank you yes uh we do it all for you the listener to try to make it entertaining in our own pathetic way um and there's nothing better than self-deprecating humor, I guess. Anyway, we are back with Anita. Oh, my gosh. I just totally pulled an Austrian move there. I added an R to Anita's name. And that's just going back to, you know, Java, which, yeah, anyhow. It's an inside joke, everybody. So we're back with Anita, and we're going to continue our conversation.
Starting point is 00:01:27 And of course, Andy's here because it's a few minutes later. We're giving away all the inside secrets of how we do the podcast right now. Right. So, yeah, Andy, would you like to use your wonderful summary skills that we just talked about to just in case somebody didn't hear the last episode? What did we speak the last episode what what did we speak about last episode well first of all you should go back and listen to the first episode because it was i think very enlightening on how burned who set up the goal of potentially being able to deploy a change into production within an hour actually transformed the damage waste engineering team over the last couple of years so So now we do wide deployment every two weeks.
Starting point is 00:02:08 So we deploy sprint builds every two weeks into production, but we have the chance because we have an automated deployment pipeline to be able to deploy stuff into production within an hour if necessary. And I think what Anita also told us, it was very interesting, like what the role of her team is, is providing the pipeline, listening to the individual development teams that need to push code through the pipeline. Actually, they are features, so we're organizing feature teams, which now actually brings me to the second part and the second topic of this episode now is – and I want to call it a lifecycle of a feature, and very important, it's feedback loop. Because, Anita, you stressed the fact very, very clear. And also, Bernd said, he said, not only do we want to do continuous delivery or continuous deployment,
Starting point is 00:02:58 we want to also have a continuous feedback loop so that our feature teams actually know if the feature that they built is A, working well, B, is used by users, and C, do we need to maybe optimize it because it's not running as smoothly as possible? So, Anita, enlighten us, please, on what's the lifecycle of a feature? How does this work? What do development teams need to do to actually get these feedback loops in place? How do you help them? And now I stop talking because I want to hear you talk.
Starting point is 00:03:28 A lot of questions. I mean, this needed a long answer, I think. But, yeah, where should I start? What do I have to do in order to get the feature in place or to get the feedback? So the one thing is, first of all, we are in a very lucky situation. We are building a monitoring tool. So our dev teams and our developer, of course, are very familiar with monitoring, the application monitoring,
Starting point is 00:03:56 and how you do this in the right way. So actually they start with the right mindset already and know what's needed to monitor a feature. This is one very, very positive thing. The other thing is there's a big difference compared with the way we worked before. So before we released twice a year, and typically when as developer, when you build here a new feature, you have half a year time to build a feature, and then it's out the year time to build a feature.
Starting point is 00:04:29 And then it's out the door and everybody gets this feature. So it's G8 immediately to all our customers. So this is the typical feature lifecycle when you release twice a year. And we have with continuous delivery a big advantage. So you can think about or you can actually be very creative how you can bring out the feature. For example, you decide, okay, you bring it partly out so you focus on a minimum version or on the key values and make it opt-in for customers and try to find out who opt-in
Starting point is 00:05:03 and how the customers are using that and then go ahead. So you can start with a very minimum version and do not have a very detailed idea how this feature has to look like. Also, you can monitor the performance in production because just a few customers are using that and then start to optimize this. So it's a much, much cooler way from a development perspective to bring a feature out. You can define your own stages. You can define what's next once you have the feedback and do not have a perfect plan for
Starting point is 00:05:37 the whole feature. That's really a great change. And really, I think a change the developers are very happy with to see how the feature is working out there and then decide how to go ahead together with the product management and so on so this is maybe the the first big difference and the first big change can I ask a quick question here because obviously in order for this to work, the architecture of the software and the pipeline also needs to support that. And I think it's hard to say, hey, I have a big monolithic application and now I'm doing this. Because obviously the architecture and the frameworks that you guys have been building enable developers to do exactly that rolling out a feature enabling it for a certain
Starting point is 00:06:25 part of users or doing the opt-in opt-out feature flagging feature toggling ab testing that's something you had to build into the into the software itself yeah that's true it's not built by my team but actually we had very at the beginning the request okay this is a feature i like that customers are opt-in or just like to turn it on for for two early access uh customers and then we actually built this feature toggle framework that everybody can now use so this is how how this gets created so we work with feature toggles with uh feature flags uh so that you as developer can decide how you what stages you like to have and how you can bring this this feature out of course product management is involved in that as well because sometimes it's not
Starting point is 00:07:18 a sometimes it's a feature toggle so we announce this maybe on our blog that there is a new feature out there or there's a new UI screen. Show this switch to this new UI screen and then you can switch back. So there are so many different ways how you can bring a feature out, how you can stage it, and how you can consume feedback to decide what's next to bring the right focus in. I think these feedback loops are really amazing, too, in a way. Hearing the story goes back to a lot of my stories go back to my previous job right um it was a towards the agile was just starting to take off so it was very monolithic uh and towards the end we had a new feature that was being rolled out and it was this really really large feature um you know trying to make a community kind of
Starting point is 00:08:19 section they were they were trying to jump on top of the concepts of facebook like hey instead of just having message boards let's make it a community. And it was this gigantic release that went way, way, way over time, was just played with problems. And we finally got it shipped and people aren't using it because there's a performance issue or maybe they're not using it because they don't like it right performance is fine so maybe it's just because they don't like it so you can do all those toggles and switches i i just it's a much different era now i think and i think it's it's just really amazing how this can all be done now and i love i love hearing about that i also think it's what i what i thought Anita, what you brought up is the blog. So it's kind of also a new way of socializing new features using the direct channel to the end user to say, hey, there's new stuff. This is what we've built so far.
Starting point is 00:09:35 Try it out if you want to. Give us feedback. And obviously on the one side, you get feedback through the tool itself or the software itself because you've built monitoring in. But then also using the other channel of proactively say, hey, give us feedback. feedback through the tool itself or the software itself because you've built monitoring in but then also using the other channel of proactively say hey you know give us feedback we want to work with you because we want to build stuff that you actually need and that actually helps you forward that's definitely definitely cool we also like so when i say we are really development likes our answers forum for for example.
Starting point is 00:10:07 It's for more than one year, it's already official. So everybody can look in this. And we really like to have the discussion with our own customers and developers like to have the discussion there. So this is really cool feedback we get there as well so direct feedback and sometimes changes as well the understanding of our own product so that the customers try to use it somehow in a totally different way we never thought about and sometimes we we we find out okay this is really cool we should think about this and maybe innovate in this area so that's really cool i wonder if there has to be a new term because what you were just saying there is that the customers are interacting somewhat directly with development. And that kind of takes DevOps to a whole different level.
Starting point is 00:10:56 It's, yeah, again, where things are these days is just really, really amazing. And I'm glad to hear we have so much of this going on i think i think the industry calls it user-centric development or customer-centric development or innovation where you basically put the customer in the middle and then ask them what they want and you start building it and then you roll out the first stages and then with feedback from them move it into the right direction i always use the example now the way we used to take pictures years ago. We had our cameras with a 24-picture film and you took pictures and then weeks later
Starting point is 00:11:33 when the film was fully developed and you got the feedback and then you were surprised that the pictures didn't turn out as well as you thought. Now you take a picture, you immediately have the feedback on your iPhone. You can even, or Android, I don't want to dismiss the other parts of immediately have the feedback on your iPhone. You can even – or Android. I don't want to dismiss the other parts of the world that are not using iPhone.
Starting point is 00:11:50 So you take the picture on the phone itself. You can apply filters like color filters. And then you post it on Instagram or Facebook, getting the feedback directly minutes after you developed, quote- unquote, that feature from your friends, if they like it or not. And if they like it, you may take more pictures in the same scene where you are right now. If they don't like it, maybe you are removing it because you don't want to get negative feedback. But, you know, that's kind of the difference now. And that's the cool thing. So continuous innovation, optimization, that's what I call it.
Starting point is 00:12:23 Now, Anita, we know that you do feature flags. What are the technical vehicles that the engineering teams then really get the feedback? I'm sure they're using Dynatrace, obviously, on the one side, but I'm sure there's more what they do. What do they do? So, actually, we continuously innovate on this area. So, they bring up always new ideas, what feedback they do so actually we continuously innovate on this area so they bring up always new ideas what what feedback they like to have but personally i categorize this in in different areas the one is the typical feedback also operations is typically looking at these are performance metrics like cpu memory gc suspension time whatever you you have there this typical metrics then the the other area
Starting point is 00:13:07 is this custom metrics uh development building for example monitor queue sizes of a special queue so that the queue is not too long or that there are not no element not a lot of elements kept out of the queue or something like this is custom metrics we also use here dynatrace to dashboard them to monitor this to alert on them and the other thing we are currently really focused on is log feedback so I have to give you here a bit a bit a background why I call it log feedback and not log analytics or something like this. Before we started this, log was somehow a different method they used because typically developers add log when they start with a feature. So this is very, because it's very simple to add some log
Starting point is 00:14:05 lines, to add some info logs, severe warning or debug logs so that you can turn it off on. But typically they just use it on their local environment or some small staging environment because looking through log files is very time consuming. And the other area where they were used to or where developers are used to work with logs is when something severe is happening and they get somehow an archive from an operations guy or something and then they have to look all through all the log files, try to find out that they have the right log files, that they look on the right time frame, filter out logs that are not the root cause, maybe just a symptom, or where severe logs that were happening weeks ago
Starting point is 00:14:59 and still happening and have nothing to do with the severe issue happening now, and so on. So this was the area where developers were used to work with logs at the beginning when they developed feature or changed something and in this severe war room scenario. But we find out that logs actually are very, very helpful because you have this in a very early stage of a feature when a feature is not really fully ready and the whole metrics, the feedback metrics,
Starting point is 00:15:29 custom measures or so on, are not fully ready. Or actually, sometimes the development and product management have not yet the full idea of what we have really to monitor on this feature. But we have logs. And the tricky thing on logs, sometimes there's really many, many logs. First of all, we have different stages. We have our dev stage, we have staging environment, we have load test, and then we have this big production environments.
Starting point is 00:15:58 When you like to look on all these logs somehow on a daily basis, you invest your whole day maybe and it's not a funny thing so what we did for example and this brings brings me to the example where i said okay sometimes we build something special for it and sometimes we improve our own product and if you know dynatrace we have now log analytics in there and this is also a reason why we have this in there because we found out we need it for ourselves. We need this as a continuous feedback, as a continuous monitoring, so that deaths really are able to continuously have no special metric for this, or if there is something strange and they have to deeper look into that. So log analytics is a very cool, fast feedback loop. They typically implement that and it's very easy to consume that if you
Starting point is 00:17:06 have a cool tooling for that yeah this is an important point i want to want to bring in i know that log analytics is sometimes sometimes always uh boring because it's many files and you have to really combine them and so on no i think it's not i think it's not boring at all actually i think it's great boring at all, actually. I think it's great that you bring it up because I think from, and you said, we actually, based on the need that you brought up, we actually build more log analytics features
Starting point is 00:17:35 into our products. And that's the cool thing of what you said in the beginning. We are so fortunate. We are actually, we are a monitoring company and we use our own products to make the life of a software engineer that works in a DevOps, NoOps-y kind of setting cloud-native. I'm throwing out all the cool words now. So we basically innovated not only the way we develop software, but we also innovated our software that helps us, but also will help our customers in the end.
Starting point is 00:18:11 And I remember in the early days of Dynatrace on the AppMon, one of the coolest features I always thought is actually capturing log messages in the context of a pure path. So instead of looking at thousands and megabytes and gigabytes of log files, you can say, hey, there's a log message that is now coming up. And not only that, but you also see the pure path. You see where it actually happens. And you guys obviously now also built this into our new Dynatrace SaaS and managed offering to put log analytics as one key component into the mix because it just helps. It helps figuring out if something is wrong or not.
Starting point is 00:18:43 Yeah, I think that's a great feature i've always loved that one too because you instead of coming through every single log you would see this error 500 times right and then you could drill down to every single place that that occurred um i've always loved that that yeah so it's awesome and the coolest thing andy was was mentioning already i fully forgot about that because it's somehow natural for us to really have a click away the information how many customers or real users are affected by a special warning log. So as a developer, I see, okay, oh my God, there's something going crazy. I have a lot of warning logs. It's not as expected.
Starting point is 00:19:24 And check how many how many real users are affected of that so is there a response time declaration because of that or something else so just to click away to have this in one tool and then you know okay there's no real user affected i can sit back relax and think about what what's doing next and how i bring maybe a change very fast to production in order to avoid a custom effect so this this is really cool and not having different tools and combining the information and trying to find out that i look on the very same time zone maybe you have a different time zone on different hosts and you have to combine them and always look on the wrong
Starting point is 00:20:01 time so it's was really time-consuming before we had that. And maybe I have to say this now because it's just – I want to say this now. Okay. A little pun to our friendly competition out there. I think one of the cool things is if you see like log statements are coming in and then you can say, what is the end user actually that experienced that? Not only do we see the end user, but we also now see the full path of the end user until they hit that point. To your point, what you said earlier, because you said sometimes you're surprised how our users are actually using our software in a total different unintended way. And because our claim to fame has always been, we want to capture every single transaction all the time, every single click, we can actually say, ah, look at
Starting point is 00:20:50 this. Those people that actually get the warning are those that are using the feature in this way versus the other way. And I think that's the cool thing of having, you know, building monitoring tools and now being able to help development feature teams not only to understand there is a problem, but also understand how end users ended up with that problem. Because maybe it's a totally different path that nobody thought about. And this is only possible if you have all the data available. And that's kind of my little pun to our friendly competition out there. You have to have the POV.
Starting point is 00:21:23 You have to have everything all the time sorry yeah that's true nothing to add andy so um so that means logs logs are critical that's awesome that's great um now the what happens if they are developing a new feature and they see something is totally not right? Then it's obviously great if you can actually deploy changes into production very fast, right, what Bern said. Within an hour, it should theoretically be possible to deploy a change.
Starting point is 00:22:00 Is this then a mechanism that the feature teams are using? If they deploy something and they see, oh my God, hell breaks loose, then they change it? Definitely. We have such a fast lane in order to give development and the whole team the possibility to really react very fast on that. We often decide not to use that. We only do this in an emergency case. So the important point or the most important component actually we have out there is our so-called agent. So this agent is sometimes injected in the customer application.
Starting point is 00:22:39 It means when this type of component is going crazy, we in the worst case affect the customer application. And this is then really an emergency case. And there we have to really be very, very fast in fixing this in order to not affect the customer's application. And this is where we have this goal to bring out a fix within one hour. Of course, this means one hour after the commit and not one hour. Okay, you have a few minutes time to think about a fix and then push it out. Maybe it's then the wrong fix. So it's, of course, important to have within this one hour also a test pipeline to have
Starting point is 00:23:22 the important tests there to guarantee that the quality that is going out in such an emergency case is fine and that do not make the situation worse. Yeah, this is what we are currently trying with bringing a fix out within one hour within an emergency case. Sometimes we decide not to be so fast because, for example, we have also a component we call cluster that has tenant capabilities, means there are a lot of customers on it. And for example, if I see that there is a UI screen
Starting point is 00:23:56 broken for one or two customers and we have a fixed available very fast and we could deploy it immediately, we sometimes decide to do a longer test cycle or to bring it first in a staging environment to look not only if it's fixing this broken UI, to also verify that there is no side effect for the other customers. So sometimes we somehow pause it and look deeper in it
Starting point is 00:24:20 in order to avoid that we affect other customers that are not yet affected or make the situation worse. But we have the option to be very fast when it's needed. That's an important point. And I wanted to ask here, traditionally in the old school, it was always if something was broken, you roll it back. And as CICD has progressed, you had your Googles, your Amazons, you know, all the big players didn't, you know, started abandoning the idea of a rollback. They would roll forward with a fix. It sounds like from what you're describing that within Dynatrace, are we fully on board and practicing a roll forward with a fix or do we still have sometimes a rollback in place yeah it depends on the component when we talk about the component that is not running in the customer application or the actually the agent component we we only besides that we only
Starting point is 00:25:20 do roll forward we do only roll forward on cluster side and all the other components beside the agent. The reason is here as well is that a roll back is something that sometimes can go wrong as well. Think about you have a change in there where you change the database schema or some new values in the database and the old version will not understand that. So maybe a roll back can make the situation really worse. So therefore we don't do this on the cluster side on the agent side we allow also a rollback so means the customer is allowed to install also an older version because we are compatible with old agent version at the moment we have agent versions out there this is a nearly one and a half year old and it's working perfectly so we allow customers to do this in order to help themselves in case they see something in staging or something like this or also in production.
Starting point is 00:26:13 They are allowed to do this. But of course, we try to fix it on the new version because nearly all customers out there are using the new version of the agent. And going back is actually not a solution. And also a good point is here. I remember the time where I started at Dynatrace and typically the severe support tickets that are coming in were agent issues. So when the customer application is affected. And it was always a really time consuming and harrassing when the customer had a very
Starting point is 00:26:44 old agent. Because what the developer had to do here is, first of all, checking out a very old branch, try to remember what the heck this code is doing there, what I implemented two years ago, no idea, and try to find out what side effects his fix would have for for other parts of this of this component so it's really hard to make a decision what's a secure fix if you have an issue on a very very recent version so something you implemented four weeks ago you're typically very fast in providing a secure robust fix that helps because you have it fresh in your mind what this agent or what this component is doing and you're very fast. And this is what I see in a daily business.
Starting point is 00:27:32 We have more than 85% of the agents out there are on a recent version. And when we have an issue on a recent version, the agent teams are very, very fast to really provide high quality fixes. If it's an old version it takes longer yeah so actually and this this is this is interesting too like it just reminds me of always i think the claim to fame from apple is always that they get their people their users very fast on the latest versions of the ios which makes it much easier for them and also for application developers to know that 80 percent of the ios community is it much easier for them and also for application developers to know that 80% of the iOS community is going to be
Starting point is 00:28:09 on the latest version within days after the update. And where on Android, it's sometimes a little challenging when we talk with Android developers. It's always hard to support all these different versions that are out there. So having things on the latest versions makes it a lot easier. Hey, Anita, I have – now the feedback loops from production are amazing, right? And it's needed because basically when you develop a feature,
Starting point is 00:28:36 you need to see how they're using it out there. But what are the feedback loops that you have in the pipeline? Because what I'm always preaching, what I've been preaching over the last years, is you need to inject quality into the pipeline and actually not only look at functional results in making a build fail and use that as a feedback loop, but also injecting what I call architectural checks and performance and scalability checks into the pipeline. So for instance, if you run some unit tests, some integration tests, and all of a sudden you figure out that feature is now allocating 20% more memory
Starting point is 00:29:11 or is making 50% more database calls, then this is also, even though functionally everything is correct, it should not be pushed forward because it will mean the feature is that much more costly out there. Is this also something you do in your pipeline, that you're doing these checks? Yeah, we're doing these checks, but we are doing this not for every commit. It would be too much. What we are having in place
Starting point is 00:29:33 is we have daily load test environments where we're doing that. So we have load simulators where we really have the very same synthetic load and can really say, okay, it's now 20% more. And it's not because the usage is different here or the load is somehow different. And we have also a bit longer running load tests. So this three or four days load environments, this is actually needed because we have our agents there.
Starting point is 00:30:00 And sometimes you see an effect when it's running for days. So we have actually these two environments where we get daily feedback of a change and can daily react on that. And then we have this longer running Lotus environments where we check that. So yes, we do this, but we do not have this implemented in this pipeline that is where every commit is running.
Starting point is 00:30:24 So we do this on a daily basis actually cool yeah and it's it's actually those were lines i mean the brian mentioned we had adam from capital one on the call like recently and he was also talking about the pipeline and and he was basically talking about the problem that they do a lot of load testing at the very end but these load tests run you know they take you know 15 minutes half an hour an hour and if you would execute these load tests on every single comet and you get 10 comets per hour and but the load test takes an hour then obviously it's hard to fit in all of that and there's there's two ways to mitigate that one is they parallelize load tests right right? They use Docker containers to actually spawn up multiple kind of load testing environments in parallel.
Starting point is 00:31:11 But on the other side, what they actually also do is they say instead of always running a full load test, we are shifting left some of these performance checks by just running unit and functional tests, which we do anyway, but then inject tools that can give us metrics like number of database statements, number of objects allocated, because a lot of these performance problems can be detected early on when looking at the execution of these unit and integration tests. And therefore, you can stop bad builds early. That's kind of our story. Stop bad builds early because you can already, by looking at the right metrics in your unit test environment and your functional test environment,
Starting point is 00:31:54 already detect a lot of problems that you would otherwise find in longer load tests. But I totally agree with you, and I like the approach that you guys have, that you have your longer running tests where you can see long time effects um and things like caches um and things like um synchronization problems and and these and things like the queues that kind of compile up over over the course of the time these are things obviously where you need longer running load tests and that's cool that you guys do that as well. Nice. Now I know what your question was pointing on. I also remember the slide you were presenting typically
Starting point is 00:32:30 when you were telling the story. I completely forget about that because I said at the beginning we were already good in continuous integration and actually this is how we do our big tests to monitor as well the the performance of our big tests and if there is a difference uh and um yeah somehow a baselining we are doing there so this is something of course we still do perfect yeah that's awesome and then and now i always bring the example in my presentation these days i always show a picture of our pipeline state UFO, or UFO as we call it in German.
Starting point is 00:33:08 But the pipeline state UFO that is kind of this funky little thing that hangs on in our hallways and basically represents the status of our pipeline, the current sprint, and also the trunk, and the green, the, and the kind of the shades between. And I think that's pretty cool that we actually show the status of our pipelines at any given point in time to our engineering teams in the hallways, right? I always tell the story, if I'm a developer at Dimetraise and I hit the commit button and then I walk over to the coffee machine and then all of a sudden I see that this light goes from green to red, I know I'd rather get my coffee fast and run back because I just broke the build. So that's the fastest feedback loop. And that's what I love so much about it. I think that's also why a lot of people like the UFO idea.
Starting point is 00:33:57 Some other companies I've seen, they have like these emergency lights that you normally see with firefighters. It's like, wee, wee, wee. Br brings it to an extreme but uh you know it's like and you know i do want to mention because we had a chuckle about ufo versus ufo um and i do want to defend our our way of saying it only because it the letters stand for unidentified flying object so it is a three-letter acronym it's not really a word. I just had to go on record with that because, you know. Anyway, one thing I wanted to bring up, you know, in the last episode at the end,
Starting point is 00:34:39 and we're just about at the end again, I wanted to talk about that whole, you know, reaction to the idea of being ready for a release or being release ready every hour, you know, should have come to it. But I wanted to frame it a little bit different, right? So I kind of wanted to, Anita, I wanted to find out, you know, what maybe your reaction and maybe some of your peers reaction was to that announcement. But now that you're there, looking back, how would, how do you see, you know, what that transformation was? Like, was it as tough and crazy and ludicrous as you thought? And did you have to, you know, work through a whole ton or, you know, a whole ton of obstacles and, you know, grand enemies,
Starting point is 00:35:20 you know, if this were a video game kind of of thing or did it actually go a lot smoother maybe than you expected because within the process of getting there maybe you had feedback loops of your own in that process fitting how would you describe that whole experience yeah i can very well remember the the meeting so it was in we call it all deaf meeting so that the whole development is is watching it uh whennd was announcing this goal. And I can very well remember my thoughts, and I can very well remember what I thought when I saw in the faces of my development colleagues. I personally, let's start with what I thought.
Starting point is 00:35:59 I personally thought, okay, it's good to have challenging goals, but we never achieve this one hour so if we have one day we can be happy with that so this was was my personal thought on that um yeah but we were actually used to have challenging goals goals in order to be happy to achieve 50 percent because it's then better at as as our our as others are doing that. So this was my mindset on that. Of course, we were fighting for this, of course. And then I looked in the faces of my development colleagues,
Starting point is 00:36:38 and we also talked after the session at the coffee machine. And actually, the common opinion was like, okay, Bernd is somehow crazy. Everybody is allowed to have dreams. Let's see what the future brings. Yeah, this was actually the reaction on that. But to be honest, I'm really happy that Bernd was really announcing this in a way
Starting point is 00:37:02 we have to do this. There is no excuse. We have to fight for it. We have to find a solution for it. Because two years later, actually one year later, we were already there. So we really found a solution for that. And we really were fighting for that. And the biggest trick we did there is actually we all read the whole continuous delivery
Starting point is 00:37:26 Bibles that are out there and all these Bibles were saying okay the trunk has to be release ready at every time and we never thought that this is possible because we know that it is challenging enough to have every sprint every two weeks something we can deploy and where we can demo all our new features. So this was challenging enough, but we were already good in that. So we did a little trick and said, okay, this was what the outcome of every two weeks, what we need to demo a sprint. This is the base that have to be release ready at every time. And on top of that, we built a very fast build and test pipeline to bring fixes out.
Starting point is 00:38:21 So it means every two weeks we made it release ready and bring these new features out and be fast in fixing this on top of this code layer. So this was the place where we started. And I think as we officially go out with our product, we were in a range of two or three hours bringing a fix out on top of this base. And once we had that, we started with the project that the trunk is always deploy and release ready, at least in our internal stage, and everybody's dependent on that.
Starting point is 00:38:47 And we are good in that as well. At the moment, we nearly every time are somehow ready to deploy directly from the trunk. We don't do this, but we are already there. We could already start with this project. I don't know if we want that. Hopefully Bernd is not hearing that. The next challenging goal.
Starting point is 00:39:06 But yeah, this was actually the feeling. So we did a little bit of a trick. So we don't try to copy the unicorn. So as Andreas is often telling this story. So we made our own whatever, not unicorn, our own little Tony or whatever. And this was working. My little Tony. So the interesting thing, too, is obviously a lot of this was all because of the hard work
Starting point is 00:39:34 and the skill set and the desire and drive for you and all the development teams to get this done. That's obviously probably one of the most important pieces of getting this done. But what I found interesting in, you know, between your talk and our talk with Bernd, I think one of the key drivers for any organization to really get this done is not just having the talent on the ground who can execute this, but it's having the support and the drive from the upper level, from the C-level to say, this is what we're going to do and we're going to support you in doing that. And then that then gives you all the support, the ability, maybe even like saying, hey, some things are going to slow down
Starting point is 00:40:15 while we make this transition, but everyone is supporting this in the entire organization. And I think that's really one of the biggest keys to achieving this kind of thing. Yeah, that's definitely true. Without the focusing that Bernd introduced in all our minds, that we have to focus on being fast, automate everything, no manual touches. Yeah, without that, we would not be there where we are. Because sometimes it meant that we have to postpone features. Sometimes it meant, okay, although everything is breaking our staging environment and we cannot bring a single feature out, everybody's focusing on the process that it's not allowed to demo locally or whatever so this was really really important to have uh to have burned here to have
Starting point is 00:41:07 the management here having the very same mindset not trying to find an easier way yes hey um i know we are almost at the end from our time budget perspective um but i want to i think i want to propose multiple things here first of all i want to propose multiple things here. First of all, I want to propose you come back to another podcast because what we have not touched upon at all, a little bit maybe, but not at all, not that much as people would maybe like as a new episode is the tooling. I think it would be very interesting to talk a little bit of what tools do we actually use to actually automate the whole pipeline so that could be another topic the second thing i want to propose to you now anita is we should i know we're doing a webinar on this
Starting point is 00:41:51 as well in november but i think we should get you on the big stage and we should have you at perform 2017 so this is a shout out now first of all to you hopefully you say yes yeah let's see we'll think about it. Because it's going to be in February in Vegas at our annual conference. So hopefully Vegas buys you over. But I think it would be great to show this to our whole user audience out there. And to conclude all of this, I want to just say one more thing that I know. Like a week or two, three weeks ago, you had a discussion with a guy from a German university, I believe.
Starting point is 00:42:33 I think it was a university. He was a student, and he's doing some research on kind of the maturity level of companies when it comes to CI, CD, and DevOps. And I thought what you were, I know you were very busy, but you took the time and then you did it. And because you also said, not only do I want to do it because I want to help students out there, but I also want to know how we are
Starting point is 00:42:57 and how we relate to other companies that do CI, CD, DevOps, or no ops. And so this is kind of my last question that I have to you. What was your reaction on that? What did you get as a feedback from this guy who is doing the study? Yeah, it was really very, very interesting. So as you mentioned, I was really busy. So we did this talk, I think, last week.
Starting point is 00:43:23 So it was one of the last companies he added to his final work. And, yeah, he had really a lot of questions, a lot of very detailed questions. I think it was because he talked to so many companies already before. And the final note, what I found very interesting was that he said, okay, he has to adjust now his metrics, his model, because we do not suit on this model. We are so high, we need to adjust this to make a place for us at the very top. So this is very, very nice. And I said to him, okay, make it a bit higher so that there's some space
Starting point is 00:44:03 because we want to improve it well. That's rather cool to hear that. I thought so too. And obviously, as I said, I think we can be proud of what we've achieved, but there's still room for improvement. And this is great. And it's just great to know that we have a good team and a great leadership that supports all of this. Brian, I think I'm speechless and I'm happy and I'm done with my questions. I'm done.
Starting point is 00:44:33 I'm done too. I just wanted to say congratulations to Anita and all the teams. Yeah, the whole team especially. It's not one person, but obviously, yeah. But awesome job to you regardless or awesome job for you relating it however you want to take it i know people like to be humble as they should be but um yeah whole team has been great um and i'm just very impressed yeah i've always known we've done things like this but i've never heard the the rich detail. And I, the, what kind of gave me,
Starting point is 00:45:07 gave me that mind blowing moment sort of was just the concept of a monitoring tool using its own self to monitor itself as it's being going through in the improvement phases and all. It's almost like the snake eating itself, but it's or, and, and adding improvements to the tool to improve that monitoring, which then feeds back. It's almost like pointing a video camera at a television monitor that's recording itself, and you get that ultimate feedback loop and Doctor Who type of special effects.
Starting point is 00:45:41 But it's amazing, and I think you're all great, and that's what I love about all this. And I think all of our listeners, if they want to experience on their own, they can just go to Dynatrace.com and sign up for the Dynatrace SaaS free trial because that's basically the product that you guys have been working on. So I think the first thousand hours are on us anyway. So if you want to experience first-class no-ops monitoring, to monitor your own software. And you were talking about releasing updates on the blog. What is that blog for the Dynatrace SaaS updates? And here's a new feature.
Starting point is 00:46:18 Is that a public? It's a public blog. So we use this actually as release notes. So we blog about our improvements and do not provide boring release notes. And what's that URL? It's on the website as well. The URL is still blogroxit.com, but we switch it to blogdynatrace.com very soon. Not sure if you know that there was a renaming from Dynatrace Roxit to simply Dynatrace, so Dynatrace SaaS.
Starting point is 00:46:45 Yep, there's been a lot of talk about all that. It's linked on the website, so you will find it. All right, excellent. And if anybody, anybody who's listening to this has a story of their own for a transformation and would like to share it with us, we'd love to have you on as a guest. Feel free to contact us through either Twitter, hashtag PurePerformance at Dynatrace, or send us an email, pureperformance at Dynatrace, or send us an email, PurePerformance at Dynatrace.com. Always looking for guests.
Starting point is 00:47:09 And, Anita, I would also love to have you back on. I'll second Andy's motion there. Yep. Yeah. See you. Cool. All right. Well, thank you, everybody.
Starting point is 00:47:21 Goodbye. Bye-bye. Bye. Bye. Bye.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.