PurePerformance - 003 Performance Testing in Continuous Delivery / DevOps

Starting point is 00:00:00 It's time for Pure Performance! Get your stopwatches ready, it exciting episode of Pure Performance. I am Brian Wilson and with me today as always is my co-host Andy Grabner. Say hi Andy. Hi, how do you know it's going to be exciting? You announce it and you don't even know what's coming up. Well I know it's going to be exciting because to me, we have a very exciting topic for today. One that kind of makes my brain itch every time I think about it, and I think it's something

Starting point is 00:00:52 that will make everyone else's brains itch quite a lot. Today we're going to be talking about performance testing in the continuous delivery world and what that looks like, how we go about it, and what does that mean for the traditional performance tester? That's good. Shall we maybe kick it off a little bit in the beginning to explain some of the terminology?

Starting point is 00:01:17 Like what is CI, what is CD, and also the whole thing around DevOps, which I'm sure is a little topic that people are hearing and hopefully not only hearing, but also by now, know what it actually means, or maybe not knowing what it means. Because I actually remember when I got on stage, and I always keep referring to this episode of my life, two years ago, I got on stage at the DevOps meetup in Boston. And before I could even open up my mouth, people are pointing fingers at me and saying, hey, because of people like you, and they were could even open up my mouth, people are pointing fingers at me

Starting point is 00:01:45 and saying, hey, because of people like you, and they were not only meaning me in person, but because of the companies, whether it's a Dynatrace or our competitors or the IBMs and the HPs of the world, nobody knows what this whole DevOps is about anymore. And what does it have to do with CI and CD? And because back then, I think we made the mistake in trying to get DevOps on every single piece of collateral that we created just to get attention. Right. And you keep on saying CI and CD and let's take a step back and say, CI means continuous integration and CD means continuous delivery. And I guess the next bit of question is what next question is, what do all those even mean? Right. So a lot of people have probably heard of these terms.

Starting point is 00:02:27 They're quite the buzzwords, kind of like Agile was several years ago. Everyone in almost every corporation is saying, yes, we need to be there. We need to be CI. We need to be CD. But it's usually slow going and people don't really always completely understand what it entails and what kind of commitment they're making and what that means for the organization. Specifically, even as we're going to be talking a lot more about today, the performance teams, but continuous integration. I can just give my very simple light version of what it is and I can let you expand further. Continuous integration is kind of like the continual flow of writing code, checking it in,

Starting point is 00:03:09 having it run through some iterative tests and proceeding up to the next chain into the testing world, into the pre-production world, and eventually out into production. But that also mixes perfectly together with the idea of continuous delivery, where the whole idea is to always be pushing code out. Right. And you may have heard of continuous delivery in examples like a Google or Amazon or Facebook where they're releasing code, what, like up to every 10 minutes or two minutes even into production, some ridiculous kind of things, which I think for most organizations is kind of not realistic nor not completely necessary. But the idea is just continually pushing out small bits of code through your chain so that you're never making one large release that's going to destroy anything. But you also just get everything out there as quick as possible. And you can quickly take it out if it doesn't work out. Well, I think you hit it pretty well here.

Starting point is 00:04:01 I think continuous integration, the initial idea was if you have a lot of developers or a bunch of developers working on the same code base, then I check in some code. And I want to, first of all, make sure I didn't break anything. And so if I check in code, the first step that happens in CI is actually figuring out, does everything compile? Do all the unit tests run? Did I didn't break, hopefully, any of the key features of the code and really making sure that the different code pieces that we're working on, especially if you're talking about a large application, then I may make a change in module A,

Starting point is 00:04:36 and how does that impact module B that I'm integrating with? That's also why I see I continuously integrating your code changes and also then running automated unit, but also integration tests to really verify, is the system still working based on what the system should do? And that's why I believe testing is so critical to the whole thing. And the whole mentality of test-driven development even becomes more interesting because test-driven development basically says, well, before we write any code, we want to first write a test.

Starting point is 00:05:07 Why do we test first? Because, well, we should know what the code we are writing is actually doing so I can easily write a test for it because the test is testing my outcome. So I'm creating a search feature. Does the search actually produce results? So I can write the test, then I do the coding. And every time we check in code and anybody's checking in code, we can always validate if the code change broke anything. And if we do it in an automated way through a CI server that is kicking off all these tests, whether it's a Jenkins, a Bamboo, any other server, it only takes minutes, hopefully,

Starting point is 00:05:40 for a developer from checking in code to knowing, did this code change break anything significantly? Right. And I think the key thing you are mentioning there is the fact that it is the developer in this case who is writing a test. And this is a big shift from the traditional means, right? This is before it gets to the QA teams. This is before it even gets to the performance teams. Obviously there is performance testing as we spoke about in the last episode that's still going to come into play into, to, to, in, in some fashion in these environments. But this is where we start getting the development teams involved in this.

Starting point is 00:06:19 And this is where it also starts getting scary because if the development teams are writing some kind of tests, what does, you know, and we'll talk about that a little bit later, but what does that mean to the performance, you know, the traditional performance teams? Exactly. What I want to add here, though, is I hope that developers are writing all the tests. But what I see also in the Agile movement, Agile initially was, I think, heavily driven by developers, by engineers that write code. But more so, I think we're seeing Agile going broader. That means an Agile team should have multiple roles embedded in their team. So there will be people that are just better in testing and others are better in development. And maybe there are people that develop a cool testing framework that makes it easier

Starting point is 00:07:02 for the developers to easily write tests and take away the fear. So I think as a whole team, creating code, but actually doing test-driven development will become much easier. And to be quite honest with you, I think if we do it right, I mean, yes, I mean, I used to be a developer. I love writing code and I think I'm the best developer in the world. And I hate going back and fixing problems, but if I do it right and if I've once experienced how cool it is to get immediate feedback if I check something in and then the test tells me, ooh, you just did something stupid. So that means I can fix my stupid mistake fast,

Starting point is 00:07:40 but I also learn and not making the same mistake again, which in the end makes me a much more efficient and better developer. So CI, and then as you said, CD is basically just taking continuous integration from we're running, we're compiling, we're running the tests. And then at the end, something drops out that we can potentially deploy into a test environment or into a staging environment. I think continuous delivery has the concept of a pipeline that then has additional steps along the pipeline until potentially we run through hopefully some performance tests, maybe some more integration tests, and then deploy it into production and hopefully everything runs fine. And in case there's a problem, then we should know about this by adding monitoring to the system as well.

Starting point is 00:08:31 So like the immediate feedback loop in CIs, I commit all the tests run and the tests tell me green or red. If I deploy something and if I have metrics built in, and obviously we from our APM background, we know what metrics should, I mean, which metrics we want, whether it's response time, failure rate, errors, how the system behaves, and all that stuff. If we deploy something into production, automated, and then we see these metrics going crazy, like response time goes up, failure rate goes up, or CPU and memory goes up, then we know it was probably a bad deployment. So it was a bad change. We may want to roll it back. Because whatever we did,

Starting point is 00:09:06 we didn't catch it in testing, we didn't catch it in CI, it made all the way out, so we need to fix this. Right. And the interesting thing, and I've heard you talk about this a lot in some of your speaking engagements, is there's two approaches a company will take if something is failing in production. They're either going to roll it back, which means that they have a really well thought out rollback plan, which can always be dangerous. But some companies never go back, right? They roll in a fix. So I just bring that up because to me, that concept, again, these are the much more advanced companies like your Googles and some of these others where they just keep going forward. If it's broken, we roll in a fix.

Starting point is 00:09:46 We're not going to roll back. So just conceptually wrapping, you know, for me, this whole concept, when I first heard about it, it just kind of blew my mind. But what that also means when you talk about that pipeline is from the performance test point of view, yeah, it's sort of still like a little mini waterfall, right? It's absolutely not waterfall deployments anymore. But if you think about that pipeline, it's going to go through the CI section, then to some of the other units and then pushed out.

Starting point is 00:10:15 It's kind of, you know, if you want to conceptualize it, it's kind of like rapid waterfall, if you will. Yeah, that's a very good, actually, it's a very good point. Yeah. Without, you know, without all the baggage that waterfall has, you know, it's a very good point, yeah. Without all the baggage that waterfall has. It's definitely not waterfall, but – Yeah, and basically what we have, and I think there's also a term that our listeners will have heard before. It's between the individual phases of your pipeline, you have gateways.

Starting point is 00:10:39 You have quality gates that basically say, do we promote this build into the next phase of the pipeline? Yes or no. But as you said, I like it actually. I like the rapid waterfall because it's really rapid. Because instead of waiting two weeks or a month until we get something over to the testing team and they take another two weeks until they either push it back or push it forward, we have it instant. That's the whole idea. The reason why, and you mentioned it, some of these companies are doing rapid deployments several days, several times per day. And Amazon actually, I always bring this example in my presentation, Amazon claims that they're deploying a change into production every 11.6 seconds.

Starting point is 00:11:21 And it's like crazy. But the only way this works is obviously because they figured out that to breaking down the application into smaller pieces small individual components that they can that they can roll out and having a fully automated with a lot of quality gates added to a pipeline so you have a pipeline you want you a change, it goes through the pipeline. If it succeeds all the way to the end, it gets deployed. And then if you have measures and monitoring in place in production and you see something is going bad, then you can react to it as fast as possible by saying, okay, we just changed that. Oh, maybe, and I've seen in production monitoring, all of our Internet Explorer users have an issue. So maybe we didn't test it right on IE.

Starting point is 00:12:04 Let's quickly fix this JavaScript problem that we may have just introduced and then run it through the pipeline again. So within a couple of minutes or maybe an hour, we can actually roll forward into the fixed version. Yeah. And I think the key there, the key difference between waterfall and this methodology are those gates, right? Because in traditional waterfall, you don are those gates, right? Because in traditional waterfall,

Starting point is 00:12:26 you don't have gates. The developers write the code, push it out to QA without hardly ever any kind of testing or a set of metrics that they're looking at consistently. And then the QA people, either it's broken right away. I mean, again, if we go back to the couple episodes ago, when we talked about common things we've seen, how many times did you used to get a release that just didn't even work out of the gate? And then again, to performance to these other things, there's always gates and there's a consistent set of metrics so that the developers in these instances and these instances and other instances, it's not going to move forward unless all those metrics are set up. And those metrics mean that the teams all have to think about those metrics. You know, we covered a lot of those concepts in our first episode. But once you start having those things in place, it's going to start falling together naturally because you're going to really be thinking about performance in the lifecycle.

Starting point is 00:13:20 I was talking a little bit like Christopher Walken there, accenting like the later parts of life cycle. So, but hopefully, but now the listeners, so CI continuous integration, CD continuous delivery, it basically just takes the concept of CI all the way until the very end in production. And then DevOps is something that I mentioned in the beginning. It's just a term for me that in general, the key thing of DevOps is actually cultural change

Starting point is 00:13:48 and how we actually go about developing and deploying and dealing with software changes and also how the culture, like who is responsible for what. But I think the technical vehicle is going to be continuous delivery, meaning the promise of DevOps is that we can deploy changes into production faster than ever before with a better quality. Continuous delivery gets us there because it's automating the way, not only the software.

Starting point is 00:14:13 Actually, we only talked about software right now, but we also need to talk about the environments, right? We're talking about infrastructure as code. So we talk about automation all the way through. And I think this is where some people in the beginning might be confused about, hey, what is this DevOps? Is DevOps not continuous delivery? I think the continuous delivery is part of the promise of DevOps because DevOps promises we can deliver software faster to our end users. Or let's say that way.

Starting point is 00:14:39 We can deliver value faster to our consumers. And value will be we need this new feature because this is just the best idea. And instead of waiting for a year in the waterfall process, we start by leveling up. We are doing everything agile. We are basically trying to start with the minimum viable product. We don't need to flesh out the whole thing, but we start small. We automate the delivery process and then we can step by step improve the feature until we're there where it

Starting point is 00:15:09 needs to be and along the way we always have metrics that act as gateways yeah yeah and i think you know you mentioned some of the idea of the culture of devops and one thing that i think a lot of people including myself when i first heard, it was dev slash ops or dev dash ops, whatever. And it always seemed like it's for developers and operations. And all the testers in the middle got cut out. Right. But, you know, as as I learn, as I read more into it and, you know, learned a whole bunch more greater deal about it, it's not the case. It's more about I think it's actually a pretty poor term, my own opinion, because you're really talking about integrating the development teams with the testing teams, with the performance teams, with the operations teams, with the marketing teams, with the business teams. It's really about bringing this culture together to streamline this delivery mechanism.

Starting point is 00:16:00 And just by using that term DevOps, it sounds like it's cutting a lot out. Maybe someday we'll have a better term for it. Maybe just like happy love for everybody. and just by using that term devops it sounds like it's cutting a lot a lot out you know maybe someday we'll have a better term for it maybe just like happy love for everybody but it's uh you're just because you're a tester you're not cut out of this devops role and you become even more integral right because your job is to know testing and to know all those tricks and those gotchas and those those warnings that you always had those intuitions for and bringing those into, you know, as you were talking before, when you're in the CI part of it, the continuous integration part of it, maybe it's not the developer who's going to

Starting point is 00:16:34 write that little test. Maybe you're going to be writing the test in conjunction with the developer who's writing the code, or you can give the developer who's not used to thinking of testing that kind of advice on what to do. Yeah. And I'm always referring to a presentation I saw last year several times. So the guy is called Adam Auerbach from Capital One. And in case listeners, you have not seen him, Adam Auerbach, you can find him on Twitter. And his videos of the presentation he did at Velocity and StarWest, the roles online. And he basically said, well, the way they do development is they basically got totally rid of their testing teams.

Starting point is 00:17:14 He was one of eight test leads three years ago, and then they went from waterfall to agile, or the new terminology, what we just learned today from Brian, rapid waterfall. No, seriously, what they do, they brought their testers into the development teams. And now the testers sit side by side with the developers and they cross benefit from each other. Why? Because if I'm a tester and I know how to better write tests, then that's great. But I can kind of show my developers a little bit on how they also can maybe use my frameworks that I'm building to build their own tests. On the other side, if they tell me, hey, Andy, you're the tester.

Starting point is 00:17:51 You're better in testing. I'm better in coding. But I'm looking at this new framework. Let's say I'm looking at Angular because I want to write my new web app in Angular. Then if they talk with me, I can actually go back and say, okay, I's, I do some research on how to best test Angular apps. What are the frameworks out there? So I can build a little frame set or framework and say, okay, cool. You're building your apps on this framework.

Starting point is 00:18:13 Here is the best way how I think we can test it. And by the way, I found these five blog posts that talk about the most common pitfalls. So as a team now, we are much stronger because we help each other in the end becoming more efficient but producing better quality software. And that's what it's in the ultimate thing about. So in talking about that, that kind of brings me to another thought along the lines of all this is how is testing performed in the different cycles of continuous delivery? You're not going to be running, at least I don't think you'd be running a full scale load test. You know, we talked recently about the different types of tests and what they mean, but now you have your CI, your

Starting point is 00:18:57 continuous integration or developer kind of cycle. And then it eventually goes into that release pipeline, which will go into the performance realm but what kind of tests are run in the different levels so let's let's even go is there something before it gets checked in on ci that even gets run like what is the developer going to be doing or you know do it does it really in earnest start in the c once it gets integrated in or we're looking at i think it starts earlier. I think it starts, I mean, the modern IDEs, they already have all the code checks, the static code analysis, pattern detections.

Starting point is 00:19:30 So there's a lot of stuff already happening, actually, while the developer is writing code, while it's live. If you turn on IDEs, whether it's Eclipse or Visual Studio, and then you start typing, it automatically tells you that that might be stupid. There's static code analysis running all the time. What we, from a Dynatrace perspective, what I try to promote, and I had this in one of my recent online performance clinics, if I'm a developer and before I check in code, I

Starting point is 00:19:58 hopefully run my local unit tests just to make sure everything works fine on the code that I changed. So I run my unit tests and I inject the damage with agent, and then the agent can tell me what my code is actually doing underneath the hood. Because guess what? Most of the code that I'm responsible for is actually not my code. It is my code plus all the frameworks it runs on, whether it's Hibernate, whether it's Spring,

Starting point is 00:20:20 whether it's any of the Microsoft libraries. I'm not sure what's out there on the Node.js front. But basically, before checking in code, I, as a developer, should do a quick sanity check by leveraging the static code analysis tools that are built into my IDE, but also leveraging APM tools like Dynatrace and run the unit tests and then look at the pure paths

Starting point is 00:20:43 and then say, oh oh the code change that i just did is causing 500 additional objects to be allocated these are 500 more than before well it still works good on my machine but maybe later on that's not a good idea if we if we you know have that much of a memory footprint or the way i configured hibernate caching is now going to the database in a much different way. We have the M plus one query problem just introduced. So these are all things I can actually do as a developer before checking in code. So it starts actually when writing code before checking in.

Starting point is 00:21:15 Right, right. And then when they go into, when they do check it in, we're obviously not going to be running full scale load. But are we talking about running 10, 20 users? Or are we talking about, again, single kind of full from the front end, but single hits? How would you run that? So in my perfect world, when I check in code,

Starting point is 00:21:34 so let's assume we're 10 developers in a team and I check in my code because I did my local tests and everything is fine. I check in code. Now what happens, Jenkins will hopefully, or whatever build server you use,

Starting point is 00:21:44 will compile the code, will run all the unit tests, run all the functional tests. And actually on these tests, and this is something I've been promoting for a while, these tests are testing functionality right now. But instead of just testing functionality, why don't you combine your test executions with tools that can give you these key metrics that tell you more about what your code is dynamically doing. Meaning instead of doing this just a static analysis, what the developer can do in the local machine, we automatically combine your unit tests, your integration tests with APM tools, you know, Dynatrace, for instance. And then we spit out, hey, this unit test is executing five database queries. It is throwing seven internal exceptions.

Starting point is 00:22:28 It is creating 100 log messages. It's allocating that many objects of that business type. And if we have these metrics, then what we can do in the CI before we run any load test, we can already identify, and I'm sure of that, most of the problems that will later show up as a performance and scalability issue. Because most problems that we see in apps are related to inefficient database access, very aggressive for memory usage, some mistakes in doing multi-threading, you know, if you're spawning up threads, calling services in the background asynchronously. Some of these things,

Starting point is 00:23:07 but looking at the metrics, can already be identified by executing single tests, like unit tests and integration tests. And it's not that you find everything, but I think what it is, you find regressions. So if you keep,

Starting point is 00:23:21 if you look at these metrics from build to build to build, and now I know, hey, in the last 10 builds, my test that is testing this feature has always told me we're executing two database queries. And now it's 20. Hmm. Will this change later impact my load tests? Probably so.

Starting point is 00:23:42 I will probably find the same thing. Probably the load test won't even finish because we're killing the database server first. So that actually, my point is, before we even go into running small or medium or large scale load tests, we should actually look at these, and I call them architectural metrics. I'm not calling them performance metrics. These are architectural metrics. How many database queries, how many web service calls, how many threads are involved, how many objects do we allocate, how many objects stay on the heap after we run a GC.

Starting point is 00:24:11 So these are all the metrics that we need to check early on. So that's the next step I think that should happen. Right, and just from the testing point of view, those unit tests and functional tests on that level, what kind of, you know, I've seen on our internal tools, a lot of those are written with like maybe some Ant scripts or something. But is it typically, you know, Ant, or what are people writing a lot of those tests in?

Starting point is 00:24:37 Well, not the test. I mean, Ant is going to be used to execute these tests. The test itself, they're using, you know, J on Ant. That's what I'm saying, yeah. Yeah, that's fine. You know, that's good. That's why we have this discussion, they're using, you know, JUnit. Yeah, that's fine. That's good. That's why we have this discussion and hopefully we, you know, and I don't know everything there, but that's what I see.

Starting point is 00:24:51 These tests are typically written on the unit test level. You use the XUnit framework of your, you know, runtime, whether it's NUnit or MSUnit on the Microsoft side, whether it's JUnit or whatever tool. What I also see a lot of people do, especially those that write services, mostly REST and web services, they typically use tools like SOAP UI, Jmeter, Gatling,

Starting point is 00:25:15 any of these tools that are able to execute tests to single REST interfaces. So you do your REST API testing, right? Just as a unit test, you're now testing the little unit, which is your service, but you're using the protocol that your end users will use to test it. So you're running some JMeter scripts, for instance. But these are single, these are not load tests. These are, we call them web API tests. So if you go, for instance, to the Dynatrace documentation, and we have some walkthroughs on how to integrate your jmeter scripts with dynatrace and just run your your your service test so test every service uh if you have a rest service for

Starting point is 00:25:51 search for instance then you probably want to run it with different search queries you want to give it an invalid search query want to verify if the search actually says this is a stupid query or you know things like that right and i think you know not that I want to talk about this right now, but all that stuff that you talked about, the different kind of tests that you want to run, that's exactly what the kind of information and expertise that the testers and, you know, the testers can provide the developers

Starting point is 00:26:18 for what kind of tests to run and how to run them best. Because that's where, you know, your expertise is going to lie in. Exactly. And the cool thing about these tests, these integration tests, the functional tests, are typically very fast executed because your unit tests execute pretty fast.

Starting point is 00:26:34 You test a small portion of your code, even if you have 100, 1,000, 10,000 tests. But the good thing about this is if you find regressions in the early stage of these tests, that means you can already stop the build. You stop your brian here he comes your rapid waterfall you can stop it right away and there's no need anymore to uh to run all these these load tests because they'll find just the same problems they find that the database is going to be overloaded they find that you crash your jvms because you have a memory leak yeah yeah that's all important for having those metrics set up ahead of time and knowing what to

Starting point is 00:27:10 look at exactly so then obviously once you get out of ci um i mean yeah the continuous integration and that build that's then when you're going to enter the delivery pipeline and go into your performance test now the big challenge on the performance testing side at this point, right, is you don't have the timeframe you used to have. You always got your timeframe cut short no matter what, right? Let's say release was four months, like an old release was four months. And they said, oh, it'll get to you by the last month. In reality, it'll always get to you in the last two weeks.

Starting point is 00:27:42 So there's still going to be a bit of a time crunch challenge, but obviously now it's good. It's very different, you know, and one of the things I kind of struggle with, at least from my history of performance testing is not knowing which scripts are still going to work, which scripts you're going to have to rewrite. Uh, and even, so does this call for another type of approach? How much can you, How much of a full sweeping test, the entire system test do you run at this point? Or if you're really getting into a mature CD system, do you start running a load test? Say, you know, let's talk about the concept of microservices, right?

Starting point is 00:28:21 There's all these other little bits, and maybe they're putting a change out to one microservice. Maybe it's login. Maybe it's checkout. Maybe it's something that's kind of key. A lot of things might interact with it, but the only change they're making, they say, right, is to that one piece. Do we at that point just load that microservice

Starting point is 00:28:37 or are we doing a full entire site test? And how do you manage in those time cycles running those really large tests anymore well i think it's a very tough question i have an first if i if i listen to the story that just lined out they only changed that service and i would obviously say well of course we only need to test that service and figure out if everything runs correctly i recently wrote a blog post on going monolith to microservices in confidence with Diamond Trace. I brought an example of a company I worked with, and they basically did the same thing.

Starting point is 00:29:10 They had multiple services, and they said, well, we are only changing this service here. But what they also changed is the way the service was accessed from other services. And now instead of calling the service once per search, it was called X times. So actually they had to introduce the M plus one query problem. So the search front end and a search back end, they optimized the back end, tested the back end, everything was fine. They did not realize that instead of calling it once as I said before, they're calling it for every single search result. So if you have a search that produces 100 results, it's called 100 times. So that's why I think obviously you need to start on where you think your change is where. But I think it would be not wise to just rely on these tests.

Starting point is 00:30:00 I think you should do at least some end-to-end sanity checks at least on your top features so if you know that search is your number one feature and if you know that these are my five top features that people are using then definitely run some at least end-to-end functional but some load on it i think that's just just you have to do it maybe not the full the full scale all the time if you don't have the time but at least see is there a change in behavior from the previous build that we released and coming back to the metrics if we see that uh 10 we have 10 more calls to the back end and we're transferring 50 kilobytes more per search query and if we know in production we have a million search queries that means we have a million times 50 kilobytes more you can do the math right i don't need to run yeah okay yeah

Starting point is 00:30:53 that was an easy one yeah so the thing is though what i'm saying is you don't necessarily always need to run a full-scale load test but i think you need to do the basic test and then you can i think what what for me what testing is all about finding regressions and whether it's a a full-scale load test, but I think you need to do the basic test and then you can, I think what, for me, what testing is all about finding regressions. And whether it's a functional regression or whether it's an architectural regression or whether it's a performance regression

Starting point is 00:31:14 where I always combine, I always link architecture and performance because if you make an architectural change that is bad, it will impact your performance and scale ability. And I always, I'm sorry that I always have to come back

Starting point is 00:31:24 to my, the M plus one query problem on the database or the M plus one query problem between services, whether you call them microservices or not microservices, I don't care. It's the same thing. If you are deploying a new code and you're now making 20% more calls for the same features, then you can easily do the math by figuring out how many calls you have in production and real and under peak load and then add 20 more and you know that this is probably not going to scale right i still hope

Starting point is 00:31:51 though i still hope that people will have some time to run some set of tests but i think you hit on a much interesting topic where you said we don't even know if the scripts work anymore if you don't run them if you don't run the tests on a continuous basis, and if you only run them once per big release, and that might be just once per month, then this is probably bad because you have a lot of change. And that's why if you're moving to a more rapid deployment model and a rapid change model where the pipeline is really executing as much as possible, then you constantly have to also keep your test scripts up to date. And if you actually use modern testing tools where you write your tests no longer in a cryptic language, but you write it in the development languages that the developer use anyway,

Starting point is 00:32:36 it will also be more easy for them to maintain it. And coming back to some of these tests might not be actually maintained by the developers who also make the code changes. So I think it has a lot of benefits if you do continuous integration and continuous delivery right. Yeah. And I think this makes things can be challenging now for the traditional performance teams. Right now, I'm coming into this thinking of when and where I started from it. So I think maybe people who are just coming into performance now might have a little bit more of an advantage because they're probably already hearing about a lot of these things and might might have already gone into some development. Right. But found the aspect in the field of performance more fascinating. So they've gone into there. But for, you know, all of us, let's call us old timers, right? It's, it gets very, very challenging of, of what do you do? What do you have to now

Starting point is 00:33:30 learn? What do you, how do you have to modify even your outlook? You know, there was, you know, back in my earlier days, myself and all my colleagues, we were kind of somewhat performance purists, you know, where we would be, we would say to the teams, look, if we can't run a full scale load test, we can't sign off on this because there might be something that has been missed. And sure enough, every once in a while, we could justify that clause because every once in a while, there would be something that supposedly was never touched that would break in the new release.

Starting point is 00:33:59 And everyone would be like, well, how did that break? How did that break? And then they find out, oh, we didn't realize this is going to have an effect on, you know, code A is going to have an effect way down the line there. So it forces you with these more rapid deployments, with these maybe not running everything all the time. It really forces you to kind of break out of that mindset of being a purist, which is somewhat painful, right? Because for me, it was always a matter of pride of saying, hey, if I signed off on this, you're going to, it's going to work, right? Unless it was something we could not have predicted, especially with the, with the, you know, not as good tools back then, I guess, you know, for a, for, for a modern day performance tester, it comes down to,

Starting point is 00:34:38 you know, changing your mindset a bit, right? Learning some coding, you're not going to have to be a developer, but you know, if you're taking taking a look at I heard you mention Gatling before. Was that using the Scala code? Is that what the Gatling one is? Exactly. Right. Jmeter, Junit, all these things. You're there. They're using some of this coding language. My advice on that side is learn some of that because you're going to understand coding a little bit better, which is not just going to help you in your job, but it's going to also help you understand what's going on with these tests. If something is slowing down,

Starting point is 00:35:08 you have a much better understanding now of what code does and how it affects things. So, yeah, I don't know. What am I trying to say here? I'm trying to say if you're going to be moving into this world, right, which seems inevitable, yeah, there's going to have to be a new set of skills to be learned. I think so, too. I think you're spot on and I think there's also some new ideas on how to do the testing I mean I think

Starting point is 00:35:30 nobody nobody wants to say we don't need to run large scale low tests anymore because I mean we had the discussion with with Mark on the last episode they're very critical especially if you think about you need to prepare for on e-commerce for the shopping it was just like it's April 18th they're very critical especially if you think about you need to prepare for on e-commerce for

Starting point is 00:35:45 the shopping it was just like it's it's april 18th here while we're recording and i think text text day was a couple of days ago right that april 15th and so a lot of people obviously like the turbo taxes of the world and other organizations out there they have to prepare for this peak load and i think they want to be sure that they can sustain the load. That's also why these companies typically have, you know, even though you said sometimes we get two weeks or maybe one week to the release and then we start testing. I think these companies figured out there's so much money on stake. They have code freezes.

Starting point is 00:36:20 They have feature freezes weeks and weeks before and then really do a thorough testing. But I think, and hopefully this came across well, I think we can shift quality checks and especially performance and architectural checks to the left by looking at these metrics. And why I bring metrics up again is, I mean, Brian, both of us, we work for Dynatrace. We look at pure paths, our data set all the time. We know what to look at, right? We open up the transaction flow. We look at the number of database queries. We look at which methods have CPU hotspots or synchronization issues.

Starting point is 00:36:59 We look at how many web service calls are made. So these are metrics. And why do we have to wait until somebody calls us and says, hey, I have a pure path from production because we're hitting a problem. Why do we have to wait all that long? Why not if we know that these patterns are out there and have been out there for probably longer than we, the two of us, have been in business? Why don't we automatically let our tests that we already have, the unit and the functional integration tests already checked for these metrics, then we can automatically detect the regression when the code change appears.

Starting point is 00:37:31 And we don't have to wait all the way until the end. We may never find it in testing or we'll definitely hit it in production. So let's just move it, shift it left. That's my point. Yeah, no, absolutely. And I think what gets really fascinating with that then is when you start seeing the patterns of, you know, let's say early on, the developer didn't think much of an increase of 10 database queries. They still pushed it out, right? And then it gets to the performance team when you're running that test is where you find the problem, or maybe the load wasn't run properly or wasn't run with insight onto what was really going to happen to production

Starting point is 00:38:09 and it blows up in production. But what you start learning over time by capturing that data early on is you see the cause and effect and you see, okay, if we go from two database queries to 10, I now know that, you know, that's going to be kind of our threshold that has enough of a change on our, in our, in our run that I'm going to flag that because from two to 10, yeah, it's still, I mean, realistically, it sounds like a lot, right. But if it was like two to 500 would be really obvious. What am I trying to say here? I guess what I'm really trying to say is it lets you learn what these patterns do, you know, and it becomes an educational experience, but it also then becomes a time-saving and money-saving thing for both

Starting point is 00:38:48 yourself and and the organization but i definitely agree with those the idea of getting these metrics early and and that kind of brings that devops concept back into this because that's where as a as a you know performance member team member it's a great idea for you to get involved on that development level and start sharing your ideas and also finding out what their ideas are. Because the development teams, that can help you run a lot of ideas

Starting point is 00:39:15 through on your performance side. I was just going to say, take a typical idea of memory usage. As a performance tester, you might not be thinking that, hey, if we use the same search term queries all the time, our queries are going to get cached. And we're not going to be really exercising the system because, let's face it, you probably don't have a computer science background. The developer might, though, mention that to you and you go, ah, great idea. And now suddenly you look a million times smarter and you get much better results. So yeah, which also, which also reminds me of two things that I want to say here.

Starting point is 00:39:50 I like that this is a perfect example of the search query. So what I see, I mean, the, the, the reason why you use different search queries is not only because you want to, you want to test caching or you want to make sure that you're not hitting the cache all the time, but you also want to find out data driven problems. So if you are using a search query that always returns five results and one that is always returning 50, let's say if you have these two tests and you test these different combinations, different search queries that return different sets of results, and you then look at the metrics, what is actually happening behind the scenes. Oh, for the search result that produces five results, or the search key that returns five results, we see five database statements. For the one that is returning 50, we see 50 database statements.

Starting point is 00:40:38 And if you do the math, it's probably going up. So that immediately tells you, even with a single integration test or functional test that uses different input parameters you can automatically identify data-driven problems because guess what in production somebody will figure out hey i can execute a search without a search key that means i get 1 million results and therefore kill the database so these are the things you can you can detect these data driven problems that's one thing and the other thing what i wanted to say in you know devops obviously goes into operations so we will never be able to find anything and obviously we need to

Starting point is 00:41:15 expand our performance engineering i think more into production and i think this is probably a topic on its own so maybe it's's something for one of our upcoming webinars, doing performance testing or monitoring in production, which is a very cool thing, obviously. But what I'm saying is don't stop with these metrics after the package drops out from your build server and then it deployed into production. So really keep monitoring them.

Starting point is 00:41:41 And I think the key metrics that I like, the same as before, we said database queries and all that stuff, the key architectural metrics. But one metric that is very critical is usage. What does usage mean? How many people are actually using that feature? This is something I brought up many times in my latest presentations. Hey, Brian, if the two of us, if we work in a development team and our product manager, let's call him Mark, he came up with this great idea and he forced us to implement this.

Starting point is 00:42:10 And we did everything. We developed it. We tested it. And now we're deploying it. And guess what? Nobody's using this feature because we see from a production perspective, nobody's using it. We're using real user monitoring or UEM in Dynatrace. And then we see that nobody's using that or maybe a very small percentage. Then we can go back to Mark and say, Mark, you may thought you had the best idea in the world. We tested it. We spent a lot of time and now nobody's using it.

Starting point is 00:42:34 So what do you want us to do? Either optimize so that people actually use it or figure out why they're not using it and then get them there. So maybe to fix something from a UI perspective or maybe from a flow perspective. Or maybe it is actually a feature that nobody uses and needs. Maybe Mark, unfortunately, had the wrong idea. And if we can actually make that decision based on these metrics, we can, guess what,

Starting point is 00:42:59 take out code, take out tests. That means if we take out code and take our tests we reduce our technical debt we reduce our test cycles with execution time we also reduce what's called business debt so we actually remove features that nobody needs or we would just drag it along we are dragging along business debt that's a term that actually nida from verizon coined when she was on stage with me i've never heard that one before i like that that one. Business debt really means, hey, we have applications and features and applications that we don't even know

Starting point is 00:43:29 if anybody needs it. Probably nobody needs it, but because we don't have the metrics, we don't know, so we better carry it along. So on top of technical debt, which is kind of code after code after code,

Starting point is 00:43:39 which is kind of getting messy and you don't really know if it's used or not and if it's useful, you also kind of say, well, we have certain business features of our apps that probably nobody needs, but we don't have the data to take it out. If you think about Microsoft Word, which is a great app, but I'm sure 90% of the features nobody ever uses, but Microsoft doesn't know whether they can take it out because they don't know if anybody is using it.

Starting point is 00:44:03 Is Clippy still in? I don't know if anybody's using it. Is Clippy still in? I don't know. One last thing I wanted to mention too with this idea of getting information from production. Let's go back even to your search idea. And I know this is a little bit off topic because this is kind of talking about testing approach or ideas, but you have this idea of testing your search, right? You can either use one term over and over again and do it wrong, or you can have an infinite number of unique terms, which in itself would be wrong, right? Because then you're never going to be leveraging cash,

Starting point is 00:44:37 which is not a real world scenario. So another thing you can get from partially that usage in production and partially just, you know, partially just some data mining is, well, let's take a look at the top 1,000, 2,000, 3,000 search terms and feed that into our data file. So you're using the patterns that people are doing, because obviously it would be great to have one of those, I'm going to search for blank and return a million things. But that's something you should find much lower in the cycle by doing some smart testing down there. Once you're doing that full scale load test, get a realistic data set to use. Because I even had that mistake that I made once back in my old

Starting point is 00:45:14 job, where we were mostly a content site. So it was all just different pages. And unbeknownst to me, they had introduced some caching. But for the last three years prior to that, the testing I was writing had a bunch of different large data sets of pages to hit for the different types of page. And it would just randomly pick through them. And then this new release came out, and I reported significant drop in performance. We were still only talking about maybe 300, 400 milliseconds of a degradation, but that was a much larger change than we've ever seen before. And I started, you know, sky's falling, we can't go, we can't go, and then found out from the developers that they've introduced some caching. So if I'm hitting them all random,

Starting point is 00:45:53 I'm never using any of that new caching that was in place, and that then helped me change my test to be, okay, what are the pages people are actually visiting, and model it on that. So that kind of data is something you can pull back from operations. And that again, pulls in that whole DevOps side of you're in between the developers and operations. You have two great, great sources of data. And then either behind you or in front of you, you have marketing and business and project managers who are other great, you know,

Starting point is 00:46:21 how do you service them and make yourself valuable? And I know that's come up a few times already in our three episodes span so far, but I think it's a very important concept. It's a feedback loop back from ops to not dev, but to test. It's awesome. This is exactly what DevOps and content delivery is about, having not only these quality gates that stop things, but also feeding back data upstream. All right. Well, I think that'll wrap it up for today's podcast.

Starting point is 00:46:51 I want to thank everyone for listening. And once again, thank Andy for doing this with me. It's a great pleasure to share this time with you. Yes. And we still don't know where this is hosted, but you should be up on – you should be able to find us on Spreaker, iTunes, and a bunch of other things very soon. Any closing thoughts, Andy? No, just – I mean, I think the topic is so broad. I mean, as always, we could go on forever, I guess.

Starting point is 00:47:20 I mean, the closing thoughts are shift quality to the left, meaning left in the pipeline by looking at the right metrics. I believe we can find a lot of performance problems early in the lifecycle, not only through performance tests. However, I'm a strong believer we need to run performance tests, not only small ones and then say everything is fine because we don't have enough time. So we definitely need to make sure we have enough time to run these tests. But you can do a lot of the checks, the performance checks in the pipeline, in your continuous delivery by running, starting with the functional test, integration test, small scale load tests. And you will find a lot of problems. And therefore, if you really then have the time at the end, and even it it might be limited for running large-scale load tests, you will be sure the system you're testing is actually much better quality than before, so you really only find the hard problems.

Starting point is 00:48:18 Or maybe you're actually good already, and everything will be fine, and then you can enjoy the next release. And I'll just end on saying that if your organization is moving in this direction, which hopefully they are, and you actually enjoy performance testing, don't be scared. Embrace the new. There's going to be a lot of amazing things to learn and a lot of growth that you can undertake in this process. So if you're not too keen on performance testing already, maybe it's a good sign. But if this is a fascinating topic for you and you really enjoy it, there's a lot of amazing things that are going to come out of it. So just dive in headfirst, and part of Agile is allowing for failure.

Starting point is 00:48:54 So allow that for yourself as well, and keep moving forward. All right, we'll see you next time. Thank you. Goodbye. Bye-bye.

Your Ad Here

PurePerformance - 003 Performance Testing in Continuous Delivery / DevOps

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.