PurePerformance - 002 What is a load vs performance vs stress test?

Starting point is 00:00:00 It's time for Pure Performance! Get your stopwatches ready, it's two of Pure Performance. My name is Brian Wilson and with me we have... Wow, who do we have here? Well, I want to actually introduce our special guest before I introduce myself. Our special guest who is sitting very close actually next to me in my hotel room at STP Con outside of San Francisco. He's rubbing my back a little bit to make it even more cozy. Mark Tomlinson.

Starting point is 00:00:51 Hello. Hi, Andy. Always a pleasure to be with you. And Brian, hey, dude, we're podcasting, man. Yes, we are, finally. Right on. Great. So, Mark, for those people that don't know you, even though it's hard to imagine that there are people out in the world that don't know you. Who are you?

Starting point is 00:01:05 I am not a rap star. I don't rap. Okay. I think they refer to themselves as hip hop stars, not rap. Yeah, hip hop and hip hop stars. That's like a couple of decades ago. No, but that's just, I'm old enough now that a couple of decades ago would be a rap star. So I just don't, I believe in cultural preservation from a musicology perspective.

Starting point is 00:01:23 Okay. I'm not a rap star. I am on, if you go to LinkedIn, you'll find me. I'll be a performance Sherpa or a performacologist. Performacologist. Yeah, I saw that. I'm actually a performinator. Performinator. You've been in performance for quite a while.

Starting point is 00:01:37 Yeah, more than 20 years. And looking at the color of your hair, it seems you have your fair share of challenging performance problems. Incredibly challenging, frustrating, nail-biting. I went through a lot of therapy to really recover from those things. But it was worth it because now I spend a lot of time actually outside of doing consulting work and helping customers and helping individual practitioners coming out and teaching like we do with Dynatrace. Andy and I both are teaching this week at STPCon. It was pretty awesome.

Starting point is 00:02:05 So I spent a lot of time teaching and mentoring, both in testing, performance testing, and also public speaking, particularly women that are getting into tech, which is really, really good to help a lot of folks break some diversity into our public speaking tech world. So that's another thing I love doing as well. Cool. And besides being on our podcast today, if people want to hear more of your voice. Why would they? I mean, I know, why would they? But you still have a couple of minutes to convince them to actually listen more to the stuff that you have to say. But there is, I think, a PerfBytes podcast that people may should have been aware of.

Starting point is 00:02:42 Yes. If anyone was at Dynatrace Perform, the conference last fall in October in Orlando, James Pulley, my colleague, and I were there doing a live PerfBytes podcast. It's been nearly, it'll be four years of PerfBytes coming up very, very soon. So we saw a complete void in doing any type of podcasting on the topic of performance. All the old forms would be, you know, writing a journal article, writing a book. You know, if you're old, like me, those are the way old people got information. Now it's YouTube, a podcast, a videocast.

Starting point is 00:03:18 And we've spawned several other podcasts to get started doing stuff, such as Peer Performance. So I'm really glad to support you guys in getting your perspective in a different voice or more voices around performance showing up in the podcast channel. Cool. Yeah. So do you feel that we're copying you and we're trying to steal your show? Actually, to take it seriously, Brian, I don't think so. I think we you guys and different people from different disciplines or perspectives,

Starting point is 00:03:45 they work with different customers, and they solve different problems, both at different points in a software development process, or they see problems differently, and that's what brings, you know, the tools for Dynatrace to solve problems differently or help people differently than other types of profiling, other types of, you know, digital performance management solutions, looking at other types of load testing tools. I mean, why are there 40 different kinds of load testing tools? Right. You know, and a lot of, oh, that's just a copycat.

Starting point is 00:04:16 And then you get in there and it's like, you know, using it is slightly different. They solve the problem differently. Same thing goes with personalities, right? There's only one me, Brian, and there's only one you, and we all know there should really only be one Andy. And Andy's not introduced himself, but that's Andy Krabner over there. So, yeah, my colleague sitting next to me, friend of 12 years, and also starting to get a little gray hair.

Starting point is 00:04:37 I know. Because you also are a longtime performance tester, performance innovator, I'll say. I hope so. In all the different companies that you work for. And your co-host with Brian, obviously, Andy Grabner. Yeah. Actually, I started as a tester on a testing tool. Isn't that cool?

Starting point is 00:04:53 Yes. Starting as a tester on Silk Performer, one of the, from my perspective, really awesome load testing tools out there. Yeah. Still is. I still am friends with a lot of these folks over there. And actually, now I want to segue into the topic of today. Oh, no pun intended. The word segue. Yeah. It's amazing, right? Yeah. Still is. I still am friends with a lot of these folks over there. And actually, now I want to segue into the topic of today. Oh, no pun intended, the word segue. Yeah, it's amazing, right?

Starting point is 00:05:09 What? Oh, R. So basically, so last episode, we talked about the one-on-ones on performance engineering, on performance testing. We did some basics, covered the basic ground. Today, we actually want to talk about some term definition. What is load testing? What is performance testing? What is stress testing? What some term definition. What is load testing? What is performance testing?

Starting point is 00:05:26 What is stress testing? What is soak testing? What is configuration testing? And there's a lot of these definitions out there. And obviously, we at Dynatrace, we have a lot of the people that use Dynatrace in the load testing environment, even though we see more and more, obviously, using it as in production. But I think this is still critical, very critical, especially also to you, Mark. We both have been performance and load testers over the years.

Starting point is 00:05:51 And so what I really tried to answer today or get in a discussion, what the hell is the difference when we talk about, and so that I use this word, what the hell is the difference between performance testing, load testing, soak testing, stress testing, whatever testing you have out there. Atomic testing. There's a lot of different terms out there. Atomic testing, is that like the nuclear device testing? Yeah, it could be, yeah, but it's a term that is floating around.

Starting point is 00:06:16 Okay. So in this podcast, we want to just share our opinions on what these terms are. And for those, before we get started, I just want to highlight for the podcast listeners, if you go to Wikipedia, and if you are Googling or searching, not Googling, Wikipedia for load and performance testing. Any contemporary engineer knows the Google search and Wikipedia. And of course, if you're ever giving a conference talk,

Starting point is 00:06:39 and you don't really know what you're going to give a talk on, go to Wikipedia and just put the definition up on the thing and just shoot the breeze with the audience for five, ten minutes. You can burn five, ten minutes just arguing about the Wikipedia article. Easy, easy, yeah. So what I did, when I search, I find software performance testing as the topic on Wikipedia, and they actually list six different topics or definitions, terms. Load testing, stress testing, soak testing, spike testing, configuration testing, and isolation testing.

Starting point is 00:07:08 Interesting, right? Do you think so? I've heard of some of those. Go ahead, Brian. No, that's all I had to say. That's all? That's it? I didn't even hear what the hell you had to say.

Starting point is 00:07:18 I said I've heard of some of those. Yeah. Well, let's back up a little bit as to what brings a performance tester or performance engineer, because I do a lot of mentoring and teaching and people who maybe come from an automation background or they come from whatever flavor of testing school that they start with. It could be old waterfall kind of stuff and they're learning or they come from an automation background, people from development moving over to do testing. Um, and so when you, when you say you're on the load and performance team and no, you're on the stress and performance team, or you're on the performance stress and load, and you're on the soak spike load stress test performance team. Um, and then you hear somebody you're on the non-functional team, which makes it can be very depressing. You know, if you're a sensitive, insecure person, wow, I'm non-functional. Wow. I must need some medication of some sort.

Starting point is 00:08:08 But I think the first time I witness people start asking questions about like, what's the difference between this type of testing and that type of testing? They're in their second year, first or second year. It's one of the first kinds of questions that you get challenged with as a new performance tester. And sometimes it happens under fire from a lesser skilled person who thinks they know the difference. And it comes as a challenge to you, Andy, you newbie. Andy, oh, well, you're not doing load testing. You're doing stress testing.

Starting point is 00:08:44 And it becomes this challenging confrontation and it catches a lot of people off guard, especially new newcomers to the, to the skills. And it's, uh, if I don't have a good answer, wow. I, and, and this sounds for some reason, it always sounds like a serious, wow. This, if I don't know the difference between these, I must not be an expert. I must not be, I don't know what I'm doing. And I think part of that is on guys like me who don't explain it over and over again and how simple it is. Now, if you take what you pulled from Wikipedia, and I'm a newbie. Let's say I go back to 22 years ago with less gray hair.

Starting point is 00:09:19 I would look at the hierarchy. I think it's etymology, right, of the actual history of that word. They actually say performance testing is the superclass. And if you break then to subclass, it helps a lot of people understand, oh, okay. So if the overall thing is traffic, then you've got different kinds of traffic, foot traffic. You've got two-wheeled vehicles, four-wheeled vehicles. you've got trucks, you've got different weights and size. So, but the Uber thing would be called traffic, the traffic control system. Well, then you have the different types of traffic. Same thing with performance testing. The Uber superclass would

Starting point is 00:09:56 be performance testing. And I think the subclasses break out into those subtleties of load stress. So it might get different types. And that's where I see people get confused. They almost see that there's a competition between load or stress because they've been challenged. Because, Andy, you don't know what you're doing. You're not doing stress testing. And that's really kind of the first place. I don't know what you guys think of that.

Starting point is 00:10:18 I always try to organize it that way to help people. I wonder, should we not change the question? Shouldn't we say say who cares about the definition what do we want to achieve with these tests yeah taking an outcomes based view on the theory so for instance uh if we want to know does our application actually respond in an expected way under x amount of users that execute this type of load, right? Then I can figure out how I can model my load test or performance test or load stress test so that I can figure out,

Starting point is 00:10:51 does my application actually handle the load well and still respond in the performance that I'm expecting, right? In like loads in one second, in two seconds, if I have a thousand users concurrently on the system hitting my page with this set of transactions. Yeah. But here's one of the problems. You know how you can never try to answer the question

Starting point is 00:11:11 about a word by using the word in the question? So you can't say, what is load testing? Well, it's testing with load. Load, yeah. Okay, what do you mean by load? And then you're like, we're into this, and this is everywhere in the IT industry. We abuse language like crazy. so there's page load oh well that's retrieving

Starting point is 00:11:30 information and loading it into something and then there's creating workloads well what's a workload and then there's load test and then there's sustained load and a number of users well i don't have users i have an ap. So it's the transactional load. And so we abuse these general kinds of things. And I like to fall back, not just to reform the question, because I think that question of what's the objective would be different for each type. So load has a different objective and question about it. What do you want to learn? But physics gives us some really good guidance here because a load test is really from physics you will build a bridge from civil engineering scott barber you know he's a civil he's trained as a civil engineer really wow yeah actually in the army i think he

Starting point is 00:12:16 was a civil engineer but if you can understand simple the physics of simple load on a structure you do a load test and that really is gravity and weight and calculation of you know materials and and and tensile strength and all these different you know how stuff could collapse on itself and what the how much load a load-bearing wall in your house to be honest even though we do things with electrons and they seem very software abstract, whatever. Electrons are physical things. There's load, load, electrical load. So that's one place, excuse me, that's one place you can fall back to and say, let's go talk about physics.

Starting point is 00:13:09 Well, half the people that are doing software testing, like they look, they hear physics and their eyes gloss over and they're like, I studied English and Russian literature and dialects and things. But I'm a tester. And you get these weird people from weird walks of life. That's what makes testing fun. So sometimes saying, let's go talk about hard sciences and physics, leave some people out. They can't relate to that. I like the analogy of the bridge. So basically, if I get this correctly and correct me if I'm wrong. I can't correct you if you're wrong well you can actually and especially you can it's very easy for you because you're sitting what is that like a foot away from me

Starting point is 00:13:32 so if i take the bridge example you basically start with uh sending one guy over and he doesn't die because the bridge doesn't collapse maybe two maybe five maybe ten so that would basically be like equivalent to a load test where you have an application and you start with hitting the application with one request, then with two, then with three, then with four, then concurrently, so not only sending them individually over, but actually two at a time, three at a time, four at a time, and basically figure out once your application starts to show some signs of weaknesses, for instance, well, the bridge obviously doesn't collapse, but the application might show an increase in response time. It takes longer for these people to get over the bridge.

Starting point is 00:14:14 Yeah. That would be something. So that would be kind of the equivalent of a load test. We are just putting more load on the bridge. But I'll interject here and ask, when does performance testing start, right? So I'm using performance testing in the general term, as Mark was talking about before. And when you talk about sending one guy over the bridge and then two guys and then three guys, that's kind of all in the realm of the testing team as they're ramping up and adding users and seeing what the effects are. But I would almost come in and then say, well, when is that supposed to start?

Starting point is 00:14:50 Does that start when it comes to the performance team? Or does that start in the development side where they capture performance metrics under a one user load, exercising their code with a single user to understand the characteristics of it to say, okay, this one transaction that we're going to run through requires five database connections or queries and 20 milliseconds of CPU utilization. And then we then pass that on to the performance team

Starting point is 00:15:22 so that we can then understand modeling how that transforms what you start adding load. So when does performance testing start? Yeah. Which is an epistemological question, Brian. You've taken us into a new realm philosophically. Because if a system fails a stress test in the forest and nobody was there to know that it failed the stress test was it a stress test was it a failure and that that's the one thing i think that bear with me if i sound a little bit like dr wayne dyer you know wayne

Starting point is 00:15:56 dyer the guy down there and he's a social psychologist and you know he does he'll help you reach your full potential i think my mother-in-law goes to see him. Yeah, no, Dr. Wayne Dyer. He always does these weird... He's an older gentleman with gray hair as well. But he talks a lot about intention, oddly. The one thing that I think relates here, Brian, is you have to intend to do something. Can you imagine an automation framework

Starting point is 00:16:19 that has no checkpoints on expected results? Would you call it a test or would you just call it automation? Well, that then would just be automation. Yeah. So you have to intend to measure something and maybe even implement the physical instantiation, like in our civil engineering bridge diet, you would actually put sensors on the bridge.

Starting point is 00:16:41 You might even measure the bowing of the structure as it gives a little bit the bow bowing not the planes but you know what i'm saying so you actually have to intend and then maybe go one step further than the intention to actually want to measure something now measuring a pass fail test result and functional testing could be as easy as a does the button exist is a property enabled did we receive a 400 error or not? Something like that. But in performance, you're measuring first and foremost response times and volume.

Starting point is 00:17:12 And a lot of the differences between load testing, stress testing, which are all types of performance testing, is the slight variation in time measurement and the volume that you're measuring or, or simulating. So in your case, uh, let's take the classic one we started with Brian between a load test and a stress test is, uh, is a one user, uh, rate over a one hour test. Is that a, could you call that a load test? Theoretically? Yes, but not under my own personal definitions. Not commonly, right? It's not commonly thought of, well, it's not a very big load. Right, it's a one-user load. It's a one-user load.

Starting point is 00:17:54 Mathematically, yes, it qualifies as, you know, we built a huge bridge that supports six lanes of freeways and thousands of tons of weight. The first test is the guy walks across the bridge, like you said. So technically, yes. Would you get much value out of that? Well, no, it's not very exciting. We kind of wanted to put a big cement truck across this thing and watch it crash and see all this stuff. That would be much more exciting. So I think when people really look at the difference between load and stress,

Starting point is 00:18:20 there's a slight difference, Andy, from what you said between load being a bridge we want to make sure it survives the two hour rush hour window and therefore it's slow vehicles and a lot of the maximum amount of weight on the bridge and it's for about two hours it's a very steady rate of cars coming on and cars moving off there's's some standard deviation in the weight of each individual car or truck. But we're looking at a very steady state load over an interval of time. And that interval of time would provide confidence, reduction of risk, or proof that you could release the software. You could reach a revenue goal, you could reach a technical goal. And that load proves we can sustain 400 cars per minute across the bridge for two hours is a good definition of a steady state load test that has intention for measurement and

Starting point is 00:19:18 a very specific outcome, a measurement, that outcome, the result is attached to something meaningful. Oh, that outcome, the result is attached to something meaningful. Oh, that's great. This new bridge will support twice as many cars as it used to. And we really need this, you know, because bridges are breaking all over the place. And so, and this steady state load. So basically you put constant load on the application, as you said, simulating not the same car all the time or the same transaction, but really a good mixture of what you actually expect in the application load to be in your real-life environment

Starting point is 00:19:48 because of different types of users. And you said it's a steady-state load. I remember that term from self-performer times. We used that steady-state load. Steady-state, yeah. Yeah. So is that, though, the same as a soak test? Well, imagine a soak test is just a load test with on the worst day of traffic and the rush hour on

Starting point is 00:20:08 that week let's say it's a thanksgiving weekend people are going home just to see their families traveling in the car usually here in the united states probably not happening in europe because you don't really have thanksgiving like we do but there's no if there's an excuse to go drive to munich and have beer it's around the same november october time frame right um uh so anyway uh sorry i just i always start thinking about beer and october fest even though we drink wine here we're having one lovely one um back to the subject imagine a soak test is really just a much longer extended version of a load test. So soak or a sustained load. I see endurance test is another phrase for the same thing. So the profile or the workload, you can imagine, think about it again with physics, material

Starting point is 00:20:55 exhaustion. So if you've got a little bit of bow in the bridge, you've ever taken a paperclip and you bend it once and it stays together you bend it again yeah it's a little weaker you bend it the third or fourth time it snaps yeah right so if you've got movement in materials from a physics standpoint eventually something might give maybe not after the two hour load test but a 24 hour soak test something starts to sag. So that could be, for instance, when we come back to applications, maybe a slightly sneaky memory leak. Memory leak handles, leaky handles, right, that are attaching onto objects and not letting them go. Or reserving them and not releasing physical resources can also happen as well.

Starting point is 00:21:40 You can also find ramp conditions in the logical realm. So when we first start the load test and we think that workload is unique, we find that there's some counter or the number of records that slowly grows with every car. We're getting some metadata that now everyone wants metadata attached to the analytics. So suddenly we start putting more features and functionality on each of the cars, quote unquote, or each of the transactions, and the transaction gets a little bigger and a little bigger and a little bigger. Now, in performance engineering, Dr. Connie Smith has a whole bunch and she and her research partner, I can't remember his name, I apologize. They coined a lot of these anti-patterns.

Starting point is 00:22:18 So performance anti-patterns and a ramp condition is a perfect example of that. I have a steady state load and it could be a long-term soak test, and I expect the performance to stay steady. But what I witness is over the first hour, it gets a little slower, gets a little slower, gets a little slower, and that would be what they called as an anti-PANR ramp condition. The more work I do, the slower it gets. Let's say a missing index.

Starting point is 00:22:44 When the table is small, I can scan it quite quickly. As I do more work and add more records to a log table or something, that scan gets slower. Or it could also be another example. I like the log example. Obviously, that's a great one. Another one could be we have a growing product catalog because we simulate an e-commerce store,

Starting point is 00:23:03 and not only do we simulate the buyers, but also the producers that can put more data into the database. So the database is growing and growing, and that also slows things down, obviously, over time. And you can also see artificially induced ramp conditions if you have really bad automation practices. A good automation practice would be I initiate a transaction from a a clean start state and I walk through the steps like even a shopping chart checkout. And maybe I leave an item in the cart, leave the item in the cart and just keep running that script in production for synthetic monitoring for six months. Suddenly my synthetic monitor script has been getting slower and slower and slower. And I've been escalating to management. The V are getting involved we need to hire more people come in hey you guys need to fix this

Starting point is 00:23:48 and suddenly you go back why are there 100 why are there 150 000 items in this cart well of course it's going to take a huge long time to do that so you can find even bad practices or poor practices can lead to artificial ramp conditions. But a load test, the most essential definition for that is that it's a constant load. And one other good tip around this load testing in that is that you want to minimize the variability in the test harness. So you really want to control for every parameter in the load itself that you're creating, which forces by nature all of the most of the variability on the system under test. So that's where we want to see stuff go haywire. If my test tool is all over the place, you know, and that can be, we can get in arguments about load generators and, you know, variable workloads and stuff. And those are totally acceptable in some test situations.

Starting point is 00:24:45 But if you're getting into some finer measurements, a nice steady-state load test, I want to see all the weird variability happening on the bridge. That's the real thing I'm interested in measuring. So basically, just if I get this right, and also for the audience, because I hope I got it right. So you're basically saying you try to make the load test and the harness stable, talking about think times that you probably don't randomize too much and things like that. Or payload sizes.

Starting point is 00:25:13 Yeah, that makes sense. Yeah, cool. Hey, Brian, I think you wanted to say something. your experiences as well, the concept of the SOAP test seems to be the one that gets the least attention and or the least support from anybody to be able to do. Because as you say, if you're looking at, say, your commerce shop, right, and your peak times of the day are between 11 a.m. to 1 p.m. while everyone's shopping on their lunch break. And even if you take into account your holiday peaks and all, and you're going to run that for the approximate amount of time that your traffic surge occurs, right?

Starting point is 00:26:00 And that's going to be your standard load test of can we support what we think our maximum kind of traffic is for the time period that it is? That's your load test. Now, the argument then would come from a performance engineer to say, well, let's run a soak test on that traffic level to see what happens to the system. And obviously, there's always the concept of deadlines and everything else. And then we got to ship it. We got to ship it. Um, what is a good argument to say, okay, we know let's sticking with the bridge analogy. We know this bow is going to last, um, for the duration of the projected time and even maybe a little longer, but why, how can I convince the, the rest of the company to say, I need to run this test for 24 hours now, which is a time we probably will never have that kind of a sustained traffic unless there's some kind of an attack of

Starting point is 00:26:50 the bots or something. How does a tester really come into pushing that idea forward so that they can find the fatalities in the system? I have a brilliant purpose. First of all, every test result you create should have somebody that loves that test result. I love learning that whether we passed or failed or why or what our capacity is. And when you think about putting a new release or a new integrated release, a new change into production, there are different people that get fired for different reasons within the first hour. Like if somebody really screws up the deployment or, you know, you don't know what you're doing right now. If the product kind of falls on its face within the first couple of hours,

Starting point is 00:27:40 you're probably going back into development or data or architecture and somewhere the engineering team. If it starts to fail over the course of three days, the biggest stakeholder I always find are your sort of keep it running, keep the lights on, support engineers and ops engineers. And that's why I try to get them as the business sponsor to say, hey, you know, we have never had a relationship with Andy. You're my ops guy. You know, I'm sure there's certain kinds of things we could be testing that would make you feel sleep better at night, be a little more comfortable with this stuff. What bugs you the most? And they're like, well, to be honest, I like to go fishing and I like to leave on Friday night and not get a pager or a text message or an alert in the middle of the

Starting point is 00:28:26 weekend. So if you can do a soak test for 72 hours, that means I get my weekends back and I'm willing to give you budget. I'm willing to give you support. I'm willing to yell at the VP of development. So this has to happen and I'll give you what you need. So find that business stakeholder who would really benefit from a 72-hour soak test. And, Brian, you might not do it every cycle. You might say, well, we do it once a quarter. We pick a weekend that works, and we actually fire it up and let it run all weekend or something like that. Yeah, basically you talk about you're finding the pain point.

Starting point is 00:28:56 Obviously it's a pain for the ops guys to be on. It's a real cost for them. Yeah, it's a cost, and it's also a personal pain, obviously, right? That's the big thing. So make it personal. And I think that kind of comes back to a little bit what we were talking about last week, Andy, when we were discussing that a performance engineer's job these days is to really establish those relationships with different people within the organization and to build those allies and just bring all the teams together under the flag of performance.

Starting point is 00:29:28 So in this case, finding those ops teams, because ops teams will be, in my experience at least, one of your biggest cheerleaders, right? In my experience as a performance engineer, it was always the developers didn't want you to find any bugs that they would have to fix, understandably. Project management and product management always wanted to ship it and not delay the release dates because that would look bad on them. But operations were always the one who said, if this goes out and it's bad, we're going to be the ones to suffer. So I think the operations team is definitely one for the performance teams to really reach out to and form those bonds and relationships with because they're going to give you that other side of the perspective to as you

Starting point is 00:30:11 say justify those tests and they'll go to bat for you brian that's the thing i mean they'll get in there and be like hey uh brian i invited him to the meeting and one of the things we're requesting i know it's going to cost you guys some money, but I think it's worth doing. You remember those six weekends last month and the month before where all of us were in here on Saturday because so-and-so screwed up? Carl screwed up that database query, and then Carl went on vacation. And it's always Carl. God. So these kinds of things, they can actually, if they're feeling that pain, you just say, I think there's something we can do to help. And just let them fall in love with you naturally.

Starting point is 00:30:48 I think as we talk about load testing and soak testing, very similar consistent workloads. And they have very similar audiences post-release, in-ops, support engineers or that first 24 hours or 72 hours. What do you think is a different audience for a stress test? Who's the audience for a stress test? Who's the audience for a stress test? Well, I guess the audience for a stress test is probably maybe the business people. Maybe I'm wrong. Maybe I'm right. I would also think the architects maybe. Yeah, I was going to go with architects. I would actually, I mean, let me just finish my thought, okay? And before you actually, you know, give me- You're already outnumbered.

Starting point is 00:31:26 Of course, I know, but still I want to keep talking. So for the stress test, I believe, if business wants to promote a certain feature of the app, and of course we always relate to e-commerce. It's a great example. And they want to send out these cool new ideas and marketing campaigns. They have a good interest that the application doesn't crash example, and they want to send out these cool new ideas and marketing campaigns, they have a good interest that the application doesn't crash and doesn't get too slow under an expected,

Starting point is 00:31:51 hopefully positive, return on investment on a campaign. So that means it should not only be able to handle a certain spike for a single email event, but maybe if they get really popular and Cyber Monday and Black Friday comes up, then you definitely want to make sure that the application stays alive under a lot of stress. Do they need to know the specific CPU measurement at the point of breaking? They don't. The only thing they care about is obviously is the setup and running,

Starting point is 00:32:19 and not only that, but is it performing so that we don't lose too many people. We have some numbers there. So, Brian, Andy, I think, called out three other types of performance tests that, in my opinion, have nothing to do with stress. Okay. Only because I go. Go ahead, Brian. Well, you were talking about stress, right? And I think he was talking really kind of going into the spike area there.

Starting point is 00:32:45 And in a way, you can almost say what is the— there's a little bit of a fine line between a spike test and a stress test. Plenty of differences, but I think for especially someone who's earlier in their career, it could be a little bit shadowy. What's the third one you were thinking of there, Mark? Or was I totally off on the spike? No, no, no. So one of the things that I see, first of all, if you go back to Andy's point,

Starting point is 00:33:16 there are obviously great supporters of doing performance testing in the business world, our POs and our business analysts and executives. But I find that when they talk about investing in a campaign, they talk in big round numbers. How many salespeople should we hire for this next season to ramp up? And when they ramp up, you'll get big round numbers at first, and it may come up here, we're doubling the size of this, the internal sales agents for the call center, or we're going to double or triple the throughput target audience, the target market. And they often speak in 1x, 2x, 3x types of numbers. We need to acquire our

Starting point is 00:34:00 next biggest competitor, and we're both about the same size. So if my systems can scale to 2x, I'm going to buy them, get rid of all their software and run their business on my software, because my software can scale to 2x. And a lot of guys in the corner office, you know, forget about math, and they trust the IT department to do all the hard work. And so when I show up and I hear business initiatives, I typically think of a scale test, which is 1X, 2X, 3X. And you can draw with those audiences much easier to understand because it's very simple math. Oh, Andy, you're telling me the system can handle twice the throughput. Awesome. Yeah. And that's great. That sounds like what I just told the chief marketing officer. We're going to double the campaign because I think we can, we've got to, you know, we're going to go for it. So you're speaking in the right language with,

Starting point is 00:34:47 with a way that they can hear you. Whereas a stress test. And if I think about physics a little bit is I'm just going to keep putting cars on the bridge until I measure its breaking point. That means I have to break it. And no business person wants to hear you tell them, well, let me tell you the breaking point. They're like, oh, my God, you remember the outage we had? Oh, I got in so much trouble. We tried to do that campaign last Black Friday, and there was the outage. The minute you even open the can of worms of some failure from IT, you're going tox, 4x, 5x tests, in the business realm, we're talking about growth, positive energy, change, power.

Starting point is 00:35:31 These are dynamic parts of language we can use to find friends and support and go after those people. Those are the business requirements we can go after. If I'm in the technical department with the architects, architects are like, okay, we came up with a new platform and we built a new road. I'm really scared. I don't really know what we've never measured where the bridge is just absolutely going to break. And I'm, I know we've done some load and we've done some stuff, but what if the marketing department is right? And what if they don't call me about what's coming down, down next month? I want you to set up all the calibrations and all the measurements. And I want you to just up all the calibrations and all the measurements, and I want you to just keep putting cars on there until you measure the breaking point.

Starting point is 00:36:16 And it's, to me, a stress test is a much more technical audience than even ops or business. Like I say, it's architects that are trying to build a platform for the long term and make sure that we build the best road for us to develop on. And it'll hold up under this crazy thing and that crazy thing and this amount of load. And it also another great thing about stress tests technically, and you can kind of get the ops guys on board with you is we can tell you which resource or which configuration is the most limited. So if we start seeing the numbers go into the red about we're about to break, the first thing they should say is, okay, we shouldn't throw web servers at the problem.

Starting point is 00:36:52 We need to throw database memory for buffer at the problem because we've already been here before. And here are my hot tickets to do right away. We're going into the red. Boom, more app servers. Boom, more network bandwidth. So you're actually giving advisement to ops on what the weakest parts of the architecture are. So basically, that also brings me to, well, thanks for both of you to correct me or to give me more ideas. Different idea, yeah.

Starting point is 00:37:19 Different ideas, yeah. That's all what this is about. And I see your glass of wine is actually empty, and it's too far for me to reach for the bottle. Andy, before you go on, Andy, I just wanted to try to clarify something on Mark's side. So, Mark, couldn't you say that the stress test and the scale test can be the same actual test execution, but just reported in different ways for business and for non-business? Well, yes, you can say that until you start to be attacked by people who don't like your results. Well, I mean, you would just say, hey, to business, we can handle up to 10x traffic.

Starting point is 00:37:50 And then to operations, you can say we break once we go over 10x traffic or something like that. And then the development architect comes up to you and says, that's not a valid test. And even though they're not a performance testing guru, and maybe you're a new performance tester, you're like, well, it seemed valid. I sort of, it's a stress. So here's my advice on that good question is if I do a 1x test and it's more like a stair step, I'm getting like three levels of a load test.

Starting point is 00:38:19 It actually steadies out. I ramp up and I hold it there for 20, 30 minutes or maybe even an hour in some cases. And I can carve off the ramp up and the rest of the test and get a nice steady state, good, clean measurement of the 1x interval load. Maybe it's the best 15 minutes of that interval. But I let the system steady out there because if a technical architect doesn't like what I wrote, developer starts to complain. Also filling caches and all that stuff. Let it warm in, settle in, do its kind of thing at 1x load,

Starting point is 00:38:54 and then take your measurement and do your analytics and the analysis and all the aggregates on that. I find, Brian, that helps just bolster the credibility of the test. You could say here was a 1x phase and all of the things that people would argue with you about would say, no, it's like a 1x load test before they went to the 2x load test. But it's nice if it's one run. You can show the whole result set and say, look how great you guys are doing at 1x. And even ramping up to 2x was looking great. And we went for 20 minutes through the 2x phase.

Starting point is 00:39:26 We started seeing something weird at the end of the 2x phase. And then right when we started to go up to 3x, boom, things were terrible. And here are the limitations. We ran out of heat memory here. GC went through the roof. So that's basically what I wanted to do the segue until unless Brian wants to add more to that. But my point now is, so I like that, that staged approach. The question is, what metrics do you need to look at to actually make these conclusions

Starting point is 00:39:50 and tell them, hey, it is the database. It is, we need more app servers. We need more memory. Basically, you look at key architectural and resource consumption metrics, how jetty are components. And I think, I mean, one of the things that I like a lot, and I think we talked about this from a Dynatrace perspective while running a load test i like the the layer breakdown because the layer breakdown is a great chart that shows me which layer of my application is actually consuming or or contributing how much time to the overall response time and then you can look at it at the

Starting point is 00:40:21 staged approach and then you see oh in 1x load my my hibernate layer my cache layer is perfect in 2x even so as well but in 3x boom it goes up like crazy so this is must be something in must be something in that area or it could be the database goes up like crazy because we have a problem starts blocking start blocking yeah for instance a connection pool is a problem because you're running out of connections in the pool and stuff like that yeah and isn't that the big challenge as well that everyone masters? What should I monitor? Well, then they monitor all 15,000 things and then they look for the anomalies and stuff.

Starting point is 00:40:54 Yeah, in any of the profiling tools, particularly Dynatrace Layer Breakdown, that's where I start with people who aren't familiar with profiling, aren't familiar with the architecture. It starts the conversation in my wheelhouse about where is time spent? And then the conversation seems to drill down to, let me drill down, show you the method, let me show the hotspot, let me show you the database query, and that kind of stuff, which is cool. But each test type, we could talk about the spike test, which is different than a stress test.

Starting point is 00:41:22 I like to do a stress test. So I know, let's say it breaks one X, two X, three X, four X, five. I'm a steady stress test. I don't have intervals. I don't have stair steps. It just keeps growing and growing and growing and growing. But I do it in a very uniform one extra transaction per second, every, you know, every 10 seconds I go a little bit. So that is again, the test tool harness, very consistent ramp up because I want the variability and the weirdness to come out of the system under test. And that's so and then that same kind of monitoring. So let's say I know I can get to 8X, which is wow, that's great. This is a really scalable system.

Starting point is 00:42:01 It breaks at 8X. And somebody comes along and says, we have a marketing initiative and we're going to do a spike. And I do a spike test and my spike goes up to just maybe 7x, just shy of what I know the real technical breaking point is. I can report and say, here's the latest spike test. Here's 1x, every day keeping the lights on, making money, running the business load. And then within 30 seconds, we go to 7X. And if you go back to the marketing department that is responsible for most spikes, for good reason, right?

Starting point is 00:42:35 We want to pimp the Twitter feed. Boom, go buy this stuff. Selling tickets. Part of the benefit of going back to them with that very successful run, because we know it'll support 8x. So you do a 7x run. It's a successful spike test. And you go back to them and say, are you guys going to do a campaign with seven times the throughput or seven times the customer base or seven times the can? And they're like, oh, no, really, it's probably two or three.

Starting point is 00:43:03 Oh, we're done. Test is done. Great. There're done. Test is done. Great. There you go. We're good. And then they go back and they're like, oh, my gosh, if we're good to 7X, I can get Andy. He's going to give me more budget. Maybe we make this.

Starting point is 00:43:16 Let's go big with this campaign. Systems can handle it. Let's go to 4X. Next thing you know, dude, you're opening up the pathway for us to really pimp the business. And now you're a best friend. So basically what really pimp the business, and now you're best friends. So basically what you, and I hear this multiple times now in this talk, the positive attitude, basically the transplant, the positive message, we can do more even. Dude, if you have a system sitting there and it can do amazing things, dude, show it off.

Starting point is 00:43:41 And it's also not just the positive attitude. There's the concept of how do you sell yourself and your results to the other departments to make yourself more valuable? So that when marketing is hearing you come in and say this, they're like, oh, wow, we got to talk to that guy more so that we know what we can do on the next time and further down the line. And again, it's it's ingratiating yourself within the rest of the business. I mean, back when I was testing, I was used to being always the bearer of bad news, and I somewhat enjoyed it. One of my favorite things when we had the product managers all hot and heavy,

Starting point is 00:44:19 we signed a contract, we're going to do this marketing thing, and we have to get this new feature in the product. And it's this humongous feature that's going to take development three months to write. But we signed the contract, and it's got to be out next month. I always had a big smile on my face when I said this thing is going to fail and just completely die if we release how it is. There you go. Because it's just their audacity. But, you know, looking back, it was, you know, sure, that brought me pleasure. But then it was like, OK, here comes Brian with the performance report.

Starting point is 00:44:53 What are we in for now? You know, and I was always like, oh, brother, we don't want to hear this news. As opposed to someone being like, OK, cool. Now let's hear what Brian's got to say. Because we want to, you know, have something that we can work with and be actionable with. Yeah, totally. And something that they, please do that again. We love that.

Starting point is 00:45:11 Give me that number. I love that number because it tells me I can go bigger. I can go do more things. I can acquire a competitor. I can go for it. On the other side, though, so I know I like the positive attitudes, but just a little bit of caution. So let's assume you are the bearer of the bad news. You can also say, hey, maybe we should think about not sending out the emails to a million users now.

Starting point is 00:45:33 Let's do it staged. Maybe staged to load in a production environment, but convincing marketing it might be better to do it staged. And then actually look at the results of the first 10% of people that came in and then maybe tweak it a little bit. Yeah. So that's also twisting it into a positive thing, even though we know we're not able to handle all the load that they want. So we negotiate with them and tell them it might be better anyway to stage it up. Yeah.

Starting point is 00:45:58 And honestly, as a point, you say, you know, Carl, you really want to go to 4X, and I just, you know, I'd say you guys would be good to go to 3X. I mean, you make plenty of money at 3X, and you don't want to overextend and fail, you know, with any campaign from a logical business perspective. You know, technically, we're going to be your root cause. So if you really think the business wants to go to 4X, you want me to spend some time working up what it'll take capacity-wise for us to get there. Yeah, good, Carl. So, Carl, do me a favor and go talk to Mary, the head of development. Kind of let her know you guys got to kind of get together and do a few tweaks to get us to 4X.

Starting point is 00:46:36 And I'm like, thanks, Carl. You're my buddy. So, hey, Brian, I know we've been, I think, talking about 40 minutes now. And I think the initial intent was to talk a little bit about the different definitions of load testing. We talked about load. We talked about spike. We talked about soak. Stress.

Starting point is 00:46:52 Scale. Scale. We know there's a Wikipedia definition as well for the whole thing. But hopefully we gave our listeners a little more ideas on what we guys think. Yeah, and why. And why. Yeah. And, I mean, Brian, anything else that we should discuss for this episode?

Starting point is 00:47:09 Any last words, maybe? Closing thoughts. Closing thoughts, yeah. Oh, you're putting me on the spot. Why don't you come back to me? Closing thoughts. You know, everyone is going to have a lot of different names for tests, especially when you're starting out.

Starting point is 00:47:25 You know, I can go back somewhat 15 years ago or so when I started out and would walk in. And as a very beginner performance tester, I was considered the expert in the place. But one or two people had always heard of different kind of tests. And really just, you know, beyond what you name it, understand the concept. What is your goal? What is the end result you're trying to achieve with the test? And yes, there, there is some, um, consideration to what you actually call that test, uh, because I think, you know, missing in the performance world is some uniformity of terms. Um, but I think the more important thing missing in the performance world is some uniformity of terms.

Starting point is 00:48:12 But I think the more important thing is to understand the concepts and to understand when and why you need to run a specific test. I remember from my physics classes in high school, our professor would always put the formulas up on the board. Right. He said, I don't the formulas are going to do any good if you don't understand the formula and when and where to use it. So the most important thing, especially when you're starting out, is to really understand those concepts, to talk to the other team members, whether they're developers, architects, marketing, product management, operations, and find out what their needs and concerns are and really figure out a way that you can service those needs. And at that point, if you understand the concepts of what the different tests are, you can call it a dog test for all I care. It's not going to be. A beer test. Yeah, the beer test.

Starting point is 00:49:04 And I guess the beer test would be more of a soak test. Yeah, it could But, you know. A beer test. Yeah, the beer test. And that, I guess the beer test would be more of a soak test. Yeah, it could be a soak test. I think that would be the most fun soak test, 72 hours. Yeah. Yeah, I like that. So I guess that's my final thought, really. Yeah, so Brian, actually, we put you under stress, and you performed pretty well under stress. You stressed pretty good.

Starting point is 00:49:21 It was actually really good. You really get to a breaking point. We could have kept going. Yeah, yeah, it's amazing. So I want to give the last word to Mark. It was actually really good. We didn't get to a breaking point. We could have kept going. Yeah, it's amazing. So I want to give the last word to Mark. That's why I want to say my part. I just agree with what you said.

Starting point is 00:49:34 I think what also, and I learned in the podcast, you know, it's always good. Did you? I learned a little bit. Isn't that nice? I think so. And that's the beauty of it. So don't believe you're the smart ass and you know everything, but listen to others. Hopefully listen to the podcast. And I learned a lot from both of you uh mostly sitting mostly well well because i've been i've

Starting point is 00:49:51 been spending more time with mark probably over the last couple of years so that's why i'm very happy actually mark that you joined us today for the podcast and did you share your thoughts and experiences and i've been doing load testing for half of my life. That's so sad. It is sad. Actually, it's a little less fun. Well, yeah. Well, I think it was not as bad as in your life because my hair is not yet as white as yours. It's coming. It's coming, of course. But thanks for all the lessons and thanks for inspiring me and others and learning more and what we can do and really do the load test or the test that you have to do

Starting point is 00:50:27 and not just do it because you read it somewhere and you have to do it because the book tells you. Just do it because you do it for the good, meaning you're trying to figure out if your app can sustain the load,

Starting point is 00:50:38 it can perform under stress, it can sustain the spike load. Sit down with the people and figure out what they need to know in order for them to be more confident to release the software. That's what it is about, right? Yeah.

Starting point is 00:50:48 Help your brothers and sisters in different apartments who are like, God, if I knew that information on a regular basis, I would do so much better. So for me, well, thank you, Andy, of course, for those kind, kind words. But I think, Brian, I go back to what you said is very important. I mean, lots of people have lots of different words and different language. And we have translation, right? We're a very diverse language and culture sector in the world. Technology generally is highly diverse compared to other sectors.

Starting point is 00:51:25 I think no matter what, like you say, you want to call it a dog test or whatever. Um, the other thing that's really important to keep in mind is that whatever the company you come into, if, if they've got it backwards in your mind, it's, I would rather go with it being backwards. They, they call a load test, a stress test, and they call a stress test a load test. OK, draw a picture and then just invert the picture. And as long as everyone sort of gets on the algorithm or the formula, as you said, Brian. Yeah, it's this is a great podcast for you guys to go into next, especially with the Dynatrace stuff, to figure out what kinds of metrics, what kinds of tools or views or components or measurements fit with each type of test. So even if you call it a stress test, I know I'm going to be measuring CPU and memory, especially network throughput to especially network throughput and logical entities like thread pools and connection pools and those things that are abstract. If I'm doing a long haul soak test, you bet I'm looking at memory because with those leaks and things we'll look at. So I think that's the next step in evolution after, you know, picking up and learning the savvy ways of different types of performance tests is to how do I

Starting point is 00:52:46 match the tooling I have and the measurements I'm taking to actually do the right kind of measurements for that type of test. And I think that's episode three for you guys. It looks like it, yeah. Yeah, we should definitely. So, Brian, I think I took notes. Great to have some good ideas. Mark is giving us

Starting point is 00:53:05 a challenge yeah well he's giving us more work too that's right that's my job yeah of course inspiring you to do more work

Starting point is 00:53:11 yeah it's good no okay well I I want to say thanks to all of you and I think that concludes

Starting point is 00:53:20 Brian right I mean actually you should do the the conclusion here well you know on the PerfBytes podcast we give all sorts of shout outs and thank you, Brian, for all of your hard work and whatever. I'd like to thank my wife for supporting me and doing all the podcasting. Thanks.

Starting point is 00:53:33 Thanks for everyone who's listening, taking an entire time out of their day. We don't yet know if anybody's listening. Right. You bet they're going to be listening. I'd like to thank Mark for being to actually thank mark for uh being one of my uh early inspirations and mentors i don't know if you knew you were mentoring me back then in uh north carolina um while we played uh i believe halo while we were running a test yes yeah but back then it was uh i had one person who trained me on load runner and then i was on my own and

Starting point is 00:54:03 you were the second person who actually knew something about it and just gave me a couple of pointers there and a couple little bits of advice, and that really went a long way. So I would like to thank you for also being a guest on the show and for running PerfBytes and just being an awesome performance advocate. You betcha. And actually, thanks for jogging my memory. I don't know.

Starting point is 00:54:22 I was on a lot of prescription drugs back then, And it's kind of blurry, to be honest. A lot of Microsoft guys, that's how we got through it. Yeah. And thank you, Andy. Thank you. Right on. For being Andy. Thank you.

Starting point is 00:54:37 And then you say something exciting like, listen to it and we'll see you guys on the next episode. Of course. Hey, listeners. Here we go. Well, of course, the three listeners that we have or the 300 or 3,000, we don't know. By now, it could be 3,000. By now, it could be 3,000, yeah. We thank you.

Starting point is 00:54:52 We thank you all, the listeners. The listeners, to stay with us and actually to make this possible. Because without you, with your contribution. You're just wasting your time. We're just wasting our time. Blabbering. And so we know that we're going to talk about metrics. Some other topics that we had in mind is performance testing and continuous integration and continuous delivery.

Starting point is 00:55:09 That's interesting, right? Oh, that's not a can of worms at all. Yeah, not at all. And then obviously with Dynatrace, we have some technology stacks that we are very good in analyzing. It's Java. It's.NET. It's PHP. It's Node.js.

Starting point is 00:55:23 So we'll have some episode plans on what are the top key things for Java applications,.NET, Node.js, and everything. So all this stuff. What's the URL for the podcast and, like, events? Because one of the things you might want also is, you know, Andy, people want to come see Brian. Of course. So I think you will find most of the information on dynatrace.com. So we have not, you know, the interesting thing is we're still recording all of this. We have not put it up yet.

Starting point is 00:55:49 Okay, shit, don't tell anyone. So I'll just tag it on a voiceover with a different, like it'll be like, oh, you can find it on, you know, some other sort of, you don't like it. So go to dynatrace.com. Search for probably events. Go to dynatrace.com. We have an events page. And just, you know, by now, Google or use your favorite search engine. Yeah.

Starting point is 00:56:11 And then search. Bing your way through the world. Bing your way. Google through it. Yahoo through it. Whatever you want to do. Yahoo. Alta Vista.

Starting point is 00:56:18 Alta Vista. Baby. Really? Are you still around? Alta Vista, baby. I'll be back. Exactly. So Pure Performance is going to be. you should find us on Pure Performance.

Starting point is 00:56:28 And maybe we get a shout out once we are out there from Proofbites as well. Yeah. Some of your listeners over. What are you doing? Great. Go for it. Welcome to the testing podcast world. We send you our three.

Starting point is 00:56:41 You send us your 3,000. That's a fair trade, I would say. We're a lot more than that. I know, I know. Awesome. Thank you, guys. Thank you.

PurePerformance - 002 What is a load vs performance vs stress test?

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.