PurePerformance - 003 Performance Testing in Continuous Delivery / DevOps
Episode Date: May 24, 2016How can you performance test an application when you get a new build with every code check-in? Is performance testing as we know it still relevant in a DevOps world or do we just monitor performance i...n production and fix things as we see problems come up?Continuous Delivery talks about breaking an application into smaller components that can be tested in isolation but also deployed independently. Performance Testing is more relevant than ever in a world where we deploy more frequently – however – the approach of executing these tests has to change. Instead of executing hourly long performance test on the whole application we also need to break down these tests into smaller units. These tests need to be executed automatically with every build – providing fast feedback on whether a code change is potentially jeopardizing performance and scalabilityListen to this podcast to get some new insights and ideas on how to integrate your performance tests into your Continuous Delivery Process. We discuss tips&tricks we have seen from engineering teams that made the transition to a more “agile/devopsy” way to execute tests
Transcript
Discussion (0)
It's time for Pure Performance!
Get your stopwatches ready, it exciting episode of Pure Performance.
I am Brian Wilson and with me today as always is my co-host Andy Grabner.
Say hi Andy.
Hi, how do you know it's going to be exciting?
You announce it and you don't even know what's coming up.
Well I know it's going to be exciting because to me, we have a very exciting topic for today.
One that kind of makes my brain itch every time I think about it, and I think it's something
that will make everyone else's brains itch quite a lot.
Today we're going to be talking about performance testing in the continuous delivery world and
what that looks like, how we go about it,
and what does that mean
for the traditional performance tester?
That's good.
Shall we maybe kick it off a little bit
in the beginning to explain some of the terminology?
Like what is CI, what is CD,
and also the whole thing around DevOps,
which I'm sure is a little topic
that people are hearing and hopefully not only hearing, but also by now, know what it actually
means, or maybe not knowing what it means. Because I actually remember when I got on stage,
and I always keep referring to this episode of my life, two years ago, I got on stage at the
DevOps meetup in Boston. And before I could even open up my mouth, people are pointing fingers at
me and saying, hey, because of people like you, and they were could even open up my mouth, people are pointing fingers at me
and saying, hey, because of people like you, and they were not only meaning me in person,
but because of the companies, whether it's a Dynatrace or our competitors or the IBMs and
the HPs of the world, nobody knows what this whole DevOps is about anymore. And what does
it have to do with CI and CD? And because back then, I think we made the mistake in trying to
get DevOps on every single piece of collateral that we created just to get attention. Right. And you keep on saying CI and
CD and let's take a step back and say, CI means continuous integration and CD means continuous
delivery. And I guess the next bit of question is what next question is, what do all those even mean?
Right. So a lot of people have probably heard of these terms.
They're quite the buzzwords, kind of like Agile was several years ago.
Everyone in almost every corporation is saying, yes, we need to be there.
We need to be CI.
We need to be CD.
But it's usually slow going and people don't really always completely understand what it
entails and what kind of commitment they're making and what that means for the organization.
Specifically, even as we're going to be talking a lot more about today, the performance teams, but continuous integration.
I can just give my very simple light version of what it is and I can let you expand further. Continuous integration is kind of like the continual flow of writing code, checking it in,
having it run through some iterative tests and proceeding up to the next chain into the testing world,
into the pre-production world, and eventually out into production.
But that also mixes perfectly together with the idea of continuous delivery,
where the whole idea is to always be pushing code out. Right. And you may have heard of continuous delivery in examples like a Google or Amazon or Facebook where they're releasing code, what, like up to every 10 minutes or two minutes even into production, some ridiculous kind of things, which I think for most organizations is kind of not realistic nor not completely necessary.
But the idea is just continually pushing out small bits of code through your chain so that you're never making one large release that's going to destroy anything.
But you also just get everything out there as quick as possible.
And you can quickly take it out if it doesn't work out.
Well, I think you hit it pretty well here.
I think continuous integration, the initial idea was if you have a lot of developers or a bunch of developers working on the same code base, then I check in some code.
And I want to, first of all, make sure I didn't break anything.
And so if I check in code, the first step that happens in CI is actually figuring out,
does everything compile?
Do all the unit tests run?
Did I didn't break, hopefully, any of the key features of the code and really making sure that the different code pieces that we're working on,
especially if you're talking about a large application,
then I may make a change in module A,
and how does that impact module B that I'm integrating with?
That's also why I see I continuously integrating your code changes
and also then running automated unit, but
also integration tests to really verify, is the system still working based on what the
system should do?
And that's why I believe testing is so critical to the whole thing.
And the whole mentality of test-driven development even becomes more interesting because test-driven
development basically says, well, before we write any code, we want to first write a test.
Why do we test first?
Because, well, we should know what the code we are writing is actually doing so I can
easily write a test for it because the test is testing my outcome.
So I'm creating a search feature.
Does the search actually produce results?
So I can write the test, then I do the coding. And every time we check in code and anybody's checking in code, we can always validate if the code change broke anything.
And if we do it in an automated way through a CI server that is kicking off all these tests,
whether it's a Jenkins, a Bamboo, any other server, it only takes minutes, hopefully,
for a developer from checking in code to knowing, did this code change break anything significantly?
Right. And I think the key thing you are mentioning there is the fact that it is the developer in this case who is writing a test.
And this is a big shift from the traditional means, right?
This is before it gets to the QA teams.
This is before it even gets to the performance teams. Obviously there is performance testing as we spoke about in the last episode
that's still going to come into play into, to, to, in,
in some fashion in these environments.
But this is where we start getting the development teams involved in this.
And this is where it also starts getting scary because if the development
teams are writing some kind of tests, what does, you know, and we'll talk about that a little bit later, but what does that mean to the performance, you know, the traditional performance teams?
Exactly. What I want to add here, though, is I hope that developers are writing all the tests.
But what I see also in the Agile movement, Agile initially was, I think, heavily driven by developers, by engineers that write code.
But more so, I think we're seeing Agile going broader.
That means an Agile team should have multiple roles embedded in their team.
So there will be people that are just better in testing and others are better in development.
And maybe there are people that develop a cool testing framework that makes it easier
for the developers to easily write tests and take away the fear. So I think as a whole team, creating code, but actually doing test-driven
development will become much easier. And to be quite honest with you, I think if we do it right,
I mean, yes, I mean, I used to be a developer. I love writing code and I think I'm the best
developer in the world. And I hate going back and fixing problems,
but if I do it right and if I've once experienced how cool it is to get immediate feedback
if I check something in and then the test tells me,
ooh, you just did something stupid.
So that means I can fix my stupid mistake fast,
but I also learn and not making the same mistake again,
which in the end makes me a much more efficient and better developer.
So CI, and then as you said, CD is basically just taking continuous integration from we're running, we're compiling, we're running the tests.
And then at the end, something drops out that we can potentially deploy into a test environment or into a staging environment. I think continuous delivery
has the concept of a pipeline that then has additional steps along the pipeline until
potentially we run through hopefully some performance tests, maybe some more integration
tests, and then deploy it into production and hopefully everything runs fine. And in case there's a problem, then we should know about this by adding monitoring to the
system as well.
So like the immediate feedback loop in CIs, I commit all the tests run and the tests tell
me green or red.
If I deploy something and if I have metrics built in, and obviously we from our APM background,
we know what metrics should, I mean, which metrics we want, whether it's response time, failure rate, errors, how the system behaves, and all that stuff.
If we deploy something into production, automated, and then we see these metrics going crazy, like response time goes up, failure rate goes up, or CPU and memory goes up, then we know it was probably a bad deployment.
So it was a bad change.
We may want to roll it back.
Because whatever we did,
we didn't catch it in testing, we didn't catch it in CI, it made all the way out,
so we need to fix this. Right. And the interesting thing,
and I've heard you talk about this a lot in some of your speaking engagements, is there's two
approaches a company will take if something is failing in production. They're either going to
roll it back, which means that they have a really well thought out rollback plan, which can always be dangerous.
But some companies never go back, right? They roll in a fix. So I just bring that up because to me,
that concept, again, these are the much more advanced companies like your Googles and some
of these others where they just keep going forward. If it's broken, we roll in a fix.
We're not going to roll back.
So just conceptually wrapping, you know, for me, this whole concept, when I first heard
about it, it just kind of blew my mind.
But what that also means when you talk about that pipeline is from the performance test
point of view, yeah, it's sort of still like a little mini waterfall, right?
It's absolutely not waterfall deployments anymore.
But if you think about that pipeline, it's going to go through the CI section, then to
some of the other units and then pushed out.
It's kind of, you know, if you want to conceptualize it, it's kind of like rapid waterfall, if
you will.
Yeah, that's a very good, actually, it's a very good point.
Yeah.
Without, you know, without all the baggage that waterfall has, you know, it's a very good point, yeah. Without all the baggage that waterfall has.
It's definitely not waterfall, but –
Yeah, and basically what we have, and I think there's also a term that our listeners will have heard before.
It's between the individual phases of your pipeline, you have gateways.
You have quality gates that basically say, do we promote this build into the next phase of the pipeline?
Yes or no. But as you said, I like it actually. I like the rapid waterfall because it's really
rapid. Because instead of waiting two weeks or a month until we get something over to the testing
team and they take another two weeks until they either push it back or push it forward,
we have it instant. That's the whole idea. The reason why, and you mentioned it, some of these companies are doing rapid deployments
several days, several times per day.
And Amazon actually, I always bring this example in my presentation, Amazon claims that they're
deploying a change into production every 11.6 seconds.
And it's like crazy.
But the only way this works is obviously because they figured out
that to breaking down the application into smaller pieces small individual components that they can
that they can roll out and having a fully automated with a lot of quality gates
added to a pipeline so you have a pipeline you want you a change, it goes through the pipeline. If it succeeds all the way to the end, it gets deployed.
And then if you have measures and monitoring in place in production and you see something is going bad, then you can react to it as fast as possible by saying, okay, we just changed that.
Oh, maybe, and I've seen in production monitoring, all of our Internet Explorer users have an issue.
So maybe we didn't test it right on IE.
Let's quickly fix this JavaScript problem that we may have just introduced and then
run it through the pipeline again.
So within a couple of minutes or maybe an hour, we can actually roll forward into the
fixed version.
Yeah.
And I think the key there, the key difference between waterfall and this methodology are
those gates, right?
Because in traditional waterfall, you don are those gates, right? Because in traditional waterfall,
you don't have gates. The developers write the code, push it out to QA without hardly ever any
kind of testing or a set of metrics that they're looking at consistently. And then the QA people,
either it's broken right away. I mean, again, if we go back to the couple episodes ago,
when we talked about common things we've seen, how many times did you used to get a release that just didn't even work out of the gate?
And then again, to performance to these other things, there's always gates and there's a consistent set of metrics so that the developers in these instances and these instances and other instances, it's not going to move forward unless all those metrics are set up.
And those metrics mean that the teams all have to think about those metrics.
You know, we covered a lot of those concepts in our first episode.
But once you start having those things in place, it's going to start falling together naturally because you're going to really be thinking about performance in the lifecycle.
I was talking a little bit like Christopher Walken there, accenting like the later parts of life cycle.
So, but hopefully, but now the listeners,
so CI continuous integration, CD continuous delivery,
it basically just takes the concept of CI
all the way until the very end in production.
And then DevOps is something that I mentioned in the beginning.
It's just a term for me that in general,
the key thing of DevOps is actually cultural change
and how we actually go about developing and deploying
and dealing with software changes
and also how the culture, like who is responsible for what.
But I think the technical vehicle is going to be continuous delivery,
meaning the promise of DevOps is that we can deploy changes
into production faster than ever before with a better quality.
Continuous delivery gets us there because it's automating the way,
not only the software.
Actually, we only talked about software right now,
but we also need to talk about the environments, right?
We're talking about infrastructure as code.
So we talk about automation all the way through.
And I think this is where some people in the beginning might be confused about, hey, what is this DevOps?
Is DevOps not continuous delivery?
I think the continuous delivery is part of the promise of DevOps because DevOps promises we can deliver software faster to our end users.
Or let's say that way.
We can deliver value faster to our consumers.
And value will be we need this new feature because this is just the best idea.
And instead of waiting for a year in the waterfall process, we start by leveling up.
We are doing everything agile.
We are basically trying to start with the minimum viable product.
We don't need to flesh out the whole thing, but we start small.
We automate the
delivery process and then we can step by step improve the feature until we're there where it
needs to be and along the way we always have metrics that act as gateways yeah yeah and i
think you know you mentioned some of the idea of the culture of devops and one thing that i think
a lot of people including myself when i first heard, it was dev slash ops or dev dash ops, whatever.
And it always seemed like it's for developers and operations.
And all the testers in the middle got cut out.
Right. But, you know, as as I learn, as I read more into it and, you know, learned a whole bunch more greater deal about it, it's not the case.
It's more about I think it's actually a pretty poor term, my own opinion, because you're really talking about integrating the development teams with the testing teams, with the performance teams, with the operations teams, with the marketing teams, with the business teams.
It's really about bringing this culture together to streamline this delivery mechanism.
And just by using that term DevOps, it sounds like it's cutting a lot out.
Maybe someday we'll have a better term for it. Maybe just like happy love for everybody. and just by using that term devops it sounds like it's cutting a lot a lot out you know maybe
someday we'll have a better term for it maybe just like happy love for everybody but it's uh
you're just because you're a tester you're not cut out of this devops role and you become even
more integral right because your job is to know testing and to know all those tricks and those
gotchas and those those warnings that you always had those intuitions for
and bringing those into, you know, as you were talking before, when you're in the CI
part of it, the continuous integration part of it, maybe it's not the developer who's going to
write that little test. Maybe you're going to be writing the test in conjunction with the developer
who's writing the code, or you can give the developer who's not used to thinking of testing
that kind of advice on what to do.
Yeah. And I'm always referring to a presentation I saw last year several times. So the guy is
called Adam Auerbach from Capital One. And in case listeners, you have not seen him,
Adam Auerbach, you can find him on Twitter. And his videos of the presentation he did at
Velocity and StarWest, the roles online.
And he basically said, well, the way they do development is they basically got totally rid of their testing teams.
He was one of eight test leads three years ago, and then they went from waterfall to agile,
or the new terminology, what we just learned today from Brian, rapid waterfall.
No, seriously, what they do, they brought their testers into the development teams.
And now the testers sit side by side with the developers and they cross benefit from each other.
Why?
Because if I'm a tester and I know how to better write tests, then that's great.
But I can kind of show my developers a little bit on how they also can maybe use my frameworks that I'm building to build their own tests.
On the other side, if they tell me, hey, Andy, you're the tester.
You're better in testing.
I'm better in coding.
But I'm looking at this new framework.
Let's say I'm looking at Angular because I want to write my new web app in Angular.
Then if they talk with me, I can actually go back and say, okay, I's, I do some research on how to best test Angular apps.
What are the frameworks out there?
So I can build a little frame set or framework and say, okay, cool.
You're building your apps on this framework.
Here is the best way how I think we can test it.
And by the way, I found these five blog posts that talk about the most common pitfalls.
So as a team now, we are much stronger because we help each other in the end becoming more efficient but producing better quality software.
And that's what it's in the ultimate thing about.
So in talking about that, that kind of brings me to another thought along the lines of all this is how is testing performed in the different cycles of continuous delivery?
You're not going to
be running, at least I don't think you'd be running a full scale load test. You know, we talked
recently about the different types of tests and what they mean, but now you have your CI, your
continuous integration or developer kind of cycle. And then it eventually goes into that release
pipeline, which will go into the performance
realm but what kind of tests are run in the different levels so let's let's even go is there
something before it gets checked in on ci that even gets run like what is the developer going
to be doing or you know do it does it really in earnest start in the c once it gets integrated in
or we're looking at i think it starts earlier. I think it starts, I mean, the modern IDEs,
they already have all the code checks,
the static code analysis, pattern detections.
So there's a lot of stuff already happening, actually,
while the developer is writing code, while it's live.
If you turn on IDEs, whether it's Eclipse or Visual Studio,
and then you start typing, it automatically tells you
that that might be stupid.
There's static code analysis running all the time.
What we, from a Dynatrace perspective, what I try to promote, and I had this in one of
my recent online performance clinics, if I'm a developer and before I check in code, I
hopefully run my local unit tests just to make sure everything works fine on the code
that I changed.
So I run my unit tests and I inject the damage with agent,
and then the agent can tell me what my code is actually doing underneath the hood.
Because guess what?
Most of the code that I'm responsible for is actually not my code.
It is my code plus all the frameworks it runs on,
whether it's Hibernate, whether it's Spring,
whether it's any of the Microsoft libraries.
I'm not sure what's out there on the Node.js front.
But basically, before checking in code,
I, as a developer, should do a quick sanity check
by leveraging the static code analysis tools
that are built into my IDE,
but also leveraging APM tools like Dynatrace
and run the unit tests and then look at the pure paths
and then say, oh oh the code change that i
just did is causing 500 additional objects to be allocated these are 500 more than before well it
still works good on my machine but maybe later on that's not a good idea if we if we you know have
that much of a memory footprint or the way i configured hibernate caching is now going to
the database in a much different way.
We have the M plus one query problem just introduced.
So these are all things I can actually do as a developer before checking in code.
So it starts actually when writing code before checking in.
Right, right.
And then when they go into, when they do check it in, we're obviously not going to be running full scale load.
But are we talking about running 10, 20 users? Or are we talking about, again,
single kind of full from the front end,
but single hits?
How would you run that?
So in my perfect world,
when I check in code,
so let's assume we're 10 developers in a team
and I check in my code
because I did my local tests
and everything is fine.
I check in code.
Now what happens,
Jenkins will hopefully,
or whatever build server you use,
will compile the code, will run all the unit tests, run all the functional tests.
And actually on these tests, and this is something I've been promoting for a while, these tests are testing functionality right now.
But instead of just testing functionality, why don't you combine your test executions with tools that can give you these key metrics that tell you more
about what your code is dynamically doing. Meaning instead of doing this just a static analysis,
what the developer can do in the local machine, we automatically combine your unit tests,
your integration tests with APM tools, you know, Dynatrace, for instance. And then we spit out,
hey, this unit test is executing five database queries.
It is throwing seven internal exceptions.
It is creating 100 log messages.
It's allocating that many objects of that business type.
And if we have these metrics, then what we can do in the CI before we run any load test, we can already identify, and I'm sure of that, most of the problems that will later show up as
a performance and scalability issue.
Because most problems that we see in apps are related to inefficient database access,
very aggressive for memory usage, some mistakes in doing multi-threading, you know, if you're
spawning up threads, calling services in the background asynchronously.
Some of these things,
but looking at the metrics,
can already be identified
by executing single tests,
like unit tests and integration tests.
And it's not that you find everything,
but I think what it is,
you find regressions.
So if you keep,
if you look at these metrics
from build to build to build,
and now I know,
hey, in the last 10 builds, my test that is testing this feature has always told me we're executing two database queries.
And now it's 20.
Hmm.
Will this change later impact my load tests?
Probably so.
I will probably find the same thing.
Probably the load test won't even finish
because we're killing the database server first. So that actually, my point is, before we even go
into running small or medium or large scale load tests, we should actually look at these,
and I call them architectural metrics. I'm not calling them performance metrics.
These are architectural metrics. How many database queries, how many web service calls,
how many threads are involved, how many objects do we allocate,
how many objects stay on the heap after we run a GC.
So these are all the metrics that we need to check early on.
So that's the next step I think that should happen.
Right, and just from the testing point of view,
those unit tests and functional tests on that level,
what kind of, you know, I've seen on our internal tools,
a lot of those are written with like maybe some Ant scripts or something.
But is it typically, you know, Ant,
or what are people writing a lot of those tests in?
Well, not the test.
I mean, Ant is going to be used to execute these tests.
The test itself, they're using, you know, J on Ant.
That's what I'm saying, yeah.
Yeah, that's fine. You know, that's good. That's why we have this discussion, they're using, you know, JUnit. Yeah, that's fine.
That's good. That's why we have this discussion
and hopefully we, you know, and I don't know
everything there, but that's what I see.
These tests are typically written
on the unit test level. You use the XUnit
framework of your, you know,
runtime, whether it's
NUnit or MSUnit on the Microsoft
side, whether it's JUnit or whatever tool.
What I also see a lot of people do, especially those that write services,
mostly REST and web services, they typically use tools like SOAP UI, Jmeter, Gatling,
any of these tools that are able to execute tests to single REST interfaces.
So you do your REST API testing, right?
Just as a unit test, you're now testing the little unit, which is your service, but you're using the protocol that your end users will use to test it.
So you're running some JMeter scripts, for instance.
But these are single, these are not load tests.
These are, we call them web API tests.
So if you go, for instance, to the Dynatrace documentation, and we have some walkthroughs on how to integrate your jmeter scripts with dynatrace
and just run your your your service test so test every service uh if you have a rest service for
search for instance then you probably want to run it with different search queries you want to give
it an invalid search query want to verify if the search actually says this is a stupid query or
you know things like that right and i think you know not that I want to talk about this right now,
but all that stuff that you talked about,
the different kind of tests that you want to run,
that's exactly what the kind of information
and expertise that the testers
and, you know, the testers can provide the developers
for what kind of tests to run and how to run them best.
Because that's where, you know,
your expertise is going to lie in.
Exactly.
And the cool thing about these tests,
these integration tests, the functional tests,
are typically very fast executed
because your unit tests execute pretty fast.
You test a small portion of your code,
even if you have 100, 1,000, 10,000 tests.
But the good thing about this is if you find regressions
in the early stage of these tests,
that means you can already stop the build. You stop your brian here he comes your rapid waterfall you can stop it right away and
there's no need anymore to uh to run all these these load tests because they'll find just the
same problems they find that the database is going to be overloaded they find that you crash your jvms
because you have a memory leak yeah yeah that's all important for having those metrics set up ahead of time and knowing what to
look at exactly so then obviously once you get out of ci um i mean yeah the continuous integration
and that build that's then when you're going to enter the delivery pipeline and go into your
performance test now the big challenge on the performance testing side at this point,
right, is you don't have the timeframe you used to have.
You always got your timeframe cut short no matter what, right?
Let's say release was four months, like an old release was four months.
And they said, oh, it'll get to you by the last month.
In reality, it'll always get to you in the last two weeks.
So there's still going to be a bit of a time crunch challenge,
but obviously now
it's good. It's very different, you know, and one of the things I kind of struggle with,
at least from my history of performance testing is not knowing which scripts are still going to
work, which scripts you're going to have to rewrite. Uh, and even, so does this call for
another type of approach? How much can you, How much of a full sweeping test, the entire system test do you run at this point?
Or if you're really getting into a mature CD system, do you start running a load test?
Say, you know, let's talk about the concept of microservices, right?
There's all these other little bits, and maybe they're putting a change out to one microservice.
Maybe it's login.
Maybe it's checkout.
Maybe it's something that's kind of key.
A lot of things might interact with it,
but the only change they're making,
they say, right, is to that one piece.
Do we at that point just load that microservice
or are we doing a full entire site test?
And how do you manage in those time cycles
running those really large tests anymore
well i think it's a very tough question i have an first if i if i listen to the story that just
lined out they only changed that service and i would obviously say well of course we only need
to test that service and figure out if everything runs correctly i recently wrote a blog post on
going monolith to microservices in confidence with Diamond Trace.
I brought an example of a company I worked with, and they basically did the same thing.
They had multiple services, and they said, well, we are only changing this service here.
But what they also changed is the way the service was accessed from other services.
And now instead of calling the service once per search, it was called X times.
So actually they had to introduce the M plus one query problem. So the search front end and a search back end, they optimized the back
end, tested the back end, everything was fine. They did not realize that instead of calling it
once as I said before, they're calling it for every single search result. So if you have a
search that produces 100 results, it's called 100 times. So that's why I think obviously you need to start on where you think your change is where.
But I think it would be not wise to just rely on these tests.
I think you should do at least some end-to-end sanity checks at least on your top
features so if you know that search is your number one feature and if you know that these are my five
top features that people are using then definitely run some at least end-to-end functional but some
load on it i think that's just just you have to do it maybe not the full the full scale all the time
if you don't have the time but at least see is there a change in behavior from the previous build that we released and coming back to the
metrics if we see that uh 10 we have 10 more calls to the back end and we're transferring 50 kilobytes
more per search query and if we know in production we have a million search queries that means we have
a million times 50 kilobytes more you can do the math right i don't need to run yeah okay yeah
that was an easy one yeah so the thing is though what i'm saying is you don't necessarily always
need to run a full-scale load test but i think you need to do the basic test and then you can
i think what what for me what testing is all about finding regressions and whether it's a a full-scale load test, but I think you need to do the basic test and then you can,
I think what, for me,
what testing is all about finding regressions.
And whether it's a functional regression
or whether it's an architectural regression
or whether it's a performance regression
where I always combine,
I always link architecture and performance
because if you make an architectural change
that is bad,
it will impact your performance
and scale ability.
And I always,
I'm sorry that I always have to come back
to my,
the M plus one query problem on the database or the M plus one query problem between services,
whether you call them microservices or not microservices, I don't care.
It's the same thing.
If you are deploying a new code and you're now making 20% more calls for the same features,
then you can easily do the math by figuring out how many calls you have in production
and real and under peak
load and then add 20 more and you know that this is probably not going to scale right i still hope
though i still hope that people will have some time to run some set of tests but i think you
hit on a much interesting topic where you said we don't even know if the scripts work anymore if you
don't run them if you don't run the tests on a continuous basis,
and if you only run them once per big release, and that might be just once per month,
then this is probably bad because you have a lot of change. And that's why if you're moving to a
more rapid deployment model and a rapid change model where the pipeline is really executing as
much as possible, then you constantly have to also keep your test scripts up to date. And if you actually use modern testing tools where you write your tests no longer in a cryptic
language, but you write it in the development languages that the developer use anyway,
it will also be more easy for them to maintain it. And coming back to some of these tests might
not be actually maintained by the developers who also make the code changes. So I think it has a lot of benefits if you do continuous integration
and continuous delivery right. Yeah. And I think this makes things
can be challenging now for the traditional performance teams. Right now, I'm coming into
this thinking of when and where I started from it. So I think maybe people who are just coming into performance now might have a little bit more of an advantage because they're probably already hearing about a lot of these things and might might have already gone into some development.
Right. But found the aspect in the field of performance more fascinating.
So they've gone into there. But for, you know, all of us, let's call us old
timers, right? It's, it gets very, very challenging of, of what do you do? What do you have to now
learn? What do you, how do you have to modify even your outlook? You know, there was, you know,
back in my earlier days, myself and all my colleagues, we were kind of somewhat performance
purists, you know, where we would be, we would say to the teams, look, if we can't run a full
scale load test, we can't sign off on this because there might be something that
has been missed.
And sure enough, every once in a while, we could justify that clause because every once
in a while, there would be something that supposedly was never touched that would break
in the new release.
And everyone would be like, well, how did that break?
How did that break?
And then they find out, oh, we didn't realize this is going to have an effect on, you know,
code A is going to have an effect way down the line there.
So it forces you with these more rapid deployments, with these maybe not running everything all the time. It really forces you to kind of break out of that mindset of being a purist, which is somewhat painful, right?
Because for me, it was always a matter of pride of saying, hey, if I signed off on this, you're going to, it's going to work, right? Unless it was something
we could not have predicted, especially with the, with the, you know, not as good tools back then,
I guess, you know, for a, for, for a modern day performance tester, it comes down to,
you know, changing your mindset a bit, right? Learning some coding, you're not going to have
to be a developer, but you know, if you're taking taking a look at I heard you mention Gatling before.
Was that using the Scala code? Is that what the Gatling one is?
Exactly. Right. Jmeter, Junit, all these things.
You're there. They're using some of this coding language.
My advice on that side is learn some of that because you're going to understand coding a little bit better,
which is not just going to help you in your job, but it's going to also help you understand what's going on with these tests.
If something is slowing down,
you have a much better understanding now of what code does and how it affects things.
So, yeah, I don't know.
What am I trying to say here?
I'm trying to say if you're going to be moving into this world, right,
which seems inevitable,
yeah, there's going to have to be a new set of skills to be learned.
I think so, too. I think you're spot on and I think there's also some new ideas
on how to do the testing I mean I think
nobody nobody wants to
say we don't need to run large scale low
tests anymore because I mean we had the
discussion with with Mark
on the last episode they're very
critical especially if you think about
you need to prepare for on e-commerce
for the shopping it was just like it's April 18th they're very critical especially if you think about you need to prepare for on e-commerce for
the shopping it was just like it's it's april 18th here while we're recording and i think text
text day was a couple of days ago right that april 15th and so a lot of people obviously like
the turbo taxes of the world and other organizations out there they have to prepare
for this peak load and i think they want to be sure that they can sustain the load.
That's also why these companies typically have, you know, even though you said sometimes
we get two weeks or maybe one week to the release and then we start testing.
I think these companies figured out there's so much money on stake.
They have code freezes.
They have feature freezes weeks and weeks before and then really do a thorough testing.
But I think, and hopefully this came across well, I think we can shift quality checks and especially performance and architectural checks to the left by looking at these metrics.
And why I bring metrics up again is, I mean, Brian, both of us, we work for Dynatrace.
We look at pure paths, our data set all the time.
We know what to look at, right?
We open up the transaction flow.
We look at the number of database queries.
We look at which methods have CPU hotspots or synchronization issues.
We look at how many web service calls are made.
So these are metrics.
And why do we have to wait until somebody calls us and says, hey, I have a pure path from production because we're hitting a problem.
Why do we have to wait all that long?
Why not if we know that these patterns are out there and have been out there for probably longer than we, the two of us, have been in business?
Why don't we automatically let our tests that we already have, the unit and the functional integration
tests already checked for these metrics, then we can automatically detect the regression
when the code change appears.
And we don't have to wait all the way until the end.
We may never find it in testing or we'll definitely hit it in production.
So let's just move it, shift it left.
That's my point.
Yeah, no, absolutely.
And I think what gets really fascinating with that then is when you start seeing the patterns of, you know, let's say early on, the developer didn't think much of an increase of 10 database queries.
They still pushed it out, right?
And then it gets to the performance team when you're running that test is where you find the problem, or maybe the load wasn't run properly or wasn't run with insight onto what was really going to happen to production
and it blows up in production. But what you start learning over time by capturing that data
early on is you see the cause and effect and you see, okay, if we go from two database queries to
10, I now know that, you know, that's going to be kind of our threshold that has enough of a change
on our, in our, in our run that I'm going to flag that because from two to 10,
yeah, it's still, I mean, realistically, it sounds like a lot, right. But if it was like
two to 500 would be really obvious. What am I trying to say here? I guess what I'm really
trying to say is it lets you learn what these patterns do, you know, and it becomes an educational
experience, but it also then becomes a time-saving and money-saving thing for both
yourself and and the organization but i definitely agree with those the idea of getting these metrics
early and and that kind of brings that devops concept back into this because that's where as a
as a you know performance member team member it's a great idea for you to get involved
on that development level
and start sharing your ideas
and also finding out what their ideas are.
Because the development teams,
that can help you run a lot of ideas
through on your performance side.
I was just going to say,
take a typical idea of memory usage.
As a performance tester, you might not be thinking that, hey, if we use the same search term queries all the time, our queries are going to get cached.
And we're not going to be really exercising the system because, let's face it, you probably don't have a computer science background.
The developer might, though, mention that to you and you go, ah, great idea.
And now suddenly you look a million times smarter and you get much better results.
So yeah, which also, which also reminds me of two things that I want to say here.
I like that this is a perfect example of the search query.
So what I see, I mean, the, the, the reason why you use different search queries is not
only because you want to, you want to test caching or you want to make sure that you're
not hitting the cache all the time, but you also want to find out data driven problems.
So if you are using a search query that always returns five results and one that is always returning 50,
let's say if you have these two tests and you test these different combinations, different search queries that return different sets of results,
and you then look at the metrics, what is actually happening behind the scenes. Oh, for the search result that produces five results, or the search key that returns five results, we see five database statements.
For the one that is returning 50, we see 50 database statements.
And if you do the math, it's probably going up.
So that immediately tells you, even with a single integration test or functional
test that uses different input parameters you can automatically identify data-driven problems
because guess what in production somebody will figure out hey i can execute a search without a
search key that means i get 1 million results and therefore kill the database so these are the
things you can you can detect these data
driven problems that's one thing and the other thing what i wanted to say in you know devops
obviously goes into operations so we will never be able to find anything and obviously we need to
expand our performance engineering i think more into production and i think this is probably a
topic on its own so maybe it's's something for one of our upcoming webinars,
doing performance testing or monitoring in production,
which is a very cool thing, obviously.
But what I'm saying is don't stop with these metrics
after the package drops out from your build server
and then it deployed into production.
So really keep monitoring them.
And I think the key metrics that I like,
the same as before, we said database queries and all that stuff, the key architectural metrics.
But one metric that is very critical is usage.
What does usage mean?
How many people are actually using that feature?
This is something I brought up many times in my latest presentations.
Hey, Brian, if the two of us, if we work in a development team and our product
manager, let's call him Mark, he came up with this great idea and he forced us to implement this.
And we did everything. We developed it. We tested it. And now we're deploying it. And guess what?
Nobody's using this feature because we see from a production perspective, nobody's using it.
We're using real user monitoring or UEM in Dynatrace. And then we see that nobody's using
that or maybe a very small percentage.
Then we can go back to Mark and say,
Mark, you may thought you had the best idea in the world.
We tested it.
We spent a lot of time and now nobody's using it.
So what do you want us to do?
Either optimize so that people actually use it
or figure out why they're not using it
and then get them there.
So maybe to fix something from a UI perspective or maybe from a flow perspective.
Or maybe it is actually a feature that nobody uses and needs.
Maybe Mark, unfortunately, had the wrong idea.
And if we can actually make that decision based on these metrics, we can, guess what,
take out code, take out tests.
That means if we take out code and take our tests we reduce our technical debt
we reduce our test cycles with execution time we also reduce what's called business debt so we
actually remove features that nobody needs or we would just drag it along we are dragging along
business debt that's a term that actually nida from verizon coined when she was on stage with me
i've never heard that one before i like that that one. Business debt really means, hey, we have applications
and features and applications
that we don't even know
if anybody needs it.
Probably nobody needs it,
but because we don't have the metrics,
we don't know,
so we better carry it along.
So on top of technical debt,
which is kind of
code after code after code,
which is kind of getting messy
and you don't really know
if it's used or not
and if it's useful,
you also kind of say, well, we have certain business features of our apps that probably nobody needs,
but we don't have the data to take it out.
If you think about Microsoft Word, which is a great app, but I'm sure 90% of the features nobody ever uses,
but Microsoft doesn't know whether they can take it out because they don't know if anybody is using it.
Is Clippy still in?
I don't know if anybody's using it. Is Clippy still in? I don't know.
One last thing I wanted to mention too with this idea of getting information from production.
Let's go back even to your search idea.
And I know this is a little bit off topic because this is kind of talking about testing
approach or ideas, but you have this idea of testing your search, right? You can either use
one term over and over again and do it wrong, or you can have an infinite number of unique terms,
which in itself would be wrong, right? Because then you're never going to be leveraging cash,
which is not a real world scenario. So another thing you can get from partially that usage
in production and partially just, you know, partially just some data mining is, well,
let's take a look at the top 1,000, 2,000, 3,000 search terms and feed that into our
data file.
So you're using the patterns that people are doing, because obviously it would be great
to have one of those, I'm going to search for blank and return a million things.
But that's something you should find much lower in the cycle by doing some smart testing down there. Once you're doing that full scale load test,
get a realistic data set to use. Because I even had that mistake that I made once back in my old
job, where we were mostly a content site. So it was all just different pages. And unbeknownst to
me, they had introduced some caching. But for the last three years prior to that, the testing I was writing had a bunch of different large data sets of pages to hit for the different types of page.
And it would just randomly pick through them.
And then this new release came out, and I reported significant drop in performance.
We were still only talking about maybe 300, 400 milliseconds of a degradation, but that was a much larger change than we've ever seen before.
And I started, you know, sky's falling, we can't go, we can't go,
and then found out from the developers that they've introduced some caching.
So if I'm hitting them all random,
I'm never using any of that new caching that was in place,
and that then helped me change my test to be,
okay, what are the pages people are actually visiting,
and model it on that.
So that kind of data is something you can pull back from
operations. And that again, pulls in that whole DevOps side of you're in between the developers
and operations. You have two great, great sources of data. And then either behind you or in front
of you, you have marketing and business and project managers who are other great, you know,
how do you service them and make yourself valuable? And I know that's come up a few times already in our three episodes span so far, but I think
it's a very important concept.
It's a feedback loop back from ops to not dev, but to test.
It's awesome.
This is exactly what DevOps and content delivery is about, having not only these quality gates
that stop things, but also feeding back data upstream.
All right.
Well, I think that'll wrap it up for today's podcast.
I want to thank everyone for listening.
And once again, thank Andy for doing this with me.
It's a great pleasure to share this time with you.
Yes.
And we still don't know where this is hosted, but you should be up on – you should be able to find us on Spreaker, iTunes, and a bunch of other things very soon.
Any closing thoughts, Andy?
No, just – I mean, I think the topic is so broad.
I mean, as always, we could go on forever, I guess.
I mean, the closing thoughts are shift quality to the left, meaning left in the pipeline by looking at the right metrics.
I believe we can find a lot of performance problems early in the lifecycle, not only through performance tests.
However, I'm a strong believer we need to run performance tests, not only small ones and then say everything is fine because we don't have enough time.
So we definitely need to make sure we have enough time to run these tests. But you can do a lot of the checks, the performance checks in the pipeline, in your continuous delivery by running, starting with the functional test, integration test, small scale load tests.
And you will find a lot of problems.
And therefore, if you really then have the time at the end, and even it it might be limited for running large-scale load tests,
you will be sure the system you're testing is actually much better quality than before,
so you really only find the hard problems.
Or maybe you're actually good already, and everything will be fine, and then you can enjoy the next release.
And I'll just end on saying that if your organization is moving in this direction, which hopefully they are,
and you actually enjoy performance testing, don't be scared.
Embrace the new.
There's going to be a lot of amazing things to learn and a lot of growth that you can undertake in this process.
So if you're not too keen on performance testing already, maybe it's a good sign. But if this is a fascinating topic for you and you really enjoy it,
there's a lot of amazing things that are going to come out of it.
So just dive in headfirst, and part of Agile is allowing for failure.
So allow that for yourself as well, and keep moving forward.
All right, we'll see you next time.
Thank you.
Goodbye.
Bye-bye.