PurePerformance - 010 RESToring the work/life balance with Matt Eisengruber

Episode Date: August 15, 2016

Are you still exporting load testing reports into Excel compare different runs manually? Matt Eisengruber – Guardian at Dynatrace – walks us through the life-changing transformation story of one o...f his former clients who used to spend an entire business day analyzing LoadRunner results.Through automation, they managed to get her the results when she walks into the office in the morning – giving her more time to do “real” business analyst work instead of doing manual number crunching. Matt shares some insights into what exactly it is they did to automate Dynatrace Load Test comparison, how they created the reports and which metrics they ended up looking at.

Transcript
Discussion (0)
Starting point is 00:00:00 It's time for Pure Performance. Get your stopwatches ready. It's time for Pure Performance with Andy Grabner and Brian Wilson. Hello, everybody. My name is Brian Wilson. Welcome to Pure Performance. And today we have an especially exciting show. Andy, do you know why it's an especially exciting show? I'm putting you on the spot here. Yeah, I know. I know. So I know we have a guest today who currently lives in a country that managed to Brexit twice in the last couple of days. We have to tell the audience we're recording now on the 29th of June, and the country that
Starting point is 00:00:53 I'm talking about is obviously England. But that's not the exciting thing? No, that is exciting. But the exciting thing is this is episode number 10. Oh, I'm sorry. I think the Brexit and the shame of Iceland against England is so clear in my head still that I thought this is the most exciting thing. I'm sorry. Episode 10.
Starting point is 00:01:18 Yes. What's the 10th anniversary? I don't know. Someone can let us know. I know like when you're talking about marriage, there's like paper anniversary and all these kind of like cardboard all these kind of weird things you know i think diamond is like 50 or something but you don't get the diamond for a long time but so we're at number 10 um yeah no i'm actually not upset about iceland because um i well i'm not ice i'm not icelandic but i have a little bit of Norwegian in me.
Starting point is 00:01:47 So that puts me sort of in that geographical region for my heritage. I heard like 30% of the country of Iceland was in the UK for that game with the team and the fans. Not in the UK, in France. Oh, in France. You just disqualified yourself as a soccer lover or as a football lover, as we call it. Well, I was going to say calling it soccer disqualifies yourself, too. Yeah, that's true. All right.
Starting point is 00:02:11 Yes. Andy, you were – so you were – and again, as people know, we record these ahead of time. And so you just got back from Velocity, right? Tell us a couple of things about that. So I think the most interesting – I mean, it's very interesting to just talk with a lot of people, meet Steve Souders. I actually got Steve Souders on a little short interview that you can also listen to on our podcast. And I think with this interview actually launched a new, let's say, special feature, which we call the Pure Performance Cafe. So I had Steve on the line. It's been nine years of Velocity.
Starting point is 00:02:47 And I think the most controversial session was the session from Tammy Everts and Pat Meenan, who showed their results on a study that they did. They created machine learning. They did some machine learning and basically put real user monitoring data through the machine learning engine and tried to figure out which measures and which metrics actually best correlate to user experience and actually conversion rate and bounce rate. And the most surprising result was that render time, which we believed in the last couple of years to be most important,
Starting point is 00:03:29 was actually not correlating at all and was actually very on the bottom of the list. But they also claimed, you know, they say, well, this is, you know, our set of data that we used and you need to use it on your own set of data because every page is different. But that was very interesting. Then another thing that I really loved was the session from Adam Auerbach, who did a session on how Capital One transformed their traditional pipeline into real continuous deployment. They are deploying multiple times a day now, which is very interesting, especially for a financial company. And with the RenderOne, yeah, and all all those are available on pure performance cafe they were released um in june uh so go back and check them out they're all under 10 minutes so they're great quick hits um but for for the render time one uh besides the fact that we track render time which is kind of funny now uh although this isn't the end of render time i don't think as they said
Starting point is 00:04:22 did they have another metric that they, you know, what did they say from their study was like the number one, or did they have that? It was DOM load time, which obviously makes sense, but for me DOM load time is also very close really to rendering. But yeah, they had a list of the top metrics, and DOM load was I think the number one that they had. And then also they correlated the number of elements on the page,
Starting point is 00:04:44 how complex it is. But definitely DOM load time was the number one. Everybody was just shocked about rendering because we tried to optimize rendering in the last couple of years and now they're telling us it's not that relevant.
Starting point is 00:04:59 But as you said, and as they also said, not every page is equal, so you need to do your own testing. It was just interesting that they used machine learning to figure out which metrics are most relevant to optimize to get a better conversion rate and more revenue on an e-commerce site, for instance. That was very interesting. Yeah, and other than that, I also wrote a little blog just an fyi meeting minutes from velocity so if you go to blog.dynatrace.com and search for velocity 2016
Starting point is 00:05:30 you'll find it that's apmblogs.dynatrace.com now though actually yeah i think it's something that's called it's something yeah i hope so i think so it used to be that one, I know, a long time ago. So speaking of Brexit, would you like to introduce our expat guest? Yeah. Well, we will see what really happens if the Brexit really happens and what that means actually for our honored guest, Matt. I'll try to pronounce the name as I would do when I would be in Austria or Germany, Eisengruber, even though I know probably it's more like Eisengruber or Gruber. I don't know how the Americans correct to pronounce it. I call him Matt Eisengruber. And he's one of our fellow colleagues who made the jump over the pond from the U.S. to England and is now helping our clients,
Starting point is 00:06:26 the Dynatrace clients, to better leverage Dynatrace as a guardian. And Matt, we met a couple of weeks ago in London at one of our Perform Day events, and you approached me and you said, hey, I think I have some good stories for the podcast and I think one of these stories was really really interesting because you and you we will explain that story in a second and give us the intro but I thought there was so great that we could not only help people to make better decisions on performance but also impact their personal lives in a positive way. So, Matt, without further ado, who are you? Give a quick intro and then introduce us.
Starting point is 00:07:10 Why do we care about you? Exactly. Why should we care? Why should we listen? Well, Andy, cheers for that. Appreciate it. And probably I'll start with the perform when I met you in London there. Probably one of the most interesting topics I thought you had that was a little bit underplayed, but it was great, was around the technical debt. And I noticed in a previous podcast, you guys talked about business debt as well. I think those are two vastly undiscovered capabilities of the product. But myself, I came from a small town out of Seabwing, Michigan, which was mostly agricultural based. So my performance metrics as a kid were
Starting point is 00:07:46 based around the number of cattle I had to feed, how many acres of land you had, and the bushels you'd get per acre on whatever crop you were growing. So you sound like a bit different. You sound like you're running for office. Exactly. Well, yeah, so jumped over the pond and life's quite a bit different now as we get into some of this stuff. But yeah, what I wanted to talk about today was around the automation, because what I tend to see out in the field is we have great tools, great capabilities. But we have reporting from, I don't know, it seems like from 1980 still. A lot of Excel sheets still floating around that people use. And I think it's very difficult to show any kind of improvement week over week. Or if you're looking to get into the DevOps culture where you're releasing things every 10 minutes,
Starting point is 00:08:38 to have to do an Excel sheet to prove those results I don't think is very feasible anymore. So the story I had was from a client two years ago. They had just bought Atman. She was a QA analyst, and she was looking to do a report. So every time development would run a load test, she would have to do the analysis on it. And she was doing it in Excel, and they were using LoadRunner at the time as well. So they hadn't integrated anything with Atman yet. They had just installed the agents, but I hadn't really looked at his capabilities.
Starting point is 00:09:09 Um, they love the transaction flow and that, that was about it. So what I noticed was a day after day, cause they'd run a test. I think it was twice a week. Um, and the next day, so they'd run the test at night. She'd come in in the morning and she just, you know, facing the screen from 8 AM and it would be six, seven at night before she'd leave. And the morning and she just, you know, facing the screen from 8 a.m. and it would be six, seven at night before she'd leave. And I noticed that the stress just kept building and building with her. And so finally I asked her, I said, what are you doing? She goes, oh, I got to run this report every day. I said, well, how long does it take? And she goes, oh, about eight hours on average. So that threw up some red flags to me, just like it would if
Starting point is 00:09:42 it was inside a tool and you hit a threshold, right? I mean, eight hours to do a report, it was mind-numbing to me. So I sat down with her, and we took the simplification approach. So I looked at what she was trying to do with Lode Runner, first of all. And what we did is we added some Dynatrace tags in so that when the Lode Runner scripts would run, we had properly labeled things coming into Dynatrace so we could get when the load runner scripts would run, we had properly labeled things coming into Dynatrace so we could get some good sessions, some good peer path data. And then I looked at her actual report that she was trying to get out of it. And she had like
Starting point is 00:10:14 all these tabs in Excel. And it was quite a monster, to be honest. Very manual, very time consuming. So what we did is I sat down and I took each section of that Excel sheet and we converted it into a dashlet inside Dynatrace. And as the dashlets all filled out, we ended up fitting it on one dashboard that we made look really nice in a report. And then we took it one step further and looked into the REST interface. And because the whole idea was I wanted to automate this whole report process for her so that she could go, well, one, so she could start making it to her kids' soccer games and sorry, Andy, you're European, so football games that she was
Starting point is 00:10:54 missing that night. And then two, I wanted her to get back to being able to be an analyst, not a report generator. Because like I said, we got tools that can do that. So started looking into the REST interface. And what I found was, well, something from 1970, because we actually make some REST interface calls using Epoch time, which I thought was pretty neat. So I had to learn a little bit about Epoch time first to do the calculation correctly. But how we ended up, our finished product was you'd start off, the load runner test would kick off, and it would make a call to the REST interface of the Dynatrace server to start a save session. And then at the end of the test, it would run a stop call so that it would save all the sessions into its separate test folder inside the Dynatrace server. And then it would also generate that report and automatically send
Starting point is 00:11:45 her an email. So essentially, we took what was taking her eight hours to now she walked into work the day after a test, and that morning she would open it up and the results were right there for her. And then if those results showed anything, if any thresholds were breached, she could then refer to the save session that we had set up and send that to development who also had Dynatrace, obviously, and they could talk the same language. So some of the knock-on effects to that was she had a great, you know, all of a sudden she got her life back and it was great. Wow. Yeah, that's a phenomenal, I mean, that's, I mean, it's been two years ago, as you say, but I still see a lot of people still creating these reports and not even knowing who is actually looking at the reports. So hopefully somebody is looking at the reports, but you basically optimized a full workday.
Starting point is 00:12:36 And then actually I think a lot of people – and this will be an interesting question to you back now. So was she fearing in the end? I mean obviously she got the time back. That's great. But was she kind of fearing that her job was optimized away or did she see this as an opportunity to actually do more with her time and actually really going into the data analytics part? Yeah, no, I'm glad you mentioned that. that that fear was non-existent in her actually she uh was crying and gave me a hug once we got it all done because she went she made it to that soccer game that night but um no what what she was what she ended up doing then was instead of
Starting point is 00:13:17 spending all that time reporting because they were on a very tight deadline because this was a project that was rolling out uh it needed to roll out. It had already hit. They had ran into some issues with the third party provider. So they were behind schedule. They weren't meeting the performance marks that they needed to. So the project was stalling and she was at the time a bottleneck for that because until she could deliver that report back to development, they couldn't go the next round. So she was actually slowing down the release cycles. So in this case, by removing that bit from the bottleneck, it didn't cause her any harm. In fact, she loved it because then she started doing it for other things as well and just became a better analyst because she actually had the time and the, you know what I mean, instead of doing the mind numbing work of
Starting point is 00:13:59 making sure she's double checking and verifying all their figures, she's now actually doing an analyst job of looking through the data and finding the harder things to get. That's perfect. So it's really leveled up basically. Exactly. I guess also giving the whole thing a little more thought on what is actually on the report because I would actually – my question would be the report that she generated,
Starting point is 00:14:24 I'm sure she has done this for a while and it's kind of a standard report. Did she ever question what is on the report and if the report still makes sense? Or did she then also try to level it up and besides just generating graphs, actually coming up with recommendations and solutions? Right. Well, and that was the interesting bit was that, just like you said, we found some, I don't know if this would be a good use case for technical debt in that report, but there was definitely some functionality that was removed. Like I said, we simplified it. And it was the fact that because we broke it down and compartmentalized it into each
Starting point is 00:15:01 section and each dashlet had to represent a section, she therefore had to pick and choose what was the most important thing she wanted to get across in that message. And we had to rethink thresholds and values and, you know, really take a deep look and spend the time making that template for that so that it worked out. So yeah, she definitely leveled up both her skills because she had to, you know, go back and review and what are we trying to accomplish? And I think it kind of took her out of her job and expanded it out. Like you guys talk about, you know, bridging the gap between the teams and how we're all in it together. And I think doing that exercise with her made her realize that, you know, the bigger picture was they needed to get a product shipped out the door. And she was, you know, she can help them better by doing it in this way.
Starting point is 00:15:49 You know, this reminds me of my old, you know, I used to be performance guy, right? And again, I used load runner and I will not hide at all my disdain for the reporting in that. I used to spend, I don't think I ever spent, you know, four or eight hours generating a report because at a certain point I got it down to a way to, you know, I had my own spreadsheet that I had set up for data. I could then export the reports from the analysis tool, copy the data in, plug it in, and have it, you know, the macros on the spreadsheet would then start putting in the other calculations. But that was always such a real pain in the butt, you know. And I'd only been using Dynatrace for about a year and a half, close to two years before know, this, this I think was before Dynatrace had so much, so many of the ES enablement people on board. But there was also
Starting point is 00:16:53 just not the time to try to set up that interface like you did with her. But what I found was really interesting when, with reporting and you kind of touched on it there, is modifying and changing what you report on. As time went on, like I would have these very complex reports comparing all different types of data. And for maybe the development team and maybe for our own team of analyzing and understanding and trying to get a feel of whether or not this release was acceptable, that data was good. But whenever then I'd get called into a meeting with the CTO and some of the other VPs, you know, VP development and all that, they would look at this spreadsheet and be like, what the hell is this? You know,
Starting point is 00:17:36 because it was just way too much data, way all over the place. And we then had to figure out, all right, we have a detailed report. Now, how do I reduce the amount of metrics that I'm reporting on to something more C-level friendly? And I just bring that up because I think it's so much easier when you're using a tool like Dynatrace with the different charts, graphs that you can put in there. You know, besides having the individual transactions, you could very easily look at the overall transactions. And there's just a lot of different ways you can very easily and very in a visually pleasing sort of way, put that data out so that when you do take it to that C-level team, they're kind of looking for the red light, green light, you know. And if it's red light, you need to have some of those indicators of why it's a red light, green light, you know, and if it's red light, you need to have some of those indicators
Starting point is 00:18:25 of why it's a red light. And I think it just makes it a lot easier when you, when you, when you have one of these reports, you know, you can, as you might say, just get all the data and pop that baby in there. And I have to admit that was a, that was a Rick Boyd asked me to say that during this at some point. Yeah. Oh oh that's another story i think brian so matt matt can do you remember do you remember what type of metrics and data they had on that report maybe because our listeners they might be interested in so what are others doing what are other companies doing when they analyze to load tests and maybe what did they have up there what was the initial
Starting point is 00:19:05 requirement but what did you end up delivering after you figured out hey maybe not all the data is actually necessary but because we now have apm tools we can actually provide much better more actionable data do you remember some of some tips and tricks yeah so some of the things that we did was uh or some of the questions i asked it was it was more um kind of a an asking session from her is what i did and i sat down with some of the developers and anything that was a known issue right so anything like um they knew that they were struggling with some database queries um specific ones. And so they needed to know invocation count and then response time, right? So you can, and I know you guys talk about the N plus one problem and string concatenations and all that.
Starting point is 00:19:53 But that was the big thing for them was they were worried about the number of invocations they were making. It was quite a complex program. And then they were also trying to relate that back to, right, response time overall. So tying that together. And then there was a few other things. They were worried about geography because this application at the time was going to be deployed. It was going to be a U.S. deployment first, and then it was going to go international quite quickly after that. So they needed to make sure that they factored in latency. So there's a bit of modeling that they were looking to get out of that for their load. So that was it really.
Starting point is 00:20:32 But it was just working from past experience. And I think actually as we got through it, we made a version one of that report, if you will. And then I waited a couple of weeks, let the team feedback and stuff. And then obviously more questions came out of that report, if you will. And then we, I waited a couple of weeks, let the team feedback and stuff. And then obviously more questions came out of that because as you turn the lights on for these, for the people and begin to show them the visibility that they can have, I think that it kind of dawns on them that, oh, I can start asking for more, right? So until you show them, they don't, they don't know what to ask for type thing. And as they got more intelligent in reading the reports, they began to ask more intelligent questions. And then we actually created a feedback loop where, okay, Matt, maybe we were showing top-end queries, but can we show top-end invocation counts as well? And can we show things that are maybe use this specific select statement because we think that there may be something going on there, but we're not sure.
Starting point is 00:21:27 And so we kept adjusting that report and it was it wasn't the same report by the end. I mean, by the time we hit the final version and the project rolled on and it was a success, it had changed quite a bit. But it wasn't that it was moving goalposts. We still had some of the, you know, concrete data. But then it was all the little stuff around it that helped, you know, the tweaks to that report made it all the more valuable as we continued. Yeah, it really sounds like the kind of report that you can add in a lot of the metrics we've talked about in past shows. Obviously, don't don't try to boil the ocean at once. But once you start getting all that buy in, you know, adding things in like total number of queries over the course of the session,
Starting point is 00:22:08 average number of queries per transaction, you know, number of threads, all, all the kind of, I want to say new performance metrics, but they're not really new, but I think kind of new in the fact of,
Starting point is 00:22:21 you know, as the performance team looking beyond CPU and response time. Well, especially new, if you, if new if these are, let's say, load testing people, and whether it's LoadRunner or JMeter or any other tools, but they, as you said, they typically look at response time. And what we can bring to the table, people need to understand, especially with the integration that we provide for load testing tools, so we can actually get the load testing step name or the transaction
Starting point is 00:22:45 name and then tell them hey in this particular test for the login transaction you have on average 500 database queries you're consuming that amount of cpu on average you have 700 exceptions that are typically thrown and by the way here are your top exceptions your top sql statements here you have the m plus one query problem. And by the way, why do you execute every single database statement on a single connection? So you can put all of this on a nice dashboard. And it seems, Matt, that you've done this. And then the more people see what's possible, the more they're asking.
Starting point is 00:23:19 I think I like that a lot. So kind of leveling up the whole team. So it started leveling up this lady who, instead of just creating, quote unquote, a stupid report that nobody really knew what it is, then she could actually really deliver answers and actually get people to ask questions about what else they needed to know. awesome because in the end, everybody was leveled up and they could get more answers in shorter time because everything was automated thanks to things like the REST API that we provide or thanks to the dashboarding technology that we provide. I love that. That's pretty cool. And you mentioned Dev having Dynatrace. Did they actually have it in their environment or they just had the client?
Starting point is 00:24:02 They had the client. So what she would do then is, uh, the meeting changed, right? So instead of her emailing out the Excel sheet, they'd sit down for, you know, a few hours cause it'd take a while to go through all the different tabs and that it was, okay, we have a single, I don't want to say single pane of glass cause that's an overused term, but we have a single view of, of the truth that we're all going to agree on that it's the truth. And then she could go in there, and then when she would get asked the question, because I think that's the next thing, is when you switch your reporting style, a lot of people don't feel
Starting point is 00:24:33 comfortable, right, because they don't feel confident in representing the data. But she could just revert back to the save session then. And so then she could share that save session file with the developers, have them open it up on the screen in the big conference room, and they would go through it and begin to look at the peer pass and the methods and the API and the response time hotspots and, you know, all the good things that Dynatrace does. So it actually resulted in a greater adoption of the tool in general and an adoption of the culture of we're all going to, you know, realize that a change has been made and it's for the better and we're going to embrace this. And the results of those meetings were far more substantial than before. And I think you brought up an interesting point about the new metrics being accepted, right? I went through that myself as well, back in my old job, when we
Starting point is 00:25:25 started switching over to different ways of looking at the data. And oftentimes, once you have more information, like what you can get out of an APM tool, or even some of the traditional statistics, when you're taking a look, you know, if I run three tests runs on a build, instead of reporting the individual instances, you know, every test is going to have some slight variability. Switching over to showing what's the average response change of the three been so many decisions on the past made from your old set of metrics, whether or not they were good or bad, that was the standard. And a lot of that buy-in then for the new set of metrics is exactly how you were describing that. The dev teams were presented with them and they agreed, yes, these are the more important ones. And they were able to then
Starting point is 00:26:23 move forward with the new set without much of a hiccup because it was a technical buy-in from all the teams that this is an overall improvement and these are much more important and meaningful metrics. Right. And I think it's important to also adopt a common language because I think traditionally it was infrastructure-based metrics, if I will. So your CPU, your memory, you know, it was all capacity-based metrics, I guess you could say. Whereas the new age metrics, I suppose we could call it, would be more interactive. It's the interaction between the layers that we care about now, not so much the layer itself, if that makes sense.
Starting point is 00:27:00 And I think it was people didn't have access to it. Like you said, before the APM tool sets came in, all we had was Windows Task Manager or Linux Top. You know what I mean? And then that slowly evolved into, okay, now we're looking at how things are interacting and measuring the complexity instead of looking at the silo. And that was a big step. Yeah. of looking at the silo. And that was a big step. Yeah, the other thing I was, you know, the reason I asked if the dev team
Starting point is 00:27:28 was just had the client or if they had the, if they actually had Dynatrace in the dev environment is because obviously then in a place where it's possible, the obvious next step would be to get those same metrics
Starting point is 00:27:42 being tracked and reported on from the dev team. Now, that wasn't the case in this case, as you mentioned. But for if anyone out there is in a situation where in their test team, they have Dynatrace, in their dev team, they have Dynatrace, this is where you start getting, you know, what did you say, the common language. But you start tracking all those metrics early on and track it in the dev phase so that there's a real good chance they'll start noticing those smaller changes without load that they can go back and say, hey, this isn't going to fly once we put it in performance. So we'll fix it now before we even send it on to the next level. Yeah, maybe this was also two years ago.
Starting point is 00:28:24 We didn't really have the free trial yet. And I think this is the free trial that we provide and then the personal license, which basically means tenant trace for free for every developer or for testers and architects on the local machines. I think that is actually now enabling these teams to actually do these checks early on. I think my big thing that I always try to say now in my presentations is we want to enable developers to even don't check in bad code, right? Don't even check it in if you can already verify
Starting point is 00:28:53 that a code change may impact the number of database queries that your code is going to execute later on. So finding more of these problems early on, so to give that lady that you were talking about even better test results because now she can actually run the tests and analyze tests and really find the tricky problems. So I think it's a level up
Starting point is 00:29:16 across the whole lifecycle in the end. So that's pretty cool. Yeah, it's cool too because once the dev and test team start working together on that for some time, you know, they're going to be able to understand, okay, if a transaction is run, you know, X amount of times per minute. And based on what we've seen in the past, when, you know, the dev team increased the queries, maybe by, you know, three additional queries, what was the impact on that under load, right? And you start to get those sort of models figure it out so that, you know, even, you know, I don't know if three queries in general, that's not a very large increase, obviously, depending on the query and the payload and all. But you get
Starting point is 00:29:56 you start getting an understanding from both the test team and the development team, three, three queries under X amount of transactions per second for the ones that call that, you know, what impact is that going to have under load? Uh, and they, the developers can have a much more finessed view of what those minor changes are going to have. Obviously, if they went from three queries to a hundred queries, that's very obvious, right? Everyone's going to know that's going to be bad. Um, But when you have those more finessed ones, you'll start developing a model over time to understand the impact that these smaller ones will have in conjunction with the amount of transaction, the transaction rate that it's running.
Starting point is 00:30:33 So over time, that team would become, you know, just stellar. Yeah, and they become more of behavioral analysts than, you know, system analysts, if you will. It kind of switches their skill set because they evolve. Right, right. So what I would like to kind of conclude that topic, and I would like to talk a little bit about the future, meaning how can we make it even easier.
Starting point is 00:31:03 But before we switch into that topic, Brian, I think you have another trivia question, meaning, you know, how can we make it even easier? But before we switch into that topic, Brian, I think you have, do we have another trivia question actually for our listeners? Is that time before we move on to what's coming up next? Well, yes, but wait, do you hear that noise? That is the, yes, the trivia time. So we missed last week's trivia. I thought really hard about getting one and, and, and phrasing it in a way that won't make it too easily Google-able, although you, or Bing-able, sorry, Andy. Um, but so the, the trivia question for today, um, based on this episode airing on, um, in, in early August, what, what date did I say that was going to be the, um, I think the 19th ish. Well, no, no. 16th, I think. Um,
Starting point is 00:31:53 back 36 years ago on August 17th, a really, really great movie eighties comedy called real genius premiered. It starred Val Kilmer and a bunch of other people. And this was before Val Kilmer became a Jim Morrison, uh, and became a real stuck up and just egotistical bastard. Hilarious, hilarious movie. It's a great 80s movie. And it features a bunch of characters who, you know, just really loved experimenting and learning. These were like super like back in the day.
Starting point is 00:32:19 These were like the stereotypical nerds who most of them, except for Val Kilmer's character, of course, were not very well adjusted to operating in real society because they were just too smart. You know, today they would probably be Silicon Valley darlings and millionaires. Anyway, the great movie. Go see it. Definitely recommend it. But the question about this movie, then Real Genius, if you have seen it is what does real genius have in common with Napoleon
Starting point is 00:32:49 so if you know the answer to that you can tweet it back at hashtag pure performance hashtag or hashtag pure performance at dynatrace or hashtag pure performance, hashtag no prize, K-N-O-W-P-R-I-Z-E. Just get back to us or you can email your answer to pure performance at Dynatrace.com and your name will be featured on our website as being a real genius yourself. So with that, let's talk about the future.
Starting point is 00:33:27 Yeah. So here's one thing that mind boggles me all the time. And Matt and Brian, I'm sure you've seen this the same way. We run into the same problems all the time, right? Meaning the apps that we see, the apps that we test, the apps that we monitor in production is always the same problems. We've been talking about this, right? All the metrics that we look at, whether it's the M plus one query apps that we test, the apps that we monitor in production, it's always the same problems. And we've been talking about this, right? All the metrics that we look at, whether it's the N plus one query problem that we detect
Starting point is 00:33:49 by looking at the number of SQL statements that are executed per transaction or others. So now looking at the story, Matt, that you told with this lady, she's a lady that is analyzing load tests, and maybe she doesn't have all the details about what is Hibernate and what is Spring and what is this new cool cloud technology that my developers are now using. So she's probably not aware of all the potential problem patterns that are out there. Yet, she has to provide meaningful answers on the load tests. So now my problem that I have with this is with the ever-growing set of tools and frameworks and technologies and the ever-growing problem patterns,
Starting point is 00:34:34 how can we make more people performance experts and actually experts in these problem detections without having Matt all the time sitting there and telling them what to look for. So my question is how can we scale? And the trend that I'm seeing, and this is also the trend that we've been doing at Dynatrace, is we try to automatically detect all these problems so that people like performance analysts don't have to create a dashboard that puts three charts next to each other and then correlate time-wise or however they correlate to figure out what the problem is. I think a big future trend is that we really automatically detect the most common problem issues. And I wrote a blog about this on June 13th. So if you go to
Starting point is 00:35:23 APM blog, dynatrace.com, there's a blog about automatic problem detection with Dynatrace, actually highlighting all of the problem patterns that we've seen or I've seen in the last couple of years. I'm sure there's many more out there. pure paths and transactions that we record, for instance, during a load test and say, hey, these 50% of the pure paths that go into that particular URL that came in from that particular test struggle with the M plus one query problem. at individual tiers or applications, but you also look at the communication between tiers, how microservices are talking with each other. So these are all problem patterns that we need that now we automatically detect. And I think this is the future. So not only giving data, because we are, and I think Brian, you mentioned that in the beginning,
Starting point is 00:36:22 we are kind of drowning in a sea of data, but it's hard to actually make sense of it. So our contribution is to automatically detect problems and also anomalies. This is stuff that Ruxit has been really good in and was built on, that we analyze the data for you and instead of you having to do it manually, we just bubble it up. We bubble it up and we tell you if there is an impact, where is the impact? So I know this was a long statement and more other problem patterns that we need to add to the list that we currently do detect, or any other ideas on how we can scale
Starting point is 00:37:11 and make more people performance experts and architectural experts? What else is out there that we can do? Well, one thing that comes to my mind that shouldn't have to come to my mind but still happens is, as far as a problem pattern, and I don't know if this is achievable, but be one to look into is being able to production where logging was turned on on debug. You know, there's some of these what you might call silly problems that unfortunately still exist. Hopefully with, you know, the advent of containers and this, you know, the more automated one set, one type of setting for all environments kind of thing will help get rid of that but um i guess you can have if you build your container logging on you can still have that issue
Starting point is 00:38:10 but that's the only one that jumps out of my mind being put on the spot right now so um matt what what else do you do or how i mean coming back to the story that you had where where she was sending these reports and maybe somebody looked at it or not. How do you educate the receivers of these reports that times have changed? It's no longer about these reports that contain whatever data, but it's more about detecting, I think, regressions, right? That's what it is. We detect regressions from build to build, from deployment to deployment. But, I mean, you said in your work you basically from the architects or the business people or the C-level people that they also started
Starting point is 00:39:09 looking at this performance data, at this performance data from a different perspective than they used to? Yeah, most definitely. It was actually the project manager who I think began to realize that the conversations were changing. And, you know, at first, obviously, he had some skepticism around some of the technicality that we had to work through. But yeah, it was, and I think a lot of the times, people lose sight of what the report's there for in the beginning, like you said,
Starting point is 00:39:36 you know, regression. But if we take it up a level higher than that, and say, well, what are we trying to do, we're trying to deploy something that is scalable, something that reacts well, and say, well, what are we trying to do? We're trying to deploy something that is scalable, something that reacts well, and overall just gives a good customer experience. And if people keep that in the back of their head when they're making these reports, or when the development asks, you know, what's the next thing to be doing?
Starting point is 00:39:58 What should we be improving? Well, you improve the things that is going to give that good customer satisfaction. And that I think is sometimes lost when you get into the churn of reporting and testing is, you know, you get too analytical and you forget about the soft science of it, that we're doing this for people on the other side of it. And just like I was able to save, you know, her loads of time on reporting, they're saving people loads of time on whatever that application function is, because that's why we use technology after all is to, you know, make our lives easier. So I think it's important to keep that in mind that that philosophicalness of it.
Starting point is 00:40:34 Yeah, I was gonna say the one other thing that I think people should, you know, really start looking into more as well. You know, you have your, you know, Matt had that great dashboard sending out the report for that current test result. But I think it's also behooves everybody who has a tool that can do what we do, which is the regression analysis, where you can take your previous build and run a regression against the current build. And it'll show you what's changed. You know, you can compare the database, API calls, all these different kind of metrics directly out of the box. You just say, here's my source, here's my comparison. Show me all the differences, what got better, what got worse, what got introduced, what got deprecated.
Starting point is 00:41:20 And I think maybe the next level on that side would be if there are a way, would be, so just a way would be. So just number one, that's a very important tool to use, right? Because you don't want to just, you know, you know, run a performance test and say, yes, it passes. It's got to be looked at in comparison with the history of the product as well. So if it's a brand new product, obviously you have nothing to compare it with. But if it's been around, you know, yes, everything looks good. But what did it look like before? And now that you have all these other metrics, what are those changes there? I think it would be really interesting to see a way to make a, you know, you were talking about these,
Starting point is 00:41:55 this automation of the reporting, automation of telling people what their, their, their problems are, you know, as you're mentioning, I think it's going to be in 6.5, Dynatrace 6.5, which I think will be in, probably should be in early access by the time this airs, correct? Exactly. It should be early access
Starting point is 00:42:16 and also part of the free trial. Yeah. Right. But I think the next step then would be with this regression comparison concept is a regression hotspots report. Because if you take a look at any kind of regression dashboard, once again, there's a sea of data, there's a whole bunch of things. And it comes down to looking at all right, you know,
Starting point is 00:42:35 sorting by time, and okay, this is a 500% increase, but maybe that 500% increase was from, you know, 220 milliseconds, or maybe 20 milliseconds to you know i guess not 500 but if it went from 20 milliseconds to 25 milliseconds that's a significant statistical increase but maybe not necessarily a performance concerning increase um i think from that future of automation kind of point of view would be how do we bubble up and show the hotspots of regression? Yeah. And actually, if you think about the problem patterns that we detect, so for instance, we detect, and let's take the all-time favorite, the M plus one query problem. So if you can run a load test, and then we can tell the user, hey, this load test has 20% more transactions that suffer from the M plus one query problem.
Starting point is 00:43:29 Even if it doesn't have a performance impact right now, it's still – this is a big hotspot. And especially now in scalable architectures, I think looking at regressions based on problem patterns actually becomes more interesting than ever before because these problem patterns tell you whether problem. It means that in operations later on, somebody needs to pay the bill for the additional transactions that you execute against your database, whether the database is hosted in-house and you need to pay licenses for the database system or whether you run it in the cloud
Starting point is 00:44:17 and you need to pay transactional fees for every single transaction you execute. So I think this is going to be very interesting to look at these patterns and how from build to build, whether you test it with load testing or even production, how the number of patterns that are included in your app change from build to build to build. I think I like that.
Starting point is 00:44:41 That would be an interesting approach. And I think it's also a good point to you were talking about um how it can how it can increase now but you know you don't know that effect but on some of these cloud services i know they charge for like the amount of data transferred from a box for instance or you know what i mean like those type of so if you're looking at you know you're introducing a statement that causes, you know, 20% more throughput, well, that may have just driven your overall business cost of that application to go up and to convert it, you know, from the, take it from the as big of a deal to them, but if their cost goes up, that's a directly, they're paying the bill every month and they're going, oh, there we go.
Starting point is 00:45:30 And how cool would it be if you could input all of your charges and all the costs for different resources within the systems and at the end of your test, it tells you how much more it's going to cost automatically. That would be something really awesome. Right. You know what I mean? Yeah.
Starting point is 00:45:48 And we could do this through our automation features, right? I mean, even though it might not be directly now built into Dynatrace, but you can just take the REST API, pull out the data after the test, and then do the math and just publish this somewhere in a report, an automatic report and say, hey, the cost just went up by $2,000 or something like that. Yeah. I like that. Yeah. All right. Well, we are at the, about the end of the show.
Starting point is 00:46:19 Any final thoughts from anybody here? Yeah. Just wanted to say, I'd like to send out a challenge to all our listeners. For those of you that have the APM tools, take a bottom-up approach once. Take a look around at the people that you work with. See if you've got anybody burying their head in the screen and see if you can make their day better. I guarantee it'll be really rewarding for you if you do. Yeah, and we'd love to hear what your favorite, uh, reports are,
Starting point is 00:46:46 uh, your favorite metrics to, to include on those reports, reports are. Uh, and if you have any, um, you know, great stories of a transformation like Matt, we'd love to have you on as a guest to discuss it. Um, you can contact us through Twitter, you know, just Twitter, um, hashtag pure performance at Dynatrace and we'll get that. Or you can, um, email pure performance at Dynatrace.com. Um, but we'd love to hear your stories or any kind of favorite metrics, dashboards, reports, uh, and anything you might've done in that case, uh, to Matt's point. Um, and I believe, um, what is the free trial is bit.ly, right, B-I-T dot L-Y slash, what is it, Andy? The DT Personal, I think that's the personal license, or that's the easiest way. Then you can fill out the form and you will get not only the free trial, obviously, but on that same page,
Starting point is 00:47:40 you also get the information about why we actually give it away for free as a personal license. So bit.ly slash dtpersonnel, that will be the best way to sign up. Okay. And the Peer Performance webpage, you can either find us on Spreaker by looking at Peer Performance, or, and I'll do this website in the early 2000s style, you can find it on the World Wide Web at http://www.dynatrace.com
Starting point is 00:48:12 slash pureperformance. Andy, do you have any final thoughts? Well, the only final thought that I have is thanks for the first 10 episodes. Yes. And an apology that I didn't get the hint in the beginning of the session that I just thought about Brexit, but not about the 10th episode. And I'm just very happy, Matt, that people like you just share the stories because in that way we can scale.
Starting point is 00:48:38 So what I wanted to say earlier, how can we make more people aware of performance and make them more performance experts. And I think sharing stories is just the way. And thanks to the World Wide Web, we can record things like this and then make it available and accessible to a larger audience. So thanks for that. Thank you, everybody. And will there be any appearances by any of you sometime near after August 16th that you're aware of? Well, I will appear at the DevOps Days in Boston.
Starting point is 00:49:12 That's been confirmed. I've also heard that I will be speaking at Java 1. That's going to be nice. I'm still waiting for DevOps Chicago, which hopefully they will also accept me. That's about it. And I want to congratulate you on saying Java 1 and not Java 1. Although I think that's Mr. Cop's big speaking. Java, yeah.
Starting point is 00:49:37 All right. Unless there's anything else, I'm all set here. Thank you, everybody. Thank you for 10 episodes. And don't forget the Pure Performance Cafe. We should hopefully have some more of those out there by this time. But nice, quick, short hits for the shorter commutes. But thanks again.
Starting point is 00:49:54 It's a great opportunity that we're getting to do this, and we're having a lot of fun. So thank you, everybody. Bye-bye.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.