PurePerformance - Cloud Migrations Gone Wild and other Patterns with Brian Chandler

Starting point is 00:00:00 It's time for Pure Performance! Get your stopwatches ready, it's time for Pure Performance with Andy Grabner and Brian Wilson. Hello everybody and welcome to another episode of Pure Performance where we remind everybody don't bite your friends. My name is Brian Wilson and as always I have with me my co-host Andy Grabner. Hello Andy Grabner, how are you? Have you been a friend lately? Not lately but I was actually wondering how do you define a friend then? Like do you just change the definition of a friend in case you bite somebody because then they're no longer your friends potentially well I would say

Starting point is 00:00:50 you have friends and fiends and the difference is an R so fiend is okay friend not okay it doesn't matter the definition as long as they're how about you can bite your fiends and that's it okay well I never heard the word fiends before.

Starting point is 00:01:06 A fiend is like an evil villain. A fiend. A fiend. So the good news is I think we have a friend on the call today. I think we do too. Not a fiend. I will not bite our guest. Yeah. And he actually just came off of an observability clinic recording with me

Starting point is 00:01:25 where he enlightened me on the seven SLOs to start your day, which was really kind of cool. Like not only to start your day, but in general, seven SLOs you need to consider and build. And he actually showed us how to build it. This is, I think, a discussion for another day. But today we have Brian Chandler on the call. Hello, Brian.

Starting point is 00:01:49 Hey, Brian. Hello. How's it going? Good to see you guys. Or, you know, listen to your voices. Well, you can see us. It's been a while. You can see us.

Starting point is 00:01:57 Don't pretend like we don't see you. Okay, okay. So it's not a secret to the people then. We can actually... And we know you hear our voices every night because you go to sleep listening to our podcast. That's exactly. You just have that sultry voice i mean that's that is my habit like i i that's how i go to sleep i got the uh i got like four different copies of this little tiny earbuds that i that i use i cycle through them you know i got a hot click button

Starting point is 00:02:19 on amazon to rebuy them every three months. So that's what I do. Anyway. Best to be invited. So Brian Chandler, for those people that don't know you and that want to differentiate you not only by your last name with the other Brian, who are you?

Starting point is 00:02:34 What do you do? Why do you think you're here? Yeah, well, I don't know. You dragged me in here. No, I'm just kidding. No, it's good. Super awesome to be here. I'm a solutions architect for Dynatrace.

Starting point is 00:02:48 I cover the Southeast United States. So I work with all of our customers, really, and kind of their most challenging problems out there. Different verticals could be healthcare, financial. Before my time here, I actually was a systems engineer at Raymond James Financial down in Tampa, Florida there. So I've also worked at Volkswagen of America up in Detroit, Michigan, as well as kind of a performance analyst. And, you know, I'm just here to share all my knowledge or lack thereof or whatever or you know whatever you guys decide it is i don't know i guess i have to see if anything is entertaining or educational i think the uh when we had a chat right we just didn't drag you because you had to make

Starting point is 00:03:35 a compelling case about some stories yeah we talked about cloud migration and we talked about things that can go wrong in cloud migrations. And you said, well, I have a couple of stories. And then we discussed. And I think we at least want to talk about two of the stories that you have learned firsthand. And because I think it's relevant for everyone out there that is moving to the cloud, always providing services to other cloud vendors, to an API, because you want to make sure you're not messing up. So this is lessons learned from shifting to the cloud. Is that what this is? I know I read the notes a little bit, but just to summarize.

Starting point is 00:04:15 Yes, and then some. You're spot on, Andy. The reason why I wanted to come here today is just because I'm really seeing, you know, you see all the stuff in the news and the buzzwords and the new phrases that are coming out in the IT space. And they're always talking about all the pitfalls you run into. But, you know, frankly, I still see a lot of really common, you know, pitfalls and stuff that are easily addressable, especially for, say, cloud migrations, which, you know, folks are still doing, organizations are still doing. And then also I got a different story that's slightly similar I wanted to get into,

Starting point is 00:04:50 but was more around kind of SLA, SLO focus, because that's kind of where another kind of buzz phrase and big trend in the market is organizations are starting to build out SRE practices, site reliability engineering practices. And I just thought I had a really, really good story around actually one of the most popular apps out there, Uber, that I wanted to talk about. So yeah, we can dive right in. What do you guys think? Let me know. Yeah, let's do it. Let's do the cloud migration first. Yeah. Yeah. So in the US, and this might be a total foreign thing to you, Andy, but Americans have to pay for their health care. And there's this thing, as crazy as that is, there's this thing called open enrollment that

Starting point is 00:05:33 most health care organizations have to go through over here. And that's basically, it's a two-month or a one-month window where all the benefits providers will basically open up their doors and me and Brian Wilson will go in and pick which healthcare plan we want and all that. And then we're locked in and then that's it for the year. And then next year we'll do it again. So that whole process, open enrollment is basically like a black Friday for the healthcare industry. But the thing is here is it lasts like a long time, not just one day, obviously. So it's super important. A lot of organizations will put like, you know, a complete code freeze in, they'll have like a red line in the sand for,

Starting point is 00:06:16 you know, when, when changes need to be done. In this case, it was this healthcare software provider that basically builds the application that folks will go into during open enrollment to pick their benefit plans. Now they build this software and basically their customers are the major healthcare providers. And they'll basically just change up the UI and to the customer, the end user, me and Brian Wilson, it'll look like it was healthcare insurance company ABC. But in reality, the people that built the software is this healthcare software company that I was working with here. So this was super important. They were actually trying to move that legacy stack, which is basically sitting in a brick building somewhere in their private data center on bare

Starting point is 00:07:12 metal servers, who knows where. And they were trying to take that and shift it up to Google Cloud. So they kind of made an organizational decision to then basically say, okay, we're going to take this and lift and shift it. You know, that's kind of a thing that you guys have probably heard before. That's what, hey, let's just take this app and stick it in the cloud. Yes, exactly. It always succeeds, right? So this is what they were going to do. And, you know, on paper, it might seem fine. You know, you would go through the compute catalog of these different public cloud providers and you'll say, okay, well, you know, we're running a two core server out in the data center here and with, you know, four gigs or whatever their specs are, let's just

Starting point is 00:08:01 spin those up in the cloud and then copy paste the entire stack into the cloud there and it'll just run fine. Right. Well, what they ended up doing was they took half their application. So the front end, which is written in PHP legacy code on basically just Linux servers. And then they took that and stuck that in the cloud, and their phase one was going to be, okay, we're only going to put the front end up in the cloud, but we're going to keep our mainframe and DB2 and legacy IIB services down in our brick and mortar data center. What could go wrong? I guess just one question here. Maybe somebody told them on the web performance side, you need to bring your web performance, your web generating things closer to the end user.

Starting point is 00:08:51 And maybe that's a good idea where you want to start with this as well, because you will immediately get some benefits. Yes. And I just need to add too that it's funny you said legacy PHP, because how long has PHP really been around? I mean, I know what you're saying, but it's still funny. In a fast-paced world of technology and languages, we can say legacy PHP. Yeah.

Starting point is 00:09:14 Good point. That's a great point, actually. Yeah, so they take that web front end, put it up there. And you're right, Andy. Sure, maybe the first page gets loaded. But what you don't realize is that these systems that are built over the course of how many decades or whoever was basically, they only took half of those services and put them up there. And then as you're probably, I'm sure you guys are aware, a complex transaction, especially something that's trying to deal with a bunch of medical records and things like that, and benefit plans, it's probably bouncing off of many different services. And half those services are still sitting in a brick and mortar data center across the,

Starting point is 00:10:03 you know, on the other side of the country. So if you think about it, when it comes down to it, they basically took an application stack where it was sitting in a room and the two services were sitting five feet away from each other. And they had a direct line, you know, across the network to the next server, the main firm of the DB2. And they basically put them on Mars, the front end, essentially. So every time now the front end needs to go talk to their old school mainframe DB2 IIB services, it's going across the WAN, connecting through the public cloud network, out to some ISP, going across the country every time it goes over there. Now, maybe it's kind

Starting point is 00:10:46 of fast, right? You would think if, if for every transaction, meaning if a user loads a page, if every time a user loads a page, it only has to connect across the WAN, you know, one time per transaction, if it's a little bit slower, maybe it's not a huge deal, but therein lies the problem. And what they didn't realize was the amount of times these services interact with each other every time a page loads. So the first major pitfall that they ran into here was they basically, and I took some notes here because, you know, I'm a note taker guy. I learned how to do this in school. So basically what you're going to see here is their PHP app, which was sitting up in the cloud, was connecting on average 10 separate times, creating a new HTTP, like TCP handshake, across the WAN to their IIB services in the backend. And then, so what that did, and it called it sequentially, too.

Starting point is 00:11:41 So, if you think about it, you had the PHP code executing in the public cloud and it needed to go get a record or something, right? So it would go say, Hey, a data center, brick and mortar data center on the other side of the country. I need to talk to you for a second. Go get me this record or go get me this benefit plan. Let's do some transactional stuff. Let's go back. And then the, it would, the transaction, the like transaction with the transaction would return to the PHP in the cloud. And they'd say, okay, that's great. We got one record. Now we got to give it 10 more.

Starting point is 00:12:10 Let's go talk to that data center across the other side of the country again and see what's going on. So it's known as probably in the performance space. And you see this a lot with like services talking with databases, the N plus one problem, where you're basically doing a very simple operation, the same operation over and over and over again, where you probably should have architected it to where it just did one operation and got multiple records. So that's in essence,

Starting point is 00:12:35 what's happening here. Now, again, just the pitfall here is, you know, that could be dealt with the way that was designed when these servers were sitting five feet away from each other on bare metal services or servers. Right. But as soon as you split those two tiers apart and put one on the other side of the country, having to connect across the land, every time it wants to grab one of these records, well, that's when you start having a problem. basically what we noticed was that this php code you know it was taking like 15 20 seconds for a page to load whereas when it was in their you know legacy data center it was taking like five seconds right because now instead of having that

Starting point is 00:13:18 super quick connection it's just going to go across there so when we were talking about uh these things to the team there, they basically just said, okay, we need to take a step back, crack open this PHP code. If we're going to keep that backend in our brick and mortar and we need to make these asynchronous calls, meaning that it's not going to go back and forth every time it needs a record across the WAN, We're going to try to make these asynchronous. So they either does them all at once or we combine, we refactor the PHP code. So it combines these duplicate calls into one. So that was just a major thing that we found.

Starting point is 00:13:58 And, you know, it's just kind of like something you don't realize kind of when you're shifting to the cloud there, but yeah, I don't know. You see that, you see that a lot there. I got one other thing there, but what do you guys think? I mean, is that my crazy here? I got two thoughts on that, but I'll let Andy go first because he's usually smarter than me. I'm not smarter than you, but I think we, we think about the same thing because we've been talking about this for so

Starting point is 00:14:19 many years, the, these patterns that we see. And obviously they, these patterns have less of an impact a when you are as you explained in the same network or b if you don't have the right set of data which means if you have test data and it's very small and then you're surprised if the first time you're having production data and then it hits you. The sad thing about this, obviously, is that these things can be found even in development environments

Starting point is 00:14:50 because detecting something like the M plus one query problem is doable with a single user and with a very small environment where you have two data records in the database. And then maybe one other thought, you said changing the front end to an async chronos uh processing model i think what you are what i would suggest is this is what i see a lot is kind of batch providing batch apis or really analyzing how the front end how a how a consumer is consuming a certain service and then based on these patterns then design

Starting point is 00:15:26 correctly your api and in this case it sounds like a batching and whether then you call it asynchronous field synchronous they think is then a second decision to make i think the batching will make the path the biggest impact in this case yeah yeah well here are my thoughts on i had two and i actually wrote them down for the first time ever i I took notes, too, Brian Chandler, because I was taught that, too. The first one is, you know, Andy, you brought up the benefit of the doubt idea of maybe it was they wanted to put it to the web interface in the cloud to get it closer to the customers. So the first thing I would say is as you're doing that pre-release, you need to test what that latency is when it's on your data center. So you know what that latency is. What are your furthest customers?

Starting point is 00:16:08 What's that hit going to be? So that now when you move that to the cloud, you can measure the hit between the web and the backend and see if you're losing that tradeoff. Yeah, maybe you get it to the front end customers, but maybe that was only two seconds. Whereas if you separate it, you add five seconds. So total three, bad idea. You can just kill that then in there so having the data points of why you're going to do this and what you're hoping to gain from that and then measuring those data points and verifying that you're going to get that gain is key number one and number two the first thing i thought about when you talked about

Starting point is 00:16:40 all those calls going from the data from the cloud back to the data center is that's public network. That's higher costs and fees, right? So are you saving money by making that choice and making that move? Because I believe still, I mean, correct me if I'm wrong. I know back in the earlier days of clouds, if you go public network, that's where you're getting the network charges. Is that still true? There's some egress charges i believe still yeah that's another really good consideration there so we always talk about the cost of your performance right are you making it more expensive or are you making it save money and to any point if you batch those and make one call yeah you're still gonna have a data transmission but it's

Starting point is 00:17:18 gonna be a lot less because you're not doing it individual individual calls so things to look out for when you're making these moves and as andy says this can all be done in that shift left mindset in that let's get it up in the dev environment see what's going on and then make an evaluation make an evaluation to say this is a good idea or this is not going to work the way we're doing it and maybe one thing to prepare for things like this i think what would be public knowledge is what's the latency right now and what's the latency going now and what's the latency going to be between that cloud provider and your data center so let's say the latency used to be one millisecond i'm making a number up now it's 50 milliseconds you can use chaos

Starting point is 00:17:56 engineering you can use any type of software that is artificially slowing down or imposing latency in your local network. So you can already test this. How does the system behavior change if you all of a sudden have a 50x latency between two servers? Great point. Absolutely. And it's funny, Brian, you mentioned data points and how important they are. Yeah, one of the things I just want to, one of the things that really opened their eyes to this was, you know, I created a report for them showing, cause, cause they kept one of their environments,

Starting point is 00:18:34 a version of their version of their stack wholly in the brick and mortar data center. And then they did the hybrid one, which was the front end in public cloud and then brick and mortar. And it was just a simple chart showing every time you're making one of these calls, just, just from network time alone, it's a 10 X difference. So it was adding, it went from like, like something like 20 milliseconds time on the brick and mortar in network time. And that's not even the compute time. That's just two, two entities trying to start to talk to each other. Right. When you shifted it to the cloud, it was like 200 milliseconds. Every time a cloud resource was coming down to even begin to start do something with the on-prem. And when that thing, you know, you add that up 20, 30 times,

Starting point is 00:19:18 it's a huge difference in seconds that you end up just adding in network time. So, you know having having good data points uh to use in those situations is a good thing and i just want to point that out and i want to ask you one last question oh sorry q i was gonna ask you one last question on that setup was all that stuff that you did was that before they went to production or was this after production uh this was this was well that's that's the funny that's now you're gonna now we're getting into the dirty laundry part of it because yeah it's uh i was gonna say this was before... Well, that's funny. Now we're getting into the dirty laundry part of it. I was going to say this was before production, and congratulations to them.

Starting point is 00:19:50 Yeah, well, it was, let's say, before our engagement there. So their project was actually delayed for a little bit because of this, because they couldn't find it. And that's kind of when we started engaging and kind of showing them these data points, because it's one thing to kind of whiteboard in theory, what it's going to look like. And that's what they did. They didn't have that data, I guess, maybe more directly answer your question. They kind of read the specs of the public cloud doc and said, oh, okay, here's a server that's going to have this much, you know, disk performance and, and this much, you know, network performance across

Starting point is 00:20:31 the WAN. But then when actually rubber met the road, it didn't turn out very well. And that's when they started realizing these things. But, but yeah, they're actually what this current status is that they're not actually in production per se yet, but they're trying to get there before the next, the next open enrollment period, which is going to be, you know, this, this coming fall. In fact, with these findings, what they're finding is that, listen, they got to, you know, they can't take a mainframe and put it up in the cloud. So now they've realized they basically are going to rewrite their middleware services that talk to DB2, the mainframe stuff, and rewrite it in Spring Boot and have that run in the cloud next. Aren't there some mainframe cloud offerings now?

Starting point is 00:21:21 I thought someone had one. Not that I'm saying they should do that, but I thought someone was trying to do something. Anyhow, it's a different topic. Does it come with like balloons that'll help float it up there? Exactly. So Brian, you said there's another thing. Oh no, those are the two.

Starting point is 00:21:37 Those are the two, the calls of the public network calls and then the latency. There was one other thing. Yeah, that Brian. Yeah. Just keeping ourselves on our toes here. There's one other thing in that area, which I thought like, you know, so Redis.

Starting point is 00:21:55 So they had a caching mechanism as which they actually took up into the cloud as well. So you would think that wouldn't be problematic. Redis, PHP talking to Redis, both on-premise on bare metal service versus PHP to Redis when they're both in the cloud. Because when you think about it in the public cloud, there's even layers of abstraction there from a networking perspective, even when two servers are talking to each other within the cloud. So we even found that PHP talking to Redis every time PHP went out to Redis. And that was also sequential sequential that added about like 10 to 10 microseconds to one millisecond, a difference per Redis call. But the kicker here is PHP was calling Redis literally thousands of times per transaction. So you take that like, you know, multiple millisecond,

Starting point is 00:23:05 a hundred millisecond, sometimes addition per Redis call, you're adding seconds onto the transaction just by nature of the networking infrastructure in the public cloud versus, you know, like a kind of a bare metal situation in your own data center. Because if you think about they have virtual kind of network definitions in the cloud. So it has to go through kind of extra handshakes, even when servers are just trying to talk to each other within the cloud.

Starting point is 00:23:33 So I thought that was really interesting as well. And in there, what they decided was, well, we got to make sure that we optimize how this thing is talking to Redis, right? What kind of resources we're caching and how often it goes out there. It just goes to show that how much, you know, design things, I don't want to say almost can be covered up because I mean, it's, you know, not to say that maybe not to say they had a bad design. I mean, if it,

Starting point is 00:23:54 if it was performant in one environment, who's to say it was, it's a bad design. It's just that you have to, you have to take into consideration what your design will look like and the implications of changing the environment on a design that worked in another environment. So I just thought that was another interesting kind of point there. I would still say there's no excuse for that design, regardless of where it runs. In the end, it's not efficient, and you were just lucky that it was never found that it's not efficient yeah well the good thing is that just like when you move we've all moved right in the past that's when you go through and discover all this garbage you've had and you start cleaning house and re yeah oh there's piles of stuff you're like why was he even saving this what was i thinking with this stuff and this is a similar case they're moving

Starting point is 00:24:43 they're shifting and all this garbage is found. And when you go through those exercises, it's a great opportunity to really look at what's going on in your code and say, all right, we survived somehow with this. That pile's been sitting over there, but now that I'm moving, let's go through that and see what the heck it's doing and see what I really need. Absolutely.

Starting point is 00:25:04 Cool. Hey, Brian. Well, thank you for letting me vent. I feel need. Absolutely. Cool. Hey, Brian. Thank you for letting me vent. I feel like weight off my shoulders. That was a big thing I just need to share with you guys, these types of problems. It's like sometimes you don't run into kindred people that understand, and it's good to be on here to kind of talk about it.

Starting point is 00:25:21 It's good, and it also feels good for us because we, we are on the one side, I'm sad that the same problem patterns are still out there on the other side. I also know that Brian Wilson and I were still relevant when we talk about, but I'm talking about relevant. I think you have a second topic as well. I do.

Starting point is 00:25:40 I would like to hear about this. Yes. That little app called Uber. I don't know. Have you guys heard of that one? What, you know, it's like, I think Andy heard of Uber. It's really the fun thing

Starting point is 00:25:52 that they are castrating a German word because I think it comes from Uber, which means above. And they just removed the two dots, the umlaut dots. But I might be completely wrong here but i'm yeah so good i mean americans have a thing for you know just taking something and messing it up and uh you know just kind of making it our own so it's like part of our uh part of our thing you

Starting point is 00:26:15 know appropriation right yeah um so basically yeah this one i thought was really cool and kind of relevant in the space of sre and the importance of defining service level objectives in general. Right. So one of the big problems in the performance space I've always and you guys, you guys might notice, too, is that in a lot of organizations, it's so nebulous and arbitrary as what counts as a performance system. So I have lost count of how many projects I have helped with, with application rollouts where either the project manager or, you know, the GM and they'll have like monitoring or performance as a checkbox on an Excel spreadsheet. And that'll be it. And then they'll turn to me or the performance team and say, Hey, is monitoring good? Is performance good? Are we good on that? Do we have the dash? Like, and, and you said, yes, the performance is good and they just want the checkbox checked. And then, and then there's no

Starting point is 00:27:15 criteria of what is good. And then at the, and then what happens is you have a dashboard, maybe in a knock somewhere where there's a bunch of charts and graphs and lines going up and down and nobody knows what you're measuring and line go up bad is kind of what people end up relegating themselves to. And it's really arbitrary and you don't even know what the impact is. So this example I thought was a really cool use case around Uber, where Uber doesn't mess around. They have put a line in the sand as to what is a performance service for their partners. So there is a program, I think it's a regional rollout. You may notice it in your app. I think how they roll out some of these features is that they'll roll out some states or regions or certain organizations and things like that. But there's kind of a new-ish

Starting point is 00:28:03 feature out there where they'll actually integrate with the major rental car companies. In the US, you probably think of a bunch of them. And it'll basically query the local inventory for users of Uber for you as the end user to Uber, right? So you could actually pull it up and you would see Avis, National, Hertz, whatever it is, right? And the local inventory. And the Uber app will actually query those APIs and present that to the end user. So I was working with a customer down here and they want to be part of this program, obviously,

Starting point is 00:28:37 because yeah, we want to expose our inventory to the entire Uber user base and hopefully rent their cars and give us money because that's really important. And that's a great line. It could be a good income for us there. So they end up adding their inventory API to it. And Uber puts a line in the sand. Hey, if you don't get your inventory back to our end user within five seconds, we're going to forget about you. And we're going to go query the other APIs and we're going to fill it up with the other

Starting point is 00:29:09 rental car inventories. And you're going to miss out unless you get your local inventory to our end users within five seconds, because Uber doesn't want to look bad because they don't want the rental car companies to be the reason why Uber's users are having a bad experience, right? So this was a huge red flag for our rental car customer here. And they actually were noticing Uber was just saying, nope, sorry, we're cutting you off five seconds or you're not getting this user base to sell to, right? So we were actually created an SLO for that. So basically the local inventory API for their Uber partner program there. And so they can tightly, because basically even if their API returns a 200 response code, if that response code comes in over five seconds, that counts as a failure state

Starting point is 00:30:00 to them. So they're now counting as an SLI this as a failure if it's above five seconds. And it's just, I thought it was just really good use case and kind of like two worlds colliding. You have the Ubers in the Silicon Valley kind of type world that's really starting to from day one, adopting SRE practices and performance and reliability for their end users, clashing with legacy systems and legacy organizations that don't take these performance requirements seriously. It might just be sort of a checkbox. Or maybe in the past, maybe not blame them too much because in the past, users maybe were more patient, things like that.

Starting point is 00:30:44 So I just thought that was a really good story um i don't know what do you guys what do you guys think about that you guys seen that i'm still fascinated just like the first time you told me about it yeah and and for me the magic moment is here and i think you it clicked again when you said for these rental car companies, in this case, this is a huge business opportunity because they're exposed to a new buyer group. But they can basically easily measure how many opportunities they're missing out on because they're not hitting their SLO.

Starting point is 00:31:18 So the only thing that I want to challenge here, well, not challenge, but expand, the SLO should not only be defined on the service itself in the data center, because Uber is also measuring that performance from the mobile device. This is why I think synthetic is so important in this case. You want to set up additional synthetic checks from different points in that region, right? Whatever territory, geography that is, and then measure the performance from these locations

Starting point is 00:31:51 um and also take into into account don't just test it from high speed connections but also do some i don't know simulate some some you know cellular data bad connections as well and yeah no that's that's a great point actually yeah let me let me think chew on that for a second because yeah it's one it's one thing to measure it from the point the entry point of the data center which is that what they're doing today obviously that needs to come in under five seconds but really it honestly should be tighter than that because because these these users could be using a 3g network out in the desert somewhere 2g or whatever right whatever we get out there. Satellite, Starlink, I don't know. But it'll go,

Starting point is 00:32:29 you got to take account for that front-end network before it even gets to the back-end rental car API. That's why we call this, I would say, I would call this you break down your performance budget, right?

Starting point is 00:32:41 If you have five seconds on the top and you know that you're on on average losing 500 milliseconds on on the network before it hits you then you know you only have four and a half seconds left and not five yeah yeah good point it's interesting it sounds uh my thoughts on are more of the uh sort of political business impacts of it where and i'm torn on how I feel about this whole practice if you think about some big-box retailers where they say if you don't do what we want we're not going to put you in the store this kind of stuff and they've had a major impact but that's that's more of an issue of

Starting point is 00:33:16 like big shark versus minnows whereas in this case you're dealing like big shark versus tiger sharks right both a but both predators both big enough and all and you're also enforcing great performance which is different than enforcing pricing that could destroy the minnows um it's still kind of odd though because you have like the the the people at the top or the companies at the top enforcing a practice and then in my my head, it's like, well, where do you draw the line versus a good practice, which would maybe be performance versus something else. And how is that defined? It's not really the topic for the show, but that's really where my, my, my mind went on this, where it's like Uber saying, you're not going to get our business if you don't do this.

Starting point is 00:33:57 Now, overall, in reality, the fact that it's about performance, I'm, I'm, I'm prone to say, or I fall more to the side of well yeah that's good because better performance is good for everybody um but it's just an yeah it's an odd uh situation where you can have companies so powerful that they can call those shots yeah but it's the same that google did with you know how they rank your page in the search result right because they also factor in page load time and they're, but that's a search result, but that's not direct business impact. That's abstracted.

Starting point is 00:34:29 No, I disagree here because in the end, whether I'm searching for a ride or whether I'm searching for a piece of content, I want to find the right thing. And if I don't find it because it's not fast enough, Google ranks it differently and Uber doesn't show it to me. In the end, for me, it's the same thing.

Starting point is 00:34:46 I think Brian is just against service-level tyranny. That's a good way to say it. The tyranny aspect. I think, yeah. And I'm not against the whole idea. It's just like these are the thoughts that are growing through my head. Like, okay, how comfortable am I with that?

Starting point is 00:34:58 And there's different components to it. But yeah, it's interesting. And I think it's a fascinating use case for sure. And as Andy said, you've got to take in those other factors, like that latency on there and add that to the budget. It's amazing stuff that you're running into. And this is once again, why Andy and I love doing these podcasts. And I hope our listeners do too, because it's like, how often do you get to hear these stories? I just love them.

Starting point is 00:35:21 Yeah. Especially too. I mean, I know everyone's, you know, we've talked about COVID so much. Everyone's tired of it, I'm sure. But really, I mean, it's like, especially you don't get to collaborate and talk with folks and share these types of stories as easily anymore. So it's just fantastic to be able to come on something like this and kind of share and collaborate. And kind of we're all still out there experiencing all the same things. The world kept going after all that. Yeah. Hey, I don't want to be a party pooper for two reasons.

Starting point is 00:35:50 I don't want to poop this party because I'm really like talk to you. First, correctly. But then there's also another party going on here because fortunately, as Brian, as you mentioned in the beginning, or maybe it was, I think it was actually in the observability clinic we recorded. He said, fortunately, COVID is hopefully getting less uh impacting and got some colleagues here in the office and i fear that if i'm not fast enough my slo will turn red because there's no beer available anymore the beer availability is dropping too fast um and i want to make sure I don't miss out on it. But I had to bring this in.

Starting point is 00:36:26 Priorities. A BDLO, a B-level objective, something like this. A leverage level objective. No, but Brian Chandler, fascinating stories. I really love it. And I have one more comment because I recently heard this on the meetup.

Starting point is 00:36:40 You talk a lot about legacy code, but I heard a different term. Legacy code is just code with character. Ah, yes. It's still code. Just don't call it legacy. It's big boned.

Starting point is 00:36:55 Yes. It's got scars and opinions. Exactly. Well, Brian Chandler, it was amazing having you on. uh hope we can have you back and bring more stories and you have a microphone that's right it's fantastic yeah any any any closing thoughts brian um i think that's it again yeah thanks for um having me on i really appreciate it you know follow me on Twitter. It's channer531

Starting point is 00:37:25 channner531 and check me out. I'd love to talk about performance stuff. Thanks again, guys. I think we'll put the link in the proceedings because it's a really complicated Twitter handle. I'm just trying to keep everyone on their toes.

Starting point is 00:37:42 Only highly intelligent people that can listen to spelling have the right to follow me on Twitter. Hopefully they're all taking notes. I'm going to start calling you Channer, but your new name to me is Channer now. Exactly. All right. Well, thanks, everyone, for listening.

Starting point is 00:37:56 I hope you all found this helpful and informative. If you'd like to leave a comment, you can do so at pure underscore DT at Twitter. Any show ideas, you can send us an email also at pureperformance at dynatrace.com. Thank you, everybody. Thank you, everybody, for listening, I should say, instead of getting the marbles in my mouth. All right, take care, everyone. Bye-bye.

Starting point is 00:38:14 Bye-bye.

PurePerformance - Cloud Migrations Gone Wild and other Patterns with Brian Chandler

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.