Screaming in the Cloud - Summer Replay - An Enterprise Level View of Cloud Architecture with Levi McCormick

Starting point is 00:00:00 I mean, I think AWS is an incredible product. I think the ecosystem is great. The community is phenomenal. Everyone is super supportive. And it makes me really sad to be hesitant to recommend people dive into it on their own dime. Welcome to Screaming in the Cloud. I'm Corey Quinn.

Starting point is 00:00:21 I am known slash renowned slash reviled for my creative pronunciations of various technologies, company names, et cetera. Kubernetes, for example, and other things that get people angry on the internet. The nice thing about today's guest is that he works at a company where there is no possible way for me to make it more ridiculous than it sounds, because Levi McCormick is a cloud architect at Jamf. I know, Jamf sounds like I'm trying to pronounce letters that are designed to be silent, but no, no, it's four letters, J-A-M-F, Jamf. Levi, thanks for joining me. Thanks for having me. I'm super excited.

Starting point is 00:00:57 This episode's been sponsored by our friends at Panoptica, part of Cisco. This is one of those real rarities where it's a security product that you can get started with for free, but also scale to enterprise grade. Take a look. In fact, if you sign up for an enterprise account, they'll even throw you one of the limited, heavily discounted AWS skill builder licenses they got, Because believe it or not, unlike so many companies out there, they do understand AWS. To learn more, please visit panoptica.app

Starting point is 00:01:31 slash last week in AWS. That's panoptica.app slash last week in AWS. Also, professional advice for anyone listening, making fun of company names is hilarious. Making fun of people's names

Starting point is 00:01:43 makes you a jerk. Try and remember that. People sometimes blur that distinction. So very high level, you're a cloud architect. Now, I remember the days of enterprise architects where their IDEs were basically whiteboards and it was a whole bunch of people sitting in a room. They call it an ivory tower, but I've been in those rooms. I assure you there is nothing elevated about this. It's usually a dank sub-basement somewhere. What do you do exactly? using our resources to the greatest efficacy possible, coordinating between many teams, many products, many architectures, trying to make sure that we're using best practices, bringing them from the teams that develop them and learn them, socializing them to other teams, and just trying to keep a handle on this wild ride that we're on.

Starting point is 00:02:41 So what I find fun is that Jamf has been around for a long time. I believe it is not your first name. I want to say Casper was originally? I believe so, yeah. We're Jamf customers. You're not sponsoring this episode or anything, to the best of my knowledge. So this is not something where I'm trying to show the company, but we're a customer. We use you to basically ensure that all of our company MacBooks and laptops, et cetera, et cetera, are basically ensure that there's disk encryption turned on, that people have a password and that screensavers turn on basically to mean that if someone gets their laptop stolen, it's a, oh, I have to spend more money with Apple and not time to sound the data breach alarm for reasons that should be

Starting point is 00:03:21 blindingly obvious. And it's great, not just at the box check, but also fixing the real problem of, I don't want to lose data that is sensitive for obvious reasons. I always thought of this as sort of a thing that worked on the laptops. Why do you have a cloud team? Many reasons. First of all, we started in the business of providing the software that customers would run in their own data centers, in their own locations. Sometime in about 2015, we decided that we are properly equipped to run

Starting point is 00:03:52 this better than other people, and we started to provide that as a service. People would move in, migrate their services into the cloud, or we would bring people into the cloud to start with. Device management isn't the only thing that we do. We provide some SSO type services. We recently acquired a company called Wandera, which does endpoint security and a VPN-like experience for traffic. So there's a lot of cloud powering all of those things. Are you able to disclose whether you're focusing mostly on AWS, on Azure, on Google Cloud, or are you pretending a cloud with something like IBM? All of the above, I believe. Excellent.

Starting point is 00:04:35 That tells you it's a real enterprise in seriousness. It's the, we talk about the idea of going all in on one provider as being a general best practice of good place to start. I believe that. And then there are exceptions. And as companies grow and accumulate technical debt that also is load-bearing and generates money, you wind up with this weird architectural series of anti-patterns. And when you draw it on a whiteboard of here's our architecture, the junior consultant comes in and says, what moron built this? Usually to said, quote unquote, moron. And then they've just pooched

Starting point is 00:05:09 the entire engagement. Yeah. Most people don't show up in the morning hoping to do a terrible job today unless they work at Facebook. So there are reasons things are the way they are. There are constraints that shape these things. Yeah. If people were going to be able to shut down the company for two years and rebuild everything from scratch from the ground up. It would look wildly different, but you can't do that most of the time. Yeah, those things are load-bearing, right? You can't just stop traffic one day and re-architect it with the golden image of what it should have been. We've gone through a series of acquisitions and those architectures are disparate across the different acquired products. So you have to be able to leverage lessons from all of them, bring them

Starting point is 00:05:51 together and try and just slowly, incrementally march towards a better, a future state. As we take a look at the challenges we see on the DocBuild group over on my side of the world, where we talk to customers. I think it is surprising to folks to learn that cloud economics, as I see it, is, well, first, cost and architecture are the same thing, which inherently makes sense. But there's a lot more psychology that goes into it than math. People often assume I spend most of my time staring into spreadsheets. I assure you that would not go super well, but it has to do with the psychological elements of what it is that people are wrestling with, of their understanding of the environment

Starting point is 00:06:33 is not kept pace with reality, and APIs tend to, you know, tell truths. It's always interesting to me to see the lies that customers tell, not intentionally, but the reality of it, of, okay, what about those big instances you're running in Australia? Oh, we don't have any instances in Australia. Look, I understand that you are saying that in good faith. However, and now we're in a security incident mode,

Starting point is 00:06:57 and it becomes a whole different story, people's understanding always trails. What do you spend the bulk of your time doing? Is it building things? Is it talking to people? Is it trying to more or less herd cats in certain directions? What's the day-to-day? I would say it varies week to week.

Starting point is 00:07:15 Depends on if we have a new product rolling out. I spend a lot of my time looking at architectural diagrams, reference architectures from AWS. The majority of the work I do is in AWS. That's where my expertise lies. I haven't found it financially incentivized to really branch out into any of the other clouds in terms of expertise, but I spend a lot of my time developing solutions, socializing them, getting them in front of teams and then educating we have a a wide range of skills internally in terms of what people know or what they've been exposed to i'd say a lot of engineers want to learn the cloud and they want to get opportunities to work on it and their day-to-day work may not bring them those opportunities as

Starting point is 00:08:06 often as they'd like. So a good portion of my time is spent educating, guiding, joining people's sprints, joining their stand-ups, and just kind of talking through like how they should approach a problem. Whenever you work at a big company, you invariably wind up with what microservices becomes the right answer, not because of the technical reason, because of the people reason, the way that you get a whole bunch of people moving in roughly the same direction. You are a large scale company. Who owns services in your idealized view of the world? Is it, well, I wrote something and it's five o'clock, off to production with it, talk to you in two days if everything, if we still have a company left, because I didn't double check what i just wrote do you think that the people who are building

Starting point is 00:08:49 services necessarily are should be the ones supporting it like in other words amazon's approach of having the software engineers being responsible for the ones running it in production from an ops perspective is that the direction you trend or or do you tend to be from my side of the world which is grumpy sysadmin, where developers hurl applications into your yard for you to worry about? I would say I'm an extremist in the view of supporting the Amazon perspective. I really like you build it, you run it, you own it, you architect it, all of it. I think the other teams and the organization should exist to support and enable those paths. So if you have platform teams are a really common thing you see hired right now.

Starting point is 00:09:33 I think those platforms should be built to enable the company's perspective on operating infrastructure or services. And then those service teams on top of that should be enabled to and empowered to make the decisions on how they want to build a service, how they want to provide it. Ultimately, the buck should stop with them. You can get into other operational teams. You could have a systems operation team, but I think there should be an explicit contract between a service team, what they build and what they hand off. You know, you could hand hand off a tier one level response. You can do playbooks. You could do minimal alert response routing, that kind of stuff with a team. But I think that even that team should have a really strong contract with like,

Starting point is 00:10:19 here's what our team provides. Here's how you engage with our team. Here's how you will transition services to our team. The challenge with doing that in some shops has been that if you decide to roll out a you build it, you own it approach that has not been there since the beginning, you wind up with a lot of pushback from engineers who, until now, really enjoy their 5.30 p.m. quitting time or whatever it was, whatever it was they wound up knocking off work and they started pushing back. Like working out of hours, that's inhumane.

Starting point is 00:10:51 And the DevOps team would be sitting there going, we're right here. How dare you? Like, what do you think our job is? And it's like, yes, but you're not people. And then it leads to this whole back and forth, acrimonious, we'll charitably call it a debate. How do you drive that philosophy?

Starting point is 00:11:06 It's a challenge. I've seen many teams fracture, fall apart, disperse, if you will, under the transition of going to like an extreme service ownership. I think you balance it out with the carrot of you also get to determine your own future, right? You get to determine the programming language you use. You get to determine the underlying technologies that you use. Again, there's a contract. You have to meet this list of security concerns. You need to meet these operational concerns. And how you do that is up to you. When you take a look across various teams, let's bound this to the industry because I don't necessarily want you to wind up answering tough questions at work the day this episode airs. What do you see the biggest blockers to achieving, I guess, a functional cultural service ownership? It comes down to people's identity. They've established their own identity as I am X, right? I'm a operations engineer. I'm a developer. I'm an

Starting point is 00:12:13 engineer. And getting people to kind of branch out of that really fixed mindset is hard. And that to me is the major blocker to people assuming ownership. I've seen people make the transition from, I'm just an engineer. I just want to write code. I hate those lines. That frustrates me so much. I just want to write code. Transitioning into that ownership of, I had an idea.

Starting point is 00:12:41 I built the platform or the service. It's a huge hit or lots of people are using it. Like seeing people go through that transformation, become empowered, become fulfilled, I think is great. I didn't really expect to get called out quite like this, but you're absolutely right. I was against the idea back when I was a sysadmin type because I didn't know how to code.

Starting point is 00:13:01 And if you have developers supporting all of the stuff that they've built, then what does that mean for me? It feels like my job is evaporating. I don't know how to write code. Well, then I started learning how to write code incredibly badly. And then, wow, it turns out everyone does this. And here we are.

Starting point is 00:13:16 But it's, I don't build applications for obvious reasons. I'm bad at it. But I found another way to proceed in the wide world that we live in of high technology. But yeah, it was hard because this idea of my sense of identity being tied to the thing that I did, it really was an evolve or die dinosaur kind of moment because I started seeing this philosophy across the board. You take a look even now at modern SREs or modern DevOps folks or modern sysadmins, what they're doing looks a lot less like logging into Linux systems

Starting point is 00:13:45 and tinkering on the command line, a lot more like running and building distributed applications. Sure, this application that you're rolling out is the one that orchestrates everything there, but you're still running this in the same way the software engineers do, which is interestingly. And that doesn't mean a team has to be only software engineers. Your service team can be multiple disciplines.

Starting point is 00:14:06 It should be multiple disciplines. I've seen a traditional ops team broken apart and those individuals distributed into the services that they were chiefly skilled in supporting in the past as the ops team, as we transitioned those roles from one of the worst on-call rotations I've ever seen, you know, 13 to 14 alerts a night, transitioning those out to those service teams, training them up on the operations, building the playbooks. That was their role. Their role wasn't necessarily to write software day one. I quit a job after six weeks because of that style of, I guess, mismanagement. Their approach was that, oh, we're going to have our monitoring system live in AWS because one of our VPs really likes AWS. Let's be clear, this was 2008, 2009 era. Latency was a little challenging there. And that VP really liked Big Brother, which was not to, not before that became

Starting point is 00:15:04 a TV show and the rest, it was a monitoring system. But network latency was always a weird thing in AWS in those days. So instead, he insisted we set up three of them. And whenever, if we just got one page, it was fine. But if we got three, then we had to jump in. And two was always undefined. And they turned this off from, I think, 10 p.m. to 6 a.m. every night just so the person on call could sleep.

Starting point is 00:15:27 And I'm looking at this like this might be the worst thing I've ever seen in my life. This was before they released the managed NAT gateway. So possibly it was. And then the flood, right? Oh, God.

Starting point is 00:15:38 This was in the days, too, when you were, if you weren't careful, you set this up to page you on the phone with a text message and great. Now it takes time for my cell provider to wind up funneling out the sudden onslaught of 4,000 text messages. No thanks. If your monitoring system doesn't have the ability to say, you know, the alert flood, funnel them into one alert or just pause all alerts. Well, because we know there's an incident, you know, US East one is down, right?

Starting point is 00:16:02 We know this. We don't need to get 500 text messages to each engineer that's on call. Well, my philosophy at that point was, no, I'm going to instead take it a step beyond. If I'm not empowered to fix the thing that is waking me up, and sometimes that's the monitoring system and sometimes it's the underlying application, I'm not on call. Yes, exactly. And that's why I like the model of extreme, you know, the service ownership, because those alerts should go to the people.

Starting point is 00:16:30 The pain should be felt by the people who are empowered to fix it. It should not land anywhere else. Otherwise that creates misaligned incentives and nothing gets better. Yeah. But in large distributed systems, very often the person who's on call more or less turns into a traffic router. Right. That's unfair. That's never fun. Yeah, that's unfair. And it's on call more or less turns into a traffic router. Right. That's unfair to them. Yeah, that's unfair and it's not fun either and there's no great answer when you have all these different contributory factors. And how hard is it to keep that team

Starting point is 00:16:54 staffed up? Oh, yeah. It's like, hey, you want a really miserable job one week out of however many there are in the cycle? Yeah. People don't like that. Exactly. Few things are better for your career and your company than achieving more expertise in the cloud. Security improves, compensation goes up,

Starting point is 00:17:11 employee retention skyrockets. Panoptica, a cloud security platform from Cisco, has created an academy of free courses just for you. Head on over to academy.panoptica.app to get started. So I've been tracking what you're up to for a little while now. You're always a blast to talk with. What is this whole cloud builder thing that you were talking about for a bit and then I haven't seen much about it? Ah, so at the beginning of the pandemic, our mutual friend Forrest Brazil released the cloud resume challenge. I looked at that and I thought this is a fantastic idea. I've seen lots of people our mutual friend, Forrest Brazil, released the Cloud Resume Challenge.

Starting point is 00:17:48 I looked at that and I thought, this is a fantastic idea. I've seen lots of people going through it. I recommend lots of the people I mentor go through it. Great way to pick up a couple of cloud skills here and there, tell an interesting story in an interview, right? It's a great prep. I intended the Cloud Builder Challenge to be a natural kind of progression from that resume challenge to the builder challenge where you get operational experience. Again, back to that kind of extreme service ownership mentality. Here's a project where you can build really

Starting point is 00:18:19 modeled on the Amazon game days from reInvent. You build a service, we'll send you traffic, you process those payloads, do some matching, some sorting, some really light processing on these payloads, and then send them back to us, score some points, we'll build a public dashboard, people can high-five each other, they can raz each other, what kind of competition they want to do. Really low pressure, but just a fun way to get more operational experience in an area where there is really no downside. Playing like that at work, bad idea, right? Generally, yes. Rep production, we used to have one of those environments. Oops-a-doozy. Yeah. I don't see enough opportunities for people to gain that experience in a way that reflects a real workload. You can go out and you can find all kinds of hello worlds. You can find all kinds

Starting point is 00:19:13 of like for front-end development, there are tons of activities and things you can do to learn the skills. But for the middleware, the-end engineers there's just not enough playgrounds out there now standing up a hello world app you know you've got your your infrastructures code template you've got your pre-written code you deploy it congratulations but now what right and i intended this challenge to be kind of a series of increasingly more difficult waves, if you will, or levels. You know, I can't really add a whole gamification aspect to it. So it would get harder. It would get bigger, more traffic. receive your post-gut slash audit today, or those kinds of things where people don't get an

Starting point is 00:20:06 opportunity to deal with large amounts of traffic or variable payloads, that kind of stuff. I love the idea. Where is it? It is sitting in a bunch of repos, and I am afraid to deploy it. What is it that scares you about it specifically? The thing that specifically scares me is encouraging early career developers to go out there, deploy this thing, start playing with it, and then incur a huge cloud bill. Because they failed to secure something or other reasons behind that? There are many ways that this could happen. Yeah, you could accidentally push your access key, secret key up into a public repo. Now you've got, you know, Bitcoin miners or Monero miners running in

Starting point is 00:20:50 your environment. You forget to shut things off, right? That's a really common thing. I went through a SageMaker demo from AWS a couple of years ago. Half the room of intelligent, skilled engineers forgot to shut off their SageMaker instances, and everybody ran out of the $25 of credit they had from the demo. In about 10 minutes, yeah. In about 10 minutes, yeah. And we had to issue all kinds of requests for credits and back and forth. But granted, AWS was accommodating to all of those people, but it was still also slow. They're very slow on that, which is fair. Like if someone's production environment is down, I can see why you care more about that than you do about someone with I did something wrong and lost money.

Starting point is 00:21:35 The counterpoint to that is that for early career folks, that money's everything. We remember earlier this year, that tragic story from the Robinhood customer who committed suicide after getting a notification that he was $730,000 in debt. Turns out it wasn't even accurate. He didn't know anything when all was said and done. I can see a scenario in which that happens in the AWS world because of their lack of firm price controls on a free tier account. I don't know what the answer on this is. I'm even okay with a, cool, you will, this is a special kind of account that we will turn you off at above certain levels.

Starting point is 00:22:14 Fine. Even if you hard cap at the 20 or 50 bucks, yeah, it's going to annoy some people, but no one is going to do something truly tragic over that. And I can't believe that Oracle Cloud, of all companies, is the best shining example of this because you have to affirmatively upgrade your account before they'll charge you a dime. It's the right answer. It is. And I don't know if you've ever looked at,

Starting point is 00:22:34 well, I'm sure you have. You've probably looked at the solutions provided by AWS for monitoring costs in your accounts, preventing additional spend, like the automation to shut things down, right? It's oftentimes more engineering work to make it so that your systems will shut down automatically when you reach a certain billing threshold than the actual applications that are in place there. And I don't for the life of me understand why things are the way that they are. But here we go. It just becomes this perpetual strange world. I wish that things were better than they are, but they're not.

Starting point is 00:23:12 It makes me terribly sad. I mean, I think AWS is an incredible product. I think the ecosystem is great. The community is phenomenal. Everyone is super supportive. And it makes me really sad to be hesitant to give people all these caveats. And then someone posts about a big bill problem on the internet and all the comments are, oh, you should have set up budgets on that. Yeah, that's thing's still a day behind. So, okay, great. Instead of having an enormous bill at the end of the month, you just have a really big one two days later.

Starting point is 00:24:00 I don't think that's the right answer. I really don't. And I don't know how to fix this, but, you know, I'm not the one here who's a $1.7 trillion company either that can probably find a way to fix this. I assure you, the bulk of that money is not coming from a bunch of small accounts that forgot to turn something off or got exploited. I haven't done my 2021 taxes yet, but I'm pretty sure I'm not there either. The world in which we live. I would love this challenge. I would love to put it out there. If I could, on behalf of early career people who want to learn, if I could issue credits, if I could spin up sandboxes and say, here's an account, I know you're going to be safe. I have put in a $50 limit, right? Yeah. be safe i have put in a 50 limit right yeah you can't spend more than 50 like if i had that control or that power i would do this in a heartbeat i'm passionate about getting people these opportunities to play you know especially if it's fun right if we can make this thing enjoyable if we can gamify

Starting point is 00:24:59 it we can play around i think that'd great. The experience though would be a significant amount of engineering on my side and then a huge amount of outreach. And that to me makes me really sad. I would love to be able to do something like that myself with a look. If you get a bill, they will waive it or I will cover it. But then you wind up with the whole problem of people not operating in good faith as well. Like, all right, I'm going to mine a bunch of Bitcoin and claim someone else did it or whatnot. And it's just like, there are problems with doing this and the whole structure doesn't lend itself to that working super well. Exactly. I often say, you know, I face a lot of people

Starting point is 00:25:39 who want to talk about mining cryptocurrency in the cloud because I'm a cloud architect, right? That's a really common conversation I have with people. And I remind them, it's like, it's not economical unless you're not paying for it. Yeah, it's perfectly economical in someone else's account. Exactly. I don't know why people do things the way that they do, but here we are. So, reInvent. What did you find that was interesting, promising there, promising but not there yet, etc.?

Starting point is 00:26:05 What was your takeaway from it, since you had the good sense not to be there in person? To me, the biggest letdown was Amplify Studio. I thought it was just me. Thank you. I just assumed it was something I wasn't getting from the explanation that they gave. Because what I heard was, you can drag and drop basically a front-end web app together and then tie it together with APIs on the back end, which is exactly what I want. Like Retool does. That's what I want, only I want it to be native. I don't think it's that. Right. I want the experience I already have

Starting point is 00:26:36 of operating the cloud, knowing the security posture, knowing the way that my users access it, knowing that it's backed by Amazon and all of their progressively improving services, right? You say it all the time. Your service running on Amazon is better today than it was two years ago. It was better than it was five years ago. I want that experience, but I don't think Amplify Studio delivered. I wish it had. And maybe it will in the fullness of time. Again, AWS services do not get worse as they age. They get better. Some get stale, though. Yeah. The worst case scenario is they sit there and don't ever improve. Right. I thought the releases from S3 in terms of the intelligent tiering were phenomenal. I would love to see everybody turn on intelligent

Starting point is 00:27:26 tiering with instant access. Those things to me were showing me that they're thinking about the problem the right way. I think we're missing a story of like, how do we go from where we're at today? You know, if I've got trillions of objects in storage, how do I transition into that new world where I get the tiering automatically? I'm sure we'll see blog posts about people telling us that's what the community is great for. Yeah, they explain these things in a way that the official docs for some reason fail to. I think it's because the people that are building these things are too close to the thing themselves. They don't know what it's like to look at it through fresh eyes. Exactly. They're often starting from a blank slate or from a Greenfield perspective. There's

Starting point is 00:28:11 not enough thought or maybe there's a lot of thought to it, but there's not enough communication coming out of Amazon. Like here's how you transition. We saw that with Control Tower. We saw that with some of the releases around API Gateway. There's no story for transitioning from existing services to these new offerings. And I would love to see, and maybe Amazon needs a reInvent Echo where it's like, okay, here's all the new releases from reInvent, and here's how you apply them to existing infrastructure, existing environments. So what's next for you? What are you looking at that's exciting and fun and something that you want to spend your time chasing? I spend a lot of my time following AWS releases,

Starting point is 00:28:54 looking at the new things coming out. I spend a lot of energy thinking about how do we bring new engineers into the space. I've worked with a lot of operations teams, those people who run playbooks, they hop on machines, they do the old sysadmin work, right? I want to bring those people into the modern world of cloud. I want them to have the skills, the empowerment to know what's available in terms of services and in terms of capabilities, and then start to ask, why are we not doing it that way? Or start looking at making plans for how do we get there? Levi, I really want to thank you for taking the time to speak with me. If people want to learn more, where can they find you?

Starting point is 00:29:40 I'm on Twitter. My Twitter handle is Levi underscore McCormick. Reach out. I'm always willing to help people. I mentor people. I guide people. So if you reach out, I will respond. That's a passion of mine and I truly love it. And we'll, of course, include a link to that in the show notes. Thank you so much for being so generous with your time. I appreciate it. Thanks, Corey. It's been awesome. Levi McCormick, cloud architect at Jamf. I'm cloud economist Corey Quinn, Thanks, Corey. It's been awesome. along with a comment telling me that service ownership is overrated because you are the storage person, and by God, you will die as that storage person, potentially in poverty.

Screaming in the Cloud - Summer Replay - An Enterprise Level View of Cloud Architecture with Levi McCormick

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.