Screaming in the Cloud - Building Taxpayer-Funded Cloud Services with Simon Elisha

Starting point is 00:00:00 Hello and welcome to Screaming in the Cloud with your host, cloud economist Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud. Welcome to Screaming in the Cloud. Welcome to Screaming in the Cloud. I'm Corey Quinn. I'm joined this week by Simon Alisha,

Starting point is 00:00:31 Head of Technology and Transformation, Australia and New Zealand Public Sector for a little company called AWS. Simon, welcome to the show. Hey, Corey. Long time, first time. Exactly, which I imagine is one of your colloquial expressions, which means it's great to talk to me. And you're right. It is. So what is it that you do at AWS?

Starting point is 00:00:52 Because I understand that you run the AWS podcast, which is a subject near and dear to my heart. It feels a lot like this one, except you probably spend less time trying to figure out who should sponsor it. Probably true, because that was one of the considerations I didn't have to worry about. So I wear a few hats here at AWS. So firstly, I lead a team across Australia and New Zealand who work with our public sector customers on building, deploying, and getting the benefit from really cool technology for citizens. So it's kind of nice and very, very rewarding. But also, as you mentioned, I do host the AWS podcast as well, which was a crazy

Starting point is 00:01:25 idea I had back in 2012 when I actually went to search for a podcast about AWS going, I'd like to listen to one. And there wasn't one. So I did that crazy thing of saying, I'll make one. How hard can it be? And I've now learned that having a podcast is like having a puppy. It's for life, not just for Christmas. But I really enjoy making it. Oh, yes. Having spent a little bit of time myself around AWS folks, I know that if you play the popular drinking game of taking a drink every time someone in an AWS meeting uses the word customers, you will die. So you just mentioned that working in the public sector, you work with citizens. How often do you wind up accidentally using one term instead of the other and having to self-correct? Or is that something you've been able to train yourself out of doing?

Starting point is 00:02:08 It's more a case that it's a different mindset when you think about citizens versus customers. And our customers are the departments and the agencies that we work for, but their customers are the citizens. And the distinction I like to make is a citizen can't choose the service they choose. So, you know, you have your tax department and that's the one that you'll be providing your tax details to. You have your immigration department, your defense department, your health and human services department. You don't get to choose. A customer gets to choose.

Starting point is 00:02:39 A citizen gets a service by their government. And so my goal is always to help our agencies provide the best possible citizen service. So it's an interesting nuance, but it's an important one. And the nice thing is a lot of governments are speaking about citizen-centric services, which ties very much into our own customer-centric thinking. I like that quite a bit. One thing that I find interesting is in my own mapping, mentally, of big companies, my impression has always been that the bigger a company gets, the more inherently narrowly defined various employee roles become. So it's interesting to me that you're not the, for example, head of the AWS podcast on the following four days out of the week.

Starting point is 00:03:18 Instead, you have another full-time job that isn't speaking into a microphone. How did the podcast come to be? And why you, I guess, is probably the rude version of that question. It's because I have a absolutely beautiful head for podcasting, is what I would describe. Yes, a face for radio is what I have on this end. But look, I think it's interesting is that one thing that we do when we hire folks at Amazon and AWS in particular, is we want to hire builders and let them build. And so what that means is we hire looking for people who the leadership principles really resonate with.

Starting point is 00:03:51 And you know them just as well as anyone else, and I won't list them out here. But what those leadership principles do is keep up these broad running rails and decision-making filters to go do stuff that is really, really good on behalf of our customers. And so what that means for me is when I sort of saw a need, which was, hey, there's no podcast, maybe people would be interested in this, I could go build it. Now, yeah, I had to talk to some of the right folks to say, hey, this is something I'm thinking of doing. Can I do it? And I didn't trust with those people.

Starting point is 00:04:17 I said, yeah, it sounds like a great idea. Give it a go. See what it is. It's what we would call a two-way door. So it's a decision we can unmake later on if we want to. And history tells a tale that a lot of people found it useful. And probably one of my favorite things is firstly, people would go to meet at conferences saying, hey, I really love the podcast. That's really gratifying. But more importantly is when people say, hey, I've got a job and your podcast helped me prepare. That's like, awesome. Very happy with that.

Starting point is 00:04:42 Of all of the feedback that I get around the different conversations I have with folks from this podcast, the newsletter, my being obnoxious on Twitter, the most common is what's wrong with you and other various forms of insult. But the ones I like the most are where I've helped someone do something next in their career, where it's getting someone from where they are to someplace else. Maybe it's career-based, maybe it's solving a pernicious problem, but that's always meant a lot more to me than the slings and arrows I get, which let's face it, I definitely invite that criticism onto myself with basically everything I say and do. But there is something to be said for having been in a position to impact people's lives in a positive way. It is humbling and not in a sort of twee way. But if I think about it, like I've been in IT for 30 years now.

Starting point is 00:05:29 I'm an old person now, relatively speaking. But I think back to those folks who helped me when I was a young whippersnapper and gave me guidance, gave me an opportunity, took a punt on me and said, hey, this person can do something, let's let him do it type thing. And you've got to pay it back. There's so many amazing folks coming into the industry from many different aspects. One thing I'm really conscious of is not just saying, hey, he's an IT graduate, that'll be perfect. But what about someone who's had a completely different career trajectory, but is really interested in this

Starting point is 00:05:56 domain? How do we get them into that? Giving that assistance is huge. And the weird thing is it has a really outsized effect compared to the input you put in. So much like yourself, you put work into the podcast, you do it, but you kind of get it done. It's done. It's out there. And what you don't think about is there's people listening sometimes years later saying, oh, this is really useful to me. This is inspiring me, getting me to that next level. Can't ask more than that. A realization I was somewhat late to was the, I guess, dawning awareness that a lot of what AWS was releasing, things like SageMaker or the DeepLens or the DeepRacer or the DeepComposer, also known as

Starting point is 00:06:33 Dr. Matt Wood's piano recital at reInvent last year, is, sure, on some level, this stuff is fun and goofy and an easy target to mock. But on the other, what you're fundamentally doing with these things is making new fields available in a fun and engaging way to people who might otherwise never go in that direction. And that's a very powerful thing. It is. And that's one of the things, particularly DeepRacer as an example, is something I've seen so many people grab that and use it to learn. And they've sort of done it because, yeah, racing remote control cars with computers, who wouldn't want to do that? But afterwards, they're like, I learned so much about how I can apply this. And importantly, where I can't apply it as well.

Starting point is 00:07:09 So just because you have a tool doesn't mean you should use a tool. So one of the things about Deep Composer and Deep Bracer and Deep Lens is to learn where's the best fit. In my toolkit of things, what should I use and how should I use them? And by making them more available and fun,

Starting point is 00:07:23 fun being a relative term, it means that people can get their hands on these technologies. And that's the big shift. If I reflect on sort of what's changed a lot over the last sort of 30 odd years, 30 years ago, if you wanted to use a new technology, you had to get on a project that used that new technology. And that was hard.

Starting point is 00:07:38 Whereas today, it's like, oh, I'll just spin up my account on AWS and I'll just get a deep composer, I'll do deep bracer, I'll spin up some stage maker, or whatever. The barrier to entry is much lower, and that's really exciting. That really is the differentiator here, and I'd argue that is the real transformational power of cloud,

Starting point is 00:07:55 where I know I'm a little late to the observation here by about 15 years, but the fact that I can have an idea, something I want to experiment with, and more or less run a single command, and these days within seconds, once upon a time, within many minutes, because hey, the provisioning plane needed to evolve from time to time, I could have an environment set up and ready to go. And when I was done, I could then, in theory, just turn it off and never be billed again. In practice, I would then spend 22 cents a month for the rest of my life for ancillary resources that wound up being spun up that I could never fully track down. Which brings us to, I think, a topic that is near and dear to both of our hearts,

Starting point is 00:08:32 specifically the idea of not necessarily reducing cost, because that topic gets done to death, but more about understanding and attributing it to different aspects of the environment. Tell me what you're seeing. Yeah, I think you raise a good point. Is that really the high value discussion, discussion you want to have with your CFO or COO or whoever pays the bills is, is what we're doing valuable to our organization as a function? Is it worth spending the money? And is it worth spending this amount of money? And the beautiful part of cloud and AWS is that you can track down to the cent what you're spending on a transaction, on a system, on an operation, on a development process, etc. And that puts you in the absolute box seat to make decisions about whether it's worth even doing.

Starting point is 00:09:17 And then secondly, you can think about from an architectural standpoint, what I call economic architectures. And this is where you sit down as an architect or as a developer, and you make informed choices about design patterns you apply that factor in price as a consideration. Now, this is really different to how the model used to be. In my day, we'd specify all the hardware, software, storage, network, et cetera, that we thought we needed upfront before we even built the software and hoped that we got it right, which we, spoiler alert. Oh yeah, we pushed the purchase request uphill both ways. Yeah, absolutely. Never got it right. So we either had too much or too little. And really what's important here is to think about, am I using the best possible tool? Is it the most efficient? Does it deliver my functional and non-functional requirements? So am I getting availability? Am I getting security, et cetera? Am I avoiding some operational costs in the future?

Starting point is 00:10:08 Well, to be honest, in my projects at the moment, I think I've spun up an EC2 instance in months because I just do everything serverless now because I just don't like patching systems, and I don't have to. So that has an ongoing benefit. And so looking at holistically really is the difference here. And one of the trends that's happened is DevOps and DevSecOps, or call that domain whatever you will, but the better understanding of developers in terms of how their systems operate in the long term, and the operations teams in terms of some of the decisions that get made early on in the development lifecycle that have long-term repercussions is enhancing and improving substantially.

Starting point is 00:10:45 And the organizations that start to get that right tend to be much happier places, and they deliver systems that have better value and can show back that they're running as they should run. That's an increasing challenge. And I find that there are two reasons that that's challenging, at least in my experience. One is the obvious of there's an awful lot of moving parts and not everything is easily attributed back to a single cost center. Shared services make that a challenge. But the other is that when you're having

Starting point is 00:11:14 this weird cultural dynamic where there isn't a culture of cost attribution, of understanding how to transform legacy processes into something that is more dynamic, it seems like a win when once upon a time, it took us six weeks in a good day to wind up getting a server provision. And now, with the cloud and the company's processes, it only takes four. But it feels like there are opportunities to optimize that. And that, in turn, leads to other weird anti-patterns, such as, if it takes that much work to get something spun up,

Starting point is 00:11:44 you'll never give it up because you might need it again. And who has that kind of time to kill? Yep, yep. Let me maybe tell a story that relates to that and answer that question. I think it's a good point. And this happened a few years back

Starting point is 00:11:55 with a customer that was all in on AWS. And they had a big developer team, like hundreds of developers. And they had set a EC2 limit of about, it was about 500 concurrent EC2 instances. And they were always at that limit. And when they dived deep on it, they discovered that the reason was exactly what you just said. The developers would not release the systems because they were so close to the limit that they were worried that they wouldn't get one back.

Starting point is 00:12:19 So counterintuitively, they upped the limit to 750 750 and their actual usage dropped to 400. So they saved money by increasing the limit. And it was a really interesting study of human factors of that supply and demand and concept of scarcity. If there's scarcity, human beings tend to hold onto things. It's just kind of how we're wired. If there's abundance, then we're not worried about it. And so to your original point of how do we take the time to understand cost, who cares, et cetera, at some point, someone is paying the bill. And what I find works really well is getting as good a handle as you can in your environment, and tagging is a big part of that. There's some great tagging strategies you can use to get a

Starting point is 00:12:58 view of cost at various levels. But the other thing is having the right conversation with the right people in the organization. And this is where IT and finance don't often talk in detail. They tend to talk at high-level numbers. The CIO goes and submits some budget, gets some budget, applies it, etc. What I've found is that when we bring the finance department into the conversation and show them how much granularity we can see around our spend compared to our utilization, we're suddenly speaking their language. And they're like, well, I'm invested in this. I want to know more. What can you show me? How can I understand more about what we're doing? And then once they understand the visibility they have, that then informs reducing or lightening

Starting point is 00:13:39 the governance frameworks around that. Because once I can see something and I can see it in near real time, I don't have to put these heavy gates in front of me. So like you said, I don't have to have a four-week delay to get an EC2 instance. If I know if you spin it up and it doesn't adhere to policy, it'll just get turned off automatically within five minutes. I'm good with that. So it changes that mental model and having those high-level conversations back with data is really changing some of these organizational structures and governance processes. This is a difficult thing to achieve in most corporate environments. I can't imagine how much more difficult it is when you have a lot of these

Starting point is 00:14:15 practices enshrined in the law or process that is so documented and required by so many other aspects. It may as well be law. How do you begin to pivot the culture around, I guess, a transformative opportunity that was never even considered when a lot of these laws and processes were written? I think the thing to remember is that there are many stakeholders for whom change is their goal. Now, they don't come to work going, you know, I want to keep it exactly the same as it's always been and it should always ever be thus. I'm not saying everyone comes to work not thinking that, but there's a big component of people who are like, hey, how can I make this better? How can I improve this? You know, I don't want to sit in an architecture review board meeting

Starting point is 00:14:56 every single week for six hours to approve something I know is going to be okay. So tying into that cultural change and again, finding those right stakeholders, giving them the information, and having the right conversation is really important. And let me give you, for instance, or an example of this. I remember many years ago, early on in the days of cloud, I met with a significant bank. And I had a workshop with their security team. This was like 15 of their heavy-hitting security architects that basically got the opportunity to say, hey, Simon's going to come in. You've got an hour and a half with him. You can ask him any question, throw anything at him, have fun. And they put me through the ringer for an hour and a half. And at the end of that session, they said to me, everything's going to be on cloud in the

Starting point is 00:15:36 future. This is fantastic. What can we do to help? And I was like, okay, that's an interesting lesson learned, which is that there are people who are there to protect and apply governance and requirements, but also to evolve thinking based upon new data. And one of the biggest challenges I find is people aren't always aware of what is possible today. You work around AWS and cloud all the time, as do I. So we just assume, oh, spin up an instance, create a VPC or spin up a database. Or talk about a new service and then have AWS employees look at me and wonder if I'm making the service up because who can possibly keep track anymore? But that's a mental model that we're very comfortable with because we live it all the time.

Starting point is 00:16:14 But for a lot of people, they've never worked or lived or really been exposed to that kind of velocity or opportunity. And so they get delighted when they can do that. So there's a lot more latent change opportunities than you may think. And that's always a challenge because looking at, I guess, the trajectory of companies as they go through various digital transformations, which I despise the term, but don't have a better one. So I begrudgingly use it, but always with a cynical tone of voice. And you see that they go from this idea of data centers where everything is effectively very fixed in terms of cost.

Starting point is 00:16:46 And then as they move through the, I guess, spectrum of how cloudy or not something is, it becomes a lot less expensive in theory. And it winds up instead billing more upon usage. And at some point, you wind up hitting the pure serverless ideal of transaction-based billing where it's, oh, I can now trace, as Simon Wardley says, the flow of capital throughout my organization. I haven't really seen that yet because almost nobody is full-on serverless to that degree. And the few shops that are, for example, A Cloud Guru is famous for this. But during the serverless comp, they get up and show their AWS bill, and it's something like 500 bucks a month. It doesn't actually mean enough money to matter. So what does transaction-based pricing start to look like? So one of the things

Starting point is 00:17:31 with that is firstly, it's often really, really low. And that's scary, as you said, because we're not used to dealing with those types of numbers. But what it looks like is an evolution in certain systems within organizations versus the whole thing. So again, this is a continuum. You're right. For a lot of organizations, this is hard to change. There's an inertia that builds up over time. But much like planting a fruit tree, the best time to plant it is 10 years ago. The next best time is today. People need to start. And so we have lots of customers who have saved huge amounts of money, like millions and billions of dollars just by changing their operational model. Now, they haven't necessarily had to say, we're going to go all in on cloud or we're

Starting point is 00:18:05 going to move everything across. They've picked their nastiest, most expensive, problematic, non-scalable system, what have you, the one that's most opaque and chosen to attack that and deliver that much less. Now, to your story, sometimes it's so much cheaper that people don't necessarily want to talk about it. I was working with a customer once a little while back who was replacing a major system. This is a system that probably cost them, I was thinking, it was about $10 million a year to run this system traditionally.

Starting point is 00:18:33 And we said, okay, we can lift and shift, make a few changes, cloudify it a bit, run it serverless, and it'll cost you a million dollars a year. And I said, we can't go back to our leadership and tell them that. I said, why not? I said, because it's too cheap and we're embarrassed that we spent $10 billion for the last few years on the same system. And I said, I hear you, but that's not actually the problem here. This is a big win. It's not about the decisions you made in the past with the best knowledge you had at the time.

Starting point is 00:18:58 It's about the decisions you make now knowing what you know. And that mental model shift is really important. And doing it on a case-by-case basis is really important. In what you might be forgiven for mistaking for a blast from the past, today I want to talk about New Relic. They seem to be a relatively legacy monitoring company, and I would have agreed with that assessment up until relatively recently. But they did something a little out there. They reworked everything. They went open source, they made it so you can monitor your whole stack in one place, and most They went open source, they made it so you can monitor your whole stack in one place, and most notably from my perspective, they simplified their pricing into

Starting point is 00:19:30 something that is much more affordable for almost everyone. There's even a free tier with one user and 100 gigs per month, totally free. Check it out at newrelic.com. And that's something that I found is one of the hardest parts. The technology is relatively straightforward. Getting people to a point where they can have that pivotal moment, that shift in how they view these things, is always the hard part. And even now, going from the idea of, for me, running on a bunch of virtual instances that are running everywhere great, now moving to containers, heaven forbid, or serverless is still a whole other sea change. But backing up, how do you even, in today's world, go from a time of understanding the world through a lens of data centers and computing to moving to cloud? Because, to be blunt, when I did this, there were a lot fewer services in cloud

Starting point is 00:20:25 that I had to worry about. I looked at the AWS console and, oh my God, I'm never going to be able to learn about all of these services. And there were 12. Now there's significantly more than that. And I don't know where to even begin in a modern era. Where do you stand? Yeah. So let me tackle that from a few places. So firstly, one thing that I think is really important in a long-term IT career is continual learning. And I think many of us start our careers off loving to learn new stuff and then we kind of stagnate after a while because life gets in the way. But if we're not relearning all the time, we're not going to be able to deliver the best for our stakeholders, for our customers,

Starting point is 00:21:04 for ourselves, what have you, take advantage of the latest and greatest. Now, I'm not saying always use the cutting edge, the bleeding edge, et cetera. I'm saying use the things that make sense, given the best of what you know. The reason why I mentioned that is the way I look at all the services we have for customers is it's kind of like a painting palette. You may not use every color in the palette. You just use the ones you need at the time. So the mental model I use is to say, think about what you're trying to achieve and then pick the services you're using.

Starting point is 00:21:29 So for example, you just mentioned, hey, I want to post my website. How would I do that? It's like, okay, posting website. That sounds like some content. This is on S3. Sounds like I need to give it to some people. CloudFront probably fits the bill there.

Starting point is 00:21:40 Probably need to have some security HTTPS. That's a certificate manager. Strap that in. And then maybe I'll do some logging some cloud front logging maybe I'll report on it using Athena and I might call it good if I want more functionality I can add more functionality but I can just stop there and I don't need to go super deep in each of those services I just need to know they're kind of there and I can use them because I want to tie into your drinking game Corey so you can have a drink

Starting point is 00:22:02 is you know because our roadmap is 90 to 95% driven by customer requirements and feedback. They're telling us what they would like us to build and take off their plate. That 5% gap, the things you release that no customer asked for is amazing. Well, the thing is that we don't always know what we want. So sometimes we need to see things that we didn't think of. But in terms of your question, I've got this wall of services, what do I do? It's about thinking about what you're trying to solve for today and assuming that, well, I'm sure other customers have asked for this too. Let me go check if Amazon has tried to solve this on our behalf. And usually the answer is yes. One of the problems I have is that invariably AWS has this incredible knack for releasing solved this problem globally, which is awesome, but it would have saved you a couple of weeks worth of effort if it had just come out a little bit sooner. Is that just me with my terrible timing or is that one of those universal moments?

Starting point is 00:23:15 I think it's just a feeling that you get with all technologies in a way. But I think if you think about it, tying into saying, well, if I'm having this problem and I'm spackle feeling, as you say, then there's probably literally thousands of people at the same time all around the world doing that. And then, as you say, it becomes a timing issue. So whereas you may have already built it and now you get to replace it, there will be many, many, many more people who will never have to build it in the first place because it's already there. Now, related to that, I might make an interesting counterpoint here is that experience is really important and understanding how things were versus how things are gives you a relative concept.

Starting point is 00:23:49 And you talked about people not necessarily understanding how hard it is to get a server racked up, etc. Now, I have a large cohort of solution architects who work in my team, and many of them are much younger than me. And they've never seen a data center. They've never had to order a server through a PO process, etc. It's always been on demand. So we actually created a little learning series to say, you know, here's how we used to do it, just so you understand what was involved. And that was fascinating. When you sit there describing connecting to a storage area network using HBAs, you watch people's eyes roll in their heads and go, wow, that's not a career I would have chosen compared to now just going, attach EBS volume, get on with my day. So understanding

Starting point is 00:24:29 what came before helps you understand how much better things are now, but it's no excuse not to continue to make them better. And so we will continue to fill that spackle wherever we can. And again, I'm not sitting here saying that we should stop the march of progress or, well, it might upset someone who's already built something janky, so you should never release a service. But there is that moment of, at some point, I take a step back and I realize that, huh, I'm spending an inordinate amount of time solving what feels an awful lot like a global problem. Maybe that's not the best path forward. Last time I did one of these things in earnest was about trying to get replication working with encryption on RDS between AWS regions. Today, that's click a button, but at the time, it was not.

Starting point is 00:25:14 And that was painful and challenging. And I'm looking at that, and it felt like I am probably not the only company in the world that has this problem. Maybe there's a better way forward. I can't shake the feeling that by going down this path of either cloud agnosticism with going multi-cloud or building everything in your own data center yourself with an eye towards, I want to avoid lock-in at all costs, you're effectively having to build those solutions that can be done for you by an organization that focuses on solving those global problems for you, like AWS, case in point. I feel like it's so easy to wind up

Starting point is 00:25:51 getting wrapped around your own axle of, so what are you doing right now that adds business value? It's re-implementing a load balancer. Doesn't really seem like the right answer unless your business is load balancers. Yeah. Yeah. I think the concept of undifferentiated heavy lifting, like a drink, is really important because as IT people, we know we've built stuff that's really hard to do and you're kind of proud of yourself. Hey, I made this work, but my goodness, that was hard and no one really cares that I did it. And really what it's about is doing as little as possible to get the outcome you need. And I remember early on in my career, I started in software development and I got to work with some really way smarter than me developers

Starting point is 00:26:28 who really knew their stuff. And one of them took me aside one day and said, you need to learn to be a lazy developer. And I'm like, what do you mean? Lazy, that's bad. You need to learn to do as little as possible to get the outcome you're trying to get for the software. Write as little code, be as efficient as you can, use as few services as you can, reduce the complexity, reduce the time it takes to get it done. I was like, aha, that's a really interesting insight. And that was 30 years ago. When I spin forward to today, what it means to me is when I'm building a system, I'm going to work really closely with the cloud provider of my choice. I'm going to write as closely to their APIs as I can, because I know if I want to make a change, I'm just going to change a few APIs,

Starting point is 00:27:05 talk to a different service provider, use a different service, drop in, replace it, whatever. But that's going to get me to my outcome quicker, which means I get my solution in front of my customer quicker, which means they can tell me whether they like it or not.

Starting point is 00:27:17 And I can either make a change or double down on that. And so it's that mental model of building as little as possible, as quickly as possible, that really has helped in this current environment. So a question I have for you that is a common refrain on this show as far as themes go. If you were starting today and you're at the beginning of your career, because let's not kid ourselves, you and I have been doing this for decades at this point.

Starting point is 00:27:42 Where would you start? And at some point, are you done? I mean, is there a place where, and now I have learned the cloud, box checked. It's a good point. And to give you some context, I mean, when I started, I started on mainframes. So I learned Kix, COBOL, DB2.

Starting point is 00:28:00 I can talk about JessQs and ISPF till the cows come home. I still miss it. But then I had to learn client server, which was the brand new hotness at the time. And then the web came out, etc. Really, what the lesson is, is that we're always learning. So never get too fixed in your mental models around technologies. Now, that said, if you're starting today, there are some really good foundations to

Starting point is 00:28:22 build upon. So lucky. So one of the things I point to customers to all the time now that's available is something called the Amazon Builders Library, which is really a set of blog posts and explanations about how we build and operate software at scale. And this is not saying this is the only way to do it and this is how you should do it. But this is saying, well, this is what we do at global scale and it seems to work pretty well.

Starting point is 00:28:44 You might want to learn from things about this. So one of the things I really like talking to customers about is how to deploy software and roll back safely. Super hard problem to solve, but we do it all the time, many, many times a day, as you can imagine. How do we do it? What does testing look like in that environment? What is validating that's going to work? How do you roll features forward, roll them back, maintain compatibility? All this stuff, it's already there. There's a lot that exists to learn upon. Even simple things like, hey, you mentioned, I want to get started in machine learning. What do I do? Well, even if you jump onto the AWS console, there are pre-built videos, labs,

Starting point is 00:29:19 instructions on how to get up and running so you can at least learn and stand on the shoulders of giants to get going. What is existing now, we'll look back on in 5, 10 years ago and say, well, how cute. You're doing this, you're doing that. Look how far we've come. But you can do that at any point in time in computing. No, it's six lines of YAML. Yeah, exactly. You can do that any time in your career. I mean, I joined AWS back in 2011. I think about some of the stuff I built back then that, like you said, I can build now with literally a few lines of code. But back then,

Starting point is 00:29:49 I was going, wow, look, I didn't have to spend six months doing this. I did it at my kitchen table. Just maintaining that mental flexibility is super important. And you're right, you're never done. And I think that's really hard for all of us as technologists because we tend to be quite mathematical in our mindset, which means we like complete proofs and finality and a solution and it's solved and it's done. And like you said, I can tick that box. I've gotten to the end of AWS. The final boss was super hard. Exactly. All done. And you just can't. And it's hard doing that. And I can tell you that that's something that Amazonians struggle with all the time because we'd love to know everything.

Starting point is 00:30:26 And I'll share with you a personal story. There was a brief shining moment where I could hand on heart say I knew AWS. And like I said, it was early on. We didn't have that many services. I knew them upside down, two ways from Sunday, inside out. I knew them all the way. But you can't now. There's like over 175 services.

Starting point is 00:30:46 Meanwhile, one of the early engineers who's there and still there and is now distinguished engineer or something is just sitting there mumbling, no, you didn't. Because there's always another level to go down to. It does depend what level you go down to. But as far as I was concerned, I had done that. I was on top of my game. I knew everything I need to know.

Starting point is 00:31:00 And then we released another five services and another 10 services and it just doesn't happen. Programming, I've got it. I have both languages, JSON and YAML. Yeah. So one of the challenges we see too, is that at some point you can't go by experience anymore. When I was starting out in my career and I was trying to sound like I was someone a company would want to hire, there was a point where I wanted to add as many years as I could to experience. Now, on some levels, I kind of want to shave some off because your skillset is what it is and how long it took you to get there is in some ways an interesting metric. I guess the depressing end of the spectrum is I've met people who've been working in tech for

Starting point is 00:31:41 30 years, but they don't have 30 years of experience. They have one year of experience repeated 30 times. And that always really depressed me because at some point the tide rises. The thing that you do winds up getting washed away and there aren't very many opportunities to continue doing that one thing. It feels like tech is one of those areas where you have to reinvent yourself the entire time that you have a career. Yeah, I think what we say at Amazon is we say, we want to be stubborn on the vision, but flexible on the details.

Starting point is 00:32:14 So being stubborn on the vision is, hey, I want to be a technologist, I want to build systems, I want to solve problems. That's what I want to do. But whether it's COBOL or C or Python or Power Builder or Delphi or Visual Basic I don't really care

Starting point is 00:32:28 like I care at the time but I'm not going to be bound to it and say well there is only one true language from here on out oh I was so angry about things like that

Starting point is 00:32:37 back then oh god picking fights about programming languages and systems and architecture that was one of my favorite things

Starting point is 00:32:44 turns out I was a terrible person some of of us evolved past that, hopefully. You've recovered. You've recovered. But it's true. But that's the thing is, you know, we tend to get into these kind of almost, you know, philosophical arguments about this chipset versus that chipset or this operating system versus that. It just doesn't matter. What matters is the outcome you're getting, how easy it is to run, easy to manage, easy to learn, etc. And when the time comes to replace it, because we're all building the legacy systems of the future, how easy is it to replace as well? That's what you want to think about.

Starting point is 00:33:19 And that sets you up for a long, satisfying, and invigorated career versus just fighting these battles that you're just going to lose eventually. You can't win that conversation. And that's part of the challenge is it's even hard to talk about because it's on some level, you definitely don't want to be coming across as saying evolve or die dinosaur. But at some point that that's kind of what you do. I mean, entire jobs that were things when I started my career, like firewall engineer was a six figure salary. If you had that skillset now, more or less, I was going to say that, wow, any basic network engineer should have that skillset. But even beyond that, today, it's kind of pretty much, do you know how security groups work?

Starting point is 00:33:51 Spoiler, no one knows how security groups work, but roll with me here. And that gets you to where you need to be. The baseline level of experience is necessary. How do you find that the fundamentals, the things that I guess we had to learn at one point because there was no other option manifest today are they still necessary well i think the detail is not necessary but i think the fundamentals are and the fundamentals don't change now i need to build a

Starting point is 00:34:17 system that's resilient well what does resilient mean well resilient means a lot different now than it did 30 years ago i need to build something that's user-friendly well. When I was studying at university, I remember clearly my lecturer telling me, for a good user experience, when I hit enter on the console, it should take no more than four seconds for it to come back. That's the baseline of good user experience. At this point, it can go to the moon and back. I know. It changes a lot. But understanding that you need to care deeply about user experience will get you a long way. Understanding that people don't really care about the details of technology

Starting point is 00:34:50 depending on their perspective. So as IT professionals, we care deeply. We're all about it all the time. It's what we love. We're passionate. We're enthusiastic. CEO of most organizations doesn't care. They just want to know, does it work? Is it fair value for money? Does it put me ahead of the competition? That's it. That's it. Now, whether it's an all-singing, all-dancing this or an all-singing, all-dancing that, they don't really mind.

Starting point is 00:35:12 They just want to get it up and running and done. So learning how to speak to non-technical stakeholders in an accessible and meaningful way is a super valuable skill. And let's face it, Corey, you made a career of it. And I'm sure it wasn't a course that you did or a instructional video you followed. It was like realizing, hey, I need to talk to these people that don't look like me from a technology perspective that need to hear what I'm saying. One last area I want to talk about with you before we call it a show was announced last year at reInvent, back when we would all gather in the same place and not worry about our lives,

Starting point is 00:35:46 was the Amazon Builders Library. And that is Builders Library, not Billers Library, which presumably is a compendium of all the different API calls that you can make that will cost you money in embarrassing ways, obvious only in hindsight. Talk to me about that, because I love it personally, but I want to get your take on it.

Starting point is 00:36:04 Yeah, it's one of those great things that we have a lot of really experienced engineers building software here at Amazon. And Andy Jassy often says, there's no compression algorithm for experience. And he's right. And the other way I like to look at it is I'd much rather learn from other people's mistakes than my own, because then I don't have to suffer from the pain of that. And this catalog, the Builders library, really showcases what we've learned. How do you build a system that is distributed, that can recover from an outage without the thundering herds of new transactions coming in, destroying it again and creating a repeatable loop of failure?

Starting point is 00:36:40 How do you build a continuous integration, continuous deployment pipeline that genuinely works at scale and you can easily roll back from? How do you build a continuous integration, continuous deployment pipeline that genuinely works at scale and you can easily roll back from? How do you deploy technologies like shuffle sharding or leader election or all these other interesting things that are highly difficult problems but have been solved and solved effectively at scale? And what this library does is gives you all that information for free. Like if I was a graduate studying today, I would just sit down and read that for a day and understand it deeply and go, wow, I've just saved myself 20 years to learn all that. And so I'm really excited that that's publicly available and continues to grow and have new content placed on it because it is genuinely,

Starting point is 00:37:20 this is not about Amazon. It's not about AWS. It's just about building software at as high quality as you can, the lessons you learn, even simple stuff like introducing Jitter in the way you handle workloads because that improves your ability

Starting point is 00:37:34 to handle it at scale. Like just stuff that you wouldn't necessarily think about in your day-to-day work that can inform your own design decisions and the way you choose to build software and may give you ideas to improve the way we build software in the future. So it's a big one to go to the builder's library.

Starting point is 00:37:49 Do you think that it has the potential to cause problems for folks in the sense of architecture as imagined by Hacker News, where someone is building out something to work internally at their company, and if they follow every tenant laid down in the builder's library, assuming such a thing was even possible, they would be building this world-scanning system that more or less would run payroll once a week for 200 people. It feels like it could lead to scenarios of stupendous overkill and more or less writing code for the joy of it rather than to solve a business problem. Do you think that that's a realistic concern or am I not giving people enough credit? I think it comes down again to that mentality

Starting point is 00:38:28 of building only what you need and evolving the architecture over time. And that's where this information feeds into that. And there's actually a talk we do at reInvent. I did it many years ago. My colleagues do it now. We evolve it every year, which is the scaling to 10 million users

Starting point is 00:38:42 or it could be 100 million users now. But really it talks about moving from, I've got one EC2 machine, now I've made it highly available, now I'm scaling out, et cetera, those thought processes to go through. You're not building the end state, you're building the start state and evolving. And I think a lot of the lessons in this can help inform when you might need to reach for that technique in your tool build based upon the scale you're at. It's like, ah, now I've got this problem. I understand what they were talking about. I know how to solve it. Versus, I didn't know this would be a problem. Now it's a problem,

Starting point is 00:39:12 and I don't know what to do. So it positions you better to tackle the future. If you were to dive into the Builders Library today, what would you start with? Because it turns out that there's an awful lot, I wouldn't say an awful lot, not by Amazon service terms, but there are enough documents in there that it could be challenging to pick which one to start with. And on some level, they get very deep very quickly. Is there something that you would say

Starting point is 00:39:35 is the most accessible way to get started? If you had to read one, it would be the one that's called ensuring rollback safety during deployments. That's the one. Because that's all about risk management. It's about saying, hey, how can I deploy software frequently and safely? And it solves for a lot of problems.

Starting point is 00:39:52 What if I move from XML to JSON? What if I change my protocols? What if I change a database? How do I test that I can roll back? What does that really mean? And there's a comment in that particular article that I had a chuckle about because it says, at Amazon, one of our leadership principles is frugality, but we don't believe in frugality when it comes to testing. And I thought, yes, that is correct. That is the way to think

Starting point is 00:40:13 about it. So that article, I think, is an absolute go-to. If you read one article, that's one to read. Excellent. We will throw a link to that in the show notes. Simon, thank you so much for taking the time to speak with me today. If people care more about what you have to say for some ungodly reason, where can they find you? Well, they can indeed find me at the AWS podcast. It's available where all good podcasts are caught. So the podcast catcher of your choice. And if you search up AWS podcast, it'll be the first hit webpage wise as well. One day I will beat you on that listing. Challenge accepted. Thanks so much for taking the time to speak with me today and suffer my egregious slings and arrows. Always a pleasure, Corey. Simon Alisha, Head of Technology and Transformation,

Starting point is 00:41:01 Australia and New Zealand Public Sector. I'm cloud economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on Apple Podcasts. And if you've hated this podcast, please leave a five-star review on Apple Podcasts, along with a detailed comment filling out exactly why you felt the need to dislike it in triplicate. This has been this week's episode of Screaming in the Cloud.

Starting point is 00:41:20 You can also find more Corey at screaminginthecloud.com or wherever Fine Snark is sold.

Screaming in the Cloud - Building Taxpayer-Funded Cloud Services with Simon Elisha

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.