Screaming in the Cloud - S3: 15 Years and 100 Trillion Objects Later with Kevin Miller

Episode Date: April 20, 2021

About KevinKevin Miller is currently the global General Manager for Amazon Simple Storage Service (S3), an object storage service that offers industry-leading scalability, data availability, ...security, and performance. Prior to this role, Kevin has had multiple leadership roles within AWS, including as the General Manager for Amazon S3 Glacier, Director of Engineering for AWS Virtual Private Cloud, and engineering leader for AWS Virtual Private Network and AWS Direct Connect. Kevin was also Technical Advisor to Charlie Bell, Senior Vice President for AWS Utility Computing. Kevin is a graduate of Carnegie Mellon University with a Bachelor of Science in Computer Science.Links:AWS S3: https://aws.amazon.com/S3AWS Twitch: https://www.twitch.tv/awsAWS YouTube: https://www.youtube.com/user/AmazonWebServicesAWS Pi Week: https://pages.awscloud.com/pi-week-2021.html

Transcript
Discussion (0)
Starting point is 00:00:00 Hello, and welcome to Screaming in the Cloud, with your host, Chief Cloud Economist at the Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud. Join me on April 22nd at 1 p.m. Eastern Time or 10 a.m. in the one true Pacific Coast time zone for a webcast on cloud and Kubernetes failures, like there's another kind, and successes.
Starting point is 00:00:44 Apparently there are other kinds, in a multi-everything world. Oh God, what are they making me do now? I'll be joined by Fairwinds president Kendall Miller. Oh, that explains it. And their solution architect, Ivan Fetch, will discuss the importance of gaining visibility into this multi-everything cloud native world, and I will make fun of them relentlessly. For more info and to register, visit www.fairwinds.com slash Corey. Oh god, it makes it look like I work there now. That's C-O-R-E-Y, the E is critical, and tell them exactly what you think of them, because I sure will. Talk to you on April 22nd at 10 a.m. in the
Starting point is 00:01:25 One True Pacific time zone. If your mean time to WTF for a security alert is more than a minute, it's time to look at Lacework. Lacework will help you get your security act together for everything from compliance service configurations to container app relationships, all without the need for PhDs in AWS to write the rules. If you're building a secure business on AWS with compliance requirements, you don't really have time to choose between antivirus or firewall companies to help you secure your stack. That's why Lacework is built from the ground up for the cloud. Low effort, high visibility, and detection. To learn more, visit lacework.com. Welcome to Screaming in the Cloud. I'm Corey Quinn. I'm joined this week by Kevin Miller,
Starting point is 00:02:13 who's currently the general manager for Amazon S3, which presumably needs no introduction itself, but there's always someone. Kevin, welcome to the show. Thanks for joining us. And what is S3? Well, Corey, thanks for having me. Yes. Amazon S3 was actually the first generally available AWS service. We actually just celebrated our 15-year anniversary here on Pi Day 314. And S3 is an object storage service that makes it easy for customers to put in store any amount of data that they want. We operate in all AWS regions worldwide. And we have a number of features to help customers manage their storage at scale because scalability is really one of the core building blocks, tenants for S3, where we provide the ability for customers to scale up and scale down the amount of storage they use. They don't have to pre-provision storage.
Starting point is 00:03:05 And when they delete objects that they don't need, they stop paying for them immediately. So we just make it easy for customers to store whatever they need, access it from applications, whether those are applications running in AWS or somewhere else on the internet, and really just want to make it super easy for customers to build storage, use storage with their applications.
Starting point is 00:03:27 So a previous guest in, I'd say, the first quarter of the show's life, as of this time, was Mylon Thompson-Bukovic, who at the time was also the general manager of S3, and she has since ascended to perhaps S4 or complex storage service. And you have transitioned from a role where you were the general manager of Glacier. Correct. Or Amazon S3 Glacier, and that's the point of the question. Is Glacier part of S3? Is it something distinct? I know they are tightly related, but it always seems that it's almost like the particle wave experiment in physics, where is it part of S3 or is it a distinct service? It just depends entirely on the angle you're looking at it through.
Starting point is 00:04:09 Right. Well, Amazon S3 Glacier is a business that we run as a separate business with a general manager, Joe Fitzgerald, who looks after that business today. Certainly, most of our customers use Glacier through S3, so they can put data into S3, and they actually can put it directly into the Glacier storage class or the Glacier Deep Archive storage class. Or customers can configure lifecycle policies to move data into Glacier at a certain point.
Starting point is 00:04:38 So the primary interface customers use is through S3, but it is run as a standalone business because there's just a set of technology and human decisions that need to be made specific to that type of storage, that archive storage. So I work very closely with Joe. He and I are peers, but they are run as separate businesses. So you have, of course, transitioned. I guess we'll say that you've thought you are no longer the GM of Glacier. You're now the GM of S3, and you just had a somewhat big announcement to celebrate that 15-year anniversary of S3 Object Lambda. Yes, we're very excited about S3 Object Lambda, and
Starting point is 00:05:19 we've spoken to a number of customers who were looking for features with S3. And the way that they described it was that they liked the S3 API. They want to access their data through that standard API. There's lots of software that knows how to use that, including obviously the AWS SDK. And so they liked that get interface to get data out and to put data in. But they wanted a way to change the data a little bit as it was being retrieved. And there's a lot of use cases for why they wanted to do it. Everything from redacting certain data to maybe changing the size of an image for particular workloads, or maybe they have a large amount of XML data and they want that for certain applications, they want a JSON formatted
Starting point is 00:06:03 input. And so rather than have a lot of complicated business logic to do that, they said, well, why can't I just put something in path so that as the data is being retrieved through the get API, I can make that change, that data can be reformatted. It's similar to the Lambda at Edge approach, where instead of having to change or modify the source constantly and have every possible permutation, just operate on the request? Yeah, that's right. So I want one copy of my data. I don't want to have to create lots of derivative copies of it, but I want to be able to make changes to it as it's going through the APIs. So that's what we built. It is Lambda. It's
Starting point is 00:06:40 integrated with Lambda. It's full Lambda. So really, it's pretty powerful. Customers can do anything you can do in a Lambda function. You can do in these functions that are then run. So an application makes a get request that invokes the Lambda function. The function can process the data and then whatever's returned out is then sent and streamed back to the application. So customers can build some transformation logic
Starting point is 00:07:02 that runs in line with that request, but then transforms that data that goes to applications. So at the time that we're recording this, the announcement is hours old. This is not something that has had time yet to permeate the ecosystem. People are still working through the various implications of it. So it may very well be that this winds up aging before we can even turn the episode around. But what is the most horrifying use case of this that you've seen so far? Because I'm looking at this and I'm thinking, oh, you know what I can use this for? And people
Starting point is 00:07:34 are thinking, oh, a database? No, that's what Route 53 is. Now I can use S3 as a messaging queue. Well, possibly. I keep saying that I'm going to use it as a random number generator, but that was... I thought that was the bill. Not quite. We have a lot of use cases that we're hearing and seeing already in just the first few hours for it. I don't know what I would call any super horrifying, but we have everything from what I was saying in terms of redaction and image transformation to... one of the things that I think a lot of it will be great will be using it to prepare files for ML training. You know, I've actually done some work with training machine learning models. And oftentimes, there's just little things you have to tweak in
Starting point is 00:08:17 the data. Sometimes you get a row that has an extra piece of data in it that you didn't expect, or it's missing a field, and that causes the training job to fail. So just being able to kind of cleanse data and get it ready to feed into an ML training model, that seems like a really interesting use case as well. Increasingly, it's starting to seem like S3's biggest challenge over the past 15 years of evolution has been that it was poorly named because it's easy to look at this now and come away with the idea that it's not simple and if you take a look at what it does it's very clearly not i mean the idea of having storage it increases linearly as far as cost goes you're billed for what you use without having to pre-provision storage appliance at a
Starting point is 00:09:03 petabyte at a time and buy a number of shells. Ooh, if I add one more, the vendor discount kicks in, so I may as well over-provision there. Oh, we're running low. Now we have to panic order and get some more in. I've always said that S3 has infinite storage because it does. It turns out you folks can provision storage added to S3 faster than I can fill it, I suspect, because you just get the drives on Amazon. Well, it's a little bit more complicated than that. I mean, I think, Corey, that's a place that you rightly call out. When we say simple storage service, although there's so much functionality in S3 today, I think we go back to some of the core tenets of S3 around the simplicity and scalability and resiliency. And those are not easy. There's a lot of time spent around the simplicity and scalability and resiliency. Those are not easy.
Starting point is 00:09:45 There's a lot of time spent within the team just making sure that we have the capacity, managing the supply chain to a deep level. It's a little bit harder than just clicking buy now, but we have teams that focus on that and do a great job. Also just around looking around corners and identifying how we continue to raise the bar for resiliency and security and durability of the service.
Starting point is 00:10:10 So there's just, yeah, there's a lot of work that goes into that, but I do think it goes back to that simplicity of being able to scale up and scale down makes it just really nice to build applications. And now with the ability to build serverless applications where you have the ability to put a little code there in the request path so that you don't have to have complicated business logic in an application. We think that that is still a simple capability.
Starting point is 00:10:35 It goes back to how do we make it easy to build applications that are integrated with storage? Does S3 Object Lambda integrate with all different storage tiers? Is it something that only works on standard? Does it work with infrequent access? Does it work with, for example, the one that still exists but no one ever talks about, reduced redundancy storage? Does it work with Glacier? Just it sits there and that thing spins for an awfully long time. It will work with all storage classes, yes. With Glacier, you would have to restore an object first and then it would, so you'd issue the restore initially, although the Lambda function itself could also issue the restore. And you would most likely then come back for a second request later to retrieve the data from Glacier once it's been restored. But it does work with S3 standard, S3 intelligent tiering, SIA, and any other storage classes. I think my favorite part of all of this is that the interaction model for any code that's accessing stuff in S3 doesn't change.
Starting point is 00:11:28 It is strictly a talk to the endpoint, make a typical S3 get, and everything that happens on the back end of that is transparent to your application. Exactly. And that was, again, if you go back to the simplicity, how do we make this simple? We said customers love just that simple API. It's a get API. And how do we make this simple? We said, customers love just that simple API.
Starting point is 00:11:45 It's a GET API. And how do we make it so that that API continues to work and applications that know how to use a GET, they can continue to use a GET and retrieve the data,
Starting point is 00:11:54 but the data will be transformed for them, you know, before it comes back. Are there any boundaries around what else that Object Lambda is going to be able
Starting point is 00:12:02 to talk to? Is it only able to do internal massaging of the data that it sees? Is is going to be able to talk to? Is it only able to do internal massaging of the data that it sees? Is it going to be able to call out to other services? How extensible is this? The Lambda can do essentially whatever a Lambda function can do, including all the different languages. And then also, yeah, it can call out to DynamoDB, for example,
Starting point is 00:12:21 if you want to, for example, let's say you have a CSV file and you want to augment that CSV with an extra piece of data where you're looking it up in a DynamoDB table, you can do that. So you can merge multiple data streams together. You can dip out to an external database to add to that data. It's pretty flexible there. So on some level, what you're realistically saying here is that until now, S3 Object Lambda directly as a public website endpoint at this point. So that's something that we're definitely listening to feedback from customers on.
Starting point is 00:13:10 Can I put CloudFront in front of it and then that can invoke the GET endpoint? Today you can't, but that is also something that we've heard from a few use cases. But primarily the use cases that we're focused on right now are ones where it's applications running within the account or within a peer account. I was hoping to effectively re-implement WordPress on top of S3. Now, again, not all use cases are valid or good or something anyone should do, but
Starting point is 00:13:35 that's most of the ways I tend to approach architecture. I tend to live my life as a warning to others whenever I get the opportunity. Yeah. I don't know how to respond to that, Corey. That's fine. You don't need to. So one thing that was also discussed is that this is the 15-year anniversary, and the service has changed an awful lot during that time. In fact, I will call out, for really no other reason than to be a small, petty man, that the very first AWS service in beta was SQS. Someone's going to win my bar trivia night on that someday. That's right. But S3 was the first to general availability because obviously a message queue was needed before storage. And let's face it as well, that most
Starting point is 00:14:17 people, even if they're not in the space, can instinctively wrap their heads around what storage is. A message queue requires a little bit more explanation. But that's okay. We will do the revisionist history thing, and that's fine. But it's evolved beyond that. It had some features that, again, are still supported but not advertised. The reduced redundancy storage is still available but not talked about, and there's no economic incentive for doing it, so people should not be using it. I will make that declaration on my part, so you don't have to. But you can still talk to it using SOAP calls in the regions where that existed via XML, which is the one true data interchange format, because I want everyone mad at me.
Starting point is 00:14:57 You can still use the, we'll call it legacy because I don't believe it's supported in new regions, the BitTorrent interface for S3 data. A lot of these were really neat when it came out in far future, and they didn't pan out for one reason or another, but they're still there. There's been no change since launch that I'm aware of that suddenly breaks if you're using S3 and have just gone on walkabout for the last 15 years. Is that correct? You're right. There's functionality that we had from early on in S3 that's still supported. And
Starting point is 00:15:31 I think that speaks to the way we think about the service, which is that when a customer starts adopting it, even for features like BitTorrent, which certainly that's not a feature that is as widely adopted as most of them, But there are customers that use it. And so our philosophy is that we continue fully supporting it and helping those customers with that protocol. And if they are looking to do something different, we'll help them find a different alternative to it. But yeah, the only other thing that I would highlight is just that there have been some changes to the TLS protocols we've supported over time. And that's been something we've closely worked with customers to manage those transitions to make sure that we're
Starting point is 00:16:08 hitting the right security benchmarks in terms of the TLS protocol support. It's hard on some level also to talk about S3 without someone going, oh, what about that time in 2017 when S3 went down? Now, I'm going to caveat that before we begin in that, one, it went down in a single region, not globally. To my understanding, the ability to provision new buckets was impacted during the outage, but things hosted elsewhere would have been fine. Everything depends inherently on S3 on some level, and that sort of leads to a cascade effect where other things were super wonky for a while. But since then, AWS has been remarkably public about what changed and how things have changed. I think you mentioned during the keynote at reInvent or reInvent two years ago
Starting point is 00:16:56 that there's now something like 235 microservices at the time that power S3 under the hood, which, of course, every startup in the world looked at that and said, oh, a challenge, we can beat that. Like there's somehow Pokemon and you've got to implement at least that many to be a real service. I digress. A lot changed under the hood to my understanding, almost a complete rewrite, but the customer experience didn't. Yeah, I think that's right, Corey. And we are constantly evolving the services that underlie S3. And over the 15 years, that's been maybe the only constant has been the change in the services. And those services change and improve based on the lessons we've learned and new bars that we want to hit.
Starting point is 00:17:38 And I think one really good example of that is the launch of the S3 Strong Consistency in December last year. And Strong Consistency, for folks who have used S3 for a long time, that was a very significant change. It was a bimodal distribution as far as the response to that. The response was either, what does that even mean and why would I care? And the other type of response was people dropping their coffee cup in shock when they heard it. It's a very significant change. And obviously, we delivered that to all requests, to all buckets, with no change to performance and no additional costs. So it's just something that everyone who uses S3 in today or in the future got for free, essentially, no additional charge. What does strong consistency mean? And why is that important, other than as an impressive feat of technical engineering?
Starting point is 00:18:29 Right. So in the original implementation of S3, you could overwrite one object, but still receive the initial version of an object in response to a GET request. So that's what we call eventual consistency, where there can be generally a short period of time, but some period of time where a subsequent write would not be reflected in a get request. And so with strong consistency now, the guarantee we provide is that as soon as you receive a 200 response on a put request, then all subsequent get requests and all subsequent list requests will include that most recent object version, the most recent version of the data that you've provided for that object. And that's just an important change because there's plenty of applications that rely on object version, the most recent version of the data that you've provided for that object. And that's just an important change because there's plenty of applications that rely on that idea of, I've put the data and now I'm guaranteed to get the exact data that I put
Starting point is 00:19:15 in response versus getting a no older version of that data. There's a lot that goes into that. It's deceptively complicated because someone thinks about that in the context of a single computer writing to disk. Well, why is that hard? I edit a file, then I talk to that file, and my edits are in that file. Yeah, distributed systems don't quite work that way. And now imagine this at the scale of S3. It was announced in a blog post at the start of this week that 100 trillion objects are stored in S3. That's something like 16,000 per person alive today. And that is massive. And part of me does wonder how many of those
Starting point is 00:19:51 are people doing absolutely horrifying things. But it's a, customer use cases are weird. There's no way around that. That's right, north of 100 trillion objects. I think actually 99 trillion are cat pictures that you've uploaded, Corey. Oh, almost certainly. Then I use them as a database.
Starting point is 00:20:06 The mood of the cat is how we wind up doing this. It's not just for sentiment analysis. It's sentiment-driven. Yeah, that's right. That's right. But yes, S3 is a very large distributed system. And so maintaining consistent state across a large distributed system requires very careful protocols. There's actually one of the things we talked about this week that I think is pretty interesting about the way that
Starting point is 00:20:30 internal engineering in S3 has changed over the last few years is that we've actually been using formal logic and mathematical proofs to actually prove the correctness of our consistency algorithms. So the team spent a lot of time engineering the consistency services and all the services that had to change to make consistency work. You know, there's a lot of testing that went into it, kind of traditional engineering testing. But then on top of that, we brought in mathematicians, basically, to do formal proofs of the protocols. And they found edge cases. I mean, some of the most esoteric edge cases you can imagine, but... But it's not just startups that are using this stuff. It's hospitals. Those edge cases need to not exist if you're going to make guarantees around things like this.
Starting point is 00:21:12 That's right. You just have to make sure. And it's hard. It's painstaking work to test. But with our formal logic, we're able to just simulate billions of combinations of messages and updates that we're able to then validate that the correct things are happening relative to consistency. So it was a very significant engineering work. It was a multi-year effort really to get strong consistency to the point it was. But just to go back to your earlier point, that's just an example of how S3 really has changed under the hood, but the external API, it's still the external API. So that's our North Star on all of this work. Incidents happen fast, but they don't come out of nowhere. If they're watching,
Starting point is 00:21:54 your team can catch the sudden shifts in performance, but who has time or more importantly, the inclination to constantly check thousands of hosts, services, and containers. That's where New Relic Lookout comes in. Part of full-stack observability, it compares current performance to past performance, then displays it in an estate-wide view of your whole system. Best of all, I get to pronounce it as New Relic Lookout. And that's a blast. Sign up for free at
Starting point is 00:22:26 newrelic.com and start moving faster than ever. Tell them Corey sent you and be sure to yell LOOKOUT when you talk to them. So you've effectively rebuilt the entire car while hurtling down the freeway at 60, or if you're like me, 85, but it still works the same way. There are some things as a result that you're not able to change. So if you woke up alternate timeline, you knew then what you know now, how would you change the interface? Or what one-way doors did you go through when building S3 early on in its history that in hindsight, you would have treated differently? Well, I think that for those customers who used S3 in the very early days, you know, there was originally this idea that S3 buckets would be global, actually, global in scope.
Starting point is 00:23:12 And we realized pretty early on that what we really wanted was regional isolation. And so today, when you create a bucket, you create a bucket in a specific region. And that's the only place that that data is stored. It's stored in that region. Of course, it's stored across three physically diverse data centers within that region to provide durability and availability, but it's stored entirely within that region. And I think, you know, in hindsight, I think if we had known initially that we would have moved into that regional model, we may have thought a little bit differently about how buckets are named, for example. But where we are now, we definitely like the regional resiliency. I think that's a model that has proven itself time and time again, that having that regional resiliency is critical. And customers really appreciate that. Something I want to talk about speaks directly to the heart
Starting point is 00:24:00 of that resiliency and the, frankly frankly ridiculous level of durability and availability the service offers. We've had you get on stage talking about these things. We've had Milan several times on stage talking about these things. And Jeff Barr writes blog posts on all of these things. I'm going to go out on a limb and guess that there's more than just the three of you building this. Oh, yeah.
Starting point is 00:24:23 What's involved keeping this site up and running? Who are the people that we don't get to see? What are they doing? Well, there's large engineering teams responsible for S3Up, of course. And they, I would say, in many ways, are the unsung heroes of delivering the services that we do. Of course, you know, we get to be on stage and talking about these cool new features, but it's only with a ton of hard work about the engineering teams day in and day out. And a lot of it is having the right instrumentation and monitoring the health of the service to an incredibly deep level. It's down very deep into hardware, of course, very deep into software, and getting all those signals, and then making sure that every day we're doing the right set of things, both in terms of work
Starting point is 00:25:11 that has to be done today and project work that will help us deliver step functions improvements, whether it's adding another degree of availability or looking at just certain types of data and certain edge cases that we want to strengthen our posture around. There's constant work to look around corners and really just to continuously raise the bar for availability and resiliency and durability within the service. It almost feels on some level like the most interesting changes and the enhancements that come out almost always without comment come from the strangest moments. I mean, I remember having a meeting with a couple of folks a year or two ago when I was I kept smacking into a particular challenge. I didn't understand that there was an owner ACL at the time, and it turned out that there were two challenges there. One was that I didn't fully understand what I was looking at. So people took my bug report more seriously than it
Starting point is 00:26:08 probably deserved. And to be clear, no one was ever anything professional on this. And we had a conversation, my understanding dramatically improved. But the second part was a while later, oh yeah, now with S3, you can also set an ACL that determines that any object placed into the bucket now has an ownership ID of the bucket owner. And I care about that primarily because that directly impacts the cost and usage reports that are what my company spends most of our life staring into. But it made for such an easier time as far as what we have to deploy to customer accounts and how we wind up thinking about these things. And it was just a quiet release that was like many others with the same lack of fanfare that, oh, the service you don't use is now available in a region you've never heard of. Have fun. And there are, I think,
Starting point is 00:26:53 almost 3,000 of various releases last year. This was one of them that moved the needle. It's little things like that, but it's not so little because doing anything like this at the scale of something like S3 is massive. People who have worked in very small environments don't really appreciate it. People who have worked in much larger environments, like the larger the environment you get to work within, the more magical something like this seems. I mean, I think that's a great example of the kind of feature that took us actually quite a bit of work to figure out how we would deliver that in as simple a fashion as possible.
Starting point is 00:27:29 It was actually a feature that at one point, I think there was a two or three D matrix being developed of different ways that we might have to have flags on objects. And we just kept pushing and pushing to say, it has to be simpler. We have to make this easier to use. And I think we ended up in a really good spot. And certainly for customers that have lots of accounts, which I would say, you know, almost all of our large customers end up with many, many accounts. Well, we'd like to hope so anyway. There was a time where, oh, just one per customer is fine.
Starting point is 00:28:00 And then you got to redefine what large account looked like a few times. And it was, okay, let's see how this evolves. Again, the things you learn from customers as you go. Yeah, exactly. And there's lots of reasons for it, different teams, different projects, and so forth, where you have lots of accounts. But for any of those kind of large account scenarios or large organization scenarios, there's almost always cases where you're writing data across accounts in different buckets. So certainly that's a feature that for folks who use S3, they knew exactly how they were going to use it, turn it on right away. It's the constant quiet source of improvement that is just phenomenal.
Starting point is 00:28:36 The argument I always made that I think is one of the most magical parts of cloud that isn't really talked about is that if I go ahead and I build an environment and I put it in AWS, it's going to be more durable, arguably more secure, and better run and maintained five years later if I never touch it again. Whereas if I try that in a data center, the raccoons will carry the equipment off into the wilderness right around year three. And that's something that is generally not widely understood until people have worked extensively with it s3 is also one of those things that i find is a very early and very defining moment when companies look at going through either a cloud migration or a digital transformation if people will pardon me using the term that i love making fun of. It's a good metric for how cloudy, for lack of a better
Starting point is 00:29:26 term, is your application and your environment. If everything lives on disks attached to instances, well, not very. You've just more or less replicated your data center environment into a cloud, which is fine as a step one. It's not the most efficient. It makes the cloud look a lot more like your data center, and you're not leveraging a lot of the capability there. Object storage is one of the first things that seems to shift. And one of the big accelerators or drags on adoption always seems like it comes down to how the staff think about those things. What do you see around that? Yeah, I think that's right, Corey. I think that it's super exciting to me working with customers that are looking to transform their business because oftentimes
Starting point is 00:30:11 it goes right down to the data in terms of what data am I collecting? What can I do with that data to make better decisions and make more real-time decisions that actually have meaningful impact on my business? And we talk about modern applications. Some of it is about developing new modern applications and maybe even applications that open up new lines of business for a customer. But then we have other customers who also use data and analytics to reduce costs and to better manage their manufacturing or other facilities. We have one customer who runs paper mills,
Starting point is 00:30:46 and they were able to use data in S3 and analytics on top of it to optimize how fast the paper mills run to eliminate the machines or reduce the amount of time the machines are down because they get jammed. And so it's examples like that where customers are able to, first off, you know, using S3 and using AWS, able to just store a lot more than they've ever thought they could in a traditional on-premises installation. And then on top of that, really make better use of that data to drive their business. I mean, that's super exciting to me, but I think you're right as well about the people side of it.
Starting point is 00:31:21 I mean, that is, I think, an area that is really underappreciated in terms of the amount of change and the amount of growth that is possible and yet really untapped at this point. On some level, it almost shifts into, and again, this is understandable. I'm not criticizing anyone. I want to be clear here. Lord knows I've been there myself. Where people start to identify the technology that they work with as a part of their identity of who they are, professionally or in some cases personally. And it's an easy step to make. If there were suddenly a giant pile of reasons that everyone should migrate back to data centers, my first instinct would be to resist that, regardless of the merits of that argument. Because, well, I've spent the last four years getting super deep into the world of AWS. Well, isn't that my
Starting point is 00:32:07 identity now on some level? So I should absolutely advocate for everything to be in AWS at all times. And that's just not true. It's never true. But every time, it's a hard step to make, psychologically. Oh, I agree. I think it is a psychologically hard step to make. And I think people, you know, get used to working with the technology that they do. And change can always be scary. You know, I mean, certainly for myself as well, just in circumstances where you say, well, I don't know, it's uncertain. I don't know if I'm going to be successful at it. But I firmly believe that everyone at their core is interested in growth and developing and doing more tomorrow than they did yesterday. And sometimes it's not obvious. Sometimes it can be
Starting point is 00:32:50 frightening, as I said, but I do think that fundamentally people like to grow. And so I think with the transformation that's ongoing in terms of moving towards more cloud environments, and then, again, transforming the business on top of that, you know, to really think about IT differently, think about technology differently. I just think there's tremendous opportunity for folks to grow, you know, people who are maintaining current systems to grow and develop new skills to maintain cloud systems or to build cloud applications even. So I just think that's an incredibly untapped portion of the market in terms of providing the training and the skills and support to transform the culture and the people to have
Starting point is 00:33:31 the skills for tomorrow's environments. Thank you so much for taking the time to speak with me about this dizzying array of things that S3 has been doing, what you've been up to for the last 15 years, which is always a weird question. What have you been up to for the last 15 years anyway, but usually in a much more accusatory tone? If people want to learn more about what you're up to, how you're thinking about these things, where can they find you? Well, I mean, obviously they can find the S3 website at aws.amazon.com slash S3, but there's a number of videos on Twitch and YouTube, both of myself and many of the folks within the team. Really, we're excited to share a lot of new material this week with our Pi Week. We decided Pi Day was not enough.
Starting point is 00:34:15 We would extend it to be a four-day event. ton of information, including some deep dives with some of the principal engineers that really help build S3 and deliver on that higher bar for availability and durability and security. And so they've been sharing a little bit of behind the scenes as well as just a number of videos on S3 and the innards there. So really invite folks to check that out. And otherwise, my inbox is always open as well. And of course, I would be remiss if I didn't point out that I just did a quick check and you have what can only be described as a sarcastic number of job openings within the S3 organization of all kinds of different roles. Yeah, that's right. I mean, we're always hiring software engineers and then systems development engineers in particular, as well as product management.
Starting point is 00:35:05 And TPMs and, you know, I'm assuming naming analysts, like how do we keep it S3, but not call it simple anymore? Let me spoil that one for someone. Serverless. You call it serverless storage service and you're there. Everyone wins. You ride the hype train. Everyone's happy. I'm going to write that up right now, Corey. It's a good idea. Exactly. We'll have to find a way to turn that story into six pages, but that's a separate problem. That's right. Thank you so much for taking the time to speak with me. I really appreciate it. Likewise. It's been great to chat. Thanks, Corey.
Starting point is 00:35:32 Kevin Miller, General Manager of Amazon's Simple Storage Service, better known as S3. I'm cloud economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice. Whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an insulting comment that whenever someone tries to retrieve it, we'll have an object lambda rewrite it as something uplifting and positive. If your AWS bill keeps rising and your blood pressure is doing the same, then you need the Duck Bill Group.
Starting point is 00:36:10 We help companies fix their AWS bill by making it smaller and less horrifying. The Duck Bill Group works for you, not AWS. We tailor recommendations to your business, and we get to the point. Visit duckbillgroup.com to get started. This has been a HumblePod production. Stay humble.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.