Screaming in the Cloud - Episode 46: Don't Be Afraid of the Bold Ask

Episode Date: January 30, 2019

If you’re looking for older services at AWS, there really aren’t any. For example, Simple Storage Service (S3) has been with us since the beginning. It was the first publicly launched ser...vice that was quickly followed by Simple Queue Service (SQS). Still today, when it comes to these services, simplicity is key! Today, we’re talking to Mai-Lan Tomsen Bukovec, vice president of S3 at AWS. Many people use S3 the same way that they have for years, such as for backups in the Cloud. However, others have taken S3 and ran with it to find a myriad of different use cases. Some of the highlights of the show include: Data: Where do I put it? What do I do with it? S3 Select and Cross-Region Replication (CRR) make it easier and cheaper to use and manage data Customer feedback drives AWS S3 price options and tiers Using Glacier and S3 together for archive data storage; decisions and constraints that affect people’s use and storage of data Feature requests should meet customers where they are, rather than having to invest in time and training Different design patterns and best practices to use when building applications Batch operations make it easier for customers to manage objects stored in S3 AWS considers compliance and retention when building features Mentorship: Don’t be afraid of the bold ask Links: re:Invent AWS S3 Amazon SQS AWS Glacier Lambda CHAOSSEARCH .

Transcript
Discussion (0)
Starting point is 00:00:00 Hello, and welcome to Screaming in the Cloud, with your host, cloud economist Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud. This episode of Screaming in the Cloud. This episode of Screaming in the Cloud has been sponsored by Chaos Search. Chaos Search is a cloud-native SaaS offering
Starting point is 00:00:31 that extends the power of Elasticsearch's API on top of your data that already lives in Amazon's S3. Chaos Search essentially turns your data in S3 into a warm Elasticsearch cluster, which finally gives you the ability to search, query, and visualize months or years worth of log and event data without the onerous cost of running a hot elk cluster for legacy data retention. Don't move your data out of S3. Just connect the Chaos Search platform to your S3 buckets, and in minutes, the data is
Starting point is 00:01:05 indexed into a highly compressed data format and written back into your S3 buckets, so it keeps the data under your control. You can then use tools like Kibana on top of that to search and visualize your data all on S3, querying across terabytes of data within seconds. Reduce the size of your hot elk clusters and waterfall your data to Chaos Search to get access to an unlimited amount of log and event data. Access more data, run fewer servers, spend less money. Chaos Search. To learn more, visit chaossearch.io and sign up for a trial. Thanks to Chaos Search for their support of this episode. Hello and welcome to Screaming in the Cloud. I'm Corey Quinn. Today I'm in Seattle meeting with Mylan Thompson-Bukovic, the VP of S3 at AWS. Mylan, thank you so much for taking
Starting point is 00:01:54 the time to speak with me. Hi, Corey. Nice to see you. You are probably best known in recent history for appearing on the keynote stage at reInvent. You were in a video of you walking through the spheres, having conversations, of taping your hands to either go kickboxing or deliver one heck of a product review. But you've been at AWS a lot longer than since November. Yeah, that's right. I had a great time talking to the reInvent audience during the keynote. S3 is a favorite topic of mine, but I have been with AWS for over eight
Starting point is 00:02:26 years now. Which feels about three times longer once you wind up getting inside of it. It feels like it ages you prematurely. So let's start at the beginning. As you wound up recently informing me, S3 was the first publicly launched service followed very closely by SQS. So if you're looking for older services at AWS, there really aren't any. This is one that's been with us since the beginning. It's interesting in that it's turned into something that almost sort of belies its name. It's become such a capable and foundational part of what people do on AWS that it almost makes you wonder, is Simple even the right descriptor for it anymore? Yeah, we think it is.
Starting point is 00:03:04 I mean, to be honest, Simple has been at the heart of how we build S3 from the start. In fact, it's part of our product tenants or service tenants that we as builders think about every time we add something to S3. And I think, you know, if you take a step back and you look at how people use S3, you know, a lot of people use it in the same way that they used it over a decade ago, which is those backups in the cloud, right? How do I take the storage, which I know is gonna grow really fast
Starting point is 00:03:34 and put it in a place where I know it's gonna be really safe and secure and I don't have to pay very much for that storage. And we have plenty of those customers, just like we did on day one. But for us, what's been really interesting is how people have taken that and just run with it and started to use S3 in different ways. Now, we still think of it as being simple because everything that we try to build is a very clean representation of what a customer is trying to do. But right now, customers are doing all kinds of things with S3.
Starting point is 00:04:05 One thing that's consistently surprised me as I've moved through my career between companies where I was an employee or as a consultant is the myriad use cases that I'll see S3 being used for. There's the stuff that everyone would expect it to be used for, such as backups or static asset serving. But there's also strange stuff, such as using it as a repository backend for Git repositories, and other things that seem a little bit on the strange side. If I'm going to look back at the last couple of years of S3-based announcements, one of the prevailing themes that I'm seeing from the outside world is that a decent number of these enhancements are focusing on what appears to be moving compute closer to the data, about taking data that lives in S3 and being able to do things with it where step one isn't pull all of
Starting point is 00:04:52 that data somewhere else. First, was that an accurate assessment from where you see things? And secondly, what's driving that? Yeah, I think that's very true. I think a lot of customers sometimes will start with these very basic, you know, where do I put it type of scenarios for storage. And then they, you know, kind of quickly go to the, what do I do with it? And now a lot of customers who are actually starting with their first use case as being a data lake, they start from what do I do with it? And all places kind of end up in S3. And if I think about a lot of the new capabilities that we have launched over the last couple of years, you know, whether it's a storage class or it's a new capability, a lot of it is focused on how do we make it easier to use the data.
Starting point is 00:05:31 And there's a lot of different examples about that. One of them is S3 Select, which we recently added Parquet format for. We'll let you go in and use these simple SQL statements directly on your storage. And the way I think about it is that it's a, it's, it's a byte aware feature. You can go in and just get the bytes you want out of the object that you have in storage, and you can do it without having to retrieve the entire object. And we did that because we have a lot of customers that filter data sets in S3. And to do that, they pull all the data out to do some type of analytics. And then they put
Starting point is 00:06:05 like, you know, a very small set of data back in. And now you have two different ways to do it. You have Athena, if you want to use that, or you have S3. And for me, the tip I always give to customers when they're trying to pick between them is basically a join. If you want to do complex SQL statements, Athena is your tool. But if you're just trying to find a needle in a haystack, or you want a simple, you know, a simple parsing or a simple SQL statement, then S3 Select is a really good way to do a targeted get of bytes in an object. That's one example. Another example is we have this feature called cross-region replication. And our idea with cross-region replication is that let's give you a policy-driven engine that lets you create another copy of data in another region if you
Starting point is 00:06:44 want it. And you can target that second copy of data in another region in the Glacier storage class or the same storage class you have in your primary region if you want to. But the way we think about it is that we know that not all data is created equal out of your storage. And so we let customers tag or put some metadata, custom metadata, on individual objects and then only replicate those individual objects. And so we think about usage of data, it's not just usage in, you know, how do I use it in my business workflow, which might be more of a S3 select kind of scenario. We also think about it as, you know, how do I manage my data in a way that is compliant with
Starting point is 00:07:20 whatever company policies I have? And how do I do it in a fine grain way where I can save as much money as possible? And that's how we keep on building on that kind of core fundamental get, put, list functionality that we started with over 10 years ago. We really take a look at what customers want to do with it. And then we try to make that cheaper and easier. One of the early arguments against AWS was, well, what happens if they wind up adding a zero to the end of every price? And we take a look back at almost 15 years of history now, and there has never been, that I've been able to detect anyway, a single case of a service that has cost more after it has been launched to general availability. Price changes only tend to go in one direction, and that's down. It's one of the only systems where I think you can look at this and say, oh, yes, I'm storing data here. And the cost to store that is getting less expensive over time. That's one of those transformative moments that you start to
Starting point is 00:08:14 see that, huh, maybe there's something to this idea of public cloud. Yeah, we take that really, really seriously. And in fact, you know, if you think about the evolution of our storage classes, that's just a reflection on that commitment to cost savings. You know, if you think about, for example, a few years back, we released infrequent access and free infrequent access gives you cheaper, almost 40% cheaper price on storage at rest. And then you have to pay a retrieval fee. But, you know, by and large, if you retrieve your object one time or even less a month, you're totally going to save money every month on SIA. And we didn't just stop there. We kept on going
Starting point is 00:08:50 with a one-zone storage class, which is very similar for that infrequent access pattern, which is 20% cheaper than SIA. And then at reInvent, we launched this new storage class, Intelligent Tiering, where we will automatically give you savings on individual objects that happen to be infrequently accessed without you doing anything at all, without picking a storage class. And so it's not just, you know, continuing to try to decrease the price of an individual storage class. It's also how do we find new ways where even if you don't know the access patterns of your
Starting point is 00:09:22 storage, you can still save money automatically. It's really important to us. It's part of our, I would call part of our DNA. It's what we come in every day to do. I was very excited at Intelligent Tiering's launch at reInvent for my own account because I know there are objects that once I wind up putting into S3, I will never touch them again. So being able to pay less to store them was incredibly exciting.
Starting point is 00:09:41 And then I remembered that my account is nothing to speak of, and my S3 charges something on the order of 32 cents a month. And it turns out that optimizing that is not really in line with what I want to be focusing on. One thing you didn't mention just now, as you went through the list of storage classes, is the, I don't want to call it sunset, but for a while there was a reduced redundancy storage class, which is still there. The API is probably going to be there longer than I will remain alive. But what's fascinating to me is that it isn't talked about. It's still there and in a majority of regions now, but I believe not all of them. It is no longer participating in price cuts so that standard storage with better durability
Starting point is 00:10:21 is now less expensive than the reduced redundancy storage, but you can still use that service. If you don't mind my asking, was deciding to change directions away from reduced redundancy storage into infrequent access a controversial one? What drove that decision? Yeah, I think a lot of it has to do with just straight up what customers are asking for. I mean, to be honest, not a lot of customers were using RRS mean, to be honest, not a lot of customers were using RRS. So we launched it and not a lot of customers started using it. We kind of looked at it and we said, huh, why? Really, what do customers want here? And we took a bunch of that feedback and that's why we built our next set of storage classes. We built SIA for the infrequent access,
Starting point is 00:11:02 and that one got a huge amount of uptake very quickly. And we built this new type of low-durability storage, which is this one zone infrequent access storage class that we have. And this ZIA storage class is actually growing really rapidly as well because we hit on what customers are really looking for, which is, you know, I have a lot of infrequently accessed storage, which is frankly copy storage. A lot of people have a lot of different copies and they don't really know who's using those copies. So I can't get rid of those copies, but they really don't really want to pay too much money for it. And so they need a place to just put that. And OneZone is just perfect for it. So I don't think there was a lot of controversy over it. I think it was more of a, let's take a hard look at the storage class and see what customers
Starting point is 00:11:45 are liking, what they're not, and how do we use that information to build what people want. One topic that I enjoy bringing up from time to time when I'm giving conference talks is at the start of the talk, when I'm waiting for people to finish filing into the room, I'll poll the audience with random questions that strike my fancy. One of them that I enjoy asking is for people to raise their hand if they've ever used Glacier to store something. And a bunch of hands go up, as you might expect. And my follow-up to that is usually, great, can you please keep your hand up if you've ever
Starting point is 00:12:14 restored data from Glacier? And almost every hand winds up dropping. Now, every time I go into a new account, I see different use cases that I generally hadn't considered before. But I'm curious as to what you see and how that interplays with the recent launch of Glacier Deep Archive. Yeah, Glacier is a great, great product. And to be honest, every time I talk to any customer, I ask him, you know, where are you putting your archives? Because everybody has archives.
Starting point is 00:12:41 I think one of the interesting things about Glacier is that, you know, we have customers that just see it as another storage class of S3. And we've built a lot of capabilities around that. We've built our lifecycle tiering around that where you can do this movement into Glacier. But a lot of customers also said, you know, how can I make it easier to use both S3 and Glacier together? And that is something that we're kind of deeply investing in right now. And I think that what you're going to see is all of our new storage classes, for example, the upcoming Deep Archive one, is going to be just another storage class of S3.
Starting point is 00:13:16 And by that, I mean you'll be able to do things like use our cross-region replication engine that I talked about before to set up a policy that says your second copy could be in deep archive or it could be in Glacier. And I think it's just going to open up people to think about archive in different ways. I mean, even that word archive, right? Archive just sounds so monolithic. It sounds like someone's about to try and sell me a tape drive, if we're being honest. That's right. And, you know, to be honest, like that's not how data works. Data is this organic living thing. And it doesn't just fit into this box of archive.
Starting point is 00:13:48 You've got very active archives. You've got, you know, semi-active archives. That is a really smart way to understand what are your retrieval patterns? How can you get the lowest price point, which is always what Glacier or Deep Archive, any of these storage classes will give you. But how do you not think about your data as an archive? How do you think about it as a library of content? Or you think about it storage that just has different access patterns? It's interesting when you look at the spectrum of storage and what aspects people are willing to compromise on. There's the idea of,
Starting point is 00:14:22 I don't necessarily care if you lose the data. I can always rebuild it. That's one angle. There's the idea of, I don't necessarily care if you lose the data. I can always rebuild it. That's one angle. There's the argument of, I don't care how much it costs me to pull it back because I very rarely need to do that. Or there's even the, I don't care if it takes you a week to wind up getting the data back to me. You can put it on a drive and mail it to me and that retrieval latency is just fine for my use cases. And seeing what decisions people make and the constraints that shape their choices and their interaction with storage has been fascinating to watch from a third-party perspective. Moving slightly away from the idea of storage tiering, I had a feature
Starting point is 00:14:56 request last spring that I made on Twitter, because apparently that is the best way to get attention, is to be first really obnoxious and loud and secondly request something that you would like to see and my request was for an sftp endpoint for an s3 bucket and immediately i wound up getting two very different responses one was oh my stars yes i would give my kingdom for that and the other was sft is so 1990s. Just teach whoever it is you're using to wind up interfacing with the S3 API. I used to work in finance. Teaching a bank how to use a new API opens Pandora's box of compliance. Being able to meet customers where they are and not have to ask your partners to completely refactor their entire security policy
Starting point is 00:15:45 is one of those things that is transformative and was a great thing to see. What was it that I guess drove that particular decision point as far as deciding that that was a feature that was worth investing time in? And please say it was my tweet. Well, I actually will say it was your tweet, but it was also a lot of other customers who said very similar things. I mean, you know, to be honest, Corey, A is really important to know we don't judge, right? We don't judge. I mean, if a customer comes in and says, I need this protocol that's been around for a long time, but, you know, we got a lot of customers who need it because the fact of the matter is there's a lot of clients out there and there's a lot of business
Starting point is 00:16:23 processes and a lot of B2B workflows built around it and we'll go figure out how to how to build it and you know whether it's that or if it's nfs with our file gateway that's another one there's all these protocols out there and what people are trying to do is they're trying to figure out how do i create a bridge right how do i take what i got today and figure out how i'm going to get to the cloud because i don't want to get there. But, you know, maybe I can't rewrite everything. And like I said, we just, we don't judge. If customers are in that spot, they're in that spot. And we got to go figure out how to help them. There's something to be said for meeting customers where they are. When the feature was announced, my mentions caught fire again with that previous dichotomy.
Starting point is 00:16:59 But there was also a bit of a flare up of people ranting about how extortionately priced it was. And looking at it, I seem to recall, and please don't quote me on this, but it's ballpark a couple hundred bucks a month. And people were saying, oh, I can spin up my EC2 instance and have it do its own thing. And yeah, if you can do that, this product probably isn't aimed at you. When you're doing this for something that has compliance eyes on it and has an entire chain of custody concern, you need to build an AMI, launch an instance from it, make sure that it winds up polling to make sure the transfer is completed, validate that the transfer completes successfully, you have data rotation, you have archival concerns.
Starting point is 00:17:42 Paying a couple hundred bucks a month to make this problem go away completely for an enterprise is an absolute no-brainer, with the caveat that it doesn't feel to me like this is a offering that is aimed at a tinkering hobbyist in their garage, which I very closely resemble. I don't know about that, Corey. Yeah, you know, I think FINRA uses it for exactly what you're talking about. So talking about the finance space and what they did is they were able to retire their entire SFTP infrastructure that they were having to maintain because they had a lot more cost to it than you know honestly maybe even just the money to run the compute it's the overhead for it it's the infrastructure and that's why we built that as a managed service with one or two big data startup exceptions i've never had a client that is spending more on their infrastructure bill than they are on their own, I guess I want to call it, their own engineering expense. Everything else tends to wind up being just a,
Starting point is 00:18:50 the infrastructure is always number two or further down the list. Yeah, I'd agree with that. So here's a fun one that I've been meaning to ask someone here for a while, and you're probably one of the best people I can think of to ask this. What do you wind up seeing as far as interesting use cases for S3 that have taken you by surprise? Yeah, I think one of the fun things about working on S3, and I think you said it earlier, Corey, it's such a foundational technology for a lot of customers. It's just storage. It's where your data goes. It cuts across every single market segment. It cuts across all the different geographies. And the use
Starting point is 00:19:27 cases are actually phenomenal. I've had so many customers come to me and say, hey, I use S3 as my data warehouse. And then they tell me what's working for them. Or they'll come to me and they'll say, hey, I use S3 for a video library. I mean, the satellite imagery and the video imagery that we have on S3 is actually fairly phenomenal. You know one peloton, those bikes? Oh, yes. I dream of affording one one day, and then I remember that my version of fitness involves fitting this entire burrito in my mouth,
Starting point is 00:19:52 and then I give up on those dreams. You could actually eat that burrito on the bike while looking at a video that's stored in S3, right? So there's so many different environments, companies, types of data. It's customer care records. It's recordings. It's imagery. I think one of data, it's customer care records, it's recordings, it's imagery. I think one of the ones that I like a lot, and, you know, it's the NGA, which is a government agency that has satellite imagery.
Starting point is 00:20:15 To me, you know, one of the really interesting things about satellite imagery is that it's not just pictures of our world, it's pictures of our world that's being used in so many different ways. And, you know, the Ebola crisis used all those digital images. It was used for helping Nepal after the earthquake. I think for me, you know, I'm a former Peace Corps volunteer. I went to West Africa, to Djenne in Mali, and I served there. And for me, it's, you know, one of the things I really like about working on S3 is being able to find all these examples about how storage is not just, you know, helping companies, it's helping people. And whether that's genomics research for companies like Grail, looking for early indicators of cancer to help, you know, prevent disease and to treat disease at a stage where it's more treatable than in later stages. You know, all of those different
Starting point is 00:21:06 ways that we're helping not just companies, but also people and the company's missions, that's really important to me. So from a perspective of someone starting out today with S3, what best practice advice would you have for them as they start interacting with effectively a type of storage semantic that many shops historically have either not existed until now or have been on-prem only haven't really had to work with. Object stores were never generally a thing that storage vendors wound up offering. What advice would you have for those companies? Yeah, a lot of people just don't think of object. The word object itself is not something that flows from people's vocabularies and, you know, the work they do.
Starting point is 00:21:47 And the fact of the matter is that S3 is just used for storage. It's unstructured storage. It's unstructured data. But it's just a lot of it in every single company. And so for folks who are just starting, I think the first thing, you know, I like to ask them is, you know, do you, when you're thinking about what you're building, there's a bunch of different design patterns that you can use. You have your lift and shift design pattern where, you know, for whatever reason, all you are trying to do is you're trying to take, you know, a application that you have on premises and just move it whole kit and caboodle over to the cloud. A lot of customers have that. Other customers have things where they're just trying to swap out one part of
Starting point is 00:22:23 their application. So they keep the compute somewhere and the storage somewhere else or vice versa. And then you have this whole, hey, I can rebuild things from scratch and really leverage, you know, the entire, you know, the cloud universe of AWS and the benefits. And what's really interesting is that, you know, a lot of companies can do all three at the same time with different applications. You don't really need to do them in any kind of order. And surprisingly, some companies will do them, you know, to take one of those design patterns in a way that will surprise you. So I mentioned FINRA before. That's a great example.
Starting point is 00:22:56 So over five years ago, I remember we were having this discussion with FINRA because FINRA was just getting started. And they were really looking at this lift and shift. Do I do that? Right. Do I swap out a part? And, you know, the mission of FINRA is to find instances of fraud within 24 hours of a transaction. Right. And so in order to fulfill their mission, they actually had to go and rebuild an application from the start. They had to, you know, do the whole kit and caboodle. The conversation I would have with people today is a little bit different, but some of the bones are still there. We would say, you know, how do you want to structure your
Starting point is 00:23:29 account and your buckets in the account? And the first thing I would say is put block public access on your account for every bucket today and in the future that you never want to be public. And what that setting will do, it's kind of a control that we offered and we launched at reInvent, is it'll make sure that any bucket created in the future associated with that account won't allow public access to anything in the bucket. It's incredibly important. And if you happen to have, you know, a need for having a public access bucket, like for example, you're, you know, you're using it, sort of a website content or whatever, then you can create a separate account. You can link it if you want to have all the reporting and the linked account benefits, but you can create a separate account that is
Starting point is 00:24:09 only restricted for a certain amount of use and you don't have to have block public access on it. But for 99.9% of any usage of storage that you have internally for data lake or for your business, you have that protection with block public access. We used to say, Corey, and you'll know this being a longtime S3 user, we used to say that you should, you know, randomize the first few characters of a prefix in order to help make sure that you get the best performance over time. You remember that? Oh, yes.
Starting point is 00:24:39 The good old days. And if you didn't do that, you were going to see interesting performance bottlenecks as you scaled that were challenging. What we did in the last year is that we raised our performance rates. And what's interesting about our performance rate is that we don't have any throughput restrictions on a bucket or an account. Our performance rates are requests per second on a prefix. I guess the secret behind this entire podcast recording, of course, is to come in here and get requests for weird edge case problems I have.
Starting point is 00:25:07 But something that I've seen that recurs fairly commonly at large scale accounts is historically there were hard limits on number of buckets per account. And as a result, people would wind up putting, in some cases, what are now billions of objects into a single S3 bucket. And now since everyone's been using that bucket for five years, there's no real understanding of what's in that bucket. And until reInvent, my answer to what to do there involved an awful lot of custom coding, an awful lot of work that no one really wanted to do, and it still was relatively unsatisfactory. The new answer to that, to my understanding, is the idea of S3 batch operations. Can you tell us a little bit about that? Yeah, for sure.
Starting point is 00:25:49 You know, it kind of goes back to this evolution of S3 that we started with, Corey, where, you know, S3 when it started was a very simple interface. And we've kept a lot of that simplicity. But then we kind of looked at what customers were doing, and they were telling us, you know, what work they were doing with S3. And batch operations is a direct result of that simplicity. But then we kind of looked at what customers were doing and they were telling us, you know, what work they were doing with S3 and batch operations as a direct result of that. So we now have a lot of customers that have a lot of objects in their buckets. And they were telling us that they were building applications to manage those objects. And we thought to ourselves, you know what, if we've got a lot of customers doing that, we've got to figure out a way to put that into our API to absorb the work there. And so it's not dissimilar with, you know, I talked a little bit about best practice for transfer manager. That's built into the SDK.
Starting point is 00:26:35 And so things like exponential backoff are built into the SDK. ourselves, how do we build in a bunch of the work, the infrastructure involved in taking one operation that you have to do across, you know, millions, if not billions, if not trillions of objects, how do you take that work and build it into the API? And that's what we did with batch operations. And what we've done is we've launched with, you know, a few capabilities that you can do, right? Initially, like restores is big one. And, you know, the goal here is that when you use this, you can pass in a manifest of objects that you want to act against. And you can use our inventory report. We run an inventory report at just like incredibly low cost. So anytime you want to know what's in your bucket, you just turn on inventory report. You won't even notice a charge for it. You get a list of what's
Starting point is 00:27:23 in your bucket. And so you can use the inventory report or you can create a custom manifest and you pass it in to batch operations and you tell batch operations what to do. And batch operations will handle retries. It'll audit what got done. It'll give you reports for what got done. Now we announced it at reInvent and we are going to be launching it in GA pretty shortly. And it is what you want to use if you want to do anything to your storage in bulk. And I think one of the really interesting things is that that applies to Lambda functions as well. And so if, for example, you have a higher function that you want to do or higher level operation that you want to do, like, for example, you want to create a thumbnail out of an image. Some people will use it for their garbage collection or their delete processing. You know, other people will use it for content transformations, right, for articles or things like that.
Starting point is 00:28:18 They'll add metadata, whatever they want to do. But, you know, kind of the book is open to if you are a serverless programmer and you're using Lambda, you can write a Lambda function and you can just use batch operations to run that Lambda function across the corpus of your storage. One thing that I also found that personally appealed during the most recent swath of storage announcements was the idea of compliance locks, either on a temporary basis or even account-wide to the point where an administrator couldn't release it. Suddenly, there's a much better story for regulated industries with worm compliance. That seems to me like one of those transformative features for companies in the regulated space. But if you're not one of those companies, it may have sailed completely past you. That feels like a manifestation of Amazon
Starting point is 00:29:06 releasing specific features for specific customer segments that for most of us are either going to be a transformative release or something that is going to go almost completely by the wayside. Is that a fair assessment of how Amazon views these things? Or is there a belief, for example, that something like this compliance lock option is something everyone should be using in the fullness of time? Yeah, you know, I think it is kind of interesting because it does convey a little bit of the mindset of how we build things. I will tell you, when we first looked at that feature and how we could build it, we could have just built it for, you know, the SEC regulation for worm compliance. We could have done that. We actually didn't because our goal
Starting point is 00:29:45 there was to make it more broadly applicable to anybody who wanted, you know, essentially a control, right? And so there's two modes there. You can use this new capability that we have in one of two modes. One of them is, as we said, for worm compliance. The second way to use it is really for retention, where, you know, you can actually set up a policy where somebody under a certain administration role, if you will, will be allowed to delete it over time, which is not the SEC compliant rule. But it is a different mode that helps create a class of storage that you have in your company that nobody can delete except for your compliance officer, as an example. We had a lot of other customers saying, look, you know, all data is not created equal. Usage is different. But hey, you know, access and permissions to delete needs to be different too. And I need a way to not manage that at a permissions level. I need to manage it at a policy level. And that's how we ended up building this feature. And now we have a litigation
Starting point is 00:30:42 hold or something else that winds up driving sudden shifts in data retention requirements. Yeah. And that, you know, going back to a topic that you brought up earlier, Corey, that's a lot of that is, again, really listening and learning about usage. It's not just about how are the bytes stored. That's like, you know, table stakes, securely, durably, et cetera. It's, you know, how are people using it and how do we make that safer,
Starting point is 00:31:05 easier, cheaper, right? Just better. Absolutely. We're almost out of time. And I want to thank you for taking the time out of your day to speak with me. But before we do, something that you have been a staunch advocate for in the past has been the idea of mentorship. That's been a recurring theme of a number of episodes of this podcast, where the ability to figure out, does someone starting new in the industry, how do you wind up taking your first steps? Where do you go? What does mentorship look like? The road that virtually all of us have walked to get to where we are is long since closed. The world changed. There are no data center jobs to speak of anymore the way that there once were. What advice would you have for people who find
Starting point is 00:31:45 themselves in that position at the beginning of their career who are wondering what the future looks like? Yeah, it's a great question. I think it's an important one. And, you know, I do do a lot of mentoring. And I think one of the common things that I tell just about everyone I mentor is just don't be afraid of the bold ask. Because a lot of times, particularly when you're early on in your career, you know, you can be a little intimidated by the ask. And, you know, when you're in school, the bold ask is, you know, going into your advisor's office or, right, or asking some professor to take you on for a project. And that's kind of hard, right? But you get in the workforce and your bold ask is a little different. It's like, hey, can I run that project? Hey, I have something to say in this meeting,
Starting point is 00:32:25 right? Like, how do you make sure that you feel like you can make the bold ask to go and be an owner, right? And a lot of times you don't have to ask for that permission. You don't even know it, but there's no ask here. But you might be blocking yourself off from doing it because you feel like, you know, you have to ask, you have to earn that way to the table. And so a lot of what I talk to, you know, some of my mentees about is, you know, don't let yourself be blocked by an ask that you feel like you have to make and it's a hard one for you. Don't be afraid of the bold ask. Just go do it. Occupy your space. Put your elbows out, right? Like, this is an important thing to do. And the thing that you'll realize when you do it is that people won't even know.
Starting point is 00:33:08 That big, bold ass that's built up in your head, you know, you make it and people are like, yeah, sure. Okay. Tell me how it goes, right? And you know what I'm talking about, right, Corey? I do. And if someone had told me that when I was starting out my career, it would have shaved years off of what it took me to get to where I am now. There have been struggles that I would have been able to just surmount almost with a snap of a finger if I contextualized it that way. It's just so many internal dialogues going on, right? You just
Starting point is 00:33:34 got to go and make it. The voice whispering in your ear that you're not good enough is lying to you. And I think that's something that people tend to lose sight of. Yeah, I agree with you. Thank you so much for taking the time to speak with me today. I appreciate it. For sure. Thank you, Corey. Mylan Thompson-Bukovic, VP of S3 at AWS. I'm Corey Quinn. This is Screaming in the Cloud. This has been this week's episode of Screaming in the Cloud. You can also find more Corey at screaminginthecloud.com or wherever fine snark is sold.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.