Screaming in the Cloud - Episode 59: Rebuilding AWS S3 in a Weekend with Valentino Volonghi

Starting point is 00:00:00 Hello and welcome to Screaming in the Cloud with your host, cloud economist Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud. This episode of Screaming in the Cloud is sponsored by O'Reilly's Velocity 2019 conference. To get ahead today, your organization needs to be cloud native. The 2019 Velocity program in San Jose from June 10th to 13th is going to cover a lot of topics we've already covered on previous episodes of this show, ranging from Kubernetes and site reliability engineering over to observability and performance. The idea here is to help you stay

Starting point is 00:00:49 on top of the rapidly changing landscape of this zany world called cloud. It's a great place to learn new skills, approaches, and of course, technologies. But what's also great about almost any conference is going to be the hallway track. Catch up with people who are solving interesting problems, trade stories, learn from them, and ideally learn a little bit more than you knew going into it. There are going to be some great guests, including at least a few people

Starting point is 00:01:11 who've been previously on this podcast, including Liz Fong-Jones and several more. Listeners to this podcast can get 20% off of most passes with the code CLOUD20. That's C-L-O-U-D-2-0 during registration. To sign up, go to velocityconf.com slash cloud that's velocityconf.com slash cloud thank you to velocity for sponsoring this podcast welcome to screaming in the cloud i'm cory quinn i'm joined this week by valino Valonghi, CTO of AdRoll. Welcome to the show. Hey, Corey.

Starting point is 00:01:47 Thanks for having me on the show. No, thanks for being had. One, let's start at the very beginning. Who are you and what do you do? Well, I'm CTO at AdRoll Group. And what AdRoll Group does is effectively build marketing tools for businesses that want to grow. And they're looking to try to make sense of everything that is happening in marketing, especially when it comes to digital marketing,

Starting point is 00:02:15 that effectively is going to help their businesses drive more customers to their websites and turn them into profitable customers effectively. Awesome. You've also been a community hero for AWS for the last five years or so. Yeah, I was lucky enough to be included in the first group of community heroes, which I think was started in 2014. It isn't still completely clear to me

Starting point is 00:02:41 what exactly community heroes do besides obviously helping the company and what did we do to deserve to what exactly community heroes do besides obviously helping the company and what did we do to deserve to be called community heroes. I think lots of people such as yourself are doing a great amount of work to help the community understand the cloud and spreading the reasoning behind everything that is happening in the market these days. So maybe you should be a hero as well. Unfortunately, my harsh line on no capes winds up being a bit of a non-starter for that. And I've been told the wardrobe is very explicit. Oh, okay. I didn't know that.

Starting point is 00:03:18 Exactly. It all comes down to sartorial choices and whatnot. So you've been involved with using AWS from a customer perspective for, I'm betting, longer than five years. Yeah, probably longer than a decade, actually. Longer than a decade. And it's amazing watching how that service is just, I guess how all AWS services have evolved

Starting point is 00:03:40 over that time span, where it's gone from, yeah, it runs some VMs and some storage. And if you want to charitably call it a network, you can because latency was all over the map. And it's just amazing watching how that's evolved over a period of time where not only it was iterating rapidly and improving itself, but it seemed like the entire rest of the industry was more or less ignoring it completely as some sort of flash in the pan i've never understood why they got the head start that they did oh man such a long long time ago i remember i was still in europe before i came to work on uh to start up adro but uh in 2006 i think

Starting point is 00:04:19 was when s3 was first released and i remember starting to take a look at it and thinking, wow, now you can put files on a system out there that you don't know really where it lives, but I don't need to have my own machines anymore. And it was the time that you used to buy co-locations online and it was a provisioning process for all of those. You needed to choose your memory size and you typically get a co-located, co-hosted, shared host type situation. And it was expensive. And then, yeah, in 2008,

Starting point is 00:04:55 EC2 came out, 2007 EC2 came out, and it felt like magic. And at that point in time, Adder was running in a data center out here on Spear Street in San Francisco. And I remember we had two databases machines, both RAID 5, and one machine was humming along fine, but the other one was going on two drives, two drives that were failing in the RAID 5 and we started driving the order in the drives in on Amazon or whatever Newegg and I think they were in backorder at that time and we needed to

Starting point is 00:05:36 wait for a for a week or two before those could arrive at that moment in time I made the call. That's it. We're not doing this anymore. We are going on AWS. Just give me two weeks and I'll migrate everything, I told the CEO, and then we'll be free from the data center. And I tell you, the costs will be exactly the same. And actually, that's exactly what happened.

Starting point is 00:06:03 It took two weeks. They moved all the machines over. the costs were exactly the same, but we had no more needs to run to the store and provision extra capacity or buy extra capacity or any of that stuff. It also allowed us massive amounts of flexibility. And then very early on, it was funny because I think I've lived through all of the stages of disbelief when it comes to AWS or cloud in general, where the first complaints were, well, it's not performant enough. If you want to run MapReduce, you cannot run it inside AWS. There's simply not enough IEO performance on the boxes.

Starting point is 00:06:47 I even lived in a period of time, I was following closely when GitHub was on AWS at first, and then they moved to Rackspace afterwards because AWS wasn't fast enough even for them. And they were working through some issues here and there. Some of those things were obviously real, like true immaturity situations. EBS has gone through a lot of ups and downs, but it's mostly been stable since then. Living in now a day and age where the EBS drives that you get from AWS are super stable, but it never used to be like that. You needed to kind of get adapted, get used to the fact that an EBS drive could fail or

Starting point is 00:07:34 the entire region could go down because of EBS drives, which has happened in US East a few times in the past. But yeah, from those very few simple services with very rudimental and simple APIs, it does feel like they have started to add more and more, not only breadth, because obviously that's evident to anybody at this point in time. I don't think anybody can keep up with the number of services that are being released. But what's really surprising is that for the services where they see value and where customers are seeing a lot of adoption and interest, they

Starting point is 00:08:16 can go to extreme depth with the functionality that they implement, the care with which they implement it, and ultimately with how much of it is available for many of them. Now you get over 160, I think, different types of instances. It used to be that you only had six or seven, and now 160. Some of them are FPGA instances, which I think there's only maybe a handful of people in the world that can code those machines properly. And they certainly don't work at my company right now. Well, that's always the fun question too, is do you think that going through those early days where you were building out an entire ecosystem, sorry, an entire infrastructure on relatively unreliable instances and

Starting point is 00:09:09 disks and whatnot was a lesson that to some extent gets lost today. I mean it taught you early on, at least for me, that any given thing can fail so architecting accordingly was important. Now you wind up with ultra reliable things that never seem to fail until one day they do and everything explodes. Do you think it's leading to less robust infrastructures in the modern era? It's possible. I think if people get on AWS thinking that we're going to run in the cloud, so it's never going to fail because Amazon manages it, I think they're definitely making a real mistake, a very short-sighted statement right there.

Starting point is 00:09:46 Not just because of that, in case of failures, but a couple of years ago, I think, maybe three years ago, there were all of those Zen vulnerabilities coming out that Amazon needed to patch and entire regions needed to be rebooted. What do you do at that point when your infrastructure is not fully automated and capable to be restored without downtime in user-facing software? You're going to need to pause development for weeks just in order to patch a high-urgency vulnerability in your core infrastructure. That's just an event that is not even a fault of anybody. It's not even necessarily under full control of Amazon and you

Starting point is 00:10:29 need to be ready for some of that stuff. So there are, I would say, there are systems that are simply, lots of companies that especially in their first journey to moving stuff inside AWS, they tend to just replicate exactly what they have in their own data center and just move it inside AWS. I know this because, for example, Adderall has done that the first time that we migrated into AWS. We first migrated just our boxes. And then we quickly learned that it wasn't always that reliable. And so we needed to figure some of that stuff out for ourselves and effectively you start to realize in our case back then that you needed to work around many of those things but as you said today it isn't quite that way and to an extent Amazon almost makes a promise about many of

Starting point is 00:11:19 these services not failing or taking care of your infrastructure for you. For example, if you look at Aurora, it's a stupendous, fantastic piece of database software. It's extremely fast. It's always replicated in multiple availability zones, so multiple data centers. The failover time is less than a second, I think, at the moment. And when you're tasked with solving a problem, building a service, you're going to choose to build it on top of Aurora, neglecting to think about what happens if Aurora doesn't

Starting point is 00:11:58 answer to me because the network goes off? Or what happens if my machines go down because I misconfigured them? Some of the biggest higher profile issues in terms of infrastructure of the last year alone, for example, with S3, have been erroneous configuration changes being pushed to production. What do you do at that point? Your system needs to be built in such a way that it's going to be resistant, at least partially, to some of these things. and Amazon is trying to build a lot of the tools around that stuff but I think it still takes it still takes a lot of mind presence from the developers and architect to actually do this in a in a

Starting point is 00:12:39 thoughtful way use the services that you need to use in a thoughtful way understand the perimeter of your infrastructure and particularly the assumptions you're making as you're building the infrastructure. And if you can design a graceful degradation service where a failure of an entire subsystem is not going to lead to complete failure to serve a website, whether you slowly get to just a less useful website progressively, but still maintaining the core service that you might offer, then it improves your infrastructure quite a lot.

Starting point is 00:13:12 I think this is where Chaos Monkey, Gorilla, whatever, King Kong, Kong, or whatever it's called for the region failure come into play to try to exercise those muscles. It's obviously important to have them going in production, but I think even a good start would be to have those running as you're prototyping your software and just see where the failures bring you. And another trend we've seen recently is the use of TLA plus as a formal formal verification language where you can effectively spec your system using these formal languages and then test it using verification software so that it highlights places where your assumptions were not checking out with

Starting point is 00:14:02 reality effectively. The challenge that I've always had when looking at, I guess, shall we say, older art environments and older architectures is that in the early days, what you just described was very common, where you wind up taking an existing on-prem data center app and more or less migrating that wholesale directly as a one-to-one migration into the cloud. That was great when you could view the cloud as just a giant pile of, I guess, similar style

Starting point is 00:14:30 resources. But now with 150 something in AWS alone, the higher level services start to unlock and empower different things that weren't possible back then, at least not without a tremendous amount of work. You talk, for example, about not having enough people around who can program FPGAs. Do you think that if you were building AdRoll today, for example, you would focus on higher level services architecturally? Would you go serverless? Would containers be interesting? Or would you effectively stick to the tried and true architecture that got you to where

Starting point is 00:15:02 you are? Probably, I would probably do a mix. I think what's important to evaluate as building infrastructure is the skill set of the people that you have working on your team. And you certainly need to play to their strengths. Ultimately, they are the ones building and maintaining your infrastructure, not Amazon, not an external vendor, and most certainly not the open source maintainer of whichever project you use in alternative. And the other aspect is

Starting point is 00:15:31 try to understand sometimes with subtle indications from Amazon, which services Amazon is investing most of their energy or a lot of their energy in so that you know that they continue to grow and they continue to receive support and they continue to fix bugs and issues because you know that they'll be with you for the rest of your company's life, for example. But on the other hand, a lot of times you write software just automatically without really thinking about the better way to write something just because you're used to it and so typically it's

Starting point is 00:16:12 not an easy thing to just jump out of the habit of getting an instance going to do something and it might be a good idea at first, but if you develop a good process to test new architectures and new ideas, you might quickly end up realizing, well, actually, I don't need to run a T2 micro or whatever for running this particular thing with S3, where every time a file is uploaded on S3, I run some checks on the file that is uploaded to S3 where every time a file is uploaded on S3 I run some checks on the file that is uploaded to S3. You might realize well maybe the best thing to do is to try to play around with a lambda function instead and that effectively fixes your entire problem. One area that for example we've

Starting point is 00:17:00 tested around and it's on Adros technical blog is that we built a globally distributed eventually consistent counter that uses DynamoDB and Lambda and S3 together and effectively is able to aggregate all of the counts that are happening in each of the remote regions into a single counter in a central region that can then be synced back to each remote region. This way we can keep track of, for example, in our case, how much money has been spent in each particular region and be sure that this money is spent efficiently. And the only other alternative way to do it is to set up a fairly complex database of your own and make sure that latency of updates is fast enough and that all the machines are up and running all the time and if anything goes down

Starting point is 00:17:53 it's a it's a high urgency situation because your controls on the budgets go away so it's sometimes it's really useful to especially when dealing with problems in which communication and the flow of information isn't particularly easy to grasp for an engineer it's easy to be able to remove an entire layer of of a problem and be like reliant on someone else to be providing the SLA that they are promising you and so effectively that's the case for what for Lambda is if there is a there's obviously a particular range of uses in which Lambda makes complete sense what from the point of

Starting point is 00:18:35 view of price and from the point of view of the resources needed or the type of computation that runs on it and if you can manage to keep this in your head, in your mind, when you're making decisions and, or you can make some tests, you can actually discover that maybe you can use Lambda and you get away with not having to solve quite a challenging problem at the end of the day. So sometimes it helps rewriting some infrastructure just as an exercise. What I do at AdRoll is that I do, as a CTO, I tend to not have a lot of direct reports. I consider each service at AdRoll to be my direct report as a team, effectively, each of them being a team. And every six weeks, they provide a short presentation in which they explain the budgets that they've

Starting point is 00:19:27 gone through, whether they have overspent or underspent and why. And among the many things, they also talk about their infrastructure. We have diagrams of infrastructure. We talk about new releases from Amazon and what would be a new way to build the same thing. And they evaluate whether it would save money or not. And so you kind of need to have someone in the organization, especially if you're planning to adopt some of the new technologies that their role is effectively dedicated to being up to date with what's going on in the world and knowing the infrastructure of your systems and be able to make suggestions and then let the team make the decision at that point. What's also sometimes hard to reconcile for some people

Starting point is 00:20:11 is that these services don't hold still. And I think one of the better services to draw this parallel to is one I know you're passionate about. Let's talk a little bit about S3. Before we started recording the show, you mentioned that you thought that it was pretty misunderstood.

Starting point is 00:20:27 Yeah. What do you mean by that? Well, S3 has been, in my view, it's been one of the closest thing to magic that exists inside AWS. Until not long ago, the maximum amount of data that you could pull from S3 was one gigabit per second on streams. You were limited in the number of requests per second that you could run on the same shard of S3.

Starting point is 00:20:54 There was no way of tagging objects. There was the latency on the first byte when S3 started was in the 200-300 milliseconds. It was expensive. S3 probably has undergone some of the most cost-cutting that you could see out there. And part of the decrease in cost is that the standard storage classes become cheaper, but also they have added significant other storage classes that you can move your data in and out of relatively

Starting point is 00:21:26 simply without having to change the service effectively. It's the very same API, but different cost profile and storage mechanism. And when it all started, there was just one, it was just US standard and it was pretty expensive to use both in terms of per request cost and storage cost. But yeah, today there's the limit on the single bandwidth. The bandwidth on a single stream is not one gigabit per second anymore, it's at least five gigabits per second. If you can get a hundred gigs on one of the instances that have a hundred gig networking inside Amazon, you can get all of those a hundred gigs out of S3 just fetching multiple streams.

Starting point is 00:22:06 The latency that you get on the first byte is well below 100 milliseconds. Their range queries are very well supported, so you could fetch blocks inside S3. S3 has turned into almost a database now. With S3 Select, it allows you to run filters directly on your files by decompressing them on the fly and recompressing them afterwards or simply by reading richer formats like it could be parquet for example it honestly is something that it's hard to imagine how you could build everything that we have going on right now at Adro without S3. It has gotten to the point where running an HDFS cluster for us is not really that useful. If you look at EMR themselves, they have a version of HBase that runs backed by S3.

Starting point is 00:23:00 And I know of extremely big companies that have moved from running HBase backed by file system HDFS to instead HBase backed by S3 that have had incredible improvements in performance and the consistency of the performance of HBase. HBase is very sensitive to the performance of the disks because it's a consistent first database effectively and if the region that is currently master, sorry, if the server that is currently master for a region is slow, it ends up bringing down that entire region effectively. It's a service that has grown dramatically and we have experimented even with using it as a file system by using user file descriptors in the kernel. More recent versions of the Linux kernel allow user file descriptors. And if you have limited use for

Starting point is 00:23:53 writing like we do and you want to treat the file system like a write once read many file system, then S3 becomes actually surprisingly useful as well. Netflix published a blog article on their tech blog talking about, for example, how they use a way to mount S3 as a local file system in order to using FFmpeg to run movie decoding and transcoding. Because effectively FFmpeg was not created with the idea that S3 was around and so it needs to have the entire file available on the local system or at least an entire block available it doesn't work well with streams and so if you can abstract that

Starting point is 00:24:37 part away from the FFmpeg API and move it in the file system you can suddenly use S3 as some kind of almost a file system. And we have done a similar thing when it comes to processing columnar files or indexed files from inside S3, where if you know exactly the range of data that you want to access, you can just do it inside S3. We use it as a communication layer between the map and the reduced stage of our homegrown MapReduce frameworks.

Starting point is 00:25:08 And it's again, it allows us to cut away thousands of hours per day on waiting times for downloading a file to the local disk before processing it on local disk. We can just process it right away and cache it on the box after it's been downloaded. It's quite remarkable. The speed increase, the cost decrease, the three select. I think we're going to see in the near future databases that start to use S3 as the actual backend for their storage more and more

Starting point is 00:25:43 without worrying about the limitations of the current disk. And effectively, we'll be able to scale in a stateless way, adding as many machines as you want and respond to as much traffic as you can without needing to worry about failures either. It's an incredible amount of opportunity and possibility that is coming down in the future that I'm really excited to see become real. I think that requires people to

Starting point is 00:26:11 update a lot of their understandings about it. I mean, one of the things that I've always noticed that's been incredibly frustrating is that people believe it when it says it's simple storage service. Oh, simple. And you look on Hacker News and that's generally the consensus. Well, S3 doesn't sound hard. I can build one of those in a weekend. And you see a bunch of companies trying to spin up alternatives to this. Companies no one's ever heard of before. Oh, we're going to do S3, but on the blockchain is another popular one that makes me just roll my eyes back in my head so hard I pass out. You're right. This is the closest thing to magic that I think you'll see in all of AWS. And people haven't seemed to update their opinion. I think you're right.

Starting point is 00:26:53 It's getting closer to a database than almost anything else. But I guess the discussions around it tend to be, well, a little facile, for lack of a better term. Well, there was this outage a couple of years ago, and it went down for four hours in a single region, and that's a complete non-starter, so we can't ever trust it. Who's going to be able to run their own internal data store with better uptime than that? Remarkably few people. Yeah, I mean, Adderall has used 17 exabytes of bandwidth from S3 just for our business intelligence workload from EC2 to S3 this past month. I don't even know how to even start. If a router communicating between S3 and whatever instance we have going around goes out and we're out, we're out for good.

Starting point is 00:27:51 S3 has multiple different paths to reach EC2 and they are all redundant. Each machine's internal there is obviously redundant. They replicate the data in multiple zones and whatnot. This bandwidth is available across multiple zones because I'm storing data inside the region so it's already available in multiple data centers. The number of boxes that are needed to aggregate to 17 exabytes as well is quite impressive. We have no people thinking about this. We run over, I think we run over 20 billion requests per month on S3. I'm pretty sure that bucket, if it were made public, would be one of

Starting point is 00:28:35 the biggest properties in terms of volume on the web. And I just can't see it. Processing 20 billion events per month with files that are sometimes significantly big, it's going to take a lot of people. Exactly. People like to undervalue their own expertise, what their time costs, the opportunity cost of focusing on that more than other stuff. And you still see it with strange implementations of trying to mount S3 in a FUSE file system. Trying to treat it like that has never worked out for anything I've ever seen, but people keep trying. Yeah, the FUSE file system is an interesting one. I think

Starting point is 00:29:18 things might change in the future, but it needs to be done with some concept of what you're doing. It really isn't a file system, but it works for a certain subset of the use cases. And we're not even talking about necessarily yet all of the compliance side of things. So encryption, ability to rotate your keys, to set permissions on who can or cannot access, tagging each object, building rules for accessing the objects or the prefix based on the tags available on that object using the IM policies. Life cycle transitions, object locks so no one can delete it, litigation hold options. And then take a look at Deep Archive, $12,000 a year to store a petabyte.

Starting point is 00:30:02 That's who cares money. Yeah, that's that's who cares money yeah that's that's uh exactly absolutely right plus it doesn't matter at a certain point in time if you're not compliant and you're storing that much data you just can't so you might have to delete it all there's a lot of different uh security regulations gdpr is incoming is your database going to help out to remain compliant well GDPR isn't incoming actually it came out a year ago but with GDPR here and the California privacy law incoming next at the end of the year and the end of this year is your data is your storage system going to help you to become compliant who's going to build all of the compliance tools

Starting point is 00:30:43 on top of on top of your storage system and and make sure that you remain compliant for kingdom comes so that basically it's uh it's it's a i mean it's it's it's awesome and i think it's a healthy exercise for every engineer to always question what uh what is the value that you're getting out of a service and try to scheme or like understand the infrastructure try to try to like whiteboard it out and maybe do quick cost estimation but it's never enough to have just the engineer in there security is a stakeholder of this kind of decisions the operations team is a stakeholder in this kind of decision the business is in the is a stakeholder in this kind of decision the business is in the is a stakeholder in this kind

Starting point is 00:31:25 of decision the business might not be happy as you said to spend 500 000 for two engineers to work on s3 a year when they can spend 12 000 to get a petabyte store the year inside s3 just it's just a lot easier uh 12 grand is really who cares money. Exactly, especially when you're dealing with what it takes to build and run something that averages that much data. It becomes almost a side note. And the durability guarantees remain there as well. It feels like one of those things we could go on with for hours and hours. Yeah. And the other aspect that is very important is how close S3 is to the computing power. Because as I said, 17 exabytes of data just for BI purposes, I cannot do that across data center. There is no way. That would cost everything from the business in terms of bandwidth costs.

Starting point is 00:32:22 Many times other vendors approach Adderall obviously asking for using their storage solution but you're simply either for deploy either to deploy you I need my own data center and then you're not close to the where the capacity is or you are in another system where I don't need a data center but you're not located near my compute capacity. And so I lose that piece of the equation that makes all of the stuff that I want to do worthwhile. To an extent, S3 is the biggest locking reason

Starting point is 00:33:01 behind the behind EC2. It really is hard to replicate all of the different bits and pieces of technology that are built on top of S3 and in particular being so close to so many services that are easy to integrate with each other from thing we're using things such as Lambda or EC2 makes it very compelling. Other cloud vendors are obviously always on the catch-up and and getting there but I don't think they're quite to the level of customization, security, compliance,

Starting point is 00:33:29 and ease of use that really Amazon S3 has. It also has really hard aspects to it as well, but I think by and large, it's a huge success story. If people are interested in hearing more, I guess, of your wise thoughts on the proper application of these various services where on the internet can they find you? Oh I the easiest way to find me is to shoot me questions comments or follow me on on my Twitter account dialtone underscore. Adderall also has a tech blog at tech.adderall.com. We usually publish

Starting point is 00:34:08 a lot of interesting articles about the ongoings with our infrastructure, things such as the globally eventually consistent counter that I mentioned earlier, but also our extreme use of the spot market effectively, or our strange use of S3 as a quasi file system for processing our MapReduce jobs, which also are described in our blog. And generally speaking, I'm more than happy to answer questions and whatnot to local events. I usually go to as many local events as I can here

Starting point is 00:34:44 with the either AWS user events or other meetups or go to a random set of other conferences as well. Thank you so much for taking the time to speak with me today. I appreciate it. Thank you. Valentino Valonghi, CTO of AdRoll. I'm Corey Quinn, and this is Screaming in the Cloud. This has been this week's episode of Screaming in the Cloud. This has been this week's episode of Screaming in the Cloud.

Starting point is 00:35:08 You can also find more Corey at Screaminginthecloud.com or wherever fine snark is sold. This has been a humble pod production stay humble

Your Ad Here

Screaming in the Cloud - Episode 59: Rebuilding AWS S3 in a Weekend with Valentino Volonghi

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.