Screaming in the Cloud - Multi-Cloud in Sanity with Simen Svale Skogsrud

Starting point is 00:00:00 Hello, and welcome to Screaming in the Cloud, with your host, Chief Cloud Economist at the Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud. This episode is brought to us in part by our friends at Pinecone. They believe that all anyone really wants is to be understood,

Starting point is 00:00:38 and that includes your users. AI models combined with the Pinecone Vector Database let your applications understand and act on what your users want without making them spell it out. Make your search application find results by meaning instead of just keywords. Your personalization system make picks based on relevance instead of just tags. And your security applications match threats by resemblance instead of just regular expressions. Pinecone provides the cloud infrastructure that makes this easy, fast, and scalable.

Starting point is 00:01:09 Thanks to my friends at Pinecone for sponsoring this episode. Visit pinecone. Nobody cares about backups. Stop lying to yourselves. You care about restores, usually right after you didn't care enough about backups. If you're tired of the vulnerabilities, costs, and slow recoveries when using snapshots to restore your data, assuming that you even have them at all,

Starting point is 00:01:41 living in AWS land, there's an alternative for you. Check out Veeam. That's V-E-E-A-M for secure, zero-fuss AWS backup that won't leave you high and dry when it's time to restore. Stop taking chances with your data. Talk to Veeam. My thanks to them for sponsoring this ridiculous podcast. Welcome to Screaming in the Cloud. I'm Corey Quinn. Today's guest is here to tell a story that I have been actively searching for for years, and I have picked countless fights in pursuit of it. And until I met today's guest, I was unconvinced that it

Starting point is 00:02:21 actually exists. Simon Sveil is the co-founder and CTO of a company called Sanity. Simon, thank you for joining me. And what is Sanity? What do you folks do over there? Thank you, Corey. Thank you. So we used to be this creative agency that came in as kind of, we would kind of blackhawk down into a company and help them innovate. And that would be our thing. And we, these were usually content projects like media companies corporate communication these kinds of companies we would be coming in and we would develop some ideas with them and they would love those ideas and then invariably we would never be able to do those ideas because we couldn't change the workflows in their cms we couldn't extend their content models we couldn't really

Starting point is 00:03:02 do anything meaningful so then we would end up setting up separate tools next to those content tools, and they would invariably get lost and never be used after a while. So we were like, we need to solve this problem. We need to solve it at the source. So we decided we wanted a new kind of content platform. It would be a content platform consisting of two parts. There would be the kind of workspace where you create content and do the workflows on it. That would be like an open source project

Starting point is 00:03:27 that you can really customize and build the exact workspace that you need for your company. And then on the other side, you would have this kind of content cloud. We call it the content lake. And the point with this is to, very often you need to bring in

Starting point is 00:03:39 several different sources. You have your content that you create specifically for a project, but very often you have content from an ERP system, availability of products, time schedules, the serial real estate agent, you have data about your properties that come from other systems.

Starting point is 00:03:54 So this is a system to bring all of that together. And then there is another thing that kind of really frustrated me was content systems had content APIs and content APIs are very particularly and specifically about a certain way of using content. Whereas we thought content is just data. It should be data and the API should be a database query language. So these are kind of the components of Sanity. It's a very customizable workspace for working with content and running content

Starting point is 00:04:21 workflows. And it's this content lake, which is this kind of cloud for your content. The idea of a content lake is fascinating on some level, where it goes beyond the data lake story, which I've always found to be a little of the weird side when cloud companies get up and talk about this. I remember this distinctly a few years ago at a re-event keynote that Andy Jassy, then the CEO of AWS, got up and talked about customers' data lakes, and here's tools for using that. And I mentioned it to one of my clients, and they looked at me like I was a very small, very simple child and said, yeah, that would be great, genius, if we

Starting point is 00:04:54 had a data lake, but we don't. It's like, you have many petabytes of data hanging out in S3. What do you think that is? Oh, that's just the logs and the assets and stuff. It's, yeah. So it turns out that people don't think about what they have in the same terms, and meeting customers with their terms is challenging. Do you find that people have an idea of what a content cloud or a content lake is before you talk to them about it?

Starting point is 00:05:18 I mean, that's why it took us some time to come up with the word content lake, but we realized our thinking was the content lake is where you bring all your content to make it curable and to make it deliverable. So that's like, you should think as long as I need to present this to end users, I need to bring it into the content lake. And it's kind of analogous to a data lake. Of course, if you can't curate your data in a data lake, it isn't a data lake. Even if you have all the data there, you have to be able to analyze it and deliver it in

Starting point is 00:05:46 the format you need it. So it's kind of an analogy for the same kind of thinking. And a crux of a content lake is it gives you one kind of single API that works for all of your content sources. It kind of brings them all in together in the one umbrella, which is kind of the key here that teams can then leverage that without learning new APIs and without ordering up new APIs from the other teams.

Starting point is 00:06:10 The story that really got me pointed in your direction is when a mutual friend of ours looked at me and said, oh, you haven't talked to them yet? Because it was in response to a story I've told repeatedly at length to anyone who will listen, and by

Starting point is 00:06:25 that I include happens to be unfortunate enough to share an elevator ride with me. I'll talk to strangers about this. It doesn't matter. And my argument has been for a long time that multi-cloud, in the sense of, oh yeah, we have this one workload and we can just seamlessly deploy it anywhere, is something that is like cow tipping, as Ben Kehoe once put it, in that it doesn't exist. And you know it doesn't exist because there are no videos of it happening on YouTube. There are no keynote stories where someone walks out on stage and says, oh yeah,

Starting point is 00:06:54 thanks to this company's great product, I had my thing that I built entirely on AWS, and I can seamlessly flip a switch, and now it's running on Google Cloud, and flip a switch and now it's running on Google Cloud and flip the switch again. And now it's running on Azure. And the idea is compelling. And there are very rarely individual workloads that are built from the beginning to be able to run like that, but it takes significant engineering work. And in practice, no one ever takes advantage of that optionality in most cases. It is vanishingly rare. And our mutual friend said, oh yeah, you should talk to Simon. He's done it. Okay. Shenanigans on that, but why not? I'm game. So let me be very direct. What the hell have you done? So we didn't know it was hard until I saw his face when I told him. That helps, right? Like

Starting point is 00:07:40 ignorance is bliss. What we wanted was, we were blessed with getting very, very big enterprise customers very early in our startup journey, which is fantastic, but also very demanding. And one thing we saw was, either for compliance reasons or for kind of strategic partnership reasons, there were reasons that big, big companies wanted to be on specific service providers. And in a sense, we don't care. Like, we don't want to care. We want to support whatever makes sense. And we are very, let's call it, principled architects. So actually, the lower

Starting point is 00:08:10 levels of sanity doesn't know they are part of sanity. They don't even know about customers. We had already done the kind of separation of concerns that makes the lower, the kind of workload-specific systems of sanity not know a lot of what they are doing. They are basically just kind of processing content, CDN requests, and just doing that, no idea about billing or anything like that. So when we saw the need for that, we thought, okay, that means we have what we call a charge, which is kind of the light bulbs, the ones we have hundreds and hundreds of them, and we can just switch them off and the service still works.

Starting point is 00:08:43 And then there's the control plane. That is kind of the admin interface that the users use to administrate their resources. We wanted customers to just be able to then say, I want this workload, this kind of content store to run on Azure, and I want this one on Google Cloud. I wanted that to feel

Starting point is 00:09:00 the same way regions do. Like, you just choose that, and we'll migrate it to wherever you want it. And of course, charge you for that privilege. Even that is hard to do because when companies say, oh yeah, we need to have a multi-cloud strategy here.

Starting point is 00:09:12 It's okay. If your multi-cloud strategy involves we have to have this thing on multiple clouds. Okay. First, as a step one, if you're on AWS, which is where this conversation

Starting point is 00:09:23 usually takes place when I'm having this conversation with people, given the nature of what I do for a living, it's great. First, deploy it to a second AWS region and go active-active between those two. You should, theoretically, have full service and API compatibility between them, which removes a whole bunch of problems. Just go ahead and do that and show us how easy it is. And then for step two, then talk about other cloud providers. And spoiler, there's never a step two because that stuff is way more difficult than people who have not done it give it credit for being. How did you build your application in such

Starting point is 00:09:57 a way that you aren't taking individual dependencies on things that only exist in one particular cloud, either in terms of the technology itself or the behaviors. For example, load balancers come up with different inrush times, RDS instances, provision databases at different speeds with different guarantees around certain areas across different cloud providers. At some point, it feels like you have to go back to the building blocks of just rolling everything yourself in containers and taking only internal dependencies. How did you square that circle? Yeah, it's a good point.

Starting point is 00:10:27 I guess we had a fear of... My biggest fear in terms of single cloud was just the leverage you provide your cloud provider if you use too many of those kind of super specific services, the ones that only they run. So our initial architecture was based on the fact that we would be able to migrate, like not necessarily multi-cloud, just if someone really ups the price or behaves terribly, we can say, oh yeah, then we'll leave for another cloud provider. So we only use super generic services like Q services, blob services. These are pretty generic across the providers. And then we use generic databases like Postgres or Elastic, and we run them pretty generically.

Starting point is 00:11:07 So anyone who can provide like a Postgres-style API, we can run on that. We don't use any exotic features. Let's say picking boring technologies was the most kind of important choice. And then that also goes into our business model, because we are a highly integrated database provider. Like in one sense, Sanity is a content database with this weird go-to-market. People think of it as a CMS, but it's actually the database we charge for. So also, we can't use these very highly integrated services because that's our margin.

Starting point is 00:11:34 We want that money, right? So we create that value, and then we build that on very simple, very basic building blocks, if that makes sense. So when we wanted to move to a different cloud, everything we needed existed. We could basically build a platform inside Azure that looks exactly like the one we built inside Google through the applications. There is something to be said for the approach of using boring technologies. Of course, there's also the story of, yeah, use boring technologies. Like what? Oh, like Kubernetes is one of the things that people love to say. It's like, oh yes, my opinion on Kubernetes historically has not been great. Basically, I look at it as if you want to cosplay working at Google

Starting point is 00:12:12 that can't pass their technical screen, then Kubernetes is the answer for you. And that's more than a little unfair. And starting early next year, I'm going to be running a production workload myself in Kubernetes just so I can make fun of it with greater accuracy, honestly. But I'm going to be running a production workload myself in Kubernetes just so I can make fun of it with greater accuracy, honestly, but I'm going to learn things as I go. It is sort of the exact opposite of boring. Even my early experiments with it so far have been, I guess we'll call it unsettling as far as some of the non-deterministic behaviors that have emerged and the rest. How did you go about deciding to build on top of Kubernetes in your situation?

Starting point is 00:12:47 Or was it one of those things that just sort of happened to you? Well, we had been building microservice-based products for a long time internal to our agency. So we kind of knew about all the pains of coordinating, orchestrating, scaling those. We want to go with microservices because we're tired of being able to find the problem. We want this to be much more of an exciting murder mystery when something goes down. I've heard that, but I think if you carve up the services the right way,

Starting point is 00:13:12 every service becomes simple. It's just so much easier to develop, to reason about. And I've been involved in so many monoliths before that. And then every refactor is like guts on the table, like month kind of ordeal, super high risk with the microservices. Everything becomes a simple, manageable affair. And you can basically rebuild your whole stack service by service.

Starting point is 00:13:35 And you can do like, it's a realistic thing, because all of them are pretty simple. But it's kind of complicated when they are all running inside instances. There's crosstalk with configuration, like you change the library and everything kind of breaks so docker was obvious like docker that kind of is like isolation being able to have different images but sharing the machine resources was amazing and then of course kubernetes being about orchestrating that made a lot of sense but it's that was also compatible with a few things that we have already discovered because workloads in kubernetes needs to be incredibly boring. We talk about boring stuff.

Starting point is 00:14:07 Like if you, for example, in the beginning, we had services that start up, they do some kind of a sanity check, they validate their environment, and then they go into action. That itself breaks the whole experience because what you want a Kubernetes-based service to do is basically just do one thing all the time

Starting point is 00:14:23 in the same way. Use the same amount of memory, the same amount of resources, and just do that one thing at that rate always. So we broke apart those things. Even the same service runs in different containers, depending on their state. Like this is the state for doing the sanity check. This is the state for serving queries. This is the state for doing mutations. Same service. So there's ways about that. I absolutely adore the whole thing it's saved like i haven't heard about those pains we used to have in the past ever again but also it was an easy choice for me because my single sre at the time said like it was either kubernetes or he'd

Starting point is 00:14:53 quit so it was very simple decision exactly the resume driven development is very much a thing i've not wanted to turn up my nose at that that's functionally what i've done my entire career how long had your product been running in an environment like that before, well, we're going multi-cloud was on the table? So that would be three and a half years, I think. Yeah. And then we started building it out on Azure. That's a sizable period of time in the context of trying to understand how something works. If I built something two months ago and now I have to pick it up and move it somewhere else,

Starting point is 00:15:27 that is generally a much easier task as far as migrations go than if the thing's been sitting there for 10 years. Because whenever you leave something in an environment like that, it tends to grow roots and takes a number of dependencies, both explicit and implicit,

Starting point is 00:15:43 on the environment in which it runs. Like in the early days of AWS, you sort of knew that local disks on the instances were ephemeral, because in the early days, that was the only option you had. So every application had to be written in such a way that it did not presume that there was going to be local disk persistence forever. Docker containers take that a significant step further, where when that container is gone, it's gone. There is no persistent disk there without some extra steps.

Starting point is 00:16:08 And in the early days of Docker, that wasn't really a thing either. Did you discover that you'd taken a bunch of implicit dependencies like that on the original cloud that you were building on? I'm an old school developer. I hail all the way back to C. And in C, you need to be incredibly, incredibly careful with your dependencies because you basically, your whole dependency mapping is happening inside your mind. The language doesn't help you at all. So I'm always thinking about my kind of project as kind of layers of abstraction. If someone talks to Postgres during a request, requests are supposed to be handled in the index.

Starting point is 00:16:42 Then I'm pretty angry. That breaks the whole point. Like the whole point is that this service doesn't need to know about Postgres. So we have been pretty hardcore on like not having any crosstalk, making sure every service just knows about, like we had a clear idea of which services

Starting point is 00:16:57 were allowed to talk to which services. And we were using JVT tokens internally to make sure that authentication and rights management was just handled on the ingress point and then just passed along with requests. So there was no, no one was able to talk to user restores or authentication services. That always all happens on the ingress. So in a sense, it was a very pure kind of layered platform already. And then, like I said, also then built on super boring technologies. So it wasn't really a dramatic thing. The drama was more that we didn't maybe like these other cloud

Starting point is 00:17:30 services that much. But as you grow older in this industry, you kind of realize that you just hate the technologies differently. And some of them you hate a little bit less than others. And that's just how it goes. That's fine. So that was the pain. We didn't have a lot of pain with our own platform because of these things. It's so nice watching people who have been around in the ecosystem for long enough to have made all of the classic mistakes and realized, oh, that's why common wisdom is what common wisdom is. Because generally speaking, that shit works. And you learn it yourself from first principles when you decide poorly in most cases to go and reimplement things like oh dns goes down a lot so we're just going to rsync around an etsy host file on all of our linux servers yeah we tried that collectively back in the 70s it didn't

Starting point is 00:18:15 work so well then either but every once in a while some startup founder feels the need to speed run learning those exact same lessons what i'm picking up from you is a distinct lack of the traditional startup founder vibe of, oh, well, the reason that most people don't do things this way is because most people are idiots. I'm smarter than they are. I know best. I'm getting the exact opposite of that from you, where you seem to wind up wanting to stick to things that are tried and true. And as you said earlier not exciting yeah at least for these kind of mission criticals like so we had a similar platform for our customers that we kind of used internally before we created sanity and when we decided to basically redo the

Starting point is 00:18:57 whole thing but for a kind of a self-serve thing and make it a product i went around the developer team and i just asked them like like, in your experience, what systems that we use are you not thinking about? Like, are you not having any problems with? And like, just to make a list of those. And there was a short list that are pretty well known, and some of them has turned out at the scale we're running now, pretty problematic

Starting point is 00:19:17 still. So it's not like it's all roses. We picked Elasticsearch for some things, and that can be pretty painful. I'm on the market for a better indexing service, for example. And then sometimes, let's talk about some mistakes. Sometimes you... I still am totally on the microservices

Starting point is 00:19:33 train, and if you make sure you design your workloads clearly and have a clear idea about the abstractions and who gets to talk to who, it works. But then, if you make a wrong split... So we had a split between a billing service and a kind of user and resource management service that now keeps talking back and forth all the time.

Starting point is 00:19:52 Like they have to know about each other. And it's as if two services need to know about each other reciprocally, like then you're in trouble. Then those should be the same service, in my opinion. Or you could split it some other way. So this is stuff that we've been struggling with. But you're right.

Starting point is 00:20:07 My last kind of rah-rah thing was Rails and Ruby. And then when I reined off of that, I was like, these technologies just work for me. For example, I use Golang a lot. It's a very ugly language. It's very, very useful. You can't argue against the productivity you have in Go, but also the syntax is kind of ugly. And then I realized, yeah,

Starting point is 00:20:23 I kind of hate everything now, but also I love the productivity of ugly. And then I realized like, yeah, I kind of hate everything now, but also I love the productivity of this. This episode is sponsored in part by our friends at Optics because they believe that many of you are looking to bolster your security posture with CNAP and XDR solutions. They offer both cloud and endpoint security

Starting point is 00:20:40 in a single UI and data model. Listeners can get Optics for up to 1,000 assets through the end of 2023, that is next year, for $1. But this offer is only available for a limited time on OpticsSecretMenu.com. That's U-P-T-Y-C-S SecretMenu.com. There's something to be said

Starting point is 00:21:02 for having been in the industry long enough to watch today's exciting new thing becomes tomorrow's legacy garbage that you've got to maintain and support. And I think after a few cycles of that, you wind up becoming almost cynical and burned out on a lot of things that arise that leaves everyone breathless. I'm generally one of the last adopters of something. I was very slow to get on virtualization. I was a doomsayer on cloud itself for many years. I turned my nose up at Docker. I mostly skipped the whole Kubernetes thing and decided to be early to serverless, which does not seem to be taking

Starting point is 00:21:35 off the way that I wanted it to. So great. It's one of those areas where just having been in the operation side, particularly having to run things and fix them at two in the morning when they invariably break, when some cron job in the middle of the night fires off, because no one will be around then to bother. Yeah, great plan. It really, at least in my case, makes me cynical and tired to the point where I got out of running things in anger. You seem to have gone in a different direction where, oh, you're still going to build and run things. You're just going to do it in ways that are a lot more well understood. I think there's a lot of value to that.

Starting point is 00:22:11 And I don't think that we give enough credit as an industry to people making those decisions. You know, I was big into drum and bass back in the 90s. I just loved that thing. And then it went away and then something came that was called dubstep. It's the same thing. It then it went away and then something came that was called dubstep. It's the same thing. It's just better. It's a better drum and bass. Oh yeah, the part where it goes doof, doof, doof, doof, doof, doof, doof has always been...

Starting point is 00:22:32 Yeah, you call it different things, but the doof, doof, doof, doof, doof music is always there. Yeah. Yeah, yeah, yeah. And I think the thing to recognize, you can either be cynical and say like, you just, you kids are just making the same music we did like 20 years ago, or you can recognize that actually... Kids love being told that. It's their favorite thing. Telling them, oh yeah, back when I was your age, that's how you, that's the signifier of a story that they're going to be riveted to and be really interested in hearing. Exactly. And I don't think like that, because I think you need to recognize that this thing came back and it came back better and stronger. And I think Mark Twain probably didn't say

Starting point is 00:23:05 that history doesn't repeat itself, it rhymes. And it's a similar thing. Right now I have the content with the fact that server-based rendering is coming back as a completely new thing, which was like the thing always. But also it comes back with new abstractions and new ways of thinking about that. And it comes back better with better tooling.

Starting point is 00:23:23 And I think the one thing, if you can take away from that kind of journey, that you can be stronger by not being excited by shiny new things and not being kind of a champion for one specific thing over every other thing. You can just kind of see the utility of that. And when things come back and they pretend to be new, You can see both the kind of tradition of it and maybe see it clearer than most other people. But also, like you said, don't bore the kids because also you should see how it is new, how it is solving new things and how these kids coming back

Starting point is 00:23:57 with the same old thing as a new thing, they saw it differently. They framed it slightly differently and we are better for it. There's so much in this industry that we take from others. We all stand on the shoulders of giants. And I think that that is something that is part of what makes this industry so fantastic in different ways. Some of the original computer scientists who built some of the things that everyone takes for granted these days are still

Starting point is 00:24:21 alive. It's not like the world of physics, for example, where some of the greats wound up discovering these things hundreds of years ago. It's, no, this all evolved within living memory. That means that we can talk to people and we can humanize them on some level. It's not some lofty great sitting around and who knows what they would have wanted or how they would have intended this. No, you have people who helped build the TCP stack stand up and say, oh yeah, that was a dumb. We did a dumb. We should not have done it that way. Oh, great. It's a constant humbling experience watching people evolve things. You mentioned that Go was a really neat language. Back when I wound up failing out of school, before I did that, I took a few classes in C and it was challenging and obnoxious, about like you would expect.

Starting point is 00:25:10 And at the beginning of this year, I did a deep dive into learning Go over the course of a couple days, enough to build a binary that winds up controlling my internet camera in my home office. And I learned an awful lot on how to do things and got a lot of things wrong, and it was a really fun language. It was harder to do a lot of the ill-considered things that get people into trouble with C. The idea that people are getting nice things and in a way that we didn't have them back when we were building things the first time around is great. If you're listening to this, it is imperative. Listen to me. It is imperative. Do not email me about Rust. I don't want to hear it. But I love the fact that our tools are now stuff that we can use in sensible ways.

Starting point is 00:25:56 These days, as you look at using sensible tools, which in this iteration, I will absolutely say that using a hyperscale public cloud provider is the right move. That's the way to go. Do you find that given that you started over hanging out on Google Cloud and now you're running workloads everywhere, do you have an affinity for one as your primary cloud or does everything you've built wind up seamlessly flowing back and forth? So of course we have a management interface that our end users kind of use to manage their, it has to be, at least has to have a home somewhere, even though the data can be replicated everywhere. So that's in Google Cloud, because that's where we started. And also it's, I think GCP is what our team likes the most.

Starting point is 00:26:32 They think it's the most solid platform. But its developer experience is far and away the best of all the major cloud providers, bar none. I've been saying that for a while. When I first started using it, I thought I was going to just be making fun of it. But this is actually really good, was my initial impression. And that impression has never faded. Yeah.

Starting point is 00:26:48 No, it's like it's terrible as well, but it's the least terrible platform of them all. But I think we would not make any decisions based on that. As long as it's solid, as long as it's stable, and as long as kind of prices is reasonable and business practices is kind of sound, we would work with any provider. And hopefully we would also work with less, let's call it less famous, more niche providers in the future to provide, let's say, specific organizations that need very, very specific policies or practices.

Starting point is 00:27:18 We would be happy to support that. I want to go there in the future. And that might require some exotic integrations and ways of building things. A multi-cloud story that I used to go there in the future. And that might require some exotic integrations and ways of building things. A multi-cloud story that I used to tell in the broader sense, use PagerDuty as an example, because that is the service

Starting point is 00:27:33 that does one thing really well, and that is wake you up when something sends the right kind of alert. And they have multiple cloud providers, historically, that they use. And the story that came out of it was, yeah, as I did some more digging into what they'd done and how they talked about this, it's clear that the thing that wakes you up in the middle of the night absolutely has to work

Starting point is 00:27:54 across a whole bunch of different providers. Because if it's on one, what happens when that's the one that goes down? We learned that when AWS took an outage in 2011 or 2012 and pager duty went down as a result of that. So the thing that wakes you up absolutely lives in a bunch of different places and a bunch of different providers. But their marketing site doesn't have to. Their user control panel doesn't have to. cloud that is sufficiently gruesome enough, okay, they can have a degraded mode where you're not able to update and set up new alerts and add new users into your account because everything's on fire in those moments anyway. That's an acceptable trade-off. But the thing that wakes you up absolutely must work all the time. So it's the idea of this workload has got to live in a bunch

Starting point is 00:28:42 of places, but not every workload looks like that. As you look across the various services and things you have built that comprise a company, do you find that you're biasing for running most things in a single provider? Or do you take that default everywhere approach? No, I think that to us it is. And we're not. That's something we haven't the work we haven't done yet. But architecturally, it will work fine. Because as long as we serve queries,

Starting point is 00:29:08 like we have two components, like people write stuff, they create new content, and that needs to be up as much as possible. But of course, when that goes down, if we still serve queries, their properties are still up, right? Their websites or whatever is still serving content. So if we were to make things kind of cross-cloud redundant, it would be the CDN, the indexes and the varnish caches, and have those be redundant.

Starting point is 00:29:31 But it is a challenge in terms of how you do routing. And let's say the routing provider is down. How do you deal with that? It's been a number of DNS outages, and I would love to figure out how to get around that. Right now, people would have to manually kind of change their, we have backup ingress points, but yeah, that's a challenge. One of the areas where people get into trouble with multi-cloud as well, that I've found,

Starting point is 00:29:54 has been that people do it with the idea of getting rid of single points of failure, which makes a lot of sense. But in practice, what so many of them have done is inadvertently added multiple points of failure, all of which are single-tracked. So, okay, now we're across two cloud providers, so we get exposure to everyone's outages, is how that winds up looking. I've seen companies that have been intentionally avoiding AWS because, great, when they go down and the internet breaks,

Starting point is 00:30:20 we still want our store to be up. Great. But they take a dependency on Stripe, who's primarily an AWS. So depending on the outage, people may very well not be able to check out of their store. So what did they gain by going to another provider? Because now when that provider goes down, their site is down then too. Yeah. It's interesting that anything works at all, actually. We're seeing how intertwined everything is. But I think that is, to me, the amazing part. Like you said, someone's marketing site doesn't have to be multi-cloud or maybe sometimes it does. And I find it interesting that in the serverless space, even if we provide a very, like we have super advanced engineers and we do complex orchestration, our cloud services, but we don't run anything else, right? Like all of our kind of web properties is run with highly integrated, basically on Vercel mostly, right?

Starting point is 00:31:08 Like we don't, we don't want to know about, like, we don't even know which cloud that's running on, right? And I think that's how it should be because most things, like you said, most things are best outsourced to another company and have them worry, like have them worry when things are going down. And that's how I feel about these things that, yes, you cannot be totally protected, but at least you can outsource some of them worry. Like, have them worry when things are going down. And that's how I feel about these things. That, yes, you cannot be totally protected, but at least you can outsource some of that worry

Starting point is 00:31:29 to someone who really knows. Like, if Stripe goes down, most people don't have that resources to worry at the level that Stripe would worry, right? So at least you have that. Exactly, yeah. Because ignore the underlying cloud provider stuff. They do a lot of things

Starting point is 00:31:43 I don't want to have to become an expert in. Effectively, you wind up getting your payment boundary through them. You don't have to worry about PCI yourself at all. You can hand it all off to them. That's value. Like the infrastructure stuff is just table stakes compared to a lot of the higher up the stack value that companies in that position enjoy. Yeah, I'm not sitting here saying don't use Stripe. I want to be very clear on that. No, no, no, No, I got you. I got you. I just remember, look, so we talked about me hailing all the way back to C. I also hail all the way back to having your own servers in a kind of place somewhere that you had to drive to to replace a SCSI card because one hard drive was down.

Starting point is 00:32:16 Or like, oh, you had to scale up and now you have to buy five servers. You have to set them up and drive them to the, put them into the slots. And like, yes, you could fix any problem yourself. Perfect. But also you had to fix every problem yourself. I'm so happy to be able to pay Google or AVS or Azure to have that worry for me, to have that kind of redundancy on hand. And

Starting point is 00:32:36 clearly we are down less time now that we have less control, if that makes sense. I really want to thank you for being so generous with your time. If people want to learn more, where's the best place really want to thank you for being so generous with your time. If people want to learn more, where's the best place for them to find you? So I'm at Svale on Twitter. So my DMs are open.

Starting point is 00:32:55 And also we have a Slack community for Sanity. So if you want to kind of engage with Sanity, you can join our Slack community and I'll be on there as well. And you'll find it in the footer on all of the Sanity.io webpages. And we will put links to that in the show notes. Perfect. Thank you so much for being so generous with your time. I really appreciate it. Thank you.

Starting point is 00:33:12 This was fun. Simon Sveil, CTO and co-founder at Sanity. I'm cloud economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice. Whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an insulting comment. And make sure you put that insulting comment on all of the different podcast platforms that are out there, because you have to run everything on every cloud provider. If your AWS bill keeps rising and your blood pressure is doing the same, then you need the Duck Bill Group. We help companies fix their AWS bill by making it smaller and less horrifying.

Starting point is 00:33:59 The Duck Bill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started. This has been a humble pod production stay humble

Screaming in the Cloud - Multi-Cloud in Sanity with Simen Svale Skogsrud

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.