Screaming in the Cloud - The Art and Science of Database Innovation with Andi Gutmans

Starting point is 00:00:00 Hello, and welcome to Screaming in the Cloud, with your host, Chief Cloud Economist at the Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud. This episode is sponsored in part by our friends at Sysdig. Sysdig secures your cloud from source to run.

Starting point is 00:00:39 They believe, as do I, that DevOps and security are inextricably linked. If you want to learn more about how they view this, check out their blog. It's definitely worth the read. To learn more about how they are absolutely getting it right from where I sit, visit sysdig.com and tell them that I sent you. That's S-Y-S-D-I-G dot com. And my thanks to them for their continued support of this ridiculous nonsense. Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted episode is

Starting point is 00:01:15 brought to us by our friends at Google Cloud. And in so doing, they have gotten a guest to appear on this show that I have been low-key trying to get here for a number of years. Andy Goodmans is VP and GM of Databases at Google Cloud. Andy, thank you for joining me. Corey, thanks so much for having me. I have to begin with the obvious. Given that one of my personal passion projects is misusing every cloud service I possibly can as a database, where do you start and where do you stop as far as saying, yes, that's a database, so itparty databases, such as MySQL, Postgres, SQL Server, and then also the cloud-first databases, such as Spanner, Bigtable, Firestore, and AlloDB. So I suggest that's where you start because those are all awesome services. And then what doesn't fall underneath kind of that purview are things like BigQuery, which is an analytics data warehouse and other analytics engines.

Starting point is 00:02:29 And of course, there's always folks who bring in their favorite, maybe lesser of things. Where does it start? Where does it stop? It's not at all clear from the outside. I guess something of a legendary figure, which I know is always a weird thing for people to hear, but you were partially, at least, responsible for the Zen framework in the PHP world, which I didn't realize what the heck that was, despite supporting it in production at a couple of jobs, until after I, for better or worse, was no longer trusted to support production environments anymore, which honestly, if you can get out, I'm a big proponent of doing that. You sleep so much better without a pager. How did you go from programming languages all the way on over to databases? It just seems like a very odd mix. Yeah, no, that's a great question. So I was one of the core developers of PHP, and I've been in the PHP community for quite some time. I also helped ideate the Zen Framework, which was the company that I co-founded.

Starting point is 00:03:50 Zen Technologies was kind of the company behind PHP. So like Red Hat supports Linux commercially, we supported PHP. And I was very much focused on developers, programming languages, frameworks, IDEs. And that was, you know, really exciting. I had also done, you know, quite a bit of work on interoperability with databases, right? Because behind every application, there's a database. And so a lot of what we focused on is like great connectivity to MySQL, to Postgres, to other databases. And I got to kind of learn the database world from the outside, from the application

Starting point is 00:04:25 builders. And we sold our company in, I think it was 2015. And so I had to kind of figure out what's next. And so one option would have been, hey, stay in programming languages. But what I learned over the many years that I worked with application developers is that there's a huge amount of value in data. And frankly, I'm a very curious person. I always like to learn. So there was this opportunity to join Amazon, to join the non-relational database side and take myself completely out of my comfort zone. And actually, I joined AWS to help build the graph database, Amazon Neptune, which was even more out of my comfort zone than even probably a

Starting point is 00:05:05 relational database. So I kind of like to do different things. And so I joined and I had to learn, you know, how to build a database pretty much from the ground up. I mean, of course, I didn't do the coding, but I had to learn enough to be dangerous. And so I worked on a bunch of non-relational databases there, such as, you know, Neptune, Redis, Elasticsearch, DynamoDB Accelerator. And then there was the opportunity for me to actually move over from non-relational databases to analytics, which was another way to get myself out of my comfort zone. And so I moved to run the analytics space, which included services like Redshift, like EMR, Athena, you name it. So that was just a great experience for me where I got to work with a lot of awesome

Starting point is 00:05:52 people and learn a lot. And then the opportunity arose to join Google and actually run the Google transactional databases, including all their relational databases. And by the way, my job actually has two jobs. One job is running Spanner and Bigtable for Google itself, meaning search ads and YouTube and everything runs on these databases. And then the second job is actually running the external facing databases for external customers. How alike are those two? Is it effectively the exact same thing, just with different API endpoints?

Starting point is 00:06:25 Are they two completely separate universes? It's always unclear from the outside when looking at large companies that effectively eat versions of their own dog food where their internal usage of these things starts and stops. So great question. So Cloud Spanner and Cloud Bigtable do actually use the internal Spanner and Bigtable.

Starting point is 00:06:43 So at the core, it's exactly the same engine, the same runtime, same storage and everything. However, kind of internally, the way we built the database APIs was kind of good for scrappy Google engineers and folks who kind of are okay learning how to fit into the Google ecosystem. But when we needed to make this work

Starting point is 00:07:04 for enterprise customers, we needed to make this work for enterprise customers, we needed cleaner APIs. We needed authentication that was external, right? And so on and so forth. So think about we had to add an additional set of APIs on top of it and management, right? To really make these engines accessible to the external world. So it's running the same engine under the hood, but it is a different set of APIs. And a big part of our focus is continuing to expose to enterprise customers all the goodness that we have on the internal system.

Starting point is 00:07:36 So it's really about taking these very, very unique, differentiated databases and democratizing access to them to anyone who wants to. I'm curious to get your position on the idea that seems to be playing its, I guess, a battle that's been playing itself out in a number of different customer conversations. And that is, I guess, the theoretical decision between do we go towards general purpose databases and more or less treat every problem as a nail in search of a hammer, or do you decide that every workload gets its own custom database that aligns the best with that particular workload? There are trade-offs in

Starting point is 00:08:19 either direction, but I'm curious where you land on that, given that you tend to see a lot more of it than I do. Yeah, no, that's a great question. And, you know, just for the viewers who maybe aren't aware, there's kind of two extreme points of view, right? There's one point of view that says purpose-built for everything, like every specific pattern, like build bespoke databases. It's kind of a best-of-breed approach. The problem with that approach is it becomes extremely complex for customers, right? Extremely complex to decide what to use.

Starting point is 00:08:51 They might need to use multiple for the same application. And so that can be a bit daunting as a customer. And frankly, there's kind of a law of diminishing returns at some point. Absolutely. I don't know what the DBA role of the future is, but I don't think anyone really wants it to be. Oh yeah, we're deciding which one of these three dozen managed database services is the exact right fit for each and every individual workload. I mean, at some point it feels like certain cloud providers believe that not only every workload should have its own database, but almost every workload should have its own database service. It's at some point you're allowed to say no and stop building these completely, what feel like to me,

Starting point is 00:09:28 Byzantine esoteric database engines that don't seem to have broad applicability to a whole lot of problems. Exactly, exactly. And by the way, the other extreme is what folks often talk about as multi-model where you say like, hey, I'm going to have a single storage engine

Starting point is 00:09:42 and then map onto that the relational model, the document model, the graph model, and so on. I think what we tend to see is if you go too generic, you also start having performance issues. You may not be getting the right level of abilities and trade-offs around consistency and replication and so on. So I would say Google, like we're taking a very pragmatic approach where we're saying, you know what, we're not going to solve all of customer problems with a single database, but we're also not going to have two dozen, right?

Starting point is 00:10:14 So we're basically saying, hey, let's understand the main characteristics of the workloads that our customers need to address, build the best services around those. You know, obviously over time, we continue to enhance what we have to fit additional models. And then frankly, we have a really awesome partner ecosystem on Google Cloud, where if

Starting point is 00:10:37 someone really wants a very specialized database, you know, we also have great partners that they can use on Google Cloud and get great support and get the rest of the benefits of the platform. I'm very curious to get your take on a pattern that I've seen alluded to by basically every vendor out there, except the couple of very obvious ones for whom it does not serve their particular vested interests, which is that there's a recurring narrative that customers are demanding open-source databases

Starting point is 00:11:11 for their workloads. And when you hear that, at least people who came up the way that I did, spending entirely too much time on Freenode, back when that was not a deeply problematic statement in and of itself, where, yes, we're open-source, I guess zealots is probably the best terminology, too much time on Freenode back when that was not a deeply problematic statement in and of itself, where, yes, we're open source, I guess zealots is probably the best terminology. And yeah,

Starting point is 00:11:36 businesses are demanding to participate in the open source ecosystem. Here in reality, what I see is not ideological purity or anything like that. It is much more to do with, yeah, we don't like having a single commercial vendor for our databases that basically plays the insert quarter to continue dance whenever we're trying to wind up doing something new. We want the ability to not have licensing constraints around when, where, how, and how quickly we can run databases. That's what I hear when customers are actually talking about open source versus proprietary databases. Is that what you see, or do you think that plays out differently? Because let's be clear, you do have a number of database services that you offer that are not open source, but are also absolutely not tied to weird licensing restrictions either.

Starting point is 00:12:21 That's a great question. And I think for years now, customers have been in a difficult spot because the legacy proprietary database vendors knew how sticky the database is. And so as a result, the prices often went up and it was not easy for customers to manage costs and agility and so on. But I would say that's always been somewhat of a concern. I think what I'm seeing changing and happening differently now is as customers are moving into the cloud and they want to run hybrid cloud, they want to run multi-cloud, they need to prove to their regulator

Starting point is 00:12:57 that it can do a stressed exit, right? Open source is not just about reducing cost. It's really about flexibility and kind of being in control of when and where you can run the workload. So I think what we're really seeing now is a significant surge of customers who are trying to get off legacy proprietary database

Starting point is 00:13:14 and really kind of move to open APIs, right? Because they need that freedom and that freedom is far more important to them than even the cost element. And what's really interesting is, you know, a lot of these are the decision makers in these enterprises, not just the technical folks. Like to your point, it's not just open source advocates, right?

Starting point is 00:13:35 It's really the business people who understand they need that flexibility. And by the way, even the regulators are asking them to show that they can flexibly move their workloads as they need to. So we're seeing a huge interest there. And as you said, like some of our services, you know, our open source based services, some of them are not like take Spanner as an example, it is heavily tied to how we build our infrastructure and how we build our systems. Like, I would say it's almost impossible to open source Spanner. But what we've done is we've basically embraced open APIs and made sure if a customer uses these systems, we're giving them control of when and where they want to run their workloads. So, for example, Bigtable has an HBase API.

Starting point is 00:14:18 Spanner now has a Postgres interface. So our goal is really to give customers as much flexibility and also not lock them into Google Cloud. We want them to be able to move out of Google Cloud so they have control of their destiny. I'm curious to know what you see happening in the real world, because I can sit here and come up with a bunch of very well thought out logical reasons to go towards or away from certain patterns. But I spent years building things myself. I know how it works. You grab the closest thing handy and throw it in. And we all know that there is nothing so permanent as a temporary fix. Like that thing's load bearing and you'll retire with that thing still in place.

Starting point is 00:15:01 In the idealized world, I don't think that I would want to take a dependency on something like, easy example, Spanner or AlloyDB. Because despite the fact that they have post-grasqueal, yes, that's how I pronounce it, compatibility, the capabilities of what they're able to do under the hood, far exceed and outstrip, whatever you're going to be able to build yourself or get anywhere else. So there's a data flow architectural dependency lock-in, despite the fact that it is, at least on its face,

Starting point is 00:15:34 Postgres compatible. Counterpoint, does that actually matter to customers in what you are seeing? I think it's a great question. I'll give you a couple of data points. I mean, first of all, even if you take a complete open source product, right, running that in different clouds, different on-premises environments and so on, fundamentally, you will have some differences in performance characteristics, availability characteristics, and so on. So the truth is, even if you use open source right you're not going to get 100 of the same characteristics where you run that but that said you still have the freedom of movement and with i would say and not a huge amount of engineering investment right

Starting point is 00:16:15 you're going to make sure you can run that workload elsewhere i kind of think of spanner in a similar way where yes i mean you're getting getting all those benefits of Spanner that you can't get anywhere else, like unlimited scale, global consistency, right? No maintenance downtime, five nines availability, like you can't really get that anywhere else. That said, not every application necessarily needs it. And you still have that option, right? That if you need to, or want to, or we're not giving you a reasonable price or reasonable price performance, but we're starting to neglect you as a customer, which of course we wouldn't, but let's just say hypothetically that, you know, that could happen, that you still

Starting point is 00:16:56 had a way to basically go and run this elsewhere. Now, I'd also want to talk about some of the upside something like Spanner gives you. Because you talked about you want to be able to just grab a few things, build something quickly, and then you don't want to be stuck. The counterpoint to that is with Spanner, you can start really, really small. And then let's say you're a gaming studio. You're building 10 titles, hoping that one of them is going to take off. So you can build 10 of those with very minimal spend on Spanner. And if one takes off overnight, it's really the only database where you don't have to go and re-architect the application. It's going to scale as big as you need it to.

Starting point is 00:17:36 And so it does enable a lot of this innovation and a lot of cost management as you try to get to that overnight success. Yeah, overnight success. I always love that approach. It's one of those, yeah, it became an overnight success after only 10 short years. It becomes this idea, people believe it's in fits and starts, but then you see, I guess on some level, the other side of it, where it's a lot of showing up and doing the work. I have to confess, I didn't do a whole lot of admin work in my production years that touch databases because I have an aura and I'm unlucky. And it turns out that when you blow away some web servers, everyone can laugh and will reprovision the stateless things. Get too close to the data warehouse, for example, and you don't really have a company left anymore.

Starting point is 00:18:23 And of course, in the world of finance that I came out of, transactional integrity is also very much a thing. A question that I had centers really around one of the predictions you gave recently at Google Cloud Next, which is that your prediction for the future is that transactional and analytical workloads from a database perspective will converge. What's that based on? You know, I think we're really moving from a world where customers are trying to make real-time decisions, right?

Starting point is 00:18:58 If there's model drift from an AI and ML perspective, they want to be able to retrain their models as quickly as possible. So everything is moving fast, moving into streaming. And I think what you're starting to see is, you know, customers don't have that time to wait for analyzing their transactional data. Like in the past, you do a batch job, you know, once a day or once an hour, you know, move the data from your transactional system to analytical system. But that's just not how these always-on businesses run anymore. And they want to have those real-time insights. So I do think that

Starting point is 00:19:30 what you're going to see is transactional systems more and more building in analytical capabilities, analytical systems building in more transactional, and then ultimately cloud platform providers like us helping fill that gap and really making data movement seamless across transactional, analytical, and even AI and ML workloads. And so that's an area that I think is a big opportunity. I also think that Google is best positioned to solve that problem. Forget everything you know about SSH and try Tailscale. Imagine if you didn't need to manage PKI or rotate SSH keys every time someone leaves. That'd be pretty sweet, wouldn't it?

Starting point is 00:20:15 With Tailscale SSH, you can do exactly that. Tailscale gives each server and user device a node key to connect to its VPN, and it uses the same node key to authorize and authenticate SSH. Basically you're SSH-ing the same way you manage access to your app. What's the benefit here? Built-in key rotation, permissions as code, connectivity between any two devices, reduced latency, and there's a lot more, but there's a time limit here. You can also ask users to re-authenticate for that extra bit of security. Sounds expensive?

Starting point is 00:20:45 Nope, I wish it were. Tailscale is completely free for personal use on up to 20 devices. To learn more, visit snark.cloud slash tailscale. Again, that's snark.cloud slash tailscale. On some level, I've found that, at least my own work, that once I wind up using a database for something, I'm inclined to try and stuff as many other things into that database as I possibly can. Just because getting a whole second data store, taking a dependency on it for any given workload, tends to be a little bit on the, I guess, challenging side. Easy example of this. I've talked about it previously in various places, but I was talking to one of your colleagues,

Starting point is 00:21:28 Sarah Ellis, who wound up at one point making a joke that I, of course, took way too far. Long story short, I built a Twitter bot on top of Google Cloud Functions that every time the Azure brand account tweets, it simply quote tweets that, translates their tweet into all caps, and then puts a boomer-style statement in front of it if there's room. This account is CloudBoomer.

Starting point is 00:21:51 Now, the hard part that I had while doing this is everything's stateless, works super well. Where do I wind up storing the ID of the last tweet that it saw on its previous run? And I was fourth and inches from just saying, well, I'm already using Twitter, so why don't we use Twitter as a database? Because everything's a database if you're either good enough or bad enough at programming. And instead, I decided, okay, we'll try this Firebase thing first. And I don't know if it's Buyer Store or Data Store or whatever it's called these days, but once I wrap my head around it, incredibly effective, very fast to get up and running, and I feel like I made at least a good decision

Starting point is 00:22:28 for once in my life involving something touching databases. But it's hard. I feel like I'm consistently drawn toward the thing I'm already using as a default database. I can't shake the feeling that that's the wrong direction. I don't think it's necessarily wrong. I mean, I think, you know, with Firebase and Firestore, that combination, it's just extremely easy and quick to build awesome mobile applications.

Starting point is 00:22:51 And actually, you can build mobile applications without a middle tier, which is probably what attracted you to that. So we just see, you know, a huge amount of developers and applications. We have over 4 million databases in Firestore with just developers building these applications, especially mobile-first applications. So I think if you can get your job done and get it done effectively, absolutely stick to it. And by the way, one thing a lot of people don't know about Firestore is it's actually running on Spanner infrastructure. So Firestore has the same five nines availability, no maintenance downtime and so on that has Fanner and the same kind of ability to scale. So it's not just that it's quick.

Starting point is 00:23:32 It will actually scale as much as you need it to and be as available as you need it to. So that's on that piece. I think, though, to the same point, you know, there's other databases that we're then trying to make sure kind of also extend their usage beyond what they've traditionally done. So, you know, for example, we announced AlloyDB, which I kind of call a Postgres on steroids. We added analytical capabilities to this transactional database so that as customers do have more data in their transactional database, as opposed to having to go somewhere else to analyze it, they can actually do real-time analytics within that same database. And it can actually do up to 100 times faster analytics than open source Postgres.

Starting point is 00:24:14 So I would say both Firestore and AdoDB are kind of good examples of if it works for you, right, we'll also continue to make investments. So the amount of use cases you can use these databases for continues to expand over time. One of the weird things that I noticed just looking around this entire ecosystem of databases, and you've been in this space long enough to presumably have seen the same type of evolution. Back when I was transiting between different companies a fair bit, sometimes because I was consulting and other times because I'm one of the greatest in the world at getting

Starting point is 00:24:49 myself fired from jobs based upon my personality, I found that the default standard was always, oh, whatever the database is going to be, it started off as MySQL and then eventually pivots into something else when that starts falling down. These days, I can't shake the feeling that almost everywhere I look, Postgres is the answer instead. What changed? What did I miss in the ecosystem that's driving that renaissance, for lack of a better term? That's a great question. And, you know, I've been involved in, I'm going to date myself a bit, but in PHP since 1997, pretty much. And one of the things we kind of did is we built a really good connector to MySQL. And, you know, I don't know if you remember before MySQL, there was MSQL.

Starting point is 00:25:37 So the MySQL API actually came from MSQL. And we bundled the MySQL driver with PHP. And so kind of that LAMP stack really took off. And kind of to your point, you know, the default in the web, right, was like, you're going to start with MySQL because it was super easy to use, just fun to use. By the way, I actually wrote, co-authored the tab completion in the MySQL client. So like a lot of these kind of, you know, fun, simple ways of using MySQL were there. And frankly, it was super fast, right? And so kind of those fast reads and everything, it just was great for web and for content. And at the time, Postgres kind of came across more

Starting point is 00:26:17 like a science project. Like the folks who are using Postgres were kind of the outliers, right? You know, the less pragmatic folks. I think what's changed over the past, how many years has it been now? 25 years? I'm definitely dating myself, is a few things. One, MySQL is still awesome, but it didn't kind of go in the direction of really kind of trying to catch up with the legacy proprietary databases on features and functions. Part of that may just be that from a roadmap perspective, that's not where the owner wanted it to go. So MySQL today is still great, but it didn't go into that direction. In parallel, customers wanted to move more to open source.

Starting point is 00:27:00 And so what they found is the thing that actually looks and smells more like the legacy proprietary databases is actually Postgres. Plus, you saw an increase of investment in the Postgres ecosystem, also very liberal license. So you have lots of other databases, including commercial ones that have been built off the Postgres core. And so I think you are today in a place where for mainstream enterprise, Postgres is it, because that is the thing that has all the features that the enterprise customer is used to. MySQL is still very popular, especially in like content and web and mobile applications. But I would say that Postgres has really become kind of that de facto standard API that's replacing the legacy

Starting point is 00:27:46 proprietary databases. I've been on the record way too much as saying with some justification that the best database in the world that should be used for everything is Route 53, specifically text records. It's a key value store. And then anyone who's deep enough into DNS or databases generally gets a slightly greenish tinge and feels ill. That is my simultaneous best and worst database. I'm curious as to what your most controversial opinion is about the worst database in the world that you've ever seen. This is the worst database?

Starting point is 00:28:22 Yeah, what is the worst database that you've ever seen? I know on some level, since you manage all things database, I'm asking you to pick your least favorite child. But here we are. Oh, that's a really good question. I would say probably the worst database, double quotes, is just the file system, right? When folks are basically using the file system

Starting point is 00:28:45 as really a database. And that can work for really simple apps, but as apps get more complicated, that's not going to work. So I've definitely seen some of that. I would say the most awesome database that is also file system-based, kind of embedded,

Starting point is 00:29:01 I think was actually SQLite. And SQLite is actually still very, very popular. I think it's, I think it sits on every mobile device pretty much on the planet. So I actually think it's awesome, but it's, you know, it's not, it's not a database server. It's kind of an embedded database, but it's something that I, you know, I've always been pretty excited about. And, you know, there's definitely kind of new, interesting databases emerging that are also embedded, like DuckDB is quite interesting. You know, it's kind of the SQLite for analytics. We've been using it for a few things around bill analysis ourselves. It's impressive. I've also got to say, people think that we had something

Starting point is 00:29:40 to do with it because we're the Duckbill group and it's DuckDB. Have you done anything with this? And the answer is always, would you trust me with a database? I didn't think so. So no, just a weird coincidence. But I like that a lot. It's also counterintuitive from where I sit because I'm old enough to remember when Microsoft was teasing the idea of WinFS, where they teased a future file system that fundamentally was a database. I believe it's an index or journal for all of that.

Starting point is 00:30:07 And I don't believe anything ever came of it. But that felt like a really weird alternate world we could have lived in. Yeah, that's a good point. And by the way, if I actually take a step back, and I kind of half-jokingly said file system, and obviously all the popular databases persist on the file system. But if you look at what's different in cloud-first databases, right? Like if you look at legacy proprietary databases, the typical setup is write to the local disk

Starting point is 00:30:36 and then do asynchronous replication with some kind of bound replication like to somewhere else, to a different region or so on. If you actually start to look at what do cloud-first databases look like, they actually write the data in multiple data centers at the same time. And so kind of joke aside, as you start to think about, hey, how do I build the next generation of applications? And how do I really make sure I get the resiliency and the durability that the cloud can offer? It really does take a new

Starting point is 00:31:06 architecture. And so that's where things like Spanner and Bigtable and kind of an LODB databases are truly architected for the cloud. That's where they actually think very differently about durability and replication and what it really takes to provide the highest level of availability and durability. On some level, I think one of the key things for me to realize was that in my own experiments, whenever I wind up doing something that is either for fun or I just want to see how it works and what's possible, the scale of what I'm building is always inherently a toy problem. It's like the old line that, oh yeah, if it fits in RAM, you don't have a big data problem. And then I'm looking at things these days that are having most of a petabyte's worth of RAM sometimes.

Starting point is 00:31:55 It's okay. That definition continues to extend and get ridiculous. But I still find that most of what I do in a database context can be done with almost any database. There's no reason for me not to, for example, use a SQLite file or to use an object store, just because there's a little latency, but whatever, or even a text file on disk. The challenge I find is that as you start scaling and growing these things, you start to run into limitations left and right. And only then it's one of those, oh, I should have made different choices or I should have built in abstractions. But so many of these things come to nothing. It just feels like extra work. What guidance do you have for people who are trying to figure out how much effort to put in upfront when they're just more or less puttering

Starting point is 00:32:39 around to see what comes out of it? You know, we like to think about ourselves at Google Cloud as really having a unique value proposition that really helps you future-proof your development. You know, if I look at both Spanner and I look at BigQuery, you can actually start at a very, very low cost. And frankly, not every application has to scale. So you can start at a low cost, you can have a small application, but everyone wants two things. One is availability, because you don't want your application to be down. And number two is if you have to scale, you want to be able to without having to rewrite your application.

Starting point is 00:33:18 And so I think this is where we have a very unique value proposition, both in how we built Spanner and then also how we build BigQuery, is that you can actually start small. And for example, on Spanner, you can go from one-tenth of what we call an instance, like a small instance that is under $65 a month, you can go to a petabyte scale OLTP environment with thousands of instances in Spanner with zero downtime. And so I think that is really the unique value proposition. We're basically saying you can hold the stick at both ends. You can basically start small,

Starting point is 00:33:58 and then if that application does need a scale, does need to grow, you're not reengineering your application, and you're not taking any downtime for reprovision. So I think that's, if I had to give folks kind of advice, I say, look, what's done is done. You have workloads on MySQL, Postgres, and so on. That's great. They're awesome databases. Keep on using them. But if you're truly building a new app and you're hoping that that app is going to be successful at some point, whether it's, like you said, all overnight successes take at least 10 years. At least if you've built in on something like Spanner, you don't actually have to think about that anymore or worry about it, right? It will

Starting point is 00:34:32 scale when you need it to scale, and you're not going to have to take any downtime for it to scale. So that's why we see a lot of these industries that have these potential spikes like gaming, retail, also some use cases in financial services, they basically gravitate towards these databases. I really want to thank you for taking so much time out of your day to talk with me about databases and your perspective on them, especially given my profound level of ignorance around so many of them. If people want to learn more about how you view these things,

Starting point is 00:35:03 where's the best place to find you? Follow me on LinkedIn. I tend to post quite a bit on LinkedIn. I still post a bit on Twitter, but frankly, I've moved more of my activity to LinkedIn now. I find it's a... That is such a good decision. I envy you. It's a more curated audience and so on. And then also, you know, we just had Google Cloud Next. I recorded a session there that kind of talks about database and just some of the things that are new in database land at Google Cloud. So that's another thing that if folks are more interested to get more information, that may be something that could be appealing to you. We will, of course, put links to all of this in the show notes.

Starting point is 00:35:43 Thank you so much for your time. I really appreciate it. Great. Corey, thanks so much for having me. Andy Goodmans, VP and GM of Databases at Google Cloud. I'm cloud economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice.

Starting point is 00:36:03 Whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice. Whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry, insulting comment. Then I'm going to collect all of those angry, insulting comments and use them as a database. If your AWS bill keeps rising and your blood pressure is doing the same, then you need the Duck Bill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duck Bill

Starting point is 00:36:33 Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started. This has been a HumblePod production. Stay humble.

Screaming in the Cloud - The Art and Science of Database Innovation with Andi Gutmans

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.