Screaming in the Cloud - The Art and Science of Database Innovation with Andi Gutmans
Episode Date: November 23, 2022About AndiAndi Gutmans is the General Manager and Vice President for Databases at Google. Andi’s focus is on building, managing and scaling the most innovative database services to deliver ...the industry’s leading data platform for businesses. Before joining Google, Andi was VP Analytics at AWS running services such as Amazon Redshift. Before his tenure at AWS, Andi served as CEO and co-founder of Zend Technologies, the commercial backer of open-source PHP.Andi has over 20 years of experience as an open source contributor and leader. He co-authored open source PHP. He is an emeritus member of the Apache Software Foundation and served on the Eclipse Foundation’s board of directors. He holds a bachelor’s degree in Computer Science from the Technion, Israel Institute of Technology.Links Referenced:LinkedIn: https://www.linkedin.com/in/andigutmans/Twitter: https://twitter.com/andigutmans
Transcript
Discussion (0)
Hello, and welcome to Screaming in the Cloud, with your host, Chief Cloud Economist at the
Duckbill Group, Corey Quinn.
This weekly show features conversations with people doing interesting work in the world
of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles
for which Corey refuses to apologize.
This is Screaming in the Cloud.
This episode is sponsored in part by our friends at Sysdig.
Sysdig secures your cloud from source to run.
They believe, as do I, that DevOps and security are inextricably linked.
If you want to learn more about how they view this, check out their blog.
It's definitely worth the read.
To learn more about how they are absolutely getting it right from where I sit,
visit sysdig.com and tell them that I sent you.
That's S-Y-S-D-I-G dot com.
And my thanks to them for their continued support of this
ridiculous nonsense. Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted episode is
brought to us by our friends at Google Cloud. And in so doing, they have gotten a guest to appear
on this show that I have been low-key trying to get here for a number of
years. Andy Goodmans is VP and GM of Databases at Google Cloud. Andy, thank you for joining me.
Corey, thanks so much for having me.
I have to begin with the obvious. Given that one of my personal passion projects is misusing every cloud service I possibly can as a database, where do you start and where do you stop as far as saying, yes, that's a database, so itparty databases, such as MySQL, Postgres,
SQL Server, and then also the cloud-first databases, such as Spanner, Bigtable, Firestore,
and AlloDB. So I suggest that's where you start because those are all awesome services.
And then what doesn't fall underneath kind of that purview are things like BigQuery, which is an analytics data warehouse and other analytics engines.
And of course, there's always folks who bring in their favorite, maybe lesser of things. Where does it start? Where does it stop? It's not at all clear from the outside. I guess something of a legendary figure, which I know is always a weird thing for people to hear, but you were partially, at least, responsible for the Zen framework in the PHP world, which I didn't
realize what the heck that was, despite supporting it in production at a couple of jobs, until after
I, for better or worse, was no longer trusted to support production environments anymore, which
honestly, if you can get out, I'm a big proponent
of doing that. You sleep so much better without a pager. How did you go from programming languages
all the way on over to databases? It just seems like a very odd mix.
Yeah, no, that's a great question. So I was one of the core developers of PHP,
and I've been in the PHP community for quite some time. I also helped ideate the Zen Framework, which was the company that I co-founded.
Zen Technologies was kind of the company behind PHP.
So like Red Hat supports Linux commercially, we supported PHP.
And I was very much focused on developers, programming languages, frameworks, IDEs.
And that was, you know,
really exciting. I had also done, you know, quite a bit of work on interoperability with databases,
right? Because behind every application, there's a database. And so a lot of what we focused on
is like great connectivity to MySQL, to Postgres, to other databases. And I got to kind of learn the
database world from the outside, from the application
builders. And we sold our company in, I think it was 2015. And so I had to kind of figure out
what's next. And so one option would have been, hey, stay in programming languages. But what I
learned over the many years that I worked with application developers is that there's a huge
amount of value in data. And frankly, I'm a
very curious person. I always like to learn. So there was this opportunity to join Amazon,
to join the non-relational database side and take myself completely out of my comfort zone.
And actually, I joined AWS to help build the graph database, Amazon Neptune,
which was even more out of my comfort zone than even probably a
relational database. So I kind of like to do different things. And so I joined and I had to
learn, you know, how to build a database pretty much from the ground up. I mean, of course, I
didn't do the coding, but I had to learn enough to be dangerous. And so I worked on a bunch of
non-relational databases there, such as, you know, Neptune, Redis, Elasticsearch, DynamoDB Accelerator.
And then there was the opportunity for me to actually move over from non-relational databases to analytics, which was another way to get myself out of my comfort zone.
And so I moved to run the analytics space, which included services like Redshift, like EMR,
Athena, you name it.
So that was just a great experience for me where I got to work with a lot of awesome
people and learn a lot.
And then the opportunity arose to join Google and actually run the Google transactional
databases, including all their relational databases.
And by the way, my job actually has two jobs. One job is running
Spanner and Bigtable for Google itself, meaning search ads and YouTube and everything
runs on these databases. And then the second job is actually running the external facing databases
for external customers. How alike are those two? Is it effectively the exact same thing,
just with different API endpoints?
Are they two completely separate universes?
It's always unclear from the outside
when looking at large companies
that effectively eat versions of their own dog food
where their internal usage of these things starts and stops.
So great question.
So Cloud Spanner and Cloud Bigtable
do actually use the internal Spanner and Bigtable.
So at the core, it's exactly the same engine,
the same runtime, same storage and everything.
However, kind of internally,
the way we built the database APIs
was kind of good for scrappy Google engineers
and folks who kind of are okay learning
how to fit into the Google ecosystem.
But when we needed to make this work
for enterprise customers, we needed to make this work for enterprise
customers, we needed cleaner APIs. We needed authentication that was external, right? And so
on and so forth. So think about we had to add an additional set of APIs on top of it and management,
right? To really make these engines accessible to the external world. So it's running the same engine under the hood,
but it is a different set of APIs.
And a big part of our focus is continuing to expose
to enterprise customers all the goodness
that we have on the internal system.
So it's really about taking these very, very unique,
differentiated databases and democratizing access to them
to anyone who wants to.
I'm curious to get your position on the idea that seems to be playing its, I guess, a battle
that's been playing itself out in a number of different customer conversations. And that is,
I guess, the theoretical decision between do we go towards general purpose databases and more or less treat
every problem as a nail in search of a hammer, or do you decide that every workload gets its own
custom database that aligns the best with that particular workload? There are trade-offs in
either direction, but I'm curious where you land on that, given that you tend to see a lot more
of it than I do.
Yeah, no, that's a great question.
And, you know, just for the viewers who maybe aren't aware, there's kind of two extreme points of view, right?
There's one point of view that says purpose-built for everything, like every specific pattern, like build bespoke databases.
It's kind of a best-of-breed approach.
The problem with that approach is it becomes extremely complex for customers, right?
Extremely complex to decide what to use.
They might need to use multiple for the same application.
And so that can be a bit daunting as a customer.
And frankly, there's kind of a law of diminishing returns at some point.
Absolutely.
I don't know what the DBA role of the future is, but I don't think anyone really wants it to be. Oh yeah, we're deciding which one of these three dozen managed database services is the exact right fit for each and every individual workload. I mean, at some point it feels like certain cloud providers believe that
not only every workload should have its own database, but almost every workload should have
its own database service. It's at some point you're allowed to say no and stop building these completely,
what feel like to me,
Byzantine esoteric database engines
that don't seem to have broad applicability
to a whole lot of problems.
Exactly, exactly.
And by the way, the other extreme
is what folks often talk about as multi-model
where you say like,
hey, I'm going to have a single storage engine
and then map onto that the relational model,
the document model, the graph model, and so on. I think what we tend to see is if you go too generic,
you also start having performance issues. You may not be getting the right level of abilities and
trade-offs around consistency and replication and so on. So I would say Google, like we're taking a very pragmatic approach
where we're saying, you know what,
we're not going to solve all of customer problems
with a single database,
but we're also not going to have two dozen, right?
So we're basically saying,
hey, let's understand the main characteristics
of the workloads that our customers need to address,
build the best services
around those.
You know, obviously over time, we continue to enhance what we have to fit additional
models.
And then frankly, we have a really awesome partner ecosystem on Google Cloud, where if
someone really wants a very specialized database, you know, we also have great partners that
they can use on Google Cloud and get great support and get the rest
of the benefits of the platform.
I'm very curious to get your take on a pattern that I've seen alluded to by basically every
vendor out there, except the couple of very obvious ones for whom it does not serve their
particular vested interests,
which is that there's a recurring narrative
that customers are demanding open-source databases
for their workloads.
And when you hear that,
at least people who came up the way that I did,
spending entirely too much time on Freenode,
back when that was not a deeply problematic statement
in and of itself,
where, yes, we're open-source, I guess zealots is probably the best terminology, too much time on Freenode back when that was not a deeply problematic statement in and of itself,
where, yes, we're open source, I guess zealots is probably the best terminology. And yeah,
businesses are demanding to participate in the open source ecosystem. Here in reality, what I see is not ideological purity or anything like that. It is much more to do with,
yeah, we don't like having a single commercial vendor for
our databases that basically plays the insert quarter to continue dance whenever we're trying
to wind up doing something new. We want the ability to not have licensing constraints around
when, where, how, and how quickly we can run databases. That's what I hear when customers are
actually talking about open source versus proprietary databases.
Is that what you see, or do you think that plays out differently?
Because let's be clear, you do have a number of database services that you offer that are not open source, but are also absolutely not tied to weird licensing restrictions either.
That's a great question. And I think for years now, customers have been
in a difficult spot because the legacy proprietary database vendors knew how sticky the database is.
And so as a result, the prices often went up and it was not easy for customers to manage costs and
agility and so on. But I would say that's always been somewhat of a concern. I think what I'm seeing changing and happening differently now
is as customers are moving into the cloud
and they want to run hybrid cloud,
they want to run multi-cloud,
they need to prove to their regulator
that it can do a stressed exit, right?
Open source is not just about reducing cost.
It's really about flexibility
and kind of being in control
of when and where you can run the workload.
So I think what we're really seeing now
is a significant surge of customers
who are trying to get off legacy proprietary database
and really kind of move to open APIs, right?
Because they need that freedom
and that freedom is far more important to them
than even the cost element.
And what's really interesting is, you know,
a lot of these are the decision makers in these enterprises,
not just the technical folks.
Like to your point, it's not just open source advocates, right?
It's really the business people who understand they need that flexibility.
And by the way, even the regulators are asking them to show
that they can flexibly move their workloads as they need to. So
we're seeing a huge interest there. And as you said, like some of our services,
you know, our open source based services, some of them are not like take Spanner as an example,
it is heavily tied to how we build our infrastructure and how we build our systems.
Like, I would say it's almost impossible to open source Spanner. But what we've done is we've basically embraced open APIs and made sure if a customer uses these systems, we're giving them control of when and where they want to run their workloads.
So, for example, Bigtable has an HBase API.
Spanner now has a Postgres interface.
So our goal is really to give customers as much flexibility and also not lock them into
Google Cloud. We want them to be able to move out of Google Cloud so they have control of their
destiny. I'm curious to know what you see happening in the real world, because I can sit here and come
up with a bunch of very well thought out logical reasons to go towards or away from certain patterns.
But I spent years building things myself. I know how it works. You grab the closest thing handy
and throw it in. And we all know that there is nothing so permanent as a temporary fix.
Like that thing's load bearing and you'll retire with that thing still in place.
In the idealized world, I don't think that I would want to take a dependency
on something like, easy example, Spanner or AlloyDB.
Because despite the fact that they have post-grasqueal,
yes, that's how I pronounce it, compatibility,
the capabilities of what they're able to do under the hood,
far exceed and outstrip,
whatever you're going to be able to build yourself or get anywhere else. So there's a data flow
architectural dependency lock-in, despite the fact that it is, at least on its face,
Postgres compatible. Counterpoint, does that actually matter to customers in what you are
seeing? I think it's a great question. I'll give you a couple of data points.
I mean, first of all, even if you take a complete open source product, right, running that in
different clouds, different on-premises environments and so on, fundamentally, you will have some
differences in performance characteristics, availability characteristics, and so on.
So the truth is, even if you use open source right you're
not going to get 100 of the same characteristics where you run that but that said you still have
the freedom of movement and with i would say and not a huge amount of engineering investment right
you're going to make sure you can run that workload elsewhere i kind of think of spanner
in a similar way where yes i mean you're getting getting all those benefits of Spanner that you can't get anywhere
else, like unlimited scale, global consistency, right? No maintenance downtime, five nines
availability, like you can't really get that anywhere else. That said, not every application
necessarily needs it. And you still have that option, right? That if you need to, or want to,
or we're not giving you a reasonable
price or reasonable price performance, but we're starting to neglect you as a customer, which of
course we wouldn't, but let's just say hypothetically that, you know, that could happen, that you still
had a way to basically go and run this elsewhere. Now, I'd also want to talk about some of the
upside something like Spanner gives you. Because you talked about you want to be able to just grab a few things, build something quickly, and then you don't want to be stuck.
The counterpoint to that is with Spanner, you can start really, really small.
And then let's say you're a gaming studio.
You're building 10 titles, hoping that one of them is going to take off.
So you can build 10 of those with very minimal spend on Spanner.
And if one takes off overnight, it's really the only database where you don't have to go and re-architect the application.
It's going to scale as big as you need it to.
And so it does enable a lot of this innovation and a lot of cost management as you try to get to that overnight success.
Yeah, overnight success. I always love that approach. It's one of those, yeah,
it became an overnight success after only 10 short years. It becomes this idea, people believe it's
in fits and starts, but then you see, I guess on some level, the other side of it, where it's a lot
of showing up and doing the work. I have to confess, I didn't do a whole lot of admin work in my production
years that touch databases because I have an aura and I'm unlucky. And it turns out that when you
blow away some web servers, everyone can laugh and will reprovision the stateless things.
Get too close to the data warehouse, for example, and you don't really have a company left anymore.
And of course, in the world of finance that I came out of, transactional integrity is
also very much a thing.
A question that I had centers really around one of the predictions you gave recently at
Google Cloud Next, which is that your prediction for the future is that transactional and analytical
workloads from a database perspective will converge.
What's that based on?
You know, I think we're really moving from a world
where customers are trying to make real-time decisions, right?
If there's model drift from an AI and ML perspective,
they want to be able to retrain their models
as quickly as possible.
So everything is moving fast, moving into streaming. And I think what you're starting
to see is, you know, customers don't have that time to wait for analyzing their transactional
data. Like in the past, you do a batch job, you know, once a day or once an hour, you know,
move the data from your transactional system to analytical system. But that's just not how
these always-on businesses run anymore. And they want to have those real-time insights. So I do think that
what you're going to see is transactional systems more and more building in analytical capabilities,
analytical systems building in more transactional, and then ultimately cloud platform providers like
us helping fill that gap and really making data
movement seamless across transactional, analytical, and even AI and ML workloads.
And so that's an area that I think is a big opportunity. I also think that Google is best
positioned to solve that problem. Forget everything you know about SSH and try Tailscale.
Imagine if you didn't need to manage PKI or rotate SSH keys every time someone leaves.
That'd be pretty sweet, wouldn't it?
With Tailscale SSH, you can do exactly that.
Tailscale gives each server and user device a node key to connect to its VPN, and it uses the same node key to authorize and authenticate SSH.
Basically you're SSH-ing the same way you manage access to your app.
What's the benefit here?
Built-in key rotation, permissions as code, connectivity between any two
devices, reduced latency, and there's a lot more, but there's a time limit here.
You can also ask users to re-authenticate for that extra bit of security.
Sounds expensive?
Nope, I wish it were.
Tailscale is completely free for personal use on up to 20 devices.
To learn more, visit snark.cloud slash tailscale.
Again, that's snark.cloud slash tailscale.
On some level, I've found that, at least my own work,
that once I wind up using a database for something, I'm inclined to try and stuff as many other things into that database as I possibly can.
Just because getting a whole second data store, taking a dependency on it for any given workload, tends to be a little bit on the, I guess, challenging side. Easy example of this. I've talked about it previously in various places,
but I was talking to one of your colleagues,
Sarah Ellis, who wound up at one point making a joke
that I, of course, took way too far.
Long story short, I built a Twitter bot
on top of Google Cloud Functions
that every time the Azure brand account tweets,
it simply quote tweets that,
translates their tweet into all caps,
and then puts a boomer-style statement in front of it if there's room. This account is CloudBoomer.
Now, the hard part that I had while doing this is everything's stateless, works super well.
Where do I wind up storing the ID of the last tweet that it saw on its previous run? And I was
fourth and inches from just saying, well, I'm already using
Twitter, so why don't we use Twitter as a database? Because everything's a database if you're either
good enough or bad enough at programming. And instead, I decided, okay, we'll try this Firebase
thing first. And I don't know if it's Buyer Store or Data Store or whatever it's called these days,
but once I wrap my head around it, incredibly effective, very fast to get up and running,
and I feel like I made at least a good decision
for once in my life involving something touching databases.
But it's hard.
I feel like I'm consistently drawn toward the thing
I'm already using as a default database.
I can't shake the feeling that that's the wrong direction.
I don't think it's necessarily wrong.
I mean, I think, you know, with Firebase and Firestore,
that combination, it's just extremely easy and quick to build awesome mobile applications.
And actually, you can build mobile applications without a middle tier, which is probably what
attracted you to that. So we just see, you know, a huge amount of developers and applications.
We have over 4 million databases in Firestore with just developers building these
applications, especially mobile-first applications. So I think if you can get your job done and get it
done effectively, absolutely stick to it. And by the way, one thing a lot of people don't know
about Firestore is it's actually running on Spanner infrastructure. So Firestore has the
same five nines availability, no maintenance downtime and
so on that has Fanner and the same kind of ability to scale. So it's not just that it's quick.
It will actually scale as much as you need it to and be as available as you need it to.
So that's on that piece. I think, though, to the same point, you know, there's other databases
that we're then trying to make sure kind of also extend their usage beyond what they've traditionally done. So, you know, for example,
we announced AlloyDB, which I kind of call a Postgres on steroids. We added analytical
capabilities to this transactional database so that as customers do have more data in their
transactional database, as opposed to having to go somewhere
else to analyze it, they can actually do real-time analytics within that same database.
And it can actually do up to 100 times faster analytics than open source Postgres.
So I would say both Firestore and AdoDB are kind of good examples of if it works for you,
right, we'll also continue to make investments.
So the amount of use cases you can use these
databases for continues to expand over time. One of the weird things that I noticed just
looking around this entire ecosystem of databases, and you've been in this space long enough to
presumably have seen the same type of evolution. Back when I was transiting between different
companies a fair bit, sometimes
because I was consulting and other times because I'm one of the greatest in the world at getting
myself fired from jobs based upon my personality, I found that the default standard was always, oh,
whatever the database is going to be, it started off as MySQL and then eventually pivots into
something else when that starts falling down. These days, I can't shake the feeling that almost everywhere I look, Postgres is the answer instead. What changed? What did I
miss in the ecosystem that's driving that renaissance, for lack of a better term?
That's a great question. And, you know, I've been involved in, I'm going to date myself a bit,
but in PHP since 1997, pretty much.
And one of the things we kind of did is we built a really good connector to MySQL.
And, you know, I don't know if you remember before MySQL, there was MSQL.
So the MySQL API actually came from MSQL.
And we bundled the MySQL driver with PHP.
And so kind of that LAMP stack really took off. And kind of to your point, you know,
the default in the web, right, was like, you're going to start with MySQL because it was super
easy to use, just fun to use. By the way, I actually wrote, co-authored the tab completion
in the MySQL client. So like a lot of these kind of, you know, fun, simple ways of using MySQL were there.
And frankly, it was super fast, right? And so kind of those fast reads and everything,
it just was great for web and for content. And at the time, Postgres kind of came across more
like a science project. Like the folks who are using Postgres were kind of the outliers, right?
You know, the less pragmatic folks. I think what's changed
over the past, how many years has it been now? 25 years? I'm definitely dating myself,
is a few things. One, MySQL is still awesome, but it didn't kind of go in the direction of
really kind of trying to catch up with the legacy proprietary databases on features and functions. Part of that may just be that from a roadmap perspective,
that's not where the owner wanted it to go.
So MySQL today is still great, but it didn't go into that direction.
In parallel, customers wanted to move more to open source.
And so what they found is the thing that actually looks and smells more like
the legacy proprietary databases is actually Postgres. Plus, you saw an increase of investment in the Postgres
ecosystem, also very liberal license. So you have lots of other databases, including commercial ones
that have been built off the Postgres core. And so I think you are today in a place where for mainstream enterprise, Postgres is it,
because that is the thing that has all the features that the enterprise customer is used
to.
MySQL is still very popular, especially in like content and web and mobile applications.
But I would say that Postgres has really become kind of that de facto standard API that's replacing the legacy
proprietary databases. I've been on the record way too much as saying with some justification
that the best database in the world that should be used for everything is Route 53, specifically
text records. It's a key value store. And then anyone who's deep enough into DNS or databases
generally gets a slightly greenish tinge and feels ill.
That is my simultaneous best and worst database.
I'm curious as to what your most controversial opinion is
about the worst database in the world that you've ever seen.
This is the worst database?
Yeah, what is the worst database that you've ever seen?
I know on some level, since you manage all things database,
I'm asking you to pick your least favorite child.
But here we are.
Oh, that's a really good question.
I would say probably the worst database, double quotes,
is just the file system, right?
When folks are basically using the file system
as really a database.
And that can work for really simple apps,
but as apps get more complicated,
that's not going to work.
So I've definitely seen some of that.
I would say the most awesome database
that is also file system-based,
kind of embedded,
I think was actually SQLite.
And SQLite is actually still very,
very popular. I think it's, I think it sits on every mobile device pretty much on the planet.
So I actually think it's awesome, but it's, you know, it's not, it's not a database server. It's
kind of an embedded database, but it's something that I, you know, I've always been pretty excited
about. And, you know, there's definitely kind of new, interesting databases emerging that are also embedded, like DuckDB is quite interesting.
You know, it's kind of the SQLite for analytics. We've been using it for a few things around
bill analysis ourselves. It's impressive. I've also got to say, people think that we had something
to do with it because we're the Duckbill group and it's DuckDB. Have you done anything with this? And the answer is always, would you trust me with a database?
I didn't think so.
So no, just a weird coincidence.
But I like that a lot.
It's also counterintuitive from where I sit because I'm old enough to remember when Microsoft
was teasing the idea of WinFS, where they teased a future file system that fundamentally
was a database.
I believe it's an index or journal for all of that.
And I don't believe anything ever came of it.
But that felt like a really weird alternate world we could have lived in.
Yeah, that's a good point.
And by the way, if I actually take a step back,
and I kind of half-jokingly said file system,
and obviously all the popular databases persist on the file system.
But if you look at what's different in cloud-first databases, right?
Like if you look at legacy proprietary databases, the typical setup is write to the local disk
and then do asynchronous replication with some kind of bound replication like to somewhere
else, to a different region or so on.
If you actually start to look at what do cloud-first databases look like, they actually
write the data in multiple data centers at the same time.
And so kind of joke aside, as you start to think about, hey, how do I build the next
generation of applications?
And how do I really make sure I get the resiliency and the durability that the cloud can offer?
It really does take a new
architecture. And so that's where things like Spanner and Bigtable and kind of an LODB databases
are truly architected for the cloud. That's where they actually think very differently about
durability and replication and what it really takes to provide the highest level of availability and durability.
On some level, I think one of the key things for me to realize was that in my own experiments,
whenever I wind up doing something that is either for fun or I just want to see how it works and
what's possible, the scale of what I'm building is always inherently a toy problem.
It's like the old line that, oh yeah, if it fits in RAM, you don't have a big data problem. And
then I'm looking at things these days that are having most of a petabyte's worth of RAM sometimes.
It's okay. That definition continues to extend and get ridiculous. But I still find that most
of what I do in a database context can be done with almost any
database. There's no reason for me not to, for example, use a SQLite file or to use an object
store, just because there's a little latency, but whatever, or even a text file on disk.
The challenge I find is that as you start scaling and growing these things, you start to run into
limitations left and right. And only then it's one of those, oh, I should have made different choices or I should have built in abstractions. But so many of these
things come to nothing. It just feels like extra work. What guidance do you have for people who are
trying to figure out how much effort to put in upfront when they're just more or less puttering
around to see what comes out of it? You know, we like to think about ourselves at Google Cloud as really having a unique
value proposition that really helps you future-proof your development.
You know, if I look at both Spanner and I look at BigQuery, you can actually start at
a very, very low cost.
And frankly, not every application has to scale.
So you can start at a low cost, you can have a small application, but everyone wants
two things. One is availability, because you don't want your application to be down. And number two
is if you have to scale, you want to be able to without having to rewrite your application.
And so I think this is where we have a very unique value proposition, both in how we built Spanner
and then also how we build BigQuery, is that you
can actually start small. And for example, on Spanner, you can go from one-tenth of what we
call an instance, like a small instance that is under $65 a month, you can go to a petabyte scale
OLTP environment with thousands of instances in Spanner with zero downtime.
And so I think that is really the unique value proposition.
We're basically saying you can hold the stick at both ends.
You can basically start small,
and then if that application does need a scale, does need to grow,
you're not reengineering your application,
and you're not taking any downtime for reprovision. So I think that's,
if I had to give folks kind of advice, I say, look, what's done is done. You have workloads
on MySQL, Postgres, and so on. That's great. They're awesome databases. Keep on using them.
But if you're truly building a new app and you're hoping that that app is going to be successful at
some point, whether it's, like you said, all overnight successes take at least 10 years. At least if you've built in on something like
Spanner, you don't actually have to think about that anymore or worry about it, right? It will
scale when you need it to scale, and you're not going to have to take any downtime for it to scale.
So that's why we see a lot of these industries that have these potential spikes like gaming,
retail, also some use cases in financial services,
they basically gravitate towards these databases.
I really want to thank you for taking so much time out of your day
to talk with me about databases and your perspective on them,
especially given my profound level of ignorance around so many of them.
If people want to learn more about how you view these things,
where's the best place to find you?
Follow me on LinkedIn. I tend to post quite a bit on LinkedIn. I still post a bit on Twitter, but frankly, I've moved more of my activity to LinkedIn now. I find it's a...
That is such a good decision. I envy you.
It's a more curated audience and so on. And then also, you know, we just had Google Cloud Next. I recorded a session
there that kind of talks about database and just some of the things that are new in database land
at Google Cloud. So that's another thing that if folks are more interested to get more information,
that may be something that could be appealing to you.
We will, of course, put links to all of this in the show notes.
Thank you so much for your time.
I really appreciate it.
Great.
Corey, thanks so much for having me.
Andy Goodmans, VP and GM of Databases at Google Cloud.
I'm cloud economist Corey Quinn, and this is Screaming in the Cloud.
If you've enjoyed this podcast,
please leave a five-star review on your podcast platform of choice.
Whereas if you've hated this podcast,
please leave a five-star review on your podcast platform of choice. Whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice,
along with an angry, insulting comment.
Then I'm going to collect all of those angry, insulting comments
and use them as a database.
If your AWS bill keeps rising and your blood pressure is doing the same,
then you need the Duck Bill Group.
We help companies fix their AWS bill by making it smaller and less horrifying. The Duck Bill
Group works for you, not AWS. We tailor recommendations to your business and we get
to the point. Visit duckbillgroup.com to get started.
This has been a HumblePod production.
Stay humble.