Software Huddle - Scaling MySQL with Sam Lambert from PlanetScale

Starting point is 00:00:00 Truthfully, I don't think many databases have been built for developers. They've been built to operate and have been built for operators, but the kind of developer experience and the daily life of using most databases kind of sucks. It's serious to go and release database software. You can't mess around. You can't just yeet it out to production. And one of our company values is no passengers. We mean that.

Starting point is 00:00:24 Hey folks, this is Alex Debris. I'm super excited about today's show. My guest today is Sam Lambert. Now, Sam is the CEO at PlanetScale, which I think is one of the more interesting cloud-native database companies out there right now. We just cover a ton in this episode. We talk about what I consider like the two parts of PlanetScale. We talk a lot about Vitesse, which is the technology underlying PlanetScale. So we talk about the history of Vitesse, the architecture, how it fits into PlanetScale, all that stuff. And we also just cover a lot of stuff about how PlanetScale is such a high-performing

Starting point is 00:00:51 culture, right? And you see this in their cadence of feature releases, in the great educational content they put out, just their nice design aesthetic, all that great stuff. So we cover that in today's episode. This is the first episode in this series I'm doing on cloud native databases. So if you're interested in that, make sure you subscribe to the podcast. If you have guests that you want on, if you have notes on the show, anything like that, feel free to reach out on Twitter, email, whatever it is. Otherwise, let's get to the show. All right, Sam, welcome to the

Starting point is 00:01:18 show. Hey, thank you for having me. Yeah, absolutely. Glad to have you here. So you are the CEO of PlanetScale. You've got a lot of great experience at GitHub and other places. Can you give listeners just a bit of your background and what you do, your history, things like that? Yeah. So like you said, CEO of PlanetScale. Before this, I've worked at two kind of very significant companies that were very significant in tech.

Starting point is 00:01:47 I did a short stint at Facebook on the traffic engineering team, which was just an awesome experience seeing that's the scale of Facebook. Like you can imagine the scale of Facebook and then quadruple it and it's still not there. It was just absolutely incredible uh and then before that I was at GitHub which uh also another site at scale I think you know around the time I was there it was like the 32nd largest website on the internet I think and I was there for about eight years started as the first database engineer at the company when the company was still

Starting point is 00:02:26 small back in the kind of no managers culture very flat organization and just a very disruptive interesting and cool tech company to be at and that's kind of i would say where the formative years of my career were spent and now i'm at planet scale trying to solve some of the problems i've seen plague every company that i've ever been at yeah absolutely i'm sure you've seen some some pretty amazing stuff so i'm excited to talk about planet scale and and just for the users like i think of planet scale as as sort of like two different things right it's like originally i think of it as like manage Vitesse, which is like a horizontally scalable MySQL solution. So if you have these hyperscalers like GitHub, like Facebook, YouTube,

Starting point is 00:03:12 somebody that has terabytes and up of data, right? Now they need to horizontally scale their MySQL. And that's what Vitesse is doing, like an open source solution. But PlanetScale is providing a managed version of that. But then in addition to that, there's just like a bunch of like other delightful database add-ons that that seem to have nothing to do with the test but like are available to to all your users that would be like hey a better workflow for schema changes you know thinking about like non-blocking changes branches deployments reverts things like that um also planet scale boost which came out recently-ish, right,

Starting point is 00:03:46 which is like a materialized cache with automatic incremental updates all the time. PlanetScale Insights, which just gives you very good query visibility compared to other solutions out there. I guess my question is, was it always this way? Or did you start off with like, hey, we're going to manage the test and then realized we can provide a lot of stuff to other downstream customers as well? Or what's the story there? Yeah, the original sort of vision for the company was that Vitesse itself is just an extremely powerful database technology,

Starting point is 00:04:17 of which there's not many like this on the market or not locked inside kind of a single cloud platform. There's a lot of power to be able to take a a a technology like this and deploy it kind of anywhere um across clouds and so it was originally like managing that and that you know picked up some really great logos and customers that of the test users and you know if you've sent a slack message today that's gone into a Vitesse database slack have been very vocal about how they're huge Vitesse shop um github's primary database is Vitesse and then we have people like HubSpot um Roblox Etsy all of these companies

Starting point is 00:05:02 that have been very vocal about their use of the tests. And that's one thing that first of all is just really cool to see contributions from major companies running this technology at scale. Means that the kind of the rising, you get decent incremental features built on top of the platform that's useful to a broad audience built by very, very talented engineers and maintainers. But then we wanted to see way past that vision, which was build a... Like you know, in our first kind of tagline was database for developers. And everyone I think was like, well, what were databases for? And yes, it was like well what were databases for and yes it was like a slightly kind of tongue-in-cheek but truthfully i don't think many databases have been built for developers they've been built to operate and have built been built for operators

Starting point is 00:05:57 but the kind of developer experience and the daily life of using most databases kind of sucked. And so we really wanted to improve it. And I did a little post on this recently talking about developer experience. And developer experience is not really just like bloopsie fun little things and a little avatar. Dark mode. Yeah, dark mode or whatever. It's doing what you're supposed to do

Starting point is 00:06:24 for years and years and years on end. And so bringing a database product to market that has great developer experience requires being a great and proven tested database and doing what it's supposed to do. And having Vitesse as our foundation enables us to do that. And then we build all these incredibly cool features on top that make operating and using the database um really joyful even like we came out with database branching you know we own we even patented that it was crazy that we patented it and they were able to in like 2021 not that we'd ever probably use the pattern but it's just a kind of a marking of territory saying this was us you know we came out and it was crazy that was done so late and i kind of put my head up after being at github and realizing databases just had not moved they were just still doing the fundamental things and so i'm proud that we've brought a huge shift

Starting point is 00:07:13 in terms of the the minimum bar for features you can see that by pretty much every other competitive company is scrambling to add these features um while we build the next set. Yeah. Did you have, when you were at GitHub or teams at Facebook, I imagine, did they have branching-like workflows? Like I remember Ghost coming out from GitHub to help with like database migrations. Like did big teams have those sort of tools to make branching pretty easy, but it was just like not available for the masses? We never quite went as far as branching,

Starting point is 00:07:43 but we did get to good staging environments i think a lot of i think what the the impression that github left on planet scale is the unrelenting drive towards shipping software constantly and deploying software constantly so when i joined the company the company had grown really rapidly. And this was around the time that we raised that record beating kind of series day around the company was bootstrapped and was just winning hearts and minds of developers everywhere. And so things had grown really quickly in the database. It kind of,

Starting point is 00:08:16 there was like the usual kind of patterns and things and that come together inside the database. And that had to be fixed, it was we could never we never had the option to fix infrastructure problems while slowing development down and that to me was such an amazing tough but amazing lesson to learn because it taught you the appreciation for shipping velocity and what shipping velocity does for a company. So when you see things like branching, like deploy requests, all these things in the plan, it's all in service of you shipping really quickly and taking away fear of the database.

Starting point is 00:08:57 Like we really believe in just push it to prod, push it to prod. And if your database allows you to do that then you will gain huge speed ups and impact for you and your user base yeah what's the history of the test was it is it youtube that developed it is that right yeah so youtube is growing rapidly now the second most visited website on the internet second most traffic search engine and has what two and a half billion monthly active users and so they were using my sequel naturally i mean if you look at the top 100 of the internet it's pretty much exclusively powered by my sequel and uh they were scaling up and they needed a solution to go and sort of shard and it was originally like a proxying project then

Starting point is 00:09:43 became sharding and all these things got built into it and yeah they just did an outstanding job it was built on the earliest version of go you look up in the go history that they talk about the early vites creators being instrumental to the go project even because they were giving feedback on like go 0.1 um so it's one of the people don't realize it's one of the oldest go projects in existence it was also built on borg which is the predecessor to kubernetes so it was built in this highly hostile environment everyone was used to running databases with like raid controllers and it's like oh we're gonna recover that machine and bring you know not in the kubernetes or borg world so that meant that they built a extremely resilient cloud native air quotes um database technology which could

Starting point is 00:10:37 recover from pretty much any type of failure and then they put very smart google engineers spending nearly a decade to go and build it and now we build on top of that extremely strong foundation it trickles down in so many ways and and why it's important but yeah it was picked up by slack picked up by github picked up by so many other companies and kind of became the ubiquitous um mysql scaling solution out there at github before pulling in the test did you have kind of a home-built horizontal scaling solution or or i guess what did that look like we were doing the classic pattern that everyone does which hopefully is dying um not at github but i mean in generally as an industry well first of all we ran our data centers, which is still kind of a good advantage

Starting point is 00:11:26 because we would be able to buy very expensive, fast database servers. And then we had, you know, when I joined, we had three database servers. And now it's a significant amount more, but it kind of got partitioned. So we would end up taking like the largest tables that were always causing problems the notifications table was a pain the stars table

Starting point is 00:11:51 was a pain the statuses table was a pain statuses table you know at the bottom of a pull request you get all those checks it's like all of the different statuses like a pull request can be in um we just hammered the database constantly So we moved them out into their own clusters. So they kind of freed up the rest of the database infrastructure in terms of like buffer pool and contention and resources. And this is like a really common pattern. And a lot, we see this a lot.

Starting point is 00:12:17 A lot of people come to us having kind of split up their main database cluster. Problem is it's not always better for availability right like even if you have high availability in those separate clusters you're kind of you're pinning a lot on three or four masters across three or four so we were doing that and then eventually and we made some built some awesome tooling right like we continued the maintenance of orchestrator with shlomi we built ghost we built lots of really ski free lots of really cool open source projects on top of my sequel and yeah like you mentioned you see the dna of those

Starting point is 00:12:51 are now in vites and they're in planet scale the product itself but we've been fortunate enough to take it way further than we even imagined uh at github yeah and what is what does active work look like on the test right now is that like i imagine planet scale employees are major contributors like are you sort of the leading contributor is that is that still some youtube folks like how does that break down the the project leads are primarily at planet scale you know we we kind of craft the project we do that we manage the releases we maintain the test essentially even though it is a cnc project cncf project it was i think i think it still remains the only graduated database in

Starting point is 00:13:32 the cncf um and was a very early uh project to do so alongside kubernetes um so yeah we maintain it it's you know it's a lot of work we have to manage releases very carefully when we deploy our software we know it's going out to like billions of users as you know one of the biggest sites in china runs on the tests um you know it's serious to go and release database software you can't mess around you can't just yeet it out to production and like it goes through rigorous testing every push gets tested against every modern framework we have all of these acceptance and regression tests then it's rolled out you know but companies like slack is they have maintainers there they're still very active maintainers the the the slack is even for a hyperscale back-end mysql scaling solution we still have a slack with like 3 000 people in it

Starting point is 00:14:22 um we have the monthly maintainer calls the weekly check-ins with all the maintainers. This is a real big active open source project that has a huge impact at a lot of companies. And so we get a great set of folks show up every time to kind of manage the project. Then we have the features in it that get released, or setups for new features that are coming for PlanetScale. So it's a big operation to maintain Vitesse. Yeah, absolutely. Okay, let's talk Vitesse a little bit. Walk me through the high level architecture. What are the components in a Vitesse cluster? So there's a few different components of Vitesse. Firstly, VTGate, which is what kind of could be known as a proxy,

Starting point is 00:15:12 I guess is the easiest way of describing it. VTGate is like what terminates your connection and determines the query routing and where it has to go. So if a test sits on top of MySQL, we run real MySQL under the hood. That has like a number of benefits. First of all, MySQL is exceptionally proven and trusted.

Starting point is 00:15:36 It's like, what, 26, 27 years old now. Incredibly reliable. Even like a decade ago when I was playing around, I've like slammed my sql servers with right ripped the power cord out of the server and everything that was acknowledged was still there like it's incredibly robust and good and trustworthy so we build on top of my sql but we're we have a layer in top on top a proxy layer which means we can't support everything that my sql supports we're working on the last few things

Starting point is 00:16:07 that that don't make us 100 compatible foreign keys is one of the major ones yeah we get a lot of uh noses turned to us for not having but foreign keys but we've solved it we just have to keep ramping up towards being production grade which is a very high bar and so that vt gate is that just emulating like the the mysql parser or is it using the mysql parser at all under the hood and and hooking into that or is it today do you have to basically replicate all that mysql parsing well yeah fully emulated in go okay um so it's not an easy task luckily a lot of time is spent at google just running down all of those edge cases and in the majority of cases unless you're doing something weird like your app is pretty much gonna gonna work and it's certainly for the benefits you get

Starting point is 00:16:58 from sharding or it becomes worth it it's not like a giant there's some more on the postgres side there's like some really interesting postgres databases um that are kind of postgres wire compatible but not fully compatible um which causes people some like weird edge cases in the long run we try and avoid that like i said we test all of the frameworks that are kind of out there and popular to make sure they work well and and foreign keys is really one of the last things that we have to fix so vd gate kind of read looks at the query looks at what has to be done to serve that query and then breaks it up among shards if necessary to go and retrieve that data.

Starting point is 00:17:48 So you can have VT gates horizontally scaled. So that's one reason we can claim unlimited connection scalability. And what we mean by unlimited is if you keep adding vtk nodes you can keep adding connection resources basically we have some people running literally millions and millions of connections to their databases that itself you talk about developer experience that itself is a subtly extremely difficult problem right like most people have to run proxies in front of their databases to terminate and handle connections and then if you're using lambda or various kind of worker architectures you can really thrash the database with connections which can be very painful

Starting point is 00:18:39 and can just exhaust and waste a load of resources on your database, basically. Yeah, absolutely. So when I'm in the PlanetScale dashboard and I look at my database and it says, like, load balancers and it shows my database, is the load balancer, are those the VT gates, essentially? Yeah. Or is there even, okay. Yeah.

Starting point is 00:18:59 That's what I'm seeing. And then we have an edge network that kind of routes you to those right, correct places. And you'll see there's a bunch of benchmarks out there that show us to be the fastest or one of the fastest because we have a fully deployed global edge that then routes you to the right VT gates. Gotcha. And then what are the storage nodes themselves? So storage nodes, you have vt tablet okay it's a sidecar process to my sql that receives um all sorts of events and um serves queries um and can rewrite queries and can protect and

Starting point is 00:19:39 buffer the database all next to my sqSQL within a shard. Gotcha. And okay, and that's running MySQL into the running like the latest version of MySQL? Is it sort of pinned to some previous version? Like what is that? Oh, so that's, that's one thing we're quite, you know, proud of is because of how advanced the architecture is, and we can kind of fail over and manipulate the cluster really simply without outages. We keep people very up to date. We manage rollouts on very large customers to help just make sure that they don't have regressions.

Starting point is 00:20:17 But most of the time you're going to be running on the latest, if not very close to the latest version of MySQL. The only is so funny if you're if you're an rds user right now the only online way to migrate from 5.7 to 8 is to move to planet scale within amazon you can't online migrate to the latest version um without an outage without downtime which is that's quite amazing to me because like you know how many rds users they are i can't believe they couldn't uh you know build some of that migration tooling um themselves it's the power of being the incumbent have everyone trapped

Starting point is 00:20:54 yeah so if i'm like a typical like say i'm on like you know the scaler scaler program i'm not like an enterprise that has you know something more managed and custom but like are you just automatically upgrading the tests in MySQL for me or am I clicking buttons to say upgrade to the latest? We do it all for you. Okay. Any PlanScale user will not have realized that we continually upgrade their software

Starting point is 00:21:18 for performance improvements, for all sorts, without them ever noticing the impact. It's really hard to it's not easy it's really it takes a lot of work a lot of automation to do these kind of upgrades online we have customers that have come to us purely because they couldn't solve that themselves right that's great that's that's cool that's value we provide and we do tens of thousands of failovers every week not because the software is failing but because we use them as a mechanism to upgrade resize repack do all of these various things that we do for our infrastructure and it

Starting point is 00:21:51 just works very well okay okay that that vt tablet or sorry vt gate which is the proxy layer it's it's parsing that query and figuring out which which shards it needs to go to does it rewrite write that as a mysql query again or is it able to pass in like the parsed query to the mysql instance it depends it sometimes does rewriting because it might have to scatter across um a number of shards to go and aggregate the data um and pull it together uh sometimes like often a very common rewrite is it adds limits limit clauses so that you don't just go and thrash

Starting point is 00:22:27 the database with too many rows that you need. It does all these things that help guard against errant application behavior. ORMs, looking at you. We love ORMs, but they, you know, they can get a bit

Starting point is 00:22:42 unruly. So there's a whole number of things that it does. Like the team that build the Vitesse query engine and the parser are just geniuses. And I wish I understood the depth that they do, but the work they do is absolutely outstanding. Yeah. Writing a query planner is very, very hard. No kidding. No, I can't even imagine.

Starting point is 00:23:07 Okay, so that high-level architecture, so we basically have the stateless proxy layer that has some config about where these shards are assigned and then all those shard layers that run the storage nodes. Is that a pretty common setup for, if we're talking about any of these distributed databases, whether it's Cockroach or Amazon Aurora or Dynamo or Cassandra? Is that the same high-level architecture? Are there any major differences between Vitesse and some of those other ones or even groupings of patterns there? I think at a very high level, they're all going to have that same kind of push down to storage node, load balance somewhere. We've chosen for kind of a shared nothing sharded model. Some folks kind of do an automated sharding model where they kind of KV across around a load of nodes

Starting point is 00:23:56 that has certain trade-offs. We choose to have a single writer node per shard rather than multi-master. That's an interesting trade-off um it's a trade-off made for scale and safety primarily right if you have multiple masters then you have to do confret resolution if you're doing that across like a network with latency you can uh dramatically slow down the throughput of your database um and that what are the what are the ones that are doing multi-master would that be cassandra are there cassandra do it but then they have like various consistency tweaks cockroach do it

Starting point is 00:24:36 um spanner does but again like they all have these trade-offs under the hood about really understanding kind of where you're placing your data how you're doing it with us each shard follows a primary machine basically and and um that means you can have very very fast writes that get um sent over they get sent over semi-synchronously which means they have to be acknowledged by at least one other node in the cluster so you know the data is safe but you're not trying to gain quorum among three or six or whatever nodes yeah gotcha so what is that so tell me about that on on the flow of a write request so when it's going to come in it's going to hit the vt gate route it to you know ideally one vT tablet and and straight to my yeah straight to the most primary and another part of the architecture that's really

Starting point is 00:25:30 important is VT orchestrator which is always watching your cluster and making sure that it's always in the correct state to serve queries so it always has a master it always has at least one replica or as many as you've configured and and make sure if that master disappears replica gets promoted So it always has a master. It always has at least one replica or as many as you've configured and, and make sure if that master disappears, replica gets promoted, promoted, new one is created.

Starting point is 00:25:50 Like it's always making sure that your cluster stays in the correct state, um, to go and, uh, do the job you need. And, and, and this is where like a simpler,

Starting point is 00:26:02 it's much more easy to reason around. We wrote to one node within a few milliseconds, it appears on another node versus we have two copies of application talking to two separate databases and now they are conflicting. That inside a code base to reason around those types of problems is much, much harder and is really not needed by many people. If you have a low volume kind of metadata store that has to be strongly consistent absolutely everywhere, you can choose some of these architectures and they'll probably work for you. What is the typical topology look like for a big service? And especially within a shard, how many replicas are we talking behind that primary? And are they same region? Are they same data center? Are they cross region? What does that look like? So by default, every plant skill database is multi-AZ. We put replicas across each availability zone

Starting point is 00:27:05 should an availability zone go. We have a very multi-layered approach. So, like, we don't trust EBS fully, right? Like, there's some, like Aurora, for example, they rely on EBS's underlying replication to replicate the data. We do too, but per node. But then we have multiple nodes that have multiple EBS volumes.

Starting point is 00:27:27 So we're not sharing single volumes across the database. We are making sure there is an isolated copy on EBS or the Google Cloud version per AZ, meaning you can completely lose the availability zone or an EBS volume inside your cluster, and your data is completely safe and replicated elsewhere. Yeah. So you mentioned semi-synchronous replication there.

Starting point is 00:27:55 So when that write comes in, hits the master, has to be acknowledged by at least one other one. If all my shards have, say, 10 replicas or something like that, are all of them sort of in that candidate group for some semi-synchronous replication or is it like you know a group of three that are sort of you know the primary and the replicas that are responsible for that and the other ones are made only just like read replicas purely it's true it's tunable but really it only needs to get to one other server normally.

Starting point is 00:28:27 It's not waiting for all 10 to receive that. And you wouldn't really, this is the beauty of sharding. Most people's sharded setups have like one or two other replicas. Because you're breaking your data out across so many servers and it's horizontal across these shards, your failure domain gets to be much smaller. So if you've got hundreds of terabytes of data, but it's strung across 100 shards, for example, when if a failover happens,

Starting point is 00:28:58 it's a tiny blip for a very small subset of your users, and it comes back online rather than sharing tons of infrastructure which means you can have global outages that's bad yeah when you see someone that has lots of replicas per shard do you like you say hey you should actually be sharding more like smaller shards or is it is that just different patterns like sometimes more replicas is better sometimes more charts is better that's what we'd recommend but then everyone has their version of the justin bieber shard and the justin bieber shard is a shard set up that that was at instagram

Starting point is 00:29:36 uh he basically was on his own infrastructure because he was the top user until a bunch of issues were solved um and like everyone's got that shard that has that really important customer on it or whatever that they may beef up a little bit but normally we recommend keeping it kind of uniform and sharding smaller which is not a difficult operation with the test is it like can i do isolation of shards that that small like if i have a major customer like sort of get that. It's not like, you know, I have this key space that's going to be, like, equally divided. I can sort of make a really narrow one.

Starting point is 00:30:11 You can get very, very granular in terms of how you want to pin and place shards and place data. You can even shard by geographical regions and have data pinned to various regions. So people use that for like GDPR and nightmare situations like that. And then can I make queries that don't use my shard key? Yeah. And that will just scatter, gather, hit every node? Correct. Not the most ideal, but it's completely possible

Starting point is 00:30:46 yeah and for some people it's not even ever a problem um but you know we kind of help the larger customers tune their queries to hit charts more uniformly yeah okay okay um tell me about vstream how does that how does that come in What is it? And how do you all use it? Vstream is an incredible technology. Vql replication that allows us to stream data pretty much across shards oh it's not built on top of mysql it's itself part of the go application but if you imagine a reshard a massive reshard where you have to completely change your shard key it's kind of a tough operation because it means you have to stream data across disparate shards into a different cluster and reshard it some people do that like routinely

Starting point is 00:31:52 for their query patterns um so it was also the message giving infrastructure at youtube as well and it's just a very robust streaming um architecture that allows um that allows us to be incredibly flexible with the data. We haven't even got close to kind of surfacing the power of things like VStream into our product. We will get there in terms of making even more powerful primitives, there's some extremely, extremely cool things that, you know, the test users even know about the askT gate level? Is that at that like each VT tablet level, or maybe like the primary for a shard? Where like, if I want to process that, how am I connecting to that? So you can connect to it. So if you want to connect to the vstream right now, we actually send it out to you via connect, which is our, you know, which can give you it's an API that can give you constant streaming. So you can put that into Kafka, you can put it anywhere. It's how we do insights as well, which is how we

Starting point is 00:33:08 capture queries and we stream them to our data warehouse. So we just expose that for you. And you can listen to the aggregated stream of the whole cluster. That's one thing that's really, really important. It's not even per shard, it's the entire cluster's change stream, which is, again, really powerful when you have a sharded system wait and so when you're talking about insights are you doing uh i guess like read queries also end up in the stream so i could look at at um is that right no i don't think reads the part as you it's okay it just inserts inserts or updates up yep all kind of updates okay very nice um i want to talk a little bit about just like

Starting point is 00:33:45 the tests limitations almost or just like the the fact that you know now it's a distributed system that has multiple nodes like what can and can't be supported or just changes so like tell me about transactions are transactions supported in the test like what pattern are they single shard only are they cross shard what can i do there so single shard is easy well supported there is a cross shard transaction implementation that is certainly slower and could could do with a bunch of improvements it's not something we're focusing on heavily right now because most people if they're at scale they kind of they don't need it and they're much better off localizing the transactions within the shard anyway, but it is possible.

Starting point is 00:34:28 Although just not completely recommended. I think the limitation that is probably most apparent to most folks is to the wider or widest group of folks is you get a tiny bit more latency because of the extra hops that we have to put you through like if you're just comparing a straight up like create rds database on amazon with planet scale there's going to be a little extra latency on planet scale just because of those extra hops now it's an unfair comparison because with that rds setup you don't have anything that's highly available you don't have a proxy in front of it you don't that RDS setup, you don't have anything that's highly available. You don't have a proxy in front of it. You don't have any of those things.

Starting point is 00:35:08 You don't have real connection scaling. So when you compare them, it's the same. But you do have this default extra hop that we can't really take away. But we return for you in return for that hop, unlimited connection scaling, high availability, all of these things that make it very easy to not think about your database. But it's a thing. It's like if you're doing 1,000 really bad queries

Starting point is 00:35:35 or have like a horrible N plus 1 on your page, like you might notice it. I think the idea is that you should really sort of architect away from that being a problem. Yeah. And what does that look like? Like, say I'm running a query that is just going to hit a single shard. It's not going to be, you know, a cross-shard transaction or a scatter-gather. If it's just hitting a single shard, are we talking like, you know, sub-millisecond addition for that hop there?

Starting point is 00:36:01 Yeah. Yeah. So it's like going to add about a millisecond, I think, to the... So it's not... And that's add about a millisecond i think to the so it's not it's not and that's just the connection coming in to get to the database server everything you do on the database server then will be isn't taking extra hot yeah okay um okay what about what what sort of consistency does planet scale provide um yeah if i'm you know if i'm reading from a primary or and also if i'm reading from a rewrap so you get you get repeatable read um isolation level um we don't do serialized isolation obviously because it's very slow and it wouldn't be necessary for the one way that we're um replicating the data it pretty much mirrors what you get from MySQL.

Starting point is 00:36:47 Yeah. And also just like single item read consistency, that'd be probably strong consistency if I'm hitting the primary, but also if I'm using a replica, then some eventual consistency. Yeah. If it's been committed anywhere across the cluster,

Starting point is 00:37:02 it has been committed. You're not reading anything dirty or that is uncommitted yeah gotcha what type of like are there certain patterns that you recommend people avoid and and you know scatter gather type queries also like many to many does is that something you try and get people away from or any other types of patterns given that you could hit those cross shard main to main is not terrible in when you're again within the same shard scatter gather is the main one we try and move people away from or sometimes people do a lot relying on single rows like updating single counts is obviously very very painful for any database server and lock contention but normally it's just like helping people kind of get away from some nasty query patterns that can be kind of very detrimental to the database. We make that really easy to find through PlanSkill Insights.

Starting point is 00:37:54 You can tag. Yeah, let's move on to Insights because I think it's so interesting because I know like a lot of other databases are like, hey, you know, only turn turn on logging for super slow queries because you might have some performance. But you'll have all the queries basically available for that. Correct. Yeah, what's going on there? And that's because of vStream? Yeah, we stream that data out to a data warehouse. We capture every single query.

Starting point is 00:38:19 And it means that we can do really solid analysis. We just did a really good blog post about how we store time series data actually in shadow my SQL. And we basically keep a sketch, like a pattern of eight days trailing of the performance of each individual query. And we've seen trillions of these at this point. And it means we can get,

Starting point is 00:38:43 you know, like you said, if you attach datadog or whatever it's like sampling your database right like it's not getting every single query and anyone who's spent enough time scaling anything has spent time hunting for a specific database query that they can't find because like when you have a server that's doing i don't know 50 000 queries a second finding those one or two that are really detrimental um is really really tough well we we see all of them and we we we can then surface the ones that really really matter and it's very unique in what

Starting point is 00:39:16 we do and there's actually so much of the current roadmap is about making insights like incredible and part of our strategy as a product is we had to have a number of things in place in terms of primitives one of the primitives being deploy requests we had to have in a mechanism that we can suggest changes to your database right we needed something that we can get in the loop and that there's a workflow attached we we didn't just go and say we're building a database back end we're building a whole load of workflows that you could only build when you have the database back end and so insights is a key way of informing what begins workflows and it is our most favorite feature even among giant customers like one of our customers very very large runs hundreds and

Starting point is 00:40:05 hundreds and hundreds of terabytes on planet scale and then they have one guy that wakes up and have his morning workflows open the insight page find the slowest query fix it now if you're lucky enough to have a tool that just tells you this is the worst query you're already winning and then just having people that just fix them routinely you constantly keep your app filling extremely quick and and it's it's just great i mean it's one of the best ways to just continually optimize your database and that will just get more and more optimized over time like we're nearly done with a bunch of features that will just make database administration just a thing of the past yeah that's amazing is it hard to do sort of aggregation by query signature to say like hey this query is

Starting point is 00:40:52 actually the same as this query even though it has some different parameters yeah it's not easy the post talks about that a little bit actually some of it's a bit you know it's it's you have to take the fingerprint of those query and there this queries even we're even learning similar queries by intent as well in terms of what they're actually trying to gain. It's not easy. It's it's. But it surfaces really amazing data when when you have it all like showing people when that query showed up. How many times have you debugged something it's like oh now i'm finding out like looking through code to find out where and why a query showed up like we

Starting point is 00:41:33 can show what connection around that query where it came from and there's also cool things that we support like you can use uh key value format in query you know i don't know people realize you can add comments to sql query um so we we then pass those key values and allow you to just tag inside so you might find a slow query and insights and see that it's coming from some worker because it's tagged that and people even tag down to like request ids and things or like or like method even just be like hey it's this method it's this uh request id uh all that stuff. Yeah, we do exactly that. With the marginalia gem, you can see action and the view and the controller.

Starting point is 00:42:09 It's great. That's amazing. Okay, so that's a good time to switch to PlanetScale from pure Vitesse. So just to be clear, if I come in, I'm using Scalar, if I'm using Hobby or Scalar or Scalar Pro, no matter what, I'm going to be using Vitesse under the hood even if I don't have my cluster sharded. Okay. How hard is it if I do want to switch up to sharding later? Now I'm at terabytes of data or something like that

Starting point is 00:42:33 and I want to shard. Is that pretty easy to just make this a sharded system now? Is that a button click? Is that like a, hey, we got to call you? What's that look like as I'm moving to a sharded system? At the moment, it's going to become a button click. But right now, because the majority of customers that need sharding are quite large, we're usually talking to them anyway. And if they're migrating from a legacy architecture, I mean, some people are still moving out of data centers onto things like client scale.

Starting point is 00:43:00 We kind of have to help them with some of the edge cases um we've got to the stage where it's pretty much scripted in terms of discovering what the best shard key would be for you by looking at your queries and we basically help you uh yeah it's it's very cool it's insights again yeah yeah and so we'll get to the point where it will be a button click. But right now, our bar for user experience is very, very high. And because it's kind of, at the moment, a fairly personal process to go through with based on your application, we spend a bit of time with you figuring it out. A large amount of the kind of proof of concepts we do with customers is just we deliver them a sharding scheme. This is how your data will be

Starting point is 00:43:48 bucketed. It's not terrible. It's way worse than switching database. That's usually anyone who's gone and picked right now if you're listening to this and you've gone and picked a database that hasn't run a huge scale,

Starting point is 00:44:04 you might be sure either short on your business or you might have a complete database rewrite in your future that's usually the path people go through and it's hell and when you're doing that you're spending a year not shipping you're having outages this is why we've done plan scale i mean that database scaling operation is agony yep yep how hard so you mentioned earlier like you know moving from rds mysql 5.7 to 8 or something like that you can do with zero downtime can you do other migrations very easily and like i'm thinking like especially non my sql migrations can you do a postgres to my sql migration or is that a more custom process that will take a little bit

Starting point is 00:44:51 it's custom one of our biggest customers has actually came from postgres um it's not crazily difficult if they're not using some of the kind of edge case features of Postgres. The data import normally has to just be done by like kind of manually, well, not manually, but like nibbling out the data and putting it into the PlanSkill database. If it's from a MySQL database, we can do it, like I said, fully online. So the online migration process from MySQL is really cool. Basically, we connect to the MySQL node. You kind of give us the credentials to the MySQL is really cool. Basically we connect to the MySQL node. You kind of give us the credentials to the MySQL server. It's all encrypted.

Starting point is 00:45:31 We don't have to be crazy to allow your data to go over the open internet without encryption. But basically it's a vstream again. So every other database provider you see and you go to their documentation we didn't want this for our users the migration process starts with dump your database yeah not everyone especially a lot of developers that haven't spent their time don't know the trade-offs with just dumping your database they do not realize that you have to kind of capture a point in time that you can replicate from it's a difficult process it's just not easy to go and do and people want their import docs to look as kind of easy and clean as possible and don't describe some of the dangerous trade-offs

Starting point is 00:46:20 that can be made by doing things this way we didn't want that we wanted it to be seamless and and truly an easy migration so the way you migrate to plant scale is you connect to what let's say an rds database we nibble the data out we do never dump there's no dumping of data involved in this process we just this is again with vstream can incrementally build replicas and copies so it just nibbles the data slowly out of your database until it's completely caught up to date and replicating and then we tell you ta-da planet scale's up to date with your previous infrastructure and this is when it gets really good you create we create you some credentials on PlanetScale

Starting point is 00:47:05 and you connect your application to those. Do you redeploy your app using these new credentials? This new connection string, yeah. And what we're doing is we are proxying rights back to your RDS database. So you're hitting PlanetScale, you're reading from us, but we're proxying rights back to your old database that's step two and then step three you just say it's time to failover and what we will do is we will swap the roles so we will make that RDS database a replica of us we will

Starting point is 00:47:37 then using our proxy transfer your right buffer and transfer your rights to the primary on planet scale and within three steps you have achieved like a zero fully zero down time migration um without doing really anything except having to change give us some credentials and change your applications credentials that's pretty amazing and and so you talked about the dump the dump process and people don't understand that like is that mostly a performance issue that they don't understand like hey if you go try and do a full dump on your production database you're going to be just really screwing it up or what other issues are there with that well yeah you're very often will just lock your database but you have to make sure you capture the right um checkpoint for replication basically like you have to discover you're in the right place for replication, basically, like you have to discover you're in the right place for bin logs, for example.

Starting point is 00:48:30 And if you're not, and you replay from there, you can you could lose data, you can have all of these catastrophic problems, or you can get you can get to an issue where it's just arduous and not difficult. Like if, if the bin logs aren't fully turned on, like don't have enough time to replicate, then you're going to miss certain transactions. It's really, really painful. Yeah, absolutely. Earlier you talked about you have some customers that are moving from on prem. I also saw in an article you had last year with future.com talking about this cloud-prem deployment model. Tell me a little bit about that. So there's multiple reasons cloud-prem is really cool.

Starting point is 00:49:15 We'll talk about on-prem first. There's a lot of databases in data centers, and we don't want to provide an on-premise version of our product um for a myriad of reasons number one being we want to provide the absolute best database service we can possibly provide and without our engineers and people being able to remediate the infrastructure not possible like upgrades for example if you want the latest plan scale feature you get it because we rolled it out if you have an on-prem planet scale you have to beg your system administrator to go and upgrade your on-prem planet scale just don't want to get into

Starting point is 00:49:54 that world we want the magic of a sas platform you just log in it's gotten better i love to see the tweets where someone's like oh plans got out of this thing it's like great without any user having to mess around. That's one reason we won't do an on-prem product. It also makes you ship very slowly. If you're doing on-prem and cloud at the same time, it's really difficult to match your... You're doing proper releases here, and then you're doing continuous releases.

Starting point is 00:50:21 It just becomes a mess and a bad experience for everybody. Then we see a huge trend of lots of people having databases in the data center and wanting to migrate to the cloud. There's very few one-hop ways to get into the cloud. So what a lot of people do, and you'll be familiar with this, is they take a data center architecture, they just scoop it up and drop it into the cloud. They're just like, oh, it's just like other people up and drop it into the cloud they're just like oh it's just like other people's servers it's just terrible when you've got a database server like

Starting point is 00:50:50 next to you in a rack or just two racks over the latency difference and the availability profile is is is much much different right like when you're RAID on machines that you can go and recover and fix problems on, it's very different to, like, volumes that may never come back. Or across a network that you don't control. Because Vitesse is built so natively

Starting point is 00:51:18 to the kind of hostile environment of the cloud, and because it's compatible with MySQLl and we can just replicate from one to the other you give this kind of olive branch from the old world and you help them pull into the new world or bridge not you know this bridge from the old to the new whereas most people are migrating huge amounts of data out of the data center they're having to re-architect at the same time to meet the needs of the cloud or they don't and things just get worse and more expensive and they probably should

Starting point is 00:51:49 never have done it in the first place that's why we have a lot of customers that move from the data center because we give them a way of just replicating from one into a completely modern architecture by default and it's obviously very very attractive attractive to them. But we also have customers in regulated industries or finance that is regulated, but just generally they would rather data be secured in their own account. But what does that leave us? We could either spend like decades

Starting point is 00:52:19 of just like constant trust building to get to that stage or we can provide a cloud-prem solution. So what the cloud-prem solution means is you give us a sub-account in whichever cloud you use. Our back-end gets provisioned there. You still use app.planetscale.com to create deploy requests, to kind of delete a branch or whatever you need to do, look at your insights, but the data lives inside your account.

Starting point is 00:52:46 So the customers love it because their security teams, they're like, yeah, it's all inside our cloud account. We can monitor access. We can see what's happening. And then the operators and the developers, they love it too, because they get a fully managed database service. It just feels like another part of the AWS ecosystem,

Starting point is 00:53:05 but usable by everybody. Yeah. Do you think this CloudPrem idea, is it a particular moment in time where you have these on-prem companies and they either have the regulatory requirements that you're talking about or maybe just the cultural desire to be able to see it and touch it in their AWS account? Or do you think in 30 years, CloudPrem is still desire to be able to like see it and touch it in their AWS account? Or do you think like, well, you know, in 30 years, cloud premise is still going to be a thing that people want where they want it in their account, their VPC? I think they're going to still want it. They also have massive commitments.

Starting point is 00:53:36 Some of our customers have a billion dollars of commitment to Amazon or whatever, right? Like this helps, right? Like, this helps, right? The more they can put into their account, the better they can get in terms of discounting. And also just like when this is possible, there's not going to need to. Like, if they're getting all the benefits of cloud,

Starting point is 00:53:59 it still runs inside their account for like security. You know what I mean? So it's, yeah i don't know it'd be nice but i i can't imagine tech moves wonderfully quickly in some areas and incredibly slowly in others um it's gonna be a long time to convince government agencies or banks to store their data in just any ones and i quite honestly i would rather not have my data just shoved in some random database stuff like yeah and there's a lot of them out there so we go very heavily on trust we have all of the certifications and we make

Starting point is 00:54:39 trusting us really easy with all of these various layers and primitives. And a lot of the features you don't see us tweeting about that are larger customers' experiences. We have a lot of features around security and securing people's data. Everything is encrypted at rest, in transit. You can't connect to a plant-scale database without a key. Everything is incredibly tailored towards security as being one of the number one jobs of us. When you're running it in that Cloud Prem model, are you running your own Kubernetes cluster? Will you do it on their Kubernetes cluster if they have one?

Starting point is 00:55:15 We want the account to stay isolated. So we don't run on their Kubernetes. It will usually run on EKS or whichever cloud equivalent primitive is there but it's like isolated within that account and orchestrated so they're not really supposed to touch it like it's not theirs to try and administrate like it all the operator itself handles that like it still remains to be highly available it's you know all of the great things about the self-healing nature of the tests but the residency of that data is is with them

Starting point is 00:55:52 yeah what about just um i guess like multi-tenancy generally for planet scale so everyone gets their own the test cluster are those running on shared kubernetes clusters or or things within your own infrastructure it depends yeah it depends which so they're all obviously very isolated the chances of is next to zero right like there's no way people can arbitrary run code or break out of the container isolation um if you are in the kind of just sign up and you run scalar or scalar pro you're going to be defaulted into multi-tenancy but again exceptionally isolated um but you can actually then when you talk to us through the sale through sales you can have single tenancy where we spin up a entire stack just for you um and then obviously our managed customers that are using the cloud

Starting point is 00:56:45 architecture that is obviously also single tenant yeah okay let's switch to pricing a bit so i know you have you have the scalar model that's based like based on like rose red and written sort of like a dynamo db or fauna db type there. And then now you more recently have the Scalar Pro that's more like, I'd say the traditional, here's how much compute in RAM you have. Yeah. Was this like a cut? Was this a customer asked like people, you know, didn't want to make that leap yet? Or how did this come about? The serverless pricing is very exciting to a certain generation of engineer and horrifying to others. We spent out there with serverless pricing for a number of reasons. It's where modern generations are going, right?

Starting point is 00:57:39 The serverless crew, they need a database that can scale to like in crazy heights unlimited all the things we do however we found when you went to our pricing page and you were coming from a more traditional architecture say running on rds you'd look at that pricing page and be like oh there's nothing here for me which is not true there's like lots there so we wanted to communicate that like if you look at your rDS database and it has this many resources, you make this many resources on plant scale and it will be fine. And this is roughly how much it would cost. And it doesn't get talked about in serverless much,

Starting point is 00:58:16 but truly the businesses don't buy that way. Pretty much no businesses buy that way. Like they have a budget. You budget for your infrastructure at the beginning of the year and if you're paying per request budgeting that is like so difficult like you the pressure that puts on the back on the engineering team it's like well what do you need us to ship this thing well i can't plan how many this features query outputs ahead of time, right? People want to set a budget, buy that amount, and then that's it. And then normally that's doable and much easier when you're doing resource-based pricing, which is what Scalar Pro is. And actually we've

Starting point is 00:59:02 been really surprised with, it's only been out for three weeks and it's mind-blowing to me how many people have bought scalar pro databases yeah that's do you think we're getting like do you think we'll move more towards the usage model or do you think this is always just gonna be the case of just like hey people want some predictability uh especially about these large infrastructure costs like your database i think you know we may get to usage fully everywhere but not for a long time it's like where the abstraction layers are right like yeah i mean this new generation of developer if you've been a developer for like two or three

Starting point is 00:59:39 years this is not pejorative by the way a lot of people say these things negatively i don't think they're negative i think it's progress that this is the case people have never racked servers they don't think in cpus like really what does the vcpu mean when you're buying a managed service like it doesn't it doesn't always mean something to everybody so i think as that generation grows and matures and become leaders themselves in software yeah like the we'll see this profilation of usage based but we'll see tooling come alongside it will help people you'll get tools to help predict your usage and and there's all of these edge cases like auto scaling is almost mythological in some ways like it's not truly real right like um in a way that people believe and you can't paper

Starting point is 01:00:27 over that with like certain amounts of pricing and that dream of oh my infrastructure can burst to five million qps and back to zero is it's not real like that no software has achieved that um and the service pricing kind of hints at that being possible. But maybe as services just get more and more abstracted and talk less about the underlying infrastructure being servers, things will get more usage based. But right now, it's just kind of a sensible abstraction that people remember and understand yeah is there an architectural difference behind the scenes if when someone's signing up for scalar versus scalar pro like do you have to sort of set up that by test the test cluster differently no no it's just just how you're paying yeah yeah whichever and and we can help you figure it out some people mostly figure it out on their own um

Starting point is 01:01:23 which is cheaper right like some people have constantly randomly bursty workloads um and it's good for them just to provision a bunch of hardware that's way above that burst other people have um workloads that don't burst as much that are lower traffic or are just very consistent and predictable and it might just be cheaper to pay the scale away can i switch my database from one to the other yep gotcha cool yeah and we even have seen people switching right um yeah just to see how their bill i guess they're checking how their bill shakes out if they just pay for the hardware to be sat around it will save some people money it will be more expensive for others.

Starting point is 01:02:06 We've proactively worked with customers to make sure they're on the cheapest and fairer version. Yeah, sure thing. Okay, I want to switch gears a little bit into some, I don't know, softer stuff, a little less tech. But like you've written on sort of like the importance or maintaining a high performance culture. And I think it's like hard to understand the culture of a company from the outside, but you can like look to clues on whether they're high performance culture? And I think it's like hard to understand

Starting point is 01:02:25 the culture of a company from the outside, but you can like look to clues on whether they're high performance or not. And I think some good clues from PlanetScale would just be, you know, the cadence of shipping like major features like boosts and insights, things like that. I would also say just like

Starting point is 01:02:38 from what I can see on the outside, really, really good reliability. Like you don't see the hacker news stories about PlanetScale that you do about some other providers. I would also say just like a really strong and delightful design aesthetic, like around your site and in your social stuff, all that stuff. And then just like really good educational content, whether that's

Starting point is 01:02:57 regular blog posts, like Aaron Francis's course, all that stuff, I would say it would be like really good signs. So like how, how did you sort of get that or maintain that? Or what principles do you have there for other people looking to do something similar? If you want a really high performing culture, it has to be an active choice from the top. Because other one, you can't push it

Starting point is 01:03:18 and no one else can push it inside your organization. It has to just be not a nice to have, it has to just be a non-negotiable part of the culture like we all have a line that people can't cross in terms of how they speak to each other how they do like we all there's you know every company has that but and every company also has a talent bar and a performance bar and you just have to set that and it's completely arbitrary how you set it like literally it comes down to just what you're willing to tolerate or enforce you just set that high and you try not to deviate from it at any point every company i've ever seen

Starting point is 01:03:57 that's gone off the rails is because they lower the talent bar like average people with too much time can just run amok around an organization and ruin it like everyone's worked on a team but with that lazy person it undermines you you don't then follow your passion and work as hard as you wish you could or want to because it's really hard to know that someone's going to just share in the win one of our company values is no passengers we mean that we're here to build a software company in a business and and and be wildly successful everyone i want everyone who works at pound scale to be wildly successful and that means you can't tolerate passengers and people sitting around um and so you have to be very careful and it has to be part of the culture and we get this

Starting point is 01:04:46 feedback routinely from employees they say yeah i mean it just feels good knowing that we really try and address any problems that come up and we don't tolerate messing around it's not not what we're here to do it doesn't serve our users talked about our reliability we take things like that extremely seriously and to do that you have to have exceptional people and we work really hard in keeping and retaining successful and exceptional people has maintaining that talent bar is that hard or just like the fact that you're working in like enormous scale hard technical problems like is that is that easier than than maybe other companies have it it's hard to start but if you

Starting point is 01:05:33 have problems like these types of problems interest smart people usually right they gravitate to difficult infrastructure problems and databases are very very difficult they're they're extra difficult when you try and you're very complimentary about our taste and design and our style most we see most database companies they do the back end piece i'm not diminishing how difficult that is but then they pat themselves on the back and don't go far enough to make it we try both we want both have to have that that rock solid foundation but it has to be delivered beautifully and so it makes the challenge even harder people love that challenge it's it's tough it's fun so that attracts good people and when good people work somewhere and they enjoy it they tell other

Starting point is 01:06:20 good people who come along as well every time we open a role we just get an incredible amount of of people come and apply to work here and because you get that kind of reputation but in the long run i think it's the laziest and best way to run a company like what what could be easier than just a small group of incredibly talented people i see companies that have nowhere near the size of customer we have that have got engineering teams four or five times the size of ours what they must just be a flabby wet wasteland of nonsense to have that many people what are they doing like how many people work for planet scale now we have 85 people the whole company under 100 that's like given some of the workloads

Starting point is 01:07:06 you're running um and just the shipping cadence that's that's pretty most people would be like they i think most people want to run giant big organizations and puff their chest up and i am more proud by every dollar we make uh per head right like i want the best ratio of dollars per head of any company in the world is my would be my my mission it's just it's the best way of doing it like you you produce better products um and you you you produce better experiences yeah the company will grow and it will continue to grow but very very deliberately um and at the end of the day people have to judge us by our service and the quality of the work that we do and i think the easiest way to get that done is have fewer people fewer heads in the room yeah you've published some some

Starting point is 01:08:00 notes on how you do management at planetScale. You've also mentioned that like managers ruin companies. How does it, how, how does management ruin companies? Is it, is it by pulling in average people or tolerating average, average results or what does that look like? A manager can wage a war of attrition against a, a company, no matter how large the size is. They're normally not busy enough they're normally not experts or craft experts of what they do they don't understand the work they try and measure

Starting point is 01:08:34 everything by like these like impact and i know people are gonna be like what it's not about it's definitely about the impact of the work but But another one of our company values, experts leading experts. You have to be an expert in what you do. All of our managers are outstanding. And the reason they are managers is because they love the work that their team does. And they know by being a really good manager, more of that work gets done. Just a proxy for doing more of the really cool thing. When you get non-practitioner managers and they're just professional man they're just the manager class they just do management your org just goes into

Starting point is 01:09:12 disarray because they can't produce the work so then they have to control people that do produce the work so they have to make up arbitrary evaluation criteria for those people which leaves those people feeling disenfranchised if you work for someone that doesn't truly understand the how you reasoned around something or made the trade-offs you make you just get a little pat on the head and then asked what impact you had i've seen some incredible engineers at companies just get like completely railroaded because didn't their their work couldn't be quantified by their boss and that's tragic

Starting point is 01:09:47 and you just see a lot of companies just completely fail I always upset people when I say when I share my opinions of management on Twitter and people get really grumpy but also a lot of people agree and yeah I have my opinion

Starting point is 01:10:05 yeah yeah when you talk about experts leading experts like as you've continued to climb up the executive ladder like vpng and then you know chief product chief uh chief executive like how do you still stay sharp and technical do you do some hands-on stuff even if it's in your spare time do you just keep up with design docs like or do you just know know your know your spots more and where you you aren't leading the experts so i would yeah i i would would hate to horrify our engineering team by submitting any of my work to i'm not right so here's the model ways so i know i try and acknowledge the things i don't know right like can, I do not spend time in the weeds technically as much as I could do at PlantScale

Starting point is 01:10:50 because I have other things I should do, right? There's, we're here to build a business. There's unique things about my role that I should do, but I gain a technical appreciation. I love technology. I still code in my free time for little personal projects or build silly things or websites or just automation for my home is one thing I mess around with. So I still write code at least once a week.

Starting point is 01:11:16 But you've got to be careful. There's another profile of like I was technical once style leaders that were like, I was an engineer once. And they try and like write code for their team. it just becomes like this embarrassing i try not to do that i also just hold technical conversations um like our vp of engineering is outstanding where he communicates um the changes we're making and you can just track along once you know this stuff once you kind of track along with the changes you understand the trade-offs um and the other thing is the role i have to play um with the organization is understanding that the very technical things we have to do for a database then we have to translate that into design and visual design

Starting point is 01:12:01 and marketing and brand and because i'm always in the kind of middle of that i very much appreciate brand i very much appreciate our marketing and and beautiful design and again that only comes from it being cared about at the top because otherwise very technical companies you can tell that their designers are getting steamrolled um and and so you have to hold all of those things in balance and appreciate all of those things which means you think about them a lot and you learn about them a lot yep is that something you've always had had natural like i i agree like planet scale has really good design for an infrastructure company and if you told me like hey former mysql dba is

Starting point is 01:12:43 like their ceo i i just like wouldn't expect great design or or like no offense to you you know but like you you really have a great design aesthetic and like how much is that something you've had a talent for that you've picked up over the years or you just like are able to find great designers that can that can help pull that out because i've worked at some technical companies that did not have uh that sort of that sort of feeling i've always appreciated design and style ever since i was young uh i did best at art at school i've loved art my house is absolutely teeming with art i've always loved and appreciate aesthetics and style and fashion um even though nowadays all i wear is black t-shirts and a database hat but i do um i've always had a strong that is great fashion

Starting point is 01:13:32 right yeah maybe maybe but i've always just always appreciated like i you know i love bags for example a ridiculous collection of bags and shoes and all these various things i just love aesthetic things things that you know there's things like the iPhone, right. Which is just like a technical masterpiece presented so simply. I've got this obsession with power and simplicity and how hard it is to present incredibly powerful things simply. And it's always just fascinated me. And then GitHub fully solidified this for me,

Starting point is 01:14:04 that it is possible to build a thriving and exciting company while caring about aesthetic the early engineers and designers of github had the most perfect taste i don't think they could quantify what that magic was because you don't have to when you have taste um and and seeing the lengths that we went to to not ship things that were ugly or shoddy was truly inspiring and we carried that over the planet scale but it's something we all really care about and it's also again why you have to be careful with the talent bar because you can't democratize taste you just that's why i argue with people on the internet it's pointless because they just don't see it the way you see it right you just have to hold it in front of their faces like is this

Starting point is 01:14:53 ugly or bad yes or no right and and and some people want to hide their product in i mean i'm just very proud of plans girl we we can take you to pet petabytes of data and you'll never have to write yaml i mean that's some like the things people put their users through is abysmal um and we just try really hard not to do that and over the long run that that really pays off and it's put my sequel in places that it historically never would have been you know i see well first of all i see our website like very often you see these like top 10 best websites or technical websites it's like linear us notion stripe github you're just like immensely proud and and and we have a technology that is most relevant was started

Starting point is 01:15:39 off as being most relevant to giant or enterprises with giant scaling problems. And the fact now we've got hundreds of thousands of developers using the product for the most ripping hot cool project is because we delivered incredible developer experience and taste and style. And that's all changes. It's how things are done nowadays. Yeah, yeah. Hello. That's that's super inspiring. Cool. I love this. Oh, that's, that's super inspiring. Um, cool. I love this. I want to close off with a fun segment. I, so I like your Twitter persona. It's very much like a,

Starting point is 01:16:12 a blunt say to the point, um, it's been on this, this podcast as well, but what I want to do is, uh, I'm going to give, give a competitor. I want you to say one nice thing about this competitor as a, each one, as I say it. So let's start off with an easy you to say one nice thing about this competitor, each one as I say it. So let's start off with an easy one. Say one nice thing about Postgres. Incredible command today. Why do we, we talked about this a little bit before we got on, but like, why do we see so many more managed Postgres providers or forks or Postgres compatible things compared to MySQL, even though MySQL runs so much more of the internet, ranks higher and, you know,

Starting point is 01:16:50 DB engines rankings and things like that. I think there's a lot of, well, you see, I think it's because people want to consolidate, they want to consolidate around like a protocol. I don't, I think you're seeing a lot of Postgres compatible databases. I think that's the difference. There's very few that are running pure hosted community Postgres like Superbase does, right? A lot of it is a lot of new generation databases thinking that, well, let's go where it's most applicable to have the kind of compatibility,

Starting point is 01:17:23 and they mostly choose postgres right so like they'll they'll then implement the parser or they'll implement the wire protocol um the thing is it's heavily fractured and it makes choice really really difficult and i think it's something that is probably something to be concerned about in the long run for the community because the postgres community is outstanding but if it's just getting torn in millions of directions because every company cynically just wants to use the protocol because it's popular i don't know what that makes postgres in in a decade's time right especially as we shift into the cloud so i think it's it's very interesting but they have a great community, the plugins community is fantastic

Starting point is 01:18:05 but they've just made trade-offs that I wouldn't make building a database yep okay, alright next one you have to say something nice about, Amazon RDS the payments always work what about Amazon Aurora good swing at a fundamentally bad architecture

Starting point is 01:18:27 that wasn't a night that wasn't a compliment sorry like it's a good swing it's like it's a good swing yeah at a certain level i think it's got a number of very nice features or when you get to the upper limits it's horrible though can you tell briefly like what what's that architecture that kind of shared everything across the file system is like great when you want an instant read replica, when you want that 16th in instant reader, I put a linear in agony because it doesn't scale past like 16 nodes or whatever. Or when you want to scale writers,

Starting point is 01:19:02 it's single writer or when you want the latest version of my sequel it takes forever for them to update it because they're not running real my sequel you know what i mean it's like these little things like they do a lot for you it's like a lot of really cool stuff but then the edge cases are brutal like like we want a customer that found out the wrong way that aurora cross region failover is a one way it's like the documentation they bury that you know what i mean it's like does aurora have multi-reg are one way it's like the documentation they bury that you know what i mean it's like does aurora have multi-region yes and everyone's like surely that would be two-way right and and then like find out it's not you know what i'm trying to say like it's just

Starting point is 01:19:35 all these like weird edge cases you get with the file systems like that it's not like we don't have edge cases we just try and communicate them a bit more loudly yeah Yep, yep, sure thing. Okay, another sort of shared everything infrastructure, at least shared storage, Neon, which is serverless Postgres. Again, I think it's an interesting swing at an approach I wouldn't take.

Starting point is 01:20:00 What about last one, the sort of like edge-based SQLitelite databases so torso light fs on fly i think these are really exciting again wouldn't use them for a big serious application but there's a lot of smaller applications where people should just be using sqlite so i think it's it's yeah it could be really interesting there's like a lot of database abuse that happens to people wedging models into poor traditional databases where we have to, we have all of our constraints of high variability and data consistency, all these things that probably don't need to be in SQL

Starting point is 01:20:37 light. Sam Bracegirdle Yep. Yep, absolutely true. Well, Sam, I appreciate you coming on. It's been a great conversation. If people want to find out more about PlanetScale, about you, where can they find you? At ISamLambert on Twitter or PlanetScale.com's blog is probably the best way to see what I have to say. Awesome. Yeah, I highly recommend a follow for Sam on Twitter if you want some, you know, just like hot takes and real talk on databases and also just like good news from seeing the cool stuff that comes out from PlanetScale.

Starting point is 01:21:11 Thank you. Sam, thanks for joining today. I really enjoyed the conversation. Thanks, Alex. Thanks for having me.

Your Ad Here

Software Huddle - Scaling MySQL with Sam Lambert from PlanetScale

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.