The Changelog: Software Development, Open Source - Taking Postgres serverless (Interview)

Starting point is 00:00:00 This week on the change law, we're talking about serverless Postgres and we're joined by Nikita Shamganov, co-founder and CEO of Neon. With Neon, truly serverless Postgres is finally here. Neon is not Postgres compatible. It actually is Postgres. Neon is also open source under the Apache license version 2. On today's show, we talk about what a cloud native serverless Postgres. Neon is also open source under the Apache License Version 2. On today's show, we talk about what a cloud-native serverless Postgres looks like, why developers want Postgres, and why of the top five databases, only Postgres is growing. This is according to DB Engines

Starting point is 00:00:36 Ranking. We also cover how Neon separates storage and compute to offer autoscaling, branching, and bottomless storage. And we also cover their focus on DevEx, where they're getting it right, and where they need to improve. Neon is invite-only as of the recording and release of this episode, but near the end, Nikita shares a few ways to get an invite and early access. A big thank you to our friends and partners at Fastly and Fly. Our pods are fast to download globally because Fastly is fast globally. Learn more at fastly.com.

Starting point is 00:01:05 And our friends at Fly let you run your app and your database closer to users all over the world. Check them out at fly.io. This episode is brought to you by our friends at Fly. Run your full stack apps and your databases, close your users all over the world. No ops required. And I'm here with Brad Gessler, who is helping to build the future Rails cloud at Fly.

Starting point is 00:01:37 Brad, what's got you excited about Rails on Fly? It's no secret that Rails is this really productive framework and application. We've also seen that happen. There's a bajillion different hosts that you can choose from out there that all make it really easy to deploy your Rails applications. We've had these for years. There's nothing really magical about that anymore. It's just, this is what we'd expect.

Starting point is 00:01:57 We want to type a deploy command, and this thing ends up on a server somewhere. The thing that I think that sets Fly apart from all that is it scales. It has so many scaling stories. It has, again, the table stakes stuff. Oh, wow, you can add more memory to a machine. All those things you would expect from a hosting provider. Again, Fly, you can scale out. You're going to have customers that live in Singapore,

Starting point is 00:02:19 that live in Frankfurt. You need to get servers there. Fly lets you do that. Again, with just a few commands, you can provision all these servers in these different parts of the world. And then the real magic with one command, you can type in fly deploy

Starting point is 00:02:32 and you have all these servers provisioned around the world. They just work. People hit yourcompany.com and they're hitting the Frankfurt server and the same person in Singapore is typing in your.com and it just works and they're hitting your servers in Singapore. So this thing scales out beautifully, which is really important,

Starting point is 00:02:48 especially if you're starting to run turbo applications or turbo native applications where you need that really low latency. Your application needs to respond to these users in under 100 milliseconds. Otherwise, to them, it's not going to be instant. They're going to be waiting. It's important to be fast, and Fly makes that possible. The reason I joined it is because of this kind of global magic that we're going to be shipping. And that's something that I want to bring to Rails developers all around the world. That's awesome. Thanks, Brad. So the future Rails cloud is at Fly. Global magic is on its way.

Starting point is 00:03:21 Try it free today at fly.io. Again, fly.io. All right, we have Nikita here to talk about serverless Postgres, a hot topic these days. Welcome to the show, Nikita. Thank you. Glad to be here. We're happy to have you. I think we last talked Postgres with Paul from Supabase. And in that conversation, we started talking about what would a cloud native Postgres look like? Or maybe what would a serverless Postgres look like?

Starting point is 00:04:17 And he said a lot of the same words that I'm reading on your guys' homepage. You are with Neon CEO, Neon.Tech. Very cool technology out there that's still getting started. Do you want to tell us from your perspective what serverless Postgres means? Well, absolutely. I think there are several parts to it. And the first one starts with user experience. When you go and provision Postgres anywhere else,

Starting point is 00:04:42 today you maybe sense AWS Aurora serverless, you go and choose the size of your instance. And then you are part of what is called a subscription pricing model, where you say, well, this is an instance of size, you know, small to large to extra large. And this costs you X amount of dollars per month, right? This is called subscription-based pricing. You're committing to a certain size and then what you're paying for. In this serverless world, you don't choose the size, right? You just say, I need Postgres. And then the system right-sizes the amount of resources that you consume. And all you get is a connection stream. And now you're just connecting your app to the database. And you don't need to think about sizing at all.

Starting point is 00:05:31 And you don't need to think about the fact that, you know, you're paying something that you're not using. And that's what it's called consumption-based model or consumption-based pricing. Right now, you know, I push the button, you've got the connection stream, and whatever you use, you're paying for. Whatever you're not using, you're not paying for. Where it's getting super, super convenient is in the various development, staging, side project environments.

Starting point is 00:06:03 Usually you have a production database and that powers your app. But then you have, I don't know, tens potentially databases out there for various stages of your environments. And so if your environments are different, then your test coverage is not the same, the properties are not the same. And if you make them all the same,

Starting point is 00:06:23 then you might be spending a lot of money by having, you know, full copies of your production environments for various parts of your development process. So that's, I think, what fundamentally serverless means, there are lots of shades of gray to it. And serverless typically becomes a part of a software infrastructure architecture to deliver on site on all such properties. So serverless isn't new conceptually or even, I mean, it's newer in the market, but serverless things have been out there for a while.

Starting point is 00:06:57 From your perspective, as somebody who's now building a tool and a business in the serverless world, has adoption been as expected or has it been slower? Are people moving to serverless things or is it mostly like small and indie people getting started? Or like, tell us what you see. Well, I think it depends on the stack. And databases are usually kind of the last one to the party. And the reason to that is it takes a good amount of hardcore engineering. The development cycle is longer

Starting point is 00:07:25 when it comes down to databases. But let's say in the front-end world, people are there, right? If you look at platforms like Vercel and Netlify or Cloudflare Workers, this becomes the dominant way of deploying front-end code. It's completely serverless. Your JavaScript project is packaged, passed into the

Starting point is 00:07:46 platform, and deployed around the world in the CDN-like manner in multiple data centers around the world. Traffic is routed to the local data center, and that drives latencies down. Then there's the backend code and the database. When we start thinking about the backend code, we're seeing somewhat similar dynamics. My favorite company here is Fly.io. You should have them on a pod if you haven't already. We know them well. You know, similar things, right? So you deploy your app into Fly and they are able to deploy that app around the world. They don't do serverless, but I think they will over time. They already have machines that can scale down to zero and stuff like that.

Starting point is 00:08:26 So now the question is, can we have that in a completely elastic way over time? This scaling down to zero is like a big deal for all the things that we've talked about before. Finally, there's the database, right? You have front-end, back-end database. That's the majority of the apps that need all three. Now, the tricky thing about databases is, well, you either build a completely new one from scratch, you know, DynamoDB or something, or you take advantage of something that is extremely popular like Postgres.

Starting point is 00:08:57 But then it's much trickier to make it serverless because Postgres is a package. It has storage, compute, metadata, all in one box. And then in order to make it serverless, you need to cut the system in the right way. And what we did, we separated storage and compute. The adoption has been phenomenal. And when we announced the system just in June, we now have close to 10,000 users coming into the platform and signing up for the

Starting point is 00:09:26 system. And we haven't even lifted the invite gate. So we are onboarding people in patches. And we're seeing like a lot of interest of people coming into the platform and using the system. Granted, all of that is free right now, which is attracting a lot of tire kickers and people who are just trying things out. But we are in communication with those folks. They're filling up surveys and we are engaging with them directly. And so we see a lot of excitement around serverless. That excitement can be probably split in three categories.

Starting point is 00:09:58 The first one is, I'm an indie developer. I just want something cheap or free or whatever. And some of that is a Heroku fallout as well. Another use case is, well, I'm doing a lot of software development. I need this developer environment. So that's where scaling to zero, branching is another thing that we bring to the table, allows you to very easily create developer environments. And don't sweat bullets that you can just like

Starting point is 00:10:25 overcreate those developer environments and forget to turn them off because they all scale to zero. And finally, we see professional, like bigger organizations that are saying, well, we are an RDS, but like it's getting extremely hard to deal with Amazon. We just want simpler.

Starting point is 00:10:42 We need more reliable. And we need something that plugs in to the next generation infrastructure, which is the Vercels of the world, which is AWS Lambda, and you know, which is something like Fly as well. So that's where we see kind of the categories of people coming in. There are other serverless offerings on the market. I think namely PlanetScale and Aurora. When I started the company, I had a little bit of an insight into AWS Aurora. And they always track, you know, they build something and they see how much of an impact this is to the overall business. And when they ship Aurora Serverless v1, which is their first implementation now, they're on the v2, which, by the way, doesn't scale all the way to zero.

Starting point is 00:11:27 But that thing took off like there was no tomorrow for Aurora. So that was a big deal and a signal for me then figuring out how to build a dominant OLTP cloud database. It might be obvious why it took off, but in your opinion, why is this space in particular growing so fast? Yeah, I think it's friction and cost. Like it's as simple as that. And it's friction, it's cost, and then it's what people want. People want posters.

Starting point is 00:11:55 So there's this famous website for database people called DB Engines Ranking. And then if you like go on Google, type DB engine ranking, and you see what's going on in the top five databases, you will see that those top five are MySQL, Oracle, SQL Server, Postgres, and MongoDB. These are the top five databases in the DB engine ranking. Out of the top five, only Postgres is growing. So in addition to convenience and not thinking about sizing and provisioning and stuff like that, and cost. And cost comes mostly from the fact that you architected the system such that you never overpay for resources.

Starting point is 00:12:38 There's also, we're on the right trend lines with regards to Postgres. There's just more and more Postgres out there and people want postgres. Kind of reminds me of the GitHub analogy I had back, way, way back in the day with Tom Preston Warner. And this is like literally months after GitHub launched. It was a whole different podcast, a whole different Adam, a different era of life. But one thing Tom said about GitHub early on

Starting point is 00:13:02 about their success was it was permission to mess up. So if you reduce the friction and reduce the cost, it's your not so much permission to mess up, but permission to explore and to be creative. Because you can creatively use something serverless if it spins down to zero or virtually zero in the case of Aurora or whatever. That's the thing I think if you give developers that experience, then they're going to play more often. They're going to create developer environments. This works great here. Let's use it in production, obviously. But if you give people the option to have a better experience and play, cool things happen. You are precisely right.

Starting point is 00:13:37 And this feature that we have, well, the two features that we have on the platform, one is branching. So you can branch, and now that creates a completely isolated environment from your standpoint, right? Now you can read into that environment, write into that environment. You can put a lot of traffic onto this environment, and you will not impact your production branch. And so that's kind of one. That's permission to mess up, number one.

Starting point is 00:14:03 The second one is time machine. Right. Even if you messed it up in your core database, you can go back and restore it to what in Git would be the commit. And in the database world, in the Postgres world, that's called restore down to the LSM, which is which stands for log sequence number. Right. If you go and drop a table, drop table users, no recommending to do this to anyone, but in the world that you did with one command in Neon, you can roll back to right before when you drop that table. So that's all cool.

Starting point is 00:14:38 First of all, how about merging? We got branching. Can we get merging? Can I roll it back in? Yeah, let's go ahead and merge this. Yeah, yeah, merging is tricky. I think, so we're watching that space, right? So first of all, in merging, even before you want to do a merge, you probably want to do, you want to understand what changed, right? And then

Starting point is 00:15:00 in Git, there's a diff, Git diff. In databases there are new tools like data diff coming from this company data fold. It's an open source tool. And the other thing to understand, which is important, that in databases, there's data changes and schema changes. And oftentimes, there's a notion of a migration that Prisma, for example, has or various ORM have, where really what you want to do is to roll forward a particular schema into the production environment. So the workflow seems to be, the right workflow is the following.

Starting point is 00:15:37 Here's my production database. I want to build a feature that potentially changes and messes up with the schema. I'm going to branch that production environment. I'm going to make all the changes, which creates a test environment or dev environment, for that matter. I'm going to make all the changes in the test environment. In the meantime, your production environment moves forward. There are more and more changes that are coming in because your application is live. Then you diverse both on schema and you diverse both on data. But really, what people want to do for the most part is just roll on the schema, not the data. I think that is the workflow that Prisma supports. I think we will eventually introduce it into the

Starting point is 00:16:20 core system at Neon where, for every commit, we will be recommending developers to create a branch will integrate with all the platforms, like including GitHub as well, with GitHub actions and whatnot. And then the analogous of a pull request emerging the pull request would be merging the schema, but not necessarily the data. Makes sense. Makes sense. So elastic compute makes sense and scaling down because you have like ephemeral on-demand resource usage right like all of a sudden i have to answer a bunch of hdb requests and so my server has to do stuff and then everybody leaves and my website doesn't get any hits and i could scale that down with databases. If I got one gigabyte database,

Starting point is 00:17:06 it's just like, it's always there, right? I mean, all that data is there and I could access any part of it at any time or need to, and we don't know which parts. So I have a hard time with like database scaling to zero unless you're, I don't know, just like stomaching the cost? Or tell us how that works with Neon. Are you just stomaching the cost of keeping that online or are you actually scaling it down? We're actually scaling that down. Let me explain how this works and it may get quite technical. The first thing is what should be the enabling technology of scaling that down?

Starting point is 00:17:40 If you just kind of thinking, you know, how would I build serverless Postgres? And if you ask a person that is not familiar with database internals, they would say something like, well, you know, I would put it in the VM maybe, or I would put it in the container. I would put that stuff into Kubernetes. Maybe I can change the size of the containers. The issue with all that, as you start moving those containers around, you will start breaking connections because databases like to have a persistent connection to them. And then you will be impacting your cache. Databases like to have a working set in memory. And if you don't have a working set in memory, you're paying

Starting point is 00:18:19 the performance hit by bringing that data from cold storage to memory. The third thing that you will find out is that if the database is large enough, it's really, really hard to move the database from host to host because that involves data transfer. Data transfers are just long and expensive, and now you need to do it live while the application is running and hitting the system. Naively, you would arrive with something that you kind of propose, right? Let's just stomach the cost. There is a better approach, though. And the better approach starts with an architectural change of separating of storage and compute. If you look at how databases' storage works at the high level, it's a what is called page-based storage.

Starting point is 00:19:06 Then all the data in the database is split into eight kilobyte pages. And the storage subsystem basically reads and writes those pages from disk and caches those pages in memory. And then kind of the upper level system in the database lays out data on pages. So now you can separate that storage subsystem and move that storage subsystem away from compute into a cloud service. And because that storage subsystem operates is relatively simple from the API standpoint, the API is, you know, read a page right into a page, then you can make that part multi-tenant. And so now you start amortizing costs across all your clients. So if you make that multi-tenant and you make it distributed and distribute key value stores,

Starting point is 00:19:56 we've been building them forever. So it's not rocket science anymore. Then you can make that key value store very, very efficient, including being cost efficient. And cost efficiency comes from taking some of that data that's stored there and offloading cold data into S3. Now, then it leaves out compute. And compute is the SQL query processor and caching. So that you can put in a VM. We actually started with containers, but we quickly realized that micro VMs such as Firecracker or Cloud Hypervisor is the right answer here. And those micro VMs have very, very nice properties to them. First of all,

Starting point is 00:20:37 we can scale them to zero and preserve the state. And they come back up really, really quickly. And so that allows us to even preserve caches if we shut that down. The second thing that allows us to do is to live change in the amount of CPU and RAM we're allocating into the VM. That's where it gets really tricky because we need to modify Postgres as well to be able to adjust to suddenly you have more memory or shrink down to, oh, all of a sudden I have less memory now. And so if you all of a sudden have less memory, you need to release some of the caches and

Starting point is 00:21:12 release this memory into the operating system. And then we change the amount of memory available to the VM. And there's a lot of cool technology there with life-changing the amount of CPU. And there's another one called Memory Ballooning that allows you to, at the amount of CPU. And there's another one called memory ballooning that allows you to, at the end of the day, adjust the amount of memory available to Postgres. And then you can live migrate VMs from host to host. Obviously, if you put multiple VMs on a host,

Starting point is 00:21:37 they all started growing. At some point, you don't have enough space on the host. Now you need to make a decision which ones you want to remove from the host. Maybe you have a brand new host available for them with the space, but there's an application running with a TCP connection hitting that system. Storage is separate, so you only need to move the compute. And so now you're not moving terabytes of data with moving Postgres. You're just moving the compute part, which is really the caches and caches only. But you need to perform on live migration here. So that's what we're

Starting point is 00:22:12 doing with this technology that's called Cloud Hypervisor that supports live migrations. And the coolest part is, as you perform in the live migration, you're not even terminating the TCP connection. So you can have the workload keep hitting the system as you change the size of the VM for the compute up and down, as well as you can change the host for that VM and the application just keeps running. So yeah, that's kind of super exciting technology. So do you have your own infrastructure that this is running on? Are you on top of a public cloud? Or how does that all work? So we are on top of AWS, we know that we need to be on every public cloud.

Starting point is 00:22:50 And that's where the users are. Now, this question kind of hits home a little bit, the cost can be at least 10 times cheaper. If we use something like I don't know, Hetzner or OVH. And in our architecture, it's like super important to have an object store as part of the architecture. So Amazon S3. And in the past, there was no alternative to S3, like no real alternative. But just a few weeks ago, Cloudflare released R2, and they made a GA. And all of a sudden, you can put cold data onto R2. We still don't know what the real reliability of R2 is,

Starting point is 00:23:30 but I trust that Cloudflare will get it up there eventually. And that opens up all sorts of possibilities. The other one that we're looking into closely is Fly. We even have a shared Slack channel with Fly.io.

Starting point is 00:23:43 I think it's a fantastic company. And I see a day where Nian will be running on Fly infrastructure as well. Now, all that said, as of right now, we're only on Amazon and we'll be adding the other cloud. In which order and what's going to come sooner, Fly or Google, for example, I can't really commit to because we continuously evaluate. Yeah. So when you say move data off to S3, how do you deem data as cold on your customer's behalf?

Starting point is 00:24:17 Because that has to be some, there's got to be some smarts in there. Yeah, there's a lot of known algorithms and they're mostly caching algorithms. So it's already happening today a little bit in Postgres, right? There's a buffer manager or buffer pool, maybe mixing SQL Server or Postgres terminology here because my background is SQL Server. But the architecture is similar where the buffer pool has a counter for every page and it refreshes the counter if the page is touched.

Starting point is 00:24:52 And then the algorithm kind of sweeps the cache and decides which pages haven't been touched for a while, and then evicts them from the cache. Here, we added another tier in the remote storage. We also track pages, and you see which pages have been touched recently and which have not been, and then you offload those pages onto S3. There is a caveat. However, S3 does not like small objects, and a page is 8 kilobytes. So we need to organize those pages into some sort of data structure that will bucket those pages together.

Starting point is 00:25:22 So when we throw those pages onto S3, we throw a bunch of them together in a chunk. That data structure is called an LSM tree. And that's the implementation of LSM tree that we built from scratch in Rust. And that's integrated with S3 and offloads colder data to S3. It's kind of like several use cases. One use case is like very large database. You know, if you have a very large database, chances are large portions of that database are never even touched. So over time, you know, some of that data, maybe it's the data from like, I don't know, five years ago, and you don't really need it.

Starting point is 00:26:00 But you're keeping this there because like, oh, it doesn't cost you much. And it's better to have them for occasional use than not have it all or put them in a different system. And the other use case is you have a big fleet of databases. A lot of them are scaled down to zero because, you know, you just have them for occasional usage. And now if you keep them hot, that will start to add up both on the compute side and on the storage side. Storing all that data into SSDs is a very different economics than storing all that data in S3 in a compressed form. So these are the second place where integration with S3 can drive much better economics. Hey, friends.

Starting point is 00:26:56 Influx Days is back. This is a two-day developer conference from our friends at Influx Data and is dedicated to building IoT, analytics, and cloud applications with InfluxDB. It is happening November 2nd and 3rd. We'll see you next time. and digital businesses. If you're new to Influx or you're building advanced time series applications, Influx Days sessions and trainings will give you the skills you need to support your individual builder journey. Here's the breakdown. Two free days of virtual user conference, watch parties in SF and London, free training on Telegraph open source server agent,

Starting point is 00:27:40 paid training on Flux in London. Again, this is all happening November 2nd and 3rd. Learn more and register at influxdays.com. Again, influxdays.com. so nikita the this is obviously groundbreaking right to get serverless postgres you mentioned the architecture of separating compute from storage and you got developer experience which is crucial right built for developers made for developers is kind of key that's what makes this a hot space. How in the world do you get the recipe right, though?

Starting point is 00:28:29 You've obviously cracked the nut, but how do you get the seemingly infinitely hard infrastructure aspects of it to build it and then build it and then actually make it work? Yeah, so while some of that comes from experience, so I spent a good amount of time in the database space. And single store is a database that built every part of the database stack from scratch in C++, including separating of storage and compute and including a hardcore analytical query processor, including distributed transactions and stuff like that. So in a way, there's a lot of lessons learned both from SQL Server, from Tingle Store, from

Starting point is 00:29:10 reading all the papers, and then actually the part of walking the walk and doing that. So there isn't much magic in this, actually. You need to have a strong team that deeply understands the underlying system. In this particular case, this is Postgres proper, plus the new storage that we're putting together. There is a continuous process of building the team and shipping software. And that is set the goals, build the thing, make sure it's robust and reliable, put effort into testing the system, put effort into software practices that are around that, and be confident in the architecture itself. The confidence, because that's the hardest thing to change. The hardest thing to change is the architecture is wrong and you need to change the architecture. Now large swaths of code need to be

Starting point is 00:30:03 rewritten. The other thing is too hard to get out of the pickle if you got the quality wrong. The quality is wrong. Then it takes, you know, you keep fixing the bugs, but they don't seem to stop. Yeah, it's really no magic.

Starting point is 00:30:19 It seems magical. It seems magical from the outside. You know, SQL Server, Postgres itself, you know, any large system project, I think, is going through that. There's a certain amount of kind of maturity that the project needs to get through to achieve dominance.

Starting point is 00:30:34 The faster you get through this, the better. The more people use it as you do this, the better. And that's why we rolled out the system for people to use for free because now the stakes are lower and then we are fixing things now the stakes are lower. And then we are fixing things on the back end very aggressively. Are you running a Forka Postgres or is it stock Postgres?

Starting point is 00:30:58 So it's stock-ish. I guess it's stock Postgres with a caveat. So what's the caveat? Well, we have to change Postgres in a very surgical matter and specifically where Postgres reads a page from disk. Instead, it needs to read a page from our remote storage by making an RPC call. And when a Postgres writes into disk and sends what is called a wall record, write ahead log record, instead of writing to disk, it needs to send it over the network into our service, into our multi-tenant service. Those changes are not huge, but they're there. We've split those changes into five separate patches that we are submitting upstream. They have not been accepted yet, but we are working with the community for it to all get upstream. And once those patches make it upstream, I'm really hoping for Postgres 16. If not, that will be Postgres 17. We're working with the

Starting point is 00:31:52 community on that. The community understands that we're not the only ones. There's also Aurora. There's also some projects in China that are exploring similar architectures, and those will benefit from this. I mean, it's not a secret to the Postgres committers either that separation of storage and compute is the right way to go into the cloud. So that gives me a good amount of confidence that the patches are going to be accepted, but I cannot claim or guarantee that they will be accepted because we need to get the buy-in of the community. There are multiple Postgres hackers on the Neon team, including one of our founders,

Starting point is 00:32:30 Katie Linacongas, who is a quite prolific Postgres committer himself. So he is spearheading that effort of packaging the changes that we made in Postgres and sending them into the community for the final acceptance. How much of your other work could potentially make it upstream or could potentially be duplicated effort as Postgres core team decides this is the direction that Postgres needs to go? Is there a lot of overlap there? It's actually relatively little, believe it or not. The storage part is a completely separate project.

Starting point is 00:33:06 It is open source. I wouldn't mind if it was a part of Postgres, but obviously that's a very long-term project and it needs to reach certain stability. If you look at the storage project on GitHub, which by the way is distributed under Apache 2.0 license, so anybody can do whatever they want with the code. It's a very actively developed project.

Starting point is 00:33:28 There are commits like, I don't know, 10 plus commits every day that are going into it. So I think building that storage by the Postgres team is off strategy for Postgres, or it seemed that way for Postgres proper. Integrating with a storage subsystem like this is absolutely on strategy for Postgres, or it seemed that way for Postgres proper, integrating with a storage subsystem like this is absolutely our strategy for Postgres. So if, what you're suggesting, the Postgres community realizes that, well, we want to have a distributed cloud-native storage system,

Starting point is 00:34:00 I think Neon would be the best candidate because by that time, it's a fairly mature system. It's truly open source. It's Apache 2.0 license. We can re-license into a Postgres license if Postgres wants that to happen. And that becomes a standard and a part of Postgres. Now, while that's possible, I think it's kind of unlikely. I think Postgres will continue, Postgres community will continue building Postgres and the Postgres engine and make sure that Postgres plugs in into Neon storage and they will look at it as kind of the ecosystem plate. In terms of how you patch Postgres, does it have to be a patch or could it be an extension?

Starting point is 00:34:42 Is it something they can live in? It's a mix. Yeah? Yeah, it's a mix yeah yeah it's a mix they there are five patches that and the reason there are five patches is they are just you know they're touching different parts so we're just splitting them it could have been one patch uh it would have been just bigger but that that makes it more palatable and then the majority of the of the changes, you know, you take stock Postgres, you apply those five patches, and you need an extension, the Neon extension. So that's how the overall system works. The extension from the lines of code changed has the most lines of code.

Starting point is 00:35:18 And those five patches are relatively small. So how much work would it take for somebody to stand up some of the open source stuff that you have? I mean, are the patches out there? I assume if they're trying to get upstream, they're somewhere to be seen. But is that possible? Like if I could stand up and run my own little neon cluster for people or something like that? Yeah, you can. Yeah, for sure.

Starting point is 00:35:38 So you will need Kubernetes. Okay, I'm out. Yeah. Okay, I'm out. You're out? No, I was just joking. End up. Keep going. I'm out. Yeah. Yeah. I was just talking. And I keep going. I'm supposed to be you can stand up on your laptop.

Starting point is 00:35:49 And so you will get branches if you if you did that. But if you want to stand up a service with like multiple computes and all that, yeah, you will need to burn IDs and it's all doable. But it will it will it will require some work. But consuming this, if you just want to consume it, that's trivial, right? You push a button and in three seconds you have Postgres.

Starting point is 00:36:07 Yeah. That's what I'm more likely to do, personally. But there are people out there who love to hack on these things. Yeah, yeah. And there are also larger companies. You know, for example,

Starting point is 00:36:16 we are talking to a Fortune 500 company which have, I think they're spending $100 million on Postgres just on the infrastructure underneath a year. Wow. And that multi-tenant storage approach and scaling down computes to zero can make a massive difference in their deployment.

Starting point is 00:36:37 And by the way, I don't think it's the only company that's doing that. There's a lot of companies that use a lot of Postgres out there. In that scenario, you said before there's parts of the database that don't get used. So it's like old data. So it's there in the database. How does that work to sort of take that away from the cache? As you said, evict it from the cache.

Starting point is 00:36:56 Would you just keep certain parts of the database alive, essentially? And some of it just goes off the cache? It's the data. Some of the data goes off the cache. So think about memory, disk, and disk, you know, fast disk, like SSD, and then S3, right? And those are the tiers. And the latency for accessing data in each of those tiers is different.

Starting point is 00:37:19 Now, you kind of have this experience by like scrolling Facebook messages with somebody you had talked to a long time ago and then going back into the history. Sometimes you get a spinning wheel and that's what happens. Loading spinners. Yeah, and they bring that data from, I don't know, an object store or something, from somewhere on disk. It's certainly not cached. It's not in a very fast storage medium right now. That would be the kind of experience your application will have. Certain queries will have added latencies to them when you start accessing the older data.

Starting point is 00:37:58 But I think that's the way to go. People think, well, can we partition the data and last month of data is fast and the rest of the data is kind of slower and then 10-year-old data is super slow. But the reality is the system can make those choices for you just simply based on the patterns of usage. Right. There's a counter, you said, on the page, right? Yeah, there's a counter on the page. So the algorithm built into Postgres is the best candidate to make that choice, essentially? On the Postgres side, yes. And then on our storage side, there's a separate one. And the difference, like I said, is we're combining those pages into what we call layer files.

Starting point is 00:38:38 In the pre-call, you mentioned AI. Is this the place where AI might make some better predictions in the future? Yeah, well, stuff's pretty simple, right? So basically where you want to use AI, or in some cases, just like machine learning, is when there are multiple competing things. For example, there are multiple caches that are competing for something. You can use control theory or you can use AI for deciding what goes out of the cache and what stays. In general, AI applications are split between, can I use it for my database engine? The most famous paper is like Jeff Dean's Learning V-trees.

Starting point is 00:39:16 And the paper made a lot of noise. But I think the practical usage of this paper is kind of zilch. Like it doesn't really make a huge difference and nobody implemented it in really big database projects. It's not in MySQL, PostgreSQL Server, Mongo. And the other place for using AI, and I think that's pretty

Starting point is 00:39:35 exciting, is in autotuning the database. So there's a startup by Andy Pavlo called Autotune. They are twisting database knobs and they can make your database, I don't know, probably up to 20% faster, maybe more for your particular workloads. Then AI can choose indexes for you and that's where branches could be a very cool thing where you branch, you unleash AI on the branch so now you're not worried that the AI is going to mess up

Starting point is 00:40:05 and take down your production database. In that branch, AI makes a bunch of changes like changes the knobs, changes the indexes. You can fork the workload and test the workload on the branch. And then you can be like, yeah, you know, like, makes sense. Do you want to send a pull request kind of thing? I think that's where AI is a lot more interesting than in like managing caches and deciding what stays in memory and what goes on disk. Because there's like caching existed for like decades, right? And there are classic algorithms that do this very well. Finally, the generative AI is fascinating. It feels like every day there's an AI breakthrough. That's where I think developer experience can be impacted, where you start generating SQL, or you start generating ORMs, or you start generating API endpoints, or you start generating some sort of backend

Starting point is 00:41:06 code, you're generating story procedures. Nobody really knows by heart the syntax of story procedures because you live in your primary programming language like Go, Java, JavaScript, and then you need to write a story procedure. You have to stretch your head. It's like, oh, what's the syntax exactly in PLPG SQL? So that's where AI can really help by generating some of those things. So replacing us app developers, not replacing you infrastructure developers.

Starting point is 00:41:35 I see how it goes. I see how it goes. Well, it's just the applications developers will go first. Eventually, we'll all go. We'll all go eventually. Yeah, fair enough. Fair enough. eventually we'll all go we'll all go eventually yeah fair enough fair enough so one aspect so you've got decoupled compute and storage and the other thing that i think about with regards to cloud native or serverless things is geographic distribution and so you've mentioned fly a few

Starting point is 00:42:02 times disclaimer fly.io is a partner of ours and so they sponsor some of our shows they may sponsor this episode i don't know it might i can't tell you right yeah but you know one of their slogans is run your app servers closer to your users right and that's also what you know netlify wants to do and what vercel wants you to do and what I'm sure Lambda wants you to do. This whole like put the CDN all around the world and then do your compute in the CDN edge nodes is like the new thing, right? But the database has always been still in some data center in Virginia, right? And so it's like kind of the mecca or the place where I've been waiting and talking to people is like, how can I get my database close to my users as well?

Starting point is 00:42:44 And even when I was talking with Paul from Supabbase, he was saying, well, that's a whole different thing from decoupling compute and storage is like geographic distribution, cap theorem, et cetera, et cetera. Curious your take on that. Like is Neon going to be Postgres serverless, but also running right there in your edge nodes? Or is it going to be, I mean, maybe for now, if you're on AWS, it's a possibility. I don't know.

Starting point is 00:43:05 Talk to that whole subject. Yeah, yeah. So of course it's a dream to be able to read and write in every region and the system magically figures everything out for you. Unfortunately, that's really hard.

Starting point is 00:43:19 And imagine at the same point in time in New York and Tokyo, you're modifying the same row, right? Because, you know, logically it's one row in the particular table. You're modifying and one sets it to two, the other one sets it to three. So which one? Merge conflict. Yeah. So there's a merge conflict.

Starting point is 00:43:40 But then it's a live application. Nobody sits there and not writing. So you can have a conflict and then so you have a process to figure that out. But who decides, right? Is that a human or it can be a human. Or you can say, oh, well, this row lives in Tokyo. So if you want to modify that row from New York, you pay the latency for modifying that row and New York and you Tokyo does not pay the latency because that row is closer to you. That's easy. Or it still requires to decide you to decide where this row lives. Now, there is actually a very practical solution to this. And as a database person, it pains me a little bit because how simple that is. But I think from the

Starting point is 00:44:24 practical standpoint, it will actually satisfy a lot of users. And I think that's what we're going to start with. And that's what Fly is doing as well, a little bit. So you split your queries into reads and writes. You say your primary write replica lives in a region. You let the user choose that region. You replicate it from that region to as many regions as you need. You actually unlikely need more than five, but you can go all the way to 26, which is the number of data centers in AWS, or it can go to 200 like Cloudflare. At some point, it will get tricky to replicate to 200. So you will need to separate replication from the engine as

Starting point is 00:45:05 well. But regardless, you can send reads to a local replica. But you need to understand that that replica will be behind of the master copy by X amount of milliseconds. I believe at some point, I know I haven't checked recently, but at some point Fly had a heuristic, you know, a few 400 milliseconds, less than 400 milliseconds behind, we'll send reads to your local replica. And this, in a way, dumb approach can surprisingly go very, very far. It will have side effects. Well, what's the side effect? Well, it's called read after write. So you write and you immediately read what you've just written. and then that thing might not have arrived to the local copy yet. So you feel like you wrote a number one and then you're reading that number and it's still

Starting point is 00:45:51 an old value like zero or something. That can be mitigated by messing with the proxy. The proxy can detect those read-after-write patterns and, in that particular situation, send the read into the right replica as well. And the more I'm scouting the market and I'm researching alternative solutions and, you know, Aurora, Ship, Multimaster. Multimaster works, Spanner, Ship, Multimaster. But the more I talk to people out there, I'm realizing that that simple and understandable paradigm is oftentimes more powerful because of its simplicity compared to all the paradigms of like, okay, we're

Starting point is 00:46:34 going to have, we're going to run a distributed consensus over three locations in the world, three to five locations in the world. And now either all of your latencies are very long, or you need to put some sort of machinery in place where you start fine-tuning. Okay, well, this data lives here and this data lives there. And if you let the system decide which data lives where, that introduces uncertainty. And it changes from having a simple solution like AK-47 into this sophisticated thing where people just stop understanding when the latencies are short and when the latencies are long. So what do we do internally at Neon?

Starting point is 00:47:15 I think we're going to ship what I just said, where we're just going to have multiple read replicas around the world. And our proxy will be routing traffic to the local replica soon. In parallel, we're working with a famous database professor, Daniel Abadi, out of University of Maryland. And he's a creator of what is called the Calvin Protocol,

Starting point is 00:47:36 which is the foundation of Fonigb. He is applying similar ideas into the Neon architecture. The difference is row-based versus page-based. That does require today, as we are halfway through the research project, that requires people to assign data to regions. And you can pose this as this thing called partitioning. So for every partition, you need to say, well, this partition lives here. The moment you do that, a bunch of things fall apart a little bit where creating an

Starting point is 00:48:08 index across all the regions becomes harder and stuff like that. So while I find this fascinating and I've spent like a decade thinking about it on and off, I think that simple, straightforward approach where you say, well, my primary is here and I'm going to have up to 20 replicas in the world can satisfy 99.9% of use cases. Yeah. It lacks a certain elegance, but it has a certain pragmatism that you're not the first person that said that to me. And I remember the first time I kind of rolled my eyes. I'm like, well, that's just like cheating a little bit, but it's defeating defeatist, right? Like it is kind of rolled my eyes i'm like well that's just like cheating a little bit but as defeating defeatist right like it is kind of defeatist but at the same time like i'm probably it's gonna it is going to be better for most use cases just not 100 right so yeah if you think

Starting point is 00:48:56 about something like uh you know e-commerce use case right you really really want your website to load fast right but that's a weak query most of them are yeah and when you display a counter like inventory yeah you might be 100 milliseconds behind but i mean that's okay you know there will be cases where the user tries to buy something and somebody else bought it across the world and that thing disappeared but that you can you know you can handle at the application level. And when you bind something, processing your transaction, right, you send in your write into the database.

Starting point is 00:49:31 Okay, sure. That takes 200 milliseconds. Fine, right? People will wait 200 milliseconds. 200 milliseconds is not that long, actually. If every page load is 200 milliseconds, that's a different story. If one thing is 200 milliseconds, and that's a write. If one thing is 200 milliseconds and that's a write or it's a purchase or it's a cart, that seems fine.

Starting point is 00:49:50 Now there's another interesting aspect here, and we're wrapping our brains around this one as well, which is if some of your database calls, meaning write calls, are incurring certain latency. Either it's cross-region or maybe even within region. If your web page or your backend has multiple round trips to the database, those things tend to add up. And at times, you want to run your compute right next to the database so those latencies are not adding up as much. So ideally, a web page could sell a Lambda function to the database. And in that Lambda function, there's a bunch of code that's running maybe in the JavaScript runtime. Or I'll tell you about some other ideas. And that you want to run right next to the database because that piece of code is like, well, query that table, get data from that table, query a few other tables, run some local compute,

Starting point is 00:50:57 do some more requests to the database. And if you go back and forth, those latencies will start adding up. So there is a reason to run some sort of language, language runtime, right next to the database, or there's a reason to give people access to potentially a VM compute right next to the database. I don't know which ones we're going to choose. Either we're going to have VMs, and we can say, well, push them, whatever code you want, into a VM that sits right next to the database, or we will run VM runtime, or we will run Cloudflare worker runtime, which these guys open sourced not too long ago. And it's not like our aspirations to not be the database, we're a database company. We just see this use case and we want developers to build the best possible apps. So having some

Starting point is 00:51:44 sort of execution runtime for arbitrary code right next to the database seems like to make a lot of sense. And so that's another thing that we're actively exploring. This episode is brought to you by Retool, and they have a private beta ready for you to check out. This is the fastest way to now build native mobile apps for your mobile workforce. There is no complex frameworks anymore or tedious deployments. You can build mobile apps with what you already know, like JS and SQL. This is all in the browser, no code or what they call low code.

Starting point is 00:52:32 Join the wait list. Head to retool.com slash products slash mobile. The link will be in the show notes. Again, retool.com slash products slash mobile. A few months back, back in July, you doubled your funding, which obviously gives you more runway and more money to dream with. One of the kind of key parts that you're focusing on seems to be developer experience. You've got three kind of different things laid out in your post when you announce your funding which was, we talked about this already, serverless, branching

Starting point is 00:53:19 and time machine. When it comes to attracting developers to adopt this, those three things seem to be the main thing. But what else developer experience-wise really makes Neon shine? Let's first talk about what is developer experience. Developer experience is, first of all, it's something you experience

Starting point is 00:53:41 and when you see the company's got it right, you kind of feel it. And so what are those companies? Well, I would love to highlight Vercel, Netlify, Prisma, Replit, Fly.io, and of course, GitHub. They get the developer experience right. There's a bunch of others that I haven't mentioned. But still, if you look at those six, for example, they get it right. But if you were to deconstruct what makes a good developer experience or DevOps,

Starting point is 00:54:12 the first one is CLI API and docs. The documentation needs to be very good, very easy to consume. Everything that you do should be available over the API and the CLI. It's super addicting, actually, when you go and spin things up over the CLI. You control the system over the CLI. You look at the UI and that is all reflected there. You have this positive reinforcement as you do that. I think that's very important. Everything is instant.

Starting point is 00:54:49 So developers don't like waiting for provisioning. For example, there are cases where you spin up an RDS instance or an Aurora instance and you need to wait up to 40 minutes. That's nuts. When you just need a database, you want to click a button and you want to get it. So in those, every second counts. Think about your developer, your flow, you have ultimate hacker keyboard, you're optimized everything, but certain things force you to wait minutes or some time of an hour.

Starting point is 00:55:21 The third thing is cold starts. And we're not all the way there. I'm chatting with Guillermo over cold starts. And he's saying like, this is the hill I'm dying on. Like the, you know, cold starts are bad. We haven't solved it all the way. We will be solving them through caching. For us, when we scale to zero, it takes two seconds to spin back up, which impacts the application experience, developer experience. It's still gigantically better than everything out there, but we need to get it down to like 100 milliseconds or so. There's other things that go into developer experiences. One thing is to run the app and the other thing is to build the app.

Starting point is 00:56:01 And the thing that contributes to developer experience is instantly shareable environments, multiplayer-type experiences where you're building an app and you just want to send a link, a short URL to somebody else and say, hey, check that one out. And when they click on it, they have a preview. And from that preview of the app, they can also be dropped potentially into the developer environment for that preview. There's an application preview, there's a developer environment preview. And the easier you make that, the more team collaboration benefits you will start ripping. And that also is addicting, right?

Starting point is 00:56:39 Because people work in teams. People don't work solo usually. The other one that I want to mention is CI CD and push to deploy. I think Heroku famously, you know, Git push Heroku master. That's what's really cool. If you think about it, so you just do Git push and this thing is live in production. The reality that of today though, is that Heroku or not, people are using CI-CD pipelines. And when it comes down to CI-CD pipelines,

Starting point is 00:57:14 the notion of a branch and a notion of a pipeline is there. So all this shareability that I just talked about in terms of like, okay, well, here's a preview environment. Why don't you take a look at it? It actually applies to automatic test pipelines that you're coming through. And the place that does not fit well into CI-CD pipelines preview environment, why don't you take a look at it, actually applies to automatic test pipelines that you're coming through. The place that does not fit well into CI CD pipelines is usually the database. That's the one that you cannot just fork into 20 copies and run 20 tests against. Each one has its own database copy in parallel. People just don't do this kind of stuff today,

Starting point is 00:57:44 and these are the things that will be possible with Neon. I'm looking forward to that. I'm glad you defined developer experience because a lot of people seem to it's a mix match. I think in your case, the CLI, the docs, these seem like easy table stakes.

Starting point is 00:57:59 If you don't have docs I can read and I can't dig into your CLI, you're right. It is addictive. Once you get into something and there's good documentation, it's easy to kind of get deeper into that. But, you know, defining it seems to be the challenge. Getting it right might even be, definitely is, I guess, harder, right? You can know what it is, but getting it right is. Well, yeah, but you need to know where you're going, right?

Starting point is 00:58:22 You need to know where you're going. Right. Are we there with our developer experience? No, we don't have a CLI. But I'm looking at our roadmap, and our CLI is going to drop in November. So I know we need to have it. So that's what we did.

Starting point is 00:58:36 We started with defining what a great developer experience is. We put it on the roadmap, file the tasks, and the team is cranking. Yeah. Where does your roadmap live? Is it easily accessible to the world? I haven't found it yet. So that's a great question. I actually want to make it public. Because my question after that is like, if you've got it out there, or you have these ambitions, and you care about DevEx, how do you communicate that? Because if you have ambitions, and I know I want that CLI, and you don't DevEx, how do you communicate that? Because if you have ambitions and the, you know,

Starting point is 00:59:05 I know I want that CLI and you don't have it yet, how do you tell me that you care and you're working on it? And if there's no feedback loop between me, the end user, the dreamer, the user, and then you. There's a little bit of a feedback, but you're absolutely right, Adam. I will actually bring this up in the next staff meeting. We're on them every Monday. I do want the roadmap to be public. It's not public today. It does live on GitHub right now

Starting point is 00:59:30 in what's it called? GitHub issues. Yeah, that's the word. And I'm staring at it right now. Let me stare at it too. Yeah. I mean, there's no reason for it's private right now, but there's no reason for it to be private. Yeah, mark my word, we're just going to flip it public. Well, it wasn't to call you out to say that, but more like it's acknowledging that there's a feedback loop and it's clear communication and expectation setting. So if me desire for this future that you're building all these, all this magic we're talking to you about, if you're making come to fruition and I want to go there with you, if I can see a glimpse into your future, your horizon,

Starting point is 01:00:11 well, then I can buckle down a little further. I can deal with that two-second delay on my cold starts because I know you're desiring to get it to 100 milliseconds. Yeah. Yeah, you're absolutely right. Point taken and we'll be making it public. How long have you been working on this and how far do you think you are from your first paying customer? So we've been working on it 18 months roughly.

Starting point is 01:00:36 So we started payroll March 1st, 2021. And when we started payroll, we had close to zero lines of code written. So we had, you know, three founders in a slide deck. Right now, the team is 36 people. The majority of them are engineers. It's a remote-first company. The majority of the people are system engineers working on the storage. But there's obviously an SRE team and a cloud team.

Starting point is 01:01:03 The service is up and running. It has more than 2000 users. There are 7,000 signups. And I think we'll make our first dollar in Q1, 2023. We are already having people who are using it, not for toy projects, but for real production projects. And so this will be the first, you know, the first dollar we're going to make into the company.

Starting point is 01:01:27 So call it two years, roughly. Is that about what you expected? Has it been easier than you thought? Harder than you thought? What's the journey been like? It's about on track. You know, you need to build the freaking storage. Yeah.

Starting point is 01:01:39 Put that on a t-shirt. Yeah, yeah, yeah. It's a complex systems project that has a certain amount of maturity. And then you need to build everything else as well, which is the cloud service. So I think we're right on track. We could make more money sooner by just talking to larger enterprises and selling this to someone before we're fully ready. That happens all the time, by the way, in startup building, where you sell it to the user and they, in return,

Starting point is 01:02:10 they get a better deal and they have the right to drive your roadmap. We chose not to do it this time around. That's what we did with SingleStore. We lined up a bunch of banks. And I think we got Goldman Sachs. We didn't get them before two years, by the way, but we got them and We started Singles for 2011 and we got Goldman, I think, two years after, sometime 2013, for a very small workload. So that's one way of building things. The other way of building things is to create an offering in the cloud, attract people, and then cherry

Starting point is 01:02:47 pick those who have more of a hair and fire problem and the ones that will have a more, I wouldn't say relevant. Basically, whoever you choose and whoever you listen to have a big say on your roadmap. And when you choose very large companies, the say will be around enterprise features such as security encryption, integration with, you know, Azure Directory, and things of that nature. SMB mid-market. They will care about productivity. They will care about small teams. They will care about cost. And that will set the foundation of your system in a much more robust way. When it does come to generating revenue, that first dollar, the first many dollars, you're taking a bet, I guess, well, I guess it's kind of been proven by other serverless business models out there. But you're not going a traditional route, which is, as you said before, subscription, which is kind of easy to define.

Starting point is 01:03:53 Well, this customer signed up for X, X per month, X years if we retain them, etc. It's easier to sort of predict some future. How do you expect the volatility of usage-based consumption to impact revenue? What's your thoughts on that front? I think the important thing in the consumption-based pricing and the consumption-based approach is that it aligns the value of the product with the value to the customer and the customer consumption. And then eventually it will align this to your sales team as well. If we're getting paid for something that is not used, eventually it will be turned off. It will be discovered and turned off. But if we are providing value with more usage,

Starting point is 01:04:38 we're providing more value, then the usage will grow. So in a way, subscription-based pricing does not keep you as honest as a consumption-based pricing. And a bunch of Amazon's revenues, people forgetting to turn off EC2 instances. And I'm guilty myself. I've done this before. And then in early single store days, $3,000 bill arrives.

Starting point is 01:05:04 And I was like, oh my God, like $3,000 bill arrives. And I was like, oh my God, I just forgot to turn a database instance. Like that was literally that. By the way, Amazon forgave it to us. But I was ready to put my personal money to this because like, I forgot it. And in the consumption based system, you will never do that. I think the consumption based model is proven by now by companies like Snowflake and Twilio, which are purely consumption. And I think that's where the world is going. And, you know, give it a few more years and this will be the expectation in the market. I'd like to see that on like Disney Plus or Netflix.

Starting point is 01:05:42 Because there's times I've had Netflix, a subscription, and I've watched nothing for a month at least. Maybe one show, maybe a couple because I've gotten busy or I've, you know, prioritized summer and family or whatever. And still yet the bill comes along. But that is mentioning Amazon. That's a big thing with them is there's a whole cottage industry of like, explain my Amazon bill to me and i say amazon it's actually aws but yeah the point is yeah i i mean they're the video services is a great example of that right they you know i have a family and and my kids watch disney i i watch you know the house of dragons and hbo i do have netflix i haven't watched net in a while. And then I just kind of forget to turn it off. Yeah. It's almost like, you know, the max I'll pay is X. The little I can pay is zero.

Starting point is 01:06:32 Cause if I watch nothing, then charge me nothing. But if I watch it enough, charge me half, right. You know, something, but like there's a max, you know, the full subscription amount. And there are 20 services out there now, right? There are 20 subscription services out there now. And you only have this much time in the day. So you kind of want to pay by consumption and not think about it. But that's unfortunately not the life we're living in. Yeah, that's not well aligned with their incentives. I'm happy to hear that you're aligning with the value of the customer

Starting point is 01:07:03 because that's a great answer. It's one to say, well, we have a long take on this game or we're playing a long game that's one answer but the other answer is like we want to align with the customer's value because that that is so true like you can get a bad reputation or just reduction in value or trust if you charge for things you're not actually getting billed for, which is how subscriptions work. But if you're aligned with their actual value, this is what you consume, this is what you use. 100%. And when this thing works, truly works, it's beautiful.

Starting point is 01:07:35 Because now you have that simplicity across the board. Now your salespeople are just trying to land a customer at any consumption because you know that your product is very good and it will grow. And then you're compensating them for educating their customers, the accounts that they're working with, how they can drive more value by using here and here, using here and there, using there, which in return drives consumption. Now their sales commissions are attached to consumption as well. And you're becoming

Starting point is 01:08:07 the truly consumption company. From a sales perspective, it really makes a lot of sense because there's almost zero risk to the customer, right? And it's an easier opportunity for the salesperson to communicate the value

Starting point is 01:08:22 because you're not saying, well, it's X per month and you're going to overspend or underutilize and all that stuff. Then it's more like, no, you only pay for what you use. And so the, the sale can, as long as the tech aligns, of course, and the value is there technically, then the sales is kind of does it for you. It's, it's almost just done for you. And it's just a matter of aligning the value, educating, as you said, and having a good team behind you that can not just sell,

Starting point is 01:08:49 but also educate and depth into a customer versus just simply one service here and that's it. Correct. That's precisely right. You don't need to sell me on Postgres because I've been a user both professionally and privately or personally for like 15, 16 years. So to me, I saw serverless Postgres and I was like, let's talk to these people. You are VC funded.

Starting point is 01:09:13 And I know that what I've read, I'm no VC, but what I've read is that they're going for grand slams lots of the time. And a lot of their, that's what they want is that vision of a potential unicorn or deca-unicorn, who knows what they are now. They want to sell Figma for $20 billion. And being Postgres latched, you've hitched yourself to a really nice racehorse, a great one, I think the best one for a lot of cases. But it is de facto a segment of the market, right? You've basically segmented yourself and you can't get that MySQL Fortune 500 company or the guys run Oracle unless they're ready to switch to Postgres as well. And so I'm curious if like that, if there was their pushback, you know, during your pitches,

Starting point is 01:09:56 these conversations, like was Postgres a thing that you had to sell to potential investors or was it something that they're excited about? It's the same as you said. You don't need to sell people posters. Even VCs? VCs are not dumb. Oh, I'm not saying they're dumb. They're removed in some cases.

Starting point is 01:10:17 I'm saying that they're going for larger markets or things maybe. But think about it the following way. From the VC standpoint, there is a market. It's called the database market. That market has players, and those players have share. And that share is on dollars and on usage. And then you can also measure share in terms of mind share. So if you look at share of usage, Postgres is going up and up and up. So that's data point number one. And earlier I said out of the top five databases,

Starting point is 01:10:48 Postgres is the only one that's growing share. The second thing is within the database market, there is an on-prem market and there is a cloud market. And the cloud market is the much faster growing market. That market is dominated by the cloud hyperscalers, by Amazon, Google, and Microsoft. There is only one public database company that's relatively modern and relatively recent. That's MongoDB.

Starting point is 01:11:14 There isn't a public relational database company, and developers are increasingly choosing Postgres over MongoDB. Another data point is AWS Aurora is a $3 billion run rate business going into potentially four and a half, growing 50% year-on-year into the next year. That's MySQL, Postgres, and MongoDB. MongoDB is kind of fake. It's built on top of Postgres because MongoDB license prevents them from running MongoDB compute. So all of those data points highlight that there are a lot of dollars in the cloud database market. Postgres is a matter of fact. In a way, it's kind of like Linux, right?

Starting point is 01:11:56 And so it's not like you can't own Postgres. That should sail. But can you have the best in the world Postgres cloud service? I think that's an open question. And that's what we're going after. That was really, that's really the pitch. It was that simple. I was going to say, that sounds like what you would have said in your meetings.

Starting point is 01:12:16 I liked it. What's left? What do we not cover so far? What's something you wish we asked you that we didn't ask you or something you wish we can cover that we haven't covered yet? Well, wish us luck. I think that's one. Good luck. We need that.

Starting point is 01:12:31 Obviously, engineering our future. We have a fantastic team. So I think that's for the most part what we need. We think that where we go is pretty clear. And we're refining that North Star every week as we get more information. We want the world to fully buy in on Postgres. I think we're getting there, and then we need the world to buy in on serverless. And once those things happen, we need the technology to work, which we're making better every day.

Starting point is 01:13:02 You've got some job openings, I see, at least in your announcement post from back in July, I'm sure there's some of those job openings still available. Some in engineering, some in product, obviously. The two jobs that we're looking right now is UX designer, and we're potentially

Starting point is 01:13:20 looking to bring a developer relations lead and more of a senior person. We have one fantastic individual, who's named Rauf, who is running our dev rel right now. But what we're hearing from the board, it might make sense to bring a very senior person

Starting point is 01:13:37 to drive the developer relations effort. So these are the two positions that we're hiring for right now. We're always hiring for engineers. We've been blessed. There's a line of people who want to work at Neon on our storage and our cloud service. So we feel we're very fortunate because the system is open source. It's written in Rust.

Starting point is 01:13:58 So that's like a system engineer candy. Hopefully, this will stay. But as of right now, we have more applicants that we can process. And we just added nine last months. There you go. You mentioned the roadmap is coming, or potentially coming. You mentioned the desire for a CLI.

Starting point is 01:14:22 What else might be out there on the horizon? What's something that maybe people know less of, or not at all, that you could there on the horizon? What's something that maybe people know less of or not at all that you could share on the show? Yeah, so there are a couple of things. We touched on autoscaling, and all of that will be packaged into the final experience, where you will have some visibility of how much compute you're burning. And then underneath, that's going to be that live VM migrations and adjusting the size of the compute with regards to memory and CPU.

Starting point is 01:14:52 So that's coming. A number of integrations are coming. First of all, I'll watch out for an announcement next week. I can't say what it's going to be, but there will be a big announcement of integrating with a major developer platform. And more such integrations will come out over time.

Starting point is 01:15:09 We'll be announcing regions. That's kind of table stakes for a database service, but that's going to happen. And then we're also experimenting with some of the generative AI stuff. We're only going to launch it if we internally feel that it provides a ton of value. But that is about automatic index suggestions, automatically branching, applying indexing, and then sending a pull request for changing the schema. That is some of the things that are brewing in our labs, which is kind of cool. But again, we only going to do that if we are confident that this is not a toy. It's really useful for the developer.

Starting point is 01:15:52 So currently technical preview, right? You have to request early access or have an invite code, which you can log in with GitHub or you can log in with, I believe, Google. So you can either SSO to get in. What's the wait? If people get done with the show or maybe midway through the show and this question is too late, how long will they wait? What can they expect?

Starting point is 01:16:12 Barring the things that are unexpected, which is like, you know, we're about to remove the invite gates and then in the last second, somebody's like, no, you can't do this because X and Y will break. Well, barring that, November will drop the invite gate. Okay. So soon.

Starting point is 01:16:30 Yeah, it's very soon. And in Q1, we get a turn all pricing and billing. So the team is working very, very hard. We already know the pricing structure, the pricing model, where we're going to charge separately for storage, compute, and data transfer. So in a way, kind of really aligned with what it costs us to run the service. And then it's elastic and scales to zero. So you're not using it.

Starting point is 01:16:54 You're only paying for storage. If your storage is zero, it runs up to zero. So these are the pieces that we need. We need regions. We need larger computes. We need pricing and billing. And once that's there, we're ready to roll. We'll drop the invite

Starting point is 01:17:10 gate even before we have pricing and billing. And it'll remain free, obviously, until Q1, and then what happens? Will there be like a grace period, like, hey, a free tier, a generous free tier? What can people expect? The generous free tier will stay. We'll give you a certain amount of consumption per month for free. push Roku master, those four words were sort of blazing in all of our brains. And, you know,

Starting point is 01:17:45 that's kind of gone now, but if you want to be a long-term player, it might make sense to always have a generous free tier and keep it that way so that you can invite those who want to play and tinker to do so. Well, it comes down to the model and it comes down to the level of what it costs you. And it comes down to a certain level of abuse. Yeah, for sure. When you give people arbitrary compute, you will be abused, right? Because you know, you can mine bit you can, you can turn free compute to value. You can mine Bitcoin, you can, you can do whatever you can do the DOS attacks, like there's all sorts of malicious behavior that you can expect on a popular platform. When your platform is not popular, you're dying for that traffic.

Starting point is 01:18:30 And so that is the push and pull, right? So on the first subject, it's harder. So databases are arbitrary compute, but it's not as obvious as like just having access to a VM, right? It's less arbitrary than a VM access. So I think the level of abuse will be there, but naturally it will be less. Well, that'd be a good spot for AI is to detect that stuff, right?

Starting point is 01:18:52 To machine learn what abuse looks like and you can sort of evict them from the cache. Get out of here. But then you're writing all the code. Yeah, you're spending all your money on fraud and abuse. That's not where you're at. Yeah, you're spending money on fraud and whatever. And so for example, fly.io gives you a generous free tier, but they they ask for a credit

Starting point is 01:19:11 card ahead of time. So that's like adding a little bit of friction now fly.io gives you actual VMs, right. And so and what I'm saying is in databases, I expect less abuse than as in a general purpose platform, but there will be some. Right now, obviously, we want that free traffic and free usage. Then it's a model. If you put my business hat on, there are a certain amount of people that are coming on the platform.

Starting point is 01:19:37 It costs you this amount of money. You fine-tune what those free tier boundaries are to maximize your long-term goals, your long-term trajectory as a company. And that's an important thing that we always want to optimize for long-term. And that's, in a way, what venture capital allows you to do, right? So you can really make sure that you build a very capable platform. You can reach a certain amount of scale of users coming in. And by the way, the more users, the more stable is the platform because you start seeing all

Starting point is 01:20:11 sorts of failures and fixing them. And that's another reason to stay free for a while, right? Once you start charging people money, the expectations on uptime and quality are higher. And maturity takes time. I think since we're not doing this for the first time, I think we can get there faster than, let's say, single store, but it will still take time. Yeah, so that's kind of how we think about free tier.

Starting point is 01:20:36 We're taking a very, very practical approach to it. We want people to come in. We want people to see value. We want people to eventually convert and become customer buyers. Well, speaking of conversion and the potential of many, there is a code. I'm curious, can you share a code just for our listeners? Is there an invite we can just give to everyone who listens to the show? Is that feasible?

Starting point is 01:20:59 Is that too much? You tell me. Yeah, there's a partnership with Hasura that is currently slated to launch on the 11th and if you come to neon through hasura by pushing a neon button on the hasura dashboard and we'll be replacing heroku uh on hasura then you will bypass the invite gate and you will you will be dropped on the neon console and you don't you don't need to have the invite gates to start using Neon. Yeah, good stuff.

Starting point is 01:21:29 Well, if that's out there, Hensura, awesome. If you're just listening to this and you don't know what Hensura is or you don't have access to that, then, well, I guess just wait till November sometime, right? Because that's when it actually opens up. The gates are down. So just a temporary wait for anybody who might

Starting point is 01:21:46 be listening. Finally, you can just tweet and say Neon Database and I get an invite and we'll DM it for you. Gotcha. Cool. Well, there's some ways then. Anything else, Jared? What else is left? We put it out there, didn't we? I think we've done a good job covering it. I'm excited for you all. I'm rooting

Starting point is 01:22:02 for you. Like I said, a big Postgres stand over here. So I want to see it moved into the future alongside all these other players and opportunities, uh, resource-based billing. I mean, it's going to be awesome.

Starting point is 01:22:16 The regions, it's going to be cool. The branching is already cool. So very excited for what you guys are building and wish you the best of luck on it. Absolutely. Absolutely. Thank you so much.

Starting point is 01:22:25 Thanks, Nikita. Okay, serverless Postgres is finally here. Neon is bringing it to the masses. We want to hear from you. Is Postgres your database? Are you excited about serverless Postgres? Is this model something that gets you excited? Sound off in the comments.

Starting point is 01:22:43 The link is in the show notes. And for our Plus Plus subscribers, make sure you stick around. We have a bonus for you. Again, a big thank you to Fastly and Fly and to Great Men's Facility for those awesome beats. And, of course, to you, thank you for tuning in. We appreciate you. That's it for this week. We'll see you on Monday. day. Thank you. Game on.

The Changelog: Software Development, Open Source - Taking Postgres serverless (Interview)

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.