The Changelog: Software Development, Open Source - Taking Postgres serverless (Interview)
Episode Date: October 14, 2022This week we're talking about serverless Postgres! We're joined by Nikita Shamgunov, co-founder and CEO of Neon. With Neon, truly serverless PostgreSQL is finally here. Neon isn’t Postgres compatibl...e…it actually is Postgres! Neon is also open source under the Apache License 2.0. We talk about what a cloud native serverless Postgres looks like, why developers want Postgres and why of the top 5 databases only Postgres is growing (according to DB-Engines Ranking), we talk about how they separated storage and compute to offer autoscaling, branching, and bottomless storage, we also talk about their focus on DX — where they’re getting it right and where they need to improve. Neon is invite only as of the recording and release of this episode, but near the end of the show Nikita shares a few ways to get an invite and early access.
Transcript
Discussion (0)
This week on the change law, we're talking about serverless Postgres and we're joined
by Nikita Shamganov, co-founder and CEO of Neon.
With Neon, truly serverless Postgres is finally here.
Neon is not Postgres compatible.
It actually is Postgres.
Neon is also open source under the Apache license version 2. On today's show, we talk about what a cloud native serverless Postgres. Neon is also open source under the Apache License Version 2. On today's show,
we talk about what a cloud-native serverless Postgres looks like, why developers want Postgres,
and why of the top five databases, only Postgres is growing. This is according to DB Engines
Ranking. We also cover how Neon separates storage and compute to offer autoscaling,
branching, and bottomless storage. And we also cover their focus on DevEx,
where they're getting it right, and where they need to improve.
Neon is invite-only as of the recording and release of this episode,
but near the end, Nikita shares a few ways to get an invite and early access.
A big thank you to our friends and partners at Fastly and Fly.
Our pods are fast to download globally because Fastly is fast globally.
Learn more at fastly.com.
And our friends at Fly let you run your app and your database closer to users all over the world.
Check them out at fly.io.
This episode is brought to you by our friends at Fly.
Run your full stack apps and your databases,
close your users all over the world.
No ops required.
And I'm here with Brad Gessler,
who is helping to build the future Rails cloud at Fly.
Brad, what's got you excited about Rails on Fly?
It's no secret that Rails is this really productive framework and application.
We've also seen that happen.
There's a bajillion different hosts that you can choose from out there
that all make it really easy to deploy your Rails applications.
We've had these for years.
There's nothing really magical about that anymore.
It's just, this is what we'd expect.
We want to type a deploy command, and this thing ends up on a server somewhere.
The thing that I think that sets Fly apart from all that is it scales.
It has so many scaling stories.
It has, again, the table stakes stuff.
Oh, wow, you can add more memory to a machine.
All those things you would expect from a hosting provider.
Again, Fly, you can scale out.
You're going to have customers that live in Singapore,
that live in Frankfurt.
You need to get servers there.
Fly lets you do that.
Again, with just a few commands,
you can provision all these servers
in these different parts of the world.
And then the real magic with one command,
you can type in fly deploy
and you have all these servers provisioned around the world.
They just work.
People hit yourcompany.com
and they're hitting the Frankfurt server
and the same person in Singapore is typing in your.com
and it just works
and they're hitting your servers in Singapore.
So this thing scales out beautifully, which is really important,
especially if you're starting to run turbo applications
or turbo native applications where you need that really low latency.
Your application needs to respond to these users in under 100 milliseconds.
Otherwise, to them, it's not going to be instant.
They're going to be waiting.
It's important to be fast, and Fly makes that possible.
The reason I joined it is because of this kind of global magic that we're going to be shipping. And that's something that I want to bring to Rails developers all around the
world. That's awesome. Thanks, Brad. So the future Rails cloud is at Fly. Global magic is on its way.
Try it free today at fly.io. Again, fly.io. All right, we have Nikita here to talk about serverless Postgres, a hot topic these days.
Welcome to the show, Nikita.
Thank you.
Glad to be here.
We're happy to have you.
I think we last talked Postgres with Paul from Supabase.
And in that conversation, we started talking about what would a cloud native Postgres look like?
Or maybe what would a serverless Postgres look like?
And he said a lot of the same words that I'm reading on your guys' homepage.
You are with Neon CEO, Neon.Tech.
Very cool technology out there that's still getting started.
Do you want to tell us from your perspective what serverless Postgres means?
Well, absolutely.
I think there are several parts to it.
And the first one starts with user experience.
When you go and provision Postgres anywhere else,
today you maybe sense AWS Aurora serverless, you go and choose the size of your
instance. And then you are part of what is called a subscription pricing model, where you say, well,
this is an instance of size, you know, small to large to extra large. And this costs you
X amount of dollars per month, right? This is called subscription-based pricing. You're committing to a certain size and then what you're paying for.
In this serverless world, you don't choose the size, right? You just say, I need Postgres.
And then the system right-sizes the amount of resources that you consume.
And all you get is a connection stream. And now you're just connecting your app to the database.
And you don't need to think about sizing at all.
And you don't need to think about the fact that, you know, you're paying something that
you're not using.
And that's what it's called consumption-based model or consumption-based pricing.
Right now, you know, I push the button, you've got the connection stream, and whatever you use, you're paying
for.
Whatever you're not using, you're not paying for.
Where it's getting super, super convenient is in the various development, staging, side
project environments.
Usually you have a production database and that powers your app.
But then you have, I don't know,
tens potentially databases out there
for various stages of your environments.
And so if your environments are different,
then your test coverage is not the same,
the properties are not the same.
And if you make them all the same,
then you might be spending a
lot of money by having, you know, full copies of your production environments for various parts of
your development process. So that's, I think, what fundamentally serverless means, there are lots of
shades of gray to it. And serverless typically becomes a part of a software infrastructure
architecture to deliver on site on all such properties.
So serverless isn't new conceptually or even,
I mean, it's newer in the market,
but serverless things have been out there for a while.
From your perspective, as somebody who's now building a tool and a business in the serverless world,
has adoption been as expected or has it been slower?
Are people moving to serverless things or is it mostly like small and indie people getting started?
Or like, tell us what you see.
Well, I think it depends on the stack.
And databases are usually kind of the last one to the party.
And the reason to that is it takes a good amount of hardcore engineering.
The development cycle is longer
when it comes down to databases.
But let's say in the front-end world,
people are there, right?
If you look at platforms like Vercel and Netlify
or Cloudflare Workers,
this becomes the dominant way of deploying front-end code.
It's completely serverless.
Your JavaScript project is packaged, passed into the
platform, and deployed around the world in the CDN-like manner in multiple data centers around
the world. Traffic is routed to the local data center, and that drives latencies down.
Then there's the backend code and the database. When we start thinking about the backend code,
we're seeing somewhat similar dynamics. My favorite company here is Fly.io. You should have them on a pod if you haven't already.
We know them well.
You know, similar things, right? So you deploy your app into Fly and they are able to deploy
that app around the world. They don't do serverless, but I think they will over time.
They already have machines that can scale down to zero and stuff like that.
So now the question is, can we have that in a completely elastic way over time?
This scaling down to zero is like a big deal for all the things that we've talked about before.
Finally, there's the database, right?
You have front-end, back-end database.
That's the majority of the apps that need all three.
Now, the tricky thing about databases is, well, you either build a completely new one from scratch,
you know, DynamoDB or something,
or you take advantage of something that is extremely popular like Postgres.
But then it's much trickier to make it serverless because Postgres is a package.
It has storage, compute, metadata, all in one box.
And then in order to make it serverless,
you need to cut the system in the right way.
And what we did, we separated storage and compute.
The adoption has been phenomenal.
And when we announced the system just in June,
we now have close to 10,000 users coming into the platform and signing up for the
system. And we haven't even lifted the invite gate. So we are onboarding people in patches.
And we're seeing like a lot of interest of people coming into the platform and using the system.
Granted, all of that is free right now, which is attracting a lot of tire kickers and people who
are just trying things out.
But we are in communication with those folks.
They're filling up surveys and we are engaging with them directly.
And so we see a lot of excitement around serverless.
That excitement can be probably split in three categories.
The first one is, I'm an indie developer.
I just want something cheap or free or whatever.
And some of that is a Heroku fallout as well.
Another use case is, well, I'm doing a lot of software development.
I need this developer environment.
So that's where scaling to zero, branching is another thing that we bring to the table,
allows you to very easily create developer environments.
And don't sweat bullets that you can just like
overcreate those developer environments
and forget to turn them off
because they all scale to zero.
And finally, we see professional,
like bigger organizations that are saying,
well, we are an RDS,
but like it's getting extremely hard to deal with Amazon.
We just want simpler.
We need more reliable.
And we need something that plugs in
to the next generation infrastructure, which is the Vercels of the world, which is AWS Lambda,
and you know, which is something like Fly as well. So that's where we see kind of the categories of
people coming in. There are other serverless offerings on the market. I think namely PlanetScale and Aurora.
When I started the company, I had a little bit of an insight into AWS Aurora.
And they always track, you know, they build something and they see how much of an impact this is to the overall business.
And when they ship Aurora Serverless v1, which is their first implementation now, they're on the v2, which, by the way, doesn't scale all the way to zero.
But that thing took off like there was no tomorrow for Aurora.
So that was a big deal and a signal for me then figuring out how to build a dominant OLTP cloud database.
It might be obvious why it took off, but in your opinion, why is this space in particular growing so fast?
Yeah, I think it's friction and cost.
Like it's as simple as that.
And it's friction, it's cost,
and then it's what people want.
People want posters.
So there's this famous website for database people
called DB Engines Ranking.
And then if you like go on Google,
type DB engine ranking,
and you see what's going on in the top five databases, you will see that those top five are
MySQL, Oracle, SQL Server, Postgres, and MongoDB. These are the top five databases
in the DB engine ranking. Out of the top five, only Postgres is growing. So in addition to convenience and not thinking about sizing and provisioning and stuff like that, and cost.
And cost comes mostly from the fact that you architected the system such that you never overpay for resources.
There's also, we're on the right trend lines with regards to Postgres.
There's just more and more Postgres out there and people want postgres.
Kind of reminds me of the GitHub analogy I had back,
way, way back in the day with Tom Preston Warner.
And this is like literally months after GitHub launched.
It was a whole different podcast,
a whole different Adam, a different era of life.
But one thing Tom said about GitHub early on
about their success was it was permission to mess up.
So if you reduce the friction and reduce the cost, it's your not so much permission to mess up, but permission to explore and to be creative.
Because you can creatively use something serverless if it spins down to zero or virtually zero in the case of Aurora or whatever.
That's the thing I think if you give developers that experience, then they're going to play more often.
They're going to create developer environments.
This works great here. Let's use it in production, obviously.
But if you give people the option to have a better experience and play, cool things happen.
You are precisely right.
And this feature that we have, well, the two features that we have on the platform, one is branching.
So you can branch, and now that creates a completely isolated environment
from your standpoint, right?
Now you can read into that environment, write into that environment.
You can put a lot of traffic onto this environment,
and you will not impact your production branch.
And so that's kind of one.
That's permission to mess up, number one.
The second one is time machine.
Right. Even if you messed it up in your core database, you can go back and restore it to what in Git would be the commit.
And in the database world, in the Postgres world, that's called restore down to the LSM, which is which stands for log sequence number.
Right. If you go and drop a table, drop table users,
no recommending to do this to anyone,
but in the world that you did with one command in Neon,
you can roll back to right before when you drop that table.
So that's all cool.
First of all, how about merging?
We got branching.
Can we get merging?
Can I roll it back in?
Yeah, let's go ahead and merge this.
Yeah, yeah, merging is tricky.
I think, so we're watching that space, right? So first of all,
in merging, even before you want to do a merge, you probably want to do, you want to understand what changed, right? And then
in Git, there's a diff, Git diff. In databases
there are new tools like data diff coming from this company data fold.
It's an open source tool.
And the other thing to understand, which is important, that in databases, there's data
changes and schema changes.
And oftentimes, there's a notion of a migration that Prisma, for example, has or various ORM have, where really what you want to do is to roll forward a particular schema
into the production environment.
So the workflow seems to be, the right workflow is the following.
Here's my production database.
I want to build a feature that potentially changes and messes up with the schema.
I'm going to branch that production environment. I'm going to make all the changes, which creates a test environment or dev environment,
for that matter. I'm going to make all the changes in the test environment.
In the meantime, your production environment moves forward. There are more and more changes
that are coming in because your application is live. Then you diverse both on schema and you diverse both on data.
But really, what people want to do for the most part is just roll on the schema, not the data.
I think that is the workflow that Prisma supports. I think we will eventually introduce it into the
core system at Neon where, for every commit, we will be recommending developers
to create a branch will integrate with all the platforms, like including GitHub as well,
with GitHub actions and whatnot. And then the analogous of a pull request emerging the pull
request would be merging the schema, but not necessarily the data.
Makes sense. Makes sense. So elastic compute makes sense and scaling down because
you have like ephemeral on-demand resource usage right like all of a sudden i have to answer a
bunch of hdb requests and so my server has to do stuff and then everybody leaves and my website
doesn't get any hits and i could scale that down with databases. If I got one gigabyte database,
it's just like, it's always there, right? I mean, all that data is there and I could access any part
of it at any time or need to, and we don't know which parts. So I have a hard time with like
database scaling to zero unless you're, I don't know, just like stomaching the cost?
Or tell us how that works with Neon.
Are you just stomaching the cost of keeping that online or are you actually scaling it down?
We're actually scaling that down.
Let me explain how this works and it may get quite technical.
The first thing is what should be the enabling technology of scaling that down?
If you just kind of thinking, you know, how would I build serverless Postgres?
And if you ask a person that is not familiar with database internals, they would say something like,
well, you know, I would put it in the VM maybe, or I would put it in the container.
I would put that stuff into Kubernetes. Maybe I can change the size of the containers.
The issue with all that, as you start moving those
containers around, you will start breaking connections because databases like to have
a persistent connection to them. And then you will be impacting your cache. Databases like to
have a working set in memory. And if you don't have a working set in memory, you're paying
the performance hit by bringing that data from cold storage to memory.
The third thing that you will find out is that if the database is large enough, it's really, really hard to move the database from host to host because that involves data transfer.
Data transfers are just long and expensive, and now you need to do it live while the application
is running and hitting the system. Naively, you would arrive with something that you kind of propose, right?
Let's just stomach the cost.
There is a better approach, though.
And the better approach starts with an architectural change of separating of storage and compute.
If you look at how databases' storage works at the high level, it's a what is called page-based storage.
Then all the data in the database is split into eight kilobyte pages. And the storage subsystem
basically reads and writes those pages from disk and caches those pages in memory. And then
kind of the upper level system in the database lays out data on pages. So now you can separate that storage
subsystem and move that storage subsystem away from compute into a cloud service. And because
that storage subsystem operates is relatively simple from the API standpoint, the API is,
you know, read a page right into a page, then you can make that part multi-tenant.
And so now you start amortizing costs across all your clients.
So if you make that multi-tenant and you make it distributed and distribute key value stores,
we've been building them forever. So it's not rocket science anymore.
Then you can make that key value store very, very efficient, including being cost efficient.
And cost efficiency comes from taking some of that data that's stored there and offloading cold data into S3.
Now, then it leaves out compute.
And compute is the SQL query processor and caching.
So that you can put in a VM. We actually started with containers,
but we quickly realized that micro VMs such as Firecracker or Cloud Hypervisor is the right
answer here. And those micro VMs have very, very nice properties to them. First of all,
we can scale them to zero and preserve the state. And they come back up really, really quickly.
And so that allows us to even preserve
caches if we shut that down. The second thing that allows us to do is to live change in the
amount of CPU and RAM we're allocating into the VM. That's where it gets really tricky because
we need to modify Postgres as well to be able to adjust to suddenly you have more memory or
shrink down to, oh, all of a sudden I have
less memory now.
And so if you all of a sudden have less memory, you need to release some of the caches and
release this memory into the operating system.
And then we change the amount of memory available to the VM.
And there's a lot of cool technology there with life-changing the amount of CPU.
And there's another one called Memory Ballooning that allows you to, at the amount of CPU. And there's another one called memory ballooning
that allows you to, at the end of the day,
adjust the amount of memory available to Postgres.
And then you can live migrate VMs from host to host.
Obviously, if you put multiple VMs on a host,
they all started growing.
At some point, you don't have enough space on the host.
Now you need to make a decision
which ones you want to remove from the host.
Maybe you have a brand new host available for them with the space, but there's an application
running with a TCP connection hitting that system. Storage is separate, so you only need to move the
compute. And so now you're not moving terabytes of data with moving Postgres. You're just moving
the compute part, which is really the caches and caches only. But you need to perform on live migration here. So that's what we're
doing with this technology that's called Cloud Hypervisor that supports live migrations.
And the coolest part is, as you perform in the live migration, you're not even terminating the
TCP connection. So you can have
the workload keep hitting the system as you change the size of the VM for the compute up and down,
as well as you can change the host for that VM and the application just keeps running. So yeah,
that's kind of super exciting technology. So do you have your own infrastructure that this is
running on? Are you on top of a public cloud? Or how does that all work?
So we are on top of AWS, we know that we need to be on every public cloud.
And that's where the users are.
Now, this question kind of hits home a little bit, the cost can be at least 10 times cheaper.
If we use something like I don't know, Hetzner or OVH. And in our architecture, it's like super important to
have an object store as part of the architecture. So Amazon S3. And in the past, there was no
alternative to S3, like no real alternative. But just a few weeks ago, Cloudflare released R2,
and they made a GA. And all of a sudden, you can put cold data onto R2.
We still don't know
what the real reliability of R2 is,
but I trust that Cloudflare
will get it up there eventually.
And that opens up
all sorts of possibilities.
The other one that we're
looking into closely is Fly.
We even have a shared Slack channel
with Fly.io.
I think it's a fantastic company. And
I see a day where Nian will be running on Fly infrastructure as well. Now, all that said,
as of right now, we're only on Amazon and we'll be adding the other cloud. In which order and
what's going to come sooner, Fly or Google, for example, I can't really commit to
because we continuously evaluate.
Yeah.
So when you say move data off to S3,
how do you deem data as cold on your customer's behalf?
Because that has to be some,
there's got to be some smarts in there.
Yeah, there's a lot of known algorithms
and they're mostly caching algorithms.
So it's already happening today a little bit in Postgres, right? There's a buffer manager
or buffer pool, maybe mixing SQL Server or Postgres terminology here because my background
is SQL Server. But the architecture is similar where the buffer pool has a counter for every page
and it refreshes the counter if the page is touched.
And then the algorithm kind of sweeps the cache and decides which pages haven't been touched for a while,
and then evicts them from the cache.
Here, we added another tier in the remote storage.
We also track pages, and you see which pages have been touched recently and which have not been,
and then you offload those pages onto S3.
There is a caveat.
However, S3 does not like small objects, and a page is 8 kilobytes. So we need to organize those pages into some sort of data structure
that will bucket those pages together.
So when we throw those pages onto S3, we throw a bunch of them
together in a chunk. That data structure is called an LSM tree. And that's the implementation of LSM
tree that we built from scratch in Rust. And that's integrated with S3 and offloads
colder data to S3. It's kind of like several use cases. One use case is like very large database.
You know, if you have a very large database, chances are large portions of that database
are never even touched.
So over time, you know, some of that data, maybe it's the data from like, I don't know,
five years ago, and you don't really need it.
But you're keeping this there because like, oh, it doesn't cost you much.
And it's better to have them for occasional use than not have it all or put them in a different system.
And the other use case is you have a big fleet of databases.
A lot of them are scaled down to zero because, you know, you just have them for occasional usage.
And now if you keep them hot, that will start to add up both on the compute side and on the storage side.
Storing all that data into SSDs is a very different economics than storing all that data
in S3 in a compressed form. So these are the second place where integration with S3 can drive
much better economics. Hey, friends.
Influx Days is back.
This is a two-day developer conference from our friends at Influx Data and is dedicated to building IoT, analytics, and cloud applications with InfluxDB. It is happening November 2nd and 3rd. We'll see you next time. and digital businesses. If you're new to Influx or you're building advanced time series applications,
Influx Days sessions and trainings will give you the skills you need
to support your individual builder journey.
Here's the breakdown.
Two free days of virtual user conference,
watch parties in SF and London,
free training on Telegraph open source server agent,
paid training on Flux in London.
Again, this is all happening November 2nd and 3rd.
Learn more and register at influxdays.com.
Again, influxdays.com. so nikita the this is obviously groundbreaking right to get serverless postgres
you mentioned the architecture of separating compute from storage and you got developer
experience which is crucial right built for developers made for developers is kind of key
that's what makes this a hot space.
How in the world do you get the recipe right, though?
You've obviously cracked the nut,
but how do you get the seemingly infinitely hard
infrastructure aspects of it to build it
and then build it and then actually make it work?
Yeah, so while some of that comes from experience,
so I spent a good amount of time in the database space. And single store is a database that built every part of the database
stack from scratch in C++, including separating of storage and compute and including a hardcore
analytical query processor, including distributed transactions and stuff like that. So in a way, there's a lot of lessons learned both from SQL Server, from Tingle Store, from
reading all the papers, and then actually the part of walking the walk and doing that.
So there isn't much magic in this, actually.
You need to have a strong team that deeply understands the underlying system. In this particular case, this is Postgres proper, plus the new storage that we're putting together.
There is a continuous process of building the team and shipping software.
And that is set the goals, build the thing, make sure it's robust and reliable, put effort into testing the system, put effort into
software practices that are around that, and be confident in the architecture itself.
The confidence, because that's the hardest thing to change. The hardest thing to change is the
architecture is wrong and you need to change the architecture. Now large swaths of code need to be
rewritten. The other thing is too hard
to get out of the pickle
if you got the quality wrong.
The quality is wrong.
Then it takes, you know,
you keep fixing the bugs,
but they don't seem to stop.
Yeah, it's really no magic.
It seems magical.
It seems magical from the outside.
You know, SQL Server, Postgres itself,
you know, any large system project,
I think, is going through that.
There's a certain amount of kind of maturity
that the project needs to get through
to achieve dominance.
The faster you get through this, the better.
The more people use it as you do this, the better.
And that's why we rolled out the system
for people to use for free
because now the stakes are lower
and then we are fixing things now the stakes are lower.
And then we are fixing things on the back end very aggressively.
Are you running a Forka Postgres or is it stock Postgres?
So it's stock-ish. I guess it's stock Postgres with a caveat.
So what's the caveat?
Well, we have to change Postgres in a very surgical matter and specifically where Postgres reads a page from disk.
Instead, it needs to read a page from our remote storage by making an RPC call.
And when a Postgres writes into disk and sends what is called a wall record, write ahead log record, instead of writing to disk, it needs to send it over the network into our service, into our multi-tenant service. Those changes are not huge, but they're there. We've split those
changes into five separate patches that we are submitting upstream. They have not been accepted
yet, but we are working with the community for it to all get upstream. And once those patches make it upstream,
I'm really hoping for Postgres 16. If not, that will be Postgres 17. We're working with the
community on that. The community understands that we're not the only ones. There's also Aurora.
There's also some projects in China that are exploring similar architectures,
and those will benefit from this.
I mean, it's not a secret to the Postgres committers either that separation of storage and compute is the right way to go into the cloud.
So that gives me a good amount of confidence that the patches are going to be accepted,
but I cannot claim or guarantee that they will be accepted
because we need to get the buy-in of the community.
There are multiple Postgres hackers on the Neon team, including one of our founders,
Katie Linacongas, who is a quite prolific Postgres committer himself.
So he is spearheading that effort of packaging the changes that we made in Postgres and sending
them into the community for the final acceptance.
How much of your other work could potentially make it upstream or could potentially be duplicated
effort as Postgres core team decides this is the direction that Postgres needs to go?
Is there a lot of overlap there?
It's actually relatively little, believe it or not.
The storage part is a completely separate project.
It is open source.
I wouldn't mind if it was a part of Postgres,
but obviously that's a very long-term project
and it needs to reach certain stability.
If you look at the storage project on GitHub,
which by the way is distributed under Apache 2.0 license,
so anybody can do whatever they want with the code.
It's a very actively developed project.
There are commits like, I don't know,
10 plus commits every day that are going into it.
So I think building that storage by the Postgres team
is off strategy for Postgres,
or it seemed that way for Postgres proper.
Integrating with a storage subsystem like this is absolutely on strategy for Postgres, or it seemed that way for Postgres proper, integrating with a storage subsystem like
this is absolutely our strategy for Postgres. So if, what you're suggesting, the Postgres community
realizes that, well, we want to have a distributed cloud-native storage system,
I think Neon would be the best candidate because by that time, it's a fairly mature system.
It's truly open source.
It's Apache 2.0 license.
We can re-license into a Postgres license if Postgres wants that to happen.
And that becomes a standard and a part of Postgres.
Now, while that's possible, I think it's kind of unlikely. I think Postgres will continue, Postgres community will continue building Postgres and the Postgres engine
and make sure that Postgres plugs in into Neon storage and they will look at it as kind of the ecosystem plate.
In terms of how you patch Postgres, does it have to be a patch or could it be an extension?
Is it something they can live in?
It's a mix.
Yeah? Yeah, it's a mix yeah yeah it's a mix they there are five patches that and the reason there are five
patches is they are just you know they're touching different parts so we're just splitting them it
could have been one patch uh it would have been just bigger but that that makes it more palatable
and then the majority of the of the changes, you know, you take stock Postgres,
you apply those five patches, and you need an extension, the Neon extension. So that's how the
overall system works. The extension from the lines of code changed has the most lines of code.
And those five patches are relatively small.
So how much work would it take for somebody to stand up some of the open source stuff that you have?
I mean, are the patches out there?
I assume if they're trying to get upstream, they're somewhere to be seen.
But is that possible?
Like if I could stand up and run my own little neon cluster for people or something like that?
Yeah, you can.
Yeah, for sure.
So you will need Kubernetes.
Okay, I'm out.
Yeah.
Okay, I'm out.
You're out?
No, I was just joking.
End up. Keep going. I'm out. Yeah. Yeah. I was just talking. And I keep going.
I'm supposed to be you can stand up on your laptop.
And so you will get branches if you if you did that.
But if you want to stand up a service with like multiple computes and all that, yeah,
you will need to burn IDs and it's all doable.
But it will it will it will require some work.
But consuming this, if you just want to consume it, that's trivial, right?
You push a button
and in three seconds
you have Postgres.
Yeah.
That's what I'm more likely to do,
personally.
But there are people out there
who love to hack on these things.
Yeah, yeah.
And there are also larger companies.
You know, for example,
we are talking to
a Fortune 500 company
which have,
I think they're spending
$100 million on Postgres
just on the infrastructure underneath a year.
Wow.
And that multi-tenant storage approach and scaling down computes to zero can make a massive difference in their deployment.
And by the way, I don't think it's the only company that's doing that.
There's a lot of companies that use a lot of Postgres out there.
In that scenario, you said before
there's parts of the database that don't get used.
So it's like old data.
So it's there in the database.
How does that work to sort of take that away from the cache?
As you said, evict it from the cache.
Would you just keep certain parts
of the database alive, essentially?
And some of it just goes off the cache?
It's the data.
Some of the data goes off the cache.
So think about memory, disk, and disk, you know, fast disk, like SSD, and then S3, right?
And those are the tiers.
And the latency for accessing data in each of those tiers is different.
Now, you kind of have this experience by like scrolling Facebook messages with somebody you had talked to a long time ago and then going back into the history.
Sometimes you get a spinning wheel and that's what happens.
Loading spinners.
Yeah, and they bring that data from, I don't know, an object store or something, from somewhere on disk.
It's certainly not cached.
It's not in a very fast storage medium right now.
That would be the kind of experience your application will have. Certain queries
will have added latencies to them when you start accessing the older data.
But I think that's the way to go. People think, well, can we partition the data and last month of data is fast and the rest of the data is kind of slower and then 10-year-old data is super slow.
But the reality is the system can make those choices for you just simply based on the patterns of usage.
Right. There's a counter, you said, on the page, right?
Yeah, there's a counter on the page.
So the algorithm built into Postgres is the best candidate to make that choice, essentially?
On the Postgres side, yes.
And then on our storage side, there's a separate one.
And the difference, like I said, is we're combining those pages into what we call layer files.
In the pre-call, you mentioned AI.
Is this the place where AI might make some better predictions in the future?
Yeah, well, stuff's pretty simple, right? So basically where you want to use AI, or in some
cases, just like machine learning, is when there are multiple competing things. For example,
there are multiple caches that are competing for something. You can use control theory or you can
use AI for deciding what goes out of the cache and what stays.
In general, AI applications are split between, can I use it for my database engine?
The most famous paper is like Jeff Dean's Learning V-trees.
And the paper made a lot of noise.
But I think the practical usage of this paper is kind of zilch.
Like it doesn't really make a huge difference
and nobody implemented it in really
big database projects. It's not
in MySQL, PostgreSQL Server,
Mongo. And the other
place for using AI, and I think that's pretty
exciting, is in
autotuning the database. So there's a
startup by Andy Pavlo called Autotune.
They are
twisting database knobs and they can make your database,
I don't know, probably up to 20% faster, maybe more for your particular workloads.
Then AI can choose indexes for you and that's where branches could be a very cool thing where
you branch, you unleash AI on the branch so now you're not worried that the AI is going to mess up
and take down your production database. In that branch, AI makes a bunch of changes like changes
the knobs, changes the indexes. You can fork the workload and test the workload on the branch.
And then you can be like, yeah, you know, like, makes sense. Do you want to send a pull request kind of thing? I think that's where AI is a lot more interesting than in like managing caches and deciding what
stays in memory and what goes on disk. Because there's like caching existed for like decades,
right? And there are classic algorithms that do this very well. Finally, the generative AI is fascinating. It feels like every day there's
an AI breakthrough. That's where I think developer experience can be impacted,
where you start generating SQL, or you start generating ORMs, or you start generating API
endpoints, or you start generating some sort of backend
code, you're generating story procedures.
Nobody really knows by heart the syntax of story procedures because you live in your
primary programming language like Go, Java, JavaScript, and then you need to write a story
procedure.
You have to stretch your head.
It's like, oh, what's the syntax exactly in PLPG SQL?
So that's where AI can really help by generating some of those things.
So replacing us app developers, not replacing you infrastructure developers.
I see how it goes.
I see how it goes.
Well, it's just the applications developers will go first.
Eventually, we'll all go.
We'll all go eventually.
Yeah, fair enough.
Fair enough. eventually we'll all go we'll all go eventually yeah fair enough fair enough so one aspect so you've got decoupled compute and storage and the other thing that i think about with regards to
cloud native or serverless things is geographic distribution and so you've mentioned fly a few
times disclaimer fly.io is a partner of ours and so they sponsor
some of our shows they may sponsor this episode i don't know it might i can't tell you right yeah
but you know one of their slogans is run your app servers closer to your users right and that's also
what you know netlify wants to do and what vercel wants you to do and what I'm sure Lambda wants you to do. This whole like put the CDN all around the world and then do your compute in the CDN
edge nodes is like the new thing, right?
But the database has always been still in some data center in Virginia, right?
And so it's like kind of the mecca or the place where I've been waiting and talking
to people is like, how can I get my database close to my users as well?
And even when I was talking with Paul from Supabbase, he was saying, well, that's a whole
different thing from decoupling compute and storage is like geographic distribution, cap
theorem, et cetera, et cetera.
Curious your take on that.
Like is Neon going to be Postgres serverless, but also running right there in your edge
nodes?
Or is it going to be, I mean, maybe for now, if you're on AWS, it's a possibility.
I don't know.
Talk to that whole subject.
Yeah, yeah.
So of course it's a dream
to be able to read and write
in every region
and the system magically
figures everything out for you.
Unfortunately, that's really hard.
And imagine at the same point in time
in New York and Tokyo, you're modifying the same row, right?
Because, you know, logically it's one row in the particular table.
You're modifying and one sets it to two, the other one sets it to three.
So which one?
Merge conflict.
Yeah.
So there's a merge conflict.
But then it's a live application.
Nobody sits there and not writing. So you can have a
conflict and then so you have a process to figure that out. But who decides, right? Is that a human
or it can be a human. Or you can say, oh, well, this row lives in Tokyo. So if you want to modify
that row from New York, you pay the latency for modifying that row and New York and you Tokyo does not pay
the latency because that row is closer to you. That's easy. Or it still requires to decide you
to decide where this row lives. Now, there is actually a very practical solution to this. And
as a database person, it pains me a little bit because how simple that is. But I think from the
practical standpoint, it will actually satisfy a lot of users. And I think that's what we're going to
start with. And that's what Fly is doing as well, a little bit. So you split your queries into reads
and writes. You say your primary write replica lives in a region. You let the user choose that
region. You replicate it from that region to as many regions as you need.
You actually unlikely need more than five, but you can go all the way to 26, which is
the number of data centers in AWS, or it can go to 200 like Cloudflare.
At some point, it will get tricky to replicate to 200.
So you will need to separate replication from the engine as
well. But regardless, you can send reads to a local replica. But you need to understand that
that replica will be behind of the master copy by X amount of milliseconds. I believe at some point,
I know I haven't checked recently, but at some point Fly had a heuristic, you know,
a few 400 milliseconds, less than 400 milliseconds behind, we'll send
reads to your local replica. And this, in a way, dumb approach can surprisingly go very, very far.
It will have side effects. Well, what's the side effect? Well, it's called read after write.
So you write and you immediately read what you've just written. and then that thing might not have arrived to the local copy yet.
So you feel like you wrote a number one and then you're reading that number and it's still
an old value like zero or something. That can be mitigated by messing with the proxy.
The proxy can detect those read-after-write patterns and, in that particular situation, send the read into the
right replica as well.
And the more I'm scouting the market and I'm researching alternative solutions and, you
know, Aurora, Ship, Multimaster.
Multimaster works, Spanner, Ship, Multimaster.
But the more I talk to people out there, I'm realizing that that simple and understandable paradigm is oftentimes
more powerful because of its simplicity compared to all the paradigms of like, okay, we're
going to have, we're going to run a distributed consensus over three locations in the world,
three to five locations in the world.
And now either all of your latencies are very long, or you need to put some sort of machinery
in place where you start fine-tuning.
Okay, well, this data lives here and this data lives there.
And if you let the system decide which data lives where, that introduces uncertainty.
And it changes from having a simple solution like AK-47 into this sophisticated thing where people just stop understanding when the latencies are short and when the latencies are long.
So what do we do internally at Neon?
I think we're going to ship what I just said, where we're just going to have multiple read replicas around the world.
And our proxy will be routing traffic to the local replica soon.
In parallel, we're working with
a famous database professor,
Daniel Abadi,
out of University of Maryland.
And he's a creator of what is called
the Calvin Protocol,
which is the foundation of Fonigb.
He is applying similar ideas
into the Neon architecture.
The difference is row-based versus page-based.
That does require today, as we are halfway through the research project,
that requires people to assign data to regions. And you can pose this as this thing called
partitioning. So for every partition, you need to say, well, this partition lives here.
The moment you do that, a bunch of things fall apart a little bit where creating an
index across all the regions becomes harder and stuff like that.
So while I find this fascinating and I've spent like a decade thinking about it on and
off, I think that simple, straightforward approach where you say, well, my primary is here and I'm going to have
up to 20 replicas in the world can satisfy 99.9% of use cases. Yeah. It lacks a certain elegance,
but it has a certain pragmatism that you're not the first person that said that to me.
And I remember the first time I kind of rolled my eyes. I'm like, well, that's just like
cheating a little bit, but it's defeating defeatist, right? Like it is kind of rolled my eyes i'm like well that's just like cheating a little bit but as defeating defeatist right like it is kind of defeatist but at the same time like i'm probably
it's gonna it is going to be better for most use cases just not 100 right so yeah if you think
about something like uh you know e-commerce use case right you really really want your website
to load fast right but that's a weak query most of them are
yeah and when you display a counter like inventory yeah you might be 100 milliseconds behind but i
mean that's okay you know there will be cases where the user tries to buy something and somebody
else bought it across the world and that thing disappeared but that you can you know you can
handle at the application level.
And when you bind something, processing your transaction, right, you send in your write
into the database.
Okay, sure.
That takes 200 milliseconds.
Fine, right?
People will wait 200 milliseconds.
200 milliseconds is not that long, actually.
If every page load is 200 milliseconds, that's a different story.
If one thing is 200 milliseconds, and that's a write. If one thing is 200 milliseconds and that's a write or it's a purchase or it's a cart,
that seems fine.
Now there's another interesting aspect here, and we're wrapping our brains around this
one as well, which is if some of your database calls, meaning write calls, are incurring certain latency.
Either it's cross-region or maybe even within region.
If your web page or your backend has multiple round trips to the database, those things tend to add up.
And at times, you want to run your compute right next to the database so those latencies are not adding up as much. So ideally, a web page could sell a Lambda function to the database.
And in that Lambda function, there's a bunch of code that's running maybe in the JavaScript runtime.
Or I'll tell you about some other ideas.
And that you want to run right next to the database because that piece of code is like, well, query that table, get data from that table, query a few other tables, run some local compute,
do some more requests to the database. And if you go back and forth, those latencies will start adding up. So there is a reason to run some sort of language, language runtime, right next to the database, or there's a reason
to give people access to potentially a VM compute right next to the database.
I don't know which ones we're going to choose.
Either we're going to have VMs, and we can say, well, push them, whatever code you want,
into a VM that sits right next to the database, or we will run VM
runtime, or we will run Cloudflare worker runtime, which these guys open sourced not too long ago.
And it's not like our aspirations to not be the database, we're a database company.
We just see this use case and we want developers to build the best possible apps. So having some
sort of execution runtime for arbitrary code right next to the database
seems like to make a lot of sense.
And so that's another thing that we're actively exploring. This episode is brought to you by Retool,
and they have a private beta ready for you to check out.
This is the fastest way to now build native mobile apps for your mobile workforce.
There is no complex frameworks anymore or tedious deployments.
You can build mobile apps with what you already know, like JS and SQL.
This is all in the browser, no code or what they call low code.
Join the wait list.
Head to retool.com slash products slash mobile.
The link will be in the show notes.
Again, retool.com slash products slash mobile.
A few months back, back in July, you doubled your funding, which obviously gives you more runway and more money to dream with. One of the kind of key parts that you're
focusing on seems to be developer experience. You've got
three kind of different things laid out in your post when you announce your funding
which was, we talked about this already, serverless, branching
and time machine. When it comes to attracting
developers to adopt this,
those three things seem to be the main thing.
But what else developer experience-wise
really makes Neon shine?
Let's first talk about what is developer experience.
Developer experience is, first of all,
it's something you experience
and when you see the company's got it right,
you kind of feel it.
And so what are those companies?
Well, I would love to highlight Vercel, Netlify, Prisma, Replit, Fly.io, and of course, GitHub.
They get the developer experience right.
There's a bunch of others that I haven't mentioned.
But still, if you look at those six, for example,
they get it right. But if you were to deconstruct what makes a good developer experience or DevOps,
the first one is CLI API and docs. The documentation needs to be very good,
very easy to consume. Everything that you do should be available over the API and the CLI.
It's super addicting, actually, when you go and spin things up over the CLI.
You control the system over the CLI.
You look at the UI and that is all reflected there.
You have this positive reinforcement as you do that.
I think that's very important.
Everything is instant.
So developers don't like waiting for provisioning.
For example, there are cases where you spin up an RDS instance or an Aurora instance and you need to wait up to 40 minutes.
That's nuts.
When you just need a database, you want to click a button
and you want to get it.
So in those, every second counts.
Think about your developer, your flow, you have ultimate hacker keyboard, you're optimized
everything, but certain things force you to wait minutes or some time of an hour.
The third thing is cold starts.
And we're not all the way there. I'm chatting with Guillermo
over cold starts. And he's saying like, this is the hill I'm dying on. Like the, you know,
cold starts are bad. We haven't solved it all the way. We will be solving them through caching.
For us, when we scale to zero, it takes two seconds to spin back up, which impacts the
application experience,
developer experience. It's still gigantically better than everything out there, but we need to get it down to like 100 milliseconds or so. There's other things that go into developer
experiences. One thing is to run the app and the other thing is to build the app.
And the thing that contributes to developer experience is instantly shareable environments, multiplayer-type experiences where you're building an app
and you just want to send a link, a short URL to somebody else and say,
hey, check that one out. And when they click on it, they have a preview. And from that preview
of the app, they can also be dropped potentially into the developer
environment for that preview.
There's an application preview, there's a developer environment preview.
And the easier you make that, the more team collaboration benefits you will start ripping.
And that also is addicting, right?
Because people work in teams.
People don't work solo usually.
The other one that I want to mention is CI CD and push to deploy.
I think Heroku famously, you know, Git push Heroku master.
That's what's really cool.
If you think about it, so you just do Git push and this thing is live in production.
The reality that of today though, is that Heroku or not, people are using CI-CD pipelines.
And when it comes down to CI-CD pipelines,
the notion of a branch and a notion of a pipeline is there.
So all this shareability that I just talked about in terms of like, okay, well, here's a preview environment.
Why don't you take a look at it?
It actually applies to automatic test pipelines
that you're coming through. And the place that does not fit well into CI-CD pipelines preview environment, why don't you take a look at it, actually applies to automatic test pipelines
that you're coming through. The place that does not fit well into CI CD pipelines is usually the
database. That's the one that you cannot just fork into 20 copies and run 20 tests against.
Each one has its own database copy in parallel. People just don't do this kind of stuff today,
and these are the things that will be
possible with Neon.
I'm looking forward to that.
I'm glad you defined developer experience because
a lot of people seem to
it's a mix match.
I think in your case, the CLI, the docs,
these seem like easy table stakes.
If you don't have docs I can read and I
can't dig into your CLI, you're right. It is
addictive. Once you get into something and there's good documentation,
it's easy to kind of get deeper into that.
But, you know, defining it seems to be the challenge.
Getting it right might even be, definitely is, I guess, harder, right?
You can know what it is, but getting it right is.
Well, yeah, but you need to know where you're going, right?
You need to know where you're going.
Right.
Are we there with our developer experience?
No, we don't have a CLI.
But I'm looking at our roadmap,
and our CLI is going to drop in November.
So I know we need to have it.
So that's what we did.
We started with defining what a great developer experience is.
We put it on the roadmap, file the tasks,
and the team is cranking.
Yeah. Where does your roadmap live? Is it easily accessible to the world? I haven't found it yet.
So that's a great question. I actually want to make it public.
Because my question after that is like, if you've got it out there,
or you have these ambitions, and you care about DevEx, how do you communicate that? Because
if you have ambitions, and I know I want that CLI, and you don't DevEx, how do you communicate that? Because if you have ambitions and the, you know,
I know I want that CLI and you don't have it yet,
how do you tell me that you care and you're working on it?
And if there's no feedback loop between me, the end user,
the dreamer, the user, and then you.
There's a little bit of a feedback, but you're absolutely right, Adam.
I will actually bring this up in the next staff meeting.
We're on them every Monday.
I do want the roadmap to be public. It's not public today. It does live on GitHub right now
in what's it called? GitHub issues. Yeah, that's the word. And I'm staring at it right now.
Let me stare at it too.
Yeah. I mean, there's no reason for it's private right now, but there's no reason for it to be private.
Yeah, mark my word, we're just going to flip it public.
Well, it wasn't to call you out to say that, but more like it's acknowledging that there's a feedback loop and it's clear communication and expectation setting.
So if me desire for this future that you're building all these, all this magic we're talking to you about, if you're making come to fruition
and I want to go there with you,
if I can see a glimpse into your future, your horizon,
well, then I can buckle down a little further.
I can deal with that two-second delay on my cold starts
because I know you're desiring to get it to 100 milliseconds.
Yeah.
Yeah, you're absolutely right.
Point taken and we'll be making it public.
How long have you been working on this and how far do you think you are from your first paying customer?
So we've been working on it 18 months roughly.
So we started payroll March 1st, 2021.
And when we started payroll, we had close to zero lines of code written.
So we had, you know, three founders in a slide deck.
Right now, the team is 36 people.
The majority of them are engineers.
It's a remote-first company.
The majority of the people are system engineers working on the storage.
But there's obviously an SRE team and a cloud team.
The service is up and running.
It has more than 2000 users.
There are 7,000 signups.
And I think we'll make our first dollar in Q1, 2023.
We are already having people who are using it,
not for toy projects, but for real production projects.
And so this will be the first, you know,
the first dollar we're going to make into the company.
So call it two years, roughly.
Is that about what you expected?
Has it been easier than you thought?
Harder than you thought?
What's the journey been like?
It's about on track.
You know, you need to build the freaking storage.
Yeah.
Put that on a t-shirt.
Yeah, yeah, yeah.
It's a complex systems project
that has a certain amount of maturity.
And then you need to build everything else as well, which is the cloud service.
So I think we're right on track.
We could make more money sooner by just talking to larger enterprises and selling this to someone before we're fully ready. That happens all the time, by the way, in startup building,
where you sell it to the user and they, in return,
they get a better deal and they have the right to drive your roadmap.
We chose not to do it this time around.
That's what we did with SingleStore.
We lined up a bunch of banks.
And I think we got Goldman Sachs.
We didn't get them before two years, by the way,
but we got them and We started Singles for 2011 and we got Goldman, I think, two years after, sometime 2013, for a very small
workload. So that's one way of building things. The other way of building things is to create an offering in the cloud, attract people, and then cherry
pick those who have more of a hair and fire problem and the ones that will have a more,
I wouldn't say relevant.
Basically, whoever you choose and whoever you listen to have a big say on your roadmap.
And when you choose very large companies, the say will be around enterprise features such as security encryption, integration with, you know, Azure Directory, and things of that nature. SMB mid-market. They will care about productivity. They will care about small teams. They will care
about cost. And that will set the foundation of your system in a much more robust way.
When it does come to generating revenue, that first dollar, the first many dollars,
you're taking a bet, I guess, well, I guess it's kind of been proven by other serverless business models out there.
But you're not going a traditional route, which is, as you said before, subscription, which is kind of easy to define.
Well, this customer signed up for X, X per month, X years if we retain them, etc.
It's easier to sort of predict some future.
How do you expect the volatility of usage-based consumption to
impact revenue? What's your thoughts on that front? I think the important thing in the
consumption-based pricing and the consumption-based approach is that it aligns the value of the
product with the value to the customer and the customer consumption. And then eventually it will align this to your sales team as well.
If we're getting paid for something that is not used, eventually it will be turned off.
It will be discovered and turned off. But if we are providing value with more usage,
we're providing more value, then the usage will grow. So in a way, subscription-based pricing
does not keep you as honest as a consumption-based pricing.
And a bunch of Amazon's revenues,
people forgetting to turn off EC2 instances.
And I'm guilty myself.
I've done this before.
And then in early single store days,
$3,000 bill arrives.
And I was like, oh my God, like $3,000 bill arrives. And I was like,
oh my God, I just forgot to turn a database instance. Like that was literally that. By the
way, Amazon forgave it to us. But I was ready to put my personal money to this because like,
I forgot it. And in the consumption based system, you will never do that. I think the
consumption based model is proven by now by companies like Snowflake and Twilio, which are purely consumption.
And I think that's where the world is going.
And, you know, give it a few more years and this will be the expectation in the market.
I'd like to see that on like Disney Plus or Netflix.
Because there's times I've had Netflix, a subscription,
and I've watched nothing for a month at least. Maybe one show, maybe a couple because I've gotten busy or I've, you know, prioritized summer and family or whatever. And still yet the bill
comes along. But that is mentioning Amazon. That's a big thing with them is there's a whole
cottage industry of like, explain my Amazon bill to me and i say amazon it's actually aws but yeah
the point is yeah i i mean they're the video services is a great example of that right they
you know i have a family and and my kids watch disney i i watch you know the house of dragons
and hbo i do have netflix i haven't watched net in a while. And then I just kind of forget to turn it
off. Yeah. It's almost like, you know, the max I'll pay is X. The little I can pay is zero.
Cause if I watch nothing, then charge me nothing. But if I watch it enough, charge me half,
right. You know, something, but like there's a max, you know, the full subscription amount.
And there are 20 services out there now, right? There are 20 subscription services out there now.
And you only have this much time in the day.
So you kind of want to pay by consumption and not think about it.
But that's unfortunately not the life we're living in.
Yeah, that's not well aligned with their incentives.
I'm happy to hear that you're aligning with the value of the customer
because that's a great answer.
It's one to say, well, we have a long take on this game or we're playing a long game that's one answer but the other answer is like we want to align with the customer's value because
that that is so true like you can get a bad reputation or just reduction in value or trust
if you charge for things you're not actually getting billed for, which is how subscriptions work.
But if you're aligned with their actual value, this is what you consume, this is what you
use.
100%.
And when this thing works, truly works, it's beautiful.
Because now you have that simplicity across the board.
Now your salespeople are just trying to land a customer at any consumption because you know
that your product is very good and it will grow.
And then you're compensating them for educating their customers, the accounts that they're
working with, how they can drive more value by using here and here, using here and there,
using there, which in return drives consumption.
Now their sales commissions are attached to consumption as well.
And you're becoming
the truly consumption company.
From a sales perspective,
it really makes a lot of sense
because there's almost zero risk
to the customer, right?
And it's an easier opportunity
for the salesperson
to communicate the value
because you're not saying,
well, it's X per month
and you're going to overspend or underutilize and all that stuff. Then it's more like,
no, you only pay for what you use. And so the, the sale can, as long as the tech aligns,
of course, and the value is there technically, then the sales is kind of does it for you. It's,
it's almost just done for you. And it's just a matter of aligning the value, educating,
as you said,
and having a good team behind you that can not just sell,
but also educate and depth into a customer
versus just simply one service here and that's it.
Correct. That's precisely right.
You don't need to sell me on Postgres
because I've been a user both professionally
and privately or personally for like 15, 16 years.
So to me, I saw serverless Postgres and I was like, let's talk to these people.
You are VC funded.
And I know that what I've read, I'm no VC, but what I've read is that they're going for grand slams lots of the time.
And a lot of their, that's what they want is that vision of a potential unicorn or deca-unicorn, who knows what they are now.
They want to sell Figma for $20 billion.
And being Postgres latched, you've hitched yourself to a really nice racehorse, a great one, I think the best one for a lot of cases.
But it is de facto a segment of the market, right?
You've basically segmented yourself and you can't get that MySQL
Fortune 500 company or the guys run Oracle unless they're ready to switch to Postgres as well.
And so I'm curious if like that, if there was their pushback, you know, during your pitches,
these conversations, like was Postgres a thing that you had to sell
to potential investors or was it something that they're excited about?
It's the same as you said.
You don't need to sell people posters.
Even VCs?
VCs are not dumb.
Oh, I'm not saying they're dumb.
They're removed in some cases.
I'm saying that they're going for larger markets or things maybe.
But think about it the following way. From the VC standpoint, there is a market.
It's called the database market.
That market has players, and those players have share.
And that share is on dollars and on usage.
And then you can also measure share in terms of mind share.
So if you look at share of usage, Postgres is going up and up and up.
So that's data point number one. And earlier I said out of the top five databases,
Postgres is the only one that's growing share.
The second thing is within the database market,
there is an on-prem market and there is a cloud market.
And the cloud market is the much faster growing market.
That market is dominated by the cloud hyperscalers,
by Amazon, Google, and Microsoft.
There is only one public database company that's relatively modern and relatively recent.
That's MongoDB.
There isn't a public relational database company, and developers are increasingly choosing Postgres over MongoDB.
Another data point is AWS Aurora is a $3 billion run rate business
going into potentially four and a half, growing 50% year-on-year into the next year.
That's MySQL, Postgres, and MongoDB. MongoDB is kind of fake. It's built on top of Postgres
because MongoDB license prevents them from running MongoDB compute.
So all of those data points highlight that there are a lot of dollars in the cloud database market.
Postgres is a matter of fact.
In a way, it's kind of like Linux, right?
And so it's not like you can't own Postgres.
That should sail.
But can you have the best in the world Postgres cloud service?
I think that's an open question.
And that's what we're going after.
That was really, that's really the pitch.
It was that simple.
I was going to say, that sounds like what you would have said in your meetings.
I liked it.
What's left?
What do we not cover so far?
What's something you wish we asked you that we didn't ask you or something you wish we can cover that we haven't covered yet?
Well, wish us luck.
I think that's one.
Good luck.
We need that.
Obviously, engineering our future.
We have a fantastic team.
So I think that's for the most part what we need.
We think that where we go is pretty clear.
And we're refining that North Star every week as we get more information.
We want the world to fully buy in on Postgres.
I think we're getting there, and then we need the world to buy in on serverless.
And once those things happen, we need the technology to work, which we're making better every day.
You've got some job openings, I see, at least in your announcement post from back in July,
I'm sure there's some of those job openings
still available. Some in engineering,
some in product, obviously.
The two jobs that we're looking
right now is
UX designer, and
we're potentially
looking to bring
a developer relations lead
and more of a senior person.
We have one fantastic individual,
who's named Rauf,
who is running our dev rel right now.
But what we're hearing from the board,
it might make sense to bring a very senior person
to drive the developer relations effort.
So these are the two positions
that we're hiring for right now.
We're always hiring for engineers.
We've been blessed.
There's a line of people who want to work at Neon on our storage and our cloud service.
So we feel we're very fortunate because the system is open source.
It's written in Rust.
So that's like a system engineer candy.
Hopefully, this will stay.
But as of right now,
we have more applicants that we can process.
And we just added nine last months.
There you go.
You mentioned the roadmap is coming, or potentially coming.
You mentioned the desire for a CLI.
What else might be out there on the horizon?
What's something that maybe people know less of, or not at all, that you could there on the horizon? What's something that maybe people know less of or not at all that you could share on the show?
Yeah, so there are a couple of things.
We touched on autoscaling, and all of that will be packaged into the final experience,
where you will have some visibility of how much compute you're burning.
And then underneath, that's going to be that live VM migrations
and adjusting the size of the compute
with regards to memory and CPU.
So that's coming.
A number of integrations are coming.
First of all, I'll watch out
for an announcement next week.
I can't say what it's going to be,
but there will be a big announcement
of integrating with a major developer platform.
And more such integrations will come out over time.
We'll be announcing regions.
That's kind of table stakes for a database service, but that's going to happen.
And then we're also experimenting with some of the generative AI stuff.
We're only going to launch it if we internally feel that it provides a ton of
value. But that is about automatic index suggestions, automatically branching, applying
indexing, and then sending a pull request for changing the schema. That is some of the things
that are brewing in our labs, which is kind of cool. But again, we only going to do that if we are confident that this is not
a toy. It's really useful for the developer.
So currently technical preview, right? You have to request early
access or have an invite code, which you can log in with GitHub or you can log
in with, I believe, Google. So you can either SSO to get
in. What's the wait?
If people get done with the show or maybe midway through the show
and this question is too late,
how long will they wait?
What can they expect?
Barring the things that are unexpected,
which is like, you know,
we're about to remove the invite gates
and then in the last second,
somebody's like, no, you can't do this
because X and Y will break.
Well, barring that, November will drop the invite gate.
Okay. So soon.
Yeah, it's very soon.
And in Q1, we get a turn all pricing and billing.
So the team is working very, very hard.
We already know the pricing structure, the pricing model, where we're going to charge separately for storage, compute,
and data transfer.
So in a way, kind of really aligned with what it costs us to run the service.
And then it's elastic and scales to zero.
So you're not using it.
You're only paying for storage.
If your storage is zero, it runs up to zero.
So these are the pieces that we need.
We need regions.
We need larger computes.
We need pricing and billing.
And once that's there, we're
ready to roll. We'll drop the invite
gate even before we have pricing and billing.
And it'll remain free, obviously, until
Q1, and then what happens? Will there be like a grace
period, like, hey, a free tier,
a generous free tier? What can people expect?
The generous free tier will stay.
We'll give you a certain amount of consumption per
month for free. push Roku master, those four words were sort of blazing in all of our brains. And, you know,
that's kind of gone now, but if you want to be a long-term player, it might make sense to always
have a generous free tier and keep it that way so that you can invite those who want to play and
tinker to do so. Well, it comes down to the model and it comes down to the level of what it costs
you. And it comes down to a certain level of abuse. Yeah, for sure. When you give people arbitrary compute, you will be abused, right? Because you know, you can mine bit you can,
you can turn free compute to value. You can mine Bitcoin, you can, you can do whatever you can do
the DOS attacks, like there's all sorts of malicious behavior that you can expect on a popular
platform.
When your platform is not popular, you're dying for that traffic.
And so that is the push and pull, right?
So on the first subject, it's harder.
So databases are arbitrary compute, but it's not as obvious as like just having access
to a VM, right?
It's less arbitrary than a VM access.
So I think the level of abuse will be there,
but naturally it will be less.
Well, that'd be a good spot for AI is to detect that stuff, right?
To machine learn what abuse looks like
and you can sort of evict them from the cache.
Get out of here.
But then you're writing all the code.
Yeah, you're spending all your money on fraud and abuse.
That's not where you're at.
Yeah, you're spending money on fraud and whatever.
And so for example, fly.io gives you a generous free tier, but they they ask for a credit
card ahead of time.
So that's like adding a little bit of friction now fly.io gives you actual VMs, right.
And so and what I'm saying is in databases, I expect less abuse than as in a general purpose
platform, but there will be some.
Right now, obviously, we want that free traffic and free usage.
Then it's a model.
If you put my business hat on, there are a certain amount of people that are coming on
the platform.
It costs you this amount of money.
You fine-tune what those free tier boundaries are to maximize your long-term goals, your long-term
trajectory as a company.
And that's an important thing that we always want to optimize for long-term.
And that's, in a way, what venture capital allows you to do, right?
So you can really make sure that you build a very capable platform.
You can reach a certain amount of scale of users coming in.
And by the way, the more users, the more stable is the platform because you start seeing all
sorts of failures and fixing them.
And that's another reason to stay free for a while, right?
Once you start charging people money, the expectations on uptime and quality are higher.
And maturity takes time.
I think since we're not doing this for the first time,
I think we can get there faster than, let's say, single store,
but it will still take time.
Yeah, so that's kind of how we think about free tier.
We're taking a very, very practical approach to it.
We want people to come in.
We want people to see value.
We want people to eventually convert and become customer buyers.
Well, speaking of conversion and the potential of many, there is a code.
I'm curious, can you share a code just for our listeners?
Is there an invite we can just give to everyone who listens to the show?
Is that feasible?
Is that too much?
You tell me.
Yeah, there's a partnership with Hasura that is currently slated to launch on the 11th
and if you come to neon through hasura by pushing a neon button on the hasura dashboard and we'll
be replacing heroku uh on hasura then you will bypass the invite gate and you will you will be
dropped on the neon console and you don't you don't need to have the invite gates
to start using Neon.
Yeah, good stuff.
Well, if that's out there, Hensura, awesome.
If you're just listening to this
and you don't know what Hensura is
or you don't have access to that,
then, well, I guess just wait till November sometime, right?
Because that's when it actually opens up.
The gates are down.
So just a temporary wait for anybody who might
be listening. Finally, you can just tweet
and say Neon Database
and I get an invite and we'll DM it
for you. Gotcha. Cool. Well, there's some ways
then. Anything else, Jared? What else
is left? We put it out there, didn't we?
I think we've done a good job covering
it. I'm excited for you all. I'm rooting
for you. Like I said, a big Postgres
stand over here.
So I want to see it moved into the future alongside all these other players
and opportunities,
uh,
resource-based billing.
I mean,
it's going to be awesome.
The regions,
it's going to be cool.
The branching is already cool.
So very excited for what you guys are building and wish you the best of luck
on it.
Absolutely.
Absolutely.
Thank you so much.
Thanks, Nikita.
Okay, serverless Postgres is finally here.
Neon is bringing it to the masses.
We want to hear from you.
Is Postgres your database?
Are you excited about serverless Postgres?
Is this model something that gets you excited?
Sound off in the comments.
The link is in the show notes.
And for our Plus Plus subscribers, make sure you stick around.
We have a bonus for you.
Again, a big thank you to Fastly and Fly and to Great Men's Facility for those awesome beats.
And, of course, to you, thank you for tuning in.
We appreciate you.
That's it for this week.
We'll see you on Monday. day. Thank you. Game on.