Postgres FM - Postgres year in review 2022
Episode Date: December 30, 2022Here are links to a few things we mentioned: 1. Startups building momentumAiven raised $210m  Timescale raised $110m  Hasura raised $100m  Supabase raised $80m  Neon raised $30m Hydra �...�OrioleDB  2. Educational resourcesPostgres FM started 👋Postgres TV became more active (including topic playlists)  Tobias Petry tips on Twitter and SQL for Devs  Hussein Nasser YouTube channel (backend engineering) Postgres Weekly (newsletter) 3. Sharding progressCitus goes fully open source  SPQR  pgcat Sharding Postgres at Notion (blog post) PlanetScale (MySQL) 4. Database branching is comingDatabase Lab Engine  Neon branching OrioleDB branching  Crunchy Bridge Database branching episode  5. Postgres is everywhereAll cloud providersAlloyDB announced Kubernetes operators (comparison blog post) ------------------------What did you like or not like? What should we discuss next time? Let us know by tweeting us on @samokhvalov / @michristofides / @PostgresFM, or by commenting on our Google doc.If you would like to share this episode, here's a good link (and thank you!)Postgres FM is brought to you by:Nikolay Samokhvalov, founder of Postgres.aiMichael Christofides, founder of pgMustardWith special thanks to:Jessie Draws for the amazing artworkÂ
Transcript
Discussion (0)
Hello and welcome to Postgres FM, a weekly show about all things PostgresQL.
I'm Michael, founder of PGMustard, and this is my co-host Nikolai, founder of Postgres AI.
Hey Nikolai, what are we talking about this week?
Hi Michael, this is episode number 26, and that means since we didn't miss any week,
and since each year has roughly 52 weeks, it means that it's exactly half a year doing podcasts every week.
How good is that? Yeah, it's pretty good, right? Like I, when we started, of course,
my goal was not to miss any weeks, like, because when you start missing weeks, then you start missing weeks then you start missing two weeks and like it's kind of but we are we have
good study progress in in this term right so we started in june i guess or or in july
i think that's the end of was it beginning of july i'm not 100 sure well congratulations to you thank
you and thanks everybody who has been giving us encouragement and feedback to keep going. Yeah, this is our fuel, definitely.
And we receive it every week, which is very encouraging.
So this was just some side note, but since it's the end of the year,
let's maybe wrap up some summary of the year in terms of Postgres ecosystem and broader community.
It will be probably very like a pioneered list, not ideal.
We don't have the goal to have ideal list.
And it's just something which, for example, I remember
and consider important, interesting, entertaining, and so on.
So just how will we name it?
Postgres 2022, most interesting facts and events. Right. Okay,
I have a list of five items. You have also a few items to add. Let's start from my list,
because it's my turn to define the topic. And I will start from the observation that Postgres ecosystem has very strong startups.
And despite of issues with global market, I mean, fundraising and so on, this year showed very, very good results in terms of money raised by Postgres related startups and also new startups appeared.
And I would mention, I would start from Ivan here.
They are not strictly like Postgres only thing, but Postgres only company, of course, but they
started, as I know, from delivering services, first of all, Postgres and Kafka, then Extended
and so on. So they are quite strong in Postgres. And they previously in late 2021, they achieved
$2 billion evaluation and total raise this year achieved $420 million. It's quite impressive.
And in February, two more companies reported that they raised more and they had new rounds
and they achieved $1 billion evaluation. These two companies were Hasura and Timescale, if I'm not mistaken, right?
Yeah.
And also we have a bunch of other companies who also raised the load.
For example, Superbase.
And we have new startups.
Very recent one is Hydra, right?
Hydra.
How to pronounce it?
I say the latter.
I say Hydra. hydra how to how to pronounce it i say the latter i say hydra hydra because i think the
yeah and they and interesting that also neon of course had released this year they didn't appear
this year but they had released this year and we will talk about neon in my different item also a little bit. And I like how these three
three startups has not super base neon and Hydra. They choose
their style of value proposition we are this but open source
right? So super base is open source Firebase alternative.
Neon is open source of Aurora alternative and Neon is open source Aurora alternative.
And Hydra is open source
Snowflake alternative.
And all of them are built
based on Postgres. And this is
super great. And this,
I think, like in my opinion,
this item in my list,
this is the strongest, biggest, and
most interesting because these
achievements bring a lot of energy, not only money, but a lot of minds and a lot of new small projects, open source projects and so on and so on.
This is great.
This shows how Postgres community grows and Postgres ecosystem grows.
And this is new heights for postgres ecosystem. But it also, I think,
in my personal opinion, this also means that postgres community is much bigger than postgres
project itself, because they are different companies, different organizations, different
people. And postgres is just in the center, right? But there are many things around and they are growing in terms of business,
in terms of user base and so on.
What do you think?
Yeah, it's really cool.
It's an exciting time to be in the Postgres space for sure.
Something I don't think I realized coming in that a lot of this money would be around,
a lot of them are hosting providers, which is super interesting.
That seems to be where a lot of the or at least that's how they're monetizing.
It's kind of that's the cloud offering that they're charging money for.
Hence, they can be open source in a lot of the cases.
But it's really cool for Postgres core as well.
I think a lot of them are hiring maintainers or trying to hire people to
work on Postgres Core. I think it's really healthy for that group to be gradually populated by people
from different backgrounds. I think in the past, it's been a mixture of people in the community,
but a lot of consultancies and a lot of industry, but not so much actual hosting providers
contributing back to Postgres.
And I think this year with the companies you mentioned,
but also even with the likes of Amazon
hiring Postgres core team members,
I think it's an exciting time
and hopefully that group will be able to achieve more
with more resources behind them, with more money involved.
Yeah, it's good that you mentioned.
I think this is just the model that works best today
to provide cloud offering, like managed Postgres,
but with some additions or very modified Postgres version.
But it's just the model that works best today.
And indeed, RDS is, I think, still, of course, obviously, the leader here,
but others try to be somehow different, to add something and so on,
and they are interesting in various senses.
Not all of them are just hosting. If you talk about Superbase or Timescale
or Hasura, they're not just Postgres. They provide Postgres plus something, like time series
extension. It's extension, but it feels like a different database, maybe
I mean, timescale. Or
Hasura and Supabase, they
provide API or
GraphQL out like
immediately you have
speeding up development.
That's interesting. But indeed,
yes, the model works
best now.
It's cloud offering.
So we have a lot, a lot, a lot of options to choose from.
If you want just Postgres database plus something, maybe.
It's a headache right now already.
So dozens of options to choose from.
You need to compare and it will take time.
But it's good. I think it also emphasizes my idea that Postgres is not just a single something.
It's very big and consisting of very different parts of Postgres, I mean community ecosystem.
It consists of many different things.
And I see it as a huge bazaar.
It's indeed open source.
Like if you recall this
Bazaar vs. Cathedral
from Richard Stallman, right?
So Postgres is indeed a huge bazaar
and have a lot of players.
It's great.
And also I wanted to mention,
if we mention RDS, for example,
also wanted to mention that
this year I changed my mind about Google Cloud SQL
because previously they had only like eight knobs and it was, it was feeling, my feeling about this
offering, their offering of Postgres was it's quite weak. I cannot run serious project. This
year I completely changed my mind. Maybe because Hanoi Crossing visited our Postgres TV episode.
It was a great episode about vacuum.
I can't recommend it enough.
But also, obviously, it's improving in terms of product and also interesting.
Okay, this was number one item in my list.
The second item is actually our podcast started this year, right? So this is
super important news for Postgres community and ecosystem, I guess. And also Postgres TV
started to become more active as well. So we, first of all, those who don't maybe know,
we also publish our episodes on YouTube.
Sometimes it's more convenient, sometimes less depends. But
besides our wonderful podcast episodes, positive is a YouTube channel, it exists for for a few years, but this year, it started
to be much more active. We invite various guests. Sometimes
it's interview, sometimes it's to redo the talk, to record it
and distribute it. It's called Postgres Open
Talks to provide good
talks to wider audience.
Also, we have
a good collection of
playlists, like for example
Postgres Backups or Postgres Replication
and so on and so on. And those are
materials from other channels.
And I think it's
already approaching 500 videos about postgres so so go check out postgres.tv it will redirect you
to youtube and you will see a lot of interesting stuff always to something to learn about i i
actually when i invite some guests i learn a lot this is I do it also, like it's also a selfish goal.
It's not only for community reasons to share,
but also like, okay, I have like one hour
and I can dive into some topic as deep as possible
because the guest who is working in this area
may be one of the best experts in this area in the world, right?
So it's really great.
So TV, FM, and maybe other others also
they also publish materials online these days maybe covet helped and realize that online
distribution online events they also matter because not everyone can come to to conference
offline and also you can you can stop listening pause and then return next day if you need to
interrupt right so it's also much more convenient to consume information in such way of course
offline events they give benefits from live communication but online events are also good
and recordings are also good so this is this is item number two what what do you
think about it yeah um really i like it a lot i think there's been not just us i think there's
been a lot of people providing a lot of good postgres content this year and i wanted to give
i think the postgres tv thing is if it's worth a special shout out that you do quite a few of the
sessions live so if people they're quite they're
from quite advanced topics but if people have questions if people want to be able to ask
questions of those experts as well they can join live and ask you them in the chat and you'll you'll
pass those on so that those are quite a unique opportunity i think that you often don't get
unless you can go in person to a conference so that's really great a couple of others i wanted to give a shout out to were tobias petri who's doing more beginner friendly tips on
twitter and his website sequel for devs that's really gained a lot of momentum this year he's
doing great and hussein nasa who has a really popular youtube channel for back-end developers
i'll link that up as well. I think he does fantastic work
and explains concepts really simply. And then also wanted to give a shout out to the Postgres
Weekly newsletter. I think still every week, I know it's been going quite a few years now,
but every week they provide a really good newsletter for all kinds of topics around Postgres. Yep, agree. Okay, item number three, sharding.
So in the beginning of the year,
I had a strong impression like we definitely need sharding.
And MySQL has VTS.
A lot of big companies who use MySQL, they use VTS.
Postgres likes it.
And it was like we have Cytos.
I mean, Postgres ecosystem has Citus,
but it's only partially available
because open source community edition,
it doesn't provide, for example, online resharding.
It was before.
I mean, I'm discussing what was in the beginning of the year.
But this year changed it.
First of all, Citus published everything as open source,
like Microsoft published it because it's part of microsoft now
and this is good news so it's fully available as open source great and a couple of young projects
in this area started first is spqr guys who i know the Russian-speaking guys, and with very good experience,
they try to build very simple yet powerful sharding system
that exists on GitHub.
And then also some guy from San Francisco,
I think, I don't remember the name, sorry,
this system called PgCat.
I'm subscribed to both projects,
and they are very active in terms of development so
interesting to check them spqr is in go and pgcat is rust and they are adding almost every day
something this is this is super great to see and pgcat actually originally the idea of sharding
there was pgcat was created as I understand as some
proxy like replacement for PgBouncer
probably, connection puller
but sharding was there
originally only like
some
very simple explicit sharding
you say I need to route
to that node, that's it
but as I understand later
it became more complex and comprehensive
and so on so now it's already something interesting to if you need sharding but of course
only few projects need sharding because many companies showed you can grow to billion dollar evaluation having sas or e-commerce and still have one or few monolith
big postgres databases or split to services big services but at some point any company if it
needs to grow a lot it will need sharding some service will need to be sharded so sharding is
needed it's just needed only to a few users,
but it's still needed.
And we have some progress this year.
It's not perfect, of course.
There's no obvious default solution
answering all questions, no.
But there is a very good promising movement this year.
Citus plus these two projects.
Agreed?
Yeah, I think PGCAT sounds particularly exciting and really cool that
it's the idea of putting it in a pooler with, I think it has also load balancing and failover
support as well. That seems really smart to me that you could put it in place and benefit from
those features before you need sharding and then make use of sharding later if even when you need it that sounds very
sensible to me right and this middleware should be very light in my opinion if you if you aim to
work in otp context because if you use for example postgres there it's also possible
but it's quite heavy and it adds latency overhead. So, of course, worth testing each particular case,
but having some...
And MySQL VTS also has proxy layer.
It's called VTGate, and it's good if it's very light.
The problem is how to support all Postgres syntax in this case
because when you start from zero,
Postgres syntax is this case. Because when you start from zero, Postgres syntax is quite rich.
Yeah.
A couple more things I want to mention on this front, actually.
There was a really good Notion blog post.
I'm not quite sure if it was the beginning of this year
or a little bit earlier,
but they implemented, I believe people refer to it
as application side sharding.
Not a great acronym.
Yeah, yeah, yeah.
Well, I promote this approach, this acronym.
It's application level sharding is less offensive, right?
But yeah, the Vitesse stuff is super interesting.
And I think this, for me, is actually a real threat to Postgres,
not necessarily on a technical level, but on a marketing level.
I'm getting kind of flashbacks to the MongoDB days where people would be offered kind of web scale. You know, you're only a startup, but you'd never have to worry about scaling in the future.
And suddenly we've actually got…
And it doesn't matter.
It doesn't matter that one node behaves 10 times worse
than one node in progress.
It doesn't matter, right?
Yeah.
But people don't take that into account
when they're starting up.
And actually, they don't notice
because the performance when they have very few users is fine.
And it is fine.
The solution is fine.
But I'm seeing startups that i that last year i would have expected to be on postgres picking vites and my sequel because of
planet scale have really nailed their marketing and i think also have got some really good
developer friendly features but mostly i think it's that same marketing angle
as mongo had and mongo i think postgres had a really good answer too with jason b support
a little while later and that i think was was an excellent move i'm really looking forward to seeing
what we can do in the postgres ecosystem kind of this time and see if see if that um because i think postgres is built
on better fundamentals than my sequel and has a lot of features that and that i love so i'm really
interested to see how we address that yeah and when you talk about some start new startups startups
which choose vtes and my sequel is it like very very, very new startups or some very new this year or so? Yes.
Well, interesting. Interesting. Well, okay. I don't see such startup. I see constant
discussions about like, oh, it would be good to have Vitesse. And Vitesse, I think later last year,
like a year ago, they announced that they don't have plans to work on Postgres support themselves,
open to community contribution.
But in my head, I don't see.
Maybe I have some filters.
I don't see companies who these days would choose MySQL
without strong pressure from I don't know who.
It's strange.
But I can imagine.
If, for example, some CTO wants to be protected in terms of growth risks and how to handle growth,
and for this CTO it doesn't matter that you can handle dozens or thousands of transactions per second
using a single simple cluster on modern software, actually in already hundreds of transactions.
And when I say transactions, I don't mean already hundreds of transactions. And I, when I say transactions,
I don't mean read-only transactions. Of course, I mean, social media like traffic, like 90%
are selects, right, or 80%, but others are rights. And you can handle perfectly, you can scale up to,
like, you can handle 100,000 TPS or a few hundred thousand TPS especially if you
take for example modern epic epic processors from AMD epic Milan epic Rome
as well so Rome is previous generation Milan a lot of CPU power, a lot of memory, and you run like 10 nodes and clusters, a lot of replicas. It can be
very, very, very powerful. So you can postpone this case when you need to be sharded. And also
microservices. People sometimes choose microservices and postpone the need to shard their databases, right?
Because databases become smaller.
So, yeah, I can imagine the reasons behind the choice to choose Vitesse.
But is it Vitesse or PlanetScale, by the way, speaking of managed services?
They are, the startups I'm seeing are choosing PlanetScale.
I don't even think they would know.
They wouldn't necessarily know what Vitesse was, I think.
It's different because, as I see,
PlanetScale now has two value propositions combined.
One is web scale, PlanetScale sharding,
and another is we will handle your changes without problem.
They call it database branchinging but it is not but they provide you like ability to change schema and then have some
similar to pull request merge request concept deploy request so your colleagues review change
and then you they run change online without downtime and you don't think like you don't need to spend efforts this
may be also a big motivator because postgres in postgres it's a big headache to to avoid
locking issues and so on this is what i meant by their developer friendly features and i think
that's a similar value proposition that mongo were offering back in the day as well. Yeah, interesting.
I think many Postgres companies, Postgres-related companies also
already working on this because, of course, yeah.
So in this case, let's proceed to next item, number four,
and this is database branching is coming, right?
So database branching is coming, And originally, I think last year, it was PlanetScale who started to use this term. And in my opinion, in the wrong direction, because they talked about branching schema only. So their branching, database branchingcalled data branching, which is something strange to me, but it's another story.
But also several other companies in Postgres ecosystem, because we are mostly interested in Postgres ecosystem, they started to talk about database branching.
And for me, of course, this is very close to home, right?
Because my company develops the Database Lab engine, which provides thin cloning.
And actually, yesterday, we released first Alpha, which supports database branching.
And branching is not cloning.
There are differences.
Another story, but I'm glad that many companies already think in this direction.
They just want to work with, like, the original need is to be able to work with databases and non-production with bigger size databases.
Same as with Git.
So you have independent database.
You can have multiple independent databases and you can experiment, develop, test, and so on.
And cloning, which we had for a couple of years already, supported it.
But branching allows you also to have some progress and commit it, like to have snapshot and then share with your colleagues.
They can branch from there.
So it's like nested cloning already.
Nested cloning with commits, it's branching.
This is what happens.
If you look at Git, it's very close to it.
And I would like to mention Neon, which released branching a few weeks ago.
And it's also already publicly available.
It works very well.
I think they don't have commits yet, but I'm sure they are already thinking or working on it.
It will be natural. But overall, this year, I think something
like big movement in this direction started. So I predict in future few years, we will see a lot of
progress in this area. So a lot of roadblocks in development and testing when we don't have big database to play with when we need to develop or test it will be
like available to most teams i think in future thanks to several companies who work on this area
including mine of course so i've seen super base also mention branching in think, maybe in one of their funding announcements.
In their roadmap it exists, yes.
And other Postgres-related things in this area,
I would have expected if Heroku had kept investing in that or if they'd kept innovating on the lines that they were going,
I think they would have done this.
They have a very good concept, so-called preview apps.
It's basically environments which are deployed by request,
explicitly request or inside CICD pipeline.
So imagine for each branch, when you have Git push,
CICD pipeline is running,
and Heroku can deploy preview app on specific URL,
and it has adjusted code from that branch.
But there is always a problem what to do about databases,
and naive approach is let's have one database for all.
But there will be conflicts, of course.
And if you want to delete something
or to change schema, especially,
like a lot of conflicts.
And branching, database branching and thing cloning,
it's exactly what can fill this idea to complete state.
So you can have preview apps or environments as a service, not only in terms of
code, but also in terms of data. And that's exciting. It's super cool.
Yeah. And then it seems like the natural successor to Heroku Postgres is going to be Crunchy Bridge.
They've got a lot of the same team involved. So I'd be interested to see if they're planning to do anything on this next year.
Yeah, it's interesting. I didn't see anything about it
in this direction yet, but maybe there's something. I would like to know,
to learn about it. But anyway, I see
obviously several companies looking in this direction.
I hope finally this problem will be solved in the next five years
because it's something that should be solved.
And it will unlock problems with, for example,
some product manager wants to try it with data, right?
Or you develop multiple things simultaneously,
or you have multiple developers.
Even if you want to
optimize SQL
to check performance
it's also a way to go because without
data performance is different so you cannot
check your performance
if you don't have enough data
and this unlocks
many many things like that and eventually
it shifts a lot
of stuff to left
in terms of DevOps infinite loop.
So it's definitely shift left testing,
shift left activities.
I'm excited about this.
Yeah, if anybody's new to our podcast,
we do have an episode specifically on database branching.
So we can link to that as well.
We have two episodes, I think, right?
So from time to time we touch this topic.
This is my favorite topic.
So I think we will continue doing this, I hope.
Because there are interesting things
about performance testing, for example.
Okay.
And my last item from my top list,
top five list is actually Postgres.
We mentioned that in the beginning,
Postgres is everywhere. All cloud vendors actually Postgres. We mentioned that in the beginning, Postgres is everywhere.
All cloud vendors provide Postgres.
And even Oracle started to provide managed Postgres service.
So like I knew about SAP, for example,
it was like a few years ago.
So everyone, literally everyone.
And for example, also interesting news,
like Google released new database, cloud database. And for example, also interesting news, like Google released new
database, cloud database called AlloyDB, which has something from Postgres. Like, of course,
it's like can be considered as from Postgres family of database systems. And it's also super
interesting. It has very interesting concepts to handle HTB, hybrid transaction processing.
So it's like analytical and operational processing combining.
They have a combination of row store, column store memory.
So it's interesting.
I haven't tried it yet.
I still in my to-do, but sounds super interesting.
And overall, as I've said already, we have dozens of options to choose from.
Okay, you need Postgres, but which one?
So many, right?
So choosing the right option is not simple.
It's not a trivial task.
It's not like before you download binaries or you download source code,
compile, make, make, install, and you're done, right?
No.
These days we have so many options.
It's not only like managed versus self-managed.
If you say managed, so many options.
And self-managed as well.
There are many Kubernetes operators already,
like maybe five quite active projects,
which all of them also interesting.
More than five, actually.
More than five.
There are other use of them. So if you five actually more than five there are other use of
them so if you want to self-manage question will be old school self-manage or kubernetes right
because there are many operators and they provide they bring things like backups replication
monitoring out of the box they are eventually will compete with managed services.
And managed services should be afraid of these operators, of course.
So this is it with my list.
Do you have something to add?
No, I think that was great.
A couple of smaller things that probably don't warrant a big discussion,
but I thought it was worth an extra shout-out to the team
behind the Postgres 15 features, all of the people that contributed to that.
I know it was more of a kind of lots and lots of smaller features this year,
but I really liked some of them.
Who mentions this?
We all know every year we have a new Postgres version.
It's not in use already, right?
I'm joking, I'm joking.
There's Merge there, Merge and other things, right?
Yeah, but I don't want to take it for granted either, right?
Like it is every year that we do have a major version,
but that's not necessarily guaranteed to happen forever.
It's really cool that we do.
Also, I think there's probably not so much for this past year,
but I think next year I think we'll see some really interesting serverless,
or I don't really like that word,
but that seems to be the standard word,
use cases.
If people can use Neon to send Postgres queries
in an almost serverless nature,
I don't understand how it works,
but that could be really cool, I think.
And then the other interesting project
I'm keeping an eye on is Aureole db which looks like it the whole aureole db also has branching in the roadmap
and they have uh there is uh in in github there is a file like markdown file describing the
concepts and they already have copy and write checkpoints so if they implement
database branching it will be
Postgres native database branching
so you don't need to think about
separation of storage from compute
you don't need ZFS like we do
you have it inside Postgres
I'm super excited about this as well
Yeah, very exciting
but yeah, I think
you've done a great job of wrapping it up it's been i think
it's been a really good year for postgres and looking forward to another one next year i agree
i agree thank you well everyone have a good new year much better than last one this one
and see you soon thank you so much absolutely happy new year bye now