Postgres FM - What’s Missing in Postgres?
Episode Date: April 3, 2026Nik and Michael are joined by Bruce Momjian to discuss his new talk "What’s Missing in Postgres?" Here are some links to things they mentioned: Bruce Momjian EDB https://www.enterprisedb....comWhat’s Missing in Postgres? (Bruce’s slides) https://momjian.us/main/writings/pgsql/missing.pdfThe Wonderful World of WAL (Bruce’s slides) https://momjian.us/main/writings/pgsql/wal.pdfGetting started with benchmarking (our episode with Melanie Plageman) https://postgres.fm/episodes/getting-started-with-benchmarkingMyths and Truths about Synchronous Replication in Postgres (talk by Alexander Kukushkin) https://www.youtube.com/watch?v=PFn9qRGzTMcThe Future of Postgres Sharding (Bruce’s slides) https://momjian.us/main/writings/pgsql/sharding.pdfMultiXact member space exhaustion https://postgres.fm/episodes/multixact-member-space-exhaustion~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith credit to:Jessie Draws for the elephant artwork
Transcript
Discussion (0)
Hello and welcome to Postgres FM, a weekly show about all things Postgres QR.
I am Michael, founder of PG-Must.
I'm joined as always by Nick, founder of Postgresair.
Hey, Nick.
Hi, Michael. How are you?
Good.
And we're delighted to be joined by Bruce Monjian, who is VP of Postgres Evangelist at EDB
and long-serving Post-Crest core team member, who recently gave a new talk about the missing features in Postgres.
So welcome, Bruce.
It's an honor to have you here.
Good to be with you.
Wonderful.
Perhaps we could get started with why this topic, why is it important or why is it important to you?
So I'd love to say that I thought of this topic and I'm a genius, but no, actually, Melanie Plagman came to me in Riga, a PG Europe in the fall.
And she said, we'd really love you to do a new talk at the PG-Dev, PG-Conf Dev conference in Vancouver.
I said, okay.
and she said, how about things that aren't in Postgres?
I think, wow, that's a great idea.
So then I guess I was showing these slides to Robert Haas once I finished him.
And he's like, wow, Melanie got a mention.
I'm like, yeah, I said half the job is figuring out what topic to do for a talk.
So he said, I'll have to think about that now.
I can get a mention just by thinking of a topic.
I don't even have to do the talk.
So, yeah, I thought it was a cool idea.
I never would have thought of it.
There's a couple of my talks where you'll see in the introduction of title concept from text.
Because again, I have about 65 talks, I think, now.
And it's hard to 67.
It's hard to think of like topics.
I've been really lucky because I did two new ones in the fall of last year.
One is this talk.
Another is The Wonderful World of Wall, which is about the write a head log.
And then I have a third one, which is more of an AI talk.
I already have two AI talks.
This is another one called building an MCP server using Postgres.
So I got three new talks queued up for 2026.
So it's unusual for me.
I write my talks and basically give them for maybe a year around and then come up with a new one
because I've been doing it for 30 years.
So on average, I'm making two talks a year.
I have 67. But this year I get three. So I was like, wow, this is great. And Melanie, I have to give
her credit for the topic on this one. Oh, that's awesome to hear. Friend of the show, Melanie,
she's been a guest before. It's good to hear. Yeah, I have a question. If my memory doesn't play a game
with me, 20 years ago, I remember you maintained the to-do list. That's right. Isn't this a to-do list?
Is this a to-do list? I would say no. The to-do list is basically driven by,
what people have asked for
it could be a small thing
it could be a big thing
this is more of the
I would say the big missing features
and a lot of them are ones
that we have no intention to doing
so it's sort of like
let's step back
let's look at what big things
were missing let's look at what big things
we may be working on
and then look some of the big things
that we'll probably never do
so strategic
much more strategic you're right if I drill down to the do list would be here forever but again I think
it is valuable to step back and look at what is missing and what isn't and I even learn something just
by writing the talk because stepping back far enough and looking at the missing things that so I think
I have what is that 12 items and what's interesting is 11 of the 12 items relate to performance
that surprised me.
If you had to guess beforehand, what would you thought more of them would be?
I thought it would be basically missing, basically operational features or missing security features.
Well, that's TD is that one.
That's one of the 12, right?
Missing just integration, infrastructure, some of these big missing things.
But when I actually wrote them down, I kind of.
kind of like, wow. So 11 of the 12 are basically related to scaling either on a single host,
seven of them, or multiple hosts, which is another area we need to work on. So you got really two
kind of 11 of the 12 performance, seven of them are for single host and four of them are for
multi-host. And again, I just never suspected that it would be a performance need and not an
operational need. If I had done this 10 years ago, I know there would have been a whole bunch of
things that... I can bring some ideas. Maybe I should do it in the end, which are not performance
related and definitely big missing pieces. And they're not included to your list. So maybe...
I have... What's interesting, I've given this twice, both in January, one in Prague, one in Brussels.
And nobody really came up, in those groups, nobody really came up with anything that was wrong.
The only I think feeling I got was some of the people were sort of like, well, Oracle Rack has some value in these small use cases.
And I'm like, yeah, so there was a little bit of, eh, you don't have this Oracle thing or multimaster.
Like, well, maybe there was some value.
So I get it.
Yeah, it's there.
But I didn't get anything that sort of shocked me.
So I'll be interested to see if you have anything.
I'm sort of like, oh.
Yeah, I want to hear this.
Why don't we do it straight away?
Sure.
Right now, let's discuss this.
Okay, I'm looking at your list and I don't see, for example,
synchronous replication improvements because right now,
synchronous replication is not all right.
Maybe you watched Kukushkin's talk last summer talking about caveats.
And this is a big topic, for example,
when there are new people coming to Posgues ecosystem from my system.
them from my SQL, for example, and one of the big group is Multigress, who are like creators
of Vitas.
Yeah.
Right.
So some words from them, some words from Kukushkin, and I see synchronous replication is
not all right because it's based on logs.
And if primary already written data and then it suddenly starts, primary things data is written,
standby, never accepted.
This is one thing.
another thing is that if you have logical replication involved, there is a nightmare to handle that.
Basically, you lose during failover with issues, you lose logical replica.
And looking at other systems, including MySQL, there is a big potential to improve synchronous replication.
Basically, some people won't say names, but just watch Kukushkin's talk.
It's broken right now.
So I'm working on the PG-19 release notes.
I do see improvements in both of those areas in the release notes that I'm working on.
They're not done yet.
It won't be done for both two weeks.
But it would be interesting if you could come back and take a look at see what was done.
So what happens in that particular example, both examples,
you're now out of the 30,000 foot and you're now getting down.
And there's totally stuff.
Sorry, I'm not getting down.
when people with enterprise and big cluster experience come, they say, where is synchronous
application? We cannot build proper clusters. And what I'm telling them, most people still use
a synchronous replication and they cannot believe. They cannot believe. Really? Oh. Yes. How can they
say? Other systems have proper synchronous replication with proper consensus algorithm. This is another
topic. There are many attempts to bring consensus algorithm or something inside PostGGIS, right?
That's similar to Puler question.
It's always there are many ways.
And so this topic is not about performance.
It's about H.A.
No, what I'm saying, I'm saying, Dan, what I'm saying is I'm trying to look at this at a very high level to look at big holes.
What you're actually looking at is we have the feature, but it doesn't work right or it needs whatever.
I get that.
But again, if I go down that level, I'll never get done, right?
I'm trying to, from the top, look for these big holes.
holes and actually to me not only look at the big holes but look at why are they holes right
because a lot of people who aren't me or you who are sitting watching everything is happening all
you 20 hours a day right they're like why don't what's the problem what's the holdup and then the
other thing for me is by by having that 10,000 foot view I see a pattern clearer than I could see
when I was down in the...
I'm not disputing that we have...
That these things are not feature complete, I would say.
And it's not atypical for us,
particularly with partitioning and logical replication
and failover to take years as you to get
to a feature complete point.
And I think we aren't at feature complete point
a whole bunch of areas.
But I tried to stay away from, okay,
we're not feature complete here,
we're not feature complete there.
I'm like, what things are we not even like moving in?
And there are all.
a bunch of them that I would say we're just not engaged at all. Yeah, I get that. That's a good
description. And when you're at this highest level, you see a lot of performance-related stuff, right?
That's right. At that level, I see a lot of stuff that we're not even in the ballpark in, that we're
just scratching the surface. I'm not saying all the things we've done so far that have been
implemented or done perfectly or feature completely, clearly or not. But at least I feel we're moving
there. What I see up top is a whole class of things where we're not even engaging and we may not need to
engage. It's more of a thought experiment for people to come in and say, okay, what things are they
not even in the ballpark on? And it's helpful to know because if you look at the list, it's not a
terrible list. And if you think this is the biggest stuff we don't have, we must have a lot of stuff,
right? Because I can't think of any other big things except these 12. There might be one or two that
somebody will think of at some point. But I think for me, it's a great exercise in seeing where we are
at that big level. Where do we think we need to go five, ten years out? That's usually the area. That's
the distance I'm looking in, not necessarily what comes in 19 or 20. And I think some of the stuff you're
looking for is coming in those releases. But what stuff do we really need to, at least
to evaluate and say, is this a direction we need to go in or not?
Looking through the list, it struck me that there's a bunch of things that you can currently
do, but you need extensions or a separate tool.
You can use Postgres and get those features, but you need an extension or another service,
like PG bounce or something.
But there's also a bunch that you literally can't do with Postgres, so you would stick
with maybe a proprietary database, or it's just a different solution completely.
maybe a fork or something like that.
So it felt like there's that kind of two categories there
where there are things you can do with progress,
but you need something outside of court,
and there are things that you can't do with,
like you need to do?
So is there one of those that's more interesting than the other
in terms of bringing into core?
That's funny.
You mentioned that.
I really struggled with that question.
And the audience struggled with that question.
I remember people saying,
hey, what about this and what about that?
And I think the version, these slides are,
regularly updated. Like every time I give a talk, I think it's something that I'm just looking at one of the
one of the capitalizations I did wrong. Anyway, I'll fix that after this talk. But my point is that you're
right. You have there's like concentric circles. So if you're willing, if you're willing to go to a
proprietary fork of Postgres, it's a big circle, right? And then if you're willing to use extension or
we're external tools. Let's say external tools and then proprietary extensions and then
extensions and then the core right. In fact, maybe I should have a slide for this to illustrate
what that span is. And in a lot of cases, we will say we have it. This is just great. You ask that
question. Nobody asked it quite the same way you did. So the question, some of the complaints we have
are not that it's not available as either an extension or an external tool or a proprietary
work of Postgres.
The complaint we have is it's not in court postgres.
Yeah.
Okay.
Or that there's an extension, but the extension, because it's an extension, doesn't work
exactly the same way it could possibly work.
It was in court.
Or if it's a third-party tool, the third-party tool is not as effective.
or easy to use as it would be if it was integrated,
which is the connection pooling example.
Or there's a choice of them.
Like, why choice of them?
Right, there's a choice of them.
Although the choice of them I kind of like,
particularly in the PG-PoG bouncer realm,
there is actually a reason for that.
But my point is that some of the missing stuff,
even something like column there, right?
You could say we have extensions to do that.
But yeah, I guess,
but I'm not really happy with that.
and I'm not necessarily wedded to Citis and it has limitations and because it's an extension,
it might not be available.
So you get all these sort of complaints.
And it brings you to a bigger philosophical question of does everything need to be in core, right?
It's probably not.
I think PGVectar is a great example of something external that can be developed on its own
release cycle.
And it really doesn't hamper its ability to be used effectively with the code.
But when you're starting to talk column there, you're starting to talk connection pooling,
where you have problems with authentication going through the connection pooler.
We've tried to improve that, but it's still kind of yucky.
Logical replication at DDL, you know, another one that's available as an external proprietary product,
but not in the code.
TDE is a classic that's available in a bunch of external stuff.
It's not available in the code.
So it's sometimes when people complain or it's missing, they're like, what I tried to focus on
was basically in this talk, is it available either in Postgres or as an extension that has no
downsides?
That's my litmus test.
And anything that has limitations and is an extension, limitations as a third-party tool
or obviously as a proprietary fork, those are going to be effectively not available in the
community.
And we have to decide, is that a good place to be, a bad place to be?
maybe that's fine.
I really like the concentric circles thought,
and I want just on the kind of the last two,
is there also a difference in your head
between contraib extensions and core?
Like, for example, PG stat statements,
a lot of the PG stat views are on by default
when people install Postgres,
but PG stat statements isn't.
So like, that's an interesting.
I don't know if that's too semantic a discussion,
but that feels interesting to me
is that missing or not really,
missing. You're right. That's another concentric circle. Yeah. So yeah, see, like PGVector, maybe that would be
nice to be in Contrib instead of being external. But then if we pulled into Contrib, then it's
release cycle becomes tied to the major releases. Maybe that's not a good idea. I don't know.
And when you start to talk about the cloud vendors will support some Contrib extensions and not
others. Yeah. And you get people pull me aside, hey, why don't you move this into Core, get out of
Contrib and I'm like, what's the matter with Contrib?
And you're like, well, my cloud vendor doesn't support it.
I'm like, okay, so you're complaining to me that your cloud vendor doesn't support
something worried shipping is that you think that's going to be like fruitful or, you know,
and they're like, well, why isn't it, you know, but there's a variety of reasons that stuff
is in Contrib, it may be an edge use case.
It may be like PG stat statements, creates its own tables and has its own overhead
that we're not sure everyone would want.
So that kind of makes sense out there.
Some of them, like an oddball one, like Cube,
an edgy case, PG-Trigram.
I don't know.
It's hard to say.
Is it better than extension?
Is it better to be better in core?
You can make a case either way, I would say.
Nice.
From the list, there are any that are like particularly your favorites
or ones that you would love to see progress on?
Or do you not like to pick favorites?
Yeah, the two that I've really felt called to champion
are cluster file encryption,
which is called TDE in the industry and sharding.
Those are the two that I have the biggest impact
in terms of adoption and expanding Postgres workloads for me.
The other ones are interesting,
but they're more, I don't know,
don't see them as expanding the Postgres adoption universe. That's, I guess, that's the phrase I
would use. Yeah, the other ones are good, but they're more operational. Oh, it'd be nice to have
this. And, oh, I could, I would be more optimized if I had this other thing. But for me,
the sharding and the TDE really take Postgres to a new level. And I would say in both of those,
I've pretty much either failed or executed poorly in terms of, because I've been working, at least
championing those for at least five years now. And I don't feel I've made a lot of, I guess I've been
waiting for kind of a groundswell of support because both of them are very hard to implement as an
individual. I got close with TDE, but got stuck on something. And sharding, I think, is,
it really requires a team. And I just, I don't think I've seen the kind of desire behind that.
I think part of the reason for sharding is that machines keep getting so big.
that effectively you're better off just buying a bigger machine or renting a bigger machine
than going down the sharding route because the sharding route really is very workload-specific
and there's an increasing dislike of workload-specific things in an enterprise.
There's a much more pushed toward just generic compute, generic solutions, generic,
and the sort of, oh, I got TISA with special chips on the drive, which filter stuff,
and, oh, I got this special, whatever, special hardware does something.
It just, the enterprise focus is a lot less on hardware now, a lot less on infrastructure,
a lot more in solutions.
Maybe that might be good.
I'm not saying good or bad, but I think that, I think sharding has, as people have needed sharding,
systems have increased in size to the, really in pretty much.
much in lockstep to what they needed.
So the number of people actually need it is limited,
and the number of people who are willing to put in the effort to architect it is also limited.
Yeah, I agree with you that hardware available right now is so huge.
And recently I saw a cluster which had default vacuum settings self-managed.
So usually like RDS or others, they have tuned.
But it was the default of the vacuum settings, three workers only, nothing tuned, and it was 20 terabytes and somehow surviving.
It was insane, absolutely.
So how was it possible?
But at the same time, HDS and CloudSQL and others, they have hard limit.
I know they work on it.
And Aurora has more 64 terabytes.
And these days, to collect that amount of data is not rare already.
even one person surrounded by EIs can collect a lot of data.
This is one thing.
Another thing is that there are hard problems, for example,
lightweight lock manager problem, which was solved in PostGIS 18.
There are other problems as well.
When you grow to some heights, you encounter with,
some people would not like to spend time firefighting, right?
So Shardink is really needed.
But I guess, what's your take?
For me, it's like it's really hard to find consensus,
how to achieve it.
And also sharding topic usually triggers topics like internal connection pooler, for
example, because they are like adjacent.
If you think, if you like talk about routing and so on, you think about also like
auto-failover, you think about connection pooling.
And these topics usually come together in my head.
But there are other ways to do sharding, right?
So there are many ways and what your take.
Will it ever be implemented in core or no?
I've always been the opinion.
I do have a sharding talk on my website,
but I've always been the opinion that
the sharding has to be really developed
organically within Postgres,
and it really has to be built on
effectively partitioning
foreign data wrappers and parallelism.
I don't think we're going to have the appetite
to create a whole new architecture
for sharding.
And every year that goes by, it becomes clearer
that, yeah, we just can't.
So I think we can get closer than we are now.
with little impact.
I think if we,
for read only sharding,
I think we want to do read right sharding,
then we have to have a global lock manager,
global snapshot manager,
becomes much more complicated.
Yeah,
like I keep going there and people who think they need it
effectively don't end up needing it
or end up getting bigger hardware
or re-architect what they're doing
and then they don't need it anymore.
So I think that's one of the reasons
it hasn't moved forward.
but I do think, I think for right only sort of a data warehouse kind of sharding, I think within two years we could have a pretty good solution in the industry.
But I just haven't seen a lot of, there's one guy in Fujitsu was working on it.
But that was about it and just everyone.
And frankly, it's been pretty dormant for the past two years.
And I haven't had time to work on it much either.
Yeah.
If you stay on a single primary cluster, there is a big limitation well known for those.
who achieved some heights,
whether you have like 200, 300,
maybe bytes per second,
and we talked about this a lot
with folks who develop sharing systems.
You, like, you hit the limits of single-threaded process
of logical, physical, logical, wall-receiver,
basically, not wall-receiver,
a replay process, right?
In it, it shows up as a neat and top,
which basically replays the changes from the primary
and it's a single-threaded process.
And I know also somebody from Japan working on it since 2013, I think, on and off.
I saw some conference talks and so on.
And this problem is not included to your list.
It's not a small problem if you achieve heights.
What do you think about that problem?
Yeah, so as I remember, that was the one where they're trying to create a dependency graph from the wall
and therefore identify which parts of the wall are paralyzable.
and send those off to workers.
CPUs do that now with CPU instructions.
They figure out which parts can be run on different cores or in parallel core.
Yeah, you're right.
We don't have threading, so, yeah.
I don't know threading is, threading's on the list that I have.
In the list, yeah.
I'm not sure threading is a requirement for that.
Because we do have parallelism without threading now.
It seems to work fine.
because you load them into shared memory queue
and the process just pull out of there.
I don't know.
I think you're right.
I think that would be
getting a paralyzable replay
of logical replication
would be physical
doesn't seem to matter too much
because it's so fast that...
Hold on one second.
Logical, I think already
if there is work maybe in Postgust 18 or something,
I remember something for logical to paralyze it.
Paralyze it, okay.
I think so.
Technically, you can paralyze it
with multiple publication subscription pairs, multiple slots, but it will be having problems
with foreign keys.
It will be like eventually consistent in terms of referential integrity.
I'm talking purely about physical replication, which is...
Oh, physical replication, okay.
Exactly.
And this is like, this is limit some companies hit, and it's super painful because there is
no escape from it.
You need to do either sharding or vertical split.
Yeah.
you need to achieve some like a lot of rights basically.
I think the reason we don't hear about it a lot in the community level
and this might be different from the level that you're working at is we're more of it.
We're definitely a general purpose database.
And not that we can't go to the heights and we keep pushing the ceiling up higher and higher,
but there is a limit to how much.
much complexity we're willing to add and potentially unreliability to the product to
Postgres to get up to the super heights and it's possible that's why you know I know you specialize
in that area that's possible we have what we don't hear a lot about it that didn't even that didn't
get on my list is something that I've heard a lot about I don't see many email threads really
addressing that this is so but I know for example in stock art he did years ago during COVID
actually and there are others who hit it. Sometimes people hit it and don't notice because, okay,
some replicas are lagging a little bit and so on. Yeah, but I understand it's not a super common problem.
We had a, EDB had a customer who hit it on specialized storage hardware, and we identified
that it was the storage hardware that was causing it. So that was, yeah, that was kind of a,
get new hardware, get new storage hardware, your problem will go away. So that was, I think,
the answer to that one.
that's the only case I've heard of replay complaints and lagging
sounds good
but what's your take on threading topic because there was a big impulse
a couple of years ago from heike originally i think yeah
yeah i have a blog entry about it which kind of from 2018 which references
i think it might even references heikis
Hakey's, let me see if it references Hakey's email.
No, it doesn't.
I think we're still on the fence on that one.
We keep chewing away at some of the small things like getting rid of global variables,
I think was one.
And I think the other thing that's going to bail us out here is that we now have, it appears
as though we never have compiler support for this.
So instead of having to re-architect all our code, it
looks like there's some if we can get rid of the global variables which is pretty simple it looks like
the compiler will sort of read local a lot of stuff for us to make it easier without us having to
like really re-architect a lot of the code so i'm hoping in that area but my understanding is that
the threading prototypes don't seem to gain a whole lot they don't at least from what we found because
it reduces task switching time.
But again, for example, if you're doing a GUI,
you really need threading because you're updating a whole bunch of stuff on the screen at once
and you have one of all these background jobs to be running in your dress space and so forth
to update a whole bunch of very light processes and light statuses.
But when you start to talk about a database, there really isn't a whole lot of light out there.
Yeah, I guess we could do index lookups with threads.
That would be cool.
Visibility bits, maybe?
Visibility, yeah, but the problem is there's, a lot of the stuff is really heavy.
For example, when you're trying to paralyze a query, those are not light, right?
So you might as well just get a process and run it and have its own everything.
So I think that's what has slowed it down.
Another problem, obviously, is resiliency.
Wait now if obsession crashes or runs out of memory or whatever, we keep running just fine,
unless it's in a critical section.
whereas with threading, it would make us less resilient to sessions that misbehave,
and we have to balance those two off.
We know the value of threading and then the less resiliency of it.
I don't really know the answer.
I've been surprised we haven't done more in this area,
but it appears to not be a huge problem if I'm not hearing about it every couple months.
Let's talk about huge problems related to four-byte transaction ideas.
which is included to your list, 8 byte transaction ideas.
And some forks have them for years already.
And also, I saw some preparation work and so on.
I'm very curious what are your forecasts and take on this.
But also, which is not included to your list, is on dualogue and redesigning MVCC
and these topics, which were quite popular a few years ago.
There are some efforts, but now, like, silence.
Maybe I don't see it.
No, I think you're right on the NBCC.
I mean, we had the Zheap effort by Robert Haas.
That was probably 10 years ago.
And he really poured into it.
But I think the job just got too large.
And I think you also, once you start to look at what undo requires, things get very complicated.
So I'm not saying it's not doable, but you end up with a lot of complicated challenges.
and Oracle suffered from those challenges for decades
until they finally figured out how to deal with them.
So what would we gain by having an undo?
Certainly it would make updates easier.
I'm not sure it would help with something like deletes or aborted inserts.
Who needs to do it?
Yeah, just update be done with it exactly.
It's all the bloat and vacuum issues, I think, that people are most excited about getting rid of.
Yeah, I mean, but at the same time, they keep improving vacuum every release.
I saw it in the thing.
You have today's vacuum.
You have the vacuum from two years ago.
You have the vacuum five years ago.
Which one are we complaining about?
Because, in fact, it has gotten better.
I'd love to just get rid of the whole thing.
I don't think we, even with an undo, I'm not sure I'm ever going to be able to get rid of vacuum.
Because it has different jobs.
as well. Right, because of the problem with aborted inserts and deletes and so forth.
So what is that going to look like?
I don't know. I don't know. I'm surprised of how long we've been able to stay with what we got,
what, 30, 40 years ago. Much more simpler architectural setup. It's funny, I had a discussion with
somebody, and I was in Geneva for the CERN PG-Day, and somebody came to me and they were
talking about page compression. Do we do page compression? And I have a blog entry about it,
and I showed it to him. And he didn't really, I hoped he would read it from my phone,
but he didn't want to. I said, why do you want page compression? It requires less disspace.
I said, this is not the 1990s. You know, what actual reason? I said, I understand why it was needed
like 30 years ago. But I'm not sure that, because once we compress a page, then obviously,
Obviously, if we do an update and then the new data can't be compressed as well,
then we've got a bigger page, we've got to put that somewhere, and where do we put it,
and then how we had deal with the index entries.
I'm like, what are we really gaining, once we're all done, all that computation,
all that moving the data around, what have we really gained with the compression?
Now, we can use storage compression, right?
You can, there is storage compression.
I think Postgresor run on that.
But again, that pushes the problem down to the storage layer.
I don't know what performance would be like, because frankly, databases don't like compression
of that size.
But I guess my point is that he was asking for a very specific thing.
But when I asked him why he wanted it, he really couldn't articulate, except, okay, it uses
less disk space, but I wasn't sure what that goal was there.
For me, for me, compression related to, let's compress data, it's more related to column
storage and how there you can compress.
I told him that.
I said, if you're telling column storage, that's a completely different setup.
And we do have solutions for that.
But again, you're assuming a lot of duplicate data in the same column, duplicate data.
And that's a different case than compressing an AK page on disk.
And he's like, no, no, no, I don't want that.
I want the one about the page.
I don't see anybody working on that.
It's like, okay.
We don't serve this in this restaurant.
Yeah, we don't serve that.
In terms of 64-bit transaction IDs, I think it's a little hard to understand what's going on when you look at the email threads because I think the impetus is that Postgres Pro has a version of Postgres that does 64-bit transaction IDs.
And therefore, you don't have to freeze and the whole thing there.
But as we talked about earlier, Nicola, I think that the Postgres Pro customers are really,
at the very high-end
throughput requirements
and they're willing to pay for that
they're willing to have a more complex system
that does that.
But when we're now in
the more generic workload
case, the Russians have been
very willing to give us the patches
to do it. But there's
a resistance in terms of exactly
how to do it in a way
it's going to benefit the high-end
users, no question.
And one of the things
I've learned in the past year is that one of the reasons that Oracle looks that way
and a lot of the proprietary forks look the way they do is because they're really selling to that top
5% of market volume. Whereas when Postgres is working on its code, we're trying to hit that
50% mark, right? And the reason Oracle is so complicated, the reason a lot of these databases
become complicated or add these complicated features is because they're really selling, they don't really
care. I don't say they don't care, but they're only really focused on their top 50 customers and
everyone else is just along for the ride. And therefore, you get these very complicated systems with
a lot of weird options, which were added only because one of the 50 wanted it. You have this
very high-end group who's very demanding, who's calling the shots, and they're dragging the database
into this high-end volume. But every time they're dragging it up there, they're potentially making
the generic workload either slower or harder or less efficient.
And they effectively don't care because the money is in that high end.
Let me comment, please.
I cannot keep silence here because we help startups.
Our primary customer is startup which grew some terabytes and they saw something
and they hit problems.
And last couple of years we experienced a lot of AI startups.
And this is crazy, absolutely.
they reach heights much more quicker
with much less effort and resources
and they start hitting problems
for example very quickly
when freezing is a big problem
and we need to do manual freezing
to skip indexes for example
this is very common right now for us
and second thing is that
yeah so second thing is that
we already losing some of them
because when we start saying
you need partitioning is like you need to locate
some resources for this, like experiments and so on.
They switch to different database system or quickly consider it,
where like less headache.
And this concerns me a lot.
Isn't that kind of Bruce's point that if they were an Oracle customer
and they grew to being a significant portion of revenue worth saving,
that's different.
If it's the Post-Crow project doesn't have to listen to the huge customers because
their payments.
Like maybe Oracle are incentivized to listen to those customers more than the average
customer or more than all the little guys.
So Postgres is a different landscape where we can listen to people more like across the
board, not just the guys hitting extreme scale.
My point is that smaller teams have bigger databases with increasing speed and the heat
problems which were hit only by big users before more and more often.
And these guys are more like ready to switch faster because they are surrounded by AI themselves
because they are AI startups.
What do they switch to?
Sometimes proprietary databases,
AWES or GCP, what they offer.
Sometimes MongoDB also on a table sometimes,
going to different problems.
But I'm just concerned that I see this different thing
that I didn't see 20 previous years.
So if we look at what's going on with the 64-bit IDs
as a microcosm, so they're willing to give us all the patches
to make this happen, right?
But what's happened is that we've only incrementally implemented those.
And we have some of them in PG-19.
For example, the multi-exact sizes are now 64-bit, the members or the groups.
I can't remember.
I think it's the groups.
And sorry, we had a couple of good examples last year, and some of them came to our podcast
where this was exhausted, this space.
The multi-exact.
Yeah, yeah.
And also, yeah, AI-A-I startups, which have a lot of data very fast.
So there's one, two ways we can go at it.
We could just go whole hog, 64 bit everything,
increase the tuple size header size by 33% by making them 64 bit and be done with it.
But we're worried about the impact on an average user.
So what we've been doing is slowly 64 bit enabling the server,
multi-exacts is one.
And there's some other areas.
I think C-Log, we're looking at doing.
I don't know if we got to that.
So we're trying to do it.
We're trying to 64 bit the areas that are costless to us, basically, where we can
increase it without really any problems.
And then I think once we get to that, now we have 64 bit pretty much everywhere that we
can easily do it.
Then we have to look at, okay, now at the places they're going to cost us, what do we do?
I think one of the great ideas that I saw was to get basically an episode.
an epic LSN on the page header so that you could basically say that for mass loads of data,
if you're doing a mass load of data, which a lot of these companies do,
effectively all the tuples are in the same transaction.
They're clearly all in the same epic, right?
So you just put one epic and then all your heat tuples look just the same as they did before, right?
I think that's where we're going is the concept of having an epic on the page,
and then basically allow those not to necessarily have to be frozen at all.
I think if we start to bring the epic down to individual rows
and we increase the header size,
which people have already complained is too big,
then we potentially get complained.
So I think if I did get a guess,
I think that's where we're going,
but we're going at it very incrementally,
again, hitting all the places that store LSNs outside of the day,
the heap and index pages.
We know those are going to be complicated and get the infrastructure done.
And I know that we have some patches in 19 for that.
I've seen them coming through that got committed.
And then we have to decide, okay, now that all this stuff is in place,
let's put up a proof of concept.
Let's see what a per page epic looks like.
Let's do some loading of data.
Let's see, you know, how far this gets us down the road.
We don't have a whole lot of empty space on the pages.
So it'll be a little tricky to figure out where to put it.
but I think we can find a place to put it.
But I think that's where we basically are,
is trying to figure out once we get all the bottom stuff done,
what do we do for the main stuff, the heap of the index pages.
Nice.
I think that makes a ton of sense,
and I really like the incremental approach that Postgres takes.
I've only been in and around the project for the past 10 years or so.
It's a lot.
It's a lot compared to some,
but it pales in comparison to 30 years.
And it's been really nice seeing the incremental improvements, but also that they stack up.
It really has come a long way in 10 years.
And things that looked incremental kind of nine years ago, eight years ago, seven years ago, they've really added up.
And you mentioned vacuum, but it's not just vacuum, right?
All of the changes that have helped with index bloat.
They're kind of attacking it from the other direction.
So it's- That's very true.
Yeah.
Logal replication, yeah.
Yeah.
Partitioning. Yeah.
Yeah.
I like that approach and I feel like it might sound like it's going to be slow, but time will fly in a few years.
I can imagine us having made significant progress on some of these things.
So yeah, that's really cool.
Bruce, I'm conscious of time.
Is there anything we didn't talk about that you wanted to make sure we did mention or any last like shoutouts or pointers you wanted to give people?
No, I was just nice that we got a chance to at least talk about, for me, the categories of what's missing was a big takeaway for me to understand.
understand like why are we here, why don't we have some of the stuff we don't, that we're missing,
and some of the stuff we may never have, optimize our hands.
Although that seems like a portion of that is coming in Postgres 19.
It's not called optimizer hands.
It's called advise, but it could be used in a similar way.
So, again, I think that we don't have a roadmap.
That's the problem.
Because we're too dynamic to have a roadmap.
So it's almost a surprise by me to see what gets in every release.
And I think everybody would be presently surprised by what's at 19.
And obviously what we're going to be,
what we're going to be starting for Postgres 20 in July.
Yeah, absolutely.
And I think I saw the work from Robert.
I think it was in contraband modules, right?
At least the first version.
PG plan advice, yeah.
Yes.
Yeah.
Looks very cool.
All right.
Well, thanks so much for joining us.
I'm sorry Nick had to drop off.
It is an absolute pleasure having you.
Great.
Thanks for, nice talking to you.
Likewise.
Take care.
Thank you.
