Postgres FM - Logical replication common issues
Episode Date: April 12, 2024Nikolay and Michael are joined by Sai Srirampur, CEO and Co-founder of PeerDB, to discuss how to overcome several logical replication issues. They discuss the protocol versions, minimising sl...ot growth, minimising lag, and some tips and tricks for scaling things well. Here are some links to things they mentioned:PeerDB https://www.peerdb.io/Our episode on logical replication https://postgres.fm/episodes/logical-replicationExploring versions of the Postgres logical replication protocol (blog post by Sai) https://blog.peerdb.io/exploring-versions-of-the-postgres-logical-replication-protocol Logical Streaming Replication Protocol https://www.postgresql.org/docs/current/protocol-logical-replication.htmlREPLICA IDENTITY FULL https://www.postgresql.org/docs/current/sql-altertable.html#SQL-ALTERTABLE-REPLICA-IDENTITYOn the performance impact of REPLICA IDENTITY FULL in Postgres (blog post by Xata) https://xata.io/blog/replica-identity-full-performance max_slot_wal_keep_size https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-MAX-SLOT-WAL-KEEP-SIZE Active Active in Postgres 16 (blog post by Crunchy Data) https://www.crunchydata.com/blog/active-active-postgres-16 pgEdge https://www.pgedge.com/DistSQL (term used by Mark Callaghan) https://smalldatum.blogspot.com/2023/07/keeping-up-with-sql-dbms-market.html Five tips on Postgres logical decoding (blog post by Sai) https://blog.peerdb.io/five-tips-on-postgres-logical-decodingPG Slot Notify: Monitor Postgres Slot Growth in Slack (blog post by PeerDB) https://blog.peerdb.io/pg-slot-notify-monitor-postgres-slot-growth-in-slack~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is brought to you by:Nikolay Samokhvalov, founder of Postgres.aiMichael Christofides, founder of pgMustardWith special thanks to:Jessie Draws for the amazing artworkÂ
Transcript
Discussion (0)
Hello, hello, this is Postgres FM, your favorite podcast about Postgres. And my name is Nikolai.
My co-host is Michael.
Hello, Nikolai.
Hi, Michael. And we have a very good guest today, Sai.
Hey, Nik and Michael.
Founder of PeerDB. We are very glad to have you here. And we just discussed that we are not
inviting guests just for guests. choose topic first so it's topic
centric discussion but my first thought when we discussed we need to talk about logical
and also i saw your blog post published recently my first thought was you're very great person to
have for this discussion right so that's why we invited you. Glad you found time
to join. Thanks, Nick and Michael for inviting me. And yeah, logical replication and logical
decoding has been my life since the past like seven, eight months. And we are trying to get
into as much depth as possible to, you know, understand how it works and probably down the
line contribute upstream as well. But yeah, I'm very excited to be here.
Cool. I know Michael has a lot of questions, so I probably will be less active this episode,
but I will have questions as well and maybe comments.
Yeah, I definitely have questions.
But yeah, I completely echo everything Nikolai said.
And yeah, we did a kind of a basics intro episode to Logical
where Nikolai also talked us through how to handle
scale in terms of major upgrades at least um so there's a common use case for logical
but one thing we didn't even touch on which was the topic of one of your recent blog posts was
i listened back and i think protocol versions got a single mention kind of like an off the cuff
mention at one point in the episode, you know,
30, 40 minute episode. So that was where I was thinking of starting is, it'd be awesome to hear
a little bit about the protocol versions, why you wrote about them recently. Yeah, absolutely.
Thanks, Michael. So the thing is, logical replication has this concept of protocol
versions that you could specify as a part of the subscription,
or if you are like an external client who is reading the replication slot, you could do it
as a part of the start replication API, which lets you read the replication slot. And there are like
four versions that Postgres supports now with Postgres 16, right? The first version is the default version, which
decodes transactions that are only committed. So it doesn't like start the decoding process
before the transaction commits. And the second is the more advanced option, which does logical
decoding of transactions that are not yet committed, right? Like, so it basically does
decoding for in-flight transactions, right? Like, so it basically does decoding
for in-flight transactions, right?
That is number two.
And then third is it lets you decode
two-phase commit transactions.
That is third.
And the fourth is it lets you decode in-flight transactions
and apply them in parallel to the target,
which is the subscriber,
which is Postgres basically, right?
Now in that blog, which you're talking about, we compared like version one and version two,
because they are more like the common ones.
Two-phase commits are not like very common, right?
That's the reason we didn't go into that piece.
But the biggest difference is the impact of two is it improves logical decoding performance
because you're not, you're actually decoding
while the transaction is happening, right? Like you are not letting the transaction finish
and then only start decoding, but rather you're decoding as the transactions go on.
So the idea is you're giving more time to the decoder, the wall sender process to perform the decoding. And now this has a lot of benefits,
right? Like number one, it reduces the slot growth, right? Like, so in that blog, we talk
about a situation where we have two long running transactions that are being committed. And with
version one, the decoding process doesn't start and the slot keeps growing, growing, growing.
And only after the transactions are committed, the decoding starts.
And then for a few minutes until the decoding is finished, the slot size remains the same.
Right now, this is with one. With two, what happens is as the transactions happen, decoding starts.
Right. Like so you do see the decoder doing its job.
And as in when the transactions finish, the slot immediately falls. And the advantage of this is so the summary was
with version one, the slot growth of like, I think we showed like five, six gigabytes
remained for like five, six minutes until the entire decoding finished. But with version two, it remained there for an instance,
because, you know, like the decoding was already happening
as the transaction was going on, right?
So this can have tremendous impact in use cases
which have long running or sizable transactions.
And that is very common, right?
We recently were working with like a fintech customer, right?
Like where they had sizable transactions, right? Like hundreds, we recently working, we're working with like a FinTech customer, right? Like where they had a lot, they had like sizable transactions, right? Like hundreds of
like, you know, operations happening in a transaction. And then these transactions were
like interleaved. And now the advantage with like two, the two would have helped them a lot
because as the, you know, operations are happening, the decoder, like, you know,
just decodes them. With one, the problem that was happening was was there
was one one large run long running transaction which took like an hour and then every time
the decoder needs to decode committed transactions it was decoding this long running transaction also
so so the order of time is more quadratic basically with like version one right because you know wall
is like sequential right
like so you keep writing writing writing and then like and then as and when there is a commit like
the decoder like starts like working now the problem is with version one as the existing uh
like the long-running transaction is not getting decoded still the decoder process decodes it for
like other transactions right so but with like, two, this problem wouldn't have happened
because once this long running transaction is committed,
I mean, it's already consumed.
It's already decoded, if that makes sense, right?
So the high level summary is version two is very helpful
when there are like, you know,
long running transactions that are like, you know, interleaved.
So the lesson is upgrade, guys, right?
Just use the latest
Postgres version. Version 2
comes with 14. Version 4 is in
16. Sorry, a small thing.
I've noticed you mentioned
subscriber, which is Postgres.
This phrase
attracted my attention. It's not always
Postgres. But
parallelization, which
to apply long transactions,
it's done only for Postgres subscribers, not for others, right?
Correct. Exactly. So the fourth version, which does like, you know, parallel apply of
these in-flight transactions is more relevant for Postgres to Postgres replication, right?
The apply piece is, you know, Postgres, and that's the standard logical replication.
And logical decoding is more like a subset of logical replication where external clients like, you know, PRDB can read the slot and replicate to homogeneous or heterogeneous
targets. Super. Yeah. In your experience, it doesn't mean that there's no sense to have this
feature for other situations like to Snowflake or others?
The parallel apply feature, you mean, right?
Right.
Yeah.
Good question.
So that is the onus of the ETL tool, right?
Like, so in PRDB, we have a option to do parallel apply or not do parallel apply.
So it's like a single flag that we provide. And the differences with parallel
apply, we cannot guarantee the order of transactions across tables, right?
So foreign key reference consistency, similar to multiple slots in regular Postgres to Postgres
logic.
Correct. Very, very similar, right? And then both, I mean, customers do it,
the advantage of parallelism is it will improve latency, right? Like replication latency reduces, lag reduces, etc. But it doesn't guarantee, you know, consistency across tables. But the of our capacity, no worries if TPS grow using a single slot.
If TPS grow, we know the bottleneck of its own subscriber, we will just use multiple slots. foreign keys, you need basically to agree that they are broken temporarily on the subscriber and
can cause a lot of troubles if you point application to such nodes, right? That's interesting.
Yeah, absolutely. Makes sense. And then I did want to call out, right, like for that,
I mean, there is a setting for disabling foreign keys just to like, you know, complete that there
is a setting called recession replication. Basically just to complete that. There is a setting called session replication program, basically. You can set that as replica. That is pretty standard.
That's the reason Postgres gave this setting. So you can set that to replica and it would disable
foreign keys and triggers on the target. But by default, if you use multiple
slots, it's disabled. By default, is it like disabled yes i know it since yeah since recent works uh for to use it
for upgrade with zero downtime upgrades so it's indeed disabled and it's interesting like
so you you need to like think maybe you shouldn't use it i I mean, maybe go back to single slot
and just choose different
time for your work. I mean, work
on weekends, guys,
right? Lower traffic
and you can afford working with
single slot. This means
only for, like, logical for
temporary time, like, for upgrades
we don't need it.
Exactly. It really depends on the use case.
Yeah, yeah.
But if for a long time, forget about foreign keys maybe.
Yeah.
And you said latency, but it's also about throughput.
We need to process more bytes per time.
Absolutely.
Legs can be nasty.
I don't like dealing with logical in Postgres 11, 12. It's not pleasant at all.
So these improvements are exciting and everyone should upgrade. That's what I see here. You
just uncovered a lot of details. That's great. Yeah, absolutely. And the tricky thing,
I mean, the beauty of Postgres of like Postgres is that like,
this is already inbuilt in like Postgres core, right? Like a logical replication does this by default. It's just like, you know, a setting you need to do say that like, hey, I want to use
version two, and it will be significantly more performant, reduce slot growth. But if it's an
ETL tool who's using logical decoding, the transaction logic needs to be managed by the ETL
tool. Because as you're reading like in-flight transactions,
we need to keep track whether this transaction is committed or not.
And then only we should push it to the target.
So it becomes tricky for like ETL tools,
which we are working on in PADB now as we speak.
But the beauty of Postgres is that it just gives you
the setting out of the box.
And you just need to upgrade your Postgres version to 14 15 and 16 so yeah and uh with this uh in 16 we have example i mean the developers of these tools
like your company they now have example postgres to postgres native logical shows how it can be
implemented so the reference example exists so it's's good. Yeah, that's great.
Sorry, Michael, I interrupted you.
No, not at all.
This is a really good diversion,
but we've talked about three things already, right?
We've talked about throughput, we've talked about lag,
and we've talked about slot size growth.
And I feel like when you said initially the performance of the version 2 protocol can be better,
I'm guessing we're talking mostly in terms of that lag metric.
Like we can start processing things faster, therefore the lag is lower, makes total sense.
And when you said the slot size growth is reduced, I think I understand where you're coming from.
I think the specific benchmark in the blog post shows that the growth is the same,
but then it kind of reduces quicker.
But when you mentioned the interleaved long-running transactions,
I guess it's in a less synthetic workload where you've got lots of potentially overlapping long-running transactions.
The peak will be lower in the version if you're
using great exactly right like because in the scenario we did it's like just do two transactions
that are long running and then we just had two of them right but then and then they ended at the
same time but in real world workloads like it i mean it can be very arbitrary, right? But with like version two, as the slot consumption
is faster, and like slot size falls quicker, right? The cumulative effect can be very significant in
slot growth. If that makes sense. And that is the next benchmark we want to do where in the real
world benchmark, I want to see how does the peak slot size like compare with version one
and version two, we are actually building that feature. So we can we plan to like, you know,
we have a few like, customers, we are design partnering with, you know, to implement this
feature. And we have like, we know, you know, what is the peak size slot size we are seeing
with version one. And with this design partnership, we will get to know, okay, what is the peak slot
size, my like understanding is it will fall very quickly, because like get to know, okay, what is the peak slot size? My understanding is it will fall very quickly because you're giving more time to the decoder, right?
It's more efficient.
And with these long-running transactions, it cannot be quadratic, the decoding process, right?
But that's a very good question you pointed, Mike.
And that is the next benchmark we want to do.
Nice.
So then the question then becomes are there any workloads and i'm thinking now that it must
only be if you had let's say you had a lot of long-run transactions that end up getting
aborted or rolled back instead of committed i'm thinking that might be the only case where version
the version one protocol might be like you might prefer to carry on using it or are there other
cases where you might prefer to carry on using the v are there other cases where you might prefer to carry
on using the v1 protocol over the v2 one when as long as you're on 14 above even when the transactions
are rolled back the wall is not removed so it still need to be like decoded basically right
like for future transactions that that's the and that's exactly what happened in a customer scenario
where there was a long-running transaction with the customer killed, basically. It was running for like a few hours, right? But still that impacted other transactions because
Postgres is not yet smart where it removes the wall. This is my favorite use case. So you can
overload wall sender if you just do a very simple trick. You create a table with like a million,
10 million rows and then delete rows in in
transaction and roll it back massive delete rollback massive digital back i like this
workload in many cases and this included and this makes wall sender consume 100 cpu very quickly
because it needs to yeah because this transaction spams wall writing xmax
constantly for a lot of tuples
and then saying
this xmax
this transaction ID was rolled back
so it doesn't mean
they're still alive
the next transaction does the same
and wall sender is becoming crazy
and I think
there should be some way to optimize decoding to understand it, maybe.
Because if you start processing, ah, when it starts processing, it's not known that
this transaction is already rolled back.
But maybe there is some opportunity to optimize, because if it's known already, it's already
rolled back.
Why? We can probably skip better rolled back. Why can't we probably
skip better or something? I don't know.
I don't know details here.
That's a very good point, Nick. And I was recently
chatting with one of the committers
and we don't even
skip rows is my understanding
basically. So every time
we try to decode basically. So the immediate
optimization is if
there is an entry in the wall of a rollback transaction,
then we don't decode that entry in the wall.
But you bring up a great point here.
So there is a lot of scope of improvement there.
Yeah.
And this led us to the conclusion that we won't be able to deal with logical at a few
modifying transactions per second.
It was the wrong conclusion
because this was a pathological workload.
And in production,
I think it's a very rare case
when WallSender is hitting 100% of single core.
Or it's not, I'm not sure.
Because the problem is you cannot scale it.
You add a second slot,
but you see WallSender is hitting 100% of single core CPU
with the constantly or from time to time.
And you think, okay, I will try to distribute workload
among two slots or four slots, eight slots.
But this doesn't help.
All of them are hitting 100% of different cores and that's it
so this this can be bottleneck but it looks like it's very rare right now in production
yeah absolutely and that is another optimization that like i mean the community could do is where
we uh i mean currently if there are multiple right, like the decoding process runs for every slot, right?
If there are ways in which the decoder runs for just one slot
and the other slots reuse these like decoded changes, right?
Like if that makes sense.
Yeah, exactly.
The same work that's done multiple times.
Exactly.
Everyone is process everything.
This is what, like definitely there is opportunity for improvement
here but i think people just don't see this as bottleneck in production often that's why it's
not optimized yet in production what we see is like it's like these one-off batch operations
where customers delete like a bunch of data or add like do copy with like millions of rows where
there is like a wall spike and
you know the recommendation that we give right like is guys have more disk i mean it is i mean
that it's hard because like faster disk faster and like you know larger disk because logical
replications can only scale up to 25 000 to 30 000 like you know messages per second basically so
have larger disks so that like once the spot ball spike like falls
it'll catch up okay sure at that time there is more latency right but it's a good good CPU if
if like 25 30 000 messages per second because for pg bouncer on modern intels and AMD I didn't see
more than 20 messages per second we can compare this like also messages per second. We can compare this, like, also messages per second. PgBouncer processes transactions
just bypassing them to backends,
Postgres backends, and returning result.
And logical replication also similar.
Some messages, just some messages, right?
Different type of work,
because decoding maybe is more consuming,
but 30,000 sounds very good.
Yeah, 100%. But the thing is,000 sounds very good. Yeah, a hundred percent.
But the thing is, it has to be done well, right?
No, I mean, like it has to be done properly
because like one of the tip that we do, right?
Like which I've seen lack in a few ETL tools
is always consuming the slot.
We cannot give up on consuming the slot.
We need to constantly consume the slot
and flush the slot, right?
And not to lag, right?
Not to lag. And the thing is, if you give up and start reading it again it doesn't read from the confirm flusher lesson it reads from the
restart a lesson and sometimes that restart a lesson can be very like old and and i don't want
to get into when postgres like updates the restart a lesson because it's a more complicated thing
but the idea is when i give up the connection and read the connection again it starts from like restart a lesson right which can increase my
decoding times a lot so this is very common when we initially built padb like if the slot size was
like 150 200 gigs right we were giving up connection periodically and like whenever like
this slot is big for like 10-15 minutes it it was just stuck in Valry.io. It was not even getting one change.
So that is one tip I recommend, where you always consume the replication slot.
Right.
I was trying to recall another problem I saw when I inspected what's happening if you start using multiple slots.
And it's kind of an interesting problem.
When you start publishing
changes to a single slot, you use for all tables, for example. For example, for upgrade, it makes
sense. And this is great. It works very well. But if you start distributing tables among multiple
slots, you need to alter table, I think. I think. Anyway, like you need a lock for
each table individually in the group. So you divide groups and then you need to say to
to publish start publishing changes. And this is similar lockers like the deal basically.
No, not similar. It's not exclusive log, but it's shareable. I remember it got stuck. It's not exclusive log. So it's not that bad as alter table, as in column. So I remember it got stuck. It couldn't complete, but at least it didn't block others.
And was this while creating the multiple slots or was this like just like,
you know, creating publications? Publications. Yeah. So you specify tables for which you need
to create publication and this needs to acquire logs on all of them. It can be successful for
some of them, not successful for others. It's a single operation, so you're waiting and this is not good but not that bad as again as ddl which
can block uh selects which come after we discussed many times how dangerous it can be under heavy
load yeah that's interesting anyway multiple slots are tricky this is the lesson here so you need to test and learn before using them. Yeah, absolutely.
Good. What's next?
Well, before we move on from that, I'm interested in,
are there any other hard lips?
So you mentioned that rule of thumb, 25,000 to 30,000 messages per second.
Are there any other rules of thumb or hard limits
or anything that people could be thinking,
oh, we're probably
we've probably not tuned it enough because we're not getting that much throughput or that low lag
or something like that i guess this number depends on the on the core on the type of cpu you have if
it's sold it can be 10 000 for example yeah 100 and then know, this setting is, this number that I shared is more in like, you know, on the higher end in a non-local environment where like, you know, the source, the target are not local.
They are like in the same region, like, but in different boxes. So network latency is a very big factor.
One thing we do want to see is like how, what is the logical replication performance when both are on the same local server right like which is not in real world that doesn't happen so this whatever i'm saying is like
you know network latency bound because it's single threaded right and the network latency kicks in
and it can only scale up to say 20 30 mbps if it's like you know 30 40 mbps if it's done well also
right so those are also some things to keep in mind. Now, coming to gotchas, right?
Like a few things that I did, like, you know,
make a note of, of logical replication.
First is it doesn't support like replication of DDL commands,
which includes like adding of columns,
like dropping of columns, like, you know,
adding new tables, creating indexes,
like, you know, truncating tables, right?
Like, so these are not supported out of the box.
And this is one of the common concerns that we hear from customers,
because in the real world, you know, people, I mean, you add tables,
you like, you have these Django migrations,
which like add a bunch of indexes, et cetera, et cetera, right?
Partitions we talked about last time.
Yeah, they don't do that also.
Yeah, yeah.
When you say customers raise it as a concern is that
before starting or like realizing it later on where does that pop up for you it happens both
ways right like because i mean obviously postgres logical replication is the easiest way to like you
know replicate databases right they start off and then there's a column that is added and then
logical replication breaks and now the good thing is you can add a column manually on the target and it would continue
but then this becomes like difficult in in production which is when they reach out saying
that hey you guys like are also you know doing using logical decoding and you support postgres
as a target and that's how like they come to us and we do support like schema changes but like
only add columns and drop columns we don't't yet support like, you know, creating tables
and like, you know, indexes and like, you know, truncate and all of that yet, basically. And
sometimes it's like, it happens before itself. We're like, okay, this is a, I mean, we keep
adding tables. We keep like adding columns very often and like, this doesn't work for us. So like
we want another solution yeah nice a
couple of the other things that i see people confused or or struggling with are things like
sequence synchronization i guess that's just a one-off task at the beginning do you see anything
else catching people out that's a good question right like Like I think DDL commands is number one. Replication slot growth issues is number two, which we talked a lot about basically.
You know, always keep reading the slot, avoid long running or like open transactions, you
know, monitor slot growth, right?
Like fourth is, you know, use protocol versions, right?
Like upgrade to like latest Postgres versions and start using the version two three four which is faster
right the third thing that we see is postgres logical replication doesn't replicate you know
toast columns out of the box you need to set up replica identity full to you know make that happen
and replica identity full could be expensive if you are not on postgres 16 because for updates
and deletes it could lead to like
a sequential scan. And this thing, what happened? CTID was used or is used or like,
why it's better than 16? It's using indexes. No, it's like using indexes basically.
Okay. And the thing is, if you have them and if you have the right indexes and another thing
is some customers, right?
Like who have primary keys, but they have toast columns.
We still like need to do replica identity full and that helps.
So replica identity full with primary keys is more efficient.
And there is a good blog that the Zata guys wrote, like which I really liked.
And that is something that I refer to customers
where, hey guys, like you can add this.
I think it increased like CPU and IO by 30%.
That was their analysis.
But that is about like toast columns
where you need to have replica identity full
and the impact of that can change based on scenarios.
Yeah, awesome.
I didn't know about that one.
I would love to read that blog post. If you send it to me afterwards, I'll include it in the show notes as
well. Absolutely. And the fourth thing that we have seen is logical replication, at least,
you know, logical decoding did not support virtual and generated columns. If there are like generated
columns, they didn't show up on while decoding. the way we i mean we easily saw that i mean is
on the target the customer set up like a virtual column or they use like dbt or like some
transformation tool to you know backfill this right like so that is number four and then last
but not the least the fifth one that we see right like, like I'm still, you know, trying to discover that is like slot invalidation issues where the restart LSN becomes null.
And I have seen that happen in two scenarios.
So the first scenario is there is a setting called max keep slot size or something, I mean, like which actually limits the size of the slot.
And in this, this setting was recently added 13 or 14 which
lets you like safeguard from storage out of storage issues so as and when the slot hits you
know over a two terabyte right like or if that's the setting that you have it basically invalidates
the slot and in that scenario the restart lesson becomes null and the slot gets invalidated and
you need to restart logical replication the second scenarios we have seen is sometimes it happens randomly also in scene and and we are still figuring out
why that can happen and now we reached out to the community right like and we heard that like it
could be because of cloud providers postgres right like because cloud providers have a bunch of like
backup operations some some folks you know slightly fork postgres right and then it could be i mean the community doesn't say that like i mean their point is like
can we try to reproduce this on vanilla postgres right like not cloud and then that was not becoming
easy for us but like we have seen that out of the seven to eight months one to two times randomly
the slot gets invalidated and we don't have like an rca on when that can happen and immediately we
think that we check hey is the mall max wall like keep slot sizes like less than is it set up right
but it's still minus one it's disabled but still like we run into these issues so i'm on that quest
of figuring out when this happens because right but that is another scenario that's terrifying
and yeah how do you even begin to go about reproducing that?
Good luck.
Exactly.
So we were able to reproduce this on one of the cloud providers.
They're like for high throughputs, like over 50k transactions per second.
Every 15 to 16 hours, it was like getting invalidated.
50,000 per second and every 15 to 16 hours.
It's a lot. That's a to 16 hours. It's a lot.
That's a lot of messages.
That's not real.
I mean, I don't think that happens often, but yeah.
Wow.
Cool.
I don't feel like we talked about use cases much.
Is that because these aren't use case dependent?
I guess other than people using two-phase commit,
do you see any differences between people that are using logical replication for analytics databases versus for
did you see any difference between use cases basically good question like i mean
logical replication is very powerful i mean it supports like you know migration online
migrations number one then it supports like online upgrades right like which nick mentioned
right like number two number three it also supports like online upgrades, right? Like which Nick mentioned, right? Like number two. Number three, it also supports like HA and backup scenarios, right?
So I have seen some customers who use the, I mean,
logical decoding or logical replication to have HA across regions
or in like hybrid environments, right?
Okay, I'm running stuff on cloud
and I still want to have a backup or like a replica on my on-premise,
like, you know,
self-hosted environments, right?
Like, and the only way it's possible is via logical replication because cloud providers don't give access to like wall, right?
That is number three, where like HA and backups is another use case, right?
Four is replication to like non-Postgres like targets or even Postgres targets for like
workload isolation, use case isolation, where, okay, I have my OLTP database running on Postgres like targets or even Postgres targets for like workload isolation, use case isolation,
where, okay, I have my OLTP database running on Postgres. Now I want to put this to another
database. Like it could be Snowflake or Postgres for analytics or for search. Like I want to put
this on like a ClickHouse or like Elasticsearch, right? Like for optimizing search. Yeah. Logical
replication is very powerful and I believe that it's going to be the future for Postgresgres and down the line it can open up like active active use cases also right like where you
know okay you have like cross region and i recently read a blog where folks were using logical
replication for like active active right like so i mean it opens up like a plethora of use cases and
makes postgres more you know powerful extensible and yeah yeah i read that blog post too i think it was version 16
change that made it possible to like not replicate changes that got made by logical replication so
you could kind of like set up a logical replication in both directions but it seems super dangerous to
me like do okay cool would you recommend it at the moment Or do you think we need more features before that? I think it's more, I mean, it's more like intense than that, right? Because like conflict resolution and all of this is tricky, right? Like, and, you know, I mean, if it's like very workload specific, where like, okay, I don't touch the same rows, right? Like, and, you know, maybe there it could work. But I mean, out of the box, like implementing that
is tricky and it requires more like effort. And maybe we go there in the future, right? Like,
because we are seeing a few like cases with customers where they want active active and,
you know, there is not like a, you know, out of the box solution. So.
Why do people need active active?
Good question. I think I have, I have like a lot of thoughts here so like the
thing is i think it really helps with like uh ha right like for example i remember this scenario
in microsoft where customers were having like a bunch of like sql server active active across
regions and then this was a bank and then one of the region went down right like and then every
second is like thousands of dollars. And then they
immediately pivoted to the other region and it kind of worked seamlessly.
But doesn't it mean that all clusters need to perform the same rights?
And I didn't buy this idea that replaying changes through logical replication is less expensive than applying them initially,
which was advertised in the BDR documentation.
I didn't buy this.
I hope to find time to test it properly
and blog post by that.
So far, didn't find time yet for this.
It's an interesting topic.
Because in BDR, it means, for example,
you have multiple regions,
four or eight. I know such cases as well. And they suffer because everyone needs to perform
the same rights. And if there is a spike in one region, it's not a resilient system at all.
I don't understand this. Yeah, 100% agree. I think that's the reason there is a big opportunity there and
i recently saw one company right like what the pg edge who are like trying to do something like
this yeah new wave of this yeah i know every cto is dreaming about this i know it because
we are like we work in multiple regions maybe we should first check that all our physical standbys
are in different availability zones than primaries, right?
100%. There are more basic things.
Yeah, but multiple regions is a great thing to have,
but still, I'm very skeptical.
But I see in DBA's mind's mind there is mind shift as well like 10 years ago all dbas
said you don't need this but things are changing so in postgres 16 this to avoid loops with
infinite loops right this feature it's interesting so but i still i don't understand use cases and
how like we can scale rights if we don't split them like in sharding.
We split them in sharding and that's great.
Here we, like everyone needs everything.
Well, I'm very skeptical still.
I don't understand this part of, I don't understand this part of landscape.
Yeah, a hundred percent.
And like, I think it becomes very critical in tier zero use cases guys right
like i mean not like i mean tier one tier two tier three i think it's kind of lesser but like
it's more these tier zero where like it's like a chase bank or something like that right like
when it becomes like hyper but you think it's possible to build good system no i don't think
it's i mean with postgres i am also very skeptical but i think there is an opportunity there right
like and community both community and like you know know, I mean, community will be very critical here, right?
Like, I don't think that it can happen just by logical application.
Maybe if we build system categorizing data in tiers as well and replicating rights only for most critical data between regions, right?
David Bainbridge We're probably tiptoeing into a different
topic here. Yeah, it has been interesting.
Alex Bialik It's not different. It's not different.
Many people think logical leads to multi master definitely like
so called old old term. So
David Bainbridge I think I think the tier zero use case
like feels like it's a long way from being supported by native
Postgres. Like we have a lot of providers out there for
that kind of thing, and I think the
general term for it is being called
distributed SQL.
That seems like distSQL
is what I'm referring to.
But I would
warn people against using logical for this
in the short, like anytime soon.
Okay.
I agree. Then what other people against using logical for this in the short like any time soon yeah okay i agree okay
then what what other questions do you have i didn't have anything else i wanted to thank
sai what did you have anything else nikolai well no thank you so much it was interesting
thank you for coming absolutely guys i really enjoyed like uh chatting with you and thanks for
like you know inviting me and uh yeah i hope
you won't stop posting interesting technical posts no no no so content is our currency guys
i mean like and the thing is for me like i'm i mean i'm building the company because uh i'm very
curious right like now the thing that is like haunting me is that slot invalidation not like
i'm not understanding why it can happen right like so so i mean like so because of curiosity we will be publishing a lot of content but
benchmarks graphs data and so on reproducible also yeah that's great thank you so much and
and also if any of our listeners have seen that as well have any theories or have a reproduction
case for it let us know absolutely good wonderful thank you thank
you both take care thanks guys bye bye bye