Postgres FM - PgQue
Episode Date: May 8, 2026Nik and Michael discuss Nik's new project PgQue, a descendent of Skype's PgQ, for running queue-like workloads in Postgres. Here are some links to things they mentioned: Our first episode o...n Queues in Postgres https://postgres.fm/episodes/queues-in-postgresPgQue https://github.com/NikolayS/pgqueHN discussion https://news.ycombinator.com/item?id=47817349PgQ https://github.com/pgq/pgqpgmq https://github.com/pgmq/pgmqRiver https://riverqueue.comKeeping a Postgres queue healthy (blog post by Simeon Griggs / PlanetScale) https://planetscale.com/blog/keeping-a-postgres-queue-healthyPostgres Job Queues & Failure By MVCC (blog post by Brandur) https://brandur.org/postgres-queuesMy queries to monitor autovacuum (blog post by Laurenz Albe) https://www.cybertec-postgresql.com/en/monitor-autovacuum-my-queries/SELECT FOR UPDATE considered harmful (blog post by Laurenz Albe) https://www.cybertec-postgresql.com/en/select-for-update-considered-harmful-postgresql/Christophe Pettus blog post https://thebuild.com/blog/2026/05/03/pgque-two-snapshots-and-a-diffOur episode on pg_ash https://postgres.fm/episodes/pg_ashRediscovering PgQ (Alexander Kukushkin slides) https://speakerdeck.com/cyberdemn/rediscovering-pgqTick frequency tuning https://github.com/NikolayS/PgQue/blob/main/docs/tick-frequency.md~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith credit to:Jessie Draws for the elephant artwork
Transcript
Discussion (0)
Hello and welcome to Postgres FM.
We share about all things Postgres Quill.
I am Michael, founder of Peach Mustard, and I'm joined as usual by Nick,
founder of Postgres AI.
Hey Nick, how's it going?
Hi, Michael.
Everything is all right.
How is your business and life?
Yeah, good.
We're in spring in the UK and it's just getting a bit warmer now.
And yeah, business is good, ticking along.
I've got some upcoming news soon, actually, that I'll publish in the newsletter.
That's great, yeah, looking forward.
How about you?
Yeah, obviously I'm a guest today, right?
Yeah, what are we talking about?
We talk about cues again, cues and postgres.
My favorite, remember, I told you so many times,
I like to observe how many of them are created
and how many of them have issues.
Actually, all of them have issues.
Now I digged into the topic deeper,
and actually I had surprises my understanding in the past
was not fully correct.
And I'm going to confess today
where I was not, like,
where there were gaps.
Yeah. And just to clarify,
when you say that they all have issues,
do you mean Q implementations inside Postgres
or inside relational database or OTP database?
Yeah. So when we have a new client, for example,
at Postgres AI,
our most popular type of client is a startup
without DBA who are on RDS or CloudSQL or Superbase whatever.
And they have issues because they have growth.
This is my favorite type of client.
And they bump into some problems.
We check, we have various tooling for health checking
and almost always we recognize one of a few patterns
like log-like, append only,
unpartitioned, huge table,
or a table usually, usually
also on partitioned, which receives some events to process.
And they got updates, deletes.
And if this startup is more mature, we see some, like, PGMQ, for example.
Or if it's smaller, usually nothing.
And everything is wrong there, starting from heavyweight lock contention, right?
But also bloat and a lot of complaints.
Are we talking about like self-rolled Q?
inside the data.
Yeah, so like naive implementation of Q in progress.
You can see it naturally just looking at Pagist at all tables, noticing patterns of
foreclode, a lot of updates and deletes, but also a lot of de-taples and auto-vacuum.
If it's untuned, especially, it cannot keep up, but also if they have long-running transactions
or other reasons to block Xmin Horizon.
We need to discuss it slightly deeper today.
They will have a lot of blood accumulated and all latencies suffer and it all.
this piece of
workload and this table, usually just one table,
it becomes a hot spot.
This is a huge reason for them
to complain about how post-gust is bad.
And I don't fully disagree.
Only who didn't complain about
post-gast-gast-Gus and VCCC
and vacuum,
all the, like being headache all the time,
everyone did, right?
So, yeah.
And I usually said,
all you need is two things.
And actually not only said,
I came to some implementations
of cues less naive
and even helped them.
For example, long ago,
there was a project called Delayed Jobs in Ruby.
And I added a couple of things
like index,
which was easy, like just missing index.
But also I said,
let's use skip locked for update skip locked.
So you just
don't need this heavy weight lock contention
when multiple sessions compete to update
or delete the same row.
It's quite straightforward.
And to some others, I had said always
like, you just need two things, skip locked and partitioning.
And as I understand, this is where everyone went.
So skip locked created roughly 10, 11 years ago,
actually 11, 2015.
I think it was,
Postg is 9.5, right, because it's 2015.
Yeah.
And it's a great feature to get rid of heavy weight lock contention.
But it's not enough.
It doesn't solve the blood problem.
Actually, somehow I noticed in my recent work over the last few weeks on hacker news discussions
and some other places, I noticed people think that skip locked will solve their blood issues,
vacuum issues. It's not so.
So partitioning and skip locked is quite good enough, and this is where everyone went.
I think in PGMQ, actually all modern Q systems in PostGus, they love skip locked.
They are like all about skip locked. At some extent, my AI boat started, we did a lot of research,
did a lot of experimenting, benchmarks. So at some point I noticed they started to name all these guys,
skip-locked architecture somehow.
Sometimes I don't like it.
I like more update-delit architecture.
So update-delit-cute-cue systems, not skip-locked systems.
Because skip-locked, it's shifting too much attention to itself.
But skip-lock is a simple thing.
Just let's get rid of heavy-weight lock and tension.
That's it.
Other problems which are bigger actually, and harder to solve are not eliminated.
They can be only mitigated with partitioning and rotation.
and I saw, for example, PGMQ, it's quite popular.
I think this is a good legacy from Tembo.
Yeah.
It's supported, I think it's supported on SuperBase, quite popular there.
And they actually also went to get rid of the need of create extension.
So they re-implemented it fully in PLPGSQL.
Oh, wow.
And then in form like a trusted language extension.
You know, PGTLE, trusted language.
Trust Language Extension, since it's purely PG PLSQL,
you don't need to ask provider to support it.
If provider supports PG-T-L-E, you can have it.
And not only load it as only SQL file,
but you can have it as tracked extension without provider support
because it's just SPILPG-SQ.
So they also focus on skip-locked,
and they have support of partitioning,
but it requires PG-P-P-G part-man and additional effort.
Right?
And just a question.
their partitioning. Is it like time-based partitions and you detach and drop them over time? Or is it a
rotational thing like I don't remember, but it doesn't matter actually. So like what matters here,
you cannot rotate partitions every minute. It's not practical. Sure, sure, sure. And it will lead
actually to some other issues. So partition rotation should happen less often. And by the way,
who those who use partitioning to mitigate blood is a great idea but instead of detaching
attaching partitions which i see in some q systems better to have several static partitions i mean
stop creating them it will lead to catalog bloat eventually right and also detaching
attaching has its own issues under heavy weight heavy loads it's better just to truncate
and have rotation like round robber of partitions
and you just truncate them and that's it.
So it's much better in many senses.
And this is what PGQ does.
So I always said like these two things are enough,
but I realize that they are not.
That's the like, yeah.
Can I just, when you say PGQ,
are we talking about your new tool, PGQUE or PGQ the Skype-based origin?
Because just the letter's PGQ.
Let me, yeah.
let me explain how it all started.
So on our podcast, I was saying, that's it.
Just skip locked and partitions, rotation or something.
Skip locked felt natural because this is how we solve heavyweight lock and tension
when you have multiple beacons competing to update or delete the same row.
Yes, yes.
Inserts cannot.
Inserts don't need it.
They are independent, right?
But updates and deletes, they compete.
And when I said partitioning, I always,
said, look at PGQ, Skype created 20 years ago.
Yes.
And that's it.
And I thought it's enough.
I thought Skype didn't have it.
I mean, Skype didn't have skip locked that time, right?
So I was thinking, they did it differently because skip locked didn't exist in 2006 and
seven.
PGQ was created exactly 20 years ago, 2006, and it was open sourced in 2007.
And now I'm talking about Sky Tools, PGQ.
Right.
Yes.
But okay, but it didn't do skip block because it,
but it was also not doing updates or deletes, right?
Like it was...
Exactly.
We will come to that.
Okay.
But I was thinking, I had a false impression that we should use skip lock because
it's modern path, but we also should mitigate blood issues, vacuum problems and so on,
just using partitioning and rotation.
And coincidentally, this is how all...
all current guys are doing it.
River, P-G-M-Q, Q, Q, Q, U-E, others.
There are many now.
And also, some of them are quite agnostic to languages.
Some, like river is go-oriented, so they're focused on goal frameworks.
And that's it.
And then what happened, actually.
So this is my understanding three weeks ago.
Then what happened?
I was in Zion Canyon on campground
and they have good connection actually
I was in my tent
and I saw that
it was Friday evening I think
and I saw Planscale blog post
yeah
right and I started to read
not blog post itself because I quickly realized
what it's about
I started reading discussions of it on Twitter
on X
right
so that post was dancing around
Brando's Brondor Litch, right?
Who is actually one of
creators of River Q?
So Brander had a post in 2015
about how
like basically how
challenging it is to have
Q in Posgas
because on MVCC
and if you have long running
transaction with XID assigned
or repeatable doesn't. I think
Brander used repeatable read transaction
lasting one hour or half an hour.
I don't remember. And I think in his post it was like below 1,000 events per second inserted,
like maybe 800. And quickly like something like 60,000 events were accumulated unprocessed by consumers
because everything started to lag and so on. It was 2015. I think this is actually a year when
skip locked was added to Posgues. Interesting.
right?
Yeah,
good timing.
Yeah.
So PlanScale discussed how bad it is.
Not like Q and Postgres are bad, but it's bad to have long running transactions
or something which is blocking Xmin horizon, right?
And they promoted their new feature, how to mitigate it, but mitigate how?
Just cancel that.
So they have some smart approach like which traffic is more important, which is less.
important. My opinion, what we created with Andre,
transaction timeout is good enough for everyone as default
solution against long-running transactions, although
there might be other problems like unused or lagging logical
replication slot, right? Yeah.
Or maybe someone is using 2PC and prepared transactions
also can be a problem.
So anything that holds X-Men Horizon and prevents the cleanup of dead,
top or like old re-versions.
By the way, we just released our monitoring, which is like front.
When we say front, we mean monitoring with Grafana and Victoria Metrics and PostGus inside everything.
We just released with our new dashboard for Xmin Horizon analysis.
And there are five possible reasons.
And also Lawrence Albee, very timely posted blog post about how he monitors AutoVacom.
I stole a couple of thoughts there.
it just implemented in dashboard and it's already released and it's free for use Apache license,
but it works much better if you go and become our customer because we have great new health
matrix. I will blog post about it separately. Anyway, this is connected because Xmin Horizon,
like we also talked about it. Everyone monitors long run transactions, but it's off. It's
a wrong thing to monitor in this context. You need to understand Xmin Horizon being blocked and by whom
to unblock it promptly because this is how you can put your river or regime you or something down actually
not down but basically lagging and having very poor performance
accumulating a lot of bloat that is not ever recovered if it's not using like a i think that's
the other thing that people don't realize is there's no recovering from that because once that's
bloated unless you're using like the partition rotation there are several things here several
First, a lot of dead tuples are accumulated.
Yeah.
Because Xmin Horizon is blocked, so dead tuples are created every time you produce
delete, successful, delete, successful, update, or unsuccessful insert.
So it means, by the way, that we also can produce dead tuples if some inserts are failing.
But this is very subtle.
It's nuance, like we can omit it, right?
So regular approach, we always produce that tuples.
is a raw version. We agreed on our first episode that I say tuple, you say tuple or vice versa. I don't
remember. Yeah. Anyway, tuple is a raw version. So old version becomes dead, but it's still
hanging out in your, everywhere, actually, including shared buffers everywhere. It's polluting everything.
So garbage collection called vacuum is needed to delete it. So it's two-phase process. First, it's only
marked that and then it's deleted.
And the first problem, a lot of debt apples are accumulated.
They cannot be deleted by garbage collection called auto vacuum.
Right.
And this first bad effect is latencies of consumption degrade.
Very fast, actually.
So because to find the next thing, you need to skim through all debt apples with your index scan.
And it becomes less and less performance.
Yeah, next bad thing is that.
we we accumulate big set of unprocessed events sometimes not always sometimes like we
okay we have degradation and when I said degradation it means like degradation was it was like
one millisecond to fetch next event for example or a bunch of events and then it
degrades to second or a few seconds during one hour I saw I think five seconds for some
cues just to fetch one event five seconds can imagine it already at some point in some
if you keep inserting a lot of events and consuming them, at some point, it might start timing out on statement timeout.
If you have strict statement timeout, as you should for all LTP systems, we always recommend to have strict timeouts.
Right.
This is so a lot of data app was degradation of consumer performance.
Consumer query.
Second effect is a lot of them accumulated just because consumer capacity, throughput is not enough.
You have, for example, 10 consumers work.
in parallel, keep locked, help them not to fight with each other.
And you have capacity, for example, to consume 2,000 events per second.
But now suddenly latency became from one millisecond to one second,
thousand times worse. It can happen during 10 minutes or so.
And 10 minutes just like for example, you have multi-turbide database. If you decided to
dump it, right, or create logical replica in a traditional way, this is it. This is how you,
can reach second level of consumer query performance.
And this leads to accumulation of unprocessed events.
And for some cues I noticed even when we stop,
long-running transaction, we unblock Smith Horizon.
First of all, auto vacuum comes, cleans up that doubles,
immediately latency for consumer query drops.
But for some, it didn't recover to previous level.
It didn't recover to one millisecond level.
It stayed like 50 milliseconds or so.
I think it was Q, which is like Q, U, E, very nameing as mine.
I call it K.
I call it K, like Spanish.
If you check, I'm not speaking Spanish, but who is speaking Spanish?
How my tool is pronounced?
It's crazy, right?
Something like Peket or something.
I don't know.
I have an issue, and actually I have already pull requests to rename,
but I didn't like all brainstormed names so far.
PG-Belt, this was the,
the best I had, I didn't like it.
And we will come to that YPG belt, right?
And YQ actually is misleading, as I learned.
I learned two big things during this journey
last three weeks or two weeks.
So...
Just quickly, before we move on from side effects,
I want...
So eventually, even in a non-partition system,
the heap bloat would get reed.
Once the tumbles are marked dead and vacuum comes along,
mucks is like reusable.
New jobs or new events
can go into that space in the table.
But people don't, and I know we've talked about this, like thousands of times,
but it's the indexes that can end up bloated in a way that isn't recoverable without a re-index.
So I wonder if that could be the source of the not ever recovering.
Maybe, yes.
Yes, you're right.
And bloat might still be there.
And this is what can be sold post-factum, actually.
It can be solved with partitions and truncate and in rotation,
because truncated means like you will have fresh empty index and it will start growing from scratch.
It's great.
But again, I was thinking first, actually, it's an interesting nuance as well.
I was thinking, oh, partitioning won't help in the middle of long grinding transaction.
In the middle of like when X-Men Horizon.
Actually, it will.
It will help.
It will help.
You will switch to new partition.
All partition will keep old stuff with that tuples, degraded indexes, and that's it.
And this is actually, it will lead us to interesting optimization.
in my tool but what it won't be possible to do to have I think it's possible but I
wouldn't recommend it to have aggressive like every minute partition rotation and
truncation first of all you need a lot of a lot of partitions maybe maybe no actually
maybe you can find a way to to just jump between them but it's not practical again
because of catalog bloat and the stress when you switch to new partition and
some stress and you need to clean up
and all, everything should be processed.
And if you dropped, if you're dropped latency,
like degraded latency leads to event,
unprocessed event accumulation,
you won't be able to recycle, right?
Old partition, because it still has useful data.
You can move it to new, like, it's becoming nightmare, actually.
So anyway, it feels architectural wrong for me
to have very often partitioning switch.
right so this is and back to your question time-based or size-based well time-based is fine i actually
don't remember what i have in settings i need to check but here your other question are we talking about
pjq new or pjq all sky sky tools so what i did i took it as is right first thing i did
i took it as this and i just started to build around it so i'm not touching core engine that's the
So the original Sky Tools PGQ core engine remains.
I mean, I knew that, but not actually by reading your...
I saw, did you see a really...
I thought it was a good blog post by Christoph Petters covering the 0.1 announcement.
Yeah, he's done it.
He's been on a bit of a blogging spree lately, but he's done a blog post about PGK or PGQ your tool already.
I missed it.
It's cool.
Nice.
Well, yeah.
What I wanted, I wanted to say, guys, there is alternative, forgotten Kung Fu, I say, because I know how trustworthy it is, 20 years.
I used it in two of my three social network startups myself before we, I think it's a mistake, switch to Rabbit in Q.
I regret it now.
We use it heavily.
We use it even originally as Skype.
Skype built it also not only like for event processing, but also to have logical replication instead of,
the Sloane they built
Londeste and it was working on top of
PGQ.
Native logical replication didn't exist
at time. So
it was serving many purposes.
And when
they built it, Kafka didn't exist.
Kafka created in 2011, I looked
it. So this is called
Q but by nature
this is a second thing, big thing I learned
it's not a Q system.
It's a immutable log
similar to Kafka, not
distributed like Kafka but or Red Panda also modern thing right as I learned and it's just
it guarantees that something is inserted this is order and consumer is just a pointer shifting right
but it can be used for Q like workloads maybe not all of them and we can discuss it in a bit
but what I wanted I saw this blog post from playing scale and again these discussions like just
you skip locked and first thing I posted and I found big feedback like people
like yeah that's it i said first of all skip lock doesn't solve vacuum problems it's just very
wrong skip lock solves heavy weight lock contention right and also skip locked there is a
another post from law lawrence alb select for update considered harmful right so there are issues
with skip lock it's a part of select update you cannot use it without select update and there are
issues with that approach as well additional danger is there but i wanted to show like there is
PGQ from SkyTools, and we all know another tool called PGBouncer, and I'm wondering,
and I know PGQ is still used in very large companies as an important building block,
very reliable, very performant.
Skype originally built architecture for, as I learned from their talks, for one billion users.
They achieved hundreds of millions.
I don't know after acquisition by Microsoft.
By the way, Skype was closed last year.
also case of this. So that's an interesting legacy here and I'm thinking why pjibouser became quite popular,
but pjQ didn't became quite popular. And my hypothesis is it's because of its extension which
requires additional demon and provider maybe I'm wrong but maybe providers just didn't want
to bring any additional demon which is not a regular background worker but something that needs to be
managed separately.
Whereas PG bounce is completely independent and...
Well, yeah, yeah, yeah, yeah.
But yeah, that's a good point.
It needs to be managed.
There are some providers who provided as managed service.
But PG Bouncer, it's like old tools solving that problem.
Here, it's inside Postgres and we need additional demon.
I don't know.
I didn't participate in any of those decisions,
but I just don't see any of providers supporting PJQ.
meanwhile maybe on
maybe on cloud SQL actually is it
they support PL
this is maybe I told you this
and I was wrong I mixed it with another tool
from Sky Tools called PL proxy
Cloud SQL SQL supports PL proxy
yeah
and yeah
hello Hanu Crossing
who actually liked every single
my post I think on LinkedIn about
PGQ because this is
he was from Skype as well
so it's great
it's great and yeah
so
then
interesting thing like i'm thinking okay you know you know my work pj a h anti-extension concept right
yes yeah we talked about it yeah then there are others like i have a couple of more coming but then i
think okay i just need to repackage it right and being in my tent i have new tool to create thorough
specification so spec creation tool which i use for many things now it's
CLI, like just. And like it's, you start from idea and then explore questions, research,
and then build a comprehensive tool and then iterate with multiple LLMs, most powerful you can
reach. And then you have already like version seven, for example, which is ready for implementation.
So I wrote spec, how to repackage PGQ from Skype in this anti-extension manner. So no create
extension is needed. And since like we discussed this, since we have PG-Kron,
like with PGHH.
Same thing.
Almost everywhere.
Yeah.
You need a ticker.
So that's why that demon was needed.
You need ticker.
So PGQ, it's a log.
There is insert, can be single insert or batch.
Never updates, never deletes.
And there is basically their own horizon.
It's based on snapshots of data, right?
So every consumer knows position.
And to shift position, you need tick.
You need to announce, okay, we shifted.
Because something new, new.
arrived. And by default, the PGQ from SkyTools, it ticks every second. So it shifts and
consumers sees new data and fetches the whole batch of data, events. So batch is by default
like there. Batch processing is by default. The thing I didn't understand is that you have a
second table for keeping track of those. There are meta tables. Yeah, for ticking and for
subscriptions but Q itself it's three partitions old school inheritance partition because it was
created before native partitioning was created so three partitions one is in work one is like in the
past one is in future and there is a rotation rotation using truncate there is also delayed table
separately for those events which cannot be a process now can they need retry we put it there
Yeah.
So that you don't end up truncating jobs that haven't been done.
Yeah.
And it was a single table.
I think I also implemented the same pattern there because sometimes we might have
a lot of jobs to be retried, events to be retried for processing.
So I created also three partitions.
You also created dead letter Q for the concept of maximum retries and then we put it to
that letter not to retire forever.
So I just, I just, I suggested this is like how many.
modern systems work. I said, oh, good idea. Let's adopt it here on top of what already exists.
So I created spec. So I was like thinking, guys, I understand playing scale position.
Let's just have shotgun and fire all those long running transactions. Great. Transaction timeout
also that's shotgun. Maybe less smart, but still working and reliable, right? But let's just
compare performance. I said, okay, let's compare performance.
And I said it in the same session of Cloud code, where we just created a spec.
And then we started benchmarking, provisioned like, I think, seven VMs in cloud for alternatives and PGQ.
And I noticed it provisioned two machines for PGQ.
One was original PGQ and one it called like PL mode.
PL mode.
I was thinking, okay, what is PL mode?
We're supposed to create it according to spec, but we haven't implemented it yet.
it said
you created it in
2019
I said
okay
it was first of
it was not me
it was Mark
Kren
or author of
PGQ
but second
how come
it's created
apparently
also interesting
part of the story
Alexander Kukushkin
main trainer of Patroni
in January
I think in
PG Day Prague
I might be mistaken
presented
a talk
about PGQ
because
PGQ is
excellent
thing, let's revive it, right? Great, goals align here. I also want to revive it. And I was like,
you know, I told him, but unfortunately it's, you cannot install it on Ardios, right? I was like
slightly provocative. He said, no, my slides have a recipe. I go, how come you learned it? He said,
everything is written in commit messages. The problem with PGQ always have been a lack of
documentation. Everyone can tell it. I hear it 15 years and can say myself. So yes, indeed,
there is a commit in 2019 or so to make it work on RDS and others. Let's have this support
without create extension, just a single file. Or maybe multiple files. Doesn't matter.
P. PLG scale only mode. It's called PL mode internally in PGQ. So apparently it was already ready,
but it was not, I said, okay, but I actually, meanwhile, I talked to my cloud code, I say,
okay, but I cannot find that file.
Like how, like, where did you get it?
I don't see it in repository.
It says, okay, you just need to run make.
What?
You need to run make to get the SQL file you can load to your RDS.
We live in different quotes here a little bit.
Yeah.
I told Kukushkin, I think,
Gen Z won't understand us with this make.
I don't know.
It's interesting.
For me, it's like mind-blowing.
Everything exists, but it's buried under these walls.
For some people, it's not a wall.
You get clone, CD, make.
PSQL, import file, everything works.
But imagine for some people it's a big barrier, right?
it'd be weird not seeing it in the repo yeah well you need to read commit messages and that's great
that Alexander has a talk promoting that this is possible but it inspired me even more let's do it
and add more and more in the documentation so my tool is just basically it has it even a module
and then it compiles and presents this is our sequel but of course I started adding more things around
So this is it.
This is the idea of PGQ with UE, which is universal edition, so it can be used anywhere.
And I'm releasing this week second version with a lot of stuff, actually, a lot of stuff.
Somehow, some requests and so on.
First of all, I realized I need libraries.
So we now have TypeScript, Go, and Python libraries.
And one person promised to bring Ruby library as well.
Oh, nice.
Yeah, and I also already have two external contributors.
So like I have some life, I achieved thousand stars in four days or so.
It was good.
I mean, it felt great.
But I also learned it's not a queue.
It's like log because it's more like Kafka than Rabbit InQ or ActiveMQ.
And I agree after for understanding.
That's why in version two, I'm bringing another concept from Skytools originally.
It's called cooperative consumers or something.
sub-consumers. So logically it's a single consumer but there is a group of consumers which
distribute load between them. This is needed. For example, when you have, imagine you have a queue of
jobs like process on video. Some videos are super small, some videos are super much. In case of PGQ,
if you just, your position and read it by one, if you have some people by the way looking at PGQ
think, okay, if I'm adding more consumers, I'm increasing capacity. No, throughput won't increase
because every consumer in PGQ reads everything. Everyone, everything. You need different
cues to distribute load. It's like topics in Kafka because it's actually not a queue. It's log.
Yeah, but do you support multiple cues? I guess it just involves...
Yeah, yeah, you can create as many cues as you want every time.
they will be partitioned and all the mechanics will work.
Yeah.
But with concept of sub-consumers, which is not my idea, it's a regional idea,
but I couldn't import it because it's a separate repo, PGQ Coop, and it doesn't have license.
There are only two issues asking what license is it because we want to package it as Debian
package or something.
And I couldn't take it, but I stole idea, of course, right, and re-implemented it with my own code.
but the idea is the same.
The feature is right now experimental.
I need to play with it.
We started already benchmarking it and so on.
It looks good.
So this I think should be natively supported.
So there is a lot of stuff,
but the key idea is that now it's like a single file.
You can load it.
I also made it a PGTL extension
for those who want to properly attract as extension.
Sure.
Right.
And you just injected.
You configure PG-Crone.
I actually thought about, and Hanuk Crossing, who is now at GCP and Skype,
we discussed it on LinkedIn and I implemented it.
So PG-Cron cannot tick more often than one second.
I was going to ask you about this.
Yeah.
But what the reason is the default is 100 milliseconds.
This is new for version 2, yes.
I just made it yesterday.
Okay.
Yeah.
So I was thinking, first of all, people think about latencies.
I recognize three latencies.
First is producer query latency.
How fast it takes to insert one events or a batch of events, like 100 or 1,000.
Second is a consumer query latency.
How fast it is to take next batch fetch it, right?
And we measure them and we see how badly the second one degrades for all alternative modern.
I cannot name myself modern tool.
I'm very old.
Engine is 20 years old.
So they all degrade.
Here we don't degrade almost.
We slightly degrade from 100 microseconds.
We go slightly above one millisecond.
While they from one millisecond go to one second,
sometimes five I saw.
And degradation, we can discuss separately.
Our degradation also solvable, but not solved into version 2 yet.
When I say version 2, it's 0.2 because it's early,
but it's super solid engine, we know.
And there is third latency,
end-to-end event delivery latency.
If you tick, if you shift your vision horizon only every second, it can be up to second, at least.
Also, a consumer itself might wake up, not immediately like you need to listen to notify or something.
It's partially supported right now.
But you need polling or something.
You might lose some milliseconds there as well.
And I was thinking this decision to have once per second was made 10 to 20.
years ago. We have better hardware, so let's have 10 per second by default. And how? Okay, PG-Cron,
which we rely on. By the way, PG-Cron is optional. You can put it to Cron or something. You just
need select Ticker, PGQ Ticker. Okay. Tick-tick. Yeah, every second by default was, it was
originally from Sky Tools. It was in the first version. In second version, I made it 10 times per second,
And it's simple.
In PG-Crone, there is a stored procedure who has a loop with commit
because we need the separate transactions, actually, to shift the snapshot.
So it's not every, this is the same misconception as for backslash watch in PCCO.
It's not every 100 milliseconds.
It's 100 milliseconds wait time.
The operation itself has non-zero duration, right?
So roughly it should be fine.
and like it takes 10 times per second, but not exactly like it's slightly shifting.
But updating one row, it's super fast.
Yeah.
And it will be even better when I implement bloat mitigation for system tables because
this is why we go from 100 microseconds to 1 milliseconds or so under blocked Xmin horizon
because we accumulate the tuples in this meta tables, ticker and subscribe subscription.
Are there any other downsides to increasing the ticker speed or decreasing the ticker frequency?
So I did preliminary benchmarks yesterday and important thing to understand.
When it's ticking, if nothing to do, nothing to read it.
So it doesn't show.
And it means it's great if load is low, it won't produce new rights.
But imagine every 100 milliseconds you have new events.
In this case, every time ticking, it's updating this row, which has metadata.
And we estimated it for ticking every second, it's 24 megabytes per second of wall.
It's very rough because it doesn't take into account full page rights.
So it's very rough, just from ticking, overhead from ticking.
24 megabytes per second from a single time per day.
Oh, per day, that makes more sense.
And 240 megabytes per day, sorry, per day.
If you have right now default in version 2 to 10 times per second,
which is acceptable.
I mean, this means you have load already, right?
So doing this database is loaded.
If every 100 milliseconds, there is ticking happens.
If it doesn't happen again, no rights.
So it sounds like it scales fairly linearly then, like 10 times more.
This is overhead from updating this meta table with row where we are.
That's it.
Yeah.
Right.
So, of course, if you inject a lot of data, there's mechanics.
Interesting.
Might happen.
And again, if you have long running transactions, X-Mech horizon blocked,
in this case, that apples will accumulate, unfortunately, in the system,
in metadata tables, which I'm going to solve also with partitioning and truncate.
So something I don't quite understand is what, when doesn't this make sense?
You mentioned it's not really, it's a queue.
It can be used for Q like workloads, but maybe sometimes it doesn't make sense.
Should we go into that a little bit?
This is a great question.
I'm deep in database.
So I would like to hear from backend engineers and people who build systems.
What's lacking here?
One thing I can understand is lack of, for example, priorities for events.
Yeah.
Because this is very linear with cooperative consumers, I think the problem of big task blocks, small tasks will be basically resolved.
But priority, I don't know.
This is definitely not a pattern here.
Also, if you need almost immediate delivery, PGMQ or river might be better because they deliver faster, right?
Because they take that.
End to end, I mean, end to end latency for job processing.
what we have here is worse
and to end latency
controlled by this ticking frequency
of this. I actually wrote a document
about frequency tuning with some
considerations. There is
in docs folder directory
there is a special document right now.
So the problem will be
so if you want
almost immediate delivery like for example
I don't know like chat or something
maybe you should choose
Pidjim Koe River but you need to fight
X-Men Horizon blockers very actively, right, and install our monitoring and connect it to our
platform and check the health and so on and fight those blockers actively.
What I can say is that they delete skip-locked systems.
They have better end-to-end delivery, but they degrade badly under X-Men Horizon blocked.
We have worse initially, but it's predictable, reliable, right?
And in the case, if you have like background, for example, we discussed how to convert integer 4 primary key to integer 8 primary key.
You have, for example, a billion rows.
You need to change them.
And you chose, for example, the approach I call new column approach.
You create a new column with integer 8.
And then you need to install trigger for future rows already.
And then you need to process your big backlog, billion tables.
You do it in batches.
how to schedule this processing.
This is exactly any background processing
where like 50 milliseconds
and the tetanency is fine, this is it.
It's good.
In working batches.
If you need one millisecond,
okay, choose new tools
but fight X-Men Horizon Blockers.
Yeah.
I mean, my understanding of when you use cues
is for asynchronous stuff.
So I'm struggling to imagine
something that can't cope with 50 milliseconds of overhead on something asynchronous.
Even like a password reset, if it comes through a second later, it's fine.
I agree.
And in this case, maybe you should consider doing it outside of PostGus with different systems like Red Panda or something.
I can imagine some systems where you need very responsive behavior,
but you need to learn how MVCC works and what dangerous weight you.
if you build like that.
You will have good latency in the beginning,
but suddenly then something blocks you
and then it degrades quickly.
I wanted to mention that
a queue in database is great
because it's ACID.
Nothing will be lost, right?
It's like it's replicated.
It goes to backups.
Nothing is lost.
And it's isolation.
All four properties are very
follow followed.
right if you go and use different system and you need to think about consistency right so you need to
think about if you have something in database you already wrote here but didn't delete that or it can be
inconsistent these days github works very poorly and since i posted this project on github i worked i'm usually
usually on git lab and i know gitlap issues as well because of there are our clients many years
But they're great.
On GitHub lately, I like, wow, it's interesting.
You already merged pool request, but it takes some seconds for counter to propagate.
Also, for this pool request to disappear.
So they have big legs, synchronous processing, right?
But legs are fine, eventual consistency, right?
But data loss is not fine.
So if you have data, I would say, if you need predictable performance,
reliable approach, good throughput, not suffering from degradation when X-Men
Horizon is blocked, PGQ is great. When you need much faster delivery and you want to
stay inside database, ACID and so on, choose different systems for PostGarves, but fight XMIN
horizon blockers. And if you want better throughput, go with Red Panda, Kafka or anything,
if you can afford supporting or paying for managed version.
But in this case, do look at transactional outbox pattern.
Yeah.
Because this is from microservices theory, so to speak.
There is a pattern to organize data delivery from database to queue properly and all statuses.
Like this is how you should do it because otherwise data loss is eventually inevitable.
Yeah, this is how to navigate.
solutions advice from me. Yeah.
Anything else you wanted to make sure we cover before we wrap up?
Well, I'm just excited that among others, as you said,
Christoph Petosso and also, as I said, Kukushkin,
I guess we teamed up a little bit, not like somehow
in distributed fashion to shed a new light at PGQ
because it's a great piece of software. It solved problems
before people encountered them. But somehow,
got lost with knowledge and I hope more people at least keep in mind what's possible
and consider it building systems and telling their AI to consider because maybe they're just
yeah I will look at it some do some benchmarks research and make decision right that's it
maybe yeah all right nice one well thanks so much Nicola and catch soon thank you for listening
see you soon bye
