Postgres FM - autovacuum
Episode Date: May 29, 2026Nik and Michael discuss autovacuum, including what it does, and the basics of why and how to tune it. Here are some links to things they mentioned: autovacuum https://www.postgresql.org/d...ocs/current/routine-vacuuming.html#AUTOVACUUMautovacuum configuration parameters https://www.postgresql.org/docs/current/runtime-config-vacuum.html#RUNTIME-CONFIG-AUTOVACUUMWhat’s Missing in Postgres? (our episode with Bruce Momjian) https://postgres.fm/episodes/what-s-missing-in-postgrespg_squeeze (our episode with Antonín Houska) https://postgres.fm/episodes/pg_squeezeMy queries to monitor autovacuum (post by Laurenz Albe) https://www.cybertec-postgresql.com/en/monitor-autovacuum-my-queries/Autovacuum Tuning Basics (post by Tomas Vondra, originally for 2nd Quadrant blog) https://www.enterprisedb.com/blog/autovacuum-tuning-basicsZero autovacuum_vacuum_cost_delay, Write Storms, and You (post by Jeremy Schneider) https://ardentperf.com/2026/04/12/zero-autovacuum_cost_delay-write-storms-and-you/Our episode on long-running transactions / xmin horizon https://postgres.fm/episodes/long-running-transactions~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith credit to:Jessie Draws for the elephant artwork
Transcript
Discussion (0)
Everyone who is running Posgars has auto vacuum, and most of people heard about auto vacuum and vacuuming and blow out and this garbage collection.
But a lot of databases grow with problems being unnoticed.
And they start tuning auto vacuum too late.
So this topic today is brought by you, Michael, right?
Hi, Michael.
Hi, Nick.
and we definitely had this topic discussed in the past in terms of like vacuum and
blood but we never discussed auto vacuum itself and how to tune it properly and
what's our best practices and yeah I'm glad that you brought it let's talk about this
sounds good yeah I'm looking forward to hearing your views on this kind of from a
seeing lots and lots of different clients perspective because I get the impression
dealing with people less frequently, that often it's left really late before people start tuning
water vacuum. And when I say late, a startup will grow to relatively significant size before
touching those parameters. And I often think they could have avoided quite a few headaches
and got into issues slower, if that makes sense, by just changing a few settings much earlier.
So yeah, I was keen to get your thoughts on if you see the same and also like when would you advise people like start up life's busy.
When would you advise people looking at this kind of thing?
Yeah.
First of all, every single client which comes to us, we talk about auto vacuum tuning on any platform.
Doesn't matter.
Self-managed, any managed service, we always touch this topic and always like in 100% of cases there is something that should be done there.
which means your point is absolutely right.
It should be done earlier
because when people come to us for help,
it's already,
they experience already some problems,
some troubles, right?
And this is a reactive approach.
And just to clarify,
when you say 100%,
do you mean some percentage
have done loads of tuning
and still have issues
and some percentage have done nothing at all?
What's the kind of distribution like?
It's hard to say.
I actually checked how many clusters
we observed lately, it was like 140 or something where we used thorough, like comprehensive
health analysis and tuning and recommendations and worked closely and so on, right? So I'd say
zero percent have ideal picture in terms of how auto vacuum is tuned. Most customers running on
managed service, especially RDS, have, let me say it, half S tuned auto vacuum.
Okay. This is a new term.
can use here because this is like I just amazes me why it's so we can dive into details soon and only
few like literally that's a several I saw they have very well tuned auto vacuum and we just shift focus
to why it's still not enough and what we should do with specific pieces of workload sure makes
sense local tuning already or what to do about it and in most cases there we also involve we focus on
specific workload, but also we go, of course, beyond just auto vacuum tuning, because usually
this is like some pathological workload or some, there are some problems which are, like,
not conflicting, how to say, like, they are blocking auto vacuum work, right?
Yes.
Before we dive into details, maybe let's just explain to a wider audience what auto vacuum is,
because I think many still misunderstand it.
recently I talk to, you know, like our focus is fast-growing startups, and I talk to technical
founders quite often. These guys are super smart. Technically, they understand, they definitely
understand throughput latency, all the numbers and so on. But I see, they just didn't have time
to dive into auto-vacuum what it is. The third blowout issue, MVCC, Postgres is very widely
criticized, right? But what auto-vacium is and what to do that?
do about it, what they need to have. It's like it's always a good topic to discuss and dive into.
So my simple explanation, Auto Vacuum is just garbage collection, first of all, right?
It has a few more jobs, but the main job, original job is garbage collection. When you
have updates, deletes, or failed inserts, by the way, the third part is usually forgotten,
but failed inserts
all of them produce
successful updates,
successful updates
and failed inserts
rolled back inserts
they all produce
that raw versions
called tuples
or tuples
right
and
probably also rolled back
updates
oh yes
you are right
yeah
successful
deletes
rolled back
inserts
or any updates
or any updates
yeah
besides hot updates, nuances, right?
Even hot updates, yeah.
If we want to be thorough, like we are destroying my intention to provide simple explanation.
Let's keep it short.
So it's first of all garbage collections. So if, especially for guys who run postgios on
machines with a lot of cores, like typical like Intel machine, 96 cores or 48 or like
like AMD hundreds of course, right?
We have very, we have very different definition of what typical is for a service.
Well, typical big, large database.
Sure, sure.
So it's, it amazes me like I saw, I told you, I saw post-gust clusters achieving 10, 20
terabytes with default auto-vacom settings, somehow surviving, already having five replicas.
It's insane.
This is what I meant in terms of like, should people doing it be doing it early?
or is it okay?
There's a certain argument.
No.
I think I agree,
but as you say,
they've somehow got by
even into that large extent.
But I've also seen
smaller systems really struggling
because auto vacuum hasn't been tuned.
So far fewer,
like maybe not even a terabyte,
but struggling because of a certain workload.
So yeah.
Yeah, it's not just size, right?
It can be a table two gigabytes in size,
but it's mostly bloated like 99.9%.
I saw it many times, especially with Q-like workloads.
And then it can suffer small database, but it suffers from the fact that auto vacuum is not tuned
and maybe something is blocking it. Garbage collection job is blocking it.
So we need to combine both things here. But my point is,
auto vacuum is just garbage collection. If you tune post-gous, or maybe your provider
tuned, for example, to have bigger shared buffers, bigger cache, right?
And workman for sort and join operations and so on.
if it's already tuned and max connections has increased maybe puller is in front of it like you
tuned it to handle workload why the heck garbage collection remains untuned it should be tuned together
with everything else right so it should be but we see it's often lagging in tuning right and
it can be a local workload on small database already suffering or it can be big cluster
and garbage collection mostly default auto-wark
is mostly default. But that being said, we should mention that there are a few more jobs
AutoVacom has. Yeah. Maintain statistics? Yeah. Do you want to stick to the vacuum ones first?
Because I think the name auto vacuum is quite a good name in that it's all of the jobs that vacuum
does but automated. But with one addition, which is the, it's also an auto-analyze feature,
which is what you're just about to talk about in terms of statistics. But I feel like the other
vacuum jobs are worth. Do you think of it as garbage collection, statistics, and then the other
vacuum jobs? It's just easier to memorize. Statistics also should be maintained up to date. If it's lagging,
it may affect performance, might affect performance. But you're right. There are also a couple of
more jobs. Transaction ID wrap around. If you have a lot of transactions per day, we know there are
sometimes like billion per day. It's a lot. In this case, Churn is.
high and integer four is not enough half of integer four is not enough transaction idea up around
prevention should be very frequent in each tapple in each row version right and also there is another
very important job especially for performance it's maintaining visibility maps it's a bitmap
with two bits for each page pages eight kibbytes right and two bits is it whole page fully visible all
means all tuples are visible to all transactions. And second is all frozen. Yeah.
Yeah, just previous topic. Transaction idea wrap around prevention. So not to visit this page.
Again, if we know that it's all the tuples are in the past, they're already marked frozen.
So they're definitely in the past, even if they, they're X mean X max look like they are in
the future. This is freezing. It marks like all frozen bit means like it's a whole page in
the past. The next vacuum will just skip it. That's optimization as well, right? So for more
frequent vacuuming, not to do the same job once again, right? I want to come back to that,
but before we do, the all visible is really important for index only scan performance, right?
So if a, if an index only scan can know that every single tuple on that page is visible to all
transactions, then it can serve the result it finds in the index because it knows it can't
have changed.
So that's a really neat optimization without having to do what it calls a heap fetch.
So it like fetching the data from the table instead of from the index.
So if you ever, if you see an index only scan that ends up with a lot of heap fetches,
that's because.
Degris couldn't.
Yeah, it might be that information was in the index.
But because it post-cras didn't know for sure.
that it hadn't changed, it still went and got it from the heap.
So it might be exactly the same information that it already had from the index,
but there's this inefficiency of vacuum hasn't run recently enough.
And it only takes, because it's that bit of zero or one,
it only takes one tuple on the page to have changed for any of the tuples on that page
to lose out on index only scans.
I think if information is not in the index,
you mean if we select columns which are not part of index definition,
it will be indexed scan immediately, not index only scan.
Of course.
And the index on the scan already means that information is present.
Information means like values.
But what is never is present in indexes, it's visibility information, the 6 Minix Max.
So for that we need heap.
But if visibility map, bit says for this page it's all visible, this can be skipped.
So if the purchase number goes lower, ideally it should be zero, then it's perfect situation.
It's true index only scan.
If it's high, it approaches the performance of index scan, which is like always consulting HIP.
Naming should be better here, but my position on naming.
Index scan could be renamed to something which would be explicitly mentioning that HIPP is always consulted.
Anyway, this is a different topic.
I think we covered quite well what AutoVacom does.
And again, the main job is just garbage collection.
Also, I wanted to mention autovacom is improving a lot from release to release.
And it's enabled by default, of course.
It's enabled.
It should be disabled only by experts.
For example, PGQ, Skype engineers decided to disable AutoVacium for Q partitions.
And it's also disabled in PGQ tool which is built on top of PGQ.
Name is awful, right?
So anyway, there auto-vacum is disabled by it. Normally, it shouldn't be, it should be never
disabled and it's enabled by default and it's done, drop and so on. So it's just a garbage collection
plus additional features and it's improving from release to release. And funny fact, I recently
purchased some Chinese modern auto-vacom for home with washing feature. It's very similar.
like advancement like I had a few years ago I had a couple of them and they're now like so many features so it became much smarter and they have multiple functions so I enjoy how it works really but it requires some tuning as well
yeah I was going to say Bruce Bruce brought this up in our episode with him didn't he said people complain about vacuum and bloat with Postgres and NBCC but he's saying which version are you complaining about because it is continually getting better but I would say you
that some of the defaults are still very conservative or maybe make sense at smaller table sizes.
And they do have scale factors built in, but they make less and less sense, I think,
as tables get bigger some of them.
So my position on defaults, they are all Bias defaults.
They are like for, I call it post-podgrass, podgrass, kitchen kettle,
Postgres, which should have, I don't know, like, one gigabyte memory and some Raspberry Pi or something,
SmartPot or something, this is it. For them, it's okay. But for any, even like small cluster with 8 cores,
16 gig of memory, it's not enough at all. And settings can be split to three categories for simplicity.
Number of workers, which requires restart. The only thing which requires restarted number is number of workers.
which is three by default.
If you have eight cores, three maybe is okay.
It's okay.
If you have 32, 64, 48, 96 course, three is not enough.
But we see it all the time.
People come to us, 96 course, three workers.
This is one thing.
You say it's not enough.
I think you're right in the sense it's not a sensible default.
Like why not give it more in case it needs it?
Right.
But as you said, sometimes people come to you and they've only had three workers the
whole time and somehow it's not fallen over.
It's grinding through challenges.
Like super bloated tables, super bloated indexes and it's a lot of dirt to clean.
Which is great for us.
We have opportunity to help them.
Instead of framing how bad it is, I actually have it.
It's actually our product strategy.
We frame it, oh, great, we can help you really quickly, really quickly.
Right away, we have long-hanging fruits.
Let's go.
With our tooling and methodologies, it's super simple.
This low-hanging fruit, let's go, and your database will feel and breathe quite soon, much better.
You can even consider downgrading or postponing upgrade for a couple of years after this.
Yeah.
And this is great.
Yeah, this is one thing.
And this is super important thing.
We just discussed.
This is how garbage collection works and updates the leads and sometimes inserts.
They do just part of the job.
Right?
And these poor auto vacuum workers need to do the rest of the job.
The tidying up after it.
Yeah.
You have 1,000 max connections, meaning you have up to 1,000 backend workers which produce all those datapels.
and only three that should catch up with them.
It's nonsense, right?
It feels as nonsense.
So it should be tuned.
And simple rule, like we say, like, actually, let's consider like 25% of VCPU account.
But we should do it very properly thinking about memory management and so on.
I know there is a position in this view, but this is, it means my position.
It's a simple case.
Just let's give more workers, especially if you have partitioning.
because you think funny fun fact a worker processing a table if it's a huge table it will be processing it like sequentially or table and its indexes and that's it if it's a partition table it can be distributed each partition can be you can parallelize this right so it makes sense to have more workers if you have more partitions and it's great to have more partitions and more workers together this is the best situation and two more categories one different
defines frequency of revisiting frequency, how frequent workers come to do stuff.
And another is it defines how fast they can, like throughput, basically, or throttling.
I think throttling is the best way of thinking about it.
Throttling, yeah, limiting.
It's cost limit and cost delay.
And how often they visit defines by scale factors and threshold, but I'm okay to think about
only scale factors, actually.
at the first glance it's enough.
So three, we have three scale factors today, right?
Or maybe there is also in Postgis 18, there is auto vacuum vacuum max threshold.
This is a new stuff in postgis 18, which changes the picture.
So usually we talked about only scale factor and threshold.
Scale factor by default, three scale factors.
But the basic one is auto vacuum,
vacuum scale factor, right? And it is either 10 or 20%. So I think 20%, right?
Yeah. So I suppose because the threshold by default is 0.2, right? So 20%. Let's see you've got
a billion rows in the table. It doesn't wait to get to 200 million. It starts at 100 million.
Yeah. So we don't allow more than 100 million, which is huge now. Maybe huge, maybe no.
100 million debt doubles in the table. It's quite a lot, though. Yeah.
Yeah. So the basic is auto vacuum vacuum scale factor.
Yeah.
This is the first thing.
And it's 20% by default, which means you need to accumulate 20% of dead apples in the table before it starts garbage collection.
When I just say this, I had recent experiences.
When I said this to some startup CTO, which is growing and so on, he's like, oh, we don't have garbage collection before 20% of dirt accumulated.
this is a very big of
of course when
things are small it's not a big deal
but when things get big
this is just a huge number
and it's not just
let's say you have
like almost terabyte
table in terms of size
billion rows and partitioned
which is common
I see this
20% means
200 million rows
that doubles
right
which is by the
way, above that threshold introduced in post-guist 18.
Yeah, but it's not just storage, right?
It's also polluting caches and indexed.
It's everything.
Oh, yes.
Oh, yes.
This is a big point.
And actually, let me add one important notice here.
Usually, it's not a big deal for smaller databases, for smaller tables.
But in some cases, we see local.
So if, for example, one row updated many times, thousands of times, for one logical row, you have many versions.
And globally, like, table is big.
But for this, like, not lucky row, you have so many versions and they can be, might be,
scattered among many pages.
So they are sitting there as datables.
And for this specific query, select where ID equals something.
This means that all of them need to be fetched to be checked in terms of visibility for this transaction.
And it's, it degrades for particular ID, like local degradation.
Most of queries are fine, but this, like, unlucky because it had a lot of updates or something, right?
It became really degraded.
And it's like, what's happening here?
This is where P99s or something are good to have, because.
average will hide it, average latency and so on.
And this means you should be vacuuming more frequently, even on smaller tables, right?
So this local degradation due to debt double for particular logical rows,
debt double count is high.
So it's happening.
And it's happening silently.
That's a problem.
And tooling, which is like lacking P99s and so on, is not helping us to understand.
That's why still logging of slow queries helps and auto-explan helps to catch these like edge cases and start analyzing why it's happening.
The next day you select the same row.
Maybe it's already vacuumed and degradation has gone.
But in that moment, if you like go back in time and select it, you see so many buffers this particular query.
it's very like its primary kill woke up, but too many buffers somehow.
That's why.
So it's important to feel it because it's really like subtle problem.
It's hard to catch it if you don't know where to look at.
Yeah.
And you're right.
So scale factors.
And since PostGar 13, there is also auto-vacuum, vacuum insert scale factor.
And there is also scale factor for analyzing.
For insert, it's really great.
I remember Darafay, the guy who from Belarus, I think, he introduced it.
It was so obvious.
Why didn't we have it?
We insert a lot.
We double table in size, but vacuum doesn't come.
And we need vacuum, not to clean up that apples after successful insert,
but to rebuild visibility maps, first of all.
And it's great because if you have select count or something,
you want index only scan to be with low hip fetches, as you mentioned.
So anyway, there are these scale factors,
and it's enough at first glance to look only at them.
Yes, like you can go deeper and think about fresh,
and how like additional tuning.
But scale factors is already enough.
You want to, you want them to be really low in all TP workloads.
Just one percent.
Sometimes two percent, sometimes half percent.
I see people choose.
But one percent is simple rule.
Set it to one percent for all scale factors and benefit.
And to give people an idea of how drastic a change that is,
these scale factors are 20 percent, the vacuum threshold,
10 percent for the analyzed threshold.
And I think 20% also for the insert scale factor.
Yes.
So dropping from 20 to 1 might feel like a huge change.
It is quite a huge change.
But you're not repeating the same work, right?
Like it's not, there is a tiny bit of repetition.
Yeah.
It's this is a slight difference to analyze, isn't it?
In that if you run vacuum twice in quick succession on the same table,
it's not doing all of the same work again.
It's done a lot of the cleanup.
It's like tidings.
Some work can be done again.
For example, it's very nuanced here.
It's super interesting.
Like you mentioned, it's not only about data storage.
It's polluting memory, wall.
Wall means replication and backups.
It's polluting everything.
And memory is expensive these days, right?
So we should optimize, not about data storage,
we should think about how many pages we need to keep in this working set.
But the ideal situation, when you raise workers, make a vacuum to visit us frequently,
with lowering scale factors, all of them to say 1% is a very rough rule here, but also
partition quite well.
Because in this case, some partitions, especially if it's historic data, they will be processed
with marked all visible, all frozen, and very rarely those pages will be touched.
So vacuuming will be like a.
like skipping, right? Unlike if it's a huge, messy heap with terabyte in size or so,
and any page can be touched at any given time. It means it will be need to process again.
Of course, it means that if you have infrequent vacuuming of a huge table, of course,
you're doing work less because maybe this page was visited multiple times before it was vacuumed.
And if you start vacuum frequently and between two processing, this page is revisited by some rates,
then it needs to be vacuumed again.
So it's multiple processing of the same page versus a single processing.
So we cannot say it will be, but it's highly optimized.
Vacuum is highly optimized.
But if you have a huge table, not partition, maybe it will be like high frequency vacuuming will cost
you some extra, but it's worth it usually. So there are nuances here and trade-offs, of course.
But usually in all TP, we very strongly recommend to lower down scale factors so vacuum happens
more frequently. And the fact is that if you own managed service, I see it's quite
common, this is a big problem, RDS included, for example. They don't care about this somehow
and they keep defaults. Yeah. I would like to hear their reasoning.
why they don't proactively tune it for the customers.
Because these scale factors makes,
because they're a scale factor, right,
they can make sense at any scale, right?
Like, it's fine when you're small.
These are inexpensive things to run on small tables.
And it works when it's big.
Like, there's no reason not to tune it from the start for all customers, right?
You don't have to be clever about which instant size.
And I don't complain because, as I said,
I enjoy when I see the problem.
I'm excited to help and have long-hanging fruits and see how health next day is much better.
But of course, we need also automated re-indexing.
We need PG-Repack to be used and so on.
By the way, we mentioned that PostGGG19 will have.
Yeah.
Repack and repack concurrently.
That's awesome.
And we should somehow revisit this topic, maybe inviting the guy who made it, right?
Yeah, I think it was
Yeah, Antonin was, we had him on, didn't we, to talk about PG squeeze
Which inspired the work
Remember, in the very end of episode,
I learned about these plans to make vacuum full concurrently,
basically, which is like became repack concurrently.
I couldn't believe, I had like doubts, is it like,
but here we go, like this is happening a year later,
it's planned to be released.
I hope it won't be reverted.
It's a super important feature to have.
It's interesting.
I have to admit when I, yeah, I saw that repack, they introduced effectively a rename of vacuum
fall to repack and that was going to make 19, but it didn't have a concurrently option.
And I was like, oh, that's such a shame, but it makes sense.
Another year of waiting.
Yeah.
And then about a week later, repack concurrently gets committed as well.
So, yeah, that was great.
Yeah, it was like, it was a mood swing for me as well.
It was like, oh, no.
Oh.
Yeah.
So anyway, back to scale factor.
This is probably one of the most important to lower scale factors for all TPP workloads
and benefit from more frequent vacuuming and analyzing as well, right?
Just a question, is there any stage that people are at where you wouldn't recommend this?
I was thinking when should people be thinking about these things,
but I think those two are such simple, or those three, let's say, are such simple settings to change.
You could change them from the start and never think about it again.
like clearly i was thinking this is complex and and this is complex thing in terms of we cannot
talk about auto vacuum tuning completely without analyzing what's blocking work if anything
yeah yes and that's why i'm like we recently developed a dashboard for xmin horizon analysis because
we yeah we probably need to revisit it once again in terms of long transactions versus pure xmine horizon
blocker analysis and auto vacuum throughput analysis. And again, like this, I cannot recommend
enough Lawrence Albers recent blockpost about how he monitors auto vacuum. Combining with
what how we monitor auto vacuum, I think it's much like it's we leveled up monitoring
of auto vacuum blockers, basically, right? So X-Men Horizon, there are five reasons why it can be
blocked. And if you keep those problems unresolved for our,
sometimes dozens of hours, tuning auto vacuum to run more frequently,
especially if you reduce also nap time, how long it sleeps between runs and raised workers.
And also, like some problems might happen.
For example, it might start consuming too much resources, just trying to do something.
It cannot do something.
And also logging can be, you can have observer effect.
If there is a, like, it tries to do something, it cannot, but it reports.
and many workers report.
And if you, at extreme, you have log to vacuum in duration set to zero,
meaning all attempts are logged.
In this case, if you have 25 workers, for example,
and they cannot do, and they nap time one second, it's a storm of logs.
This is one problem I can definitely imagine.
You mean one millisecond?
No, by default nap time is 30 seconds.
What am I thinking of that's two milliseconds by default?
Oh, actually, I was wrong.
I'm looking at it right.
now it's 60, one minute. Nap time is by default one minute, which is quite long in my opinion.
So we usually say, let's choose a few seconds. But I admit aggressive tuning without resolution
of X-Men Horizon blockers might lead to storm of logging and resource utilization as well.
So this is what I can think of from top of my head. Yeah, so sorry, I was thinking of cost,
there's a cost delay. Right. This is a,
the last thing we haven't touched. So as I said, there are like throughput in terms of number of workers.
There is frequency, how often vacuum and analyzing happens. And also the remaining piece, big piece, is
throttling. You said we should say throttling, right? So two parameters. And in post-guess,
who is responsible for naming? So there is an attempt to save on some defaults or something,
there is some parameters, they have relationships.
You know, inheritance, right?
So there are two parameters, cost delay and cost limit.
There are vacuum cost delay, vacuum cost limit, this pair.
And by default, I think cost delay is zero means it's unthrottled.
If you run vacuum manually, it's unthrottled.
It tries to go full speed, usually bounded by I.O.
Disc.
Right.
And there is also auto vacuum, vacuum, vacuum cost delay, auto vacuum, vacuum, cost limit,
two parameters, one of which I think auto vacuum, vacuum, vacuum, vacuum,
cost limit is set to minus 1, which means it inherits from vacuum cost limit, which is set to
two milliseconds, right? It was 200 milliseconds and then it was or 20 milliseconds in the past before
post-guess. Yeah, the delay is, I've just looked it up there, auto-vacuum vacuum cost delay is
two milliseconds. Yeah, I can confuse everything. I see. So like auto-vacom, vacuum cost delay is two
with the seconds. And it was before Postgrease 12, it was 30. So became two in Postgreaves 12, which is great.
And auto vacuum, vacuum cost limit is minus one. Means that it inherits from vacuum cost limit,
which is by default, 200. Which is like an arbitrary unit of like work, right? And there's
some cost. Yeah. Yeah. Are your CPU cycles? There is some math.
what I don't like about acquisition of EDB
of second quadrant by EDB,
they just killed the block.
Block was brilliant.
And they had a great block post about some math.
If you think only about IO part,
all defaults before PostGGOS 12
yielded to 8 maybe bytes per second of reeds.
So with 200 and 20 milliseconds,
it was roughly 8 maybe bytes of reeds.
So bumping 20 milliseconds to 2 milliseconds
to 2 milliseconds.
I remember it was first considered to raise cost limit, but then decided to reduce
frequency of analyzing of this frotting, right, because it makes things smoother, right?
So going down from 20 milliseconds to 2 milliseconds and maintaining cost limit 200 inherited from vacuum
cost limit.
I don't like this.
Just put straightforward defaults.
It doesn't hurt.
Minus 1 makes life of humans much harder, much harder, right?
Like it's, okay, anyway, maybe I should just propose it.
Maybe it wasn't proposed.
Maybe it was proposed.
I don't know.
My battle over dozens of, over decades already.
This is happening decades, this suffering from memorizing what minus 1 means.
We forgot actually memory management, fourth piece, right?
There is minus 1 there as well.
Autovacian work mem is minus 1 by default.
which means you need to go and consult
and there is a big, interesting trick happened in PostGuS17,
in the separate story.
That's maintenance work member, right?
Yeah, maintenance work memory.
If you have time, we can talk about that as well.
Dive to it a little bit.
So minus one for auto-vacom vacuum cost limit,
inherited 200 this is like arbitrary cost,
not arbitrary actually, but some costs like two milliseconds.
Since PostGovies 12 meaning in all current versions,
it means by default that can just take
reads of 80
maybe bytes per second
roughly
which is not enough in modern systems
and it's shared among all workers
unless
you start tuning at table level
which makes things much more complex
because then quotas with throttling
becomes local for this table
for this worker which is processing
specific table
I prefer not to go there
we can have a separate episode
for table level analysis
tuning
table level tuning. So big picture is like by default it has only 80 maybe bytes per second of
reads and if you have powerful even EBS volume which I don't know it's a nitro old not
nitro architecture so it definitely in VME like it has good throughput 80 is tiny among all workers
again this is garbage collection for everything so it should have several hundred megabes
per second at least but good thing is for example ardeus fixes this especially if on aurora we see
enormous quota throttling there is quite like good in terms of like it has a lot of power
but in many other cases we see it needs attention and we need to give more like
capacity because it otherwise it's too slow and that's not good
So it never catches up, if that makes sense.
It never, yeah.
Yeah, it's a kind of saturation.
You see all the workers you allocated.
They're always busy.
This is number one, how you can understand that something is not okay.
And second, we have a special report.
We have a queries.
I have a blog post about that.
And we also have it in our monitoring, like queue or line of tables,
which sit there and should be already processed, but not processed yet.
and this like basically weight line for tables
and I think Lawrence also has in his blog post
a slightly different analysis
but the idea is the same like how bad is it
the picture like it should be done already
but not yet done this is how you also feel
this situation happening
from the other side of things
I saw a good post recently by Jeremy Schneider
about what the risk of
completely
a moving, throttling.
So he wrote a good post recently about an issue they encountered once.
I think they reduced the...
Yeah.
I haven't seen the...
Yeah, sometimes people put to Kron.
We haven't touched this specific cases related to multi-exact member situation
when people have this approaching and they need to vacuum separately.
Switching off indexes because index vacuuming,
because for that problem you don't need...
And this is actually another job to clean up those things.
you don't need
indexes to be involved
and this slows down
the whole thing
if you have many indexes
so anyway
I wonder
I need to look at that
but
and when you do
manual vacuuming
it's unthrotled
as I said
because vacuum
cost delay is zero
so it's not checking at all
let's go
but I think that's
isn't that because
every now and again
you want to run a manual vacuum
as like an emergency procedure
like
yeah makes sense
I agree
but if it's throttled
and it's unthrotled actually this is super interesting for me in the past I had problems
with unthrotled vacuum but it was when we had some enterprise disks and self-hosted
postgust and our capacity was like 400 megabytes per second I also I'm very curious right now
in postgis 18 with aIO and so on like how a single worker might be have remember I told you I was
super impressed to see the PG base backup, which is single-threaded, and I expected 100
megabets per second of reads from it. Suddenly, it amazed me approaching like gigabyte per second of reads.
In post-gast 18. So maybe vacuum also, I'm not sure. Like this I haven't visited.
Yeah, I think it was me that suggested it might be AIO-related. Yeah, but I was, let me correct
myself. So there is fourth piece, memory is super important. And if we don't think about it, it might
bite badly, especially if we raise workers. So all our clients, we say, okay, we raised workers,
but jumping from, say, post-gust 15 to 17, you must limit memory for auto-vacumen advanced,
because it's by default minus one means it takes the value from maintenance workmen. And maintenance
workmen, people usually raise quite quickly thinking they will have much faster index creation.
So they put two gigabytes or so. Now I would think like 25 workers, two gigableness.
it's already a lot, right?
And the problem is, like, before PostGar 17,
you might not notice it because auto-vacom workers suddenly limit themselves with one
gigabyte, and then suddenly jump to no this limit anymore.
No one-gigabyte limit anymore for auto-vacom workers.
And if you have four gigabytes, I saw eight gigabytes like these things, right?
And many workers, it's huge memory consumption.
Or risk, like, not necessarily, but there's the risk that they could take.
take up that much, right? It's not pre-allocated or anything. It's just that's the budget they have.
It can influence, and it can increase risks off out of memory. Yeah, yeah, yeah. A killer.
So, it's not, but then you can set, yeah, great. But then you can set the auto vacuum at work mem.
I've forgotten the exact name of it. Yeah, like half gigabyte. There's some math should be done there.
And we usually like, we have it like some formulas. So usually half a gigabody.
by it even maybe lower and it will be okay, right?
It's interesting a topic.
But yeah, so usually for maintenance work where, again, I wish like this minus one
is killing me, like decayed already.
Like why we don't put explicit value right away?
Why this like I need to go in some maze and remember it took me many years to settle
in terms of this like relationship between vacuum.
and auto-vacom settings.
And for non-expert,
who is not working with post-gess settings
every day, it's impossible
to memorize.
It's super tricky. So you would have hard-coded
values for each, even if they start with the same
value? Doesn't hurt. Yeah, so
if we have maintenance workmen by default,
I don't know, what is, like, 128
megabytes or so, like maybe more.
Just copy-paste. This is not the place
where we should save on copy-pasting.
We shouldn't save, like,
few bytes. Minus one, it's two bytes, right?
120. It's like, okay, a few bytes more.
This is not a place where we should save it, right?
This is, this is complexity, extra complexity, which is like, okay, I need to redirect this
frustration to proper replace.
Yeah. By the way, it's 64 megabytes default maintenance work, man.
Podgrass, yeah, podgrass.
Yeah. All right. Awesome. And actually, the last thing I wanted to say, so as a little
summary, if people have less than eight
VCPUs, maybe
leave, you can probably leave the workers in place, but still
change the thresholds. And it's just those three thresholds
will get you like 90% of the way there, probably.
I think most of
reduce, you said down to 1%
I think that makes a lot of sense. I've generally advised
people to 5% or 2% as like a
starting point, but 1% makes
a lot of sense. It's a bit
more, but I think it's...
It won't save you from, in large tables,
it won't save you from this problem of local degradation for particular rows.
It's just like maybe you need to think about it additionally.
But overall, it's like good, very rough tuning for all the people.
Yeah, absolutely.
I think that will get a lot of people a lot further.
Most providers let you set that globally.
I did come across one recently that only lets you change these things at a table level,
at which point my advice would be focus on you and,
important tables and just change a few of those.
But that was only one provider.
Or just migrate.
Or migrate, yeah.
That small task of migrating
database. But yeah,
so that sounds wise. And then
yeah, memory, tuning,
make sure it's not using
maintenance work. Memory if you've up to that.
Yeah, there are other parameters,
but it's like already deeper things.
Yeah, so.
Sounds good.
My simple rough tuning list
raise a number of workers.
This requires restart, unfortunately.
Then make visits more frequent, decreasing scale factors to 1%.
For example, roughly.
And give more power unless it's already done by provider,
raising, for example, raising cost limit.
Oh, yeah, cost limit, right?
And default settings, as I said, PostGar's default settings
because not provider's default settings.
It's roughly 80, maybe bytes per second,
for discreet's think how much you want like maybe just raise it maybe sometimes four or five times
depending on this capacity it's very important to understand that and finally work ma'am set it explicitly
minus one sucks in all senses it's just very hard to deal with
i like that i like saying it explicitly even if you plan to keep it the same as your maintenance
workmen yes like free people before upgrading to 17 for example or before touching number of
auto vacuum workers.
Yeah.
Just make it explicit.
So to simplify math analysis,
including with AI,
so it won't jump additional,
like mind hope.
And just avoiding that future change
where somebody wants to increase
maintenance work,
then doesn't think at that moment
about auto vacuum.
Because these are two separate things.
Operational, they are very separate.
Indexing and vacuuming,
auto vacuuming.
Yeah.
Yeah.
And that's it.
These four areas for rough tuning,
it's enough. It should
become a blood situation
much better, but
there are also blockers. If you can
tune a lot, tune everything
perfectly, but if you have
some X-Men Horizon
blocker, and we have
an episode about X-Men Horizon,
then it won't help.
Blood will accumulate anyway.
Great.
All right. Good.
Yeah.
Nice one, Nick. Thanks so much.
Yeah, I hope it was
helpful see you next time see you next time bye
