Postgres FM - Out of disk
Episode Date: July 19, 2024Nikolay and Michael discuss Postgres running out of disk space — including what happens, what can cause it, how to recover, and most importantly, how to prevent it from happening in the fir...st place. Here are some links to things they mentioned:Disk Full (docs) https://www.postgresql.org/docs/current/disk-full.htmlpgcompacttable https://github.com/dataegret/pgcompacttable Our episode on massive deletes https://postgres.fm/episodes/massive-deletes Getting Rid of Data (slides from VLDB 2019 keynote by Tova Milo)pg_tier https://github.com/tembo-io/pg_tier Data tiering in Timescale Cloud https://docs.timescale.com/use-timescale/latest/data-tiering/ Postgres is Out of Disk and How to Recover (blog post by Elizabeth Christensen) https://www.crunchydata.com/blog/postgres-is-out-of-disk-and-how-to-recover-the-dos-and-donts max_slot_wal_keep_size https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-MAX-SLOT-WAL-KEEP-SIZE Our episode on checkpoint tuning https://postgres.fm/episodes/wal-and-checkpoint-tuning Aiven docs on full disk issues https://aiven.io/docs/products/postgresql/howto/prevent-full-disk  ~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith special thanks to:Jessie Draws for the elephant artworkÂ
Transcript
Discussion (0)
Hello and welcome to PostgresFM, a weekly show about all things PostgresQL.
I am Michael, founder of PgMusted.
This is my co-host Nikolai, founder of PostgresAI.
Hello, Nikolai.
What are we talking about this week?
Hi, Michael.
We are out of disk space.
What to do?
Panic.
This is what Postgres does, right?
Yeah, in some cases, right?
Yeah.
Well, I saw it a lot.
Recently saw again, actually. And and yeah it's not fun why panic
don't panic right keep calm and carry on right yeah it's a sports girls you should yeah yeah
just use postgres you know yeah so as we discussed before this call, before recording, let's just talk about possible reasons, mitigation and avoidance, right?
Yeah.
Prevention. Prevention is better.
Avoidance.
Avoidance.
Denial. It's like the stages of grief.
It's not my problem, actually.
Yeah, I'm a DBA. It's a SRE problem, right?
They should just give us more space.
To start about problems, I mentioned it before the call, but let's repeat.
The only reason, the only true reason of this is an insufficient disk space allocation.
It's always so.
And it's a joke, but we will be
returning to this joke, I'm quite sure.
Because sometimes you do
need more space.
Of course, I would
deprioritize this path
and consider everything else.
But sometimes we end up
with this reason and
mitigation action like just
add disk space.
But let's consider this as a last
last uh like resort right well yeah i like your structure so like causes then recovery and then mitigation or prevention um mitigation is recovery yeah okay great prevention being the
last one so causes what are the like most common reasons you see this happening to people in the wild?
Good question. There are many reasons. I don't have statistics.
I can tell you the one I see most blog posts about, because I don't see this that often, but the one that comes up time and time again in blog posts is an open replication
slot.
Oh, it's like it's too narrow.
Let's just say a lot of wall written.
Yes.
Yeah.
And it remains, a lot of wall files remain in pgwall directory or pgxlog if we talk about
ancient versions of Postgres.
Yes. world directory or pgxlock if we talk about ancient versions of postgres yes but and that's
but that's the root cause of why it's the most common one i've seen i think it's super popular
reason but i would structure this like if we need to provide comprehensive list of reasons i would
put it to the second place and before the first place I would put and not only like because of slots, slots is just one subsection here, whole section is just pgwall directory
takes a lot of disk space. This is the big class of reasons, right? But the very first
class of reasons I would say anything related to data directory itself. Like a lot of disk space consumed by tables and indexes, right?
It's also like a whole class,
so we have specific reasons inside it.
Yeah, so we actually have more data than the size of the disk.
Than we expected.
We try to delete, but the space is not free, and so on.
Lots of bloat, for example.
Data-related, wall-related,
and the third class probably or category is probably anything else.
Anything else is tricky and interesting.
We will talk about it soon, right?
So how about this classification?
Yeah, just to give people an idea of the anything else,
like you including logs in that,
like things that can grow quickly that aren't data,
like main data directory stuff or wall.
Base subdirectory is where data is stored.
So if you think about PgData directory structure,
we can think about base, so tables and indexes, right?
Materialized use as well. And then pgwall sub-directory and it is huge, for example.
And everything else. Everything else is interesting as well. Maybe also outside
of pgdata, but let's not reveal secrets right before it's time so what about
tables and indexes what do you think like possible specific reasons like i deleted
90 of my table but it didn't help this consumption is still 99 what to do time to panic right so like obviously delete
doesn't delete data right it just marks it deleted but it's a two-stage process well and and this is
a tricky one because a lot of the times we've discussed how to deal with issues like this we actually require quite a lot of disk space in order to
repack or um to to actually mitigate that problem what the easiest way the way that to do that in
an online fashion requires double the size of the relation or double the table size normally
so it is a tricky one in this case.
Exactly.
Just a few days ago,
I was discussing this exactly problem
with one of our customers.
And they mentioned that they have a huge table
which consists of like 90% of their database.
And to fight with bloat,
they know there is bloat.
And to fight with bloat, they know there is bloat. And to fight with bloat, especially in heap,
it's obviously you need to use pgRepack.
And to use it, you need the same size.
Temporarily, you need to have two large tables of same size
because it rebuilds the whole table in background
and then switches to it.
So yeah, this is a problem.
And if you don't have enough disk space, oops. But yeah, so I would say this is edge case.
Normally, there is no single large table, which is like 90% of whole database, right?
So normally, it's not a big problem. And if we keep free disk space 30-40%,
which should be so, like at least 25% maybe,
we have this space.
And we know it's just temporary.
We like just when we're repacking the heap, the table, right?
Not a problem usually.
But if you're in this situation, well, not good.
Maybe you should do something else. There are alternatives to pgRepack. I think pgsquiz alternative from cybertech also has this issue. It requires additional disk space because it rebuilds the table. But older solutions,
like in this particular edge case, I would think I would return to solutions which exist for quite some time, several implementation of not let's rebuild table, but another idea. It's called
PG compact table. Again, there are three probably attempts from different persons to implement this idea.
And all of them, I think, are in Perl.
But the idea is let's issue some updates which don't change data.
When you issue update, even saying update blah, blah, set ID equals ID
where ID equals some number, you know a new tuple is created
always right so this tool is issuing these updates in a smart way it checks which tuples are in the
end of table and it knows there is bloat so there is empty space in first pages for example so if
we update tuples which sit at the end of table, they will move
to the beginning of the table.
And once there are no
tuples left in the
final page,
final block,
it will be truncated. By default,
this is what vacuum does
unless it's turned off.
And then we
do the same with next from the end,
the next page, and so on and so on.
Like basically, this is similar to like,
I remember in Windows, like defragmentation, right?
So it's moving data to different places.
Do you remember that defrag screen on Windows?
Yeah, yeah, yeah.
People love that.
You could visualize it.
Yeah, yeah, yeah.
Maybe like red and blue
right so yeah and this is this is much slower process than using pgd pack for table but also
in our case if we are actually out if we if we notice we've got you know 99% disk full maybe
this kind of thing helps but if we're actually out of disk space,
even this is an option.
Yeah, yeah, yeah.
I'm talking about mitigation here,
maybe too early.
But of course, if we already achieved 100%,
we cannot move.
It's not what I would do.
But you just raised a very good point
about extra disk space needed.
And I just,
like,
I just remember there are alternative solutions and I think they don't
require any extensions.
So they can,
could move,
could be used in anywhere.
Sure.
Like RDS,
cloud SQL,
anything like you,
it's just simple updates,
super slow,
but not requiring extra resources, which is great.
Also, probably you will need to
make sure the tool is working
well with current version of Postgres because
they are quite old. And again, there are
three of them. I think Depeche also
tried to implement one of them.
That makes sense
that it's in Perl then. This is a small
zoo of ancient tools.
And I like the idea, actually. But it's in Perl then. This is a small zoo of ancient tools. And I like the idea actually.
But it's not an emergency tool. It's like right before emergency tool, trying to avoid it.
So yeah, this slide. But as for indexes, probably yes, when we rebuild index concurrently, like Postgres
builds another index, but I think
this should not be super
noticeable because no index
can, like,
yes, it requires extra disk space,
but we should have it if we do it, right?
So, anyway, if we
delete, we should think about,
okay, that tuples and bloat,
keep it in mind.
Try to avoid massive deletes
or design them properly with
MVCC and vacuum
behavior in mind.
We did a whole episode on that, I think.
Right, exactly.
So, yeah.
I saw crazy cases when
some tables and indexes were bloated
like 99%.
Yeah.
And you saw these cause out-of-disk issues.
Well, they contributed a lot.
And if we keep it as is, we will soon be out of disk space.
Right?
So it's definitely worth keeping an eye on the situation in this area bloat and vacuum behavior and yeah if you don't do
it and do it very like infrequently then you once like in five years you take care of it and then
you realize out of 10 terabytes you of disk space you had now it's below one terabyte and what do you do with other nine terabytes you keep paying
the cloud provider right because it's easy to add disk space but it's not easy to reduce this space
because they don't have such function and it's possible only if you create a new a new standby cluster with smaller disk and then just switch
to it, switch over, right?
Yep.
But yeah, it's quite funny seeing that message.
You know, this is a one-way
ticket.
One-way ticket, yeah.
So I think those make sense, right? Normal
data exceeding the size of the disk,
whether that's actual data or
whether that's bloat, makes perfect
sense. Do you want to talk any more about
Of course. Actually,
I still believe since
VLDB in Los Angeles
I think in 2018,
I still
believe, in the keynote there,
I like this keynote, about
data deletion
is a huge, big problem we engineers need to solve and find better solutions.
So sometimes it's not about bloat.
It's about a lot of garbage.
Duplicated data also.
And I saw many times when we start paying attention to bloat and pushing customers to fight with it,
suddenly they say, you know what?
I had it like maybe a month ago.
You know what?
We don't need this table.
Let's just not spend time and just drop it.
Drop this table or truncate it.
We don't need the data.
And it's like, wow, some terabytes are free.
Yeah.
I see this with like analytics data for example people tracking everything like they don't know they never
look at their analytics they never look at who's using which feature but they track everything
because we might need it someday or you know like that there's all these data use cases and
and even the solution to this like in terms of... I've heard the phrase retention
policy. I know it's not quite the same topic, but talking about how long we should retain data for,
even that we don't even mention deletion. It's not deletion policy, it's retention policy,
which is quite funny. Also, maybe one of the last things in this section I would mention is it's becoming more popular instead of deleting all
data to have some better approach tiering tiers of data storage and I know there is from Tembo
there is extension called pg tier first of all timescale has this yeah right but in cloud only, I think it's not open source.
If I'm not wrong.
So the idea is, let's consider some data archived, and it still looks like it's present in Postgres,
but it's not stored on an expensive disk, which has also
capacity limits. We offload it to object storage,
like S3 or AWS.
And this is a great idea, actually.
But I think some latency issues will appear and also maybe some errors
sometimes, inconsistencies or so.
It's interesting.
But it's a great idea.
And again, like I said, Tembo has a PG tier extension, which I think I hope to test someday when I or my team has opportunity and time for it. It's super interesting to check how it works. some huge tables, maybe already partitioned, you can go either with sharding
or if some historical data is not needed all the time,
you can just offload it to object storage,
which is like virtually infinite.
And then, okay, users experience some worse latency
when reading this old data.
Imagine e-commerce, for example.
Just to quickly, I feel like this is an old
kind of an old-fashioned solution to this was table spaces yeah but table spaces in cloud reality
became not popular at all last time i used table spaces in serious project was more than 10 years
ago i think yeah but that's what i mean we
had this we had this before it's just it had a different name like it maybe yeah yeah table
spaces yeah you can have cheaper disk attached to your machine but i still don't like this particular
approach because i like as offloading to object storage more that because imagine you have primary and multiple standby
nodes if you use table spaces and cheaper disks you are forced to use the same structure of disks
on each node and it becomes still expensive and and slower for example right while object storage
it's it's worse in terms of uptime compared to
EBS volumes on AWS, for example. But it's much better in terms of reliability or vice
versa.
Durability.
Yeah, availability and durability. If you check SLAs, S3 is... I forget. So if you check characteristics,
you think, okay, this is what I actually
would like to have for
archived data. Maybe latency will
be slower, worse,
right? But if you
think about e-commerce and order history,
for example, if you
launch an e-commerce website,
if you have
ability to offload all the activities and users touch them
very rarely, but it's still good to have the ability to see the old order history.
But it's not needed often, so we can keep it outside of Postgres and evict from caches and so on.
And don't pay for our main cluster.
We just pay for S3.
And there we also can have tiering, right?
Usually.
Tiering.
And even automated.
Usually providers have these features.
If it's older than one month, it goes to some colder and cheaper space. So I think this direction will receive some more popularity
and will be better developed and so on.
Tiering for storage.
Once again, we've been sucked into kind of like prevention.
I think, yes, it's hard to talk about reasons.
And then, yeah, like this is a big flaw of the structure I proposed.
Well, no, I like it still, but it feels like we haven't really touched on.
The most popular, let's talk about.
Yeah.
Wall, right?
So Pidgey wall is huge.
We identified it, which is actually already some good skill.
Not everyone can do that, but if you manage to identify that your PgWall data is huge, there are a few certain reasons.
And there is a good article from CyberTech about this.
Why Postgres doesn't delete wall files.
It's already time to split them.
This is exactly when
we need to apply this article
as the list
of possible reasons and just exclude
one of them until we find
our case, our real
reason. So one of them,
you mentioned
some replication
slot, logical or physical physical not progressing and accumulating a lot
of changes it's not like accumulating it's not like slot postgres doesn't write to slots it
writes to pg wall a lot of walls right but slot has position and if it's inactive or position is
like frozen consumers don't consume and don't shift this position.
That means Postgres must keep those walls in pgwall directory until it finishes.
And since recently, a few years ago, one of, I think, Postgres maybe 13 or 14 received
a setting to limit this, to have threshold when we give up and say,
let's better to kill our slot to destroy our slot.
And but let's stop this situation and walls should be deleted already.
And I remember this big fear in a good engineer I worked with.
He was not Postgres.
He was more SRE, but with a lot of Postgres experience.
And it was long ago when initially replication slots for physical replication were introduced.
We had basically a small fight.
Like I said, let's use it.
It's a great feature.
He said, no, no, no.
I'm not going to use it, he said. Like, said, let's use it. It's a great feature. He said, no, no, no, no, I'm not
going to use it. He said, this is super dangerous. And he understood that the danger is if some
replica is somehow stuck, the slot is not progressing, the primary is out of disk space,
and this is the last thing he wanted. After lost backups, of course. Right?
So, but over time, we started using slots.
But then this setting, I think it's a great setting.
You can understand your disk layout and how much free disk space you can afford.
Like, if you approach, like, 90%, it's time to kill slots.
So you can do some math and understand
that maximum number of gigabytes you can allow for a lag is this.
Of course, over time, database grows and still this, like it still can be a problem, right?
Because if data grows, tables and indexes grow, and probably your setting will not save you from emergency, right?
Maybe.
But this is definitely one of the reasons.
What else? What reasons?
Because of failing archive command, right?
Because for Postgres, it's important if archive command is configured.
For Postgres, it's important to archive.
And if it's failing, any reason can be.
Postgres cannot archive, so if it cannot archive,
it cannot remove these walls.
And that's a problem.
So you should monitor archiving command lag separately.
And if it's growing, it's very bad for backups already.
But it's also bad because it can hit you in terms
of disk space.
So, yeah,
these processes, replication and
archiving of walls are two processes
that if something is wrong with
them, Postgres cannot
remove walls.
Another reason, actually...
By the way, this can grow quickly. I think people
don't always realize how quickly this can grow even with some database yeah it can be small but very active
database and you can have terabytes of wall generated per day it's actually good to know
how much you generate per day or per hour per second so i think even with like a low activity
as long as something's happening every time there's a checkpoint, I've seen people with toy databases like, you know, like free tier RDS, be surprised that it's generating like a couple of gigabytes a day, like 64 megabytes per five minutes or so.
Adds up quite quickly with small. smooth. Funny reason I also can mention you have not huge database
but also quite active and you think
it's time for checkpoint tuning. We had
an episode about it. Checkpoint tuning. Let's
raise max wall size and checkpoint
timeout. You raise it
but you didn't do proper math
in terms of how much disk space
you have and
end up having
normally too big,
PGA wall because distance between checkpoints increases and Postgres
needs to keep more walls.
And this can be just a mistake of configuration.
Yeah.
So, yeah.
Like after this, I think we can move on and let's talk about other reasons.
Yeah, what else do you see?
I would put on the first place in this category log directory.
Yeah.
Especially if it's inside PgData, inside this main Postgres directory.
Not especially, like no.
Especially if log directory where PostBiz writes logs,
if it's in the same drive as data directory.
For example, inside pgData.
Or just next to it, but on the same drive.
If logs are suddenly, like we have some performance degradation
and log min duration statement is configured or auto-explained.
Or, for example, we decided, oh, we need more audit,
and PGAudit is a great tool.
All right, let's bring it.
It starts writing a lot to logs.
And if you didn't separate your data from logs, you should separate.
I'm not talking about PGAwall.
I'm not a big fan of having a separate disk for PGO, actually.
Oh, interesting.
Yeah, it was a thing in the past, but recently I don't see benefits
from doing this often.
Benchmarks don't prove it and so on.
So it depends, of course.
It depends if it's your own data center, maybe it's worth doing this.
But if you don't uploadload logs, regular logs,
and keep it on the same disk, then you bring Peugeot Audit, they suddenly start writing
gigabytes per, I don't know, like hour, or maybe some problem with rotation of logs.
Yeah.
Like retention or retention. So they are not deleted properly, for example.
And then this can hit you badly and you have panic, actually,
for your database and selects are not working.
But if you're offloaded to a different disk,
and at least with login collector enabled,
I know that database actually will be working fully.
Then that's great.
I actually recently, maybe a few months ago, I had this experience.
I got
used to it. Postgres is fully down
if we are out of disk space, and I
expected the same behavior
when our log drive is
full. If we are
flooded, if it's full, I expect big problems.
Postgres didn't notice almost.
So, okay,
we cannot log, but we keep working.
That's great, actually.
So this is a super big benefit of having different drive
for regular Postgres logs.
And even if it's slow, it can affect performance, right?
But if it's 100% full, not that bad.
Yeah, we've talked several times about excessive logging having an impact on
performance observer effect yeah and i've definitely done benchmarks you know this is the
the age old log min duration statement zero like what why we recommend not or i recommend never
setting it to zero because this like you can generate gigabytes in like a 30 minute benchmark
like with pgbench or something so it's it's surprised like it surprised me how much it can
grow yeah it's actually a good idea maybe to just to you know like maybe to ask our bot to benchmark
with regular pgbench and then to have sampling for logging of queries in one way or another and just to see how observer effect
grows and kills your performance only people with small databases can recommend log min duration
statement zero or quiet databases yeah quiet yeah not active no small databases and um small workloads that's it so
or i guess super low like turn it to that only but only very very briefly like a minute or two
or something yeah yeah so for for serious workloads you cannot do that. I already shared my experience putting big critical databases productions down.
And yeah, don't repeat my mistakes.
As well as logging, what else did you have in this category of things?
Oh, that's a good question.
Well, let's not maybe spend too much time discussing some unknowns or external things like we have, I don't know, something else installed on the same machine and using the same disks and suddenly we are out of disk space.
Also, it can happen sometimes.
Or maybe, for example, you do some troubleshooting, you have self-managed Postgres, you can SSH to the box, you started sampling every second some good logs,
but do it to, for example, to home directory,
and then it's full or something like this.
So you always need to be careful with that.
But interesting case when, for example, you try to upgrade,
you have Patroni, or I don't remember exact details,
but in some cases, Patroni decides to retry provisioning
of a standby node, for example,
and it fetches PgData
in the way you configured
with PgBaseBackup or from archives,
but it renames the old directory
to make it like backup,
not like local backup, right?
It just renames it
and you end up having two directories suddenly, right?
Data directories full-fledged.
They're like, wow,
it can quickly be a problem
in terms of disk space.
Does that mean like if you had,
let's say your data director
is about 40% of your disk size,
so you thought you had loads of headroom,
suddenly you're at 80%
plus whatever else is on there.
Yeah.
Yeah.
Okay.
Maybe large temporary files,
but it's quite exotic.
I like,
it can be,
they can be huge,
right?
But it's like,
it's exotic.
I saw some Stack Overflow posts mentioned that,
you know,
people that were out of disk got the error,
but when they checked,
they did have free disk space.
And one of the suggested answers mentioned check for temp file.
Like your queries could be doing a lot of temp file stuff.
But yeah, I've never seen that myself.
What about, well, backups is an interesting one.
Like backups, if you, because some people recommend keeping some,
obviously keeping all your backups on the same disk as your database
is suboptimal for recovery purposes.
But keeping some on it makes sense, especially for huge databases, right?
I don't think so.
I would not recommend doing this.
But what about for recovery, like no network?
I'm just thinking about like...
Regular backups should be separate, of course.
But if you do some operations manually, for example,
you decided to back up some tables and just you
know like dump them to some compressed form and keep on the same drive well it might happen but
it's it's like there are many mistakes you can do manually fine fine or you just create create
table select for example you basically just clone the table and then forgot it. Well, many such mistakes can be done.
I think there are many exotic situations we don't discuss here.
And I'm curious if our listeners had some interesting situation we didn't discuss yet.
It will be interesting.
I think many exotic things can happen.
Let's talk about what to do in emergency, right?
Yeah.
First thing, understand the reasons, right?
Or maybe not.
No, I don't think so, yeah.
I mean, bear in mind, if we're actually in an emergency,
our database is no longer,
not only not accepting updates and writes and things,
but it's also, in a lot of of cases not accepting selects, which...
Which is ridiculous, right?
You shared with me before our call, you shared Ivan talk,
which says if managed Postgres service has automation,
if it understands that we are approaching a full disk space,
it converts our database to read-only state, just setting a parameter, which actually any client can
override actually. So it's a weak protection. Default transaction read-only turned on,
this parameter, default transaction read-only. I like the idea actually, because it's weak
protection, but it's quite
reliable if you don't have this set to off all the time in your code usually like there are low
chances you have it in your code so it means that it will protect you and then you have alerts you
need to have proper monitoring and so on but this is like prevention measure maybe yeah again if
yeah it's again yeah but if you're an emergency this this is like i think we
should write some good how-to in general how-to for for any case i did check your how-tos and i
did i was surprised not to find one for this actually um but it's a miss i think also one
thing worth mentioning i nobody's going to be listening to a podcast and be like half an hour
in or whatever and be like you know know, because they're panicking.
Yeah.
Yeah.
But one thing to remember that does seem to catch people out is,
I mean,
hopefully everybody listening here,
but this won't happen to,
but they know that wall stands for right ahead logging,
assume that it's logs and therefore can delete it.
And then because it's a
because it's like a large amount of what they think are logs they delete those to try and recover
and i think that the main advice from me in terms of recovery like from every guide you'll read on
this is don't do that yeah i will say the main the like classic approach is understand reasons and fix them, mitigate, right?
But you might have pressure of time.
In this case, it's good if you can quickly identify something you can delete safely without big consequences.
For example, regular Postgres logs.
If you have a lot of logs, gigabytes, it's good to delete just one file or so,
start database if it's down.
And then you do need to do full analysis and mitigate the real reason and so on.
Or maybe you can delete something outside Postgres data directory, something.
You can check quickly if there are different directories around
which have some gigabytes of data, or at least megabytes,
so you can quickly delete and let Postgres behave normally
for quite some time.
It buys you some room for analysis and proper mitigation.
But mitigation is inevitable.
You need to understand what happened.
Is it wall or data or something else? And then you need to understand what happened is it wall or data or
something else and then and then you need to mitigate yeah i was reading some guides on this
there's a good one on the crunchy data blog and they made a really good point that we don't all
think of doing when you're in that kind of panic mode get you know get things back online is is my primary objective um in cases like that
but they recommend taking a file system level copy of everything in its current step like
take get a snapshot of the current directory as it is before you start deleting things yeah
it's additional yeah yeah because it takes time and we're down. In cloud, actually, if you don't see what you can delete quickly,
like in cloud it's managed, for example,
you cannot go and run the UDF and so on.
But in this case, probably it's easier just to add a few percent
of disk space quickly to buy some room again
and then to investigate.
Obvious approach, right?
I also think because of some reasons we're going to discuss in a bit,
this is just so much less likely to happen in cloud environments.
It just feels like this is one of those features.
There aren't that many.
This is their job, right?
Yeah, this is one of those two or three features
that they really do manage and can manage for you automatically.
Notify properly and so on
right or just auto scale yeah you just don't hit this i don't like auto scaling this area because
it's auto scaling of budgets well but it's also not solving the root cause of the problem if you're
if for several of those problems we talked about it's not your actual data growing
it's not your like uh it's not your true tables and indexes it's right ahead logging going out
of control because of an open replication or like some reason that it shouldn't be that much data so
auto scaling those is it's just not solving the problem exactly so yeah like hiding dirt under the rug
as long as you've been told when it has scaled up then at least you can go and
investigate yeah so yeah i think we covered quite well mitigation in general and uh
what do you think about yeah um, I think increasing makes sense.
I saw some warnings not to increase in place necessarily.
Like maybe you should look at setting up a replica and failing over,
but I thought that would be slower.
Like I wasn't sure myself.
I don't know.
Like I think if you can add space or delete something safe, it's a good first step.
Then you have a big step of analysis, proper analysis and mitigation of true reason, main reason why you approached it and not noticing.
So a big part of mitigation, if you perform proper root cause analysis, I'm quite sure will be we didn't notice this and it's a problem itself.
So our observability is bad.
Either monitoring or probably alerting.
In fact, we have to go to prevention.
Yeah, prevention, I think,
philosophically is quite simple.
You just need, like,
the better way to prevent it
is understand and keep keep keep on like good
level of understanding of how you spend your bytes of disks of gigabytes so you understand
understand the layout you understand wall data real database data is it like actual data or a
lot of bloat and then you understand logs and everything else and you understand, okay, we are good
and we understand how we spend our disk bytes.
And if this understanding is good,
everything will be fine
and you will notice problems early.
But it's hard to do
if you have hundreds or thousands of clusters, right?
Yeah.
In this case, you do need some rules to be alerted. If
bloat is high, if the PG wall grows without good control, if logs are not properly... Also, logs
consume too much. You just need to have good observability right to to have mitigation and then you just
with understanding of possible reasons you you navigate in these reasons and find mitigation
but the key of prevention is understanding and understanding is observability right
yeah two other things i think like whilst i'm kind of against auto scaling in principle i think it's if you've
got the option going up in budget and having to like migrate later back down to a smaller size
is so much better than being down that i just i personally would have that on myself if i had
that option and i think also oversizing like disc is so cheap compared to everything else we're doing.
Unless you really are in the like extreme, huh?
Disk is not cheap if you have a lot of terabytes
and many standby nodes and oh, it can be not cheap.
But most of us aren't in that case.
Like most of us aren't in.
So yeah, I agree that for like people with serious serious
some some people pay like millions for disk space it's like it's can it can be headache that's why
i mentioned pg theater is a quite a very good potentially good thing and tiering is good and
like some design understanding how the like structure of your disk space consumption you you design some
rules and the float data properly and keep everything under control and that's that's
this is good this is good but yeah i think imagine imagine advising some startup and they had like
some 10 gigabyte disk or something and their database was only a few hundred megabytes.
The cost of being on a 100 gigabyte disk instead is just so tiny.
It's just such a small...
But yeah, I take your point.
But again, I think observability
can be always improved, I think.
So especially if we know that
this is BGV, this is data,
this is logs, everything else,
and we have some proper analysis and alerts for it.
I think that's the bit that trips people up.
I think a lot of people actually have quite decent observability.
They can see what's going on,
but alerting people who aren't getting alerted early enough
or not at all is
seen is the one i see quite often no alerts at all for you know disc is at 50 sends an email
70 raises that you know 90 sets off a page you know like they these kinds of like serious alarms
yeah you know i know we need to go already out of time, but let's mention some small
practical advice in the end. If you cannot SSH to the box, for example, it's RDS or other managed
Postgres service, and Postgres is still alive, but you wonder why the disk usage goes up. At SQL
level, you can check what's happening in directories using pglsdir, pglswalldir.
There is a set of administrative functions which allow you to inspect the content of directories.
And this is very helpful when you troubleshoot and understand what's happening.
And maybe also it can be helpful in mitigation activities in observability. For example, if you have a report
triggered out of disk, almost soon we will like usage exceeded 90%. Then you can include some
analysis automated to have a snapshot of what's happening in key directories, right? And understand and understand what are consumers of disk space.
Just some advice I thought we could miss.
Let's not miss it.
That's it, I think, right?
I think so.
I hope none of you ever actually have to face this and you can all mitigate or, sorry, prevent it from happening.
And yeah, thanks so much, Nikolai.
Catch you next week.
Thank you. Bye-bye.