Postgres FM - Compression
Episode Date: July 26, 2024Nikolay and Michael discuss compression in Postgres — what's available natively, newer algorithms in recent versions, and several extensions with compression features. Here are some links ...to things they mentioned:wal_compression https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-WAL-COMPRESSIONOur episode on WAL and checkpoint tuning https://postgres.fm/episodes/wal-and-checkpoint-tuningSynthetic wal_compression and streaming replication wal size test https://gitlab.com/postgres-ai/postgresql-consulting/tests-and-benchmarks/-/issues/11default_toast_compression https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-DEFAULT-TOAST-COMPRESSION ZFS https://en.wikipedia.org/wiki/ZFSOur episode on TOAST https://postgres.fm/episodes/toastOn compression of everything in Postgres (talk by Andrey Borodin) https://learn.microsoft.com/en-us/shows/cituscon-an-event-for-postgres-2023/on-compression-of-everything-in-postgres-citus-con-2023cstore_fdw https://citusdata.github.io/cstore_fdw/Using Hydra columnar https://columnar.docs.hydra.so/concepts/using-hydra-columnarAbout compression in Timescale https://docs.timescale.com/use-timescale/latest/compression/about-compression/pg_tier https://github.com/tembo-io/pg_tierpgBackRest https://pgbackrest.org/WAL-G https://github.com/wal-g/wal-gpg_dump https://www.postgresql.org/docs/current/app-pgdump.htmlpg_dump compression specifications in PostgreSQL 16 (article by Pablo Glob from Cybertec) https://www.cybertec-postgresql.com/en/pg_dump-compression-specifications-postgresql-16/Our episode on pgvector (with Jonathan Katz) https://postgres.fm/episodes/pgvector ~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith special thanks to:Jessie Draws for the elephant artworkÂ
Transcript
Discussion (0)
Hello, hello, this is PostgresFM, episode number 107.
My name is Nikolai, founder of PostgresCI, and as usual, my co-host is Michael Pidgemustard.
Hi, Michael.
Hello, Nikolai.
Hello.
So, you chose the second most boring topic, in my opinion, after security.
Tell us what it is.
Yeah, I can blame our listeners for this one we had a
in my opinion blame someone else right yeah exactly always that's the first this is my
favorite methodology in troubleshooting in off incidents no blameless blameless culture right
so we had a great listener suggestion to talk about compression and i guess it's kind of surprising we haven't covered it yet we've covered various topics around this but they they are specifically
in the context of a couple of extensions that offer compression options i thought it was a
good topic to cover generally we can talk about all the different things where compression is available in postgres kind of
broad but shallow topic maybe this time yeah we will talk about physical like not lower level of
compression not like modern way to compress data when you tell some llm i have a lot of data in
this table in this column it's a huge chunk of text let's just summarize them and drop the
detailed texts and so on right it's it's it's actually my favorite way to compress
well that's a good point actually i guess we i i assumed automatically we were talking about
what would be called lossless compression yeah so, so like with images. This is definitely lossy. Yeah, this will lose some details for sure.
And it will use some details that don't exist.
Not only is it lossy, it's also glossy, I guess.
Does it stand for lossy lying machines?
Is that what they stand for?
Yeah, yeah. for lossy lying machines is that what they stand for yeah yeah so we talk about lossless
lower level transparent compression when we just enable something and data suddenly
takes less disk space or space when we send a network like like sending something in transit, so to speak.
And we are able to uncompress it without any losses.
But it's all fully automatic
and users just use it,
enabling some features probably.
This is our focus today, right?
Yeah, exactly.
And I think that's a good transition
into kind of the two main benefits. And I think they're obviously related, but I think they are somewhat different and probably in some ways trade off against each other a little bit. if we have very repetitive data or data that compresses really well we could maybe spend a
lot less money and less resources by compressing it but so storage is obviously a big one but
there's also the performance side of it if we can if it can take up less space it might be
depending on where our bottlenecks are it might be that overall the the
cost or speed uh degradation of compressing and uncompressing or decompressing the other side
is still faster if we've had to you know only transport a lot less data or do calculations
on a lot less data that kind of thing let's maybe start with things we have for many years in Postgres, right?
And then discuss some new stuff and then discuss, like,
compare what's beneficial, what's less beneficial.
We have, first of all, compression at wall level for full page rates,
full page inserts, right?
FPIs.
And full page inserts need to fix
the problem of difference
between the size of
buffers, the pages in memory,
8 kilobytes in most cases,
and the size of block and file
system, very often 4
kilobytes, x4 for example, right?
And to avoid partial writes in the case of feathers,
Postgres, after each checkpoint,
if the buffer was changed for the first time
until the next checkpoint,
Postgres doesn't write only the change itself.
It writes the whole buffer.
And if by default wall compression is not enabled,
it means the whole buffer 8KB is written as is to wall, consumes 8KB, right?
But if we enable wall compression, this page is compressed. And in my opinion, in most cases, if we talk about significant load, we should consider
enabling wall compression. It can be beneficial, especially if you have short distance between
checkpoints, because in this case, you have more frequent full page writes happening. Of course,
if you increase distance, then maybe the same page is
becoming dirty it means writes happen inside it. Many times, multiple times and
only for the first change it will be full page write. Subsequent changes will
be written only like deltas, right? Only tuple which was changed. But if you have
quite frequent checkpoints enabling wall compression, you can significantly reduce
the size of wall written and this has very good positive consequences
such as less data to write to backups while archiving. Archive command will archive less bytes, fewer maybe
like gigabytes per hour, for example. And second, replication as well. Wall is smaller,
so replication has less to transmit over network. Of course, compression needs CPU cycles
and decompression needs CPU cycles.
And I saw on Twitter,
some people mentioned they had problems
when they enabled wall compression.
But in my cases, I observed,
we always decided to switch it on
and CPU overhead was worth it. so there is trade-off here CPU
versus IO and we always chose in favor of less IO and then with synthetic tests I just showed you
before we started this recording I showed you we had simple PgBench experiment with max wall size 4 gigabytes
checkpoint amount 15
minutes. We saw
wall reduction
I think on regular PgBench
workloads, which include
writes, of course, inserts, updates,
deletes. We saw three times
less wall
created, generated,
or written, it's the same, right, when we enable wall compression,
which is a huge benefit, three times less wall. But if you had the checkpoint tuning, as we
discussed in some of our previous episodes, if you had it, and distance between checkpoints is quite large, especially if you have Patroni and modern Postgres,
which in case of failover or switchover doesn't require restart of all nodes,
you can afford bigger distance between checkpoints
and have maximal size set to, I don't know, like 100 gigabytes.
In this case, it's unlikely the benefit will be so huge, not 3x.
Of course, it also depends a lot on the data nature.
If it's very repetitive, it's easier to compress.
That's pretty true across the board for compression, isn't it?
On the wall front, one thing i think we probably didn't talk about when we talked about a wall and checkpoint tuning is on the compression side
as of postgres 15 which i guess is pretty new so you might not have have this kind of in the wild
experience yet but we do have more options now so in the past we could only turn wall compression on
or off and now we have instead on, we have three different options.
We have the previous option, which is the PGLZ algorithm.
But we also have LZ4 and ZStandard options now.
And for the people that complained about the CPU side of things,
one option might be LZ4 i believe would be less cpu intensive
it might not be faster and more lightweight in terms of cpu consumption but the compression
ratio is worse well it's not that different to pglz actually it's yeah but compared to some other
modern compression algorithms it it tends to lose
but compared to pglz you don't lose much i think it is slightly worse on average obviously
depends a lot on the data but i think it's a pretty good option if you have that issue and if
you're if you're testing turning this on on a modern version of postgres worth trying at least
whether you know depending on your constraints,
trying those different new options.
Yeah, that's a great point.
And two ideas here I have in mind.
First of all, I remember these options,
ZStandard and LZ4,
they require compile flags to be turned on.
I guess, as you rightfully said, I don't have a lot of...
It's fresh things, so I don't have rich production experience. But these flags,
I guess they are turned on in the official apt packages, right? If you just install it on fresh
Ubuntu. I hope it's so, but worth checking. This option should be turned on.
So Postgres should be compiled with support of these two algorithms.
And second thing, I think it's worth new research.
Maybe with our bot, it should be quite easy.
And we should just research maybe with different workloads
and maybe with different Maxwell size.
As I said, it's very important in this case, the distance between checkpoints.
And just with regular PgBench, we could check all available algorithms on fresh Postgres versions.
And maybe draw some charts, right?
Plot some charts.
So, yeah, we just like hands, but bot is definitely ready for that.
So I think it's a great direction.
I wish I had more hands.
By the way, I wanted to say we are looking for maybe part-time database engineers,
people who want some fun with this kind of work, like research.
We usually do it public, so it's kind of interesting for community as well.
So if some people listening to this podcast want to participate We usually do it public, so it's kind of interesting for community as well.
So if some people listening to this podcast want to participate and work part-time with us, with Postgres AI, definitely would be interested in discussing.
So maybe we will have more hands and do this benchmark, for example.
What's the best way for them to get in touch with you? Email, email or Twitter or LinkedIn, but email nick at Postgres AI, always a good way to contact me.
On the packaging front, I don't know, I haven database, and recently upgraded to Postgres 16, finally,
and was pleasantly surprised that I was able to turn on
while compression with LZ4 and Toast.
I was able to switch our default Toast compression algorithm
to LZ4 as well, which is really cool.
16, you said?
I'm on 16, but I think those are available
as of version 15 for Wall and version 14 for toast.
That's great.
So what benefits?
Do you remember some numbers?
So I didn't do extensive benchmarking, but the thing we use it for most is we have saved plans.
So explain plans compress really well.
It's a lot of repeat.
Large values.
Yeah, manually.
Yeah, yeah, yeah.
It makes sense.
It's in JSON, right?
Well, both.
Like text plans compress well too,
but JSON plans compress extremely well.
But obviously JSON plans are bigger in the first
place so there's not yeah so yeah compresses really well but the main thinking was we don't
mind spending a bit more on storage for those if if the speed of retrieving them will be quicker so
people save a plan they might share it with somebody on their team and that person needs to load it obviously it's like we're talking small amounts of time
but if if the storage wasn't that different and the speed was faster i was happy to make the switch
and yeah it turned out the storage storage was slightly worse on being on average for the plans
i tested with the lz4 but the retrieval speed was faster. So I bit the bullet and did it.
The cool thing is, the thing I didn't realize,
I thought it would be a complex migration,
but like, what do you do?
Like, I thought you might have to change existing plans
or existing data, but you don't.
If you change the setting, it applies to new data.
Right.
Yeah, actually, this is a good point we didn't mention.
So we're shifting to discussion of storage, compression, and post, right?
But it's a good point we forgot to mention about wall compression.
It doesn't require restart.
So you can switch it on, switch it off,
and all new writes will happen according to the new setting.
Of course, you need to send the SIG hub signal
or select PG reload conf.
So configuration change is applied without any restart,
which is great.
And it also means it's like easier to try.
If you have monitoring, if you're prepared to roll back,
it's easier to try.
And if things go wrong, can return and yeah so toast
compression is interesting and so you again like sorry i'm i'm not paying attention to details
today somehow you you chose this standard right i actually chose lz4 oh okay okay i was more
interested in the speed of retrieval and then speed compression and speed of retrieval than I was for total size on disk.
Yeah.
Meanwhile, I asked our bot to double check how apt packages, official PGDG packages are created.
Of course, these two options are there.
So if you install Postgres on Ubuntu,
you can try various compression algorithms.
Well, just to double check.
Okay, so toast.
What else?
We don't have a good ability to compress table data, right?
By default in Postgres, we don't have it.
And it's actually not a simple topic in RowStore.
In RowStore, we have, yeah, so tuple by tuple is stored
and it can go in different pages.
So like if we compress it,
we can transparently compress it switching to ZFS, for example.
And we saw like benefits in terms of disk space, like 10% to 30%, depending on data nature.
But ZFS brings new challenges for administration, for sure.
It's still good to shave off 30%.
And this is what you get by default if you install DbLab because it's on ZFS so
if you have for example terabyte size
database in DbLab
it will look like 700-800
gigabytes only it's much
better but
yeah so but in Postgres itself
there are no good compression options for
heap
except for single large
like if you have one column that's large,
there's good, like I think Toast is very good,
but not for like smaller values.
Oh, actually, yeah,
big values are compressed in heap, right?
Now I remember before going to Toast,
like Postgres tries to squeeze,
like to fit them in size like
two kilobytes
and have
for roughly
four tuples per page
roughly
like it's
like kind of
things I remember
if once it's compressed
it's under one kilobyte
or something like that
they put it in line
on the page
in line
yeah
it just means that
if you have
values that are multiple kilobytes
and
compress well even the default pglz will give you quite a lot of compression out of the box transparently in Postgres.
It's just if you have lots of repeating small data, like time series data that could compress as a whole very well, the very store model. But this compression is applied to a single tuple,
single row, basically, version, right?
So single row version.
It means that we only have one timestamp, for example,
one, like, I don't know, like temperature and so on.
Several different columns, maybe a couple of timestamps,
but different nature of timestamps.
For example, created at, and I don't know,
like registered at something, different timestamp. and postgres tries to compress the tuple
and toast like we didn't cover and there's no goal to cover it deeply we had another episode on it
right so postgres tries to compress if it doesn't fit it shrinks it already and
chunks are stored in separate uh-called TOS table,
which is actually also a regular table, which is kind of invisible to user,
but you can inspect it if you want as well.
And then the decompression and reconstruction of tuple is occurring when it's needed.
But I'm trying to say there is no like for heap itself, there are no like rich capabilities
to control compression.
And even if we had them, expected benefits would be not high compared to analytical column
store databases where, for example, we have all temperatures or all timestamps stored
in separate file.
Like only this column for all rows is stored here.
And they all are temperatures,
and maybe these temperatures are coming from, for example,
if they're coming from a single source, they don't jump.
I don't remember the word in English, sorry,
but the changes are not acute, right?
I mean, if they change, for example, if temperature is increasing,
probably it will be increasing for some time.
It means that compression could be done using various algorithms, for example, applying deltas and storing only deltas, right?
And we will probably discuss timescale DB soon
because this is pointing in the direction of timescale DB
and what it provides.
So for column store, you can compress like 10x,
which is not possible for row store.
Yeah, or more even.
But I think you're missing...
The multiple is interesting, right? But you're missing the the multiples interesting right but
you but you've missed the fact that like let's say we had a row that is let's say our tuples are like
a megabyte but 99.9 of that is a single json column and we have 10 other tiny little integer
columns we get way more than 10x compression ratio just with Toast, you
know, as if that block is as well.
Yeah.
So as you know, probably I, like when I was very young, I participated a little bit in
XML function and data type development in Postgres.
And I remember that time XML compression was the thing, big thing. And I even remember hardware accelerators companies were selling
to compress XML on the fly transparently.
It's crazy.
It's because a lot of structural pieces of the values
can be compressed well in XML for sure, like less than JSON,
but still also like all those
parentheses, braces,
like quotes and so on.
Like, yeah.
Or maybe it's also some,
if it's JSON, not JSONB,
a lot of white spaces.
Right?
Sure, yeah.
Yeah.
So if you have json and uh yeah so so in some cases
compression and toast like tuning it it's interesting and maybe also we need benchmarks
so maybe for json as well right it's good yeah it would be interesting but what what i mean is more like the the compression
ratio is a lot is is varied obviously extremely dependent on the data but i think yeah i'm very
curious what you have for example if you know that some old values are not compressed and some
new values are compressed you check how much storage is occupied by, I don't know, by like 1000 rows or so.
It's worth checking.
Well, you can't, it's like impossible to store large values in Postgres without,
as you know, you can turn compression off, but Toast, you can turn compression off with Toast,
but the default is on. So I have never tried it without so i don't have
that to compare but just give you an idea we didn't used to compress plans in so we only store
plans in local storage by default i can browse the local storage and that's an extremely limited
resource like in some i think in firefox it's like five or ten megabytes depending on the version
so that's not very much when you're talking large query plans.
Obviously, it's a lot for most people's query plans,
but some people will come up with query plans that are like dozens of megabytes.
Varalina data types accept up to one gigabyte, right?
But as I remember, it's better not to go beyond a couple of hundred megabytes
because the performance will become terrible.
Well, just to give you an idea, though, quickly,
these plans were compressing from megabytes down to kilobytes.
It was more than 95%, I think, in some cases.
That's cool, but still, it's a compression of only one tuple.
Actually, one value value not tuple
value here yeah yeah but uh it feels like it would be it would make sense to have some compression
that may be at page level as well and so on this is what the zfs provides transparently if you
put postgres on top of it right so compressing I think, even if it's still a row store,
maybe having these abilities.
I remember Peter Zeitz
talked a lot about new features
in MySQL many years ago.
I think MySQL has more settings
in this area for compression of storage.
Have you heard or remember
something from it? We have mostly questions today right
i didn't know i didn't know that okay they are they seem to be a bit further ahead on the storage
engine side of things don't they like they've had multiple storage engines for quite a lot longer
than well like we don't have but yeah so it wouldn't surprise me if they were ahead on that front.
But yeah, you mentioned timescale.
It feels like we've had a history of a few extensions
that have offered the option for columnar storage.
Before going there, let's just touch a little bit
on the options that Postgres could have,
and there are some discussions.
What else could be compressed inside for example temporary files temporary files yes
well we should also probably talk about things that we actually haven't mentioned that like backup
files pg dump files yeah i wanted to do it but look look, before going to backups or dumps, backups and dumps, it's
already going outside of Postgres.
But imagine we have Postgres, we run some queries, we discussed wall, which is the first
thing we should discuss because data is written there first before it's written in data files
or pages and so on.
So we discussed wall.
We discussed the storage itself. Toast.
This is what we have, and that's it,
right? Okay. Not at page
level, but it can be
achieved by file system, but that's it.
Indexes,
I have no idea.
Maybe
deduplication, which was done in Postgres
13, 14, by Peter
Gagan, Anastasiy Lubennikov, maybe this can be considered as compression deduplication which was done in postgres 13 14 but by peter gagan anastasia lubenikova maybe
this can be considered as compression actually right some that's a good check yeah native native
compression you know like pieces of compression because like it requires less less space right
it's kind of why not so but it's optimization but. And last time we discussed, what was the topic last week?
Out of disk.
Out of disk, right?
And remember we mentioned, I named it exotic,
and some listener commented on YouTube.
By the way, thank you all those who write something on YouTube
or on Twitter or LinkedIn.
It's very good to receive feedback.
So they mentioned that temporary files is not that exotic.
And if you run huge, heavy queries, so they had it,
you can be out of disk space because temporary files were huge
and consumed a lot of disk.
And compression of them makes total sense to me right what do you think well we don't
have it right i was watching a really good video that you sent me a link to by andre borodin right
right and that was going to yes and he i think uh he meant i watched it last night and he
mentioned that in order to be i think there was a problem with durability.
Forget about problems.
This is like, I'm just, that doesn't make sense.
He started doing the work and it's more complex than it might sound.
But yeah, it makes sense, but it's complicated.
Full disclaimer, I just chatted with andre this
morning and this is what he told me and i'm just shamelessly using his ideas so yeah it's worth
compressing pages as i mentioned and it's worth compression compressing temporary files andre
says it's just lack of time and hands but but it's worth implementing and proposing patch. So, yeah, it's a great idea.
I hope he will find time to work on this,
and other people probably will find time
to move Postgres forward in this area.
And in this case, imagine if temporary files are compressed.
In this case, I would be more right saying
it's exotic to be out of disk space
when you have temporary files occupying be out of disk space when you have temporary files
occupying a lot of disk space, right? I think it makes extra sense because
the type of things that are generating temporary files, like sorts and hashes, it's probably
a lot of similar values. Well, I'm guessing there would be a fair amount of repetition which would suit it would naturally suit compression as well so i'm wondering if there's like a not
only would it just take up less space and obviously there's benefits there but i'm
wondering if there might be performance benefits too yeah that's a good point actually when we
talk about saving disk space sometimes it's also about performance if disk is slow
it's better to consume cycles of cpu for compressing decompressing
and things become much faster i agree this is a good point and but it's not always so it depends exactly it depends in general i think i saw somewhere there somewhere, the general point is that databases are I.O. hungry.
And as you very well know, we talk buffers, buffers all the time.
It's just confirming it's all about I.O. usually.
But I also saw CPU 100% and we lack CPU especially.
For example, we are on Graviton or ARM,
and we only have up to 64 vCPUs, and that's it.
These days, it's not a lot, 64, right?
It's like a moderate number.
It's a lot for some people, yeah.
Right, right.
But it's not 360.
It's not 800 like you showed me on AWS for Intel scalable Xeon.
So if it's 64, a lot of clients, in this case, CPU becomes a very valuable resource, right?
Of course.
So in this case, probably we might be preferring to spend more disk IO cycles.
Okay.
What else can be compressed in Postgres?
We talked about, I mean, ideas which are not currently implemented.
Temporary files, page-level compression,
maybe protocol level, which has pros and cons.
Maybe let's not go there, right? What else?
Anything else in memory that could be?
Oh, interesting.
Maybe. I don't know.
But I think if we,
for example, shifting maybe to some
additional projects,
I think we will talk about
Hydra and TimescaleDB.
If, for example, we have StorageEngine
and we have
some tables defined as column store,
this is what Hydra provides.
In this case, it makes total sense to compress those tables.
Or, for example, if we mentioned last time talking about disk space,
or we have row store, but we have partitioning and some old partitions we consider as having archived data.
And this data we want probably compressed,
and maybe we want to store it not on the regular disk, but on object storage.
And we discussed PGT, Tembo developed.
So kind of bottomless Postgres, but we want all data to be compressed and rarely used, right?
Yeah, I saw three different approaches from different extensions.
Like the Hydra one you mentioned, I think, came from Citus data originally.
C-Store FDW, I think, was the original.
It's AGPL from Citus inherited by Hydra, right?
Yeah, so I think that is very much like from the start of a table's life, It's AGPL from Cytos inherited by Hydra. Yeah.
So I think that is very much like from the start of a table's life,
you choose whether it should be row store oriented or column store oriented.
That's one approach that makes a lot of sense.
The timescale approach seems to be…
And column store, of course, we won't probably compress quite a lot
because the ratio usually is good. storage benefits, but the main aim seemed to be so that we can make analytical queries faster. So it's that performance angle that seemed to be the driving force of why should
we want it to be column-store in the first place for performance of analytical queries
that tend to be aggregates over a single column. And if we've got column-oriented data that's
compressed...
Yeah, it's massive read. A lot of I.O.
Of course, we want to reduce this I.O.
and then already deal with it in CPU.
But also nature of data is like a lot of similar-looking values
and compression.
And I think once it's organized by columns,
you can also start to store much easier like the metadata of like min max
values and do some like shortcuts i think on that data as well so i think there's some like
cool tricks as well now but but there's then i think there's i think there's two other approaches
that i've seen one is the time scale approach which seems to be on older partitions, like on older data,
everything's row store at the beginning.
And then after a certain point, you set a policy that it gets converted
to column store later once it's unlikely to change.
I remember differently.
I remember it's always row store, but compression works in two dimensions.
Like, for example, I'm not sure if they converted to column store.
Maybe I'm wrong, but what I remember is still row store,
but with understanding of vertical dimension, so to speak.
And, for example, storing deltas instead of row values
and applying compression.
Well, I mean, it's still in postgres
it's still in postgres so it's still row store under the hood but it's column or like it's
organized by column like if you look the postgres docs are really good on this um i'll share a link
sorry not the postgres the timescale docs i like understanding here apparently but there's a third
approach as well i think that so the timescale approach I think is more optimized
for space. I think the compression is more
on the, let's make
older partitions take up less
space because you have so
much data.
Like 20, yeah,
some of the numbers they mention, like
95% compression. We have impressive
numbers observed, yeah.
It's really good.
But yeah,
but that's the idea is with time,
time series data,
you could end up with hundreds of terabytes like,
and they've,
they have themselves.
So I think it's,
it's the kind of time where you could actually save a lot.
Now,
obviously they also benefit from the performance on analytical queries and
they have some cool features there,
but it feels like the way they've
implemented it was primarily for those storage benefits and then the third one i think has popped
up relatively recently in the grand scheme of things is this idea of as you mentioned the
tiering or like the moving to a file like format like parquet so exporting data out of Postgres into a compressed format on object storage
that's normally column-oriented
so that you can get these fast analytical queries
and they take up a lot less storage space.
There's some big limitations for data types, I suspect, right?
Only a limited set of data types supported
for those kinds of things.
And how do updates and deletes work? I don't actually know all of the details. limited set of data types supported for those kinds of things well and and like how how the
updates and deletes work i don't actually know all of the details only yeah so there's definitely
like limitations and differences between these and a lot of like we're not gonna be able to
describe them all here obviously time-wise um but i found it really fascinating that there's
these three different very quite different approaches.
What do you think?
Right.
Yeah, well, it's super interesting to observe progress here.
I, like, probably again, like, for me, this episode raising more questions than answers.
I think after this episode, I will be planning more experiments to study benefits.
And I guess we always need to take into account several metrics,
compression ratio, speed of compression, speed of decompression, right?
Maybe CPU overhead itself, if it matters for us, how much CPU we consumed.
So, yeah, timescale DB and Hydra are interesting in this area.
And I still, I remember my big impression from reading the TimescaleDB details in their
blog post, how they implement compression. I think we forgot, because of me,
we forgot to talk about compression of dumps.
PgDump has compression options.
And also compression of wall files as a whole,
which I think Postgres doesn't provide it,
but both PgBackrest and Wall-G,
I think the most popular backup tools, they both do it, but both pgbackrest and walg, I think the most popular backup
tools, they both do it, right?
Because if you have 16 megabytes file, if you compress and you have like three times
less, for example, as I remember, like five megabytes maybe or so, maybe even less in
some cases.
Again, it depends on data.
In this case, it's much better in terms of storage costs
and transfer speed and so on.
But this will consume some CPU cycles,
and usually we archive commanders working on the primary.
This is the key here, usually,
to avoid the risks of having longer delays,
this lag of archiving.
Because we need walls, it's part of our DR strategy,
the disaster recovery strategy.
So we want to archive wall as fast as possible.
And this means we do it on the primary.
And if we archive whole wall, we need some CPU.
And if a lot of wall generated, probably we will need multiple workers.
And then you see already more than 100% of single CPU.
So meaning multiple cores are busy.
Like 200% out of our 360 cores, vCPUs.
We allow two cores to be used to compress and archive walls.
It's just like some random numbers in my head,
but definitely if we talk about compression
of whole walls by Wall-G or PGBgres,
we need to keep in mind this,
the most valuable CPU resource,
which is on primary.
We need to think about capacity here.
And decompressing usually is not a big deal,
especially if we can fetch walls using multiple workers from object storage like S3 and decompress them.
Usually it's not a big deal, but still worth remembering.
So usually we have compression there.
Unfortunately, Postgres right now doesn't do it officially.
So here I talk about, again, third-party tools,
which are very common, popular, WorldJPG, PgBackrest.
Others as well, I think they also compress.
And pgDump.
pgDump is official, very official.
pgDump, pgRestore., very official. pgdump, pgrestore.
They support compression for custom or directory or both formats, right?
I always mix them up.
Yeah, and this is one of the oldest.
Like, this has been in forever, right?
Right, but compression was forever,
but what about options in terms of algorithms?
Oh, yeah.
ZStandard and LZ4.
So I think it's relatively new.
And it's worth, again, experimenting, benchmarking, and studying.
I think there are articles from CyberTech, right?
Let's check.
Yeah, there are articles comparing different types of compression
and decompression.
So ratio, I think you can even control a ratio there if you want to spend more time and CPU
capacity to achieve a little bit more better compression ratio, it's possible. Yeah,
I remember some surprising results from those articles. I don't remember details,
so let's attach them. But definitely we want to study this
as well. Again, mostly questions today,
not answers,
not exact recipes.
But it's good that the only
thing I don't like
in PgDump, PgRestore is that
you cannot use parallelization
and compression on the fly.
What if I don't want
this is a one-time operation.
I won't just logically migrate
from one database to another database.
And I don't want to send a lot of bytes over network.
I want to compress and decompress
and use multiple workers
because, for example,
these servers are not used right now.
So I just need to migrate out of RDS
to self-managed Postgres
because I feel enough level of confidence
or I work with great guys
who know how to do it
and RDS is not needed for me anymore
and so on.
So in this case,
you need to, unfortunately,
you need to first save it to disk
and then restore from there.
Yeah. So this is, I think, a big missing feature of Postgres Lex.
And Dmitry Fontaine created pgCopyDB, I think.
But I quickly checked before our recording.
I didn't see anything related to compression.
It talks only about parallelization. Let's have like
16 workers or I don't know, four workers to speed up the process, which is in my head quite
relatively simple idea. We just create repeatable read transaction, keep it, export snapshot,
create other transactions, repeatable read, and use the same snapshot so all
of them are synchronized and you can even read one huge partition table in chunks. It will be
fully consistent read using multiple workers. But no compression options, I couldn't find it. Maybe it's not an easy idea to implement.
Maybe it's a good idea.
So what I would like to have is just pgdump, pgrestore,
supporting both parallelization and compression
without the need to store intermediate file or directory.
Would be great.
Makes sense.
Yeah, I needed it yesterday.
Copying huge, not huge, not huge at all,
like 10 or something gigabytes, tiny table with big vectors.
I mean, a lot of vectors.
By the way, let's like make a full cycle
and in the end mention that although we joked
like about lossy
compression using LLM,
underlying tech,
when you have a lot of dimensions
and huge vector values, it means
toast is involved definitely. It's interesting
how it's compressed there. In this case,
it's a simple idea. Sometimes
reducing the
number of dimensions
means kind of lossy, but it's also
kind of compression, right?
And OpenAI speaks about it.
It's also an interesting area
how to, like, not
losing a lot of quality.
On vectors, though, I think
Jonathan Katz made a really good point
on the episode we did
for Postgres.fm
that because it's mostly because the vector data is
like at least the embeddings that come back from models like open air it's mostly
random integers it doesn't compress well so there's actually i think i'm not sure he's done
it yet or he's planning to do it there's work to look into whether you'd actually be benefit it'd be beneficial to
turn compression off for toast of vector data because you're not getting much compression
there's no point paying the overhead of the compressing and decompressing each point so
i thought that was super interesting as a use case for turning it off
yeah interesting i would like to explore it off. Yeah, interesting.
I would like to explore it myself.
It's a very interesting area.
So I think we discussed maybe five different directions for benchmarking.
I would like to conduct these benchmarks.
Cool, let us know how you get on with all five by next week, right?
Yeah, well, yeah.
So thank you.
It was interesting. Many questions. right? Yeah. Well, yeah. So thank you.
It was interesting.
Many questions.
There you go.
Not boring after all.
Thanks so much,
Nikolai.
Catch you next week.
Thank you.
Bye.