Postgres FM - Tens of TB per hour
Episode Date: November 14, 2025Nik talks Michael through a recent benchmark he worked with Maxim Boguk on, to see how quickly they could provision a replica. Here are some links to things they mentioned:Ultra-fast replica... creation with pgBackRest (blog post by Maxim Boguk and Nik) https://postgres.ai/blog/20251105-postgres-marathon-2-012-ultra-fast-replica-creation-pgbackrestCopying a database episode https://postgres.fm/episodes/copying-a-databaseAdd snapshot backup support for PostgreSQL in wal-g (draft PR by Andrey Borodin) https://github.com/wal-g/wal-g/pull/2101Multi-threaded pg_basebackup discussion 1: https://www.postgresql.org/message-id/flat/CAEHH7R4%3D_GN%2BLSsj0YZOXZ13yc%3DGk9umJOLNopjS%3DimK0c1mWA%40mail.gmail.comMulti-threaded pg_basebackup discussion 2: https://www.postgresql.org/message-id/flat/io_method https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-IO-METHOD pgBackRest https://github.com/pgbackrest/pgbackrestAdd sequence synchronization for logical replication (commit) https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=5509055d6956745532e65ab218e15b99d87d66ce Allow process priority to be set (pgBackRest feature added by David Steele) https://github.com/pgbackrest/pgbackrest/pull/2693 Hard limit on process-max (pgBackRest issue from 2019) https://github.com/pgbackrest/pgbackrest/issues/696 ~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith credit to:Jessie Draws for the elephant artwork
Transcript
Discussion (0)
Hello and welcome to Postgres FM, a weekly show about all things PostgresQL.
I am Michael, founder of PG Mustard, and this is Nick, founder of Postgres AI.
Hey Nick, how's it going?
Good, good. How are you?
Yeah, keeping well, thank you.
Yeah, so a long time, obviously. We skipped last week because of me doing some stuff at work and with family,
so we couldn't make it. Thank you for understanding, and I'm glad we're back.
Yeah, and no complaints from listeners, so thank you everybody for.
your patience. Nobody noticed. They are still mostly catching up, we know. Yeah. Yeah, somebody shared
on LinkedIn that, like, oh, I discovered PostGus FM and while I was walking the dog,
listened to all the episodes, it took me like almost two weeks. Like, okay, one 164 episodes
during 10 days, it's challenging, right? It's like 16 episodes per day. Yeah, that's binge watching.
Yeah, you need maybe two X or three X speed and yeah, it's insane.
So this topic we have today, it's not my originally, it's from Maxim Baguuk.
I probably mentioned him a couple of times.
He always has very interesting ideas and approaches to solving hard problems.
And I remember he said, he told me it was maybe half a year.
year ago, maybe more, he said, oh, I, like, I squeezed more than 10 terabytes per hour
when copying data directly from one machine to another. And for me, the standard was five
years ago, it was one terabyte per hour. I think I mentioned it a few times. Yeah. So, yeah,
we had an episode about how to copy postgres from one machine to another. And one terabyte per hour,
it's like just okay modern hardware we know like there is a throughput for disks throughput for
network throughput like disks on on the source and on the destination both matter here
sterilization and SSD is good with polarization so basically golden standard for me was one terabyte per hour
and if you need to create wall g or pg breast backup it was one terabyte per hour at least right
And then we saw like two, three terabytes, if you raise, sterilization with modern hardware.
But it was still about not local NVME, but about EBS volumes or PD SSD disks on Google Cloud,
persistent disks on Google Cloud.
And so like traditional risks.
By the way, those are also improved.
ABS volumes are pretty fast these days.
And the Google Cloud also has hyperdisc.
which are impressive, but not as impressive as local NVME disks, right?
And Maxime told me that, of course, he was using local NVMEs.
It was some self-managed Postgre setup with special machines, so local disks and no snapshots like in cloud and so on.
And PGBCrest, and backups to where, to different machine or to,
S3, it doesn't matter here.
The idea was how fast we can provision a replica,
because sometimes we need it, for example,
one replica is down, or we reconfigure something,
we need to build a replica.
And normally, if we have cloud disk snapshots,
these days we use cloud disk snapshots.
By the way, there is great news.
Andrei Boredin implemented in WOLG,
Andre is a maintainer of WLG,
implemented native support for cloud disk snapshots in Voltry as an alternative to
backup push backup fetch so instead of full backup or delta backups you can rely on snapshots
addable yes gCP or others provide yeah i saw this is this is it a draft PR at the moment or what's
the status of this it just created a few days ago early it was wipe coded mostly andre like
Andrea is great hacker, but I think our hacking postgres sessions contributed to this idea.
Let's do more code writing with LLMs.
But of course, review will be thorough.
I think it's already happening.
I saw many comments.
And the result will be improved and so on.
So no hallucinations will be allowed.
All covered with tests and so on.
Anyway, there is this idea that instead of full backups, we can have snapshots, but at the same, like, I see like two big trends.
One trend is, okay, if we have snapshots for disks, let's use them more and more and more.
And rely on cloud capabilities depending on cloud.
Of course, such backups need testing and so on.
In parallel, there is old idea.
we should not be afraid of using local NVEs.
We know they lose data if machine is restarted,
ephemeral disks, right?
But it's okay.
These days, if machine is having issues,
we usually just reprovision it anyway.
And switchover or fell over,
it doesn't require restart,
so it's very seamless.
It means that we can leave,
live with local Nvimis, right?
And additional momentum to this idea recently was created by Planet Scale.
Yeah.
Which came to Palsgous ecosystem and started showing very good looking pictures in terms of benchmarks
and real latencies from production systems.
Of course, you cannot beat local NVMIS disks with their million to three million iops
and amazing like throughput, like more than 10.
gigabytes per second writes and reads it's insane because for regular disks we have one
two three gigabytes per second only and that's it and iops hundred to hundred
thousand iops maximum and i think for me the critical part is we if we have an h a set
up with failover in place then we don't need the durability of those discs that we
don't need there to be six cloud backups you know so because we're going to fail over
anyway so yeah it's that that insight and i think they did a good job of making that very clear
if you've got an h a setup we can make use of the mvome e drives yeah by the way i recognize i'm
mixing throughput and latency anyway local enemies are good in both aspects of performance and
like almost or sometimes full order of magnitude better than network
patch disks and so on but you don't have snapshots so too big downsides you don't have snapshots
and there is hard stop in terms of disk size if you plan to grow in 200 terabytes you definitely
need something like sharding or anyway some some way to split only if you have it if you master
it in this case local disks are great for for like long-term planning but for long-term planning
everyone should remember also that EBS volumes and PD SSD or something hyper disks are
limited to 64 terabytes and both CloudSQL and RDS limit this is hard limit for them as
well 64 terabytes this is cliff or wall very hard one and Aurora has 128 terabytes double
of capacity but anyway if you self-manage you can use LVM of course to combine multiple
disks and I think some companies already do it. But in case of local disks, it's really
hard stop. Nothing to combine, unfortunately. Anyway, you usually combine multiple disks when you
have terabytes of data. In my benchmarks, in our benchmarks, we combine multiple disks on
I4i instances. So there is a hard stop in terms of capacity. I think on I4i, it's 40 terabytes or something
like this. It's impressive, right? So if you think you won't grow to that size very soon,
it's a very great alternative to consider if you go self-managed or some self-cubernatist
managed setup compared to RDS. Because it gives you order of magnitude better performance,
right? Yeah. By the way, when you say our benchmarks, do you mean in the blog post you did with
Maxime? Yeah, yeah. So the original idea was my, my, my,
Maxime, he did the majority of initial work, created the recipe. I will tell the recipe
like and why it existed. But the idea was we need to copy from one machine to another
as fast as possible. We don't have snapshots because it's local disks, either fully like
maybe your own data center or it's these instances with local disks like I-4-I-7-I.
I remember I explored this idea very long ago, 10 years or so maybe, yeah, 10 years maybe ago with I3 instances at that time.
I really like the cost of the final solution because the disks are included.
Instance costs are slightly higher and performance is great.
So we did a lot of benchmarks using I3 instances, even spot instances.
I remember it was great.
So you get very cheap, very powerful machine and do with Postgres many things.
So anyway, the goal was how to clone to provision replica, for example,
or to provision a clone for experiments.
Yeah, I wanted to ask about this, because in the benchmark, for example,
you turned off check sums, and it was interesting to me, like, because if you're
provisioning a replica, presumably you wouldn't do that, but if you were doing it for some other
reason, maybe, yeah, so what are the use cases here where this would make sense?
well if it's a clone for for experiments we don't need to be super strict and this option i think
exists mostly for backups to make them super reliable and so on of course and also you know what
like this is the extra feature a pggebackrest has which like it's actually luxury to have it right
you know like so so yeah of course it would be good to see experiment with check sounds enabled
of course results should go down but for some cases it's appropriate to disable that check
that's a good point actually so if we don't have it it's not is there a risk of introducing
corruption or is it more that we persist like if our prime is corrupted if you let me let me ask
you this if you copy data using pg base backup who will check who will check this who will do that
yeah good point or you use r sync traditional ways are r sync and pjibase backup this is the most
traditional way both are single-threaded that's the point like they are single-threaded so and there's
no check some of it like i this is extra feature that's great that shogie backrass maintainers
implemented it and uh it would be good to compare how it affects this amazing performance uh yeah
in this post but anyway the original goal was to be able to copy data with traditional
way not involving backups this time because usually we basically like normally if not snapshots
we just fetch from backups using woldj or pgbackrest and this is a like also very fast way and
there you can control also parallelization you can provision quite fast and s3 and gcs on google cloud
they are good with parallelization like 16 threads 32 threads you can
you can speed up significantly, you can increase throughput of backing up or restoring from backup.
But in this case, okay, we don't have those backups.
We just want to clone from one machine to another, that's it.
And the problem is PIRG base backup is still single-threaded.
There are discussions and even patch proposed to make it multi-threaded, but apparently it's not trivial.
And I don't see current work in progress, so I think it will stay single-threaded.
So Maxim came up with the idea that we could use PG backrest not to create backups in S3 and then restore because it's basically copying twice, right?
If you have them already, good, just restore.
But if you need, if you don't have them somehow or cannot involve them somehow, you just have two machines and need to clone.
In this case, he just came up with idea, let's use PGBgress to copy from one machine to another.
That's it.
It's not its primary job, right?
But why not?
And he told me, like I said, a month ago, that he achieved more than 10 terabytes per hour,
which is great, which is like absolutely great and impressive.
And multiple times what you thought was, you know, considered good.
Yeah.
So from this fact and others, I already recognize that like one hour, one terabyte per hour is not a gold.
on standard anymore it's outdated we need to raise the bar definitely so maybe two at least
i don't know maybe more three four how much like should be okay for modern hardware for large
databases these days maybe five terabytes per hour should be considered as like good because you
you see you build some infrastructure and you see some numbers and we have clients not one client
many clients who come to us and complain okay we all backups take like
this amount of time, like 10 hours.
What's your size of database?
200 gigabytes.
10 hours, 200 gigabytes.
Something is absolutely wrong, you know?
Maybe like, let's see where situation is.
Where is the bottleneck?
And it can be disks in many cases, but maybe network, maybe something else,
mostly disks, but anyway, this is not normal.
Sometimes it's software and lack of paralyzation and so on.
But it's not okay to live with these bad numbers these days, right?
It's already, we have very powerful hardware, usually.
Just to ask the stupid question, what's the negative impact of it taking 10 hours?
Well, it affects many things, for example, RPR, RTO, right?
You create backups, you recover a long time, it affects RTO.
How much time recovery target objective?
How much time do you spend to recover from disaster?
You lost all the nose, how much time to get,
at least something working, at least one not working.
If it's 10 hours for 200 gigabytes, it's a very bad number.
You need to change things to have better infrastructure and software and so on.
And this is one thing.
Another thing is when you upgrade, for example, sometimes you need to clone.
Our recipe involves cloning, a recipe for zero-down time upgrades.
And if it takes so many hours, it's hard to think what will happen when you will have 10 terabytes.
How many days will you need, right?
So it's not okay.
This requires optimization earlier.
And also for provisioning replicas,
a certain amount of time taken is always going to be acceptable, I guess,
because it's not the primary, right?
Like, we're...
It depends.
If it's only primary, you are out of replicas, it's dangerous.
Even with two notes,
I would be...
Running with two notes, I would spend as much as less time.
as possible.
So when you say two nodes,
do you mean with two replicas?
Oh, no,
okay.
Primary and one standby,
it's already a degraded state.
We need a third node, right?
With three notes,
you have how many nines,
12 nines or so?
Like if you check the availability of
we see two instances
and just think about three nodes
and what's the chance
to be out of anything.
This will be,
think 12 nines or something as I remember that is a lot of nines yeah yeah and that's great so it
means like almost zero chances that you will be down unless some stupid buck which propagates to all the
notes and puts posgous on knees on all nodes simultaneously anyway if we integrate it and especially
if it's just one primary it's very dangerous state like I like we have cases and some clients
who run only one node and like we discuss it with them it's not okay but
Somehow these days, they just survive because cloud became better, you know?
Like, it doesn't die so often.
Notes don't die so often as before.
I also think there's this phenomenon where if you're in US East 1 and you go down,
there's so many other services that are down at the same time that you kind of get,
not a free pass, but like they get away with it a little bit more.
You're in the club, right?
No, no, no.
You are down, others are down.
We are in the same club.
Yeah, I'm not in that club, but yes, I think a little bit.
No, I think, yeah, we have also customers who got in that trouble as well,
and they seriously think it's not okay and they need a lot of the region set up.
And I think this drives right now improvements in many companies and infrastructure and so on.
I'm glad to hear that.
Yeah, anyway, and one node is dangerous.
like I would provision more nodes sooner.
And in some cases, you cannot leave with one node.
If you already relied on distributing redonly traffic among multiple nodes,
one node won't be capable of serving this traffic.
Yeah.
So we've established speed matters.
Of course.
Sorry, time matters.
Therefore, speed matters.
Yeah.
How do we go faster?
Yeah.
So PG-based backup expected like, okay,
it's single thread so i expect something like two three 400 megabytes per second maximum
and this is what uh also my expectations fully matched with what i saw maxim shared with me
but when i started testing i took two nose i four i it's not the latest it's uh third generation
in the scaleable zion which is quite outdated they have already fifth
generation i7i so two nodes 32 courses 128 vcpu's more than terabyte of memory i think both
and 75 gigabit per second network and the disks are like i think it's 3 million iops
i remember i don't remember how many max per second gigabytes per second it's
definitely somewhere like 10 or more gigabytes per second disk throughput maximum. I think
eight discs already in raid. I took Ubuntu. I installed Postgres 18 and this was my mistake
because this gave me very good speed for PG base backup unexpectedly. I saw gigabyte per second
and I was like oh what am I doing wrong here because it's too fast. And then I realized it's
Postgast 18 improvements, right?
I saw you said that in the blog post,
but did you run it deliberately with IOUering on?
It's by default.
Well, no, it's not Iioring by default.
But it is using something similar.
It's using something similar.
It's like,
then it's some pre-fetching or something,
but it's definitely something.
Yeah, well, it does have three work,
two or three work processes by default.
and it is an asynchronous I.O. thing.
No, I didn't change that setting.
So this is probably inaccuracy in my blog post.
But maybe not bad because it might still be AIO,
might still be asynchronous IO, just not IOURING.
So, right, there is setting I.O method, right?
And default is worker, right?
Yeah.
And it's asynchronous I.O. using worker process.
Yeah, it's not I.O.U.R.
I need to make correction.
This is actually interesting because I think the default is only two or three workers,
which means if you increase that number, you might, so you triple, like,
three workers.
You probably got triple, yeah.
So you've got roughly triple the.
Exactly.
My expectations were of 300, yeah.
But if you increase that number, if you increase the number of workers, you might be able to get even more.
Does it mean I need to spend a few hundred dollars again for these machines?
I guess so.
I guess so I
Or an exercise to the reader.
How it worked.
Like I started them like at 6pm or something.
I worked with them like something.
Cursor did a lot of work.
Like I just explained, control that I connected TMAX, I'll start, I your top, everything.
I see how many threads, everything, like itch top, many things.
So I see that it's doing work as I do, many iterations to Polish approach.
And then I realize, okay, it's already 9 p.m. 10 p.m.
But I cannot drop it because it's already.
provisioned and it took time to create one terabyte database. I had comments from Maxime as well
at one terabyte is not enough. Like, okay, I agree. I should do 10 terabytes. So I guess I need to
redo this. I have homework. You don't need to redo it. It's just an interesting. It's interesting
to me as well. Because I also want to see PostG 17. You know? Yes. And different settings
here. I think it's interesting. How PG-based backup can can behave.
various settings more workers iowa ring as well sync as well right synchronously so not as
synchronously like like here but synchronously should go down the throughput should go back down
yeah yeah so now i i like obviously you just make me do another round of this experiment right
well sorry but i also think like this this
someone else could do this right someone else could do this yeah i i will have it in my to do
list but if someone who is listening to us is ready to repeat maybe they have already some good machines
unfortunately i don't have credits anymore like i did in my company had a lot of credits but not not
now maybe someone has credits and can provision on big machines and just to avoid extra spending
because i think a few more times it will be already a thousand dollars just to check this setting and i i'm
interested in checking on smaller machines and then like extrapolating no no no it's
boring and not serious so it should be big machines i would also check i 7 i because they have
100 megabit per second it's versus 75 so also extra and i think we can exceed 10 gigabytes per second
so what i've got with ptjab base backrest with ptj a base backup one gigabyte you i i think you are
right in terms of three workers.
This is why, right?
Interesting to check more workers.
With PG backrest, I increased sterilization
and my blog post has this like the graph showing how throughput was growing with more
and more workers and then situation happened on network.
So I achieved 36 terabytes per hour.
And it was also exceeding my expectations.
I was hoping like be close to 20 maybe, right?
but these machines
yeah it sounds fake to me
it just sounds like not believable
but it obviously is impressive
yeah so i7i with
100 megabit per second should give
maybe more than 40 right
maybe approaching 50 terabytes per hour
this raises with bars and expectations
what we should have in our systems
these days
so it should be normal 5 terabytes per hour
should be normal now I think
answering my own question
some minutes ago, right?
Tantra about it should be not
surprising already. If you have local disks
with BS volumes
different. Yeah, for me
I guess the
surprising thing is that we can
saturate the network. Like it's about
it then becomes about your network
right, I think
and the number of course.
Yeah. Yeah.
And if you saturate network,
the idea is of course let's try compression.
So I did.
And compression improved things, but
so it shifts the saturation point,
but it's like quickly saturated again
and it was not helping to achieve more.
Maybe I should read the disks there actually.
Let me check.
So if you look at the picture,
I'll pull up your bug paste as well.
Yeah.
Because you hit a peek around
32 parallel processes.
Ah, also, S.S.H. overhead.
I've got good comments on Twitter that
would be good to check TLS.
It would require effort to configure,
but it should also reduce some overhead
and throughput should be improved.
Wow.
So I guess 36 terabytes per hour is not the current limit
for modern hardware in AWS we can squeeze more right so we can probably squeeze
up to 50 terabytes per hour there are several good ideas like and also squeeze from
pg base backup right so this maybe this is a competition right so my intention was to show
like pg base backup is bad single thread and forget about let here's there's a
speed now with posgous 18 it's not that bad it can be tuned additionally as you say probably right and we can have very good speed with it as well so it's it's it's interesting we and one gigabit per second is more than three three terabytes per hour yeah wow three thousand six hundred gigabyte per hour right it's more than three terabytes so it's already like good enough but in the
these are default settings so the answer should you use pjibakrest recipe well it depends
with postgast 18 with older postgust i i definitely think you will be limited by 3 400 max per second
that's it because single of single threaded nature of pj basebacker and i like this a lot
compared to the tricks we did in the past with arsing our sync is also single threaded and it's like
i really don't like that why is it so it should be implemented like people use gnu parallel
and so on but it's so like feels um kubersome right so it's not light lightweight approach to
write some like things and then control the threads and so with our sync it's definitely
single thread that pjibati's backup in postgis eighteen basically shines compared to rc yeah
And also it orchestrates all the stuff additionally, like connecting to the primary,
doing, like telling it, I'm copying you, right?
PG start backup, PG stop backup, basically, right?
So you don't need to remember all those things.
And it's also standard, official way.
It's core.
Although I would say I was looking, because of researching for this,
I was looking at the PG backrest docs and GitHub repo just earlier today.
and it's incredibly well maintained like I was surprised to see yeah we're moving to
David still yeah it's great yeah kudos to David but also just in terms of viewing a project
for the first time in a while I hadn't seen I hadn't looked into it in detail for a while
and I think I was just a bit surprised when I looked at an open issue I was like there's only 54
open issues like like that's for the number for the size of the project for the age of the project
that for me seems like
and by the way it doesn't
I'm not taking that number on its own
looking at what they were like some of them
are like feature ideas from a few years ago
that are just left open because they're still good ideas
some of them are new things
that's good because there's approach
when people just close if it's like
okay no comments for a month let's close
and I absolutely hate it
because if you have if you reported
a bug which is not fixed
cloud native PG hello like they just
closed because right
because converted to discussion
but buck is still there
and everyone around agrees it's still
there why do you close my issue it's a buck report
so next time I won't write anything right
and I told it everywhere
and in cloud native PJ I'm not going to contribute
anyhow else
so this is like let's just keep
issue settling right
it's not okay if it's serious
problem of idea
it should be there so the fact that
there are old issues there is great
yeah yes so all i meant is uh you kind of get these sometimes you get red flags when you look at a new
project or like if you if you see a project and it's got 20 000 open issues for me that is a red flag
like not that it's that there's lots of issues but that no one's maintain no one's closing the ones
that are duplicates or or things so it was more just like i wasn't getting red flags and then
there were loads of green flags looking at how like recent PRs and it just it seemed like a
So I know it's not in core, but it does still feel like a very tried and tested product
that's really well maintained.
And I know a lot of people are using it.
So it feels pretty much as close to core as like a...
What if you see hundreds or thousands of issues open, but many of them are recent and
there is activity?
Not well maintained or a mess?
Spam issue.
Like how can you create hundreds of them?
thousands of issues. I just haven't seen a repo like that. So maybe I have to like reassess,
but yeah. Yeah. Have you seen a repo like that? I have such projects. Oh, okay. So it's just,
yeah, there are some old issues that should be closed, you know. Yeah. Yes. Well, if you check,
well, so I agree this is a great example of well, maintain open source project. Absolutely
great. I agree. I think if you check some internal report, well, so I agree, I think if you check some internal
repositories in companies. Sometimes it's a mess as well, right? And sometimes it's beautiful.
It depends. I think it depends a lot in the maintainer. What can you tell about postgres itself?
Yeah, well, we don't have an issue track. Yeah, no issues, no issues, right?
Well, like, we should get that. That would be a good t-shirt. No issues, no issues.
All right, should we get back to the topic? Back to the topic. So David still,
reached out to me commenting and it's great so some inaccuracies in my blog post to fix
and also obviously it inspired him to as I understand this idea was already for quite some time
to to have the feature to basically issue really nice to reprioritize some processes right
so now there is a pool request to be able to change priority of
PGBCrest processes, which is good.
Yeah, so this is super cool.
So this is the idea that because you're running this on production,
you might not want to, you know, let's say you're quite high CPU on your primary.
Compression, yeah.
Yeah.
You use compression, so you need CPU for PGBCrest, for example, yeah.
Yeah, but you don't want to affect your current traffic, right?
It reminds me all days, like 2003 or four, when we also, it's so strange, like,
It was, we couldn't think about terabytes those days, 20 plus years ago.
It was too much.
Terabyte is, it's mind-blowing number.
But we used, I remember, re-nice, it is really nice, ionize something.
I barely remember from those.
Just nice?
Yeah.
Yeah, nice and so on.
It's strange that this world is used to change priority, right?
I don't know details.
Do you know details here?
Yeah.
No, I don't know.
Yeah, yeah.
So I remember from those days we used it, but also I remember I was struggling to see good effects.
So now if this pops up again, I would check, test it properly again, how exactly it helps.
But this is in to-do.
And really cool that these benchmarks helped prioritize that work.
So cool.
Inspires to do more benchmarks as well.
Always.
I get stuck in that loop sometimes, not just with benchmarks, but you know, you think you implement a feature, then you think of a better way to implement it, then you think of a better way, and you think of a better way, and you think of a better way, and you get stuck in that loop of concert, yeah, or blog posts.
This post is definitely not a final.
So somebody should achieve 50 terabytes per hour.
This is a challenge.
Yeah, I like it.
Yeah, who can do it?
Yeah, who will be the first in postgast community reporting 50 terabytes per hour?
Yeah, let us know.
although I did see there was an issue that David linked us to
where somebody maxed out a hundred core machine
and it convinced him to increase the
he had a process max at 99 before then
so I thought that was quite funny that you know
there was a the limiting factor was this hard-coded limit
number of maximum number of VCPUs on
yes already I think has it exceeded already of 1,000 or no I remember seven of
800 VCPUs or also fifth generation into zero scalable so it's already like just we have a lot
well if you bear in mind your test bear in mind your test maxed out at 32 processes right
right but this one bumped into network yes okay and then when I increased
compression it's it's shifted right so it's shifted to 64 course so I'm wondering I'm
wondering what do you think the person in the open issue back in 2019 maxed out their
hundred core like what do you reckon their throughput was I don't know lower it's interesting
but but maybe you know like we we need compression not only to battle network throughput limit
but we also if it's actual backup we want uh sometimes we want it to take as much as less space as
possible in s3 right just to pay less maybe right in this case we can have aggressive compression
and uh CPU consumption high right in this case it can be 128 cores for example
third generation of intel scalable zion has 128
course maximum i think at least in in clouds it's n2 on gCP and i4i as i used on a wesse
so but but i wonder a lot how come a double yes created seven or eight hundred vcp us
machines it's like wow so like 10 plus terabytes of memory wow yeah yeah by the way when you keep
saying there was the Google disks earlier.
I think you were saying that every time you said them,
all I could hear was PTSD.
PD SD, SD, persistent disk, SSD.
Yeah, yeah.
I heard PTSD.
And then when you're saying eye for I, all I can think of is the saying an eye for
an eye in the world is blind.
So now.
Yeah.
Well, by the way, one interesting thing, like EBS volumes for this days also are
built on top of NVMEs i don't know how is it called nitro architecture or it's the new name i don't
follow the all the terms but i remember two gigabytes per second and even more right so i think we can
squeeze a five to six seven terabytes per hour on modern eBS volumes as well even on eBS wow yes yes
yes so so forget about one terabyte per hour it's old you can do better
yeah definitely better if you have large database do better nice yeah all right good thanks so
much nicolai enjoyed that take care bye bye
