Postgres FM - Slow queries and slow transactions
Episode Date: July 5, 2022What did you like or not like? What should we discuss next time? Let us know by tweeting us on @samokhvalov and @michristofidesIf you would like to share this episode, here's a good link (and... thank you!)We also have an uncut video version on YouTube.Postgres FM is brought to you by:Nikolay Samokhvalov, founder of Postgres.aiMichael Christofides, founder of pgMustardWith special thanks to:Jessie Draws for the amazing artworkÂ
Transcript
Discussion (0)
Hello and welcome to Postgres FM, episode number one.
Pilot episode.
Pilot episode, yeah.
I'm Michael, I'm from PG Mustard and I'm joined today by...
Nikolai Samukhvalov, don't try my last name unless you have Russian or Ukrainian roots
or Belarusian roots.
So yes, I'm Nikolai from Postgres AI.
Wonderful.
Looking forward to these.
Let's discuss why we do it.
Because I felt so long that we should have some podcast,
pure podcast without any screencasting or slides and so on,
because it's also a good format to have some discussion.
And I feel like meetups died because of COVID.
They started to die before COVID,
but COVID finished them completely.
So it's very hard to meet at some place
and have some pizza and beer.
I mean, people still do it,
but only like 10 persons come and so on.
It's not working well.
So we need more formats.
I know many people hate purely online formats,
but I actually love them.
And I think we should experiment
with more
and more styles or formats and the postgres community needs some podcast that's why we
discussed with michael and we should start postgres fm probably right maybe it will die
also but who knows right but i i hope not why do you do it yeah well i i love podcasts in general i love them for learning
i find them really helpful i can i can listen to them at times where i can't watch a video or i
can't attend an event i find them really helpful in terms of learning but also getting to know
people in the community getting to see different guests getting to hear interesting conversations
between people that i probably wouldn't be a part of generally so I love them as kind of an insight into something that you're not necessarily familiar
with already or a way of keeping up with people that you don't talk to all the time like that's
quite a nice format as well but yeah basically thought it'd be interesting I thought I would
like to listen to this right I would like to listen to a couple of people talking about
postgres each week or discussing something that is you know
super interesting or controversial or not yet decided or a common issue that people see all
the time something that i've just hit or might be able to avoid hitting in future that kind of thing
right the same i'm just sometimes trying to fill gaps when i like walk my dog or i'm an airplane
just i have some time and I want to
not just some entertainment but something useful and podcast is great for this kind of
gaps to fill right so yeah okay let's let's start with some topics we have I think we
like in general we will try to talk more about performance but maybe not only right but definitely
about Postgres. Yeah I think always about Postgres.
I think based on what we focus on,
we probably will end up bringing performance topics
more than most would.
But equally, I think we're open to suggestions
from other people as well
as to what you'd be interested in hearing about.
So yeah, we're very open on the topics front.
My opinion about performance, it was a surprise to me,
but I think not everyone is interested in performance. Not every engineer who writes SQL is interested
in performance. This is a discovery I've made. It feels like the most interesting part, like
scalability, performance, these kinds of topics. But I found many people not just interested,
they just need some query to work and return
proper results and that's it so like I wish everyone to be interested in performance but
anyway let's let's start with the first topic well I'm interested in that one actually do you
find there's a pattern as to who cares and who doesn't care so much? Or is it that they only care if it's below a certain threshold?
Or is it that they care about one thing way more?
I think we're moving towards one of the topics
we wanted to discuss.
And the word threshold is good here.
So what I observe in organizations
from small startups that grow very fast,
like several times, all numbers increase several times per year,
or very large organizations with thousands of people and hundreds of engineers,
I observe quite good pattern, like not good, like obvious pattern,
when business dictates feature delivery to be very fast.
Like a lot of competitors around,
so requirements are very strict.
So we need to move fast.
So developers mostly interested in having features delivered
like daily, for example, right?
So very, very fast, move very fast.
And under this pressure,
they don't have time to have the best performance ever right they don't
have this like they just don't have time so they need they start to move like okay it works let's
go and but the problem is who defines this threshold like where is the minimum and who
checks it how how to check it yeah it's usually a problem so sometimes you
observe very good performance when you develop the feature but when you deploy it it's not good
or you deploy it also good but one year later nobody touched this code it left unoptimized
you have more data and it and the query degrades so my my threshold, I have my threshold.
I wrote an article about it.
So it's like every web page should be faster
than like 200, 300 milliseconds.
One second is absolute maximum.
And since we consider page or API request, of course,
and since every page or API request may consist of multiple SQL queries, it may have zero.
But sometimes we have dozens of queries.
It's not good.
Sometimes we have like a loop with queries.
Of course, it's not good.
But in general, it means that if we have requirement for a web page to be not longer than one second, it means that general requirement for SQL
to be not longer than dozens of milliseconds.
And also, my article also describes
where this number comes from.
Where does it come from?
Human perception.
It's like 200 milliseconds reaction of any human.
Plus, minus.
Yeah.
50 maybe.
So if you press some button, you expect a reaction below 200 milliseconds, better 100 milliseconds.
Yeah, I think I've heard, is it below 100 milliseconds, it feels instantaneous.
Anything above that, we perceive some delay, even if it's not instant.
Yeah, so I know exactly where you're coming from on that. It makes a lot of sense. Anything above that we perceive some delay, even if it's not instant.
Yeah.
So I know exactly where you're coming from on that.
Makes a lot of sense.
So I guess the question is, do you see a lot of companies that generally have way worse
than that performance and still don't care?
Of course.
Well, yes, yes, yes.
A lot.
So if users don't complain, oftentimes we just don't care like we have we have some
functionalities working but we have so many things to create to compete with others right to to
expand to grow so if users don't complain sometimes you see some pages or ap API requests are above one second in duration and nobody cares.
You can see it from logs actually.
You can check if you have, for example, log min duration statement longer than one second,
for example.
And here we can discuss percentiles, right?
So not only every query should be below one second or below 100 milliseconds.
We say 99% should be below, right?
So any big project should talk in these terms. But sometimes we have a lot of slow queries and
nobody cares until big feedback loop works. Big feedback loop is when a lot of users start to
complain and management understands it's a big problem. churn even right like you hear some users
complain but you realize that it's the ones that are churning that are the ones complaining they're
the ones right well you know it's real dollars that are being affected or you know ecommerce
maybe your conversion rate on the pages the comments not it knows how to measure every
second of downtime or at least minute of downtime so yes degradation is tricky in terms
of how to measure it in dollars but also maybe it's possible i think it's possible again if you
slow slow query log usually a lot of things are i mean in a large large company or large project
a lot of bad things i mean if if you have luxury luxury to get a few weeks for optimization,
there is always a large project to fill them with optimization, right? But yeah,
it's a decision from management. Let's optimize and be prepared for further growth.
Yeah. So I've seen some exceptions, I guess. I think you're probably, I've read your post,
and I think it's great. I think you're probably accounting for them because they probably fall a little bit into the
analytical workloads but some applications almost do analytics as OLTP so they might be an analytics
product and if they let you set a few filters and it takes a few seconds to load sometimes that's
acceptable from a user but that seems like the that seems like the exceptions where the user understands that
it's doing something complicated or data heavy behind the scenes. And if they're waiting for
that on a one-off basis, and it's only a few seconds, they seem okay with it. But yeah,
I think that's probably only at the exceptions. Some complex search, it may be okay. But usually
the question is what will happen in a few years when we will have a lot of data, much more data than today.
If today we have 10 terabytes, in a few years we might have 50 terabytes.
What will happen?
So if it takes three seconds today, it may be already above 10 seconds in future. And of course, people usually are okay
to wait a second or two during performance search,
from my opinion, but still not good.
We should optimize.
Yeah.
And the examples I've seen where you mentioned
people scaling and having problems there,
sometimes these startups, they bring on a customer
who has three, four times more data
than all of their other customers combined. You know, when they're in a an early phase that's not that unusual and if they
haven't tested sometimes well performance drops off a cliff or something goes badly wrong we've
seen that a few times so it's yeah super interesting to think of planning ahead, but most companies don't seem to because, as you
say, it has to be a focus on feature delivery and investors need updates and their customers
want certain things delivered in certain timeframes, that kind of thing.
So what can we do?
What can we do to help people?
Well, first thing to define some thresholds, as you said.
Maybe it can be defined in the form of if some SREs are involved, they have some methodologies to control uptime and so on.
So a similar approach can be applied here.
Like we can say we have good quality if, for example, 99% of all queries are below 100
milliseconds.
We define it in the form of slo service level objective
and we start monitoring it we start having alerts if it goes down we perform also second thing we
perform from time to time like at least once per quarter we perform analysis of current situation
and also we try to predict future so like some capacity planning are we
okay with numbers growing that we observe or we predict our thresholds to be broken already in
next quarter so this is like usual approach to growing project and then we should of course
go down and perform query analysis optimization, analyzing whole workload.
But it's a whole different topic, probably.
We should discuss it separately.
Yeah, well, and the slow query
looks a great way of getting started, right?
The log min duration statement.
If people don't have that turned on,
that feels really sensible for pretty much everybody.
I've seen startups that don't do any monitoring yet.
They don't do any performance monitoring.
So the idea of even getting the 99th percentile
might be a stretch.
But if you just start logging the absolute worst queries...
Yeah, I saw this.
So what I describe is for larger organizations.
So it's already sound like some bureaucracy involved, right?
Some processes, but it works.
In the larger organization,
we need a little bit more complex process to be established. But if it's a small startup,
a few engineers only, I saw this. CTO with a couple of engineers, very small startup,
by the way, it was very successful and sold recently to a very large company. But in the beginning, like three or four years ago,
I saw them, a CTO, having every query
that went to Postgres log
because of log mean duration statement,
500 milliseconds or so.
It was sent to Slack.
An immediate reaction.
So they tried to achieve zero events for this.
In the beginning, it's fine.
But this process doesn't scale well.
At some point, you will be overwhelmed, of course.
But in the beginning, it's good.
You react to every occurrence
of slow query.
That's great. Anything else you wanted to cover on this one?
Well, not
at this point, I think. It's enough
about slow queries. Maybe we can talk about slow
transactions because it's quite a different topic. Related but different, right?
Yeah, go for it.
So we hear, like, maybe, like, you discussed analytical workloads, but it looks like
we mostly discuss OLTP workloads for, it means like web and mobile apps.
So if we talk about transactions, we should understand like there is query, transaction,
and sessions, three levels.
And to execute a query, you need to have a transaction.
Even if you don't define it, it will be defined automatically, a single query.
It's like implicit transaction anyway.
I hate Ruby developers when they say, I will run this database migration without transaction.
It's not possible.
And this disable DDL transaction, it's wrong term.
You cannot disable transactions in Postgres.
But still, this weird term, it's used. So you need to open transaction, but you
cannot open transaction if you don't open a session. So it's like one inside another.
And the problem with long transactions is that there are two problems. First,
if you acquired some exclusive lock and keep it, it's always kept until the very end of transaction.
Either commit or rollback.
So you need to stop.
Otherwise, you're still keeping it.
So it means you can block.
And the second problem is auto-vacuum.
If you keep transaction open, even if it's read-only transaction, and sometimes even
if it's on a replica, on a standby server, if hot standby feedback is on,
it means that autovacuum cannot delete freshly dead tuples, right?
So we block some autovacuum work, at least partially.
But the first problem is the most, like, it can be seen immediately.
If you keep long transaction, you can have a storm of locking issues. So this means like
ideally transactions should be also below one second. Right? Oh, interesting. Yeah. Right?
Because otherwise, imagine you acquired the lock in the very beginning of transaction and someone
also trying to acquire this lock. So you updated the row and someone also tries to update the same
row. and this happens
in the beginning of your transaction if you keep it longer than one second you you may block this
session for longer than one second so our previous topic will be broken as well right so this this is
similar thing that's why i always say when you split work into batches, try to find some batch size
that will allow you not to exceed one second roughly.
Exactly because of this.
Right?
Awesome.
So going back to the folks that don't have much of this set up at all, what would you
recommend them logging or monitoring?
Yeah.
Well, every monitoring should have monitoring for long transactions.
And I see most of monitoring fails to have it.
Just fails.
Like they don't have it.
They don't report what, like we cannot, we open monitoring and we cannot answer the simple
question.
What is the longest transaction?
What's the duration right now?
Is it five minutes?
Is it one hour?
We are interested in this.
We don't discuss AutoVacuum and XminHorizon here, just this simple question.
And I think this is the first step. You should have it in any monitoring. If it's missing,
you should add it. And then you should have alerts again, like soft or hard alerts. Like if some transaction is running
longer than 10 minutes, it's already not good. Like it's already a bad situation. Of course,
sometimes you should exclude some transactions. For example, vacuum can run longer, but regular
transactions should be under control. In OLTP, we cannot allow somebody just to open a transaction
and keep it forever.
Yeah. So I know you're focusing on the first part of this, but the second part does feel worth mentioning in terms of avoiding transaction ID wraparound. I know some small companies
shouldn't get anywhere close to it, at least for a long, long time. But we've seen big companies
over the last few years get tripped up by that and feels like maybe they might not have been monitoring for this kind of thing,
even at these large companies with massive scale.
Right, right, right.
One of the consequences of keeping transactions very long
is you block out the vacuum as well,
and it cannot delete freshly the tuples if you do it.
Sometimes people open transactions and keep it for several days
if you're modifying queries or transactions are coming at very high rate it's it can be a problem
as well but i observe usually already every monitoring has it like i'm less concerned here
because i just see great for example datadadog has it. Others also implemented it, exactly because of this very painful experience
from MailChimp and Sentry before, like several years ago.
It's good that those guys, by the way, blogged about it.
That's visibility.
Yeah, blogging is very important.
So I guess even in my mind,
I don't have a very good model for monitoring,
ideal model, because
we should distinguish various types of transactions, modifying which acquired lock, and also they
have real transaction ID, or the only transactions on standby, and like different things, right?
So maybe we should have a couple of monitorings aimed to different purposes.
One is for locking issues, another is for auto-vacuum issues.
By the way, we can discuss a different thing,
not transaction duration, but Xmin horizon.
And only then we care where it comes from.
From our transaction on our primary or from replication slot or somehow.
So maybe we should have different things
to monitoring charts metrics right yeah the other thing that interplays with this that i think i've
heard you speaking about before are timeouts so to protect against all these yeah i recently had
one company quite already big, and they experienced a lot
of issues because of lack of timeouts involved.
And once again, we saw the problem that Postgres doesn't have a very important handle, actually.
Somebody should develop it.
Maybe I should develop it, but I'm not a hacker anymore.
First, what do we have?
If we talk about accession, transaction, and query statement software levels, we can limit
statement using statement timeout, but we cannot limit transaction.
There is no such way in Postgres at all.
That's it.
Can we do session timeouts?
Session, I think it's usually if it comes through PgBonus, or we can limit idle sessions,
so we can say drop connection.
I don't remember the concrete names out of top of my head but right but it's i think this is possible but
session is less a problem this is not a big problem transaction i would i would prefer to
have a way to limit transaction duration but there is no such but if we if we kill the session that
kills the transaction right i guess if it's only guess if it's only if it's idle.
Yeah.
Interesting.
But if transactions are small,
session, for example, can be very long.
I mean, you established connection yesterday
and continue working politely.
Very small transaction, very brief.
Why should we kill you?
We can limit statement and we can limit
breaks between statements inside transaction.
It's called idle transaction session timeout.
Very long name.
So we can limit
statement, limit breaks, pauses between
statements. And everyone should do
it, actually. I think
any LTP should start with...
We had a discussion on Twitter
some time ago, and I'm a big fan of global default very limiting for LTP should start with... Like we had a discussion on Twitter some time ago,
and I'm a big fan of global default,
very limiting for LTP projects.
Those who need it to extend,
they can do it in session or for user.
But in LTP, I prefer to see statement timeout 30 seconds.
I don't transaction session timeout also like 30 seconds,
sometimes even 15.
Imagine a transaction which consists of like a chain of very small, very brief statements with small pauses between. You don't break any timeout settings in this case. And your
transaction can last hours. And I saw it. Yeah. So it's bad.
It's a really important point. Yeah. And it seems actually potentially very tricky.
When do you see people using multi-query transactions?
So like, let's say the Rails app that we were talking about.
Well, with long transactions, we have two risks.
Again, like locking issues.
So you can block others and it can be very painful.
And auto-locking.
So what I saw, people don't understand this simple idea that logs are released in the very end and they split work into batches but these batches are inside one
transaction block and it's it's awful you like we perform small updates everything is fine
but transaction lasts one hour and those updates that were in the very beginning logs associated with
those updates are still held
so
you have a lot of
blocked concurrent queries and they are
those like victim queries they can
be of course can reach
statement timeout and fail so we see degradation
in best case
statement timeout will save
us to have a chain reaction, but still
not good.
Yeah.
That's why I think by default in LTP, transaction duration also should be limited.
And I saw people implementing this on the application side, which is, of course, a weak
implementation because you may have multiple application code parts
sometimes in different languages.
And still somebody can connect using some tool and so on.
So I would rather see it on Postgres side,
but it doesn't exist yet.
Yeah.
And just to go back to something you said earlier,
so if you set quite an aggressive one,
let's say even if it's 30 seconds as a timeout,
if we're doing a big data migration
or if we need to do a big schema change
or add a large index,
we might need to set that.
Yeah, okay.
I'm guessing if we create an index concurrently...
You should do statement timeout set to zero, right?
Yeah.
Right.
But there is another topic
we probably should it next time about
log timeout setting let's keep it outside of today's discussion so yeah exactly this is a
good point wonderful definitely i think we've done all right there what i write about it actually
like sometimes people say we don't want to have timeout because it's painful you reach it you fail well if you have 15 seconds
and then you fail you see it immediately then you said like if you have 30 minutes for example
and you fail after 30 minutes damage already there and you also like a feedback loop is huge
this is painful but small statement timeouts are not that painful people
see them people adjust settings go that's really good point i've not considered that the smaller
painful it is of course the user doesn't see what they're expecting to see and there's a problem
but in the grand scheme of things it's a much smaller problem than you'd have had if it had
been minutes or or longer yeah great point right small timeouts are good in terms of user experience.
Because it's like fail fast and adjust.
That's it.
Right.
Awesome.
Is there anything else you wanted to cover on that?
Maybe that's it for first pilot episode, right?
Yeah, let's see what people will tell us in Twitter,
where we can get it. Yeah, Twitter would be be great i'll put some links in our in the show notes so you can find us yeah please let us
know what you think what you want to be discussed any questions you have be really welcome
thank you nicolai see you next week see you bye