Postgres FM - Mean vs p99
Episode Date: June 13, 2025Nikolay and Michael discuss looking at queries by mean time — when it makes sense, why ordering by a percentile (like p99) might be better, and the merits of approximating percentiles in pg..._stat_statements using the standard deviation column. Here are some links to things they mentioned:Approximate the p99 of a query with pg_stat_statements (blog post by Michael) https://www.pgmustard.com/blog/approximate-the-p99-of-a-query-with-pgstatstatementspg_stat_statements https://www.postgresql.org/docs/current/pgstatstatements.html Our episode about track_planning https://postgres.fm/episodes/pg-stat-statements-track-planning pg_stat_monitor https://github.com/percona/pg_stat_monitorstatement_timeout https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-STATEMENT-TIMEOUT~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith credit to:Jessie Draws for the elephant artwork
Transcript
Discussion (0)
Hello, hello, this is Posgas FM. I'm Nick, Posgas AI, and as usual with me, Michael, Pidgey Master.
Hello, Michael. Hello, Nick. How's it going? Everything is all right, a lot of work. There
is progress on many fronts, so yeah, excited to discuss quite important topic. And maybe,
I, let's see what we can learn from, from this discussion. I don't know, like mutually learn or something.
Yeah.
I'm glad you think it's interesting.
It's interesting because it's always with us.
We, when we deal with performance, we need to work with many numbers and
this topic pops up all the time.
Yeah.
So well, so specifically the topic is when we're looking at latencies or, you know,
monitoring or troubleshooting, maybe prioritization, we often need to look at queries by some measure
of their duration. Like often, and I see this in blog posts, in tools, various places, often ordered
by the mean or the average of their latencies. Very easy to get from PGSTAT statements, for
example. It has a dedicated column. But in a lot of other places, a lot of other, you know,
different types of monitoring tools, and I'd say more on the application developer side,
I'm seeing increasingly people moving to more monitoring the P95, P99 type latencies. So
percentile based and kind of the tail latencies, not the average. So wondering about that. And yeah, I saw recently another feature request to have
the ability to see these in pgset statements. And I've realized it's something we can approximate,
but it's not something we can get from that. So I wrote a post about it recently,
and I'm glad you find it interesting too. Yeah. So that's important topic, I think,
because we all work with averages or mean values all the time.
And it's not only about posits, right?
Mean value, meaning of average is often misunderstood
and it's often misleading.
I remember the idea that in any organization, if you check what is the mean salary or average
salary, most people might find out it's like in some organizations, in some organizations,
oh wow, it's like this average salary is much above my own salary.
What's happening here?
Well, what's happening?
A couple of dudes have huge salary, that's why.
So it means that when we're back to Postgres, back to queries, you see average or mean time
is like one millisecond.
Oh, not bad, not bad, right?
But how many queries are much above
that? It might be a lot. And I must notice that mean time, mean exact time, mean plan
time, these columns are redundant, they could be avoided because we always can calculate
them dividing total by number of calls, right? Or plans in case of planning time.
By the way, I read your article, I noticed you optimistically involve planning metrics.
Unfortunately, in 99% of cases, we don't see them because they are off by default.
I remember last year or previous year, it took some effort for us to prove that basically
you can turn them on.
But unfortunately, as we had an episode about this, unfortunately, performance cleave will
be two times closer to you if you turn it on.
But again, default accommodation is to turn it on because it's very valuable information.
So again, back to averages,
we could always calculate them dynamically or something. It's not a big deal. But I guess it
was just a matter of convenience to add them and keep them already pre-calculated in PGSAT statements
and other similar extensions and so on. But it would be really great to see
extensions and so on. But it would be really great to see the percentiles, right? Or better to see the whole histogram analysis of what's happening. But unfortunately, I guess it's a very
complex topic. I mean, I know, PGSTAT monitor implements this, but we need to check overhead.
That's the critical part. And also just it's, it's more columns that we've got already a lot of columns
in PD stat statements.
So as additional overhead both to tracking, but also in terms of user experience.
And I mean, I think at this point, we have so many columns that
why not have a few more, but it is, it is a problem.
Like there are, I think just in time compilation, got a bunch of additional columns for every
single part of that.
And there's, there has been discussion about whether that should be collapsed down because
do we really need granular reporting there?
But yeah, I would be in favor of some histograms.
I also haven't yet seen any analysis of, so PGSTAT Monitor is an alternative to PGSTAT
Statements, relatively new in the grand scheme of things, probably a few years old at this
point, but I don't see it in many installations and I haven't yet seen anybody publish benchmarks
on its performance relative to it.
It was created in Percona and I remember I discussed it with Peter Zaitsev who is the founder of Percona when this extension was created and they
said instead of contributing to Perjista Statements they wanted a
separate independent release cycle and as we know Perjista Statements is a part
of the set of so-called contrib, it means that they are like basically recycle,
it's Postgres its own release cycle, so once per year. And if you want a few times per year to release something, it's hard. And then I know the original creator left Percona and I don't know
what's happening with development of this extension, but when it was created, our first look,
of this extension, but when it was created, our first look, we did, our team, we, I remember, we did some analysis and early versions looked promising but challenging in terms of performance
overhead. And of course, if you talk about RDS and others, they don't have it. I saw
it only maybe on one or two platforms. So yeah, and performance
overhead is a big question. That being said, I don't know the current performance overhead.
It would be great to have histograms and so on, right? So, percentiles. So yeah, and another thing
is two aspects here additionally in my approach. First is that in many cases I don't care about average
at all because this is as we also discussed several times depending on our goals it might
be relevant or not like important or not. For example if we target to optimize how much
resource utilization happens on this server we We care about total time and total
values because this is what exactly shows a picture because a query might have very
like tiny average, tiny standard deviation. So they all like similar, all query executions are similar and planning occurrences are similar, but so many of them,
like 10,000 QPS, queries per second.
This is going to load our server a lot and we need to optimize this query in spite of
good values in terms of average.
Total will be high. So averages, as I feel it, you mostly need it when you target to deliver good experience
to your users.
So, please don't use it.
Yeah, you're completely right, obviously.
If we're trying to reduce utilization, there's no point looking at things by how slow they are on average or even at
the extremes at the P99 level because if they don't run that often they're not
things taken the most time so total ordering by totals whether that's tight
I still like to do time or some combination of buffers if for example
your resource that you're constrained by is IO which could be the case or like
if you're paying by IO like if you're on Aurora or something,
ordering by something else could make sense.
But yeah, totals always make sense for those cases.
But yeah, this is, this is for those other cases where you care about user
experience, perhaps like how long a dashboard is taken to load for a large
customer or the boss has complained or you know that that kind of thing
or we've talked a few times about the issues with long running queries so you might be investigating
what queries do I even have that run long sometimes so I think that's the other use case I
thought of as like why might you want to order by some version of either average or percentile?
Yeah, let me argue with myself, my favorite thing.
So yeah, when we optimize for resource consumption, lower resource consumption,
we look at totals, but at the second step, we should look at average because we need to decide average and frequency, which
multiplied brings us to total.
So we need to go down and see is it about too frequent query or is it about not very
frequent but very slow query and decide which optimization methods to apply, right?
And in this case, I'm wondering,
does it really matter to understand that,
like to look at standard deviation, basically, right?
To think about percent else.
I think you're conflating two things here,
looking at something versus ordering by it.
And yes, when you're looking at resource consumption,
you want to look at average time sometimes,
but you don't order by it.
Crucially, you need to order by total time.
And the same is true for standard deviation, I think.
I'm talking about adding maybe one, maybe two,
depending on exactly what percentile you want,
adding a couple of standard deviations to the mean
to order by it. Not necessarily to look at it, but just to get the initial
prioritization order, which thing should I start looking at? What are my top 10? Because
the order will be different. Assuming you have some variants and you know, distribution
of your queries and some of them have a longer tail than others, the order will be different
if you order by mean plus a couple of standard deviations than it would be if you just order by
mean. So it's about the prioritization order and then you care about looking at a lot of the other
columns but you don't order by them. So I think that's a subtle difference. Yeah, yeah, yeah, I
agree. Order by standard deviation, it's interesting. I'm joking, I'm joking. Yeah. I understand you are thinking about
ordering for example, average plus some factor times standard deviation. This is a smart
idea I think. And before we move to that idea, which is explained in your blog post. So again, we order by total time, and then we look to decide which approach to use for
optimization.
And still, maybe that also makes some sense to look...
To have percentiles, for example, because we would understand, oh, this query has this
average which is not that bad, but for example, we don't have percentiles in PgSense 7, no
histogram.
So, but we see, oh, standard deviation is so huge, it means that some occurrences are
okay, some are not at all.
And this brings us to the idea, oh, it means that we have instability here, depending on maybe data volumes or maybe
we are blocked by someone.
By the way, I would rather prefer to have this information for different metrics than
time.
But for hits and reads, this is where I would really love to see that piece, standard deviation
or even better
person tells.
Oh my goodness.
I'm just imagining the number of columns at that point.
Yeah, because it's always in the business approach, it always concerns me how much of
those slowness was associated with query being blocked by another query, just wait time.
We don't know and it's hidden here.
Yeah, I mean it's a good point that these aggregates are always going to hide information,
you know, that is the nature of aggregations is we don't get all the information. It's the
trade-off, right? We get some useful information on aggregate, but we don't get all of the information. I quite often see people include min, mean, and max in their queries, their querying page
stat statements from. So you get kind of the full distribution. I could imagine adding a couple,
you know, like either side of the mean to say what's the mean minus the standard deviation,
what's the mean plus the standard deviation. Without standard deviation, mean and max, they can be some, like you have million calls,
but just one of them was so far and like, okay, it broke your whole picture, affected
your whole picture, while standard deviation brings some good sense, right?
Yeah.
I mean, I guess mean is possibly more interesting than max, but yeah, of course,
sometimes there'll be one parameter that causes the critical one very differently.
Zero roles returned, we have min close to zero, it doesn't make any sense, right?
Good point, yeah.
Yeah, but by including more, you get a better picture of the distribution, you get more
points on the graph to get an idea of quite how is this distributed. Is that max and outlier or are there quite a few
queries running nearly that long? You know, so without some measure close to
P95, P99, that kind of thing, I think it's really hard to know that and I think
possibly what's happening at the moment is quite a few of these dashboards,
monitoring tools, diagnosis tools, I'll give you an order when you look at it know that. And I think possibly what's happening at the moment is quite a few of these dashboards,
monitoring tools, diagnosis tools. I'll give you an order when you look at it by mean,
that's fine. It's useful, but could be more useful. And you could maybe start at the more
problematic queries if you ordered them by a higher percentile and then started at the
top from there. Yeah, because it's not, it's like we say, oh,
on average we deliver everything like 10 milliseconds, everything is good.
Or one millisecond again, like for queries we want to be below
10 milliseconds ideally, definitely not 100 milliseconds, it's too
too much. So we say it's good, but what if almost 50% of that was one millisecond and almost 50%
of that was how many? Like, no, one millisecond. What did you say the average was? My method is
going to be really bad. I was trying to say you might have two segments or clusters of cases.
One is very close to zero, another is very bad.
And they call it like a bi-modal distribution.
Yeah.
Perhaps because of a different plan for different amounts of data.
Maybe like...
Different parameters, maybe, or maybe these locking issues.
And then you observe something in the middle.
Well, maybe 50-50 is a bad example because we are going to be in the middle.
What if like 70%, very close to zero, 30% is very far from zero and they really struggle
with performance, those users, right?
And we see average is not bad, it's going to be shifted towards
to zero because with average not... Oh no, it's not going to... Okay, depends, right?
So, but average will tell us it's like, it's not that bad. We can leave it.
It's very easy to imagine distributions that on average are less than some threshold
that we might not care about, like 50 milliseconds or something.
Maybe we set our threshold there.
On average, they're 40, 45.
But for a non-zip, you know, like a decent number, maybe if it's P95, P99, and we're
talking about either 5% or 1% of times, it's running for hundreds of milliseconds.
And we've talked in the past about how...
Or seconds, go on.
Or even seconds, yeah, exactly.
So those are noticeable to human durations,
even though on average it's running in 45 milliseconds.
So it's very easy, I think, to hide things that on average
they maybe aren't perceptible to humans,
but if the tail's long enough,
then for quite
a lot of people, especially, let's say it's 5% of people, that's one in 20 times somebody's
loading that dashboard or using that feature. One in 20 sounds quite often.
Yeah. Yeah. Yeah. If it's one in a thousand, well, things happen. We we can say okay we have a very large user base some of them
like we need to take care of everyone but we cannot like it's it happens right but but it this
approach works only if your P99 is good yes okay one percent is not that good, but if it happens with 20% of your users, they suffer and you
missed this fact just looking at the average, it's not good.
Yeah, exactly.
Well, imagine, go back to our example, if you've got two queries, mostly they look the
same in terms of average.
Let's say one's 45 milliseconds on average and the other is 50 milliseconds on average. But the one that's 45 has a P 99 that's 10 times that 450 milliseconds.
And the one that has the one that's 50 milliseconds has a P 99 of only 60
milliseconds, only a little bit higher.
I would much, much, much rather start.
Uh, I'll have it ranked higher.
The one that had the higher P 99 in that case, even though it has a
lower average
Duration so that's that's kind of the whole point. Yeah, let's let's explain the idea
Yeah, the idea is simple. We we don't have buckets. We don't have percentiles
We don't have histogram we cannot draw it, but we have stellar deviation and the idea is how I learned how I
Hear it read it in that all analysis we have
over the last years with PgSense statements and other things PgSense.kcache with PgWatt sampling
maybe and so on we always look at throughput and average throughput and average latency that's it
yeah we know that from you mentioned the article mentioned that from application side
some monitoring systems that bring you P99, P95. It's great or two values even better than just
one here. Yeah. And sometimes colorful, good looking graphs. But in Postgres we like don't
do it somehow, but we have standard deviation, and in which POSGIS monitoring standard deviation is really appreciated?
I don't know.
I don't know any, yeah.
We don't have P99 or P95, P90, nothing like that, but what if we assume we like distribution,
well your assumption it's normal, it's not normal, like half normal.
But the idea is let's take average and standard deviation and have some multipliers.
So we have formulas, average plus some multiplier times standard deviation. And we have a kind
of P something, P90 something, kind of, right?
Yeah, I called it approximate, just only because it really does assume a normal distribution like
standard deviation only makes sense in it like not not that only makes sense you can only
approximate if you assume a normal distribution and i i made the case that i think it will be
quite rare for you to have queries that or at least problematic queries that are normally
distributed i suspect a lot of the problematic ones are going to have long tails, but it
definitely isn't a safe assumption that they're all going to be normally distributed. But
once you factor that in, I think this is still useful. I think queries that have a low average
and a higher standard deviation where this ordering difference, where this change makes
a difference, are probably gonna have well
are definitely gonna have high P99s as well like it is it's still
directionally correct I think I used that phrase so it still points you
probably in the right direction this order is probably not going to be that
different to the order by P9 by true P99 so it's it's a step in the right
direction it's not necessarily accurate, but it's direction correct.
Yeah, I agree.
And since I suck at statistics as a listener already should
understand by this point, I talked to my daughter who is learning data
science and at UCSD right now.
Nice.
And, and, uh, yeah.
And she told me, Oh, this is quite known, like P95 or something for normal distribution.
We just standard deviation times two to both sides.
This is how you can get P90, she said, or P95.
I forgot.
Yeah, so this is reasonable, she said.
But then, of course, it's not normal distribution, and it's a big assumption.
And what I thought, like, idea number one I had, and maybe I will encourage some, I
don't know, some students or some people who are willing to do something here.
What if we had some raw data from somewhere?
I don't know who can afford logging all queries with duration. Since we talked last week, we talked how dangerous it can be to log all queries.
But imagine we collected some raw data and we see what's present in persistent statements.
And what I'm trying to say, we could find, we could understand distribution nature, maybe
observing many, many cases from all to be like web apps, mobile apps.
And then we could probably find some heuristic based based multiplier, right, which would
serve like closer to reality.
I don't know, like to avoid this assumption that it's normal, but it's a big work.
I know like it's you need to collect a lot of data, process it and so on. What do you think?
Well and it would be different for different query like unless we find out that it's just I think
it's a tricky problem unless you actually approximate it in while you're monitoring.
Look I understand what you said. Yeah. No but so if we take take assumption, it's kind of normal.
It sounds for me like fragile assumption.
If we learn how some OTP system, social media or something, with all these queries, like
select two users, select two something, join some tables, and so on, how these queries
behave, at least, I don't know, a few dozens join some tables and so on. How these queries behave, at least,
I don't know, a few dozens of them, how they behave. This practice learned multiplier,
I'm pretty sure it will be in another system, it will be very close.
Yeah, actually, it's a good point. You could get more accurate with the number like the exact multiple to get a more likely to be
P99 measure it might turn out that the number I've suggested based on normal distribution is actually more likely to be a
P96 than a P99 or P97 instead of P99
Does it make that no difference?
nine. Does it make that much difference?
If they just order by no, I agree with you. And what I'm proposing breaks
Pareto principle, right?
It's Yeah, it's a lot of work for arguably not that much additional like I think the if someone's going to put a lot of effort into this, it'd be much
better to actually track this and put it into Peterstatt statements.
Well, this will be this like, again, overhead question.
It would be great to have, but again, okay, I understand.
What I'm proposing is going to be outside of P80 of the effort we're going to have.
I like it.
P20, P20, I don't know.
P80 of the value, P20 of...
Okay.
So, another thing, another thought I have.
I noticed over time that when I run some benchmarks, very often with pgbench, I'm absolutely unsatisfied
with...
So, I use...
We have it in automation as well, like all my team members use it.
So I have lowercase r to report results in the end for each statement used in transactions.
Sometimes we use multiple files with different transactions and with some weights assigned
to them.
It can be achieved with option f and add sign.
It's a great feature, by the way, underappreciated.
But anyway, we see these reports and we discuss them and like, oh, it looks great. Oh, it
looks not that great. But I realized it's not enough. I want histograms. Right? Yeah.
And to mitigate it, at least somehow, we started to use capital P option.
For example, P30, it means like every 30 seconds, it will be reporting.
Oh, interesting.
Different P notation.
Yeah.
Yeah, yeah.
So it will be reported as progress.
So it reports latencies every 30 seconds, for example.
Or every 10 seconds if benchmark is very fast or every minute if it's longer, it depends.
And from there we already understand, oh, distribution is like this.
So we can feel some distribution.
It reports, by the way, standard deviation as well, but it's still not enough.
And there I also like, but there is another aspect here, not only about distribution of NP-90, NP-95, it's also
about the distribution and time, over time, because there might be a spike of latency,
and then it's better.
But still, I think what I was trying to say, actually, there is an idea to implement histograms
or percentiles.
Maybe it's better to start in PGA Bench.
Not in PGA Start Statements first.
Yeah, interesting idea.
I hadn't thought, I get where you're coming from.
I think it'd be easier, right?
You don't worry so much about.
Well, you worry.
You worry about overhead, but this is like,
this is tool which used to make decisions and so on.
Well, it's also used.
But anyway, maybe it's easier because it's less risky.
The idea is it's not production.
So are you thinking educationally you could see the idea that this is a useful thing?
And once people start seeing it, like, oh, this is really helpful, why can't I have it elsewhere?
So it kind of builds.
Yeah, I think most of people understand that this is helpful, but it's just risky because
PgSETments have overhead, we know it, and it's probably increasing because more and
more things are being added.
And just again, PgSense Setsments,
it's not about production usually.
Depends, of course.
And I think-
You mean PgBench is not about-
It's not used in production.
It's a learning tool for experiments.
For- Yep.
Yeah, PgSense Setsments is for production.
So it's harder to bring something heavy there, right?
Yeah, I agree.
PgBench, again, like optionally, and see how it works.
It's easier, in my opinion.
And it's also, first of all, it's needed there.
It's needed.
Yeah.
It's very useful to have it there.
It would be very useful.
Do you know if Sysbench is bringing histograms or not?
Oh, no, I don't know.
I don't remember.
Yeah, it brings percentiles. So
in history. Nice. So yeah. How many buckets or which? I don't know. Just report a couple.
It brings P95. Yeah. And I don't know. I don't see any more. I don't know. Anyway, it's already
better than just standard deviation, I guess.
But I agree with your point if we order by just like maybe just two times standard deviation.
Okay.
I did. Yeah. In the blog post, I did consider rounding just to two instead of using the one that's actually an approximation of P9.
Yeah, exactly. Because at that point I'm saying, look, this is approximate already, why are
you multiplying by such a specific number? But then what are we approximating? It's like
a different p-value. And I just didn't really, I thought it would just be confusing as a
kind of, to people, like people reading the results rather than people looking at the
query. I figured more people would probably end up, if you've got, if you put these into dashboards and things,
you're going to have more people reading the dashboards than looking at the queries behind
the dashboards. So I figured I'd rather, I'd rather have something confusing in the query
than something confusing in the like column header. Makes sense. Makes sense. Yeah.
Yeah. So yeah, quite strange to use such a precise figure to calculate an approximation.
You know what?
I will think about next time we will maybe do some reports or dashboards.
Maybe we will implement this approach.
We'll call it P question mark, question mark. I suggested like calling it a Prox P99, just so that it's really clearly like you're leading with the word approximate.
So it's mysterious percentile.
Yes.
And even if you don't include like, it's probably confusing to look at dashboard that doesn't include it in the results, but still include all the other columns you're going to include.
Just ordering by that is then, I think, a useful...
Ordering means in monitoring, it means another chart.
You like...
Exactly.
Top end.
Top end by what?
And this is ordering.
And instead of top end by average, or in addition to top end by average. This would be helpful to bring
sense of how users feel about your service. I really respect monitoring tools and reports that try and limit the number of pages or dashboards they show, but most com come across don't seem to try and limit it too much so I think another
what's another chart but if you are worried about quite how many charts you have I might
argue that I'd rather have why my argument is I'd rather have this than the one ordered
by me which all pretty much all of them include so if you include that one I would rather
have this one. Yeah. And do you think it's worth keeping a chart for average, pure average or no?
Again depends on the constraints, right?
If you've already got a hundred charts and you're showing and it's 99 versus a hundred,
why not include the average still?
But if you're talking about only having two, then two versus three is quite a big difference.
So I think it depends on the content.
And this chart is going to be regular, like non-stacked because you cannot summarize this.
Like it's not a cumulative thing, averages, right?
Yeah.
And probably it should include only like 10 or something.
It's not really a chart.
It's just like a top, top end.
Table.
Yeah, like a table.
Yeah, I think so.
Chart also matters.
Like again, table is good,
but it loses historical information.
Yeah, okay.
So maybe your chart is, yeah,
your most important queries looking at their P99 over time,
but then you can include the mean on the same chart,
can't you?
I've seen those in monitoring tools
where you're looking at the same.
For the chart, we can get average for specific period,
just dealing with total and calls.
We have delta for total, delta for calls,
we can have pure average for a specific period of
time. But what about standard deviation? We don't have it.
You can calculate it using the Deltas, but well, the same way PGSTAT statements calculate,
we calculate it every time it has a new query come in.
Oh yeah.
So, yeah, I was looking at the source code.
I was like, oh, that's clever.
Yeah, yeah, yeah, yeah, exactly.
Otherwise, yeah, exactly.
So, this...
But you also have the option of like snapshotting PGSTAT statements.
Some people do that, right?
Reset snapshot, like some monitoring tools or some providers reset it regularly.
It's not a good idea because to do it often, you will have trouble under load because it doesn't come for free
to add new entries.
Well then we have deltas.
Delta is a good idea.
Yeah, yeah, yeah.
Well, standardization can be also understood for a specific period of time.
And in this case, you can have this metric with like magic, mysterious percentile.
And you can have a chart of, for example,
10, the top 10 queries.
Yeah, and maybe also average of everything.
Plus standard deviation of everything. Why? Standard deviation of everything?
Why not?
So, I don't know.
Well, actually, there's a good point of like, there are a couple of...
I don't even know if I mentioned this in the blog post,
but there's at least one major caveat here.
A lot of people have a timeout,
often like 30 second query, like,
statement timeout or...
Not just a statement. It won't be be registered so it does quite seriously affect those metrics like the
max the standard deviation and affects in which way it's not registered so you
miss it there you miss it you miss an outlier so it's it's bringing the
average down it's bringing the P19. It looks more positive than it is.
Yes.
But you know, like if query was cancelled, job is not done, we don't count it.
As I remember, this is another point PgStat monitor aims to solve.
Yes. Like it's a common problem with PgStat admins and all its friends like Pgistad Kcache that cancelled queries are
not counted.
And I think also a big missing piece here, honestly, is that in Pgistad database we have
TPS, two metrics committed rolled back, right?
But we don't have QPS, we don't have latencies, standard deviation.
It should be there, actually. This is another idea to register these metrics, calls, min, max, and min, average, and standard
deviation at the database level for everything, regardless of Pidgestad statements.
And maybe involving
error doubt queries, maybe this is I'm not sure because this is this is like this is if if we involve them
The values observed by purchase our statements and at this high level they will not don't they want to match at all
Yeah, I wasn't I was unsure what like ideal looks like here. I did wonder about should you know in an ideal world would you add the time
out like would you add a query took 30 seconds. Let's say one got cancelled at
30 seconds would you add that an instance of a query that ran 30 seconds
and then add that to the average, add that to the same duration.
Would you want that or would you not want that?
It's easy.
Let's just have another PCSAT statement.
We'll call it PCSAT error statements and have a whole set of metrics.
I'm joking, of course, but...
Well, it's not a bad idea though, right?
How many of each error type do we have?
And query can be cancelled before statement timeout.
It can be cancelled due to deadlock timeout, for example, or just something else happens,
I don't know.
Kill, PG cancel backend, for example, came, cancelled us.
So yeah, and it would be great to count all those things like buffer hits reads.
Yeah.
So yeah, that's the long and short of it.
I did.
Was there anything else you wanted to talk about?
I feel we went to some very dark corner of post-gaze observability, you know?
Yeah.
It definitely seems less explored than I was expecting.
As your daughter said, this is not a novel, like this is not an out there revelation.
This I'm just, I was just surprised when I saw the, like feature request for it.
And I've heard it a few times.
I was thinking, why don't we add, like, why don't more people have that in their
queries?
Why don't I see this in blog posts or in monitoring tools?
And also for buffer operations and wall operations. Well, I know why we don't
have that is because it's not, the standard deviation is not even available there. Well,
it should be available. Let's edit. So yeah, my summary is I'd love to see these in pitch
stat statements, but in the meantime, like you mentioned, if you're doing a consulting report,
or if you've got a dashboard internally consider adding this like or replacing
the ordering by me just don't call it P fake P95 call it mysterious P PN fake news P99 yeah
I liked your mysterious uh what was it MMP magic P yeah mysterious magic P. Okay?
Yep, absolutely.
Nice one Nikolai. See you next week. Bye.
Bye.