Postgres FM - Over-indexing

Episode Date: October 20, 2023

Nikolay and Michael discuss over-indexing — what we mean by it, the regular issues people discuss about it, as well as a novel one Nikolay has come across and benchmarked recently.  Here ...are some links to things they mentioned:Nikolay’s tweet on over-indexing https://twitter.com/samokhvalov/status/1713101666629927112 Heap-Only Tuples (HOT) optimization https://www.postgresql.org/docs/current/storage-hot.html Our episode on index maintenance https://postgres.fm/episodes/index-maintenance PgBouncer now supports prepared statements https://github.com/pgbouncer/pgbouncer/releases/tag/pgbouncer_1_21_0 Our episode on connection poolers https://postgres.fm/episodes/connection-poolers Configurable FP_LOCK_SLOTS_PER_BACKEND (Hackers mailing list discussion) https://www.postgresql.org/message-id/flat/CAM527d-uDn5osa6QPKxHAC6srOfBH3M8iXUM%3DewqHV6n%3Dw1u8Q%40mail.gmail.com LWLock:lock_manager (Amazon RDS docs) https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/wait-event.lw-lock-manager.html ~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is brought to you by:Nikolay Samokhvalov, founder of Postgres.aiMichael Christofides, founder of pgMustardWith special thanks to:Jessie Draws for the amazing artwork 

Transcript
Discussion (0)
Starting point is 00:00:00 hello this is posgus fm and uh it's michael hi michael hello nicolai oh and my name is nicolai right so do you remember the number episode 68 i believe now 68 oh yeah great and the topic today will be i chose it and it will be over indexing next time we will talk about under indexing i promise right well yeah i mean this is the this is the age-old thing i am you did a tweet series about over indexing and this is definitely kind of one big category of issue where both of us have come across people that have added a lot of indexes to certain tables and that's caused a certain category of issue but i'd actually say that in the majority of cases that i see that people have performance issues
Starting point is 00:00:51 it tends to be under indexing that's the problem for for their cases but yeah maybe another a whole another topic for a whole another day right well right right right if you you start without indexes just with primary key maybe, maybe even without it, then you find out that indexes is a good thing. They didn't teach you when you learn SQL standard about indexes because they don't talk about it at all. But then you discover them. So, right, so we have phases here. Phase one, you discover indexes
Starting point is 00:01:26 and understand that lack of indexes hurts your performance, your database performance. But once you discover indexing, I bet you tend to do what we will discuss today. You create indexes too much. The silliest approach is to index every column.
Starting point is 00:01:55 Well, and worse, right? Because I often give that example and I forget that it's not even only as bad as that because we often, if you want to, let's say we're trying to optimize a specific query, often the optimal index is a multi-column index to help us get exactly the right conditions and get an index-only scan. And once you introduce multi-column indexes, it's not even just an index per column, it's an index per combination of columns in a specific order. So the number of possible indexes, even if we don't start to consider duplicates and overlapping ones, it's not even just the number of columns,es, even if we don't start to consider duplicates and overlapping ones, it's not even just the number of columns, right? It's way higher. In general case, the optimal index is multi-column involving some expressions instead of columns,
Starting point is 00:02:36 and also partial. Yeah, of course. Yeah. Well, okay. So nobody says we always need just one index per table. That's not so. Of course, we need multiple indexes. But what I'm trying to state is that having too many indexes hurts not only write statements, but also reading statements, selects. By the way, my tweet, it's just one tweet, it's not a serious. Serious is Postgres Marathon, which is about everything, not only performance, all things Postgres as well, and just with some practical examples. But in this case, so many people, when they pass this phase one lack of indexes and they are in phase two overindexed, they realize, and it's discussed a lot, that indexes hurt too many indexes. Extra index will slightly slow down write statements. And by the way, I made a mistake in my tweet saying that deletes are affected.
Starting point is 00:03:48 This is my usual mistake. I think I made it during our podcast some time as well. Deletes are not affected. Deletes just put xmax colon value to tuple header and they don't change indexes. So deletes are fine. Affected write statements are inserts and updates. For updates, there is optimization technique called HAT, HIP Only Tuples Update, which sometimes works, sometimes not, depending on a couple of important conditions.
Starting point is 00:04:20 But my statement is, selects are also slowed down and we can discuss things step by step. Let's talk about writes first. So again, inserts and updates. Why are they slowed down? Because when Postgres creates new tuple, each insert, each delete will create new tuple, even if you roll back the operation. If you roll back operation, they will be dead. New tuple will be dead, right?
Starting point is 00:04:49 And when new tuple is created, index entry has to be created in all indexes. It's called index write amplification problem in Postgres. One of the reasons Uber moved back to SQL. Very infamous story, right? And for inserts, there's nothing you can do because we do need to put additional pointer to each index. But for updates, there's hot updates, heap-only tuples. Two conditions needed to be met.
Starting point is 00:05:23 First is there should be additional extra space in the same page as alt tuple and second condition you should not change values in columns which are participating in index definition and it can be even where clause if it's a partial index. So the basic example here is if you have updated add column. And by the way, Jeremy Schneider from RDS team mentioned it in the thread on Twitter as well. So if you have updated add column, many people tend to have it. And then you have an index on it. Of course, when you change data, you change updated at value,
Starting point is 00:06:05 but since it's an index, you never will have hot update for such updates. It's impossible. So the idea is maybe you should not index it. Try to avoid it. But we might want it because probably we want like last 25 updated
Starting point is 00:06:21 rows or something, right? So what, like, let's discuss this. What's the best approach here? You have updated that column, you update maybe by trigger, for example. Each update changes this value. But what to do? I want fast
Starting point is 00:06:38 updates. I want hot updates. So tuple is the same page. This is one of the things I love about performance, that you just get hit with these trade-offs, and it's going to depend on your case, right? What's more important to you for that table? And what are you most limited by? What's your most precious resource in this case?
Starting point is 00:06:56 So I don't think there's a generic piece. I think people try and give generic advice. I've definitely tried to do it in the past as well. And sometimes it feels like you can get like an 80-20 rule. This one feels like it's, I know it's a bit of a cliche now, but it really does depend, right? Right, right. Well, but how, like, let's slightly discuss how it depends.
Starting point is 00:07:16 First of all, I forgot to mention, hot updates, it's when you, like, I explained conditions, but what is hot update? Hot update is when during update, Postgres doesn't need to touch all indexes. Because if you have 20 indexes, it's hard to touch them. It's expensive and slows down update. And this is because the tuple in the heap doesn't move page, right? It's on the same page, yeah.
Starting point is 00:07:43 And we get like, I can't remember the wording for it but it's like a little bit of extra information gets added to that so it's like a link additional jump inside page right so like index keeps uh the pointer to the same page same buffer basically like and when we read something we need to read whole 88 kbytes block, buffer page, and inside it, old tuple now says there is new version. It's super fast to just jump to new version inside.
Starting point is 00:08:14 And the space on the page thing is important as well, because if you want to optimize for heap-only tuple updates, tables by default have a fill factor of 100%, don't they? But we can reduce that. Artificial load.
Starting point is 00:08:29 Right. Yeah, but if we reduce that to something a little bit, giving a little bit of space by default on each page, yeah, sure, first our table takes up more space, but in future we might be able to get that net benefit of our heap-only tuple updates. Right. Or just here we just can benefit from slight bloat we have. Because if some space is left after some tuple became dead and then cleaned up by auto-vacuum. This is how, like, I mean, I'm sure most times in the world when hot
Starting point is 00:09:08 update is executed, it was in tables with fill factor being default 100. Yeah, probably. It's just working because autowacuum already cleaned up some old dead tuples. Right. So okay. And in this case, we, of course, we have an updated at column, we need to decide. If we have an index on that column and we rely on it a lot in our queries, for example, give me 25 lastly updated rows. In this case, probably we want to keep this index, but we understand that updating updated at will lead to degradation of this write. But, of course, we also need to see the full picture here. When we update updated at column, change it.
Starting point is 00:09:52 When we change other columns, maybe we already lost hotness feature of our update, right? Because we anyway changed something which is indexed. But in some cases I saw like and I recommended let's just get rid of this index on updatedAd. Our updates become very fast because we have too many indexes on this table. We need to update all of them. In this case, we have, for example, we have some payload which is, I don't know, like some JSON and it's not indexed. We change something there, and automatically we change updated add. In this case, if we don't have index on these columns,
Starting point is 00:10:32 neither of them have index. In this case, update is very fast. But how to find recently updated rows? I remember I wrote a couple of times relying on the index or either on ID if it's a sequential number or on created at column which is like default now
Starting point is 00:10:51 current timestamp. Trying to find like lossy search. You find more and then if it's not enough you get next. You search by created at but you try to find updated ad. It was some for maintenance, some approach.
Starting point is 00:11:09 I just wanted to, I knew I don't want index on updated ad. So I tried to find workaround. It's an interesting engineering task, but of course it depends on the case you have. But it's worth keeping in mind, such updates are quite slow. Right? Yeah. So and speaking of inserts, probably you just want to insert in batches if you can, because
Starting point is 00:11:33 in this case, many tuples will go to the same page. And it will be faster compared to single row inserts in separate transactions, right? So in this case, it might be faster. So we've got the right amplification problem. We've got hot updates, not only hot updates that can help alleviate that, but also it's another problem of over-indexing, right? If we've indexed loads, columns that we don't ultimately need, we prevent those. So there's both sides of that argument.
Starting point is 00:12:06 And then is there anything else you want to talk about on those before we move on to things like that? Of course. We had an episode about index maintenance. So the easiest way to improve the performance of right operations, excluding deletes, I will never forget anymore, you need just to perform index maintenance regularly. Like once per few months you check all unused indexes and redundant indices you have so by far and just remove them
Starting point is 00:12:33 yeah so by unused indexes we're using um is it i've written it down here is it pgstat user indexes and not not just on the primary ideally if you're, if you're using read replicas, also on read replicas. If you have multiple setups, check multiple setups because there are difficult situations. For example, some software you install to users, like to customers. Oh, yeah, like a multi-tenant thing. Not multi-tenant, the opposite. Multi-tenant is we have one database and many... Oh, yeah, like a tech multi tenant thing. Not multi tenant for the opposite. Not standard is we have one database and many, many Oh, sure. Or if not stand also interesting, if
Starting point is 00:13:11 you have multiple databases, probably you need or multiple tables, the same structure happens sometimes, you need to check them all to understand the pattern, maybe not of them all, like maybe just like 10% understand the structure. But I'm talking about when you install software, like for example GitLab, you install it, customers install it, and who knows,
Starting point is 00:13:32 maybe you checked several cases, index is not used, but who knows, maybe 5% of your customers need this index. And also time matters. You need to check long duration of usage we mentioned example when some indexes is needed a lot but on the first day of each month if you if you happen to check usage statistics in the middle of month and checked in the end of month, you might not see the pattern well enough.
Starting point is 00:14:08 So checking usage requires some understanding as well and should be done carefully. Redundant indexes also, like, some people have fears that if some redundant index looks like redundant. And we know, like, index on column A is redundant if there is an index on column A and B. So our A is on the first place. We consider this index as redundant. But what if it's used? It can be used because it's smaller
Starting point is 00:14:39 and the planner thinks cost is lower. Oh, yeah. It will likely be used because it is smaller right but it does yeah you're right um good point so redundant can even be well so redundant can be these overlapping ones like you described but it can also be literal duplicates like postgres yeah so postgres doesn't stop you i think that surprises people it doesn't stop you creating the exact same index definition as many times as you want yeah why not so yeah and and uh so just try to achieve the minimal set of indexes you really need this is the general advice here but let's move move on to
Starting point is 00:15:25 more interesting topics which are not usually discussed yeah i had like a i had like one at least one more kind of easy one that probably don't need to discuss much but i think people often forget that they're competing for space in the cache as well like if you're in a position where you don't have well your working set doesn't easily fit in as well. Like if you're in a position where you don't have, or your working set doesn't easily fit in memory, the more indexes you're creating, the more competition there is for that limited resource. And then the final point was,
Starting point is 00:15:55 you mentioned index maintenance, but it's also a point, if you need to rebuild everything, if you, like, for example, there was the issue in 14 that was fixed, I think, in 14.4. We needed to rebuild all our indexes. The more indexes you have, the longer that takes. There's a few things where the more you have, the more maintenance there is to do.
Starting point is 00:16:17 Yeah, exactly. Indexes have degrades over time, and you're talking about the case when we rebuild index with concurrently option, either create index concurrently or reindex concurrently, and during this XminHorizon has been held and after vacuum can then delete
Starting point is 00:16:35 freshly the tuples and optimization was attempted in 14.0, but all versions between 14.0 and 14.3, they have bug in this feature, and so fixing 14.4, yes. And yeah, so you're right.
Starting point is 00:16:54 We didn't mention basic overhead from extra indexes. It's disk space and also buffer pool usage. Yeah. You can put it in disk space, yeah. And actually space in the page cache as well, so in memory. So the more indexes you have, the more memory you need to maintain. Postgres needs to maintain them during operations.
Starting point is 00:17:19 Even if you don't use them, they need to be updated, so they need to be loaded to the buffer pool to be updated. And it's not good. But let's talk about even more interesting. I mean, what this tweet discovers and talks about, probably it's not a super big problem, but it's a surprise to many. We have a small table or bigger table. We use PgBench accounts.
Starting point is 00:17:47 I mean, regular PgBench with scale like 100. It means like, I don't remember, 10 million rows or so, like a small table. And then we just have, by default, we have only one index, which is primary key on this table. And we just check the performance of primary key lookup. It's simple. Select only workload PgBench provides by default
Starting point is 00:18:12 option dash capital hyphen capital S, selects only. And then we just check performance, check planning time, check execution time, and then we add one more index. And we see that performance of planning time degrades. And it degrades noticeably. First degradation is more than 10%, 16% maybe or so, if you check even planning time only. Execution time doesn't degrade, but planning time degrades. And then one more index also like nine percent or so and then more and more and then this like the speed of degradation goes down a little bit like
Starting point is 00:18:51 to five percent but index each index adds a penalty to planning time five percent or so it's significant i mean like it doubles very quickly by, I don't know, by index number. I think nine. Nine, ten, something. Right. So having 10 indexes compared to one index is like twice longer planning time. By the end, we reached 20 indexes and the degradation was more than 4x. So more than 300% was added to planning time.
Starting point is 00:19:28 But there was additional effect at some point. We will talk about it slightly later. It's even more interesting. But this is like super basic. Index means more planning time to basic queries like primary key lookup. And why is it so? It's because the planner needs to analyze more options for more plan options, right? Obviously. And is this just to check, are we adding the equivalent duplicate indexes on the primary key column? So it basically has a choice of all 16 or all 20 of the indexes. So it has to choose which one is likely to be the lowest cost. Right, right, right. So yeah, this exactly shows also how bad it is. It's bad to have
Starting point is 00:20:15 too many indexes, especially duplicate indexes or unused indexes, even for selects. But it affects only the planning time and of course if you can get rid of planning time for most of your queries if you use prepared statements and it's easy to check with the same pgbench saying Mprepared and you will see there is no effect like you can add as many as you have and planning time is very small because there is no planning time for most queries it's planned only once and then it's reused the same plan right so it's worth saying that this monday this week pg bench finally like i waited like honestly waited pg bouncer pg bouncer sorry
Starting point is 00:20:59 not pg bench is for benchmarks yeah pg b, which is still the most popular, I think. Puller. Although there is competition growing in the last years. Finally, PgBouncer got support of prepared statements for transaction pull mode.
Starting point is 00:21:21 Transaction pull mode is the most interesting because it has most benefits. We discussed pullers as well transaction pool mode is the most interesting because it like has most benefits we discussed poolers as well in our another episode and people who are using pgbounce a lot of people
Starting point is 00:21:35 could not use prepared statements and now they can since this Monday and this is a big feature and I'm glad that it happened finally. So I mean, I wanted to say maybe not 15 years, but close to it, I waited for it.
Starting point is 00:21:52 Like Skype released BG Bouncer maybe in 2007, eight or something like almost 15 years, definitely. So I didn't want it like too much, but I obviously observed a lot of projects who would benefit from having this feature. So it's great that it was released, definitely. How are you going to celebrate? Benchmarking.
Starting point is 00:22:17 Seems appropriate. Already benchmarking. Benchmark party, you know. So, yeah, I wanted to thank all people who were involved into this. I don't remember all names, but I want to mention a couple of them. Gerald F. Nemanil, sorry if I
Starting point is 00:22:33 pronounce it wrong, from Microsoft team and also Konstantin Knizhnik, who created the latest version of the patch this spring, I think, from the Neon team, and a couple of more folks who prototyped in this direction. And I forgot names, sorry.
Starting point is 00:22:52 But obviously, it's multiple teams, multiple people and teams actually were involved. And that's great to see that it's released. Knowing the history of Pidgey Bouncer development, I actually expected that this patch won't be merged quite soon. So that's why I said it's super fast. It's not fast, as Geraldine mentioned, it was like 1.5, like one and a half years of work. But from perspective of 15 years of waiting. So anyway, each index adds penalty to planning time. It's probably very low, sub millisecond for small cases, but it might be already like millisecond, depends. For primary key lookups, which are very frequent, for example, thousands per second,
Starting point is 00:23:37 it's a lot of CPU time. And now if you use PgBouncer before this Monday, you could not use it. Now you just need to switch to prepared statements and enjoy that CPU goes down at least a little bit. And primary key lookups will be much faster now. Give me your skeptical opinion about optimization on sub-millisecond timing. Well, firstly, I commend anybody that actually looks into this stuff and benchmarks it and shares the data. I think that's a step that people often miss
Starting point is 00:24:14 when they're talking about generic advice around this. And I mean, I saw the numbers shared around. I think it's dangerous, isn't it? Like you share data and then people quote it without looking into it themselves. So the pg bouncer stuff they they ran some synthetic workloads and showed it was between 15 percent one five to 250 percent on depending very much on this synthetic workload used and i've seen some people then quote that saying it's minimum 15% improvement to all workloads.
Starting point is 00:24:47 It's like, that's not what it means. Right. So I think it's like, it is a bit dangerous to say that without checking for yourself, like check, do testing and do, do your own research.
Starting point is 00:24:58 But so that's kind of where I want to go with this degradation of query planning time. The graph that you've done is great, but the Y axis is only like, even at the 400% degraded query planning time, we're talking about a total of 0.1 milliseconds on average, planning, right? First of all, it's not a mistake in that tweet, I'm fixing it in the final repository of these how to's uh these problems will be solved i i hate when people in this case it was it was not people it was charlie pt who wrote this graph with this advanced data analysis it was a bunch of python if you say don't
Starting point is 00:25:36 do it it won't do it anymore so you you shouldn't start y-axis not from zero right it's all it should be always from zero right i i agree generally but i wasn't even talking about that i was talking about the the other end of the y-axis like what's the what's the maximum average planning time even at 20 indexes here let me explain from this point of view for most uh quite queries, the planning time is like the smallest fraction of everything, right? So, and optimization, even if you go to zero planning time, probably won't be very noticeable. But that's why I keep talking about primary key lookups or other index scans or index only scans, when we have execution time is very small
Starting point is 00:26:26 and planning time compared to execution time, it becomes noticeable, sometimes exceeding. And in our, like, if you have a table and create 20 indexes, in our case, we showed that planning time is already four times more than execution time. It's 80%. And if you remove it from the picture, it's 5x improvement. Of course, we still talk about sub-millisecond queries, but if you have 10,000 of them per
Starting point is 00:26:56 second, it's significant, right? Yeah. Oh, I completely agree. And I'm a big fan of this. Even in your example, you show that even with only one index, planning time is about double execution time. So we're talking about significant planning time here, regardless of how many indexes there are, if you're looking at it in relative terms. I'm just saying for people that hear these numbers, check for yourself on your own data as well, because we're talking about 0.1 like you say sub millisecond but it's a it's not even that it's sub a tenth of a millisecond it's like another order of magnitude yeah yeah of course it's because in our case this case was very small
Starting point is 00:27:36 table again 10 million rows it's not nothing 10 million rows it's like these days it's it's small so on to more like specific things i i would be interested i don't see many people like i do see some duplicate indexes but what's the what's the same effect on of having 20 indexes when they're not all on the primary key like my understanding is that this was this effective planning time because the planner had to consider we should check but that so again like most people don't have that kind of, I've got 20 indexes on the table, they don't have them all on the programming.
Starting point is 00:28:07 Good question. We will double check this path as well. Like indexing different column, which for Planner, it should be obvious that checking those indexes doesn't matter. Yeah. Doesn't make sense. I mean, doesn't make sense.
Starting point is 00:28:20 But who knows if Planner has this optimization. Let's check. i wanted to say there's another test vitaly kuharik created yesterday actually checking this new feature in pg bouncer and cpu load was like just again moving from one index to 20 index and we've prepared statements through pg bouncer everything c CPU load went from 80% to 40% so two times less just switching to prepared through pgbouncer so this is like
Starting point is 00:28:52 but again it's a very specific case primary key lookups, indexes extra indexes on the same column the real case can be very very different from it and then one more thing that you don't... In fact, we should discuss here
Starting point is 00:29:07 where the graph changes shape a little bit. But before we do, another thing I've realized we haven't talked about is a case that does hit people pretty hard, which is too many indexes plus partitioning with dodgy... Yeah, let's switch to final topic, which should be like a cherry on the top of this cake.
Starting point is 00:29:31 Yeah. When you reach, like for this particular case, simple query, one table, primary key lookup, you just add the extra indexes. No other tables are involved. The thing is that when you reach index 15 and add index number 16
Starting point is 00:29:49 degradation is much worse but only if you have multiple session experiment if like it's just one session you won't notice it probably but in concurrent environment when many sessions around this primary key lookup they at when you add index number 16 and number of tables and
Starting point is 00:30:13 indexes all together exceeds 16 this threshold hard coded in source code in this case you start observing additional degradation from a lock manager behavior. And it can be observed as a weight event lightweight lock. Weight event type is light LW lock. And the weight event itself is called lock manager. So lock manager starts doing some additional work because the thing is that when you just run some select, this select is going to acquire a lock on the table and all its indexes, even if they don't participate in the plan at all. You don't need them, but still Postgres needs to acquire access share log to block possible DDL, basically, right?
Starting point is 00:31:09 So like, and it will be released only at the end of transaction. So to acquire a log, log manager needs to deal with memory. So like there are slots, basically. And when it's doing it, it needs this lightweight log. So your transactions, select transactions, start competing for this log acquisition i mean they're not blocking each other but acquisition like takes some time additional before why it's 16 like before that there's there's a thing called fast path and there is another algorithm which is not like not full-fledged checking.
Starting point is 00:31:46 It's faster. But once you need more than 16 access share logs, this is the lowest heavy log. And is this like a hard-coded number? Where is this 16 coming from? It's a hard-coded number. It's called FP log slots per backend. FastPath log slots per backend. FastPath log slots
Starting point is 00:32:06 per backend. It's 16. Quite old thing. And that's it. So even if you have like 100 or more vCPUs, terabyte of memory,
Starting point is 00:32:19 this hard-coded number can hit you badly. And why I switched to this topic? Because partitioning can be involved here. Because partitioning is a multiplier of a problem. If you, for example, have a simple SELECT to some table, and you have 15 indexes, relation, I mean table,
Starting point is 00:32:40 relation means, in post-guest terminology, relation is both tables and indexes. It's called relation. Okay, so table tables and indexes. It's called relation. Okay, so table plus 15 indexes, it's 16 relations. All locks will be fast path. By the way, a comment. It might look surprising that we need to lock all indexes even if they don't need it. It's so.
Starting point is 00:33:03 But it's so, yeah. But I'm curious if it would be implemented differently, like virtual locking for indexes. Indexes do not live without tables. If a table is locked, we could consider indexes also locked. It's additional checking some pieces of code, but maybe checking this is less problem than dealing with this fast path versus not fast path. Good question. but maybe checking this is less problem
Starting point is 00:33:28 than dealing with this FastPath versus not FastPath. Good question. Oh, by the way, FastPath can be seen by hands. You just select from pgLogs, and there is a column called FastPath. But when you're checking this, you need to do it in another session, knowing PID, process ID of your main session.
Starting point is 00:33:45 You just don't close transaction and if you just say begin, select something and keep transaction open and then from different session you look at PID logs knowing the process ID of the first session. And you will see all the logs you will see, relational level
Starting point is 00:34:01 logs, heavy logs and you will see which of them fast path, which are not, and fast path 16 maximum. So this is hard-coded, and there is a discussion on Hacker News. I think there are several discussions. The latest one I started, and originally it was maybe no, but now Thomas Wondra proposed some patches. We are going to explore them.
Starting point is 00:34:27 So to increase this threshold to 64, for example, or yeah, there are complexities. Did you say Hacker News or do you mean Hacker's mailing list? Sorry. Yes. Hacker's mailing list.
Starting point is 00:34:42 I was thinking... No, no, no, no, no. Such things are not discussed on Hacker News. Yeah, sorry. Sorry. Mixing terms here. That makes sense. Let's link them up.
Starting point is 00:34:51 I'll do that. PgBench versus PgBouncer and Hacker News versus PgSkilledHackers. Of course, yeah. Sounds similar. Sorry for mess. So there are proposed patches to raise this threshold
Starting point is 00:35:02 and it turned out it's not so simple as I thought originally like one line you just change to 64 no no no there is a big discussion of complexities of this move and I think it would be great to to move this forward and I plan to how it feels. I see several cases when it was a very bad situation, when this lightweight log weight event came, and many sessions are starting to clash in on this, right? And we quickly reach a number of active sessions, like reaching a number of CPUs.
Starting point is 00:35:47 We have vCPUs, we have cores we have and exceeding it and it puts server on knees. I mean, we have brief outage like lasting from several seconds to maybe a minute. And sometimes it's a bad problem actually. And it's also a hard problem to understand. But the main how it looks, you see this lightweight log manager in weight events. So unfortunately, there is nothing you can do here except reducing number of relations, for example, removing some indexes or reducing frequency. It happens when frequency is high. If frequency is low, you won't notice. You can have 100
Starting point is 00:36:29 indexes and there is no contention observed. But if you have many indexes and, for example, you partitioned and some queries don't have partition pruning involved in plants,
Starting point is 00:36:51 meaning that they need to deal with multiple partitions maybe, in the worst case, all of them. And each partition has, for example, 15 indexes. Obviously, you have a lot of non-fast path locking involved. And it slows down lock manager behavior, and you see this spike of lightweight lock manager lock contention. Yeah, and even fewer indexes can be a problem. As soon as you've got, let's say, like 10 partitions, and you've forgotten that you have a partition, or even like two indexes, yeah, it becomes an issue really quickly. And I think this is one of those confounding factors
Starting point is 00:37:31 where it's not even an issue of too many indexes. The main issue is you've forgotten a couple of queries that don't have the partition key in there, like a couple of frequent queries that don't have the partition key. Ah, they should be frequent. If they're not frequent, it's okay. Exactly. And then that hits this issue,
Starting point is 00:37:49 and then that kind of cascades into other issues. So, yeah, a really good point about this. I think this is really important. So, yeah, so partitioning. But in general, when partition pruning in plants is involved, at least in Postgres 14 and higher, if Plan is working only with single partition, you won't see logs for other partitions.
Starting point is 00:38:16 I just double-check manually for safety. It's so. I checked it a couple of days ago additionally. So it means that you can have 10 indexes and work. By the way, if you join with other tables, again, more indexes. Joins also multiply this problem, right? So I'm talking about high-frequency queries, like thousands of per second,
Starting point is 00:38:43 and the lack of CPU power we have here. Because if you talk about 500 per second and you have 128 vCPUs or 244 with modern AMD EPYC, for example, generation 3, right? In this case, probably you won't notice this problem. But if you start noticing, and it can be noticed early stage, you just see wait events popping up, you don't have outage yet, but already these wait events, you should be very careful, consider index optimization, reduce indexes, check the plans, try to reduce frequency, and what else?
Starting point is 00:39:21 If it's primary key lookups, back to our situation with planning time. If you see planning time is significant here, it's time to consider again, if you have PgMouse, time to consider prepare statements and reduce the duration these logs are being held because they are released only at the end of transaction. Yep. Nice. Yep. Nice. Right. So yeah, this is probably, by the way, one of the biggest performance problems Postgres has right now,
Starting point is 00:39:54 which is hard-coded. Probably it will be improved. Of course, only projects with high-frequency queries experience it. But it's an interesting problem I I consider it as similar as sub-transaction issues I I started like a few years ago so yeah oh by the way also worth noting that the RDS documentation has very good wait event pages. Documentation. For most popular wait events, they have a whole page. It's much better than official documentation because it talks also about use cases, what to do, how to diagnose, and so on.
Starting point is 00:40:35 And for this particular lightweight log manager, they have a very good page as well. So worth checking. I'll put it in the show notes. Good. So that's it, I guess, right? So don't allow over-indexing. Under-indexing, it's next time.
Starting point is 00:40:55 Yeah, probably most of you listening are still in the under. I would guess most of you are in the under-index stage, but the ones that are in the over-index, I'm sure this will be. We should have a vote somehow. Do you feel over-indexed or under-indexed stage, but the ones that are in the over-indexed, I'm sure this will be... We should have a vote somehow. Do you feel over-indexed or under-indexed? I can do it on my Twitter and LinkedIn. Yeah, great. Okay, good.
Starting point is 00:41:14 Thank you, all the listeners. Yeah, thanks so much, Nikolai. See you next week. As usual, if you reach this point, it's definitely like we ask you to share it with your colleagues and friends who use posgas and also we also ask you to provide us some feedback and ideas for future topics right maybe there are no many topics left what what's your feeling here we have covered everything already right no yeah
Starting point is 00:41:45 i'm finding the more topics we do the more research i have to do for each one we get into the ones where neither of us either one of us knows it like well actually this is a bad example because you've done part of this research as part of your work anyway and yeah i think we're getting having to dive deeper each for each new topic now or we're having to cover ones that we're less familiar with. So, yeah. But I don't think we're anywhere near running out. Best Postgres library, best Go library for working with Postgres.
Starting point is 00:42:17 There are interesting nuances there. Honestly, and we've only, like, if you look at the index types, we've talked a lot about future indexes, mostly today. I think we did one on Brin indexes, but we've never touched. I don't think we've done one on GIMP. Don't think so. Or I think we mentioned it, but not, yeah. Hash, we haven't talked about. Okay, we have some fuel. Don't write us. Okay. Okay.
Starting point is 00:42:41 Take care, everyone. Bye-bye. Bye.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.