Postgres FM - Planning time

Episode Date: September 20, 2024

Nikolay and Michael discuss planning time in Postgres — what it is, how to spot issues, its relationship to things like partitioning, and some tips for avoiding issues. Here are some links... to things they mentioned:Query Planning (docs) https://www.postgresql.org/docs/current/runtime-config-query.htmlAre there limits to partition counts? (Blog post by depesz) https://www.depesz.com/2021/01/17/are-there-limits-to-partition-countsNikolays recent experiment https://postgres.ai/chats/01920004-a982-7896-b8cb-dfd2406359b0PgBouncer now supports prepared statements https://github.com/pgbouncer/pgbouncer/releases/tag/pgbouncer_1_21_0“The year of the lock manager’s revenge” (from blog post by Jeremy Schneider) https://ardentperf.com/2024/03/03/postgres-indexes-partitioning-and-lwlocklockmanager-scalabilitypg_stat_statements.track_planning https://www.postgresql.org/docs/current/pgstatstatements.html#id-1.11.7.42.9.2.4.1.3pg_hint_plan https://github.com/ossc-db/pg_hint_plan~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith special thanks to:Jessie Draws for the elephant artwork 

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to PostgresFM, a weekly show about all things PostgresQL. I am Michael, founder of PgMastered, and this is my co-host Nikolai, founder of Postgres AI. Hey Nikolai, what are we going to be talking about today? Hi Michael, let's talk about planning time and what can increase it, potentially. Nice, and this was your suggestion. Why is it something you're thinking about at the moment? Well, we are making circles around this topic all the time because basically almost all of our customers, which mostly are fast-growing startups,
Starting point is 00:00:36 they reach the point where partitioning is needed. They reach the point where log manager contention happens, number of indexes becomes concerned, and also many of them still don't use prepare statements, mostly because PgBouncer didn't support it until very recently, right? This means that planning time being basically unnoticed for quite long suddenly becomes an issue. And we often show them,
Starting point is 00:01:09 oh, it looks like planning time is an issue in these cases. Actually, especially partitioning things, there is always a question, especially for time series, it's always a question how big partitions should be, which can be translated to how many partitions you will have, right? And yeah, and if you partition not well, you may end up having many thousands of partitions. They will be very small. But if you don't have prepared statements,
Starting point is 00:01:49 you can end up having a huge problem with planning time. And planning time can exceed execution time for simple primary key lookups, right? This is not fun. Yeah, it's not the only case of that, right? But it definitely makes sense that it's a common one. Yeah. Basically, last week we discussed with a couple of companies how to properly implement partitioning.
Starting point is 00:02:09 And this question was in the center, what size of partitions to choose. The rule of thumb is like, don't let them be larger than 100 gigabytes, right? Because this is like golden rule of partitioning. It's time to apply partitioning when you have risks already table exceeded 100 gigabytes. But it also can be applied to partitions.
Starting point is 00:02:35 Also, they should be balanced in terms of size. This is usually a good idea to do. But on the other end, if you make them tiny, you end up having a lot of them and it's not good. Why it's not good? Because of planning time, mostly. This is not the only reason why you don't want a lot of partitions, but this is one of the biggest reasons. If you have, for example, 10,000 partitions, you might have dozens of milliseconds or maybe a second of planning time for each query. If prepare statements are not used, every time a SQL query is executed, planning is happening in Postgres, right?
Starting point is 00:03:17 Yeah, and you're talking about the best case. Planning time can expand depending on the complexity of the query, and you're talking about for the simplest possible query. Very simple. Bit real index scan. Yeah. Is it worth us explaining what are we talking about? What is planning time?
Starting point is 00:03:36 Yeah, let's do it. So when query is executed, there are stages, right? Do you remember stages? Planning, it means, the planner is based on statistics and it considers, the goal of the planning is to choose the best plan
Starting point is 00:03:53 as fast as possible. It might be wrong. Very often it's wrong, actually. Not very, it happens. It happens in, for me, it's every day I deal with such cases because I see I see usually problems not
Starting point is 00:04:08 not like ideal situations right that's like this is when my team and I are involved when something is wrong so it needs to choose as fast as possible the like kind of optimal plan
Starting point is 00:04:24 and then when plan is chosen, execution happens. Executor executes according to the plan. And there are many, many, many possible plans. Many possible plans. Sometimes I think infinite number of possible plans. Right? Well, it depends, right? if you're doing a single table select and there's no indexes there maybe it's only one possible plan but yeah once you include what about prioritization already a couple of possible plans right yeah but all i mean is as the complete it gets important like for example when you start joining multiple tables and you start to realize that you have multiple scan choices multiple join orders multiple join
Starting point is 00:05:13 types that you could multiple indexes are present uh planning needs to choose them properly so yeah and these things are combinatorial these things multiply to create the number of total possible plans. And it quickly, as the number of joins, for example, increases, that can explode exponentially, the time. So there are some things in Postgres 2. There's also a genetic optimizer, right? Yes. So because it can, well, I guess the simple version is that the planner isn't always looking for the best plan in those cases. It's sometimes looking for a good enough plan in a reasonable timeframe. Because if we're taking way longer to plan the query, if we take multiple seconds to plan a query that's probably going to execute in a few milliseconds, it's probably a complete waste of time. So there's a balancing act here for complex queries but for simple queries it's yeah it doesn't make sense to be paying dozens of milliseconds for a 0.2 millisecond primary key lookup right so the
Starting point is 00:06:18 process of finding ideal plan is called optimization plan optimization and we the planner deals with basically several things first of all decisions are of course heavily depend on the version of postgres you use because all the time improvements are being made second of all it uses statistics currently present in postgres so that's why it's important to keep AutoVacuum tuned so it maintains the statistics. And this is not only PG statistics. It's also estimated number of tuples in Postgres rel pages. It's number of pages the table or index have.
Starting point is 00:07:01 This is super important. And finally, it also takes into account the settings Postgres currently has. Interesting point, as I also mentioned many, many times, those who listen to us not recently, they know it, right? Interesting that Planner has no idea about real physical capabilities of your machine. It doesn't know what type of processor you have, how many cores, how fast disks are, or how many gigabytes of RAM you provided. It has no idea.
Starting point is 00:07:37 It relies on the settings. First of all, the costs, right? So random page cost, sec page cost, and so on, like CPU tuple cost and so on. And also settings like workmem and some estimate of available memory called effective cache size. And you can put any number there, even if you don't have those gigabytes of RAM. This is bad, because if you forgot to tune your server, Postgres might not know that you have a lot of RAM, like a terabyte of RAM,
Starting point is 00:08:11 but Postgres still thinks you have only one gigabyte, for example. It's easy to be in this situation, and it's bad. But at the same time, it's good for us who deal with a lot of experiments because it means that we can fool Postgres and say, oh, you know, we have the same number of gigabytes as production has. And this helps us to reproduce the problem in lower environments when we have a much weaker machine. This is great for experimentation.
Starting point is 00:08:37 So good and bad. So that's it. This is everything that matters when optimization happens, when the planner chooses the plan the executor will be using. Yeah. Anybody that's familiar with explain, I've had success describing planning time as the time it takes Postgres to come up with that explain plan.
Starting point is 00:09:04 That's what it's doing. And you can even see this. By default, explain on its own doesn't show planning time. But if you do explain summary, so explain with the parameter summary, it will show you planning time as part of that. So it's quite a cool way of thinking. In the bottom. Yes, exactly. Or if you ask an executor to be involved and you run explain analyze, then summary is being turned on automatically, right? Yeah, and you'll see the planning time and the execution time then.
Starting point is 00:09:39 And the planning time is roughly the time it took to run explain on its own. And the execution time, in addition to the planning time, is the time it took to run explain on its own. And the execution time, in addition to the planning time, is the time taken roughly to run explain analyze. So that's quite a cool way of thinking about the difference. Yeah. And worth mentioning, again, those who listen to us for long know it very well, the word buffers. Since some version, like several years ago, it was implemented,
Starting point is 00:10:03 I think it wasgres 13 maybe. You can use buffers, expand buffers even without analyze, only planning phase, and you can see how many buffers were involved into planning. And this is super cool because usually bad timing is because of a lot of IO, which is buffers. IO is number one reason why Postgres can be slow in anything. Yeah.
Starting point is 00:10:31 And you might think, what's Postgres reading during planning time? Yeah. Actually, you know what? I explained some picture, which is maybe true in 99%, but I now remember a case where something else was happening it was some interesting query I don't remember how many tables but definitely it was a join maybe a couple of joins and I remember planning time was completely out of normal. It was like half a minute or so.
Starting point is 00:11:10 I think I remember this case. Was it a merge join? Exactly, yeah. It was not merge. Merge join was not considered the best plan. It was not chosen, but it was the reason why it was so slow considering merge join path the planner used not only what i just described but additional information it tried to find minimum or maximum values for some subsets of data edges real values from table so it needed to to during planning time it needed real values
Starting point is 00:11:49 and this due to some reason was super slow so planning became many many many seconds exceeding statement timeout and failing it was extremely hard to diagnose actually and i actually do you remember why yeah why what why it was hard to diagnose because explain doesn't show other options it shows only the winner well and also extra reasons it was difficult i believe from i remember talking to you at the time auto explain doesn't show planning time and therefore when you're trying to it's not a problem for us because well we didn't use at that time for this project we didn't use auto explain at all but we had dblab engine installed and i knew when these problems started happening so i quickly rewinded created snapshot and clone.
Starting point is 00:12:48 So yeah, I worked on the clone and reproduced the problem easily. Interesting that like one hour later, it was different. And statistics changed somehow and Planner stopped considering merge path, merge join path. Yeah. So it auto healed, but it was super scary what happened one hour ago, right? But thanks to DbLap Engine, we easily reproduced it. So I agree with you, auto-explain, not showing planning time at that time. It was an issue, but not in this project particularly. What was an issue is you reproduced it, you see super slow planning time,
Starting point is 00:13:21 but you have no idea what's happening and you try to understand if we had a way to see second best third best plans yeah in this case we probably yeah second best probably would be merge join it lost eventually but during consideration of it it was like posgus planners spent a lot of time considering it because it needed real values. I think what helped in that case actually is friend's advice because Kukushkin first
Starting point is 00:13:53 said, have you tried set enable merge join to off? This is weird. But once I did it, planning time went to almost zero. Like, boom. It was an interesting case. And, of course, when you try to dig deeper,
Starting point is 00:14:11 you can use, like, Perf or BPF and understand what's happening under the hood and in which functions Postgres spends time. And you quickly find its merge join path. And then read the code because it's open source. This is cool. Read the code and understand that Postgres analyzing merge join sometimes needs real values from table.
Starting point is 00:14:37 Wow. It was exotic, honestly. Yeah, that is exotic, but it's also very helpful in understanding how the optimizer is working. Because it isn't exhaustively calculating every plan that is possible. It can give up on plans once it gets to a sufficiently high cost. So because the way enable merge join off is working, it's penalizing merge. It's not saying you can't ever have a merge join. It penalizes it cost-wise, at which point the planner gives up on that.
Starting point is 00:15:13 Yeah, it discards very quickly because cost is already super high. But I don't think that's how everybody thinks of how the planner works. They think it's going all the way to the end of each option but it kind of bought early and this is one of those trade-offs it's making it will sometimes make a mistake because it seems like it's going to be an expensive plan but then it isn't like it so it isn't always looking for in my opinion the fastest plan it's looking for a fast enough plan like that's quite like in well maybe i'm wrong in that like it is it is looking for the fastest but it uses these shortcuts to avoid spending excessive time to do so
Starting point is 00:15:51 yeah yeah exactly as usual i like exotic and hard problems but let's maybe switch to simple things. Yeah, good idea. You sent me a very good post by Depeche, Hubert Lubashevsky, Depeche, right? And it was three years ago. He was curious about, is it really so that Postgres doesn't allow more than 65,000
Starting point is 00:16:21 partitions? And it turned out it's not so. He reached 100,000 partitions, and it turned out it's not so. He reached 100,000 partitions, right? But at the same time, Hubert checked the planning time. How exactly the planning time depends on the number of partitions. And this is back to the initial reason you were looking into planning time. Exactly, yeah. We don't want a lot of partitions, many, many thousands of them,
Starting point is 00:16:48 because planning time can be already high, dozens of milliseconds, maybe hundreds of milliseconds, and so on. And basically, the growth is linear. And if you double the number of partitions, you double planning time, likely. So, of course, it depends on the query you use, I guess. But the simple rule is just if you have a lot of partitions, you have bad planning time. It means we need to be...
Starting point is 00:17:17 I also think Postgres versions matter a lot here. Many optimizations were made, right? Well, I'm aware of one. So, Dipesh's test, he says that in the comments that he tested on version 14, which is important because there's a commit in version 15 a couple of years ago from David Rowley that says that in the release notes, it said, improve planning time for queries referencing partition tables. This change helps when only a few of many partitions are relevant which is the case here where we're doing a primary key lookup like if we're just just searching for a single row in a single partition all of this you know the hundred thousand partitions only one of them is relevant to this query we can prune that
Starting point is 00:18:01 at planning time right in those cases, planning time should be much reduced as of version 15 onwards. Well, you know, I use different table structure and queries, but it also was primary key lookup. But basically, I reproduced the same experiment using our AI system. And I'm just checking right now, Depeche had for 10,000 partitions, he had like 40
Starting point is 00:18:28 milliseconds or so, right? 30, 40 milliseconds. It's pretty much the same what we observed with our AI system. But I also used so-called shared environment experiments. So I was not alone on the machine. So that's why I involved buffers in my analysis. Because this time growth, linear time growth, according to my benchmarks, is explained by linear growth of buffer heats associated with planning. Yeah, so if you have 10,000 partitions, it might be several dozens of milliseconds. Yeah. So what do you think? What are you looking for?
Starting point is 00:19:17 I was looking at Depeche's post just to see if he gave enough details around how, like, did he pick the best of, you know, sometimes when you're doing this kind of experiment, you would run the query a few times and pick the fastest or the average. Several, yeah.
Starting point is 00:19:34 I didn't. He used several, not many, I think three or maybe five runs only. I used only one run for now. I think it's a good idea to check, to convert this experiment to more like full-fledged experiment, maybe on separate machine, and maybe even with PgBench and have many, many competing,
Starting point is 00:19:56 like not competing, concurrent selects. The idea is it's again about buffers, many, many buffer hits for planning. And this explains the growth of planning time yeah i like how in your experiment i like how clearly linear the chart with buffers is whereas the the one with timing it just it looks like it might be linear but there's enough uh noise that it's it's tricky to see although there are a couple of dips in the in the buffers chart so there's something interesting going on there.
Starting point is 00:20:25 Yeah, I'm also interested. And we also discussed before this call that it really matters, is it first query when you just connect it or even not involving prepared statements. If you run the very first query, you have extra penalty because in this connection connection information from system catalogs is not cached at all and this adds extra to like extra buffer hits during planning time yeah that's what we thought it was i don't know for sure and it seems really interesting like what what it could be that's going on there the reason I thought it might be that is I've seen in the past when running experiments,
Starting point is 00:21:06 if I run analyze, like not nothing to have explained in this case, just gathering new statistics for the table. I've seen planning buffers spike on the next query execution and then come back down on the second execution. So that makes me think it might be statistics related, but yeah, I've seen vacuum drastically reduce buffers
Starting point is 00:21:25 as well not vacuum analyzed just vacuum on its own and definitely not vacuum full so whilst buffers are much more reliable and stable than timings for this kind of experiment there are a few things that can affect buffers still yeah. So anyway, I couldn't reproduce vacuum thing and analyze thing, but I can imagine if you have complex query, rebuilding statistics leads to many more hits
Starting point is 00:21:56 from PGS statistic, for example, of new data just collected. Especially if you have a lot of buckets of default statistics target is increased and so on. It's a lot of data to hit additionally. I don't know. It's interesting. But anyway, back to partitioning,
Starting point is 00:22:13 the current recommendation would be let's not allow number of partitions grow to many, many thousands, unless we are okay with increased planning time. We can be okay with increased planning time if we use, first of all, puller matters. Yeah, true. The situation when it's the very first query in session just established,
Starting point is 00:22:36 it's very rare, right? It's good if it's really rare. And the puller helps. Even application side puller helps. Yeah. And second, of course, prepare statements just help us shave off a lot of planning time situations. You remember Dirk from IDN discussed problems. Even if you use prepare statements, still there might be interesting problems. And we had a very interesting episode, number 100, right?
Starting point is 00:23:06 It was an interesting discussion. But it's already an advanced topic. And at bare minimum, it's good if you know planning time is going up. It's good to consider prepared statements. And Pitcher Bouncer supports them now, so it's great. And I must say, like, dozens of milliseconds, it's already a super slow overhead for planning. If we aim to have all queries fast,
Starting point is 00:23:32 fast for me means in all TP, as usual. This is episode number one. What is slow query? Slow query is if it exceeds 100 milliseconds, 200 milliseconds, because this is human perception, 200 milliseconds, most people already see it's quite slow. And if one HTTP request consists of multiple SQL statements, this means we should be below like 10 milliseconds already.
Starting point is 00:24:00 But if planning time adds 50 milliseconds, for example, if we have like 20,000 partitions, for example, wow, it's not a good situation to be in, right? So prepare statements should be the good tool to fight with this problem. Or just keep partition number low. Well, a few times you've said, and I completely agree with you, that our primary server's CPU is our most limited resource. And planning time is CPU.
Starting point is 00:24:31 It's a pure use of that. And if we can shave it, why not? It's CPU, but I insist it's game buffers. It's just buffer heats. It's related to memory. But yes, it's CPU, yes. But still, it's game buffers. It's just buffer hits, it's related to memory. But yes, it's CPU, yes. But still, it's I.O. I consider I.O. in a wide sense
Starting point is 00:24:51 when we involve communication to memory also considered I.O. Not disk I.O., of course, in this case. Most of the time. It can be buffer reads as well, but in rare cases. Sometimes we see buffer reads involved into planning as well you mean like shared red rather than shared hit yeah i call them shared buffer reads because yeah we discussed it well this i consider this bad naming because it's IEO operations, not amount of data. Right?
Starting point is 00:25:25 Right. Yeah. So, and what else? Indexes, right? If we have a lot of indexes, planning time is increasing, increasing, increasing also. And I suspect it's not linear even, but it's worth double checking. So, if we have, for example, 10 indexes, that's some penalty and planning time in a primary queue lookup can be already like one millisecond or so, half of millisecond. But if we go, like if our table has many columns, we are tempted to have dozens of indexes.
Starting point is 00:25:59 And that's really bad. And also, not only planning time increases itself due to buffer hits. Again, buffer hits also grow here. But also locking. When planning happens, Postgres applies access share lock to each object involved. All indexes of the table are logged by access share lock. And this is quite expensive, I would say. If you have thousands of queries per second for this primary key lookup, it might be a problem, especially if you exceed 15 indexes
Starting point is 00:26:34 because total number of objects locked already reaches 16, which is threshold where when Postgres cannot, like above 16, it cannot use fast path for for locking in LockManager anymore. And we discussed this as well. Is that number being increased? I'm not sure. There are discussions to make these things configurable. Actually, I think I raised this question.
Starting point is 00:27:04 It turned out not to be so simple. I'm not sure. I think in 2017, nothing happened with this. So what helps here? Add more CPU, more cores, or decrease number of indexes, or again, prepare statements, right? Prepare statements. What else?
Starting point is 00:27:26 What else helps in this situation when you need? Yeah, I don't know. Maybe even denormalizing. If you've got a table with 15 indexes, why do you have 15 indexes? Or this also helps. Decrease QPS of this kind. Caching on application side.
Starting point is 00:27:43 If you have 5,000 queries per second, I saw it all, I see it all the time. So every click an application reads from users table checking who is this. Nothing changed. So cache it. Why are you reading
Starting point is 00:28:00 this all the time? Or Rails, recently we discussed with one customer. Rails code is constantly bombing Postgres system catalogs, checking some pgclass, pgattributes, and so on. Checking schema all the time, all the time, like thousands per
Starting point is 00:28:16 second. Why? On replicas, nothing changed. It's changing, but not so fast. So, lack of caching everywhere, and this quickly can become a problem especially if you don't use It's changing, but not so fast. So lack of caching everywhere. And this quickly can become a problem, especially if you don't use prepared statements again. But interesting that we can combine this problem and they can amplify. If you have, say, 10 indexes, it's enough to have like three partitions. Even two partitions is enough if you haven't checked partition pruning and some queries involve both partitions.
Starting point is 00:28:53 You already have a situation where like 22 objects need to be locked by access share lock and lock manager contention might strike you. Jeremy Schneider had a great overview of various works in this area from various companies. It's both about execution time, but mostly about planning time, because the planning time of these problems is worse, because this is when Postgres locks all the objects. During execution, it locks only what is really needed. So, yeah, it's the worst case.
Starting point is 00:29:27 You have partitioning, not many partitions, but also indexes and then a lot of QPS, and it can put your server down completely because of log manager, lightweight log contention. Yeah. I think you've talked about like people doing the right people who could be doing things for good reasons still getting caught up but i think there are also a couple of cases where like around partitioning people running a lot of queries
Starting point is 00:29:58 without the partition key being supplied in them like that kind of thing can... How to catch those situations? What would you do? How to verify that? For example, imagine we are prepared. We prepared already good partitioning approach. We have zero downtime conversion, everything. And we need to double check that all queries will have partition pruning involved. How to check it?
Starting point is 00:30:25 Holistically. Good question. all queries will have partition pruning involved. How to check it? Good question. Well, actually, that brings me back to an idea. We've got PGStats statements, and we can now look at some planning-related data. As of version 13, it was a big release for planning. By default, it's not on. You need to enable it.
Starting point is 00:30:43 Yeah, why is it off? Good question. Since it's off,. You need to enable it. Yeah, why is it off? Good question. Since it's off, many people, including actually myself, still have some fears that observer effect might strike you in some cases. So, yeah, honestly, yesterday we discussed with one of our clients, we discussed that we should definitely consider enabling it. It's a good thing to have. It would be good to hear what the observer effect is in that case. I think we had experiments checking this with synthetic benchmarks
Starting point is 00:31:15 and it was very low, this overhead. But it doesn't eliminate... Synthetic benchmarks don't eliminate all the fears, right? Of course, of course. So, yeah. I'm curious if people have planning time. Well, I know projects heavily loaded, huge clusters, planning time enabled.
Starting point is 00:31:38 So you mean pgsetstatements.trackplanning? Not planning time, sorry. Yes, yes. It's not only time, it's also I.O., right? I mean buffers for planning phase. They are fine, but I'm curious about opposite, when people
Starting point is 00:31:53 are not fine and decided to keep it off. Otherwise, I will call another default bullshit. Right? We have a collection of them, like random patch cost or this one's i think this one's even worse because pgstat statements isn't on by default so people already have to turn that on and they already agree to pay overhead price yeah but also it means that in a default ship like
Starting point is 00:32:22 if postgres ships by default even if this parameter was on by default, it would be off still because pgstat statements isn't on. So I think this one might be one of the easier ones to change. Yeah, yeah. Well, we need to check discussion. I might ask our AI to check what was discussion and provide summary. But you asked how to monitor for that. I would expect to see issues there quite quickly.
Starting point is 00:32:49 I'm interested in such issues. So anyway, I agree this is important. Observability for planning overhead. Also just P95, P99, like query latencies I would expect to suffer. But I think also I have in mind an approach we discussed with a couple of clients actually over the last few weeks. If you enable partitioning, it's worth checking with auto-explain in lower environments, enabling it for everything and just having it enabled and playing functionality.
Starting point is 00:33:29 It's about testing coverage, right? If you cannot reproduce whole workload, of course there are chances that you miss something. But it's good. Then you can just analyze logs and see if there are plans which involve multiple partitions.
Starting point is 00:33:43 But you need to partition properly as well. You need to have multiple partitions for all tables which are partitioned. Yeah, true. And if you were doing this and you had thousands of partitions, that would be a lot of information in the logs. And still no planning time information. Yeah, yeah. But we are interested in plan structure in this case.
Starting point is 00:34:05 True. And in lower environments, we usually, like if it's a smaller database, for example, it's not 1,000 partitions. For example, we say, okay, we want to have at least 10 partitions for each partition table. And then we just check plans. And with some analysis, we can analyze them and it's a separate question how to analyze them but anyway we can find holistically find bad plans but i like your idea about pgsa statements as well i wish it had you know it had information about plans more, not only timing and buffers, but, for example, how many objects were logged, for example,
Starting point is 00:34:52 or involved into plan, or, I don't know, like some additional things here. For example, just was partition pruning involved? How many times it was involved in plans? You know, there's a column called plans. How many plans, right? Yeah, partition pruning. Question. Because partitioning is hard.
Starting point is 00:35:17 It has so many surprising aspects, I would say. So many surprising aspects. I have some sympathy with them. I think PGStatsStatements already has a lot of columns. Yes, I agree. Still not enough. It is difficult, yeah. Still not enough. We have interesting work happening in the area of monitoring and we discussed PGStatsStatements not only PG-STAT statements but query analysis and holistic
Starting point is 00:35:47 table of query analysis is so wide because we take every metric and have three derivatives for all metrics time-based differentiation calls-based and also percentage except column calls for this column it doesn't make sense
Starting point is 00:36:04 calls per calls it's always column, it doesn't make sense. Calls per calls, it's always one. So it's like four times more minus one. It's so many columns to present. Yeah, we don't need them all the time, but I can find cases when some of them, like almost every one is useful in some cases. So query analysis becomes harder and harder, wider and wider in terms of how many metrics we have.
Starting point is 00:36:31 And it's still not enough. That's it. So yeah, good. I think we covered this topic quite well. I have one last question for you. Yeah. Do you ever change people's from collapse limit or join collapse limit?
Starting point is 00:36:46 Oh, interesting. I honestly like last, for example, 12 months, I don't remember cases of it at all. Somehow. In the past, I had cases when we considered it. I think I even applied it. But it was so long ago. Yeah.
Starting point is 00:37:04 Somehow. I think the default is quite good. The default for it was so long ago. Yeah. Somehow. I think the default's quite good. The default for both of them is eight. Yeah. I think it works pretty well. It works pretty well. And also, I don't know, like when I see 12 tables involved in plan, joins, and like I already like, oh, genetic optimizer already here right it has no 12 threshold no
Starting point is 00:37:26 yeah i only learned this today well but i think because of from collapse limit and joint collapse limit gecko never kicks in wow okay i have old memory okay me too like well or maybe i misunderstood originally but unless you raise those to above the Jack-Wraith threshold, this is my new understanding based on reading the docs, unless you raise them above the Jack-Wraith threshold, it will never kick in. Well, yeah. Somehow recently I mostly deal with different problems
Starting point is 00:37:57 when only a few tables are involved into plans. Makes sense. Yeah, but my experience is not everything right so i'm sure there are cases where collapse limits matter like and worth considering but just didn't happen to me recently no i was just interested from like a it's one of the things we look out for high planning time it's one of the things we recommend looking at in case someone has tweaked them for example if somebody's increased them too much that can massively balloon planning time it's one of the things we recommend looking at in case someone has tweaked them for example if somebody's increased them too much that can massively balloon planning time interesting but i wanted to know like in the real world for you is it something you see but it sounds like no
Starting point is 00:38:35 yeah cool well there might be more cases we even didn't see yet. Always. Always, yeah. So because Postgres Planner is complex. It's not as complex as SQL Server or Oracle Planners. Not yet, no. Or hopefully never. Already complex enough, I would say. So, yeah. Good.
Starting point is 00:39:02 Oh, by the way, if we use pgPlan hint plan. Oh, yeah. Good. Oh, by the way, do you know if we use PG plan hint plan? Oh, yeah. PG hint plan. Right. Because we tell the planner what to do directly. Does it help fight high planning time? I don't know.
Starting point is 00:39:21 Good question. It would make sense that it would, but I don't know where it's hooking in. Yeah. I don't have this used in... I don't observe it used in production at all, only non-production. So I also don't know. Yeah, same. Okay, good.
Starting point is 00:39:35 Good question for our listeners. Yeah, it's homework. Let us know if you're... And also the other thing I wanted to ask is if anybody using some of the more complex extensions that hook into planning. Oh, some extensions affecting time scale. Yeah, I've seen more planning time issues. It just feels like a coincidence at the moment. I haven't looked into if it's definitely the causal rather than just, you know, what's the correlation versus causation.
Starting point is 00:40:08 But yes, that's the one. So I don't know if it's just coincidence, but I tend to see more planning related issues with the more complex. Maybe some extensions help improve planning time, not make it worse. Well, you mentioned PGHimimPlan would be the one that would make most sense on that. Well, yeah, it's just a random guess. I'm not sure at all.
Starting point is 00:40:31 Me too. Well, let us know. There's definitely people out there that know more about it. Let's do a summary. Try to avoid a high partition number. Try to avoid a lot of indexes. Especially try to avoid this combination of these two problems and use prepared statements.
Starting point is 00:40:49 Right? Yeah. You don't think avoiding lots of joins as well? Maybe yeah. Again, like somehow I deal with very simple plans lately. I don't know. Yeah. Good point as well.
Starting point is 00:41:01 Good point as well. Good point. I try to avoid a lot of tables, many joins, and then we try to expect good performance because I always hunt for single index scan and better index only scan, always. And if you have even one join, even two tables, you cannot build index on two tables. It's always like indexes belong to their tables always, right? And materialized views, as we maybe talked, out of scope for my recommendations always, regular materialized views. So that's why I tried to think how can we simplify things and, okay, nest loops and so on,
Starting point is 00:41:42 and make queries submit a second. But, yeah, if if many many tables planning time might be huge anyway three i wanted to three lessons like low index number low partition number but not super low so partitions should be small relatively and prepare statements and i think these three might take years to implement in large projects yeah true
Starting point is 00:42:08 yeah unfortunately cool nice one Nikolai thanks so much bye bye catch you next week bye

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.