Postgres FM - Postgres 18

Starting point is 00:00:00 Hello, hello. PostGos FM. My name is Nick, PostGSI. Ii. And this usual Michael host is Michael Pijamaster. Hi, Michael. Hi, Nick. How's it going? I'm very good. How are you? Yeah, very good also. It's again this time of the year, right? When it's obviously right moment to discuss something. We make a mystery of it, but people already saw the title. And I'm pretty sure you won't hide it in the title, right? This elephant in the room. I was going to call the episode Christmas for Postgres users. Big box of gifts.

Starting point is 00:00:34 Exactly. Cool. Okay, let's talk about it. Well, I think for anybody brand new to Postgres or, like, relatively new to the show, we've done these episodes each year when the major releases have come out. Postgres has got a really good track record for many, many years now of doing a single major release each year, around this time of year. and that's the only release in the year that will include new functionality, breaking changes, very semantic version style, major version.

Starting point is 00:01:05 No, it's not Semware at all. And many years we have dispute inside my team. Simware or not Simware? I noticed, by the way, Tiger Data, Time Scale, they appreciate Postgreas approach, only two numbers, not three. Because original Simware, it's three numbers. True, true, true. Right, and three numbers, I always have, like, I understand the logic behind three number versioning.

Starting point is 00:01:31 But I like Postgreas approach much more. It's much more simplified and so on. Cool. Well, breaking changes will only ever go into a major version. But you're right. Even the major version, we would still cut a new major version, even if it was only new features, even if it didn't have breaking changes. So that's a good point. So, yeah, you're right.

Starting point is 00:01:51 Yeah, yeah, yeah. It sounds like very. It's only two numbers. But yeah, major version, minor version, simplicity. Yes. And major version changes every year. Major version every year, minor versions at least every quarter, sometimes more often if they need to do quick security fixes.

Starting point is 00:02:09 So yeah, this is the time of year where we get new features and the only time of the year that we get new features in core Postgres. So as such, there tend to be a lot of them. So we've got a lot of things we could be choosing. No chance we're going to talk about them all. today, but the idea is to talk about some of our favorites, some of the kind of categories of things that are coming, things that we've been kind of involved with or emotionally attached to or wanting for many years, all sorts. Where would you like to start? Well, first of all,

Starting point is 00:02:40 maybe we should just mention the features which we already discussed in depth and we had whole episodes sometimes for some features which are going to be out. Right. So, for example, skip scans in B3, right? Yeah, that's a huge. And we had a great chat with Peter Gagan, didn't we, all about them? Right, right. And the TLDR version of it, it's sometimes index on two columns, for example. It will be able to support searches, for example, only on the second column. So this is this rule we got used to it over time that we must put the second column on the first place.

Starting point is 00:03:20 So if we need searches or create additional index only on the second. that column, second column. Now it's sometimes not true and things become more complicated when we make decisions on indexing, right? Yeah. This is my perception. Oh, like it's even more complicated now.

Starting point is 00:03:38 Yeah, I would add a little bit that it's about the efficiency of scanning that index as well. So we could previously scan a full index for things that didn't have filters on the leading columns. It just wasn't, it had no chance of being.

Starting point is 00:03:54 efficient, or it's very unlikely to be efficient depending on the exact case. Now we can efficiently do it, or Postgres can efficiently scan, skipping any equal, like, across the first column, if the first column doesn't have high cardinality, let's say, I think you gave the example a while ago of a Boolean. So if we had only true and false, we could jump to the true and check that for the thing we want to filter on, and then skip the rest of the true values. then jump to the force where we and pick up again there. So it's a much, much more efficient way of being able to scan indexes that aren't perfect for our query. For performance, it's excellent news. Excellent. I'm just thinking about the decision-making process when we have a lot of tables and

Starting point is 00:04:43 complex workload and we need to choose proper index set. And I'm more lady. As you know, I shifted my mind totally to full automation because of full-set driving. idea, self-driving PostGos, and I see several good attempts to create index advisors mechanisms. One more is from Konstantin Kynchnik. There is blog post and neon published this week. And there is also the PG-A-analyze index advisor. I was just thinking how this new feature will affect those tools, you know. Interesting to see. The good news for users, though, this is. is one of those features that if you upgrade, you might just see certain queries get faster

Starting point is 00:05:29 without having to add new indexes. If you haven't thoroughly gone through all of your queries optimizing them with perfect indexes already, there's a good chance that some of your existing queries will better serve queries that you haven't optimized. So if in the past we could consider something like this as a mistake, now like Postgrease has more foggy in this area. Much more. Yeah. Yeah. Good, good. So yeah, this is one of the of those cool features that I think loads of people are going to benefit without even necessarily knowing that they're going to or as usual my advice remains to verify everything with real testing with proper data sets and settings of post-guisplanar and and work mem which is not

Starting point is 00:06:10 post-gisguis planner setting and yeah and just verify everything and so on yeah testing cool but yeah and maybe maybe we will be able to drop some indexes additionally because of this feature who knows it's yet to be discovered because we don't like a lot of indexes due to many reasons and another one like one of them will be discussed later I'm pretty sure what else we already discussed so the one we've discussed I think we've had at least two maybe three episodes on was where the buffers should be on oh okay I didn't get now I get yes this is our favorite and this is big news for this podcast actually right this is why in January

Starting point is 00:06:54 I said, I think it was committed very long ago in January or it was already obvious that it's going to be committed. So we spent two years actually. This is my position, my point of view. We spent three, two years advertising for using

Starting point is 00:07:10 buffers inside explained when you do explain and analyze. And finally I feel like it's, I feel slightly being sad because we don't need this anymore. I actually had t-shirt with this, explanalized buffers with

Starting point is 00:07:29 buffers in bold. So yeah, buffers after PostGus 18 hitting your PostGus servers. Yeah, it's not needed anymore. You just say explain the lies and you get buffers. Yeah, I had some slight shift. I know PGMastered website has a recommendation how to properly collect like the plans, execution plans. And I shifted my mind already. Focus shifted to settings, wall, and verbose. Great. Yeah.

Starting point is 00:08:04 So now I, like for older post-Gus versions, we write, explain, analyze, buffers, wall, settings, verbose. Then it will be minus one word. But it feels still like there is a room for improvement here. And we had this discussion. So let's just refer to it and move on. I think what I got confused, because we also had a couple of episodes about UID. Maybe one of them was without you, actually. It was.

Starting point is 00:08:33 Yeah, this work, UID version 7 started a couple of years ago when Kerk, Volok, joined Posgues hacking sessions where Andrea Borodin and I tried to cook something online on YouTube channel, PostGos TV, YouTube channel. And he joined with idea, let's implement ULID originally. then it shifted to UID version 7 but then we needed to wait I mean postgres project needed to wait for quite some time because RFC was not polished and Peter Eisenhower position was like we should release only when it's polished because who knows maybe something is changed is going to change so I remember like yeah very conservative everyone already has it various NPM modules Python like Google projects they all already have UID version 7 support even though RFC is not finalized. But Postgres, I think, took very conservative position. Maybe it's good for database because it should be conservative, right?

Starting point is 00:09:33 And we need to support it for five years, right? Like if whatever we implement, we're saying we're supporting this version for five years, even if you automate, even if you manage to deprecate it, the next version, which is also nearly impossible. Normally you need to deprecate, give warnings that it's being deprecated, then actually remove it in a later version. So I really understand that. And I think also Postgres did support,

Starting point is 00:09:59 like you've said this multiple times. Postgres does support UiVVV7 in older versions of Postgres. You just have to generate them either outside of the database or with your own function. You've done it even within Postgres. Yeah, this is what we do already for a couple of years and we have how to do it in PLPGSQL or just with P. your sequel. It was part of my how-tos. Yeah, it's already aged. And you can combine it with

Starting point is 00:10:28 partitioning, timescale DB or whatever. Yeah. But this time it's a, it's just, actually, it's like basically synthetic sugar, honestly. It's a nice function, right? It's a, it's. Since it's official, we can expect it will be more, more widely used. Yes. And it was discussed a couple of times on hacker news and people say there is like concerns about security, ability to guess i don't get those concerns because it's really hard to guess anything from those values you can extract timestamp and unlike recommendations from original rfc authors posgis decided to present a function extract timestamp from value from uad version 7 value which i find useful for example for partitioning implementation so anyway this is great

Starting point is 00:11:15 feature we discussed it in depth let's not lose time but it's it's not like we have a data type just have a couple of functions to generate and to extract timestamp. That's it, but data type remains just you, you ID. It's version agnostic. Yeah, we already, we do have a data, we already have the perfect data type, which is we've had it for many years. Now we have nice functions around it too. Now it's just in quarry and everything in quarry is great. Just use it. Use it. It's better than version 4 for performance reasons. And also convenient. I still find it helpful to keep time stamp like created ad separately although you lose additional bytes but it's a matter of like you need to compare what's better for you just rely on id value if it's your idea version 7 or have it

Starting point is 00:12:05 separate i think it's worth a separate blockpost to compare pros and cons for this actually yeah it's interesting topic for developers i think yeah what else uh we also discussed one a few times which I'm listed as offer right but I'm like I vibe coded it originally and it was no policies support for PJ dump and PG restore I know there is an issue which some other folks were fixing because it didn't eliminate comments policies can have comments you know. So if you specify no policies, I saw there was an additional work following up this patch. And yeah, it's a pity that I missed it. But in my defense, I must admit that other know something in PG-Dup and PGRStore also forgot about comments. And this work was done

Starting point is 00:13:05 in Claude 3.7 in January, I think, was committed later by Tom Lane with some improvements. But when I say vibe coded, it was originally wiped coded, but eventually before sending patch, I always review line by line, everything and polish myself and we're using multiple LLMs. So I think it's not strictly speaking, wipe coding. We call it vibe hacking, but it's just good for prototyping to use some help of LLM

Starting point is 00:13:39 if you don't write C code every day. And I think it unlocks the power for people who are not C coders, they can implement things and present patches. But obviously, if you do it without looking into code and send some vibe coded, purely wipe coded patch, it's a terrible idea. And inside my company, everyone is using LLM AI to code, but it's like it's prohibited to right now to start coding on your own, not thinking about LLM. But it's also prohibited, not taking responsibility for final result. If it's some BS code, it's on your shoulders, not like, you cannot blame AI, right? And if you send something, some proposal, open pull request, merge request, or sending patches,

Starting point is 00:14:26 obviously you are responsible for what you are sending, right? Yeah, well, I think it's quite rude as well, isn't it? Because you're asking, when you send a proposal, a PR, merge request, anything, you're asking other people to review it. And I think the minimum you owe them is that you've reviewed it yourself when you ask other people to review something. And that didn't used to be an issue because by writing something, you are reviewing it in a way.

Starting point is 00:14:56 But now that people have options that involve them not even writing it, I think it is a step worth being insistent on. If you're not writing C code all the time, but you want to implement something, go take AI, to help you, but then take another AI to review it and explain every line, explain every line to you so you fully understand what you, like the final result, and you are happy with it and you have tests and documentation. By the way, a small tip, start with writing documentation first. And like usually people like, it's boring to think about documentation and tests with AI,

Starting point is 00:15:35 it's slightly more like productive. And PostGos test system is it's pearl. So, it's really like not easy to deal with if you like never wrote pearl or already forgot it after 20 years for example or something anyway i think yeah is great it's like it's unlocking creativity here but the final result is on your shoulders fully and i also before this episode i checked a couple of articles about postgust 18 and i found terrible articles yeah terrible articles i can I'm building the list of companies who produce BS blog posts, obviously, I'm generated. And this is like not okay, in my opinion. Do you mind if I name the companies?

Starting point is 00:16:26 Not at all. I find it quite frustrating. And I think, again, it's rude, right? You're taking a shortcut and publishing something that you hope will be read by many, many people, taking many, many more hours to read it than you to write it. think it's yeah in the past i saw a lot of bs articles about postgoers from from from on the website called minerva xyz or minerva d b xyz something like this this time stormatics dot tech blockpost has a lot of wrong information well yeah i guess this goes back to your your age old thing of always verify

Starting point is 00:17:00 right but sadly now we need to make sure we're also verifying things from what we considered maybe in the past to be reputable companies actually let me correct myself this stormatics dot tech article was not I'm in the context of you know like building some police bot for blog posts about post this blog post was about optimization of moving pg wall to different disk and it had a lot of mistakes for example mentioning undo in wall and in a few hours but this time about postgis 18 we have blog post on published on dev.2 deaf tips and it has a lot of like it's LLM generated because it says yeah logical

Starting point is 00:17:45 application gets full DGL support well we know it's not this is one of the pains we still have yeah obviously LLM generated so I think before we think like consider problems with patches vibe coded purely wipe coded not vibe hacked as I said but what I've coded we have a bigger problem of misinformation being spread in blog posts so yeah and what do you think I think it would help to have some tool to quickly verify, to do some fact checking. So AI checking AI and so on. The challenge is false positives and false negatives, right? Like, if an AI is capable of coming up with this stuff, who's to say an AI wouldn't

Starting point is 00:18:28 also rubber stamp it? So that's the kind of, there's not catching the ones that are AI written. And then there's the issue of catching or flagging blog posts that were human written but it thinks AI and accusing those of being AI written. I think this is an unsolved problem isn't it? Don't education have this exact same issue with kind of

Starting point is 00:18:50 coursework and things? Right. Humans also hallucinate. I hallucinated on this very podcast sometimes. I think we should say not AI checking AI we should say like I have this information. It can be blog post or some report. We actually

Starting point is 00:19:05 internally we have such tools already because we do a lot of consulting with write some reports and we need to verify we do we need to do fact checking because sometimes you think this is this is it this is how post this works but it works like that 10 years ago and it already changed so we need we need to dig deeper and do some fact checking all the time because for example i was wrong recently when i said changing your primary key from integer four to integer eight is going to produce a lot of bloat i was completely wrong and if you in that case i was wrong very deeply so turned out there is a table rewrite mechanism similar

Starting point is 00:19:45 to vacuum full and table comes out from it like alter table out alter column table in the end is actually fresh and all indexes are freshly rebuilt and sometimes it's acceptable to have this downtime maintenance window yeah of course completely offline operation but no bloat so yeah i i sometimes don't trust myself and this tool like checking, fact checking, we could use it. Yeah, we use it internally again. I'm thinking to present it as an external tool so we could check block posts we suspect are not accurate. I think that, well, accuracy is slightly different to LLM generated and I think it depends what you want to, like do you want accurate posts or do you want human written posts and that those are subtly different. I don't care about who wrote it. I want

Starting point is 00:20:36 accurate posts. Yes. So I wasn't even talking about inaccuracies. I was talking about inaccuracies in the checking. So like even if you assume that article is completely accurate and human written, an LLM checker could still say we think this was, this has some hallucinations in it. You know, hallucinating the hallucination. So I don't have much hope in this other than trust-based. You know, if you as a company start publishing posts that you haven't checked,

Starting point is 00:21:08 whether, like, with humans or if it has inaccuracies in it, I'll forgive a few, but if you're doing it consistently and there's loads of made up stuff, I'm going to stop reading your stuff and I'm going to stop sharing it. And, you know, there's... So, LLM can be considered like amplifier of your, like, state. If you are not accurate, you produce a lot of bad pieces of advice, wrong things. It just amplifies it, right? So if you don't check information, to connect bits a little bit and to finalize this AI discussion,

Starting point is 00:21:44 I find it also interesting that sometimes hallucinations are really helpful. I had several hallucinations which led to the idea. Actually, this missing feature would be great to have. Well, this is great hallucination. Let's have DDL in Postgast 18. Already too late. There is a work in progress, unfortunately. And by the way, I recently checked it.

Starting point is 00:22:05 ddl in logical support of ddl and logical replication it's not like we discussed it right we we we we we we we we we we we have low expectations that will be implemented very soon unfortunately and this is a lot of work isn't it and there's there's there's probably lower hanging fruit like sequences and things yeah yeah this is also work in progress and i think uh is it already going to postgis 19 next year i'm i'm not sure that's not hallucinate here but sometimes like just working writing some ideas i've got i'm getting some great ideas like we had the case with corruption and we needed to fix it with all backups and so on so we needed to apply pg reset wall idea was we need to recover from some backups and they were broken so lLM suggested to use pj reset wall with system identifier option which doesn't exist so

Starting point is 00:23:01 I ended up implementing that option and proposing a patch. Unfortunately, it was not finished before Post-Gus 18, but maybe it will be finished before Post-Gus 19, and we will have this option. Yeah, so there is discussion about this. That's funny. Yeah, and a few things like this. So sometimes, so this is like an interesting thing to have LLMs,

Starting point is 00:23:24 but you must fact-check, and sometimes you think, oh, this is a good idea, and you go implement it. So why not? Yeah, all right. While we're on topic of corruption, another thing we've discussed a few times is check sums and encouraging people to enable them. And we're getting them, they're going to be on by default now from 18 onwards. I think it's long overdue. Yeah. It was. But great. Yeah, it's a great. Great thing. Minus one recommendation we will have in the future for our clients. RDS had it for ages, right? Check sums. are enabled there by different... Google Cloud also, yeah. Just a great thing, yeah.

Starting point is 00:24:06 I think quite a few of the hosting providers do, but each of them have to... It's another one that any new hosting provider that comes along, they have to then know to do that or their customers don't get it by default. Now they don't even have to know and they'll get it on by default, which I think is great.

Starting point is 00:24:23 Yeah. Another one we've discussed a few times is PG upgrade, getting or preserving the statistically optimised statistics across a version upgrade. This this this this this is huge pain and especially because unlike previous topic this is not solved in managed postgres providers. I didn't see any of them I mean major like providers like RDS cloud SQL others they always put it on shoulders on users to run analyze after major upgrade and we had outages because of that.

Starting point is 00:25:00 So now, I think, in the future, not now, in the future. You cannot use it before your old cluster is already on 18. So it's only like first time we will be able to benefit from it in one or two years only when you upgrade from 18 to newer version. But anyway, this is great news. So I feel really happy that this is solved finally. And this is solved in a very elegant way because now you can dump statistics. So it's not only for upgrades, you can use it in other areas.

Starting point is 00:25:37 Because, you know, like, PG restore also doesn't care about it. You restore and you need to run, analyze manually. Also vacuum, actually, you need to run vacuum. Now, a recommendation will be after PG restore, after restoring from a logical backups, are a.k.a. dumps, like, logical backups is fine to name dumps, like, but not just backups. So when you restore from it, we always said you need to run a vacuum analyzed. Now, just vacuum to have visibility maps created sooner. Your index only scans, yeah. Well, if you forget vacuum, it's not a big deal. Like, maybe you will, like, and of course, auto vacuum will take care of

Starting point is 00:26:23 but lacking statistics like it's a big danger now you can dump statistics, restore statistics this is great. I think from this we can benefit sooner than from having it in upgrades because once you're already running PostGris 18

Starting point is 00:26:41 you can start using it. By the way, I think we might owe at least one cloud provider an apology. Do you know who implemented this feature? three guys, all from AWS. So I think at least one cloud provider does deserve credit for implementing this feature.

Starting point is 00:27:01 And they've actually implemented it in a way that everybody benefits, which is cool. Right, but why do we need to apologize? Well, because you said all of the cloud providers have just shoved this onto users and didn't do anything about it. I think, yes, we can apologize easily, not a problem. And this is great. I knew this feature is from RDS team. This is great.

Starting point is 00:27:23 But still, if you run upgrade on RDS, it has 16 points, 16 steps. Yeah, yeah, yeah. How to upgrade using PG upgrade, official documentation has 19 steps. RDS has 16 steps. And one of them, last one of a couple of last ones, is you don't need to run analyze. And people tend to forget or just say, okay, maybe it's not super important. and it happens people yeah it's manual and so I stand my ground like they don't they don't automated they put it on shoulders on on on users but of course they like now they need to

Starting point is 00:28:08 wait until 18 and then 19 and then it will be fully automated which is great yeah I I heard reasons why they don't automate it because it can take time prolonging downtime window and there is an official approach in the license stages which I think is a mistake honestly

Starting point is 00:28:28 because in all TP cases we better to get final statistics sooner and do it inside maintenance window right? Yeah I just think if you care about

Starting point is 00:28:41 the downtime you should probably be working out how to do a zero downtime upgrade in some way shape or form and then you don't have to worry about this problem anyway. So it's more, I think if you can take the downtime, why not just do the full analyze?

Starting point is 00:28:59 Exactly. And also in this area, PG upgrade, I think, did you see that it got paralyzation, hyphen-hyphen drops option for... Yes, I did, yeah. This is also great news for those who care about upgrades. Everyone should care about upgrades, right? Yeah.

Starting point is 00:29:19 Especially when we're talking about upgrades. about a new major version. Yeah. But anyway, am I right that statistics can be dumped and restored even not in the context of upgrades? If they can,

Starting point is 00:29:31 then actually people can already benefit from this because you can use... Yeah, this is... Digit dump option statistics. So once we are... Because this is discussed in the context of upgrades only, but again, like,

Starting point is 00:29:43 it's not about only upgrades. Once your server is running Postgres 18, you can already... start benefiting and adjust your dump scripts and always dump statistics right and then then you can just restore it and that's it i also noticed that vacuum db has now now in postgis 18 receives option missing stats only to compute only missing optimizer statistics this is interesting because vacuum db has also hyphen hyphen drops to to move faster if you have many many cores right you can I think, yeah, I think we need the missing statistics because currently extended statistics aren't dumped and aren't preserved.

Starting point is 00:30:29 So if you create, you know, when we talked about correlations between two columns, if you create those, those are not preserved. So I suspect that that got added to analyse so that we don't have to run full analyze still. Otherwise there's, well, if we've used create statistics, if we've used the extended statistics. Yeah, that's interesting, but also, like, to be careful with vacuum DB hyphen, hyphen drops, I love it. And because if you have increased default statistics target or like a lot of statistics to, a lot of tables and so on, this is great to have parallelized processing here, right? But unfortunately, until version 19, so next year only, partition tables will be still in trouble because vacuum DBB. B doesn't take proper care of them. It only cares about partitions themselves, but root table is not processed.

Starting point is 00:31:29 And if you process it with analysis, it will process partitions once again anyway. So there's a problem here. We have any feature for that too in 18? No, it should be in 19. We have an analyze only for the parent partition. No, no, no. analyze only for parent partition is in work i think it's not like it's it's it's it may be it's committed already but it's not in postgust 18

Starting point is 00:31:57 Lawrence albert did it i know i noticed because i also wanted to do it but it was only recently committed i think there is allow vacuum and analyze to process partition tables without processing their children this is great but this is not vacuum db stuff Yeah, okay, but we can do Analyze Only, and then the table, the parent name, you're right, it's a separate. Analyze only we had always, I mean, for ages. Vacuumdb analyzed only. This is what we do after upgrade. I know it because this is our, this was our recipe until we hit this very problem when partition tables lacked, root table, lacked statistics, and some queries suffered from it.

Starting point is 00:32:41 No, but if you did it on the parent partition, it would also gather statistics on all of, of the child politicians as well. There are many things here. There is a single-threaded SQL comment, right? Analyze. Or vacuum-analyze. Doesn't matter. They are single-thread.

Starting point is 00:32:59 And there is also vacuum DB, which has hyphen-j-j-a-h-hy-drops option, which helps you parallelize. The problem is, when you run vacuum-db, hyphen J, like, 60, I'll analyze only. root tables in partition tables will lack statistics after it

Starting point is 00:33:19 and this is solved after postgius 18 beta 1 was out so we need to wait until 19 so that would be in 19 sure we cannot move fast and take care of partitions unfortunately partition tables properly

Starting point is 00:33:35 yeah we need to set then you need to do a separate task right so in postgres 18 first of all statistics can be dumped, which is great. Pigeupgrade is not in trouble anymore. Okay, but sometimes we still need to rebuild statistics. In PostGreast 18, also VacuumDB received missing stats only,

Starting point is 00:33:55 and vacuum, single-threaded SQL comment, received ability to take care of partition tables properly. This is what happened. Not just vacuum, but also analyze. Yeah. Vacuum, well, yeah, vacuum analyze and analyze. So vacuum's a bit pointless. I don't think there's any point vacuuming a parent partition

Starting point is 00:34:16 because it's not going to have any data in it. But it needs statistics. Vacuum analyzed. Exactly. It does. It does need statistics separately from its children. But in the past, in 17, for example, in version 17, 16, 15, if I, to get statistics on the parent partition,

Starting point is 00:34:35 I'd have to run analyze on the parent table, which would also gather statistics for all of the child tables, is a lot, like there's so much more work. So what happened, we say vacuum DB, hyphen J is great because we can, like, we have like 96 scores, like let's go. Hyphen J 96, let's go. But then, oh, actually, root tables in partition case, they lack statistics.

Starting point is 00:35:01 We need to run analyze on them. But when you run analysis on root on the table, it recalculates statistics for all partitions once again. Now in 18, we have ability to scale. skip that and say, okay, vacuum DB, hyphen J, move really fast, and then additionally run analyze only on partition tables, only for root partitions, skipping partitions themselves because vacuum DB already did it. So now we have full recipe for PostGIS 18, how to recalculate statistics in all database

Starting point is 00:35:32 and move really fast when you're like inside maintenance window, for example. You need vacuum DB hyphen J and then separately you need take care of root tables for partition tables, skipping partitions themselves, right? This will be the fastest. In PostGrist 19, VacuumDB will take care of this and we will need to get rid of this second step. Yeah, I didn't check the commit, but it's possible, I said missing stats only might be for create statistics, it might be for extended statistics, but it might also be for parent partitions.

Starting point is 00:36:05 That could be another place you're missing stats. I haven't checked. Anyway, 19 will be excellent and we will have. True. the pieces of the puzzle already and we can move fast i like i honestly if i think like vacuum and analyze could have some option i mean sequel comments they could have some option to tell like i want like additional workers give me like you know like vacuum db is good but if you in managed post this case like rd s you cannot easily run vacuum db because you need this to instance running

Starting point is 00:36:40 in the same region or something so you like it's terrible idea to run it from laptop right because connection issues and so on you will lose it so you need to instance next to it like some maintenance host and from there you can it's it's not convenient right it would be great to have vacuum and say i want like 16 drops or 96 drops and let's go with full speed we are inside maintenance window makes sense right i think so yeah yeah yeah i'm thinking also also So you might always want to have more, if you've got, you know, regularly scheduled jobs, you might always be able to give it more. You might even want a configuration parameter that's like always give my vacuums at least three,

Starting point is 00:37:24 you know, up to three workers or something. Yeah. Anyway, getting into auto vacuum territory here. Yeah, yeah, yeah. Let's maybe jump in between some other pieces. There are actually so many great things I like the release overall, a lot of stuff. And for example, I like the idea that now regular check, not check, not null constraints can be defined as not valid. As we discussed a few times, it's terrible naming not valid because they immediately will be checked for new rights, but they are not checked for all the rights.

Starting point is 00:37:59 I think it's like all the pieces we had before already because not now, I think in postgast 12, since postgust 12, it can rely on check constraints, not check not null constraints there is not null and there is check constraint yeah so there are for primary keys we needed only regular not nulls not nulls not null constraints right but implicitly we could already use like regular check constraints with not null and then do some dance around that now it's like becoming easier to redefine primary keys without thinking about all these nuances i like this like if things are polished now in many areas Like we discussed it, just discussed it with statistics and this, not now, like I like this. Maturity of many pieces, many functionalities, yeah.

Starting point is 00:38:52 It feels like a release chock full of things that have come up for real. For real users have hit these issues and people have come up with fixes for them. It's not like, I know that's almost the definition of a maturing product, but it doesn't feel like, there's that bigger divide between hackers working on things that hackers want to work on versus users hitting problems and actually wanting to solve problems. You know, it feels very, very tightly done, which is nice. Yeah. Yeah, another thing I wanted to mention is ability, is in famous number of fast path slots for lock manager, right?

Starting point is 00:39:37 We had 16 only. we had issues in multiple customer databases several companies suffered and I remember Jeremy Schneider who did it like this was year of lock manager yes that was his post yeah yeah it was maybe a couple of years ago right or maybe it was 223 I think

Starting point is 00:39:57 and now yeah I raised this last year in hackers that we need to make this probably adjustable like expose it as a setting and Posgis 18 finally has has this sold but in different way. I think Tomash Wondra, who worked on it mainly, right? And instead of adding one more GUC setting, we already have almost 300, right? Too many. Instead of that, now it takes into account max locks per transaction setting, right?

Starting point is 00:40:35 I think. and automatically adjusts based on that. So if you have a lot of course, you're likely, if you tune your postgres, you raise that setting already. So number of these slots for FastPath checks in the log manager mechanism, it will automatically raise. And we checked in one case, we had 16 hitting replicas in that case. Sometimes it's hitting your primary workload, sometimes replica. us workloads. In that case we had 16 by default right before 18 but now we will have due to that setting is already adjusted we will have 64. And honestly in half an hour there is a scheduled

Starting point is 00:41:23 benchmark I need to do I need to conduct. Great. But unfortunately I remember early implementations of the idea of raising this threshold were not super successful. in terms of helping with this performance cliff. So I'm not sure this should be very carefully tested in each particular case because there are several variants

Starting point is 00:41:46 of this cliff if I trust anybody around performance cliff I do trust Thomas to have looked into it so I'm optimistic. As we learn from our podcast episode we have different definitions of performance cliff. True.

Starting point is 00:42:04 There is a, there's one huge feature in post-crestating we haven't talked about yet, which is asynchronous I-O. Yeah, well, this is elephant in the room, honestly. And I must admit, I lack experience. So I cannot tell a lot about this. I know this is a big work led by Anders Freund, right? Yeah. So, yeah, I remember following this work, it's a lot.

Starting point is 00:42:31 I hope we will find good results in some benchmarks in the. like actual systems we we help manage but I don't have such experience yet looking forward to it honestly so yeah it'll be good to benchmark it because I think a lot of the normal benchmark type workload you know PG bench this bench that kind of thing won't see benefits from this work there are like lots and lots of small queries and this is much better at larger scans right like if you if you're benefiting from gathering a lot at once, you're going to see bigger differences when you're getting more blocks than one block, for example. So yeah, I think it'll be interesting in real world

Starting point is 00:43:16 close to see how big a difference. I think it's another one of those things that you could see significant improvements to how fast vacuums run on some of your largest tables or how fast. I actually don't know exactly which things have made it in, but I think bitmap heap scans definitely have and vacuum definitely has and there's a few others but as we see more and more of those get used in future versions approach because I think we're going to see the benefits of this

Starting point is 00:43:43 without having to change anything again it's another one of those features that has managed to be committed on by default and with quite high like maybe not high is not necessarily the right word but it hasn't been shipped with really conservative defaults I don't think that's how I'd phrase it

Starting point is 00:43:59 yeah I would also I would also pay attention to additional CLA tools this version is bringing right because we in Postgres 17 somehow I've overlooked

Starting point is 00:44:13 and we in our discussion didn't touch it but later I noticed not I noticed somebody pointed out to this big new tool appeared PG create subscription converting physical

Starting point is 00:44:27 physical and biological which is great like absolutely like this is automation of like we can throw out a lot of automation which is great i always love to do it right it's it's official and it's working i'm not sure about managed postgres setups yet because i think they need they must to expose it via their APIs and CLIs because it's it needs to be run on the server and we don't have access to it right but in self-managed setups it's already like available since last year i mean since your production is running postg 17 yeah but also also this release postgis 18 gets uh extension pg logical inspect well it's not

Starting point is 00:45:15 a cly but some extension i'm i'm not sure like what's inside in detail but i would i would like to explore because logical replication is quite complex and and yeah and inspecting logical snapshots sounds a really good idea. And also there is extension of PGR over explain. I'm not, I'm talking, I'm jumping between CLA tools and some extensions, I know.

Starting point is 00:45:36 But this is like a good thing, I think, to understand. I remember how it was born. Robert Haas did it, right? Yeah. And I think maybe you can tell more about this because it's closer to your fields

Starting point is 00:45:52 dealing with explained plans. Yeah, I think, first, kudos for an awesome name. How good a name is PG over-explain? Overtune. Yeah. But, yeah, so I think it does two things that are quite important.

Starting point is 00:46:10 One is it's the first customer of the ability to extend, explain in an extension. So I think it's a good example to people if they want to add more features to explain in the future, which has been, you know, over the last four or five versions, we've had quite a few new parameters added to explain. and I suspect there's some amount of belief that maybe some of those could have been extensions for a while first and seen how popular they were before adding them for everybody. So first and foremost, Robert made explain extensible, and then over-explain is an example of that, but also is more information than a user would normally want from Explain that would generally be useful for hackers, like people working on the Postgres source code or optimiser,

Starting point is 00:46:57 trying to understand why it's making certain decisions while they're working on a new feature. So, for example, the behaviour around disabled nodes. So you know when you're testing, why isn't Postgres using my index? And you want to enable sex scan off so that you completely disabled. Well, you try and discourage sequential scans so that if it could possibly use your index, it will. In the past, that used to work by adding a massive cost constant to sequential scans so that they'd be discouraged, but not impossible. That's changed in... I know Robert actually was happy to be the person working on that one as well, and it's now done on a basis of counting the number of disabled nodes as the first tiebreak, and then cost is the second high break, which is a really neat implementation.

Starting point is 00:47:53 I've actually blogged about that. but the over-explane instead of telling you which nodes of it's disabled within the explain plan it will give you a counter of the number of disabled nodes so far so that's useful for hackers but it's really confusing as a user because if you're using it you can see disabled nodes counter being one all the way up and you have to kind of look back to see what the first one was and luckily I think it was David Rowley who kind of backed that out

Starting point is 00:48:22 and made it more user-friendly but the point is explain is designed for users the primary user of explain and explain analyze are people trying to speed up slow queries users you mean human users i mean human users and what i mean human users and what i mean is not post not postgres hackers not yeah people working on the optimizer so over explain gives them a bit more information about exactly what the optimizer's doing at each stage and i just don't think it's going to be useful for users yeah let's also also mentioned moving to macro level mentioned that pgisdusatements have a couple of improvements we always suggest that especially for java applications or some applications which

Starting point is 00:49:06 tend to set application name to some like session id blah blah blah and set comment could can could pollute pgsyst statements we always said uh set pgsac statements track utility to off so you don't track set comments now in postgis a 18 it's not a problem anymore parameter is just like I think dollar one, right? It should be. Yeah. So, yeah, you basically, it's all, everything is just one entry, set application name to something. So normalization is implemented for set commands.

Starting point is 00:49:36 And also something is improved in the area of tracking parallel activity, right? Yes, there's like some extra columns. Not enough columns in the just established. I have actually seen a conversation since then, discouraging more columns. Interesting that that one got in, I wouldn't have thought that would be the one that snuck in last. Talking of new columns, there's also some new ones on PG stat or tables and similar for vacuum tracking, vacuum and analyze time spent, which looks really useful. Yeah, yeah. I like this and I think it's worth working on moving information from logs to SQL.

Starting point is 00:50:19 Yeah, it's easier to monitor it than dealing with logs. I can't wait until errors will be tracked properly in post-gress. Yeah, there's work in progress in this area. I hope it will end up being added to 19. Not yet, I just like maybe it won't work, but I think errors should be tracked as well properly. Counters at least, something.

Starting point is 00:50:46 Yeah, because now we need to go to logs all the time. Although I actually, I learn all the time. the last 20 years I learned policies and sometimes I feel like I'm lagging development is faster than my learning curve yeah so I just learned that there is a PG start conflicts or something database conflicts of you which exposes some types of errors and for example it can tell you that query was canceled because table space was moved it also can tell you that it was canceled because of certain timeouts, but not statement time out and not a transaction time out, not I don't transaction time out. Yeah, but due to lock timeouts, for example.

Starting point is 00:51:32 Lock time outs, yes, but not statement time out, not transaction, not I don't and a few more like replication conflicts. So we track some counters, but choices are really strange. Yeah. And query cancellation due to very common reasons, like statement amount. Unfortunately, you need to parse logs right now or use an extension called log errors. Not super popular, but very convenient. Unfortunately, available nowhere, almost. So only in self-managed case, I guess. But I hope this idea of log errors will be finally implemented in the future in Postgres itself. And yeah. Okay, good. We touched some future a little bit as well. I had one more favourite I wanted to give a shout out to

Starting point is 00:52:21 obviously it's explain related being me I think there's a really big improvement that's gone a bit under the radar for many which is actual row counts are going to be reported as decimals now so this is most important when you have loops and lots of loops and it will report that you've got 100,000 loops.

Starting point is 00:52:50 Pardon me? 1.5 rows, right? 1.5 per loop is way more useful than 1 or 2 rounded to... But it's even more important when you are around the 0.5 rows per loop or 0.5 rows per loop or anything 0 to 1 is particularly interesting. Below 0.5 is very, very important because it will round. to zero and no matter what you multiply zero by you're going to think that that's returning zero rows and that's really really not true when you've got 100,000 loops yeah yeah the mistake

Starting point is 00:53:27 accumulates and multiplies yeah exactly okay the next like once you get past the looping operation you will see the actual number of rows but you don't know what's happened in between now you don't have to worry about that it will report it as a decimal and and you can multiply that. Let me add one more because this is great actually because I feel connection to so many features and one more. This is where Andrei Braden worked like,

Starting point is 00:53:54 and others, many others actually. Worked for quite some years. And finally, Amcheck is getting support for gene indexes. Yeah. This is big because we do need it. We do need it because we need to use Amchek often.

Starting point is 00:54:11 I actually missed that in the release notes as well. It always worked for B3 to check indexes for corruption when you, for example, perform OS upgrade or other stuff. You need to check for corruption from time to time. But gene indexes, like, we couldn't do it. Well, we usually, in self-managed cases, yeah, only in self-managed cases, I think. We used patched version, so we tested that patch. it took several years for the patch to mature

Starting point is 00:54:41 and get rid of false positives false negatives I think especially false negatives when it's missing but it was a lot of false positives as well yeah but now it's great it's there so it's time to use it it extend your uncheck actions

Starting point is 00:54:59 to Jean and if you don't have uncheck actions you must I mean you should not must you should right because i'm check is a great tool i think maybe it should be renamed because it's like we use it mostly for to check indexes for corruption yeah but yeah anyway maybe it's a good name actually because it's unique but i think uh for large postgres databases everyone should check for corruption from time to time for various types of corruption and i'm check is

Starting point is 00:55:31 a great tool for it what is it is it access method is that access method check yeah yeah makes sense all right yeah good one thanks so much nicolai catch next week you enjoy it yeah thank you so much

Postgres FM - Postgres 18

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.