Postgres FM - Corruption

Starting point is 00:00:00 Hello, hello, this is Postgres.fm. I'm Nikolai and my co-host is Michael. Hi, Michael. Hello, Nikolai. And today we are talking about corruption, right? Yeah, absolutely. Database corruption. Technical corruption. Yeah. No money involved unless you need to pay some consultants a lot of money to fix it, or at least to understand what's happening and how to escape from it.

Starting point is 00:00:25 Because sometimes you cannot fix, you need just to escape and lose some data and save what you want and so on, right? Yeah, these can be some pretty horrible stories, can't they? Every now and again, I see somebody mailing one of the mailing lists or popping up on one of Discord on Slack or something like that. And they say something along the lines of, I'm getting this error message, help. And you can just tell, your heart sinks and you think, oh, they're in trouble here. This could be bad. Right. Let's start maybe from what corruption is. What definition do you have in your mind? Because I've heard various kinds of definition. Yeah, good point. I guess it's a few categories of issue.

Starting point is 00:01:07 There's some things on the subtler side, which are like incorrect information coming back, let's say for a query, all the way through to the database not starting. You've probably got one written in front of you, but I guess it's data not being stored on disk in the way it was supposed to be. Something along those lines? Yeah, well, we can distinguish a particular type of corruption called data corruption.

Starting point is 00:01:34 And in this case, that is quite broad, but I think it's quite a good definition. You wrote something, you asked Databoys to write something, you tried to read it, and you cannot read it or you read something else and so on for example you wrote zero but when you read it it's it's one this is corruption but if you cannot read it and you have some error instead on just no response at all well sometimes it can be called corruption but sometimes if server is is down, it's not corruption, it's just it's down, right? Or it's not working well at all and have some bug or something. Some people can name this corruption as well, but it probably doesn't involve any wrong data writes or data reads.

Starting point is 00:02:17 Maybe just like server itself is not working properly. And data is fine on disk, for example. There's no real corruption there. But if we apply this quite broad definition, you wrote something, but you cannot read it or read something else. A lot of cases can be called corruption, but which are not really corruption.

Starting point is 00:02:40 So I'm struggling to find very good definition for corruption. Well, I think probably if we got given 100 cases and we had to categorize them into corruption and not corruption, we'd probably agree on 98 or 99 of them. It feels like there's quite generally accepted cases. Should we go through some of the more common ones? How do you want to do this? Yeah, definitely. First of all all data corruption

Starting point is 00:03:05 can be in heap in tables for example and it's not uncommon because still a lot of databases have data checksums disabled right because you need specifically enable them they have this has overhead some managed postgres providers enable them by default. Some don't. But if you install Postgres, if you installed it many years ago, it's hard to switch to data checksums on because basically you need the NintB and dump restore. Or you need to play some tricks,

Starting point is 00:03:36 and we can discuss it with some extension. But it's not easy if you have a big database under load and the maintenance window is not welcome. It's hard. I see many cases when people keep postponing this, leaving without data checksum enabled. So they probably have data corruption in table files easily. It can be also in toast files, which is also part of data files. Then it can be in various other places, for example, in indexes. And let's discuss this in depth, like in detail

Starting point is 00:04:15 a little bit later. But now just let's walk around and see how what else can be corrupted. Visibility map can be corrupted. Yeah, a bit more niche that one,, I guess. But the other major category I've seen a lot, like I've heard a lot of people that have had tables data corrupted, indexes corrupted. It seems to be the most common one to me, especially recently with the drama around version 14. But the other one that I see quite often is backup file corruption.

Starting point is 00:04:46 Oh, yes. And that catches people out. Yes. And let's talk about this in detail as well. Now let's just collect ideas for our episode. And I also can be, what can be corrupted? Wall files potentially can be corrupted, but fortunately for ages we have data checks,

Starting point is 00:05:04 wall record checks enabled. Everyone has it enabled. It's well ready. That's why we don't hear a lot about wall corruption. If you think also about levels, for example, data, you write something to the table.

Starting point is 00:05:23 Corruption might happen at software. It might also happen at lower software file system, for example. It might happen in hardware. So we have also layers. We have verticals and we also have horizontal classification. It's like metrics. When we say data files corrupted, well, but what if at protocol level something happened? You are writing zero, but protocol substituted to one. You have data checksums enabled. It's crazy. It's probably not going to happen, but I'm just trying to explain about levels. So protocol replaced your value, right?

Starting point is 00:06:07 And then Postgres wrote with data checksums enabled, everything is fine, checksum is fine, file system is fine, disks are fine. No corruption, right? But it already happened at upper level. Yeah. And you haven't even mentioned operating systems, I think. And that's like one of the major.

Starting point is 00:06:26 Yeah. So there are so many places. The application, if application took some value from user input and then replaced it, Postgres cannot notice it because it's already very upper level. But we won't be talking about application level. Yeah. Let's, as far as I'm concerned, I don't think if Postgres has been told to store a certain value and then it returns that same value, we're not talking about corruption. So if the application is to blame, I think that's out of the scope for me of the definition here.

Starting point is 00:07:00 Cool. Right. Also, replication, both logical and physical, might have issues. And it might spread some corruption, for example. Well, of course, if some data is corrupted on the primary, it will be easily spread if you use physical. But interesting point here, if you use logical, probably you'll be fine. It depends, of course. Well, I was going to say you could have different corruption on each. They could be i was gonna say you could have different corruption on each like they could be exactly the same it could be different all combinations seem to be

Starting point is 00:07:29 possible but yeah really good point that logical can help avoid some of the types of corruption spreading makes sense but equally i've heard problems like i was reading up on a thread by a trip advisor uh thread a while back where I think it was actually replication that was part of the problem. It was replicating between two systems as part of an upgrade, but the replica was on a different operating system and that ended up causing corruption.

Starting point is 00:07:58 So it almost certainly was GLBC related. I can't remember the details though. Yeah, well, maybe to finish with upper level when application corruption happens i remember i had a very weird case when my scripts based on psql returned one value while application had different so different value from database. And application was Java. You know, this meme, ground is Java, and all junior developers looking as goats, they are all on tree.

Starting point is 00:08:34 So it was Java. It was a quite good customer. And they asked me, like, I trust P-SQL much more, of course. What happens in Java with all those, like, Spring, everything, Hibernate, a lot of stuff there. What's happening here? But then I noticed that

Starting point is 00:08:49 in DBWare, value is the same as in Java. And I think maybe we have an issue with P-SQL, maybe some bug. But then I realized DBWare is also Java. Good point. So if you use GDB-C... It has some extra flow digits or something

Starting point is 00:09:07 so numbers uh look differently sometimes and you just need to change this and you'll be fine so this also kind of corruption to me because like i expect one value but it's it's it's really at upper level the stored value was fine pcql showed it well it's called at upper level. The stored value was fine. PC will show that well. It's called extra flow digits. The default in Postgres is not the same as default in JDBC. And this is how you can easily get corruption. If you don't

Starting point is 00:09:36 notice it, okay, it's good that I noticed it, but if you write some application and your application doesn't notice it, switching from JDBC and to other things like ODBC, for example, switching back and forth can cause corruption. If you read something and write to a different place, to a different table, you will have different results.

Starting point is 00:09:57 It's corruption. Just causing by lack of knowledge about this setting. Just realized we forgot to say that this was a listener request. Oh, yeah, yeah, yeah. So thank you for that. And they mentioned they were interested in what kind of types of corruption there are, how to detect it, how to fix it, how to safeguard from it. And even, are there any performance penalties to some of those safeguards?

Starting point is 00:10:23 So that's an interesting question. I feel like it's hard to carry on without talking about, well, they asked types of it. I think we've covered that a little bit, but it would be easier to keep going by talking about how can we detect it or how do we generally find out about this kind of thing. Yeah, let's dive into specific types of corruption. But to finalize this overview of types of corruption, I must admit that it's one of the topics when you know this saying, the more I learn, the more I understand that I don't know anything. It's the topic where you can easily realize that, oh my, this also can happen, this can happen.

Starting point is 00:11:03 And then you already have a feeling that there should be many more types of and kinds of corruption, which we probably keep being unnoticed by everyone, right? But at some point, we start talking about them. So we had a few sessions on Postgres TV also, including with Andrey Borodin, where we discussed in detail various things. I will try to mention a few approaches. And also, I have a couple of documents where we try to summarize a lot of links, tools, and also mitigation issues, monitoring approaches, and so on. So where do you want to start? Heap, Index, anything else?

Starting point is 00:11:42 Well, or even, as you said, some of those links could be useful i was looking at the postgres wiki i thought it had a really good page on corruption it's good but it's very very limited i agree but they make roughly the same point as you which is almost by definition a lot of cases of corruption are quite unique if they happened all the time hopefully there'd be things in place to mitigate them so the people designing software systems really don't want this to happen this is like when we spend quite a lot of episodes talking about things like performance but performance doesn't matter at all if you're you know it's like quite this is quite a low level need this is something people come to databases and really want if you if you say oh our database is a thousand

Starting point is 00:12:25 times faster has loads of other benefits but there's like a five percent higher chance of corruption or i don't even know how those numbers would work but this is a really like basic need this is something that postgres i think has quite a good reputation on generally but we still have some quite famous cases of of issues so yeah I thought it was a good starting point, that wiki article. I'll link it up as well. But you're right, it's not huge. Definitely. And when we talk about corruption, I always start also from that article.

Starting point is 00:12:54 It has some very core knowledge. And in the first lines, you will see, first rule, if you try to fix something, make a backup of your data directory or backups or what you have. Because with trying to investigate and fix it, you are going to use some interesting, risky tools and approaches. Some of them probably are not working well and you may make harm additionally or just lose everything. So you need to start with additional backup of everything. Yeah, if you can copy everything and make notes of what you're changing, if you can't.

Starting point is 00:13:30 There's two things though, right? Firstly, you can make things worse. You can end up in an unrecoverable state when you were in a recoverable state previously. And the second thing is, even if you do recover, if you don't have a copy of what it was like in a corrupted state, you might struggle to understand what the root cause was or there might be some useful information in there for

Starting point is 00:13:49 working out what caused this and it also i think it says quite early on you might want to contact professionals for this so most of us in our day-to-day work who aren't consultants i count myself in this don't see this often if ever in our careers hopefully you'll never see this in your career but consultants get called about this i don't know you could tell me the exact number but probably at least monthly like seeing different cases of this kind of thing so there are people that have seen this dozens of times if not hundreds of times in their career so you might want to lean on their services and it's probably worth it financially to pay somebody to help you with this it's it's one of the topics which for consultants like my team and i like to do because this clients have fears if you explain properly what kind of fears they should have they they easily have them and of course they they usually ready to to invest into improving things.

Starting point is 00:14:46 For example, improving monitoring and mitigation and so on. Let's start from backups, probably. I find that as probably one of the easiest places. Backups without proper testing are like Schrodinger backups, right? Because nobody knows if they are working. And of course, all backups should be tested. Of course, it requires some resources. You need to provision virtual machines, for example, and test them and so on. You need to test recovery from a backup. And usually it means,

Starting point is 00:15:14 usually physical backups, they have full backup plus maybe delta backups and also wall stream, stream of 16 or 64 megabyte walls so when we test we need to test full backups delta backups and also we need to test all walls and point in time recovery until the very last point before next backup started so it's like it's a lot actually to to test because otherwise probably full backups is working but later in the day for example you have daily backups and you test full backup is working but then some wall is missing for example or corrupted or something well wall corruption might be something which won't happen as we discussed but the backup tool compressing walls with some innovative compression might cause corruption.

Starting point is 00:16:07 You cannot restore it, for example. A full backup test passed, but some wall later, if you don't test the whole stream until the next full backup, you probably also won't be able to restore until some point. Until the latest point, usually. In my experience, larger companies tend to have this down. They tend to have a process for doing this on a semi-regular basis, if not

Starting point is 00:16:33 very regular basis. It's somebody's job to do a full run of this every couple of weeks or every couple of months, depending on how serious... Every day. It should be the same frequency as backup creation. Oh, think i'm sure if you automate it if you so you're talking about an automatic everyone must automate it otherwise backup backup system is super weak that's what when people say untested backups are not backups this is it you need to test all backups

Starting point is 00:17:03 of course you need to pay for it, unfortunately. And all walls, including all walls. But a lot of people, I think, there's a lot of, especially smaller companies, startups that either have never tested a backup or did it once as like a one-off. Or they just trust RDS and that's it. Yeah, exactly.

Starting point is 00:17:24 That's another good point about only having backups in one place, right? But how can we be sure that RDS tests all backups properly? Or that they're not corrupt, right? How do you know they're not being stored in a way that's accidentally... You need to test restore. Oh, by the way, when you restore, there are also some levels. First, you restore and can reach a recovery point and Postgres is working, like accepting connections queries, right?

Starting point is 00:17:48 But then usually we think, okay, what about data in this restored Postgres? Maybe something wrong, right? Maybe corruption from production propagated to backup. And here we usually, like very simple trick, you need to dump it fully to DevNull, causing reads of all data files, but not indexes, by the way, right? And dumping, it's very, it's like, it sounds silly probably, but this is a very good approach just to cause reading of all files. It might take a long time.

Starting point is 00:18:22 While this, you need to pay cloud for these resources. Resource consumption is significant here. But this is the only option to test backup and see it definitely works, right? So backup recovery works. Point-of-time recovery to any point in the day works. And we can read data. But reading data, it's already verification of data files not recovery itself not restoration process right it's already beyond backups but this is a good place if you

Starting point is 00:18:55 have backup verification automation at least some like maybe not every time but sometimes this is a good point where you can start testing various things, including data corruption risks and index corruption risks as well. Yeah. And also worth mentioning, some larger companies, you mentioned, they have some additional replica, for example, it might be delayed replica, used sometimes to quickly recover accidentally, mistakenly deleted files, deleted records in database. So they have, for example, a replica which lags some hours, like six hours or four hours, always behind. And this replica usually, of course, it's based on wall shipping. And this replica can be considered as a test tool for wall stream you still need to test recovery of full backups but

Starting point is 00:19:48 walls are tested by this replica or you can install database lab engine and it will be reading for walls and also can if it's down there is a problem with false so it's a complex system to test backups yeah that sounds good and it's a really good point that these can be corrupted. And the best way to check is see if you can restore it. S3 can have problems as well. To store backups in object storage is very good because S3 or GCS, they have very reliable storage so they don't lose data. But uptime might be worse than EBS volumes, for example. So when you test it, you need to understand that sometimes it's slightly down, but they won't lose your data. This is a very good thing. So some retries logic needs to be implemented. It's like it's a whole project to test backups properly. But okay, let's shift to data checksums probably,

Starting point is 00:20:42 and then we'll talk about indexes. Because I think these topics are narrower. So data checksums should be enabled. Of course, there is some penalty overhead you need to pay for it. But in my opinion, they should be enabled everywhere. There is also PG checksums, which is, remember, all new versions, they have it in its country module. And it helps you enable checksums, and there is a recipe, so you can do it on replica first and perform switchover and then cover all existing replicas.

Starting point is 00:21:11 So with very small downtime, close to zero, it's possible. But for older versions, this tool, PgChecksums, didn't exist. But there is a third party. This tool is available for all the Postgres versions, And it's also possible to use it with replication, switchover, achieving downtime close to zero. Do you know what that one's called? Also pgchecksums. Okay, cool. Makes sense. I'll find it and link it up. I wasn't aware of that. Yeah, so they're both called similarly. Maybe some difference.

Starting point is 00:21:46 I don't remember in detail. But this is it. You need to enable data checksums, and that's it. It still doesn't mean, for example, some constraints can be violated silently. Data checksums won't save you from this. It's like upper level corruption. Data checksums will save you from having wrong writes.

Starting point is 00:22:11 So you wrote something, disk, for example, corrupted. File system has bugs. And then you read something else or cannot read it at all. This is where data checksums are needed. Also, you need to choose either to like, there's a setting saying that you need to stop if corruption was detected, or you can continue but just have some errors and logs. It's interesting. I had very interesting discussions with very experienced people who have good experience with Oracle and SQL Server. And we discussed what is better for current business. For example, you have e-commerce, which generates $1,000 per minute, for example, or per second.

Starting point is 00:22:51 And you have corruption and you have very good, strong tool and you enabled it and it puts your server down if corruption is detected saying, oh, corruption, fire, right? Of course, you'd prefer to continue. And if corruption is limited, you don't want to be down at all. So it's better to know about it, mitigate somehow, but you want to continue operating. This is interesting, right?

Starting point is 00:23:22 And they also mentioned that some like SQL server they have interesting approach when corruption is found on one node it can auto heal getting

Starting point is 00:23:32 not corrupted files from replicas for example from standby nodes bringing them interesting let's

Starting point is 00:23:41 auto heal Postgres does have it but it's an interesting idea but like rule of thumb here enable data checksums Let's auto-heal. Postgres does have it, but it's an interesting idea. But rule of thumb here, enable data checksums. If you don't have them enabled, use this extension to enable. So indexes.

Starting point is 00:23:57 Let's talk about indexes. Yeah. Uncheck is a good tool. Very good tool. Is it worth quickly covering like what so bt index specifically why why do we tend to see more index corruption than heap corruption because of collation and gdpc version changes for example if you upgrade from one linux for example ubuntu to another there was an article which said like gdp-C version 2.27 maybe has dangerous changes. So you definitely need to be careful.

Starting point is 00:24:29 And it said other versions are fine. But in my practice, I have zero trust. If we plan OS upgrade, if we plan any upgrade of underlying software, GLEAP-C, and you can version, ldd-version, you can check version. If we expect that this version will change, or if we also play with containers, you might have some binaries in container, and if you have container with Postgres, and PgData is outside of container, probably PgData was created using different glipc version and in container we have different like it's if it is not if it is mismatched here what can happen b3 is built using one order of characters collation right and now we have different order

Starting point is 00:25:18 and you can just for example not being able to find some, like you select something and previously you had some rows returned, but now you don't have them at all. And we can distinguish two types of problems here. One is read problems. And another one is propagation of read problems to writes, which hurts more because it's permanent. So customers cannot find they search something and cannot find something or they get back incorrect data right like they get back data

Starting point is 00:25:51 not expecting so that's that's like often how these present themselves or as you said like the right problem if we if we can end up with duplicates like we can end up looking for an identifier it doesn't exist so we in like we insert another one. And we don't think it exists because we're looking up via the B-tree lookup. Duplicates is a slightly different problem, not related to the order of JLipC version.

Starting point is 00:26:15 But it also can happen. Duplicates is a different problem. I think it could be related to... Isn't it using the primary key? Because it's... Isn't it using the primary key? Isn't it using the B tree for the unique constraint checks? Right, but if order changed, the uniqueness is not violated.

Starting point is 00:26:32 Yeah, we just changed the order of columns and order by. This is a key problem. Cool. Well, my understanding is you can also end up with data being in the wrong partition, for example. So ordering affects quite a few things.

Starting point is 00:26:45 Oh, yes, partition corruption. So I was thinking the index is used for those things, but no. So unique installation might happen, and user actions might cause it sometimes. But I don't see how the order, like how order characters can lead to it maybe I'm wrong

Starting point is 00:27:06 maybe I just don't see it but what can happen in my practice I saw it like what can happen wrong order

Starting point is 00:27:11 so user expect one order having different order and missing results completely because of some ordering

Starting point is 00:27:19 and limits results are out of scope and this like unexpected and so on and problem bigger problem is indeed rights and if it's some billing subsystem you can propagate wrong rights some people can

Starting point is 00:27:33 not get money for example right yeah i was just thinking when you gave your e-commerce example i was wondering what the trade-off would be for like like, let's say a bank. If you've, yeah, but it's, so if you're worried about people's like balances being wrong, for example, is that a different matter? But I still think they'd make the trade-off of keeping it up and dealing with the consequences in a lot of cases. So it's an interesting question, like which applications would choose which. Yeah, yeah.

Starting point is 00:28:02 But anyway, OS upgrades are dangerous, right? And containers are slightly dangerous. If you bring PgData, we had it also. You bring PgData from different operational system and you run your Postgres in container. Well, even without containers, if you just copy PgData between two nodes and one node is running Ubuntu 16.04

Starting point is 00:28:22 and another is 22.04, you are going to get in trouble because you copy at physical level in this case. You need to copy at logical, dump restore. This is where dump restore plays good because it eliminates the problem completely. And I'm not sure we've said it explicitly since we switched to talking about indexes,

Starting point is 00:28:42 but the tool for checking B-tree index corruption is still AmpCheck. AmpCheck, right. And it has several modes, and unfortunately the proper way to check it is heavy, and you shouldn't do it on production because it requires exclusive lock. This is where you probably need to put it to backup verification system. And after you restore, you run it on all indexes it will take a lot of time but you need to run into using special snippet which like parent blah blah i don't

Starting point is 00:29:12 remember but it also checks relationship between leave nodes and this is where order can be easily broken so you need this and it will take time and the funny thing from practice is that if you do it at SQL level, it's single-threaded. But it's not fun. If you have a lot of indexes and you have a lot of cores, you usually… There is a trade-off. It's backup verification, same trade-off. If it's in cloud, you usually think, okay, I will hire a much more powerful machine, but for a shorter time. And I will do job much faster

Starting point is 00:29:45 and destroy it. So it's, I mean, budgeting here, it's interesting for such kind of verification projects. But if you do arm check, don't do it in single thread. Do it utilizing all cores and disks, like saturating them

Starting point is 00:30:01 because it's a verification machine. We should go full speed. Nobody's there, right? If it's a verification machine we should go full speed nobody's there right if it's so of course but then i i twice i wrote uh parallelization scripts for it and only then realized that since postgres 12 or 13 maybe 13 there is a cli tool pg uncheck which has dash j dash j it's for parallelization so so and you can use it for older postgres as well yes that's a really good point i think you brought this up we would i think we had an episode on index maintenance and you made you so we've actually covered am

Starting point is 00:30:36 checking this stuff in quite a lot of detail for people that want to go back to that one i'll link that one up in the show but it's covered even more with our sessions with Andrei Borodin and in documents I mentioned. We will definitely link to this episode of these documents. And we have a table for monitoring and prevention mitigation measures.

Starting point is 00:30:57 And AmpCheck, we see patches to check uniqueness violation. There is such patch. But it's not committed yet. It's ready for committer and fortunately it won't be in Postgres 16 because it's already almost better. Next week

Starting point is 00:31:14 we have better, right? Better one. So we've gone past the feature freeze for Postgres 16, right? Right, right. But I hope it will make it into Postgres 17 and i think it's one of good cases when you can start using it before official release and also help testing it because on these verification machines temporary machines where you verify your backups probably it's you

Starting point is 00:31:42 can run like not polished software and so on it's a good point and what's the worst that like i guess if it's got bugs like that's the normal reason to not use it right is it there are false positives and false negatives false negatives like you miss some problem well without using it you miss it for sure if it happens. So still, unpolished software is still beneficial. Force positives are annoying. It says, oh, unique disfellation. Okay, it's easy to check and verify. Yeah, exactly.

Starting point is 00:32:15 Right. So it's quite a nice failure case. For corruption, slightly more complex, of course. For example, it says, oh, there's corruption index. Well, in this case, you can just rebuild it on this machine, this very machine. You rebuild it and check again. If it still says corruption

Starting point is 00:32:31 in the same place, probably it's false positive. Because you fixed it already. Rebuilding fixes it. Yes, so that's I guess where we should be going is how do you fix some of these issues? And with indexing, it's relative or with B-tree indexes, because it's an ordering issue, normally, right, re-indexing should fix the ordering because you're re-indexing with the new collation.

Starting point is 00:32:56 So that's the fix. And as you say, if it shows again, it should be a false positive. Although, I guess that would then mean it's a bug in Amgec, right? Yeah, of course yeah and uh worth mentioning again uh postgres versions 14.0.1.2.3 are super bad because if you use a re-index create index re-index concurrently you might have a corruption and in this case if you use the same version on this verification machine and Amchek says corruption, you rebuild it with concurrently again. Probably it says corruption again.

Starting point is 00:33:28 Not good. Concurrently option in create index and reindex is super cool. But it has a long trail of bugs, unfortunately. And if we run Amchek on this clone, non-production, I would verify all findings with regular index. That's a good shot. Which is more reliable. Because you don't have people reading, right?

Starting point is 00:33:54 You're not worried about the looks. Single user machine, right? It's faster too, right? In this case, even if I was running 14.0 and corruption came from production because of the use of concurrently option. Here I would uncheck find it. I run reindex and uncheck shouldn't find it anymore. It means it's not false positive, right? But uncheck for bitrate is quite reliable. It

Starting point is 00:34:19 shouldn't produce false positives. It's quite broadly used and well tested and like production tested i'm check with this patch for uniqueness violation of course different story and also i'm check with patch which also the same like it's it's ready for committer but not uh won't happen in postgres 16 unfortunately i'm check for gist and gin indexawaited thing. It's very also needed. Andrei Bardin also participated in this. And this patch, I tested myself, applied it to Amcheq and started to test. I saw false positives.

Starting point is 00:34:56 And I was told that there were many false positives in the beginning, but a lot of them were fixed already. So it has many iterations already. So I would consider this quite good. And I encourage people to try it in non-production. Again, you still need non-production because of this exclusive log. Andrey mentioned some improvements for future Postgres that will probably make possible to run AmpCheck in production in a less aggressive way.

Starting point is 00:35:27 But currently, I prefer AmpChecking on the clone where I can do more things and so on, utilize all cores, situate disks, and so on. It's better. So that's probably it with AmpCheck. Again, I'm looking forward to having Gist engine support and also uniqueness violation support in future versions of AmpCheck, but we can already start using them, taking just patches, right?

Starting point is 00:35:55 What else? Visibility map, there is also some tool to check visibility map corruption because it's not good to have it as well, right? If it happens. I think this episode could easily be tool to check visibility map corruption because it's not good to have it as well right if it happens pg visibility i think this episode could easily be like three hours long i suspect we could go on for a long time yeah but there was there's a few good posts that i wanted to mention i mentioned mention my document has lots of information if you find something it's missing let me know i will put

Starting point is 00:36:22 it i'm trying to collect everything about corruption, and it's a very broad topic. It has many directions. Nice, I look forward to reading that. There's one by CyberTech that's quite a fun post of Lawrence, I believe, going through about giving quite a few examples of ways to corrupt the database, which is quite a nice way of learning the kinds of things you can do to end up going. How to corrupt your database? It's an interesting area of DBA practice.

Starting point is 00:36:53 Let's corrupt to learn. Yeah. And I mentioned, I remember something like corruption related to Xydra Peround. Is it this topic? Is this post or no? I can't remember that one, sorry. It's okay. How to corrupt? Yeah, I remember it. I have it in my list. There are two posts on CyberTech about corruption.

Starting point is 00:37:15 One about Ksitra Proud and another is just about corruption. There's also an interesting tool, Noisia from Alexey Lysovsky, it's like how to damage your postcards. A tool just to learn and practice how to escape from situations. We didn't cover cases when you have corruption, how to do it. Of course, PGL reset wall is a tool. It will take time. And you definitely need to back up before.

Starting point is 00:37:46 I use it several times in my practice. One case was interesting when Postgres was running on top of NFS in AWS and got corrupted. But fortunately, only one table was corrupted. It was PG statistic.

Starting point is 00:38:02 So it was super easy to fix. Just analyze whole database, it's rebuilt. But to find what's corrupted, it took time, like half an hour at least, and it was nervous because it was under pressure from clients. And I remember some cases takes hours or day to fix and to escape with some data losses. So it's a huge topic. There's one more final thing I wanted to mention. There's a good presentation. I think it was quite an old one from Christoph Pettus that the slides are available online.

Starting point is 00:38:37 I'll check and see if I can find the talk as well. But I thought the slides alone were very good. And that was detailing a previous case. So you talked about the 14.0 index and re-index concurrently corruption issue. There was also apparently one that I wasn't aware of back in 9.3.1. So even in a minor release, there was a case. Let me mention a few names as well. I mentioned Andrej Bradin, a lot of work related to Amchec indexes and so on.

Starting point is 00:39:03 Of course, Peter Gagan and tools, work on B3 indexes and so on of course peter gagan and tools work on b3 indexes and so on and also a tool called pg hex edit i use it a few times you need linux for it but it's still like very good thing and on top of page inspect visualize b3s and and other types of indexes as well also thomas wandra pg check pgcatcheck from EnterpriseDB by the way interesting thing what was it called? pgcatcheck catalogcheck I think it's about

Starting point is 00:39:31 yeah I'm not sure it could be applied to regular Postgres but the like catalogchecker to find corruption

Starting point is 00:39:39 in system catalogs it's also interesting topic worth exploring as well pgtoolkit by Bertrand Drouveau. Sorry, my pronunciation is wrong.

Starting point is 00:39:50 But also there is a PG Toolkit or also works by Jeremy Schneider, good posts and talks as well and so on. Interesting materials by Jeremy Schneider from AWS team. A lot of stuff.

Starting point is 00:40:05 Wonderful. And if people can only take one thing away from this, what should they, maybe they should restore, like check their backups. What would you, what would your number one advice? Check backups with Wallstream. Dump it to death null, causing reads of data. Add uncheck for B3. And of course, enable checksums, right?

Starting point is 00:40:27 Data checksums. If you want to go further, use visibility map checks right on production. Also use patched uncheck to verify just engine if you have any in non-production after backup restoration. And also uniqueness violation, and many more tools, like I mentioned, this pgCutCheck and so on and so on. It's a rabbit hole.

Starting point is 00:40:52 One more thing. Read release notes, even for minor version releases. Because if there are any Postgres bugs causing issues like this, they will be mentioned in there with mitigation steps. Sorry, I completely forgot that one. I didn't mention many good names as well. Sorry, it's a big list. PG Checksums, it's from Creditive.

Starting point is 00:41:14 And it went to Postgres Core and Postgres 12. Very cool. Yeah. Well, if Nikolai forgot to mention anybody, it's a definite case of corruption. He's just mentioned all his friends. That's it. No, if I got to mention something, I mentioned that it just was not recorded. Oh, yeah.

Starting point is 00:41:34 Good one. Blame the internet connection. Recording corruption. Right, right. Yeah. So if you have more ideas, please tell us on Twitter, LinkedIn, and comments, YouTube, anywhere and help me collect. This is community effort. Like 99% of what is there, it's not from me. I'm like a hub. And these documents are quite good. Like, if you take any of these documents, you can find work

Starting point is 00:41:59 for any DBA for a year at least in serious projects. Definitely for a year. Like a lot of stuff. That's bad joke from me for today. Would you consider your work a collation? Okay. Okay. It's time. Thanks, everybody. Good ending point.

Starting point is 00:42:18 Right. Catch you next week. Have a good... Yeah, bye.

Postgres FM - Corruption

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.