Postgres FM - Should we use foreign keys?

Starting point is 00:00:00 Hello and welcome to PostgresFM, a weekly show about all things PostgresQL. I am Michael, founder of PGMustard, and this is my co-host Nikolai, founder of PostgresAI. Hey Nikolai, what are we talking about this week? Hi Michael, how was your vacations? Oh yeah, really good. I had a really nice week off, pretty much completely, which is nice. And rare as a founder, I think you know that. But I did enjoy listening to the podcast as a listener for the first time in a while so yeah really enjoyed your episode last week thanks for

Starting point is 00:00:30 doing that good to hear but let's talk about question do we need foreign keys or we only need american keys or yeah the word foreign is interesting, isn't it? Right. Foreign tables, foreign keys. Yeah. Yeah, should we use them? Or like, when should we use them, perhaps? If not always, what are the pros? What are the cons? In what cases can they be a problem?

Starting point is 00:00:57 Or what cases can they really help? That kind of thing. Right. Well, I think it's obvious that it's better to use them if you don't expect heavy loads. But if you are preparing for heavy loads, until what point you can use them, right? already designed your system to for example to delete to propagate deletes using cascade option if you drop foreign keys you lose some functionality right it's not oh i see what you mean so like maybe we'll get to this at the end but maybe if you start with them and then if they become a problem or if you start to get success and lots and lots of scale, lots and lots of load, migrating to maintaining that integrity of the system,

Starting point is 00:01:48 quality of data without them, and processes around data deletion, things like that. Right, so they can give you not only quality of data, but also some functionality. Cascade deletes. And if you rely on them, dropping to solve some performance problems in the future will be problematic, right?

Starting point is 00:02:06 I think I've thought about it many times, but, well, we have this also, right? We cannot drop them sometimes because we will lose functionality and this is not good. But in general, there are people, I know there are people who think you should avoid them.

Starting point is 00:02:21 And, you know, for example, I can tell you, you should avoid sub-transactions. This is my position and I can prove why you should avoid sub-transactions unless it's absolutely necessary. Should we apply similar logic to foreign keys after you saw some examples of bad behavior or something? Or it's different? This is a question I have in my head. And honestly, in this morning, thinking about this episode, I was thinking, oh, there are new options in Postgres 17 we should explore.

Starting point is 00:02:54 Why we didn't explore? And then I realized, okay, okay, it's only beta 1 of Postgres 17. It's still too early. But I can't wait when we will see new results because we will talk about it later in this podcast, in this episode. But Postgres 17 will bring some new

Starting point is 00:03:11 settings which are interesting in the context of some performance issues you can have with foreign keys being used. But let's start with simple things first. Good idea. We had an episode about constraints, and

Starting point is 00:03:28 we mentioned there are six constraints, right? Six constraints, and foreign keys is one of the types Postgres offers. And usually I say foreign keys are good because I trust database system much better than

Starting point is 00:03:43 anything else when we talk about data quality and constraints and integrity and so on, and referential integrity, what foreign keys offer us. You cannot maintain it in complex systems without foreign keys, because even if you implemented it on your application code, for example, Python or Ruby, later your system can be becoming more and more complex. And if someone brings other interfaces to database, new application code in different language, different frameworks, for example,

Starting point is 00:04:21 or someone is working directly with database somehow. It's not uncommon to see multiple types of code working with the database, right? In this case, you have already multiple implementations of foreign key or not foreign key, referential integrity checks and enforcement logic. And that means that, for example, imagine you have Ruby code and you have Python code somehow working with the same database. They both need to have this logic

Starting point is 00:04:53 and it's hard to maintain this logic. Eventually, you will see some cases where this integrity is not enforced. So while in database, foreign keys are implemented internally using triggers. Yeah, and triggers is a good thing in terms of like, it's inevitable. Like, of course, you can say alter table,

Starting point is 00:05:17 disable triggers, all. In this case, it will disable all triggers, including these implicit system triggers which are supporting foreign keys. This is when you can see them in backslash D table name. If you alter table disable trigger all and then backslash D you will suddenly see that oh this table has some triggers because usually they are hidden. Backslash D doesn't show them. But when they are disabled, suddenly they show up.

Starting point is 00:05:53 So it's a funny trick. That's cool, yeah. And these triggers, unless you disable them, and you should not disable them, of course, unless you know what you do, right? I mean, during some massive operations and so on, but in this case, it's on your shoulders to ensure that they are followed. Because when you enable them back, Postgres won't check them for existing data. So if they are enabled, they are inevitable to work. So any write you perform is checked using those triggers. And if write blocks these rules, it won't happen. It will

Starting point is 00:06:36 be rolled back, which is good. So it's like more trustworthy approach checks on database side right make sense yeah so yeah yeah so i mean to be super explicit if you try and insert a row in the table that has a referencing column and you include an id that isn't in the reference table then you'll get an error saying so or that kind of thing right right so this is good thing obviously better quality of data but of course there are complexities under heavy loads and large systems one of complexities when you define foreign key for existing large tables or you define foreign key for a new table, but it points to a large table. Yeah.

Starting point is 00:07:29 Of course, in this case... Not just large, but busy, right? Busy, okay. Or active in any way, yeah. Well, there are two types of problems here. One is related to busyness, another is related to large volumes of data. Anyway, we need to... Postgres needs to acquire logs on both sides, which is challenging.

Starting point is 00:07:50 Both tables. Right. Different types of logs, but anyway. And second, it needs to ensure that existing data already complies with our constraint. Constraint, yeah. Right. And if you do it straight like in a regular way

Starting point is 00:08:09 in general you will have issues high risk of issues and risk is becoming higher and higher the more you have data the more the higher our tps the, the higher risks are. So at some point, and it's better to do it earlier, you need to install foreign keys in proper way. So you need, fortunately, Postgres supports, we discussed this, but let's repeat. Postgres supports two-step approach for constraint creation here, similar to check constraint. You can define foreign keys as not valid state, flag, right?

Starting point is 00:08:49 It means that they will have flag not valid, but important to remember, it doesn't mean they are not working. They are immediately working for all new rights. They are just not checked for existing data. And I remember I had super big mistake when I designed some complex procedure related to integer four to integer eight conversion. And I forgot that so-called invalid foreign keys are working for new rights. So I kept all foreign keys at some point just for the sake of being to be able to roll back i mean to revert all operations i decided oh i need them because if uh decision

Starting point is 00:09:34 will be made to revert changes i already have them i will just validate but new rights were not followed like new rights were blocked, basically. And this was a big mistake, I learned. So it happened on production, and it was terrible. It was my mistake. So not valid doesn't mean it doesn't work. Similar to check constraint. And actually, indexes, if you say is valid, false,

Starting point is 00:10:03 which is not actually recommended due to different reasons, also means index actually is maintained. So foreign key is maintained for all new writes. And then second step, you say alter table validate. This is not blocking. It's blocking briefly, right? But if you have huge dataset to check, okay, your DML queries are not blocked. Again, they are blocked very quickly, like briefly for a short period of time.

Starting point is 00:10:38 Anyway, this is not a trivial operation, but it's already well described in various sources. For example, as usual, I recommend GitLab migration style guide. It's called migration style guide, but it's about database migrations. It's about DDL basically, how to deploy DDL under heavy load reliably. And they have Ruby code, which is called Migration Helpers RB, which demonstrates how to do it reliably under any heavy load. So, yeah, it's one thing to remember. You cannot just create them easily. But this is true for any DDL.

Starting point is 00:11:19 DDL is dangerous. We discussed this multiple times. There's also a headache associated with installing them to partition tables. Right? I always forget details. Every time I deal with it, I need to double check

Starting point is 00:11:35 details. It's always an issue. And it changes, right? The restrictions around partition tables change every version. It improves each version, so things that I thought were not possible It changes, right? Like the restrictions around partition tables change every version. Exactly. It improves each version. So things that I thought were not possible become possible.

Starting point is 00:11:51 Yeah, so it's good that we check documentation. Documentation is great. But yeah, it's difficult to remember the details version to version. Right, it changes in a good direction. Exactly. Which is good. So yeah, if you're an old version it's one thing another version is like newer version different thing but in general on newer version a lot of things are possible so

Starting point is 00:12:12 so yeah one last thing that you normally mention around creating these on large tables with or like busy or large tables is the idea of having timeouts and retries so i think that i don't know i don't think you mentioned that this time but it's an important extra point and i'll link to your you did a good twitter when you were doing a twitter marathon you did a great post on on this i'll link that up as well. Yeah, I already forgot about this. But yeah, right. In general, if you need logs, you can do it explicitly with log table, but you need to do it with... So basically, if two-step approach,

Starting point is 00:12:55 like invalid and then validate, is implemented in Postgres, generally you don't need it, right? For custom checks and for foreign keys. But for some things like adding some column actually or dropping column it's needed you need to deal with locks and do it with low time lock time out and retries as usual but before why not do it with this one though like i would i still think it's sensible i think for example we gave a good example of if auto vacuum is running in the heavy mode.

Starting point is 00:13:28 You are right. You are right. You are right. Yeah. So if it doesn't yield its lock or something else isn't yielding a lock that you don't know about, it's so helpful then to have that timeout and a retry. You know what? In general, DDL is hard. Under heavy load in large systems it's really hard i wish it was much simpler in postgres i think there is a huge

Starting point is 00:13:54 room for improvement step by step and i hope it will be improved so but now we need to, it's like art. You need to improve your tools and libraries and so on. You are right. Even when we install a foreign key with not valid flag, we still need locks on both tables, right? And these locks, if they cannot be obtained they will block us and if we block us we start blocking everything else including selects because it's ddl and selects is waiting so yeah it's it's kind of i think for selects it's in this case it's not that maybe no or insets for example yeah yeah we. Yeah, we need to carefully check types of logs

Starting point is 00:14:48 that need to be installed on both cases. I only remember they are different on both sides. But in general, this should be carefully designed. It's better to follow existing experience. For example, GitLab's maybe some frameworks already implemented. I don't know, Django, I Labs, maybe some frameworks already implemented. I don't know. Django, I think, Ruby on Rails in general, they don't offer good automation here. It's not enough. So if you're preparing for heavy loads, it's definitely worth keeping an eye on

Starting point is 00:15:20 this topic and maybe to check it earlier than it starts biting you because sooner or later it will if it doesn't so it first it bites like nobody notices right okay some spike of latencies few seconds during deployment but then at some point, especially if longer transactions are allowed, at some point it can bite you so heavily, so you will notice downtime, a few minutes, for example, not good. So it's better to double-check carefully, step by step, and rely on other people's experience. But you are right, we need to deal with low log timeout and retries here as well.

Starting point is 00:16:03 And I think especially for partition table case you need it. Why especially there? Because I think I don't remember details but I remember when

Starting point is 00:16:13 you create foreign key on partition table you need more actions to be done. First, yeah. Would you have

Starting point is 00:16:23 to do it on each partition and then parent? At least until some to be done first. Would you have to do it on each partition? I think at least until some version, you cannot create a not valid foreign key on partition table. Oh, interesting. Right, right. Yeah.

Starting point is 00:16:39 But again, I don't... Yes, it differs for each version and so on. It should be carefully studied, each version and so on it should be carefully studied tested and so on and but with proper understanding that logs are needed it's solvable right yeah it's solvable

Starting point is 00:16:54 so maintenance overhead basically there is there's some maintenance oh let's mention one more thing it's maybe people like don't not everyone realizes it. We also didn't realize until our clients at some point it was a company

Starting point is 00:17:08 called Miro which is very popular now. We helped them with some things to fight bloat and so on and we offered to use PGRPAC

Starting point is 00:17:17 to fight bloat and they had at that time deferred constraints. So, foreign key constraint checks were done at commit time, not immediately. So foreign key constraint checks were done at commit time, not immediately.

Starting point is 00:17:28 And I remember it caused some issues with pgRepack. Again, I don't remember all the details. There is a good blog post about this. I will send it to you. But additional maintenance overhead which is caused by the presence of foreign keys

Starting point is 00:17:43 in this special state deferred constraints the presence of foreign keys in this special state, deferred constraints, deferred foreign keys. So definitely, if you use foreign keys, you need to pay attention to different things when you do some operations with your database. This is true. But I would say it's worth it, right? Because you have good reference integrity and so on. So I would still choose foreign worth it, right? Because you have good referential integrity and so on.

Starting point is 00:18:11 So I would still choose foreign keys in spite of these problems, right? Yeah, I tend to agree. I think the vast majority of us are working on systems that don't have the characteristics where foreign keys cause problems. I know we had an episode only a couple ago where we had some except possible exceptions to that well maybe we'll get to that in a moment you talk about 100 terabyte episode which was also episode number 100 we have received good feedback about this considered as like the best episode we had so yeah oh actually someone told this was the best podcast episode uh what so, we had great guests. Again, thank you, because you organized all the invitations and so on. that they're heavily loaded systems and they're constantly busy and they have to worry about a lot of these scaling limits and I guess keep hitting different cliffs or different...

Starting point is 00:19:12 Cliffs. Maybe that's a bad word, but like limits of what you can do or how many of a certain thing you can exhaust before falling back to a different method but but 99.9 of us are not working with those systems and don't have to worry about those and don't don't have to worry about them initially as well so i think i think it was sammy from figma who said as well and i think i agree with this at the beginning maybe you don't have to design for this kind of thing and you can so if you have to solve this it's a good problem you're probably successful in other ways if you've got to

Starting point is 00:19:49 a heavily loaded postgres and you're not making enough money to pay people to fix this kind of problem you've probably done something wrong like you you need a business model that works with that load that load and all of them did that successfully and i i can't think of an example where a company got too successful with load but not with finances so i think they can normally solve these problems well i not fully agree i think yeah let's talk about cleaves but first let's talk about performance overhead, foreign keys, and PgBench supports foreign keys option, which is not used by default. And before this podcast, we were very curious. We didn't like the simple question. Is it better?

Starting point is 00:20:40 What's the difference between two PgBench runs with and without this option? Of course, this option should be used during initialization time. But we checked and we saw difference on small scale, like 1 million rows in PgBench accounts, scale 10 means, right? We saw difference only 11%. On my MacBook, it was very quick and dirty experiment, all in memory and so on. As expected.

Starting point is 00:21:08 I expected also like 5% to 10% or so. You said 10%, right? I guessed. It was a complete guess. And also, I don't know why I guessed before even knowing. I'd forgotten the breakdown of what proportion was selects versus inserts, updates, and deletes. I definitely shouldn't have guessed before knowing that.

Starting point is 00:21:26 Why do you care about selects, deletes, updates, and so on? Well, I thought, I guessed that, well, maybe this is wrong again. I guessed that there would be a bigger impact on inserts. Yeah, exactly. Than the select operations. That was my guess. Well, by the way, actually, maintenance, let's put it to the maintenance overhead basket.

Starting point is 00:21:54 When you need to delete, like, reference said row, a row in a referenced table, sometimes we call it parent or something. If an index, well, index always exists on one side of foreign key because otherwise foreign key creation will complain lack of index.

Starting point is 00:22:16 You need like unique index on one or primary key, for example. Primary index. Index for primary key, which is also unique index, right? You need a unique constraint. Right, right. If index is not created on the other side, which by default is so,

Starting point is 00:22:32 you can miss it. Yeah. And you delete this reference row. Postgres need to find all dependent rows to either say it's okay, or maybe to propagate delete if you, again again use cascade deletes. And if this is not there,

Starting point is 00:22:48 it will be sequential scan of whole table. So delete will be super slow. And this is also kind of maintenance overhead to me because if you expect, sometimes people say, oh, we don't have such deletes. We use only soft deletes, right? So, or something like we don't care. deletes. We use only soft deletes, right? Or something. We don't care.

Starting point is 00:23:08 We don't need those indexes to be present. But if you have some risks, sometimes, not often, sometimes these deletes are happening very rarely in some systems, you need to maintain these indexes and need to remember about them and need to create them

Starting point is 00:23:24 and so on. So it's also a kind of overhead. I see this quite often. And I actually wrote a blog post about this quite a while ago now. But yeah, it's the time I see triggers causing performance issues the most often. You see it in plans, right? Exactly. That's how you spot them in the explain plans. Yeah. So I'll include that. But my view is normally it makes sense to index your referencing columns.

Starting point is 00:23:53 Right. Also for select performance. Like often they're a good candidate for selects anyway. So even if you don't add. Yeah. Like there's loads of workloads and patterns where it makes sense to have them indexed. Of course, if you're already adding like a multi-column index with them in the leading column,

Starting point is 00:24:10 you don't need an additional index. But what I mean is like an index on that so that they can be looked up. It is normally... Yeah, our Postgres checkup tool also has such kind of report, non-indexed. Like foreign keys without supporting indexes. But we usually say it's very special.

Starting point is 00:24:33 And some people, I see this as well. I see that people say, well, we don't need that. Because our workload is different. I remember a blog post, I think it was from Pocona, arguing the opposite case, saying, please don't automatically index. I'll include that as well. I would say as well, I would say don't automatically index because every index has a different kind of overhead. It has penalty, right? Well, we've discussed that many times, haven't we? Not just insert overhead, but also killing hot updates in some... Yeah, planning.

Starting point is 00:25:07 Yeah, lots of... So many dangers around everywhere when you need to grow and have good performance. At scale. Right, right. So back to these experiments with PgBench

Starting point is 00:25:21 and without foreign keys. You say you expect problems only in write operations. Interesting. Or at least more overhead. Let's remember that because I have a case. You know I have a case where it's very different. Okay. In this case, PgBench, we observe roughly 10%.

Starting point is 00:25:43 I'm quite sure it will be the same at larger scale, different machines and so on with PgBench, I think. PgBench type of workload, roughly 10%. Would you pay this price knowing that everything is slowed down? Right operations are slowed down by 10%. Most systems I work with that have been easy. Yes. Like most systems aren't maxed out on CPU aren't maxed.

Starting point is 00:26:10 Like it's, it's an easy decision. And for the, for the, anybody that's ever dealt with bad data, like bad data can be so painful as in bad quality data. I mean, you have to look through a system and you're like,

Starting point is 00:26:23 this doesn't make sense. Like these are, these these these are referencing the something or maybe like a product that just doesn't exist what like what plan are these customers on or you know there's or you know if you're doing analytics and you're trying to categorize things and there's bad data in there it just becomes such a nightmare at those times or somebody just uh disabled foreign case did some rights and then enabled them back or or some bug sometimes or human cases without foreign kids actually but i i i'm talking about yeah i'm talking about cases without but i mean anybody that's ever dealt with that and it's probably kind of slightly more

Starting point is 00:27:04 seasoned folks that have had to deal with it once or twice in their career that's ever dealt with that, and it's probably kind of slightly more seasoned folks that have had to deal with it once or twice in their career, it's just so painful that you then think, do you know what, I don't want to, I'll happily pay 10% going forwards to not have to deal with that again. I remember cases when the financial integrity was broken, even with foreign keys, but it was a bug,

Starting point is 00:27:20 and Postgres said it was very long ago. So since then, they became very reliable, and I would agree, I would pay this price 10% to have these checks constantly performed by Postgres to ensure the data quality is higher in this area. And let's emphasize that this overhead is not a cliff. It doesn't matter how many transactions per second you have, how many rows you have. This extra penalty is added to all write operations all the time

Starting point is 00:27:56 because extra work needs to be done. For example, you insert. Postgres needs to make sure that you insert a value to child type. Let's call parent-child for simplicity. It's not necessarily meaning of the relationship. It can be different, but let's call it. You insert a record to a child table. Postgres needs to ensure that parent record exists

Starting point is 00:28:17 because you're referring to it, right? So it performs basically implicit select to check that record exists. And this is extra work that always needs to be done. And even if you have one transaction per second, very tiny workload, right? Still, penalty will be there. We can check this actually with

Starting point is 00:28:36 expand analyze buffers as well and see that more buffers need to be involved when you insert a row with foreign key versus without foreign key, right? Maybe one more? Well, depends. Depends how big the table is and how many hops index scan should be.

Starting point is 00:28:56 True. True. Actually, definitely not one. At least two. Again, it depends. Maybe, I don't't know maybe it's index I think there's the page there's like the

Starting point is 00:29:07 top I don't think you'll ever get data in that root node but I could be wrong well yeah okay anyway we agree that this is not a cliff it's constant

Starting point is 00:29:16 yes true true additional workload which is like additional weight you need to carry performing this work all the time you like you have a contract to do work all the time. You have a contract

Starting point is 00:29:25 to do this all the time, right? It's like a tax, actually. Foreign key taxes. Right? That actually is a good metaphor, I think. Yeah. Like 10% tax from foreign keys. So, we

Starting point is 00:29:42 have consensus here. We are ready to pay those taxes to live in the world with better data quality. But there are performance cliffs, as you already mentioned, and they are terrible sometimes. So one of them was raised a year ago by Lucas Vittel at Paginalize series of small small videos which I like. Five minutes of Postgres. They are always based on some

Starting point is 00:30:10 other folks' materials and Lucas reflects very well in video. It was based on Christoph Petter's blog post where simple workload was described imagined like Twitch stream and new stream is starting

Starting point is 00:30:29 and we have streams table, like kind of parent table and we have some viewers or something table and immediately like 1 million or so viewers join the stream and you need to insert them to this, like let's say child table, parent and child. And if you do it in separate inserts, separate transactions, when Postgres performs insert, not only it needs to check that we've selected that row exists,

Starting point is 00:30:59 during the duration of insert this writing transaction, we need to lock that row with share access share lock. So basically with for key share lock, right? If it was explicit lock, it was, it would be for key share lock. So to ensure that this row won't disappear during our transaction, right? Yeah. And imagine now like a million inserts are happening around the same time. Of course, it's not possible. It depends on how much resources you have, but many of them will collide and try to happen at the same time, right?

Starting point is 00:31:38 And in this case, since multiple transactions need to perform row-level log on this parent table, Postgres starts using multi-exact IDs, right? Multi-exact IDs. So multi-exact IDs are created, and multi-exact IDs are used when multiple transactions log the same row. And this mechanism is very easy to achieve some performance cliffs when you grow, when multi-exacts are actively used. Well, by the way, Lucas also mentions another problem with multi-exacts.

Starting point is 00:32:20 Like many, many monitoring systems, and P-Analyze is a good exclusion here, as well as our version of PG-Watch, PG-Watch Post-GCI Edition. They both care about this. But many systems care about only regular transaction ID wraparound. They forget that multi-exact ID wraparound also might happen.

Starting point is 00:32:43 It's just very rare. I never heard about this case so far, but it's still possible. So you need to monitor and ensure that auto vacuum freezes rows properly and takes care of rows which were involved in this multi-exact mechanism. So one problem. But another problem, this cliff. So in SIRS, we have contention on multi-exact mechanism. Multi-exact has SLRU, SLRU, and buffers are quite small. The number of buffers, like there are default. So it's not default, it's hard-coded until Postgres version 17.

Starting point is 00:33:34 And yeah, so it's like, I saw this problem with SRUs in different case, in sub-transactions case many years ago, not many, several years ago. And I remember I touched it a little bit that time already. And sub-transactions is also performance cliff. Everything is right, your scale is right, and then suddenly the system is down basically because overhead becomes enormous. It usually happens when you achieve the limits of these buffers, SLRU buffers. We needed it for subtransactions, subtrans SLRU buffers. And we needed it for sub-transactions, sub-trans SLRU, but this is similar to multi-exact SLRU. Around the time, I learned that Andrei Borodin already proposed improvements in algorithm, not only improvements, but also make this tunable. And good news, this work is committed to Postgres 17. So if we observe this performance cliff at some point,

Starting point is 00:34:28 at some rates of inserts you observe performance becomes terrible, like system unresponsive downtime, right? You might try to tune to postpone this cliff from your current situation. At what cost? Like a little bit of extra memory? Is there any? Yeah. Great.

Starting point is 00:34:51 Algorithm was improved instead of sequential capital O of N, I think, or maybe even quadratic algorithm. It was improved, better algorithm, but also you can adjust configuration and have more buffers, for example.

Starting point is 00:35:11 I remember we tested some earlier versions of these patches. There are multiple patches in this work. By the way, big congratulations to Andrei. More and more works. He was working on many years. They are committed. It's great. But I remember there are some complexities in our particular case related to sub-transactions.

Starting point is 00:35:37 But I think we should test these things and study this. Exactly this very performance cliff Christophe describes in his blog post, right? Yeah, well, so I think that's worth mentioning because this is one solution, but Christophe also mentions a couple of alternative approaches to that schema design. He mentions reliable approaches.

Starting point is 00:36:01 One of them is just drop foreign key. In this case, no more performance cliff because multi-exact mechanism is not involved. Absolutely true. But also alternative approach like batching the updates or there was another idea. Not updates, inserts. It's a very important.

Starting point is 00:36:19 We always say you need to perform deletes and updates and batches. But we never said this about inserts. Inserts is better to do in a single transaction if you can. Because inserts are usually not blocking. Not in this case. Here we need to deal with this multi-exact row level lock. But if you can do it in single transaction, it's great. Even if it's a million rows, it's okay.

Starting point is 00:36:50 Or multiple bigger batches, like a thousand inserts. But straightforward implementation of this approach, like, oh, viewers just join our channel. It will be separate inserts because they happen in different sessions, right? You need to know about this problem to design system differently and start

Starting point is 00:37:13 batching these inserts. So it's a dangerous performance cliff. But this performance cliff demonstrates the problem for inserts. I have another case which is not yet published. I hope we will publish some blog post about this. It was not observed in production,

Starting point is 00:37:31 but it was inspired by some problems we had with some customers related to multi-exact IDs contention. So imagine we have parent and a couple of records, just one and two, ID 1, ID 2, and child table with some inserts happening. Not at a huge rate, not at a rate where we already have this performance cliff. And then also we want to have a lot of selects, selecting, for example, the latest child inserted to child table. Just with join, right? So we join, we select parent ID 1, and we select what was the latest child inserted recently.

Starting point is 00:38:19 We have timestamp, for example, to do this, or we just order by primary key if it's just integer primary key for the child table. And now we think, well, that's it. So we insert it, like, say we insert 10 per second, for example, and we select as much as we can. Like, imagine we have a lot of viewers trying to read the latest inserted record to child. And now we start updating the parent ID equals one. This is the cliff.

Starting point is 00:38:53 We performance terrible for everyone, including selects. This is the key for this demonstration. It can be like 100 times worse. 100 times bigger latency 100 times lower tps this really surprised me when you told me yes when you showed me this well uh we need to research all the details but i suspect this is related to the fact that in this case when you have inserts and updates of parent inserts to child, they also, well,

Starting point is 00:39:27 first of all, all those inserts, if they happen around the same time, it's also multi-exact involved. But updates also need row-level log, right? And they also like, actually, multi-exact ID is being put to xmax hidden

Starting point is 00:39:43 column, system column, right? So Xmax column defines the transaction ID when the row becomes dead. And if transaction succeeds, okay, this tuple is dead. If transaction is canceled, nobody knows this, like it would be XmaxX being null, right? Live tuple. But in this situation, Postgres puts not regular transaction IDs, but multi-exact IDs,

Starting point is 00:40:15 and produces some volume of dead tuples all the time with multi-exact IDs inside XMAX. And I suspect that when sellex are trying to find live tuple for parent, it's really expensive because multi-exact mechanism, we need to check which transactions were canceled. It's very expensive. So SELECTS need to scan a lot of data tables, performing expensive verification, and they degrade a lot. So, this workload, I think, anyone can experience.

Starting point is 00:40:57 And I remember I designed a few systems where, for the sake to have fast counts when inserting to child table, for example, comments, right, or something, or like views or something, I don't know, like something. I decided to update some counter in the parent table to have dynamilized data to avoid slow count. So I just insert and update in the same transaction. In different transactions, I also insert and update plus one, plus one. Of course, this already is not good

Starting point is 00:41:30 because these updates have some contention. You're updating the same row. You cannot update two times. It will be sequential. Postgres will be contention on heavy logs. Yeah, contention on heavy locks. Yeah, contention on heavy locks. You can improve this, but anyway, even if you perform

Starting point is 00:41:51 batched updates, for example, in separate transactions, you have high risks of multi-exact IDs being involved, and then you already can have a lightweight lock contention on multi-exact mechanisms. Yeah, a lock multi-exact mechanism. A favorite.

Starting point is 00:42:06 Yeah, a lock multi-exact offset. And yeah, so this is not good. And I imagine this can happen with very good chances. And it also can be seen as performance cliff. So suddenly it was fine, but then it quickly becomes terrible. chances and it also can can be seen as performance cliff so suddenly it was fine it was fine but when it like quickly becomes terrible yeah so what to do i don't know like i think we need to study research all the those cliffs document them and understand how to predict them maybe right at what rates everything is fine

Starting point is 00:42:43 because i don't see them under very heavy loads. I don't see them. But it does mean they are not there. Or they're not coming. Yeah, or they're not coming. At some point, they might bite you back. That's interesting. For those who are interested, let's put links.

Starting point is 00:43:03 I have a series of benchmarks performed with our Postgres CI bot with some visualization. Of course, it requires explanation, maybe, but if you know how PgBench is organized, everything is there, including all, it's easy to reproduce. If you're talking

Starting point is 00:43:19 about the graph that I remember seeing, it doesn't take that much explanation because you have like four fairly large bars on the bar chart all showing like lots of transactions per second and then one that is like so tiny you have to almost like zoom in to see it and it's quite easy to see oh that one didn't do so well yeah let's let's explain a little bit. The tiny one is this cliff demonstrated. Three others. One was let's just remove updates, right?

Starting point is 00:43:51 Another one, let's keep updates, but update ID 2, not 1. Yeah. Another one, let's remove inserts. And the last one, let's remove foreign keys. Which is fair, right? In this case, just to show that foreign keys is the key of the problem here. And all bars are

Starting point is 00:44:14 TPS for selects. Maybe latency. Actually, it's more correct to show latency not TPS. But okay, TPS. Somehow people tend to like TPS numbers more but latency what matters here for sure because yeah so

Starting point is 00:44:29 but then we'd only see then we'd see one really large one and three small ones not as dramatic a graph yeah maybe yeah smaller is worse smaller is not better yeah because latency is inverted logic.

Starting point is 00:44:49 Smaller is better. Anyway, it's still great. So yeah, SELECT can be affected. It can be affected badly. Has this changed your opinion on should you use phone keys or not really? Yeah, this is the first question I asked to myself and my team. I said, are we observing a similar case to sub-transactions where I basically started telling everyone,

Starting point is 00:45:20 if you want to grow, try to avoid them. At least know all the problems they have. By the way, for sub-transactions, these new settings are also interesting to double-check in Postgres 17. But the answer is still not 100%. I'm not 100% sure about the answer here because foreign keys are a good thing in general. So I think

Starting point is 00:45:48 in sub-transactions case we saw very how to achieve the problems very fast, very easily when you grow. With foreign keys, again, I'm not sure. Actually, the customer I mentioned, they removed

Starting point is 00:46:03 some transactions and the problem has gone. So I think sometimes these problems come together. For example, we know that, as we discussed, foreign keys, they put a select for share log, and this causes multi-exact mechanism being involved. But sometimes if you work with select for update, you work with select for update. You don't expect multi-exact mechanism to be involved.

Starting point is 00:46:26 But if sub-transactions are there, select for update might lead to multi-exact mechanism because every nesting level of your nested transaction or sub-transaction is considered a separate transaction when Postgres puts xmax value. And if you lock row, select for update, and then define save point, and then perform update, this is similar to select for share. Although it's select for update. Because basically multiple transactions are trying to lock the same row so multi-exact is needed here and there is blog post I don't remember very good blog post about performance

Starting point is 00:47:12 cliff and this is very blog post which said the sub transactions are cursed just remove them and I became like follower of this approach so in that case with that customer we we had two hypotheses. Either sub-transactions removal will help or it won't help

Starting point is 00:47:32 just because they had the case we demonstrated in this experiment. Fortunately, it was related to sub-transactions. But who knows? Maybe in some case, we will see that no sub-transactions, but still performance cliff like that. It doesn't make sense. Yeah, for sure. And I think I was just looking up the blog post because it was bugging me.

Starting point is 00:47:55 Are you talking about the one from Nelson L. Hage? I think so. Yeah, I included it to my four cases with sub-transactions. We did an episode on sub-transactions and I found that in the show notes. I'll double-check it and I'll link it up. It's very good, very good blog post, very precise. It doesn't explore too far this situation

Starting point is 00:48:16 with Select for Update and foreign keys. I think it even doesn't mention foreign keys. Yeah, because they had different problem. But for sure, it was a good starting point for me some time ago to start understanding that multi-exact IDs also might be dangerous. It's also a cliff due to various reasons. So I think we need to postpone exact answer to this. I had one more thing i wanted to add

Starting point is 00:48:47 i think there's like there's definitely people i know who who have been using postgres for five or ten years and they're riding this wave of they adopted postgres when it didn't have all the features they thought they were going to need but But as they've needed them, Postgres has been adding them. As they've scaled, their business has been able to stay on Postgres because it got replication features or it got parallel query features or some things that they started to be able to really benefit from. Postgres improved as they needed improvements. So it feels like this could be

Starting point is 00:49:25 another one of those things that postgres is going to get better in this area there are enough there's enough noise around these things you mentioned some even if these don't even if these get reverted in 17 and don't make in 17 something will probably make it into 18 or 19 or 20 and there's enough people at scale pushing postgres to these limits that a lot of us that aren't there yet can probably get away with adding foreign keys using them even in cases where maybe it's not recommended at scale and maybe postgres will be improved or we'll have configurable settings by the time we do need them yeah yeah maybe makes. Maybe. Makes sense. Makes sense. Maybe.

Starting point is 00:50:06 Yeah. We need to explore those settings and test various edge cases. I would say these performance cliff should be

Starting point is 00:50:13 called edge case. Yeah. Or corner case. Let's say this what I just described sub transactions plus select for update.

Starting point is 00:50:23 This is two edges where they meet. Exactly. This is a perfect example of Cormorant case. But others are edge cases. We just need to explore the territory and find all the edges and think if it's possible to

Starting point is 00:50:37 move them to have more room when needed. Or just at some point, probably, foreign keys should be just dropped in some cases sometimes maybe yeah good well we spoke to some people who don't use them so that's really yeah but in general i keep using them i keep using them everywhere and i i keep suggesting use them great pay tax and but yeah And society will be better. Right.

Starting point is 00:51:08 But don't get, get into jail with these cliffs. Oh, interesting. Yeah. Thank you so much, Nikolai, as always,

Starting point is 00:51:17 and catch you next week. See you. Bye.

Postgres FM - Should we use foreign keys?

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.