Postgres FM - Locks

Episode Date: May 23, 2025

Nikolay and Michael discuss heavyweight locks in Postgres — how to think about them, why you can't avoid them, and some tips for minimising issues. Here are some links to things they menti...oned:Locking (docs) https://www.postgresql.org/docs/current/explicit-locking.htmlPostgres rocks, except when it blocks (blog post by Marco Slot) https://www.citusdata.com/blog/2018/02/15/when-postgresql-blocks/Lock Conflicts (tool by Hussein Nasser) https://pglocks.org/log_lock_waits (docs) https://www.postgresql.org/docs/current/runtime-config-logging.html#GUC-LOG-LOCK-WAITSHow to analyze heavyweight lock trees (guide by Nikolay) https://gitlab.com/postgres-ai/postgresql-consulting/postgres-howtos/-/blob/main/0042_how_to_analyze_heavyweight_locks_part_2.mdLock management (docs) https://www.postgresql.org/docs/current/runtime-config-locks.htmlOur episode on zero-downtime migrations https://postgres.fm/episodes/zero-downtime-migrations~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith credit to:Jessie Draws for the elephant artwork

Transcript
Discussion (0)
Starting point is 00:00:00 Hello, hello, this is Posgus.fm. I'm Nick, Posgus.ai. And Michael, PgMaster, hi, Michael. Hello, Nick. So, today we talk about logs, heavy logs. Because in Posgus, there is also a concept of lightweight logs in some other systems called latches, latches, right? And if we say lock without heavy, it means heavy by default, right? I think so, yeah. I think this would be a very tough discussion or a very shallow discussion
Starting point is 00:00:37 if we were trying to cover everything that uses the word lock in Postgres. But yeah, LW locks are the kind of latches, the lightweight locks, right? So database objects and row level locks. This is what we are going to focus on today. Perfect. Yeah, more like closer to application development actually, rather than to internals. And this is one of very popular topics we discuss when we talk with clients and solve problems because
Starting point is 00:01:05 many incidents are caused by lack of understanding when people don't realize how locking works, don't go into details there and they have storms of so-called lock contention, spikes of active sessions, waiting on log acquisition, and sometimes database can be down. This is the problem. Yeah, it came up in our recent episode, didn't it, on 10 dangerous issues? Oh, yeah, exactly. So, it's one of the 10 dangers we talked about. And where to start? Where to start? I will start with simple foundational fundamental idea, which should be written everywhere in
Starting point is 00:01:55 bold, you know. Log cannot be released until the very end of transaction. Heavy log can be released only at commit or rollback time, not earlier. Yeah, I like that. It's a good place to start. Any lock, regardless of any heavy lock, we don't touch lightweight locks. Any heavy lock can be released only when transaction ends. Once you understand that, the other issues start to become more complicated. Yeah, and that's why we additionally don't want long-running transactions.
Starting point is 00:02:35 And this is regardless of transactional isolation level, right? So it doesn't matter if you acquired a lock to be held until the variant or until commit or all that. So yeah, even very gentle log like accessory log can be harmful. So you should be, don't hold logs you don't need to long. It's like a simple advice, but this is like understanding. It sounds super obvious, super obvious, but I wonder how come these super obvious things become obvious only in my particular career, they became obvious only in the middle of it. career, they became obvious only in the middle of it. First time I heard it from somebody else was maybe like 10 years ago or so, not 20.
Starting point is 00:03:30 It should be 20 years ago when I started working with Postgres and databases deeper. Super obvious, but somehow you spend 10 years of your career thinking you're expert, database expert, not realizing this obvious thing. Maybe feeling it some partially, but not realizing it. Maybe it's not delivered properly in documentation and educational courses or something. It should be. Maybe. I think the documentation does lay out quite nicely, but it doesn't give you practical tips. As usual with all these documentation. Like practical tips, yes. But this is super important practical tip.
Starting point is 00:04:08 Like don't expect you will release it earlier. It's super basic, super obvious to, I think, most of our listeners right now. I'm pretty sure like 90% know this, but I wanted to spend some time here because I know new people need it. Yeah, well, I mean, it's also kind of maybe another case of that, you know, the midwit meme with the kind of the beginner has a certain opinion. As you get more experienced in something,
Starting point is 00:04:39 you start to think you need to do all these different things. And then as you get more expert, maybe in a lot of cases, you go back to that early opinion. If you start appreciating basics more and more, because you also think about other people and you want them to be on the same page with you, if you work in teams, for example, right? Yeah, you value different things perhaps like the liability and not having a weird edge case bugs and things.
Starting point is 00:05:04 So yeah, I actually in terms of where to start, we also have, we had a listener request for this and they asked and I think they phrased it in an interesting way because I think it's not quite how I'd like to attack the problem. They've said could you explain the various kinds of locks PoshQuest has and what to do to avoid them. You cannot avoid locks. Yeah. But then what, so what is the question? I think it's more around understanding when they're necessary and what to do to reduce
Starting point is 00:05:37 the severity. So how heavy a lock or what else is allowed concurrently at the same time. Yeah. So and tips and tricks around that basically. Right. how heavy a lock or what else is allowed concurrently at the same time. Yeah. So, and tips and tricks around that basically. Right. So, so we cannot avoid locks. Locks are needed to work in multi-user environment. They are mechanism to, to handle concurrency, like how to, how many users can
Starting point is 00:06:02 work with the same thing at the same time somehow and and Not not interfere with each other to avoid the inconsistency or many errors or something and you cannot avoid locks even if you read something you already locking something and What we want to avoid so I will tell you what you wanted, right? So yeah, so right question is, I hate this term honestly, I hate this phrase, but this is exactly when it's needed. The right question is how to avoid waiting on local acquisition. This is what we want to reduce. Again, completely, you cannot get rid of it completely, but you want to reduce it as much
Starting point is 00:06:46 as possible because this is when a database cannot do work and you just have wasted time. Eventually, again, like I said, if you have a lot of active sessions waiting, it can even be caused downtime, caused outage. We want to avoid too much of waiting and too much again remembering our very first episode what is too much in all TP context. 100 milliseconds is already quite slow SQL query. If we wait 100 milliseconds, 100-200 milliseconds, people already will notice because human perception is 200 milliseconds roughly plus minus like depending on various people it's not I I'm like we are not cats
Starting point is 00:07:32 Cats have much better reaction if imagine if everyone will be a cat and we We're not making apps for cats. Yeah. Yeah, we would need to optimize much harder They will they will recognize they would recognize Yeah, we would need to optimize much harder. They would recognize slowness much earlier, so higher standards. Anyway, this defines how long we can allow ourselves to wait and also we need to include the actual work time, right? So ideally we should wait very like, okay, close to zero. Or not wait at all. If we do things properly,
Starting point is 00:08:11 we can almost avoid waiting time. So this is it. This is the right question. How to avoid wait on box, waiting on log acquisition. Agreed? Yes, I like that a lot. And it's not just about how to avoid waiting, it's also techniques to bail out. So I think maybe it's the same thing, but I think lock timeouts, for example, is not just about shortening the wait time, right? It's about bailing out
Starting point is 00:08:48 and saying, actually, we don't need to do this right now. Let's wait for a better time. Great point. So you're talking about the same thing, but from different end. It's like, not like we want to avoid, I mean, we likeusing some queries, transactions, sending them to database, we want to avoid waiting. For example, if we say for Select Fablet to keep locked, we won't be waiting. Or no wait to have an error sooner. But if we do something, it's good to think about others and don't let them wait on us too much. That's why I lock them out.
Starting point is 00:09:32 So it's two-sided problem. We don't wait and others don't wait too much. I like this. This completes the picture. Nice. So in terms of the first part of the question though, various kinds of locks, should we start on the table level ones? Do you reckon? Right. But before we go there, honestly, I don't keep this in my mind all the time. Yeah, good. Yeah. This
Starting point is 00:09:58 baggage, this is a simple way to think about it. There are exclusive logs and share logs. There are row-level logs and table-level or database-subject-level logs. This is enough for understanding. Then you can use documentation as reference. There was a great Cytos blog post, several actually blog posts, but I especially like the old one by Marcus Lott, Postgres Rocks except when it blocks understanding logs. I like it because it translates the table of conflicts documentation has, it translates
Starting point is 00:10:37 it to some more practical form. What kind of operation can be blocked by what kind? So instead of, it shifts language from logs to operations we have like selects, alter table and so on. So I don't keep this reference style info in my head almost never. I know something experience gives me, but understanding this, they can be shared log, exclusive log, row level, table level. It's already good enough. You know? Yeah, because it makes you think, before I do this, what kind of lock, and then you can look that up. You can look up for the specific
Starting point is 00:11:15 command or the specific thing we want to be able to do. What does the documentation say? Or, and this is maybe a good time to bring up, there's a newish tool. I think it came out maybe a year or two ago from Hussein Nasser, a friend of the show, called isitpglocks.org. Yes. It's a very good attempt to translate what documentation has to some better form to consume with easier. I think actually documentation could just have it as is. It would be great. Maybe it as is. It would be great.
Starting point is 00:11:46 Maybe it should be. That would be awesome. But in addition, I actually thought it was a relatively good, it's very browsable and I think it's quite a good educational tool. I didn't realize this could be done at the same time as this, or this would take this kind of level of lock. So it was quite a nice way of browsing as well, I thought for me personally,
Starting point is 00:12:10 from some commands I hadn't considered, like I hadn't really considered, for example, the locks that, for example, like vacuum I had thought about, but analyze, I hadn't thought about the kind of analyze from a locking perspective. So it's really interesting seeing that in some of the lists and thinking, oh, that means you couldn't, you know,
Starting point is 00:12:29 run two at the same time. Or there are some other interesting things I had never considered before. Yeah, yeah, yeah. So you analyze, like everything is locking, everything. Even selects are locking. Not only like selects, if you have planning time, by default you have in queries, it locks all indexes as well with Sharelock, right?
Starting point is 00:12:56 Accenturelock. We talked about it. This can be bottleneck if you need to run a lot of these queries per second, like thousands of them per second, it can be bottleneck that you need to lock a lot of queries. This topic has a bridge to lightweight lock discussion, but it starts with heavy locks. So the need to have many heavy locks can lead to lightweight lock contention. This means that understanding what's locking, what's blocking others is great. But again, at very basic level, if you change something, you need to have exclusive lock.
Starting point is 00:13:41 For example, if you update some rows, you need an exclusive log on these rows. Right. And this means it will conflict with others who want to change it. Yeah. So exclusive is you can only have one at a time, right? Share means you can have multiple at the same time. If you read rows, well, for rows we don't have shared logs, but with table level, if you change table schema, again, it's table level exclusive log. If you just deal with table, including SELEC, you need shared log on that table, right? At table level. Yeah.
Starting point is 00:14:19 So, of course, database level, object level logs and row level logs, they are kind of different beasts, right? Different beasts, because their implementation is different and so on, behavior is different. But still, rule can be applied directly. If we have multiple reading operations that don't conflict, and they are like by default, they are not blocked by others. But they can be blocked indirectly.
Starting point is 00:14:46 This is my favorite example when you have long lasting transaction, which just read something from a table, maybe zero rows, no rows read, just select from table, limit zero. But it already acquired a share lock on this table. Then some alter table comes, it waits on us, and then we have a lot of selects after it coming to this table and they cannot read because they are waiting on this man in the middle, transaction in the middle. So yeah, we have a chain or queue of locking issues and this is a nasty situation which is inevitable if you don't have lock time out and retries. And this feels like we have a cell deck
Starting point is 00:15:31 which is blocking other cell decks. How come? Okay, because there is an alter table in the middle. That's why. Yeah. Okay. But yeah, I like that description because it hammers home a few things we've talked about in the past. Like this is a reason not to have excessive long running transactions because it can be that one blocking or it can get in the way of a misguided migration or table or something. But yeah, then there's but it's not enough. Yes, but on its own, it's not enough because we also need the lock timeout and retries
Starting point is 00:16:09 for the migrations. And maybe one on their own would actually be fine. If you only implemented one or the other for a while, you might be okay. But then the vacuum comes, right? And the vacuum always is blocking, but it has a detection mechanism. It's blocking always, but if it's not running in transaction ADWP around, prevention mode, it will kill itself. But if it is running in transaction ADWP around, it will not kill itself. And this is terrible. Because you don't have long-running transactions but you still have this problem. I think having low lock time out and retries for DDL is inevitable.
Starting point is 00:16:50 It's really needed for everyone. It's a good point though. Is vacuum the only thing that will take a lock and kill itself if it's if something important comes along. I don't know. I don't remember others. Yeah, maybe. Yeah. Yeah. I mean, it makes sense because it's kind of like background work that can be done at a later point. There isn't much else that fits that bill. But yeah, cool. Good. So what else I wanted to say? It feels like monitoring systems don't care about real problems.
Starting point is 00:17:32 They show you, this is number of share logs, this number of exclusive logs, bye-bye. Datadog does it, for example. It's very popular, right? And people come to us saying, oh, we have issues with logs and they show us like a big spike of logs like, and we see, okay, access share logs. Okay, you had some intensive reading happening. So why should we care about this at all? Right? And so such thing as good log dashboard doesn't exist yet. I haven't seen it yet. There are good attempts, but the problem also lies in the fact that it's really hard to
Starting point is 00:18:15 understand who is blocking, who is waiting on what. So it's really easy to understand, but to reconstruct the whole chain of or tree of blocking events, processes, backends, one backend can be waiting on other, that one can be waiting on another, and so on. It can be a tree. And actually it can be a forest of trees, because there might be several roots and you have multiple trees. And in this case, what helps is a function called... Well, let's step back. So, I recommend to everyone to enable log log weights, because by default it's not enabled, it's off. Everyone should have it on.
Starting point is 00:19:06 By default, it's not enabled, it's off. Everyone should have it on. In this case, you will see every time some session waits for one second, it cannot be enabled to acquire a log. This situation will be logged. Actually, not one second, more precisely deadlock timeout, right? Because after deadlock timeout, some checks are happening. This is exactly this logging can happen in Postgres. Yet another parameter that's being used for two things. Yeah, well, it's indirect, it's not straightforward, and it is what it is, right? But also some people change it. Sometimes it's two seconds or five seconds. Some people try to postpone deadlock detection, thinking maybe it will be resolved. Resolve itself, right?
Starting point is 00:19:52 Yeah. It cannot resolve itself. Well, but the problem is you might want different settings. I would do it in different direction. Resolve earlier, maybe. It's an interesting topic, right? So maybe we should talk about deadlocks in a few minutes. Speaking of observability, how to deal with
Starting point is 00:20:12 locks, it's really not easy because you enable this. It's already something. It's great. You see process ID of victim, let's use this word, why not? And waiters. Yeah, offenders. use this word, why not? And waiters, sometimes maybe multiple, right? And then you see, since it's your session who is a victim, well, not your session, but it's the session for which this analysis was performed. So we see the text of the query for the session who is waiting, but we don't see details for transactions. There might be actually an id-lan transaction. There might be no query, although in PgSat activity we would see the last query executed in that transaction. It would probably help. But sometimes it's an active state, state equals active NPT set activity, and
Starting point is 00:21:06 if we were able to see the query, it would help. Unfortunately, in logging you cannot. This is a little bit annoying, because if you need to do postmortem analysis, like root course analysis, for something in the recent past, you see only one side query, but you don't see another side. It takes time to understand. Actually, it's impossible sometimes to understand what caused it. I remember we even implemented our own mechanism to log additional information in such cases. So using just a raise notice in PLPG SQL, but it's not fun, honestly, to do this. And in some cases, it's reasonable to do this if you, for example, have very strong tooling for log analysis, like enterprise level. In this case, I would do more logging using PLPG SQL, for example, and let the tool visualize the number of errors
Starting point is 00:22:08 we have and provide details and maybe notifications and so on. So in this case, the question is, how to implement good monitoring for this? And good monitoring for analysis, we have a query, right? Among our how-tos. How-tos I wrote. There is a huge query which had big evolution. I took it from some people. They took it from other people. So some evolution happened there. And it's great, I think, 100 plus lines of query, which provides all the details.
Starting point is 00:22:41 And you see the whole forest of trees, including all states, weight events, sometimes it matters, and also queries. One problem with this is because it requires pgBlockingPids call. This function gives you the whole list of process IDs which is blocking this process ID. And the problem with it, it's quite like sometimes expensive. It's not for free. So limit yourself with some statement amount, not to cause observer effect because when there is a huge acute spike or not acute, just spike of some like storm of local acquisition weight events. In this case, this function can cause additional trouble sometimes.
Starting point is 00:23:33 So is that, does that mean at the times it would be most useful is the most likely for it to time out? Well, yeah, yeah. In general, in most cases, I warned about this, but it doesn't mean that it always happens. It happens rarely, this thing. But it's quite wise to limit yourself, I don't know, half a second, maybe up to one second. Okay. And not to call it too frequently, not every 100 milliseconds, like I do sometimes with
Starting point is 00:24:01 some queries, not with this. During. During observing something like manually, like I just see like almost animated state of some results. But in this case, it's better to do it less frequently with statement timeout. But once you do this and you bring this to monitoring, it's beautiful. It's great. Yeah. With monitoring, it feels like sometimes we want to catch every example of an issue for some types of issue. But sometimes it's OK if we're just sampling.
Starting point is 00:24:34 And if we catch that sometimes it's an issue, that's a sign that there's a problem in our code and or a problem in our team, or that we don't have the education around these types of problems. And in, I understand that like this might be one of those ways, one, so you do want to catch as many of them as you can. But I think it's more important that we. Realize that we even have any migrations that don't have this, or if we are doing updates in a, in an order or transactions in an order that could end up with dead locks or, or complex locking trees? I don't know. I would prefer to have exact information here in the void center, but I
Starting point is 00:25:11 live in the world where max connections doesn't equal 10,000. If you're on RDS, you can have like multiple thousand max connections, multiple thousand. In this case, you have a storm of active sessions, many thousands of active sessions. Of course, in this case, it's a different situation. But for me, it's not a healthy situation. I would prefer to have a number of course multiplied maximum by 10, maximum. This should be your absolute maximum for max connections. In this case, you cannot have too many records to analyze, right? And in this case, it's good to have exact information because if you start sampling, you might miss something because one line
Starting point is 00:25:56 can matter a lot in this analysis. Yeah. So you have like, I don't know, 96 scores. Your max connections should be 400. In the worst case, we will have 400 backends to analyze. It's a lot, but it's already like, it will be only during incident. Hopefully, it's not every day. If you have incident every day, you know what to do. Ask people who can help. Anyway, 5,000, I agree, 5,000 of same thing actually. They all are waiting on the same thing. Why should we log them all?
Starting point is 00:26:38 Why should we bring them all to monitoring? Of course, it doesn't make sense. But if it's just a few hundreds, I don't know. Should be fine. Should be fine. Again, under normal circumstances with 96 cores, you should have I'd say 50 active sessions. And most of them are not waiting. So, this analysis won't bring anything. So this analysis won't bring anything. And then if your system is healthy, in most cases, this query will produce empty results. Empty. That's great.
Starting point is 00:27:13 So normal state is zero rows in results. Well, I guess normal state could still be, like normal state is still up to a few rows, but the main point is that they shouldn't have been waiting for too long. Is that fair? Yeah, yeah, yeah. Well, you can wait a little bit, but sometimes you have spikes of some small spikes.
Starting point is 00:27:35 But in healthy systems, we just see that zero, zero, zero, even if the system is very heavily loaded and then some small spike and then 000 like this. Yeah. So, yeah. Also wanted to mention that, again, indirect connection to light weight locks. So, weight event, when some backend is waiting on another to acquire a lock, heavy lock, you will see weight event equals lock. Or wait event type lock. So this means we are waiting. And again, lock means heavy lock. What else?
Starting point is 00:28:17 Well, is it a good time to go to... I mean, you mentioned that these lock trees could get complex, but they could also be loops, right? In the case of deadlocks. Oh, that's interesting. Yeah. So it's like the only solution to deadlocks, somebody should die. That's easy.
Starting point is 00:28:38 But no, I'm talking more about prevention, right? Oh, yeah. I know it can get complex, but I feel like most cases of deadlocks I've heard about have been what I would class as kind of poor design on the application side. Exactly. Or at least, yeah. Exactly. So very avoidable with the right consideration around locking and around ordering of what you do in which order? Yeah, actually, let me correct myself a few phrases ago. I said, it should be a weight event type equals lock.
Starting point is 00:29:14 And there are several additional weight events in this category or under this type, which can help understand at which level, for example, we have a lock We are trying to acquire right like it's a relation or like row level tuple Or something else like maybe advisory locks also. Yeah. Yeah number this like user defined mechanism so yeah, and Speaking of dead locks. It's always Here exactly we can apply this anti-methodology by Brendan Gregg. Blame somebody else. It's always a problem with applications, with workload, maybe users.
Starting point is 00:30:00 It's not a database problem. The sequence of events is not good. And it should be changed. It should be redesigned so people don't conflict in this way. And in many cases, I remember dealing with deadlocks was really challenging because if it's not you who wrote this code and you need to explain, they need to change. In many cases, it's easier to just accept the loss. If it's just a couple of deadlocks per day, it's okay.
Starting point is 00:30:32 If you have a heavily loaded system, it's not that bad. And then often the solution is to make sure that transactions can at least be retried. Because they are so... Definitely so. 100%. Yeah. Of course, there are cases where having a deadlock is a serious problem, because if money are involved, then it's a problem, because you can have some issues. And definitely application...
Starting point is 00:31:01 It's the right point. Maybe instead of trying to completely avoid and prevent deadlocks, maybe it's better to design and... Okay, we have a couple of deadlocks among... We have billion transactions per day with two deadlocks, but we have retrial logic, so nothing is lost and no users are complaining. That's it. It's smart. I agree with this. Nice. Do you want to touch on advice, VLUX, quickly? Or not really? Oh, yes. We had a case recently. It was an interesting case. Obviously, engineers, very smart engineers, and they also read documentation a lot and so on. And that quite advanced code is written and so on,
Starting point is 00:31:55 but they had issues with heavy lock intention. And turned out it was storms of advisory log acquisition, wait events. We just talked through this and my advice was quite simple. Let's get rid of advisory logs because we have data in table. This data is organized as rows. We can log rows. We talked about reasons why advisory locks and it was like... There are reasons originally, but when we just think could we switch to regular row-level locks and the question was yes, there's no big reason not to use it.
Starting point is 00:32:39 And once we switch, everything becomes quite clear how to resolve contention completely. You just start, select for update. Before you update or delete row, you just select for update and then skip locked or no wait, plus retries if you want to fail immediately and then retry, it's no wait. If you want to, depending on application, depending on logic and data and so on, sometimes you can just take next batch of rows to process. In this case, keep locked. If you must work on the same batch, okay, come later. In this case, no wait and just retry. And that's it. It's an easy solution. Well, advisory logs can
Starting point is 00:33:27 still be helpful in some cases, but working with data in general, I think it's some mechanism which feels like kind of extra and not needed. If you just need to process some data in many parallel sessions, just work with row-level locks. Yeah. I feel like it's a sledgehammer, and sometimes, sometimes you need a sledgehammer, right? Yeah. If you think about it, we have, in the UK,
Starting point is 00:33:55 we keep our garden tools in a shed normally. Not many people have a sledgehammer in their shed. Like, you don't need it that often, right? I agree, that's a good analogy. Yeah. Well, anyway, but the point is every now and again, if you're doing like a remodeling and you want to get rid of a wall, maybe you need to bring in the big guns and actually do something more heavy. I use this word, I apply this word, maybe you remember to materialize the use in the past.
Starting point is 00:34:25 This definitely feels like just you solve your problem with performance, but it's so like massive, it always needs to be refreshed fully. So same feeling, like it's a tool, it's reasonable in some cases, but if you overuse it, you cannot do precise work. Like you lose some precision, right? So maybe... Well, and going back to the original question is what can we do to avoid these locks? And we were saying, well, actually, what can we do to avoid long waits? And what can we do to make locks lighter in general or take less time or retry when needed.
Starting point is 00:35:06 This is a perfect example. Is there a way of taking less extreme locks or releasing them quicker? Or, you know, so it feels like another example of how to minimize locking or, or at least shorten its impact. Yeah. Yeah. So yeah, short transactions don't hold locks too long If you wait give up
Starting point is 00:35:31 Give up sooner don't wait too much because you others can wait behind you What else like? You skip locked or no wait select for updates locked, no wait. This is measure mechanism. Finally, MySQL even has it, right? So this is great to have it and use it. And I guess the more obvious. With only one comment for select for update and select for share,
Starting point is 00:36:02 select for share can lead to multi-exact SLR issues. I barely remember already. Didn't touch it for a while. But there is an interesting effect with select for update. If you have foreign keys, select for update, and you use sub-transactions, you can have issues with multi-exact IDs somehow indirectly and unexpectedly. Select for update can feel differently if you have sub-transactions. So again, I'm a big fan of sub-transactions, so to speak.
Starting point is 00:36:42 I would not use select for update and sub-transactions at the same time. This can be a problem as I remember. But in general, Select for Oblate is great. In general. And up to, I mean, you're talking mostly about very heavily loaded projects. Oh yes. This problem of locking is a problem even in my opinion, it's a problem way before that. It can be a problem in much much much smaller projects. I agree, it can be even in a small project it can be very annoying to have contention on heavy locks. Well and can cause downtime still. So yeah, I think that's great
Starting point is 00:37:25 Anything else you wanted to make sure we covered no just that's it like no long transactions No long waits and you will be fine Yeah Actually, I had one I had one more but I think it's almost at the risk of being too obvious There are but we've talked I think we had a whole episode actually on zero downtime migrations, but there are schema changes you can make in different ways to avoid the heaviest of lock types, as simple as, you know, the create index concurrently type things, or we had a whole episode on PT squeeze and things like that. I feel like that's another topic of avoiding a certain type of lock in favor of different types of lock.
Starting point is 00:38:09 You can avoid lock contention issues during DDL, but you need to sacrifice transactional capabilities. This is the key, unfortunately. You cannot have transactional and atomic steps and zero risk of having low contention, unfortunately, at the same time. And creativeness concurrently is a great example here. Or attaching-detaching part partition concurrently and so on. Yeah. Good. OK. Nice one.
Starting point is 00:38:49 Thanks so much, Nikolai. Catch you next week. Bye-bye.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.