Postgres FM - RLS vs performance

Episode Date: December 20, 2024

Nikolay and Michael discuss Row Level Security in Postgres, focussing on the performance side effects and some tips to avoid (or minimize) them. Here are some links to things they mentioned:...Row Security Policies (docs) https://www.postgresql.org/docs/current/ddl-rowsecurity.html7+ million Postgres tables (recent talk by Kailash Nadh) https://www.youtube.com/watch?v=xhi5Q_wL9i0Row Level Security guide (Supabase docs) https://supabase.com/docs/guides/database/postgres/row-level-securitycurrent_setting function https://www.postgresql.org/docs/current/functions-admin.html#id-1.5.8.34.3.6.2.2.1.1.1.1Our slow count episode https://postgres.fm/episodes/slow-countRLS Performance and Best Practices (gist from Gary Austin) https://github.com/orgs/supabase/discussions/14576Everything you need to know about Postgres Row Level Security (talk by Paul Copplestone) https://www.youtube.com/watch?v=vZT1Qx2xUCoBUFFERS enabled for EXPLAIN ANALYZE by default (commit for Postgres 18) https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=c2a4078ebad71999dd451ae7d4358be3c9290b07Add UUID version 7 generation function (commit for Postgres 18) https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=78c5e141e9c139fc2ff36a220334e4aa25e1b0ebPostgres hacking session with Andrey and Kirk (for UUIDv7): https://www.youtube.com/watch?v=YPq_hiOE-N8~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith special thanks to:Jessie Draws for the elephant artwork 

Transcript
Discussion (0)
Starting point is 00:00:00 Hello, hello, this is PostgresFM, a podcast about PostgresQL. Do I pronounce it right, Michael? I think so. Yeah. And hi, Michael. Michael from Pidgey Master. Hello, Nikolai. Yeah, and I'm Nikolai, Postgres AI. And we don't have guests today, unfortunately. We have good guests coming soon, but not this time.
Starting point is 00:00:23 And my idea was, you know how much I don't like to deal with security, I just need to do it. And let's talk about policies, create policy. Why is it create policy, by the way, it's not create RLS policy, it's just create policy. Policy is a very general word, right? It can be not only role-level security, but maybe some other policies. Do you think decision was made to... I think this is not standard. I'm not sure because I just checked with some AI
Starting point is 00:00:58 and it says SQL standard doesn't have RLS. I looked up where did the name come from because it wasn't the title of the page in the Postgres documentation. And the time it's used is in the auto table statement. So this is a table level setting that you first enable. And only once you've enabled it on the table, then you set policies. So it's auto table and then enable row level security. And then once that's enabled, you need to create at least one policy so that people or non super users
Starting point is 00:01:31 can read anything right but still like it confused me from very beginning I think it was implemented originally very long ago right before post this then or after I don't remember I didn't actually check this time yeah but I think it's quite old feature relatively right it definitely wasn't in Postgres 9.0 but but possibly it already present in I found it in 9.5 9.5 yeah yeah it means 2015 or something, right? Mm-hmm. So let's talk about the problem it's solving or tries to solve. The problem is in the era of SaaS, software as a service, we have a lot of cases when one system running in cloud operates with many, many different users,
Starting point is 00:02:22 which ideally should not put at risk each other or something. There should be some segregation. For example, if we have, say, Twitter style, right? We have users, we have... Twitter is not good. Let's think about something more business-oriented. It would be more... Or e-commerce, being able to see somebody else's orders
Starting point is 00:02:48 or being able to see their address or something like that. Let's find some B2B example because this is where it feels much more sound, so to speak. So, for example, let's take Slack, and we try to implement Slack, right? Yeah. So we have workspaces, many of them, and channels inside and people communicate, but each workspace belongs to some customer, some organization which uses our service.
Starting point is 00:03:16 And it's SaaS, right? It's in cloud. And is it a good idea to keep all messages, for example, in a single, maybe partition, but we forget about performance for a while, in a single table called black messages for all customers. Maybe it's a good idea, maybe not, right? But from the security point of view, it's a terrible idea because if selecting our messages, we have some bug or some kind of hacker attack, we can select other organization messages, right? And that's bad. So, or maybe just developer forgot to put some condition again, a bug or something like unintentional and suddenly one organization
Starting point is 00:04:06 deals with messages from other organizations sounds super scary for platform development for the SaaS service developers it's bad so my point is some SaaS systems
Starting point is 00:04:21 still have this situation old ones definitely have this situation unresolved. I think it might be most, yeah. Yeah, ultimate. So it's called multi-tenant system, right? Ultimate way is like you create a database or maybe it's like whole cluster for each at least big customer and that's it right and
Starting point is 00:04:47 full segregation full separation of data and that's great but it's super expensive even if you put it into in the same database in the same scheme but in different tables it's also quite expensive although we know Postgres can live with millions of tables. I can send you a new presentation we just recently discussed. It was probably presented in Brazil. I don't remember. Again, like 100 million tables. Is it possible? Well, it's possible, but it's difficult. So idea of RLS comes from this problem, right? We want to have guaranteed protection that one customer sees only their data, not other customers' data, right? And it can be achieved by defining special rule that additional condition will be always present in selects, right? Or updates, deletes inserts yeah and i think i think would you say it's
Starting point is 00:05:46 fair that most when i when i said most don't have this i mean most don't have it at the database level i think most people put a lot of effort into making sure this couldn't happen at the application level but most don't also add an additional level like layer of defense at the database level and this is a database level. So any application sitting on top of it or any client sitting on top of it gets that by default. Yeah, that's why I always say we cannot trust the application layer because what if next year your organization decides to try a new framework or something and you have two applications and they behave differently
Starting point is 00:06:21 or you need to deal with implementing the same rules in two systems and basically copy-pasting. Sometimes in different languages, it's super hard to maintain. And what if you have some specific interfaces connected to database and people just work with this data directly, right? And they bypass your application logic completely.
Starting point is 00:06:43 So yeah, this is good to have inside database closer to data and apply it in the same manner to all. I wanted to say this problem can be solved using views, right? So you can, for example, have a view which will deal only with data dynamically for particular customers, customer only, not seeing anything else. You just put something to rule definition, which will involve some, I don't know, current setting again, right? And in current setting, you have your customer ID, for example, and you say, okay, this view is selecting rows from messages table, where, and we have the check that customer, like project ID, organization ID,
Starting point is 00:07:31 doesn't matter, it belongs to this customer ID, that's it, always. It can be joined maybe. And in this case, this view can be even writable, so inserts, deletes, updates could work. And this check is automatically verified, although maybe it's not verified at insert time I don't remember exactly I think there is some caveat here but if we stay only if you think about only selects at least it's definitely working you you cannot if you make your application work only with this view not with original underlying
Starting point is 00:08:04 it's called underlying table. In this case, you can be 100% sure that they will see only data they can see, right? Not anything else. And this additional filter will be always, planner will be always using it, right? But this is not RLS. It's an alternative approach. And I think in source will be a problem. I don't remember 100%, but I remember I got bitten by this problem a few times in my experience.
Starting point is 00:08:32 And we needed to implement triggers to double-check that inserts are protected and the same rule is applied the third time. So maybe this is like downside of this approach. But create policy statement as like more like natural way to protect data for these purposes, right? So we define policy and we alter table saying that role level security is enabled for this table. By default, it's not, right? Well, yeah, in Postgres, yeah.
Starting point is 00:09:04 Yeah, and then this policy is supposed to protect us in terms of multi-tenant situation, so customers deal only with the data they can deal with. They don't see each other. Do you think it's a good protection for multi-tenant situation when we keep everything in one table? I have mixed feelings about role of security. I think this is a security question in general.
Starting point is 00:09:28 It's how much are the downsides worth it? How much is the extra complexity worth it for the additional security? And yeah, I think in environments where security is of paramount importance, it makes sense to add security at every level, including this one. This one, you mean database level? Yeah, I don't see Postgres row-level security in use that much.
Starting point is 00:09:54 So most people seem to be making the trade-off of not using it in favour... Well, not in favour of... I suspect most people using row-level security also have other measures in place but i've seen a growing number of performance related questions around rail level security in various communities and i think it's due to the popularity of postgres and the automatic creating also my theory is because people are creating these restful apis directly interfacing
Starting point is 00:10:28 with the database it becomes super important to have security at the database level because people can change the parameters of the url and they otherwise they could just view other people's data like immediately so i think it makes sense that it's growing in popularity, but I don't think it's been growing in popularity as an additional layer. I think it's been growing in popularity as the only layer, but I could be wrong. Yeah, that's interesting because exactly this alternative approach, which I explained using views, it's coming. Actually, my bachelor thesis was about updatable views and master thesis was about updatable XML views. So it was like 20 years plus ago.
Starting point is 00:11:09 But Postgres particularly provokes you to use views over tables and views. You just create a view in, say, by default, like schema V1. And this is your V1 version 1 api and by creating views and controlling who can see what you can already limit access right you can hide some columns just not taking them to views or and so on and that's great but maybe you are right because I did see some popularity of low-level security growing from that side. Because people need more and more and more logic on the database side, which is related to data, because there is no middleware. It's only this Haskell application called Postgres. And all the data-related logic is supposed to be in database, including everything related to permissions.
Starting point is 00:12:07 And all UI logic is on fat client, right? It's like React or something. And yeah, I see your point exactly. If you have middleware written Ruby, Python, Java, anything, then people tend to implement all the security checks right there and maintain them in a unified way and so on. Yeah, I agree. There's no application, middleware application here, so it goes to database. Where else? Because it cannot go to UI. But interesting, again, views are there as well, but it's not enough. It doesn't feel
Starting point is 00:12:41 enough. And sometimes we need to deal with data from multiple clients. Sometimes. It depends, right? I mean, admin mode or something, right? Yeah. In this case, role security provides this bypass RLS flag for when you create a role or user, you can specify this flag. It's kind of super user or not super user, but it bypasses these security checks, right? Yeah, I agree. But my point stands as well. Old SaaS systems, they tend to not use RLS. But there are many other fresh, like the last few years, a lot of SaaS systems are created, have been created, and still created.
Starting point is 00:13:24 And also with AI, it's also a very multistandard situation sometimes. Think about chart GPT as well. It's also like kind of many messages from these charts. It can be also single table and it requires some security as well, right? And my point is that new SaaS systems, they tend to have it even without PostgreSQL or Hasura or something like that. Even if they use regular like Django or, I don't know, Java, anything, they sometimes, I just observed,
Starting point is 00:13:56 they sometimes implement RLS because they think, okay, we need better protection for our SaaS system. Right, multi- our SaaS system. Multi-tenant system. Cool, I didn't realize that, that's great. Yeah, I just see it and then they grow and start hitting performance issues. Yeah. And this is natural and it looks like the first,
Starting point is 00:14:16 let's talk about performance if you don't mind, right? No, I think the most interesting part of this is performance. And the first thing they bump into sometimes, not everyone, is that having some condition present in the RLS definition, this filtering, doesn't mean that planner will see it.
Starting point is 00:14:36 And you can have index on a column. For example, it's a simple condition. Some column equals some constant. Forget about current setting or some stable functions for a while some constant and we have an index on this column but we don't have this condition used in the where clause this will lead to sequential scan right because planner doesn't see what role level security is going to, which filter is going to apply, right? And this is a surprise. Yeah, sequential scan, but like with a filter.
Starting point is 00:15:11 So each role will be checked against the policy. Like you have filter, just use it. You might even, even if you have an index on that. So yeah, that's super unintuitive, I think, for people. And it tripped me up when i was reading about this i i thought i wondered if it could use an index on a on like a simple policy on a column like a like a user id or something like that and i even saw guides in in fact there's a really good guide on rail security by the team at Superbase who heavily encourage the use of rail level security as part of their offering because they're using Postgres.
Starting point is 00:15:52 But yeah, they say you can think of policies as adding a where clause to every query. But that tripped me up when it came to performance because it's not like a where clause from a performance perspective it's like a where clause from like a user experience perspective like from a from what you see at the end of the the data coming back might might look the same as if you'd filtered it using a where clause but performance like the planner doesn't have access to that to to choose to do a different type of scan at the beginning so yeah super important distinction and one of their tips on the performance side is therefore to always include all the filters you want anyway uh in the queries yeah yeah that's that's interesting but it's easy to solve i think it's just edit there into their work class and that's it let's talk about like a step back maybe a little bit about not
Starting point is 00:16:45 performance because i will forget about this but i wanted to share some feature stuff first of all i saw the case that if we have foreign keys and so on referential integrity checks are not covered right yeah this is like like the integrity is more important than role level security, right? Did you see any interesting cases related to this? Well, I only read about, I thought it was really fascinating concept that by enforcing referential integrity, you could be leaking security information, like if you or insert something, which is not
Starting point is 00:17:24 Yeah, exactly. If you're one user trying to insert something, and then you find out that it fails to be inserted, then you know that thing exists but from some other user. Insert is a bad example because foreign key doesn't help with insert. It helps with, for example,
Starting point is 00:17:40 deletes, right? Like SK deletes. So you, for example, can delete what you cannot see or something like that. Yeah, yeah, sorry, good point. Maybe it's not relevant then, but I did read in the documentation that you need to be careful with leaking information
Starting point is 00:17:56 by checking... Ah, I see how. For example, if you try to delete something without Cascade and it says it cannot be deleted because there is a row in a different table which we reference to, it reveals the presence of this row, but maybe you are not supposed to see it. Yeah, it's not your row, it's somebody else's row. Something like this is interesting. And then from error, you can conclude that there is some data there. Maybe you can understand the values.
Starting point is 00:18:33 It's interesting. But I never saw such situations in production. Another thing I had experience with is that PgDump... Yeah yeah you mentioned also before we started recording that it's bad for backups no it's bad you just have to test your backups but as I said the dumps are not backups I'm still on this idea
Starting point is 00:18:58 that by default backup is physical backup if we say logical backups okay it's bad for, I mean not bad. Yeah, exactly. It's can be tricky and surprising for logical backups. For example, you dump your table, but you see only rows you can see. Surprise. I was just imagining using a service like Superbase and wanting to keep a semi-regular dump of the data somewhere else just for my own peace of mind. Nowhere else, just lack of a lot of data. It was more that if you're not super careful about which user you're using, like making sure that's from a super user or some role that can see all the data, it'd be very easy to think, I've
Starting point is 00:19:50 got, like, I've backed this up. And then if you actually ever need it, you're in real trouble. You've lost a lot of data. Yeah, you can have it unnoticed, right? Yeah. A credit to the Postgres documentation. They called that out as something to be aware of. In the same area of dumps, pgDump doesn't have skip policies, or pgRestore doesn't have skip policies flag, as I remember.
Starting point is 00:20:16 And this was a problem because some users of our Database Lab engine, DbLab engine, this is how actually I noticed that more and more people start using raw-level security. They start asking questions, we have some raw-level security here, but we restore it in this non-production environment
Starting point is 00:20:37 for dev test activities. And we don't need raw-level security at all here. How to skip it? We just want to check various stuff and pgrestore has a lot of flags
Starting point is 00:20:52 to skip this to skip that many things but as for policies skip policies it's lacking and we needed to deal with
Starting point is 00:21:01 featured supports hyphen L small L lowercase L and uppercase L to have a list of objects present in the dump and then just have filter added like it's dictionary of everything what was dumped. And we just removed policies and then restored without them. And we automated this in Database Lab Engine. So basically, the idea is like
Starting point is 00:21:25 if i had a little bit more time i would probably code it and pg store could have it it feels like some good addition for future postgres versions i understand the attraction but i'm also kind of against it like partly because we're about to discuss performance issues right i'd want to catch those earlier i'd want to know. I'd probably want to still have row-level security on in my dev environment. Good point, actually. Yeah. Removing policies.
Starting point is 00:21:55 This is what we didn't think about. We just received idea, let's remove it. Yeah. And we implemented it. Now I think, oh, indeed, you're right. Like, you can do you don't see the whole not plan, but all the problems that policies can bring. And we will discuss in a minute,
Starting point is 00:22:12 maybe one of the biggest right? About stable function and so on. Yeah, so it's a good point. But it means also that if you want to have policies, you want to reconstruct your users, well, this is a good idea as well. So I have two to-do ideas. Anyway, skip policies is a valid flag for PgStorm. In some cases, it's up to the user, right? It should be possible to decide. Because it supports the removal of skipping of many other things. True.
Starting point is 00:22:44 Like permissions, for example, ACL, right? It's booming, it's skipped. But ownership, but RLS now, but I have to do for DBLab Engine as well. Like we should find a path to simplify restoration of users and to preserve all policies so we see performance. I mean, this is one of the purposes of the develop engine to experiment with plans, right?
Starting point is 00:23:09 And see how indexes help or not help and so on. Yeah, good. Let's talk about performance, back to performance. So we discussed that you must put this filter from policies. You must duplicate filtering in the where clause to achieve good performance. Another thing is that it's natural to use function current setting, for example,
Starting point is 00:23:35 or something like this, some function which current setting is just getting the value of a variable. It can be standard GUC, Grand Unified Configuration, or GUC variables, right? Let's call them variables. Or it can be user-defined variable. It has to have namespace,
Starting point is 00:23:57 so blah, blah, dot, blah, blah, right? And you can, using set or setConfig function, you can set it to something, and then you can use it. you can access it either using show, but you cannot use show inside SQL query. To access it inside SQL query, you can use function card setting. And naturally for Postgres and I think for Hasura as well, many things are coming inside these functions already preset by this
Starting point is 00:24:30 thin layer which provides API, right? For example some headers are there and in headers and also like processing of JWT tokens basically authentication, authorization is done through this mechanism. And it's natural
Starting point is 00:24:46 for you to include it into queries and say, okay, current setting and blah, blah, blah. And we see which user is working right now. And again, it's natural to put the current setting into definition of your level security policies. You say, okay, I don't like customer ID equals what you have in some headers or something coming from JWT token anywhere. You just use current setting to identify your user and apply the security rule to filter out the rows this user can see only, right? So you have current setting used in row-level security. And here we come to two problems. First problem is two separate problems, actually, but they are connected.
Starting point is 00:25:38 First big problem is that if you have select count dealing with many, many rows, like million, 10 million, or it may be another aggregate. We know Postgres has slow count, right? But if you have row-level security, slow count is order of magnitude slower, even worse, because row-level security check is applied to each row. This is terrible. We already have very slow aggregates, right? Which makes you think even more about something like materialized views or continuous aggregates in timescale DB or some other stuff, like other kind of pre-calculation, like denormalization with pre-calculation, I don't know, like something like this. So not providing actual values in real time
Starting point is 00:26:28 because it's a huge penalty, additional check. Even if it's just simple check with constants and that's it. This is number one problem. Yeah, it feels terrible and I don't have solution. We don't have good solutions actually, honestly, to slow count in Postgres because it's a row store. It's really expensive to calculate the sum of 100 million values.
Starting point is 00:26:53 But now we all additionally need to check every row. Can we deal with it? Can we deal with it? Or it's not our row. Wow, it's super slow. But additionally, another layer of penalty comes if you use a function which is not mutable, like which is stable, for example,
Starting point is 00:27:11 or volatile, which is worse, right? But current setting is stable function. It means that Postgres also, even if it's current setting and just some variable which is just set and that's it it will call it for each row boom boom boom right and this is insanely slow already insanely slow so people can think postgres have slow count people with rls can think oh it's very slow count people with RLS and current setting inside RLS think okay it's terribly slow
Starting point is 00:27:49 absolutely terribly slow right yeah but for this last problem with current setting there is a few ways to avoid this like behavior and I like the way just to put it into inside parenthesis and the right select word so postgisplanner moves current setting call into it's called init plan right and now another node in the plan and it will be called once and then just used that's great so this is easier solution just around by parentheses and just surrounded by parentheses won't solve it you need the word
Starting point is 00:28:31 select so it becomes subselect it goes to unit plan this is also interesting behavior also quite surprising to some folks but anyway this's some solution here. However, problem like we need to check every row in real time, it cannot go away. This is actually the number one, maybe the only one reason I think ILS, row level security is painful. Like how to solve it? I was reading there's a really good gist put together by i think one of the team at super base and or at least they reference it in
Starting point is 00:29:13 their docs and they do mention a kind of a trick where if as long as the function isn't leaking any information, you could bypass RLS. So you can, I think it's called security definer. You can set it up in a way that the function can then bypass row-level security. But yeah. But it feels like we're turning off. Why?
Starting point is 00:29:44 I can say, okay, let's run all massive aggregate calls, like counts, sums, anything. Let's just run them using a user which bypasses ALS. That's it. If you want to present some big count somewhere, just do this. But if it's global, it makes sense. I would say, okay, here we need to count everything by pass-through list. We would
Starting point is 00:30:10 do it anyway. But if it's inside single organization, it feels like breaking a hole in our wall we just built. I don't understand it enough to understand why that might be okay. And I think we would have to be careful
Starting point is 00:30:25 about not leaking information but because you're because of that first tip where we're still providing all the where clauses we need to only be accessing that organization at the query level we're still we've still got one level of like security there so i i can see why it's attractive but that's that's the only solution I saw to avoid doing it on every... to avoid that check. It's a logical problem. If we must definitely check and we do it after already, after planer already did everything, and basically after like the last stage of execution, right?
Starting point is 00:31:02 It's going to be super expensive if you have a lot of rows before producing the final result. But I feel kind of like Postgres could say, okay, we have it in the work laws, we'll already apply this rule, let's skip it. This could be some optimization, but it needs to be done inside Postgres somehow.
Starting point is 00:31:24 I'm not sure it if it was just like all planner things it would work even if they even if it was implemented for simple conditions it wouldn't work for some more complex like it would there would be like a never-ending list quicker because these policies are so flexible that's one of their strengths right you can set pretty much anything as a policy you You can do it at the operation level. You can do it at a role level. You do so many different levels. And then you can pretty much, I think you can use any SQL query to set up exactly the policy you want.
Starting point is 00:31:55 Yeah. Then you've got a whole new planning issue. Like the plan is already pretty complicated. So I get why they haven't. I agree. It like, for example, you know, like in the check constraints, you cannot refer to other tables. Right? It's, they are simple. But here in these policies, like rules, we can define, do whatever we want. And this means it's super, like,
Starting point is 00:32:23 what I just proposed, probably it's impossible to implement, right? If it was just this table, no reference to other table, maybe current setting can be used. And that's it. Maybe there would be a way to connect to the planner and see, okay, it's already applied this rule,
Starting point is 00:32:40 we can skip it. Skip it in a safe manner, 100% safe, reliable manner. but if we can refer to any table write anything there it's impossible right or or much more so many like predicates like yeah so yeah i agree with you here there's for people that want like i will share i'll share a link to that gist there's also a really good talk by Paul Copperstone, CEO of Superbase, from the last Pazette conference that I can share as well. Does a beginner's guide to RLS, but also covers some quite in-depth, manages to in about 20 minutes, cover some quite in-depth performance tips. I think most of which we've covered now, but it's visual and you can see him going through them which is nice yeah well i'm quite sure
Starting point is 00:33:33 it's like like super bass and other systems which work with past grass or surah or something these uh guys like are at frontier of rls usage right and for sure a lot of experiences in the discussions there. Yeah, which is great. But I just see like still like I feel like we already had bad aggregates. You made them worse starting using RLS. Yeah. By the way, I had a question for you. Do you know how many times we've had this topic requested for the podcast i have no idea four times four different people have
Starting point is 00:34:13 requested something around that's by far the most we've ever had for a single topic okay in this case we should have a lot of feedback. I guess. Well, I hope yeah. There are also a couple of people asking about, like, when asked for more information, like what exactly they want us to talk about, they asked about considerations around perform, like how does how to then diagnose performance issues. But I don't think there's anything unique about it. Right? Like, mostly, it's still well, tell me it's tricky. Yeah, my like, general rule is just to compare with and
Starting point is 00:34:50 without or less. For example, you can use user CPUs, the processes. Yeah, yeah. Because it can be tricky. I of course, you'd like filter and so on. But I feel like in the plans or movie, I'm missing something. But feel like in the plans or maybe i'm missing something but i feel in the plans where like we can start guessing how much time do we spend on rls check i well i think i think personally if you're if you've already gone with rls if you've already decided that and you're well into your project i think it doesn't the rls checks don't matter that much. It's more indexing. I still see more people with indexing issues than RLS performance issues.
Starting point is 00:35:31 Right. And they look like RLS issues because people might think that, yeah. I agree. I agree. But I'm talking about this massive account, for example, like and you have account on them and and how much of that time like how much how big is the other head from our lives it's quite tricky without like comparing two cases i i wish the explain analyze show this by the way uh we just have uh buffers by default committed right which we forgot to to celebrate a little bit or maybe it should be a separate We just have buffers by default committed, right? We forgot to celebrate a little bit.
Starting point is 00:36:09 Maybe it should be a separate episode about this. I don't know. I think I want to wait until post-class 18 to celebrate that, or at least far into the future and release candidates. I cannot believe this. It's super great, right? Yeah, a couple of things, people to thank on that front i think david rowley right did a really good job and also guillaume lalage did a lot of the heavy lifting as well so those
Starting point is 00:36:32 two in particular there was a really good conversation on the mailing list i think you kicked off actually i i didn't expect that this time it will go go go like i i was just i was discussing things near this topic but I'm happy to see it and I guess I will need to find another topic to be upset with because it was more than two years I was talking about how I'm
Starting point is 00:36:57 upset that buffers are not default and it's going to be over I feel happy and sad at the same time another thing was UUID version 7 finally committed and you know what
Starting point is 00:37:13 my perception was we were waiting until full finalization of RFC I thought so too because everyone is it not being finalized? It was developed before Postgres 17 released and then the decision was made not to take it into 17 because RFC is not fully finalized. At the same time, everyone already started releasing UID support like Node.js,
Starting point is 00:37:40 Go, like Google decided to release everyone already but both decided to stay conservative the thing is that i unless i'm fully mistaken i i see rfc is not fully finalized yet amazing so maybe it will be before 18th release andrea told me not to to shout about this probably sensible andrea and i have that we have similar philosophies then. Wait until it's actually released before celebrating. I bet we can talk about these things here because hackers don't listen to us, right? I think a couple do. They won't go and say let's revert it, I hope. Anyway, congrats to Andrey because this is who coded it. This is him who coded this during our Postgres TV live coding,
Starting point is 00:38:29 hacking sessions with Kirk and Andrey. So, yeah, this week was, to be honest, surprised me. Both things I'm very passionate about, like buffers by default in explain, analyze, and UID version seven, I think, again, people who listen to us, don't wait until the release of those functions, UUID v7, start using it right now because it's just, basically it's synthetic sugar,
Starting point is 00:38:57 not fully, but you can start using UUID data type and insert UUID version seven generated on application side or using some SQL function, PLPJSQL function. I have examples in my how-tos. And when this feature becomes native, switch to native. But it's hard to overestimate the importance of this support because I can convince a couple of guys to start using right now some guys can guess but 90 will follow default path right same with buffers exactly people can start using it now there's no reason they can't use it now it's just more will once we will finally analyze plans with buffers always and let's like to wrap it up.
Starting point is 00:39:46 When you see overhead from RLS, it's not about buffers, usually. Unless it's a lack of index. But if this is like current setting, we talked about current setting on each row, it's cold. Buffers are like tiny tiny but timing is awful and and yeah this is one of the rare cases where buffers well buffers don't lie right they don't lie they they show there is no i o here and it's already useful information but uh but you need both you need both right and yeah you know sometimes i say most of the time we optimize targeting buffers. In this case, we don't because it's just not about I.O. at all.
Starting point is 00:40:32 But seeing that I.O. is low, it already gives us an idea. Maybe it's RLS, right? Nice one, Nikolai. Well, thanks again and catch you soon. See you later. Bye-bye. Thank you. Bye.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.