Postgres FM - Gapless sequences

Starting point is 00:00:00 Hello, hello, this is Posgis FM. I'm Nick, PostGos AI. And as usual, with me here is Michael Pigeamastered. Hi, Michael. Hello, Nick. How's the game? Very good. How are you?

Starting point is 00:00:13 I am good, thank you. We haven't recorded last week because I was on trip and Oregon Forest, having some fun disconnected from internet mostly. Yeah, so now I return and you said we should discuss sequences somehow, right? Yeah, so I was looking back through our listener suggestions. So we've got a Google Doc where we encourage people to comment and add ideas for us to discuss topics. And whenever I'm short of ideas, I love checking back through that. And one of them from quite a long time ago actually caught my eye.

Starting point is 00:00:51 And it was the concept of gapless sequences. And I guess this might be a couple of different things. but I found it interesting both from like a theoretical point of view but also in terms of practical solutions as well as being one of those things that's kind of, so a sequence with gaps is one of those things that catches most engineers eye. Like if you start to run a production postgres,

Starting point is 00:01:18 you will see occasionally like an incrementing ID and then a gap in it and you think what happened there. So it's one of those things that I think most of us have come across at some point and being intrigued by. So yeah, it's a few interesting causes of that. I thought it would be good to discuss. The name sequence should mean it's sequential

Starting point is 00:01:38 like why the gap, right? Unexpected. And anyway, this episode is it number 163 or because I missed last week it will be 164? Do you know what? It would be quite

Starting point is 00:01:54 funny to, should we increment the episode? so normal because I was selling yeah either we should do yeah 164 missing one and then do 163 next week as like a joke and like because it's like coming in too late or we just carry on the world is broken this is another one normally you can observe sometimes because at commit time you can see like if we have many users they can use next numbers from sequence all the time and there's not

Starting point is 00:02:29 different of course, right? Yeah, I forget the name for it but whatever that phenomenon is but whatever it is, we should discuss that next week and have the ID go back down. It's self-eralizable, right?

Starting point is 00:02:39 So it's, it's, yeah, it's not serialized. So if you have two transactions and one, for example, you have support system, ticket tracking system and you will generate ticket numbers. You think sequentially.

Starting point is 00:02:54 One user came, open a ticket, but haven't committed yet. And now a user came, open ticket, and that ticket has bigger number, next one, right? And then it committed already, and then you committed this. You see one ticket created before that one, right? But at the same time, if you generate a timestamp automatically with, like, created at times, with default clock timestamp or something, and it was insert happened at the same time when sequence next while call happened.

Starting point is 00:03:27 in that case created at values will have the same order as sequence values like ID column values right so there would there will not be a problem when you order buy those tickets but an anomaly can be understood oh there is a ticket number like 10 and then number nine visible layer because we don't see uncommitted rights right so it should be committed first before it's visible to other transactions other sessions yeah yeah yeah but this is different anomaly you're like gapless gaps this anomaly is very well known because for the sake of performance sequence this mechanism post goes exists for ages and it just gives you next number all the time and of course if you for example decided to roll back your transaction you lost that

Starting point is 00:04:14 value so yes this is number one thing yeah exactly so it's to allow for concurrent rights isn't it so if you've got imagine it like within a microsecond, two of us trying to insert to the same table, if I am just before you and I get assigned the next value in a sequence

Starting point is 00:04:38 and then my transaction fails and is rolled back, you've already been assigned the next value after me. So yeah, I think that's super interesting. So I think that's probably the most common. In fact, possibly not, but that's the one I always see as the example given

Starting point is 00:04:56 as to why. Yeah, so not to think about going back to previous values. Like, this is your value, and it's like, fire and forget, like, this value is wasted, and sequence has shifted to new value. Although you can reset it using... There is next valve and there is... Set bar, I think. There is Carval, and Carval requires assignment first before you can use it, and then set valve.

Starting point is 00:05:23 Yeah, so you can shift sequence back if you want. but it's global for everyone, right? And also interesting sequences, it's considered relation as well, right? Yeah, we discussed this recently, didn't I? Yeah, yeah, in PG class you see a real kind equals, you said capital S, right? Capital S, and by the way I was wrong. It's not the only one with the capital, there is one other.

Starting point is 00:05:48 Do you know the other one? No. Indexes on partitions. Okay. Capital I. Okay. So we've got one cause already you've mentioned, which is transactions rolling back. I want to go through a bunch of other causes.

Starting point is 00:06:03 But before that, should we talk about, like, why would you even want a gap-less sequence? Like, we've got sequences and sequences with the odd gap in are fine for almost all use cases. Should we talk a little bit about why even bother? Like, why even discuss this? Why is it a problem? Well, my expectations, I guess, right? You might have expectations. So I think I've only got a couple here.

Starting point is 00:06:30 I'm interested with other people have seen others. But one I've got is user visible IDs that you want to mean something. There was a really good blog post on this topic by some folks at incident.io. It's actually old friends of mine from GoCardless Days. And they wanted incident IDs to increment by one for, their customers. So they could refer to an incident ID. And if they've had three that year, it's up to three. And then the fourth one gets assigned incident four. And it's not ideal if they want to, if want them to mean something for them to miss the odd one and much worse to miss like

Starting point is 00:07:15 10 or 20 in a row. So they obviously have many customers. So it's multi-tenant system, right? Yeah. And do they create a sequence for each customer? well they did initially okay yeah i'm asking like because i saw this in other systems and i remember the approach when we have a sequence to just to support primary keys unless we use a beautiful ui version 7 yeah well with some drawbacks but overall it's winning in my opinion these days but for each customer like in namespace of which client idea or organization idea does my project ID we can might we might want to have internal ID internal ID which is local right and then we shouldn't use sequences it's like overuse of them because if each customer has like thousands

Starting point is 00:08:08 of millions of rows and like we can handle handle it and the collisions would happen on would happen only locally for this organizational project or customer right which is great yeah right so yeah and for sequences we just the only thing we care about is uniqueness in my opinion

Starting point is 00:08:30 yeah you're right uniqueness is but that's the job of the primary key right it's also the fact they only go up I think yeah yeah well

Starting point is 00:08:38 unless somebody on in line right with set valve set valve exactly so and the capacity

Starting point is 00:08:49 like just forget about it because it's integer eight always for any sequence so i noticed some blockposts you share with me not this one different ones they used integer four primary keys i'm very welcome this move because these are these are our future clients yeah so very good move everyone please use the integer four primary key and later if you're successful and have more money you will pay us to fix that yeah i like you flipping the advice so wait but you said something interesting then so you said sequences are

Starting point is 00:09:24 always integer eight so even if i have an integer for a primary case the sequence behind it is interesting sequence it's an independent object which well independent relatively because there is dependency which is also weird thing like owned owned by right it belongs to it might below it might be dependent but it also might belong to a column of a table with alter table or alter sequence owned by some column right but overall it's just a special mechanism

Starting point is 00:09:55 integer eight always and just gives you those next number next number that's it simple yeah so yeah by the way I wasn't talking about so incident I did use sequences initially and it turned out to be a bad idea but all I meant was that that's a use case

Starting point is 00:10:11 for not just monotonically increasing IDs but IDs that increased by exactly one each time. So that's that's one use case for like the concept of gapless sequences. And another one came up in the blog post by sequin that I shared beforehand. And I'll link up in the show notes again. And that was the concept of cursor based pagination. So the the idea that you, well, I think it's I think it's very similar to key set pagination, but based on an integer only. So the idea that it would, I guess it's, I guess for those it's most important that it monotonically, that only increases, but also that concept of the committing out of order becoming important. So if we read rows that are being inserted right now, there might be one that commits having started later than a second one that, sorry, having started earlier than a second one that hasn't yet committed.

Starting point is 00:11:13 So we could see, the example they give is we could see, I. these one, two and four, and later three commits, but we only saw one, two, and four at the time of our read. So if we were paginating and got the first set, and it went up to four, and then we only looked for ones above four, we've missed three. So that's an interesting definition of a sequence where you don't want there to be gaps, maybe at any point. You know what? I'm looking at the documentation right now, and I think it would be great if this thing called not sequence, but something like generator, a number generator or something, because sequence it feels like it should be sequential and gapless, like it's just some feeling, you know.

Starting point is 00:11:56 This just gives false expectations to, I think, to some people, not to everyone. Of course, the connection says, great sequence, define a new sequence generator, right? So generator is a better word of this, right? And I think the condition could be more explicit in terms of gaps to expect. yeah yeah so so yeah because in my opinion in my practice it happened not once when people expected them to be gapless somehow i don't know like no like a lot of new people are kind into poses and all of us were new ones right as well like i i definitely experienced this i remember i think for us moving on to kind of a second cause of this i think the reason we were getting them was

Starting point is 00:12:46 using insert on conflict so if it was something around having new users that had been added by somebody else in the team for example so the user had already been created behind the scenes because somebody invited them and then when they signed up we were doing an insert on conflict update or something like that and then so as part of that the next file was called just in case we needed to insert a new row but we ended up not needing to because because it was an update instead. So I think you can get these again through in certain conflicts.

Starting point is 00:13:20 Yeah, and actually the confession mentions it. Oh, cool. It mentions, like, I think still could be mentioned more explicitly maybe in the beginning and so on. And the thing is like, like someone might consider sequences as not ICID, right? Because if roll back happens, they don't roll back. for the sake of performance

Starting point is 00:13:45 obviously so it violates atomicity does it yes or no yeah so if things other things other rights are reverted this change that we

Starting point is 00:14:00 advanced sequence by one right we shifted its position it's not it's not rolled back so yeah so our operation only partially rewarded

Starting point is 00:14:12 if we strictly look at it for the sake of performance it's pretty clear but yeah so like kind of not full ACID and that's okay just you need to understand it and that's it yeah

Starting point is 00:14:27 for me it's natural but I can understand the feelings of people who come to POSGUS now and they just from the main they expected it but then boom it's a simple thing to learn yeah another case where naming things is hard

Starting point is 00:14:43 and yeah so yeah for me is a generator of number huge capacity 8 bytes and it gives me a tool to guarantee some uniqueness when we generate numbers that's it very performed very very I never think about performance because because rollback is not supported that's it let's go and and yeah but but let's talk about again like if we really need to I would think that do really need or we can be okay with it if we really need it I think we should go with like specific allocation

Starting point is 00:15:21 of numbers maybe additional ones not primary keys right right yeah I well personally I think this is rare enough need that it's not it's not needed by every project I don't think right I've run plenty of projects

Starting point is 00:15:36 that have not needed this feature so I personally think there's not a necessity to build it in to Postgres core as a feature as like a, you know, a sequence type or something. But I do think it's interesting enough, like it seems to come up from time to time. And I think there are neat enough solutions, at least at lower scales. I'm sure there is a solution at high scale as well, but there are simple enough solutions at lower volumes that I think there's no necessity, I don't think, for a pre-built solution that everyone can use. High performance solution, it's impossible because if there is transaction which wants to write number 10, for example, but it hasn't committed yet, and we want to write next number or also number 10, it depends on the status of that first transaction.

Starting point is 00:16:33 We need to wait for it, right? It creates natural bottleneck. Yeah. And we're like, I cannot see how. it can be undone like can be done differently we need to wait until that transaction we need to serialize these rights and again like for me the only trick in terms of performance here is to use the fact that if we have multi-tenant system we can make these collisions very local to each project or organization or tenant right so they say it only within this organization

Starting point is 00:17:05 and the other organizations are not like are separate in terms of these collisions And ultimately, then, it's about parallelizing rights, which I think is then sharding. Yeah. So if you've got the multi-tenant system across multiple shards, you can then scale your right throughput. So it feels to me like another case of that probably being the ultimate solution. Well, if you involve sharding and distributed systems, oh, it's like. I don't mean you meet across shards. I don't mean locally, locally, yeah, yeah, yeah.

Starting point is 00:17:37 Exactly. If you've got a tenant that's local and you can... Because if you want pure sequential gapless number generator for distributed systems, it's a whole new problem to solve. You need, yeah, you basically need to build service for it and so on. But again, if you make... So you should think about it, okay, we will have thousands of varieties of new rows inserted per second, for example, soon. What will happen? And if the collision will happen only within boundaries of one tenant or project organization, doesn't matter, it's not that bad, right? They can afford inserting those rows sequentially one by one.

Starting point is 00:18:19 And maybe within one transaction or some transactions will wait, but maybe just one. So maybe this will affect our apparelization logic, so saying let's not deal with multiple tenants and multiple backgrounds and transactions. Let's do it in one transaction always. But if we, like, write thousands of rows per second and they belong to different organizations, collisions won't happen, right, because they don't compete. So this dictates how we could build this high-performance, gapless sequence solution. We just should avoid collisions between tenants, for example. Yeah.

Starting point is 00:18:58 Yeah. But we've jumped straight to the hardest part. Should we talk about a couple more of the kind of times that you might? Surprises, yeah, so Roelbeck is one thing which can waste your precious numbers right

Starting point is 00:19:10 another thing is I learned about it and I forgot and Lillian when you sent me these blog posts there is a hard-called constant 32

Starting point is 00:19:22 pre-locate actually I think there is constant and I think there is some setting maybe I'm wrong but there should be some setting

Starting point is 00:19:29 Yeah so which you can say I want to pre-alocate more. Oh, I didn't come across that. So we've got the set blog vows. That's the hard-coded one, right? Yeah, maybe I'm wrong, actually.

Starting point is 00:19:44 So there are pre-located values. And can we control it? No, we cannot control it, right? 32. Ah, there is cache. Right? What is cache? When you create a sequence,

Starting point is 00:19:58 you can specify the cache parameter as well. Okay, so what does that control? Yeah, so this controls exactly like this. If you don't do it, it will be 32. Oh, okay. So it's definedable on a per sequence basis. Per sequence. You can say I want 1,000.

Starting point is 00:20:16 Prolocate. What if we said it to 1? Well, I think only 1 will be prelocated, right? One is minimum, actually. One is minimum. Yeah, yeah. Actually, it's also interesting, maybe I'm wrong because there is also yeah so i'm confused there is uh so the computation about this parameter

Starting point is 00:20:37 says one is default but we know there is also 32 hard-coded constant yeah in any way like i know this hard-called constant can be associated with uh 32 gap so when for example a failure happens or just you fell over switch over to new primary which should be like normal thing right you change something on your replica switch over to it this is when you can you might have a gap which is described in one of those articles 32 so I'm not sure about this cache parameter right so so maybe if you change it it's only cache of relocated values and that's it maybe like specifying it won't lead to bigger or smaller gaps I'm not sure about that so maybe that two layers of

Starting point is 00:21:25 implementation here but based on articles we know okay there are gaps 32 and this is like just like common right and interesting this is connected to recent discussions we had with one of big customers who have a lot of databases and we discussed major upgrades and the part of you know we have zero downtime zero data loss reversible upgrades solution which multiple companies use and part of it is like one of the most fragile fragile parts is when we switch over doing switchover into logical replica we do it with basically without downtime things to pause resume and pidgee bouncer also pyrj dog supports it so we pause and resume and between pausing resume where small latency spikes in transaction processing happens we change redirect pidge bouncer to new server and that server by default has sequence values corresponding to initialization

Starting point is 00:22:35 because the logical replication in postgres doesn't support still there is work in progress. It doesn't replicate values of sequences. The question is how to deal with it? There are options.

Starting point is 00:22:49 First, you can synchronize sequence values during this switchover but it will increase this spike. We don't want it because we achieved a few second spike. That's it. It's really, it feels really pure zero downtime. And if we start synchronizing sequences, it

Starting point is 00:23:05 will increase it. Especially some customers had like 200,000 tables. It's insane. But okay, if it's only 1,000 tables, I thought well, I don't want it. Actually, customers said, one of engineers on customer side said, you know what?

Starting point is 00:23:22 Like, this set valve is not too long. If we quickly read it, quickly adjusted, maybe, okay, another second. And posting shows, yeah, exactly like changing position of sequences super fast, actually. Yes, if you have hundreds of thousands of tables and sequences, it will be quite slow. But it's only a few. You can do it quite quick.

Starting point is 00:23:44 Also can paralyze it maybe, but it will make things more complicated. But another solution, which I like much more, we just advanced beforehand, before switchover. With some significant gap, like I say check how many you spend during like a day or two millions, 10 millions at once. We have enough capacity for our life. Eight bytes, it's definitely enough. So, yeah, it's just bump it to like 10 millions. But then it works with, you know, your system like 1,000, 2,000 tables, it's just one system. And, you know, these big gaps are fine.

Starting point is 00:24:29 But when you think about very, very different projects, thousands of clusters, you think, oh, maybe some of them won't be happy with big gaps. You know? And this is a hard problem to solve, yeah. And if you go back in the other direction, let's say you want to be able to fail back quickly, that's another gap. So each time you bounce back forward. Yeah, yeah.

Starting point is 00:24:51 Yeah. Yeah, since our process is fully reversible, it's really blue-green deployments. Every time you switch, you need to jump. And we recommend jumping in big, like we have big gaps. And we say, you should be fine with it. But I can imagine. Why not smaller gaps? Why not, like, let's say it's a two-second pause.

Starting point is 00:25:15 Yeah, if you know, there won't be spikes of rides right before you switch over. Well, we can do that. but it's like it's just there are like risks increase of other than if you if you made it wrong after whichever some inserts won't work because

Starting point is 00:25:34 this sequence already used right but jupica key or yeah yeah so what would the actual Evers be jupica key violations yeah so your sequence but yeah it will it will heal it will heal itself

Starting point is 00:25:48 right thanks to nature of sequences which are like waste numbers insert like a new key violation you can oh it works yeah it's funny yeah anyway like i was i was always i was always like thought like i always prefer to be on safe side and jump in like do big jumps but when you think about many many clusters and things of many people oh it's a different kind of kind of problem to have and and so i'm just highlighting gaps are fine but what about big gaps yeah you know some sometimes uh can look not good might yeah in this case yeah i we're still thinking maybe we should just implement two paths you know and uh by default

Starting point is 00:26:36 we do big jump but if somebody is not okay with that maybe they would prefer a bigger spike or bigger like maintenance window like okay well up to up to 30 seconds or so yeah yeah while we are synchronizing their sequences and don't allow any gaps or for me naturally knowing how sequences work for years like gaps should be normal right yeah it's so interesting isn't it like the trade-offs that different people want to make no solution to this yeah pardon me you know the good solution to this finally start supporting sequences in logical replication that's yeah that would be well yeah and i and that might not be too far away so yeah so yeah and i and that might not be too far away so yeah so yeah i think so i think so this work in progress lasts quite some years it's called logical replication of sequences or synchronization of sequences to subscriber and it's already multiple iterations since 2004 i think and it has chances to be in postgis 19 but it requires reviews it's a great great point for you to take your cloud code or cursor and and ask it to compile and test and so on and then think about edge cases, corner cases.

Starting point is 00:27:54 And even if you don't know C, this is a great point to provide some review. You should be just engineer like writing some code. You will understand discussion, comments. It's not that difficult. So I encourage our listeners to participate in reviews. Maybe with the year, but there will be still value if you consider yourself an engineer. You will like feel it out over time, which value you can be. bring. The biggest values in testing is to think about various edge cases and corner cases as

Starting point is 00:28:25 a user, as post-gast user, right? And try to test them and yeah, I will help you. Yeah. Well, I also think we do have several experienced postscript, like see developers, um, listening. And I think, I think it's always a bit of a challenge to know exactly which changes are, are going to be the most user beneficial because you don't always get a representative sample on the mailing lists. I think sometimes like a lot of the people asking questions are very, like at the beginning of their journey, they haven't yet worked out how to look at the source code to solve problems. So you don't get some of the kind of slightly more advanced problems always reported because

Starting point is 00:29:12 people can work around them. And I think this is one of those ones that people have just been working around for many years. A lot of consultancies deal with this in different ways. But it is affecting every major version. It is friction on. So if any hackers, any experienced hackers are also wondering like which changes could I review that would have the biggest user impact. This feels like this. This feels like so many wanted. Logical replication is used more and more like a blue green deployments and so on. And it's like for me in the past, if I looked at this, let's include by the way commit fast entry so people could look at it.

Starting point is 00:29:46 and think if they can review and help testing. So in the past, I would think, okay, to test it, I need, first of all, what I need. This is about logical replication and behavior. I need logically setting up two clusters with logical replication. Oh yeah, okay, I have better things to do actually, right? Now you can just launch your code or cursor and say, I have Docker installed locally on my laptop or something. Please launch two containers, post-gues, different versions maybe,

Starting point is 00:30:16 create logical replication and let's start testing and then like not not containers uh if containers work now you can say okay now i want uh some of them are built locally from source code and then same thing and you don't need to install logical to set up logical application yourself that's it so yeah yeah so this like roadblocks can be eliminated by yeah and then you focus only on use cases where this thing can be can be broken and this is where you can start contributing just you just need to be a good posgous user that's it yeah nice good just to be able to distinguish logical replica from physical replica manually that's this is like the only thing you need to know to start yeah good okay so are there any other cases where we can experience gaps

Starting point is 00:31:11 well i actually thought it i only wanted to talk about two more things for sure One is Y32. Why do we pre-allocate these? I think that's interesting. And two, what can you actually do about, like, if you... I thought the incident, like, especially at lower volumes, there's like some neat solutions. They were the only last two things I had on my list. Well, for performance, we pre-locate, right?

Starting point is 00:31:35 Because technically it's a page. It's like also relation which stores value and so on, right? Well, I got the impression from a comment in the source code that it was... I think, well, so let me read it exactly. We don't want to log each fetching of a value from a sequence. So we pre-log a few fetches in advance. In the event of a crash, we can lose, in brackets, skip over, as many values as we pre-logged. So I got the impression it was to avoid spamming the wall.

Starting point is 00:32:01 Yeah. It's optimization technique, that's it. So I could imagine a case where you'd want to pay that trade-off the other way around. And it's good to know, as you mentioned, that you can reduce it on a per-sequence basis. I think it's different. I think what you can reduce is, cash but it's not the thing that goes to wall I'm not 100% sure here I just I just yeah I just think you still lose to 32 but because these are two different things

Starting point is 00:32:29 one is hard-called constant value another is dynamic control by user so but maybe I'm wrong again here it's a good question to check it's but it's nuance yeah for me a sequence is always gap gap having gaps that's it yeah yeah it's okay um the last thing was solutions and yeah the i thought the incident one was really neat but also quite oh it's very very simple i like simple solutions that uh work for now and we can solve later problems later and it was just to do a like a sub query and read the max value currently and increment it by one so not using sequences of course yeah no sequences is just reading it reminds us as the episode we had with hockey benito right

Starting point is 00:33:23 and the problems yeah yeah get get to create or something like this right yes so we need to basically we need to deal with other like we need to read the maximum value and get plus one but maybe others do the same thing in parallel yeah how to how to deal with like with performance in this multi-concurrent environment again like the clue for me is to use to narrow down the scope of collisions that's it so contention would be local to yeah so there's there's multiple options right like you could just implement i say just as if it's simple retry is a one option if you expect collisions to be super super uncommon retries would be a solution but i think there's well the sequence blog post actually goes into a bit of depth

Starting point is 00:34:16 into how you could you could scale this if you are doing tons like a lot per second so that that's an interesting solution that there's way too much code to go into now but I'll link that up in the show notes but yeah I did think there's like

Starting point is 00:34:32 a range of solutions like from we have a multi-tensity system like incident for example you're not going to be creating hopefully most organisations are not going to be creating thousands of incidents per never mind second per day so the chance of collisions or like issues there

Starting point is 00:34:50 are so low that it's almost a non-issue whereas for a different use case I actually I can't think of a use case for like needing a gap plus sequence that can insert thousands or like thousands per second so I just don't see that being a well I'd love to hear from people that have seen that or have had to deal with that and what they did thousands per second

Starting point is 00:35:16 yeah of for and a gapless sequence like yeah where it's important not to have gaps yeah yeah yeah because you have a lot if you have a lot of inserts you have big numbers and yeah so the idea is a gapless desire to have gapless matters when we have only small numbers I think it's more important, right? Yeah, maybe. Also, even 32 would disappear quickly. Imagine a gap of 32 would disappear quite quickly if you're... Because if it's a big number, we already stop paying attention to...

Starting point is 00:35:54 Yeah, maybe, maybe you're... And also, I don't think computers care about gaps. I think it's humans that care. Yeah. Personally, I don't know. Yeah, well, with sequences, I remember it was 2005-6 when we wanted to hide actual numbers of users and things

Starting point is 00:36:12 created in our social network so we use two prime numbers and set default next from sequence multiplied by one big number and then a model of different number so it was like fake random

Starting point is 00:36:31 you know to hide it I figure out like you can still if you can still If you create some of things yourself, you see numbers, you can quickly understand the logic and like it's still, you can hack it and understand the actual growth rates. But it's hard to understand absolute value from this. You don't know how many things. Compared to like people who don't care, they're just one global sequence for all users and you, okay, number of posts, this, like one million something. Okay, this platform has one million posts.

Starting point is 00:37:04 It gives some signals to your competitors. right so you do i learned today what that generally is called it's called the german tank problem have you heard this no it's like the maybe not the first but like the the first famous case of this was i think in world war two the allies were using the numbers like an incrementing number found on german tanks to find out how many they were going through like how many they what their production capacity was and it was a useful thing in the war so yeah this is older than computers yeah it reminds me how yeah how the guys who from my former country went to your country to poison some guy and their passports were sequential that's how they were tracked yes so stupid right

Starting point is 00:38:02 I mean, sometimes gaps are good if you want to hide some things. Yeah. So if you build some system, maybe you want gaps, actually. Yeah, that's the next episode, another different episode. How to build gaps?

Starting point is 00:38:16 Gapful sequences, yeah. Yeah, some random gaps. So everyone, like, don't understand, doesn't understand how many. Yeah, just UIDV4, right? Yeah. Random jumps. Yeah, so that's it.

Starting point is 00:38:31 I also wanted to make. mentioned sequences have a key sequence has a few more parameters you can specify like mean value max value and you can say it should be in loop i don't know like i never used it cycle it's called cycle so you can specify from 1 to 1,000 and cycle so you don't for example need to it doesn't need to be on a primary key because so it couldn't be on a primary key that one yeah i i would use like just percent operator model just divide by something and have the same effect but yeah i guess it's similar to transaction IDs if you think about how transaction ids lure wrap around yeah if you want to wrap around go for it yeah i'm very curious use cases

Starting point is 00:39:18 for this never never used it yeah but but in comment also you can specify jump in like Only odd numbers, for example, right? Yeah, or any positive might be more common. We want to increment by random. This will be our random gaps to fool everyone. Yeah, all right. Okay, good. Enough about sequences.

Starting point is 00:39:42 Thank you for the topic. Likewise, good to see you and catch you soon. Bye-bye.

Postgres FM - Gapless sequences

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.