Postgres FM - GIN Indexes
Episode Date: March 14, 2025Nikolay and Michael discuss GIN indexes in Postgres — what they are, what they're used for, and some limitations to be aware of. Here are some links to things they mentioned:GIN Indexes ht...tps://www.postgresql.org/docs/current/gin.htmlGeneralized Search Trees for Database Systems (Hellerstein, Naughton, Pfeffer) https://dsf.berkeley.edu/papers/vldb95-gist.pdf RUM extension https://pgxn.org/dist/rum/1.1.0/Understanding Postgres GIN Indexes: The Good and the Bad (Lukas Fittl) https://pganalyze.com/blog/gin-index~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith credit to:Jessie Draws for the elephant artwork
Transcript
Discussion (0)
Hello, hello, this is PostGCFM. As usual, I don't remember the episode number.
137?
137. My name is Nikolai, PostGCFM, and as usual, my co-host is Michael, Pidgey Master. Hi, Michael, how are you doing today?
Hello, Nikolai. I am good, thank you. How are you doing?
Fantastic, honestly.
Good? A lot of things are happening but all good. Things are changing and a lot of things are changing in a constructive way, which I like.
Let's discuss GIN.
It's one of things a lot of folks use and it's one of index types many people use still and i think it's not going to
disappear despite the rise of vector related types of indexes and approximate k nearest
neighbors indexes right gin is very interesting type of index developed by the same folks who brought wall support to JIST.
JIST was the only thing for complex data types before Jin.
But then with the rise of JSON and JSONB, Jin became very popular, finding a lot of applications.
Not only full-text searches, I think originally it. Not only full-text search, as it was, I think
originally it was created for full-text search.
Yeah.
Yeah.
So I looked at the history. I was shocked to see that it was added back in 8.2. So 2006,
nearly 20 years ago.
Yeah. This is exactly when I started to be involved with Postgres.
Yeah. This is exactly when I started to be involved with Postgres and 8.2 when I worked on XML as well and I thought how, very sad about what's happening in the world and how disconnected we became
all of us. Yeah. Because these guys are all
Russian folks but also I was looking at the domains and
one of the, there's an article
mentioned in the Postgres docs on gin indexes that references the initial release
and mentions who sponsored the work which i think was a german company but one of the domain the
domain it's hosted on is.su which i didn't even know i didn't even know what that stood for it
turns out it's ussr or soviet union which makes sense yeah i wanted to say that originally the
primary purpose was to support better performance
for full-text search and originally full-text search was powered by Gist and its version called
RD3, Russian DOL3, where its B3-like structure was created for basically arrays, for sets,
created for basically arrays, for sets and supported operations like instead of less than greater than supported operations is contained, one array is contained in another
or contains, right?
So also overlaps and it just can work with this, it can support full-text search, but it's slow if you have large volumes
of data and a lot of words.
Different words, like vocabulary is huge.
Actually used vocabulary, right?
So gene is inverted index and the name was inherited from Gist.
Gist is generalized search tree.
I like this so much.
Folks who are interested in databases and want to learn something cool, read article
by Hellerstein from Berkeley.
Joseph Hellerstein and article about jumping from single dimension B3 to two dimensional space R3
and then generalizing this idea to GIST generalized search tree. It's so cool.
And then additionally Rd3, it's four sets already. And of course you can have multiple dimensions,
like many dimensions. Not so many as four vectors, unfortunately, like not 2000, but for few dimensions
JIST works well and even there was a history for R3, original implementation of R3, native implementation,
JIST behaved much better, so it was replaced. Original implementation of R tree for multiple dimensions was replaced
by JIST, not for B tree.
But there is extension, JIST B tree for various like multi-column indexes and so on, when
you have columns of different nature.
And trees are great, but for full-text search they are not so great. So they jumped from idea of working with trees to inverted index, which is a more common
idea for search engines.
And compared to tree-like structure, inverted index is very simple explanation.
Can be, okay, we have list of words and for each word we have list of references to places
where this word was noticed, right?
Like rows.
Links to tuples.
Yeah.
Not row versions. So list of words, of course normalized and then for each word list
of places where it can be found. It's very very similar to index in a book, in the end of the book.
Yeah, I read this and for so many years I've been using and reading about,
you know, indexes at the back of books,
or, you know, in recipe books is like,
there's a really famous example,
really easy one to understand how indexing speeds things up.
But it never occurred to me
that they're not a B-tree structure,
they are a list of words.
Yeah, exactly, it's an inverted index.
So yeah.
And I think the thing that clicked for me as to like why the word inverted is used was
in a standard, in a non-inverted index, we have an entry per row or row version.
And in an inverted one, each row could be referenced multiple times
assuming exactly in a full-text search case for example each document is likely to have
lots of words in it so that's the where the word inverted comes from.
Yeah in tree like structure in tree in B3, R3, RD3, and just in general.
We perform descending.
We say we need to find this value, for example, and we jump from one note to another down
to the index towards leaf notes, right?
Leaves.
Performing some operations to understand where to go, which child node to choose, not only direction
because it's how to say, each internal node might have multiple, many, many children,
unlike binary tree. And my favorite example, favorite question when hiring engineer, not database engineer.
For database engineer this question is mandatory, but software engineer, backend engineer.
You can ask what is B3 and if you hear binary, this is game over for you.
And this, unfortunately I had situations when very, very good software engineers, I had this feeling,
do I want to ask that question?
Because it's like kind of flipping the coin.
And then I hear binary and my internal principles say this is no go.
So B3 is not binary tree. B3, unlike binary, we are only two children's
in the maximum. We have many children for each internal node, maintaining some rules.
In case anybody's wondering, balanced. Well, there are several theories about it,
as I remember, but I don't remember theories. But remember serious but okay yeah balanced but actually it's almost balanced meaning that like conceptually
the path from the root to each leaf can be n or n plus 1 this helps with
balancing so almost balanced okay again again when we work with tree, we have several operations until we reach leaf.
And leaf has pointer to the tuple we are looking for.
Inverted index. Okay, we find our word.
And then we find many places for this word.
And this also means there are several challenges here. I remember
originally, as I understand, maybe like my perception might be wrong, but it was long ago,
how many, 20 years, right? Almost. I remember GIN was lacking some internal B3 indexes. It needs B3
indexes twice. First, to find the word, because if the number of
different words is huge, we need additional mechanism to
perform the search faster.
And it can be B3.
And second, inside posting lists, we also have B3s.
So inside GIN, there are two types of B3 indexes, as I
remember.
And I saw Igor Rogov's articles and book about internals.
And this book has, of course, a whole section about GIN indexes.
And I saw a representation, visualization for GIN was,
I think actually, the computational should have it, right? One of the, maybe the first illustration which was added to Postgres documentation was Gene,
if I'm not mistaken.
My memory might trick me a little bit, but I remember I saw it was 90 degrees rotated,
so instead of like in book we have words and then we see page
numbers to the right for each word. It was rotated 90 degrees so everything looks
down. So it seems like we're going down all the time like in case of
gist or bit or bitree. So starting from search of our word, then we using internal B3, then we go in posting link, posting list,
we also use B3 and also descending, right?
And I honestly feel confusion.
I like the book-like visualization
because it distinguishes from tree structures.
And also we have three internal trees here.
It only adds confusion, in my opinion.
This is the choice.
The guys who created visualization and wrote, Igor Rogov and others, they are great in terms
of education.
I definitely admire them, their work in education.
But I personally like original representation.
We could write B3 also rotated, highlighting that it's kind of different beast in terms
of indexes, right?
Yeah.
Good?
Yeah, this is...
Sorry, sorry.
This was my introduction, just like history and so on like and some simple
Explanation
Yeah. Yeah, I really liked it and I really want to like diagrams in the docs
I think they a lot of effort goes into making diagrams that are easy to understand
Unfortunately, we don't have that many in the post-growth documentation, but I also find that I personally
And unfortunately we don't have that many in the Postgres documentation, but I also find that I personally, I don't learn best visually, I don't think.
And I found the description in the Postgres docs of how Jim works particularly useful.
It's really well written, like most of the documentation, but I found that really helpful.
So I'll link to that in the show notes as well.
I think that's an incredibly good description with a few examples of, I mean obviously full text search is the
main use case but it's not the only. I think right now full text search
was original main use case, right now main use case is JSONB. Yeah interesting.
Well I don't honestly I don't see that I'd say I think I'd be interested in the results
of your poll. I think you've put a poll on Twitter, but I do think gin is probably the
second most popular index type. After B-tree.
But it's by somewhere in the region. I was thinking order of magnitude wise. It's, I
say it's there's fewer than one in 10 maybe maybe, but probably more than one in 100. So like somewhere in between
those. So it's still not super common for me to see them. Yeah, almost half of folks voted.
They say two index types are used. And I guess it's either B3 and GIN or B3 and vector types like HMSW.
Interesting.
Who knows?
I do see the occasional brin and the occasional hash, but often those are kind of... people
have left them lying around after an experiment, after like trying them out, but not necessarily
getting that much usage.
Postgres should have telemetry enabled by default.
I'm joking, I'm joking.
To get usage stats.
Well actually, I know this is a real tangent now, but I do think that the usage stats can
be misleading for people as well because the last person I came across that had a hash
index, they had a hash index on the same column they had a regular B-tree index on.
And I said, you know, that's redundant, like if you could, the B-tree could, could...
You already pay the price for maintaining B-tree, so...
Exactly. But the hash index was still getting used because it's a bit smaller for that, like in...
So it was still getting usage in the telemetry.
Had a method about hash indexes, remember? Yeah. But all I mean in the telemetry. About hash indexes, remember?
Yeah, but all I mean is the telemetry isn't enough to know that you can drop that hash index
because it's still getting usage.
I'm joking about telemetry. I cannot imagine Postgres would add some telemetry stuff enabled by default.
It's completely opposite to what it's usually done in Polgis.
But I'm very curious, of course. I think big platforms might have some stats, but who knows.
Anyway, it definitely looks like B3 is number one, but just GIN may be number two or number three.
Oh, sorry, I understand what you mean by telemetry. I thought you meant like for the users to see what they were using. No, imagine Postgres has telemetry sending
to some server. The cloud providers could do it. Like, anonymized. Yeah. If they can,
because it's also a question how much they can go to schema analysis. Usually it should be possible.
But it's better to focus on different tasks for cloud providers and we can discuss this another time.
This is not number one priority, I'm pretty sure.
How about you? How often do you see gin being used in the world?
Well it's used for full tech search but often it's losing to elastic in larger companies.
In smaller projects it's great but in larger companies we should discuss performance gains
and overhead next, right? But yeah, what I see
is great until it's not in terms of full-text search. For JSONB, kind of similar, but there we
can tune it, right? We can use smaller size of index just preserving only information about key and keys and values
just on be pattern ops with these things like additional you can tweak it right and yeah
path that may be path ops exactly yeah just to be path ops so you can say I don't need
everything to be indexed all the knowledge about structure, I just need keys and values and support only
part of operations, search operations I have.
And for JSONB for sure I see it, but at the same time in many cases JSON or JSONB values
left unindexed because it's just storage for some metadata for example and we don't need
to search in it or we can we decide to search only using specific paths
inside JSON value and then in this case for both JSON and JSONB and even for
XML data type you can have B3 because it becomes a scalar value when you apply
some like you need to get the value from
some path that's it. It's just an expression index yeah yeah exactly so I
also see quite a lot Trigram PgTRGM being used with sometimes with some
issues but yeah this this this I see as well.
Yeah.
So again, but again, normally related to full-text search,
right?
I say trigram by the way,
and I thought maybe it's trigram, I don't know.
I think JSONB is number one in my head,
it's so I don't have analytics,
so it's some subjective perception.
It makes sense, right?
It makes sense if you think I might want to do some search
on this.
It's a perfect, I mean, it came 10 years before JSON V,
but it's the perfect index type for it.
If you don't know in advance what you're going to want to be
searching for.
There is also a full-text search for JSON V, additionally,
a right specific case.
I remember some support was added
to perform full text search inside JSONv values, also
powered by GIN, I guess.
So anyway, this index is great.
Well, for arrays as well, I use it many times for arrays.
If I know I need to store arrays, text arrays, number like integer arrays,
doesn't matter. I like how Postgres supports them for decades. Array support in Postgres is great,
except one thing, you know it, right? No's? No, no, no. Similar, but no. I don't know it right? No, no, no, similar but no it's... I don't know it. No. It's indexes.
It starts with one. It's very confusing. Oh, okay. Super confusing. You confused me saying
indexes. I was thinking... No, no, yeah, it's different. Index structures. Yeah, yeah.
I mean to access the first element, everywhere else... outside Postgres, it's zero, right?
But in Postgres, it's one.
And you need to switch your mind all the time.
Not everywhere else, but yeah.
And for arrays, it's great.
For example, I remember my very first talk in the US in 2008
was about how cool Postgres is for Web 2.0.
And I talked about EAV structure, how it's performance-wise not good for larger data
volumes.
And when we talked, I talked about tagging, tags, right?
For social media, we need tags.
And putting tags into separate tables following EAV, entity
attribute value structure is going to hit performance. So instead we can put it as arrays
in the same table and then to search we can just use gin, right? And Russian Dole Tree will be
effectively used but supported by gin implicitly, right. So in this case you can find quickly
like give me rows where we have this tag. There will be problem although of ordering.
Before we move on to discussing this problem I think this is
the biggest problem with full-text search and general gene people suffer
from it but we will discuss it before that let's discuss performance gains and
losses again is obvious search speed is much better for inverted index than for
tree like structure so for larger volumes of data, search is good, select is good, right?
If we, if like one comment here, only if we talk about the use of gene only, right? And
like only searching by like one dimension. Yeah, well one word or few words also, it has additional internal query language, you can say,
and or phrase search.
A lot of features there.
But also, when we say this, search speed is better.
We need to think about edge cases as well.
For example, if we have a very, very popular word which was not excluded
by putting it to stop list of vocabulary, stop list dictionary. If we didn't put it there,
we're going to have poor selectivity and high cardinality of results and performance, I don't know, limit works, right?
But then there are edge cases where when like it's quite popular word, we use gene
and there is a big problem of the need to combine gene-based search
with something else like for example additional filter on some scholar like time stamp or
anything or ordering.
And this is huge problem.
I want to find everything which includes these words or these values, doesn't matter, but
I also need to get 25 last items from there.
And this is somehow weight it like you might want to wait based on the the more recent
The more you want to wait it, but if it's got you know, the word many many times over I I get it
I think we're probably
Venturing a bit too far into full tech search stuff rather than gin stuff, but you're right and it's about gin
It's not about full-text search but for for three grams don't you want to have most popular
use or like the latest or for for just on be search I want to find everything
which which includes these things but I want very fresh I want to order somehow
and in most cases and modern work like when it was created, the idea was we are going to order by ranking.
It was full search, ranking, most relevant. But in social media, we usually don't want that, we usually want the freshest information.
So creation time or ID, if it's numeric, or UID version 7 would work. So we need to order by some scalar.
And this works not well with gin.
I think you're right though.
I think maybe at scale it's always going to be a problem.
There is a feature that was added from the beginning with gin,
at least for the stop word issue.
And that's gin fuzzy search limit that you can set in that it recommends in the thousands
so that if if your search would have returned more than that number roughly it will limit them but
it will limit them randomly so that that plays into your which which one should be returned but
the idea behind that feature is if you're returning thousands of options anyway how good is that search in the first place like it's a so yeah you yeah
you're right also statistics related work like it over time it was improved
but like I remember still having issues like with lossiness of for a planner
right like like we need to check I don't know actually details right now.
I'm using Jin blindly lately in terms of performance.
But for me, the key problem is an ability for Jin
to combine search with additional scholar-based filters
or ordering.
Ordering is number one problem for me.
It led the same folks to create RAM indexes.
Yeah, which confused me because when they launched
gin they said you should think of it as a genie not as an alcoholic drink. But then they come out
with rum and it's definitely about alcoholic drinks. It was a catch. I don't believe them. Yeah.
Ambush, right? So yeah, yeah. Well, and the rum unfortunately in my case didn't work well because of like it was huge and so on
maybe it was because I tried to put to the idea of RAM is let's put colors into a structure right
like time stamp. So that you can do the creation time. You can do the thing you wanted. Right exactly and creation time it's
how many? Eight bytes or 16 and I keep forgetting. It was huge. I guess I could reduce it, for example,
putting 4-byte integers just to support ordering. There are ideas how to improve, but still this is
an extension that is not available on many platforms, unfortunately. But there is another thing.
There is a standard contrib module extension which is available everywhere called Bitregin.
At least this can help you to combine different filters.
For example, if you want to have filter to search in JSONB or full-text search or Trigram
index, you want to combine it with additional filtering, for example,
price, right, or age, depending on data, right. This is possible, this is worth considering, so to
have multi-column index and maybe for ordering it also can be helpful, but if you don't do
anything, what is usually happening, the planner needs to decide.
Should it use, for example, primary key to order,
or create an add timestamp to order by,
and then apply filter on instead of gene,
it will be just applying filter for whole data.
It will find it.
And it can be very, It's like alternative is to perform gene filtering, find everything and
then sort and memory. And it's hit and miss. Sometimes we see that planners are very optimistic
thinking we will quickly find just following bit. by created ad in reverse order like very
fresh item dynamically check does it match our filter which supposed to go
through gin well it's not matching not matching not matching and if it it's a
miss it might like skip many many many rows until it finds our 25 target hopes
and just following B3 not involving gene at all and this is
terrible situation I saw many incidents because of that so or vice versa gene
again we use some popular word it finds millions of entries then tries to like
sort them in memory to find top 25 maybe it's like edge cases here maybe
hitting less in terms of performance but also
not fun at all to have this situation. Right? Yeah. And this is number one.
It could be where the fuzzy limit could actually maybe help there depending on how important
is to not have false negative like things that should show up not showing up I guess
depending on the use case. If you just want to show something, let's say you're showing a social media feed,
trying to show engaging things to people, it doesn't matter if there's a few really
good ones that don't show, but then if you've got a different use case, then it matters
a lot. Like for full-text search, if you don't show the most relevant thing, that's an issue. Or for example user just added something
goes to search expects this to be there but it's not there it's it's not good
it's like it's feeling of this is like it's a bug so it's in many cases
unacceptable unfortunately and also there is a problem with pending list
right? This is about inserts rather than selects.
Not posting lists.
There are posting lists for each word.
Basically we have references to tuples.
And then there is pending list, which was the trick to...
We forgot to mention the main problem with GIN, actually.
The key problem is slow writes.
Because search is fast with all the things we discussed, but writes are slower than writes
to tree-like structures, to trees, to B-tree, to Gist.
So even with three balancing, I guess it's like slower.
And what was created fast update what option isn't it's an unfair comparison like it's almost by design that they have to be slower because
Documents contain lots of words one. Yeah one row being updated or added
Creates multiple rights to that to the index. So it's yeah, it's not a fair fight. It's not a fair comparison
I agree so to mitigate this fast update
Technique was created and there's an option fast update when you create an index and I guess it's owned by default, right?
Yeah, yeah
So fast update means that changes will go to some temporary like buffer like kind of location inside the structure pending list
Pending list. Yes, and then at some point when they reach by default four megabyte
Yeah, there's a few things that can trigger it. That's one there's like a
Catch what happens but also back. Oh
It batch processes them. Yeah, and it's happening it can happen
synchronously so some write some update or insert might suddenly be very slow
so slow it might hit like your statement amount which is like 30 seconds for
example or 60 depending right or 10 seconds and this is like unpredictable
performance for rights yeah or vacuum can do it.
Asynchronously.
Vacuum I like, but what I saw from practice, vacuum is good, mitigates this, and not allowing
this job to be done in a synchronous manner.
But it works only if our duration of vacuum is reasonable, which points to the direction
we do need partitioning.
Because if we have a huge table, vacuum is super slow, it cannot catch up with our writes
and pending list problem hits our writes, and users notice it.
At some point, it might be so that we decide, okay, we need to switch off
fast update for some indexes, for large tables, because we want predictable, maybe not good,
but predictable performance of writes. It depends. There is a trade-off, and I see both cases are not
ideal, you know, like if you have really large volumes of data and workload, heavy workload.
So partitioning can help here and to keep fast update on, in my opinion.
Well, yeah, have you tried, have you tried that?
I'm just thinking actually, then you've got to have the partition key in the query and
then suddenly you've got another dimension there.
Sounds tricky.
And so what? We need a different index in this case, right?
Well, it just might not then use the gin.
I've had trouble convincing the planet to use gin indexes in the past when like they've got another choice and it doesn't
I'm probably doing I'm maybe I'm doing something wrong but I wanted to try and
get a bitmap scan like and in the index like with a GIN index as one bitmap
index scan and let's say a B-tree index is another sounds like the case for B-tree
GIN for me well a multi column
gin index with B-tree in it could have helped but it wouldn't have been as like
it wouldn't have been structured quite as like optimally in my opinion so I
was hoping to get this and list really short and then a really like quick scan
and I didn't have luck with multi-column be a gin index being as like as performant as I thought we could get it
with a bitmap scan but that the
partitioning thing I would love to be wrong but the
partitioning thing sounds like you've now got a gin index per
partition right so you've got to get to the right
partition so then it's like,
I get if it's time, partition should work partition pruning should be in good shape of course,
but this is general rule for partitioning.
Yeah. But so let's say you're partitioning by time. It means we need,
we've got the time filter on, right? Right. Is, is that the case?
Like for full for full text text search for example, we're
gonna be filtering by time? Well if we're ordering by time as I like for social
media again we need fresh information in most cases this is the rule compared to
previous like pre-previous era for web 1.0. Well unless there isn't fresh
information then you need to go back and back and back right? Maybe yes well
depends depends so my question is how we limit how we order. Like if we don't
limit don't order this is found fundamental question to this query why
do we request unlimited results? This is this question I raise very often
during consulting practice, very often I see no limit, no order by, and this is the key question.
In most cases people say, oh, actually, yeah, it makes sense to have pagination here, for
example.
Or pagination with gin.
Yeah.
It's interesting.
It's difficult, right?
Yeah.
Well, again, if we order by ID or timestamp by scalar value, it's like key set pagination works and that's it.
And Gene is just additional filtering in this case.
And we're kind of back to where we started
with your original comment that it works until it doesn't.
So until and set and scale, it's fine to not have these.
It's fine to not have the limit and the order by
because your data set is small enough
that your query response times are pretty good.
But we design for future, not only for now, right?
So it's always worth asking yourself how much data you expect in the future, how much data
you need to return with this query to your application or front end.
I don't know.
Do we need everything?
I saw some case, interesting case't know, do we need everything? I saw some
case, interesting case, when people intentionally decided to return
everything and then do a lot of work on front-end. I saw it many
times actually with guys who decided that our front-end should be very thick
and a lot of logic there and it's better to cache everything, like it's
let's load and then work well
in my opinion it's a dead end because if your project grows it's a dead end you will experience
poor performance and other case was guys who decided to perform analytics with R so they also
needed to load everything and then build B3 and so on and kind of group by and so on
was performed only in memory using R which for me like looks like why do you
do this do it on inside your database so yeah and I let me not to forget when
when when pending list is growing and like limit, it can exceed limits, right?
Or no?
Is it a soft limit or no?
I'm not sure.
This is a good question.
But anyway, maybe it was a bug when it exceeded.
I don't remember.
I remember some parts of my memory tell me that it happened in the past.
I'm not sure it was a bug or no.
But obviously also select
performance will be affected if pending list is huge. For example imagine for specific filters
we are unlucky and we need to process all like we need to deal with it and selects become slower.
Yeah so it's four megabytes right? The default. I suspect, I
suspect it's not too much of a penalty if you leave it at the four megabytes, but
a tempting thing to do like might be to increase that so that you don't have, so
that you hopefully have auto vacuum kick in and do the work in the background
instead of having a random insert paying the penalty. Right. But then once you've increased the list, then each select has to look through
more pending entries, or especially towards the end, and then you're paying
the price on selects as well.
So yeah, I don't think, I haven't seen, in fact, actually we probably should
mention there's a really good write-up about these kinds of issues by Lucas Fittle
of PG Analyze who also linked to the GitLab work.
I think Lucas does fantastic job analyzing various Postgres cases.
I think that's why his company is called PG Analyze, but this is your joke, not mine.
You're stealing my jokes now.
Yes.
Well, yeah, good job with PG analysis and yeah
in GitLab's case I was slightly involved only and I remember that case was few years ago and the
result was decision for specific indexes to switch off fast update because table was huge and so on
right? Yeah anyway. So switching off fast update that means we're not using a pending list anymore
and every insert and update pays the price
at the time of insertion to update the index.
And I think that's a really interesting trade off
to make each insert a bit slower,
but not have to have occasional very slow ones.
I like that a lot in terms of
performance, in terms of trade-off. Anyway kudos to GitLab for openness as usual
because all the details are visible to public, to community and it's super
helpful for like for general development of Postgres and so on. And kudos to
Lukas for very great analysis. Yeah.
Of this Posgo's case.
Good, I think we touched a little bit of some deeper waters, but not super deep if someone
wants to read internal SIGOR ROGOS articles and books already.
Great in this area.
Yeah.
And yeah, I think that's it. Right. Right. Or
anything else. One good one last tiny tip that I think is worth you mentioned Jason
be path ops briefly. I think for most other data types, the defaults are the only operator
class. But for Jason B, the default is actually JSONB ops, not JSONB path ops.
And you can get a performance boost with JSONB path ops if you don't need setting up.
Like if you're just using the usual JSON, the usual contains operators.
Final words, gin is very important index type.
It's not going to disappear.
It's going to be used heavily. It's one of the strengths of Postgres,
like a rich set of index types.
Yeah, for sure.
Okay, so thank you. See you next time.
Thanks so much, Nikolai. Catch you next week. Bye.
Bye.