Postgres FM - Hash indexes
Episode Date: December 15, 2023Nikolay and Michael discuss hash indexes in Postgres — what they are, some brief history, their pros and cons vs btrees, and whether or when they recommend using them.Update: the idea Nikol...ay mentioned at the end of this episode turns out to be a little fraught (and as such, inadvisable).  Here are some links to things they mentioned:Index types (docs) https://www.postgresql.org/docs/current/indexes-types.html  Re-introducing hash indexes in PostgreSQL (blog post by Haki Benita and Michael) https://hakibenita.com/postgresql-hash-index Hash indexes intro (docs) https://www.postgresql.org/docs/current/hash-intro.html Hash indexes implementation (docs) https://www.postgresql.org/docs/current/hash-implementation.html~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is brought to you by:Nikolay Samokhvalov, founder of Postgres.aiMichael Christofides, founder of pgMustardWith special thanks to:Jessie Draws for the amazing artworkÂ
Transcript
Discussion (0)
Hello, hello, this is PostgresFM episode 76.
My name is Nikolai and as usual, together with me is Michael.
Hi, Michael.
Hello, Nikolai.
So, today's topic is super interesting and very popular, right?
Not exactly. I chose this one.
Okay.
Therefore, it's boring.
But we will learn something new, which is good.
I hope so. Even if it's useless, it's boring. But we will learn something new, which is good. I hope so.
Even if it's useless, it's still new.
We're here to learn, right?
So Nikolai's teasing me because I've picked the topic hash indexes,
which until relatively recently were highly discouraged by the Postgres docs
and not very sensible to use.
Until Postgres 10.
Yes.
But have been in Postgres for a long, long time.
I looked into the history of it.
And therefore, some people must think they're useful.
And I definitely think there's one or two use cases for them.
But more to the point, I think they're very interesting for people.
I think this is the kind of thing that will make you
appreciate the B-tree index as well
and also understand
a little bit more about
trade-offs we make when we're choosing
things within Postgres.
Indexes, right?
Why doesn't
Postgres choose indexes,
index type for us? We just want
to make it fast and that's it, right?
It should choose.
Okay, different question.
Are hash indexes more useful than the money type data type?
Oh, I think so, yes.
Okay, this is already better.
I hope we will never have an episode about the money data type.
Yeah, me too.
Okay.
So, yeah, should we start with what hash index is?
Yeah, hash function, hash index, right?
Yeah, that's pretty much it.
Hash table.
Yeah.
So in fact, that's a good point.
A lot of people have come across hashing from looking at query plans for hash drawings or, you know, backend engineers, very familiar with hash tables, can be a very efficient way of taking a value, hashing it. storing that and then in future if we want to look up the value by equality and equality only
we can hash the value we see and look it up in the in the index in this case so
people that like uh theory talk a lot about big o notation don't they so this is one of those things that i struggle to love but it has a low
like oh one is what people often say i'm not sure it's strictly true depending on how deep down that
rabbit hole you dive but yeah it's it can be very very efficient for looking up things by quality
especially in certain conditions so that's like like actually a couple of things on details.
I didn't realize you could, and it makes sense,
but you can use a hash index on any value, any data type,
which is quite cool.
I guess the same is true.
Postgres has a lot of internal hashing functions for all data types
and also for the record.
So you can say record
for some table and
it will give you a hash
integer 4, right?
Regular integer.
32 bytes.
And I actually recently
rediscovered it and used.
It was interesting. I didn't use the hash index
but I used one of probably hash text function.
Yeah, it was hash text function. I needed to, you know, when you need to reindex a lot of indexes.
And it's a huge database. For example, after upgrade to Postgres 13 or 14, 14 especially, you need to re-index a lot of bit re-indexes
to start benefiting from the optimizations
Peter Gagan and others implemented.
And obviously you need to re-index.
Okay, you will do re-index concurrently and so on.
It's good, but the question will be how to move faster.
There are parallel workers, as as we discussed and so on.
But probably you will decide
to use multiple processes
to issuing the reindex
concurrently command.
And in this case,
if two of them
try to reindex
two different indexes
which belong to the same table,
you will have a deadlock.
Right?
So my idea was
we just need to attach each table associated with particular
worker. And to do it, I just took a table name, calculated hash text from it, and then model
number of workers. If I have, for example, five workers, I just... So we have table name, hash text function.
The hash text function produces number, like integer four number.
And then we just model five.
Remainder of the division, it will be like 0, 1, 2, 3, 4.
That's it again, 0, 1, 3, 4.
And each index associated with particular table name will always go to particular workers.
So there will never be a conflict, no more deadlocks.
So this is how you can redistribute works.
Maybe also uncheck if you need to redistribute.
You need to check a lot of indexes for corruption.
You can also use many, like 16 workers, for example, and you
redistribute workers to avoid
deadlocks. So hash text is
useful. I was
already like CRC
32 or something. I was thinking, oh, I need
to... Oh, by the way, I asked my bot to
advise me.
And it says, oh, there is hash text.
There is hash text. I never
used it before, or maybe I just forgot. In PSQL, if you
say backslash df then hash star, you will see more than 50 functions. We checked it before the call.
More than 50 functions, many of them return integer 4, some of them return integer 8. So it's good to have a lot of already implemented functions.
You don't need to care.
Just use it.
Right.
Good?
Yeah, absolutely.
Small comment, slightly off topic.
Yeah.
I mean, related to hashing, but I guess completely unrelated to hash indexing.
Yeah, exactly.
Yeah, so you mentioned already there's a big change in the history of hash indexes, I think is worth mentioning. And that's the version 10 made hash indexes crash safe or right. Before that, it meant if you had like, it made them basically unusable or unsafe to use,
especially in a high availability setup. If you had replication and then failed over,
your replica wouldn't have that index. And if you're-
Ephemeral indexes.
Yeah, right.
But no more. So it's solved. The same was solved with GIST very long ago.
Very long. Originally, GIST was also not
wall-supported
and wall at all, but it was like
20-something years ago it was
implemented. Well, speaking
of 20-something years ago, I was
trying to find out when hash
indexes were added, and
the Git history doesn't go far
enough back, I don't think. And the
online docs,
7.2 is the oldest
version on the online docs, and they have
hash indexes. So that's as far back
as 2002 already. So
they're more than 20 years old as well.
Well, I'm quite sure they're older
because it's maybe one of the simplest
type of index you can implement.
But interesting that wall was not implemented so long, right?
Right.
Until 2017, right?
I think you'd struggle to get – I could be wrong,
but I think if we didn't have hash indexes in Postgres today,
they wouldn't get added,
and they would be encouraged to be added via an extension.
That's my guess as to how they would be implemented if they if they were done today but i mean i might
be or maybe maybe via the contrib modules well yeah but we have them we're first class supported
they're right they're well uh logged now they're replicated and everything. Crash safe, even while they're doing complex operations,
we might get to a bit later, it will recover. So we can use them if we want to. I guess the
question then is, why would we? And you mentioned something before the call I found very interesting
and it's probably very telling. But should I ask you, have you ever used the hash index? Only in experimental environments, never on production.
I think I saw them in the wild, you know, saw them.
Yeah, I can believe that.
Yeah, but I don't think I ever seriously considered them.
But you need to take into account that version 10 and later, it's already mostly
consulting practice. Before that, I created a lot of systems, social networks, and so on. There,
it was before Postgres 10, so never consider them. So my memory says stay away from them.
Yeah, that's a really good point. So it's only since 2017 that you should even be considering using these. So I'd like to first cover, I know there's going to be quite a few limitations that we need to talk about, but I'd like to cover some of the potential benefits of hash indexes. Like I think there's a couple that are worth mentioning.
Size? So they can be. So for imagine if you are hashing a fairly large string or even number or just a piece of data that is more than four bytes.
When you indexing it with a hash index, it's only having to store four bytes or, you know, a bit a bit more.
But, you know, it's not having to store that value in the index and that means for larger data types or larger values it can be significantly smaller than a bt index with some
caveats naturally there are some optimizations for bt that we've got in recent versions and
yeah this it really stands out for larger values, and for indexes that are unique or mostly unique, like also for columns that are unique or mostly unique. So that's a lot of the work. He deserves most of the credit, but he put me down as a co-author,
which was kind of him.
And we looked at quite a few things.
And for the use case that he had for hash indexes,
which was a URL shortening service,
you get some quite large,
URLs can be quite large strings
in the grand scheme of things,
especially full URLs.
So they actually came out quite a lot smaller when you put a hash index on it versus a B-tree, which is pretty cool.
Even post-deduplication, because these are largely going to be unique.
We didn't make them completely unique, but largely your URL is going to be unique strings.
So, yeah, they can be significantly smaller.
They can be significantly bigger, they can be significantly
bigger, depending on your data, but I think
size is underrated in terms of
index types. I think
not only we're going to look at performance
in a bit, but base and the
cache, you know, we've talked about this kind of thing before,
but I think it is worth mentioning.
Yeah, by the way, if we
try to index
in a regular way with B3 some large text values, there's some limitation.
I always forget.
Oh, good point.
Yeah, so sometimes we just need to apply hash and use expression B3 on top of it.
But it's not, of course, it's not hash index, but we use combination, right?
In this case, we cannot order these values,
but for exact matching, this is what can work well.
Why do you hash it and then put a B2 index on
instead of just putting a hash index on?
Because, again, I'm old guy.
You know, like, again, my mom always says,
stay away from hash. By the the way with my bot and github we just checked
when hash build the hash build function was added and it's 27 years ago original postgres 95
yeah original source code of postgres 95 wow committed by by Mark Fournier. It had already this hash build, which builds a new hash index.
So this is super old stuff.
I think maybe it even came from original Postgres.
Obviously, it came from original Postgres, maybe even from Ingress.
Who knows?
So maybe it's even more than 30 years ago.
Yeah, very cool. But but yeah that's another advantage right
there's no limit to the size so you think yeah yeah that's great so you think i just can create
hash index if i have very large text values you're losing the same things right you can't order by
them and yeah it's you're kind of doing a self-rolled one which maybe even proves
that we don't need them but i think less less footprint right because you're having to yeah
i'm not sure i think there might be some weird trade-offs around maintenance but i think you'd
largely recreate them yeah so if i say create table uh something as select and then I say repeat as column C1 and I repeat some letter a million times, I have obviously text value a million letters.
Then I say create index on this value, on this table, on this column.
In this case, B3 will say index row requires 11,464 bytes, so 11k, right?
Ah, in my case.
But maximum size is 8k.
Okay, like block size.
But if I say using hash, right, using hash, same column, create index.
Yeah, it works.
So why do I – okay, I'm old guy.
I have old habit. I need to get rid of it actually this is
perfect case
large text values
I don't need to hash first and then B3
I just hash index in this case
I will need to think about it
thank you
there are some reasons you might not want to do this
why
okay well let's go through the positives first
i promised positives size can be it can be worse but it can be a lot better speed can be better
not just lookups but also average latency of inserts i definitely had some discussions with
hacky when we were reporting these numbers that we could have done a better job i think I hold my hands up this was a couple of years ago and I've learned a lot
since then but um inset performance on a batch was or when we were insetting one more at a time
was about 10% slower I think for the B-trees than versus hash which I was a bit surprised by
because I knew there was some overhead around splits
for hash indexes or higher overheads
some of the time. But look
up performance. B3 also has
split sometimes, right? Yeah, of course.
So we need to compare.
In this case, we can't say
hash index has this
disadvantage and that's it, right? B3 also has
this disadvantage. Yeah.
So they're kind of bigger splits less often. Yeah, that's it, right? B3 also has this disadvantage. They're kind of bigger
splits less often.
Yeah, that's a good point. But look
up speed, I was expecting a difference
and admittedly they're both really
fast, right? We're talking about basically
key lookups.
So B3s, we've always
raved about how good they are at
key lookups, right?
Yeah, just a few pages to find proper reference.
It's just a few blocks to fetch from disk
or to read from page cache.
I think we only did about 100,000 lookups,
and both of them were under a second to do 100,000 lookups.
So we haven't checked buffers at that time.
Only timing, which is quite vulnerable.
Yeah.
Again, I would do some things differently now.
This is a couple of years ago, but the, yeah, the, the lookups whilst they were both under
a second, I think it was something like 30 or 40% faster for the hash lookups, which
was, I thought, actually quite significant.
So I think there are some cases where in some extreme,
if you've got long strings and you only care about uniqueness lookups,
then I think there's a case for using them.
But that is obviously a very narrow, very specific use case.
I wonder in these tests if you considered the planning time,
because I guess if table is not absolutely huge,
like billions of rows,
in this case planning time can be maybe even a bigger contributor
than index scan itself, right?
So we need to exclude it using maybe prepared statements
and consider on execution.
That would be a good test. But also, why would the planning time differ between index types?
I'm not saying it differs. I'm saying if you're comparing these like latencies.
Yeah, sure.
What percentage of planning time is there and here? So we need to exclude it to compare clean numbers, right?
Yeah, that'd be great.
I don't think we used compared statements.
I'm pretty sure we didn't.
And honestly, I wouldn't look at timing
here at the first metric.
I would still prefer looking at buffers.
But okay.
But the point is they can be more efficient
and that's partly because they're if
you imagine a b tree structure you might have to go through a few late you know i might go
through a few hops uh especially once your table gets really large now we've talked about
partitioning so many times right so you could argue that you could keep your b trees relatively
small in the grand scheme of things but But if we do get large enough,
the hash structure is just so much more efficient
and that will show up in buffers as well, right?
So that is the argument, I think.
So you've got potentially smaller indexes
for highly unique, slightly larger data types
and potential benefits on the insert and lookup speeds.
Have you considered collisions?
Because if you have only integer four, obviously you might have collisions,
and you need to resolve this additional hops, additional comparison and so on.
Have you considered trying to measure some edge cases.
Yeah, I looked into this.
I looked into the internals a bit.
Well, actually, there's a couple of really good internals pages
on the Postgres docs on hash indexes.
And there's only a throwaway word quite early on in one of them
that mentions that they are a lossy index type.
So it's not the only lossy index type,
but it means that just because we get
a match for the hash, the 32-bit integer, doesn't mean we do actually have a hit. We could have a
collision. So there is a recheck and you can get those removed by index recheck type things in
your explain, but I haven't seen it. Like we're talking, but you'd have to have,
well, I'm not so sure you'd have to have billions,
but I'm guessing you'd start, you know,
in the tests I've done, I haven't seen any.
But I guess you'd pay that overhead for every lookup, right?
You're having to do that recheck all the time
because they're lossy.
It's not just when there are collisions
that you pay that recheck.
So those numbers I was talking about
include the overhead of rechecks.
Okay.
So what are the downsides?
Why shouldn't I use...
So many.
So many, okay.
Index-only scan, as I remember, right?
Yep.
Lack of index-only scan.
Because we don't have the value in the index,
it's literally impossible to have an index-only scan.
Right. That's bad. Yeah.
Yeah. So...
But if I, in my case, if I have large text value
and I consider hash index versus B3 over hash expression...
You can't have an index-only scan there either.
Yeah, because I cannot fetch the original value.
So it's not a reason that would stop me from using hash index.
Okay.
No.
Right.
In fact, have you considered collisions for that?
Yeah.
Well, yeah, yeah, yeah.
You have to implement it yourself.
That's true, yeah.
So I think the big one is ordering.
Obviously not in the case you're talking about where you're hashing,
but the big advantage B-trees have is they order your data,
and that supports so many different things.
And greater than or less than comparison, obviously.
Yeah, range scans.
Yeah, it feels like a limitless number of things that that then enables.
Right, you can deal with collation issues.
It's so cool.
They can be corrupted because of collation changes.
I mean, that's a good point.
Maybe hash indexes are less likely to get corrupted.
Right.
Okay.
Actually, what if the hash function like created a different value
instead
I don't think it's a good question
I don't know how it's implemented
internally it should not be super
difficult but obviously
how will it hash
non-latent characters and so on
and it doesn't depend on
JLipC version change for example
it's an interesting question I never had this question because again I'm not and so on. Yeah. It doesn't depend on JLipC version change, for example.
It's an interesting question.
I never had this question because, again,
I'm not seeing them
in the world often at all.
Another huge difference,
and I don't think this is a,
I don't think this is necessary
based on its,
I don't think this is like
the index-only scans
where it's literally impossible,
but in Postgres,
we can't use hash indexes
to enforce uniqueness.
So for unique constraints, which is a big deal.
Like, for example, we can't use them for primary keys.
Well, like, there's no point because we've already got a B-tree index on them.
Okay.
Good to know.
What else?
Multicolumn, right?
I remember now.
Yeah.
Multicolumn.
Exactly.
So hash indexes don't support multicolumn.
You can combine values, or you can, I don't know.
But more to the point, one of the nice things about multicolumn B-tree indexes
is you've also got the sorting, sorted by the first column,
then sorted by the second.
So we just don't have advantages like that. And ties into the index only scans as well right like one of the best
things about multi-column indexes is we then get potential for index only scans as well
so these things a lot of them are into play one difference that isn't necessarily an issue but i
found interesting i thought you might as well was that the fill factor for hash indexes is 75 by default, whereas BG's is 90. Really interesting.
Some old decision, I guess. Maybe not, but maybe. I don't know. at what does it look like if you change the fill factor to 25, 50 or 100.
And at 100 versus 75, you just see it not getting larger as quickly,
but when it does, it bounces at a faster rate.
So as the index grows, it grows less at first and then more all at once. So my guess is it's some trade-off
between general size and performance
and the impact of splits or the cost repair splits.
And in fact, this is probably the biggest one
you'd want to consider on the difference
between doing a B-tree index on a hash function
versus using a hash index.
And that's how they would how inset
performance would look in terms of probably not average latency but maybe the standard deviation
of latency so with hash indexes the way they're structured we have buckets that values can go into
but beyond a certain size postgres works out and it has all this for you. It decides, oh, we need more space.
It'd be best to have another bucket.
Let's split one of the existing buckets.
And then we pay that overhead during the insert, right?
Like that, you can slow down your insert.
So my guess is the standard deviation for inserts would be perhaps significantly higher than BGs, but I haven't checked.
So another thing I would like to do.
Maybe.
So on high throughput,
there is a note in the
internals document about
high throughput. Let's say you've
got a really high throughput table.
You might actually be better
off with your B-tree index that you mentioned
than a hash one.
Okay.
And multi-column index, I think
we can... I've just checked, by the
way, hash record
applied to whole table row.
It works.
Producers, inter, for, number.
I think we might probably
decide to use it to...
I remember I was always using MD5
hash, converting row first to was always using MD5 hash, converting
row first to text and then MD5.
But maybe hash record is
a better choice when we
need to, for example, compare the content
of two snapshots
row by row, you know, like
to find which
rows were
changed, comparing
them, like joining by ID, by primary
key for example and then we compare
hashes of whole
row, right? In this case hash record
is working. I wonder if
like if we
multi-column index is not supported
what if we just combine multiple
columns
and produce hash out of them
and almost the same, right? It's strange multiple columns and produce hash out of them and
almost the same, right?
It's strange that it's not supported.
I mean, we can still only use uniqueness
lookups, right? So
I guess it's the case where you want
to see if two columns when
combined are equal to
the record you've stored. Yeah.
But I'm not seeing the same benefits.
Okay. I will check not seeing the same benefits. Okay.
I will check a couple of things more.
Yeah.
Any more limitations or that's it?
Don't think so.
I might have missed one,
but like those for me,
they're substantial limitations.
And I think if you're,
imagine when you're designing a system
or you're choosing an index type
for a new feature building,
and it turns out that right now the hash might just about come out on top.
Maybe you get a slightly smaller index.
Maybe your lookups are slightly faster.
I'm not sure I'd still advise using a hash index.
I think I might say, look, if your requirements change in the future,
if you ever do need to look up a range of these values, if there's some analytics you might need to do on it or something, you might have been better off with a B-tree index.
You've got that extra flexibility.
You've also got, I think, probably more effort being put into optimizations on them going forwards.
If it's a tight call between them, I might still encourage people not to use them in general.
I never did it. I just did it.
I used hash record function applied to whole row in a table,
which has three columns.
And then I created index on top of that hash record.
So first time I created an index not specifying columns,
but involving all columns in a table.
This is super strange.
It could be bit.3 actually as well, because hash record produces integer 4, right?
So it applies to all columns.
So I said create index on T1 using hash of hash record of T1.
And the whole record.
That's super strange.
I need to think about it.
What use case are you imagining for this?
I don't know.
I'm just exploring, you know, like curious.
I don't know.
Maybe like, again, like to track which records changed content or something.
I don't know.
I don't know. I don't know.
It's just interesting that it works.
I thought we are required to specify columns, right?
When we, like, this column, that column.
But here you just specify whole record.
And it works.
This shows flexibility of Postgres as usual, right?
Yeah.
But I created multi-column index hash,
but it's hash of hash. So double hashing here. Okay. So I don't know. I'm still thinking about use cases. Realistic one is only these really large text values. That's it so far for me.
Yeah. And when you only need unique lookups of them
yeah so yeah interesting so i will keep it in mind so yeah i love i love that we've gone that
during this episode we've gone from i never have used cases for hash indexes yeah exactly there we
go cool but i also think you know you said you've seen them in the wild.
I think that might be a result of people overthinking it.
And I was there a couple of years ago.
I was thinking,
you know,
there might be some cases where that difference is,
is meaningful and worthwhile, but for the flexibility,
I think anyway,
I've said that a few times now.
Cool.
Well,
thank you,
Nikolai.
Thanks for indulging me. you a few people found that interesting
and useful and yeah catching it thank you bye