Postgres FM - superuser
Episode Date: March 15, 2024Nikolay and Michael discuss the superuser role in PostgreSQL — what it is, how and when it shouldn’t be used, and whether most cloud providers are right to not give us it (no prizes for g...uessing). Here are some links to things they mentioned:superuser (docs) https://www.postgresql.org/docs/current/role-attributes.html#id-1.6.9.6.2.1.2.1.1Crunchy Data PostgreSQL Security Technical Implementation Guide (STIG) https://www.crunchydata.com/blog/crunchy-data-postgresql-security-technical-implementation-guide-now-availableSupabase docs (unsupported operations) https://supabase.com/docs/guides/database/postgres/roles-superuserCrunchy Data docs https://docs.crunchybridge.com/concepts/usersRDS docs https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Appendix.PostgreSQL.CommonDBATasks.htmlCloud SQL docs https://cloud.google.com/sql/docs/postgres/usersAzure docs https://learn.microsoft.com/en-us/azure/postgresql/single-server/concepts-serversRoles, Privileges, and Security (talk by Ryan Booz) https://www.youtube.com/watch?v=mtPM3iZFE04~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is brought to you by:Nikolay Samokhvalov, founder of Postgres.aiMichael Christofides, founder of pgMustardWith special thanks to:Jessie Draws for the amazing artworkÂ
Transcript
Discussion (0)
Hello and welcome to PostgresFM, a weekly show about all things PostgresQL.
I am Michael, founder of PG Mustard.
This is Nikolai, founder of Postgres AI.
Hey Nikolai, how are you doing?
Hi Michael, I'm doing great. How are you?
I'm good also.
So this week, well I was in charge of choosing, but I've actually picked something that you suggested.
I was looking through all the listener suggestions, all of the ideas we've had in the past.
And this was one of yours, right?
I have no idea. I already forgot.
It was my idea? Okay.
So you suggested a while back that we talk about super user
and especially super user in the new normal,
the new context of cloud providers or managed services
and whether we, well, the fact that we normally don't have
super user access anymore in those cloud environments, whether we should, that kind of thing.
So a little bit of refresher on what a super user is, what it can do, and maybe why we don't have it or why we should have it.
Right, right.
Yeah, good questions.
What is super user, right, first of all?
It's just bypassing all privilege checks,
right? This is the idea. Yeah, I looked it up in the documentation. Super user bypasses all
permission checks except the right to log in. It's a dangerous privilege, should not be used
carelessly. It's best to do most of your work as well that is not a super user and then talks
about how to create one so you can create super user with no login flag right or or i guess so
yeah interesting actually i have like a white spot in my knowledge here maybe honestly because but yeah super user is what
people sometimes use not thinking about permissions at all even launching their
services and web application work using super user database it's very very bad practice
did you do that ever yeah it. It's the default, right?
When you don't know better.
Actually, yes.
And it means defaults here again are not perfect
because they don't encourage you to create non-super user database
or user that you will be using.
Also, I can remember some companies,
small and big ones, doesn't matter,
which give single super user or mouth.
So it's different kinds of scene, different scenes.
One scene is let's use one single super user
for all people who have DBA access or SRE access, like admin access, and share it.
Or let's give everyone super user but named one.
Maybe the first thing, like it's different.
So different things.
Use super user by default to when you work with database, checking something.
And different scene is like, let's share account.
Let's share database role.
Both are not good things, right?
But at least if you split it, if you separate roles and create multiple superusers,
it's already slightly better.
Not slightly, actually.
It's significantly better because you can see who is doing what at least, right?
Distinguish people.
But in general, yeah,
Postgres, how it's organized,
it provokes you to use superuser for everything by default
and you need to make efforts to go out of it.
Most people at least realize this
and at least stop using Superuser for application work.
This is number one thing to do. Okay, you are going to use Superuser for yourself because you
have all the rights. You're maybe the owner of everything, right? Or you own this database,
okay, you have Superuser. It's a separate question. Should you always log in as Superuser,
right? Maybe you should log in as normal user, regular one with
limited permissions and only if needed use super user access. But at least all application code
must not use super user, right? This is obvious. And why? Like it's the danger, right?
Well, security, my favorite topic security right
i'm joking well it's security is a really good reason but i think also like the danger of being
able to like drop things being able to destroy data being that like it's not just a security
issue right like somebody could steal everything but they they could also just destroy everything and not steal anything.
Well, it's also a kind of type of security.
Well, it might be reliability or something.
But still,
it's insecure to give everyone...
It's not about somebody outside
of the company stole data, but even
inside the company, somebody
made a mistake. it's also about
it means that
the work
is not secure
right
but
I wish
auditors
dig into
this
whole deeper
like many
companies
already reported
I mean
Postgres companies
reported
they have SOC 2
and
some companies go to IPO,
and they have a lot of auditing activities from external auditors, right?
And I know companies who are very well-known,
they have a bunch of questions,
and some of these questions sometimes related to Postgres.
But when you look
at them being Postgres expert, they look funny, usually. So yeah, I think it would be good to
create some standard or something. Actually, there is some standards, right? Crunchy Data,
they shared a big PDF a few years ago, which is aimed to make Postgres setup more secure.
And it was for, I think, for Army or something like development.
Yeah, US military, I think, collaborated with them on it.
Right. This is a good thing.
And even better that they shared it and this became public
so you can use it to grab some things.
And obviously, a lot of things are related to permissions
and what kind of database user you use, right?
But if your company right now already got SOC 2
or doing this or IPO or something,
I would not rely on external auditors to say,
okay, we are good in here.
They suck there.
Their questions don't cover this topic almost.
Or at least what I saw so far over the last five years.
I didn't see good questions, honestly.
Some of them were kind of good, but this kind of topic,
like, do you use super user?
Do you distinguish users, I mean, humans, and so on and so on?
Like, what is your model and so on and so on?
Like what is your model for these privileges and so on?
Quite weak.
I'm not an expert, as usual I say.
I'm not an expert in security at all. This is maybe one of the least favorite topics in database.
But it's super important topic, right?
So, yeah.
Okay, so when should we use super user?
That was where I was thinking we were going to go.
So you mentioned a while back, I think maybe on Twitter,
maybe just to me, I can't remember,
that there's a bunch of times when you're maybe with a client
that's using a managed service provider that is frustrating to you
that you don't have super user access.
So I was interested, like, when are those times?
Like, what are you trying to do that you can't do without it
or that's difficult to do without it?
Yeah, there are things that only a super user can do, obviously.
And, for example, copy from program, dangerous thing to do
because you can basically execute any shell code
under posgus os user linux user right and some things like i don't remember exactly but there
are certain type of things where you need super user definitely so i found a list i was looking
at all the different cloud providers and whether they do or don't provide it.
And there's a really good list in the Superbase docs
of what is unsupported in their highest privileged role.
So they don't...
Let me just read it quickly.
Superbase provides the default post control
to all instances deployed.
Superuse access is not given
as it allows destructive operations
to be performed on the database.
And those unsupported operations are...
Distractive. Okay.
Yeah, it was an interesting choice of words.
Create role with replication, create subscription,
create event trigger, copy from program, as you mentioned,
and, of course, auto-user with super-user.
So you can't make other users super-users.
Well, these are destructive actions.
If they are destructive, let's remove them from Postgres.
This is judgment.
Let me judge, right?
I'm the owner of this database.
Or who is the owner of the database?
So let me drop my position.
It's very simple.
Two things.
First thing, for managed providers, managed Postgres providers,
if you don't run in container, question why not?
Run in container.
Or actually, or in separate VM. you probably run it in separate vm
or in container at least right so it's you mean like is it isolated so if somebody breaks out
and or yeah if somebody else is being destructive it doesn't affect you if you know a different
customer that yeah a five-acre micro vm something. A lot of things. But at least container.
So it's already isolated.
If it's isolated, second point, give me super user
because I'm the owner of this database.
That's it.
And they usually say, most of them say,
it's for your safety, right?
But it's bullshit and lie.
It's a lie.
Because I
know for
sure that
inside big
provider,
very big
provider
teams,
this topic
pops up
from time
to time.
And technical
people usually
say, let's
do it.
There are
no big
reasons to
say, like,
we protect
these users.
Like,
when AWS give you
EC2 instance
it's a virtual machine with Linux
they give you root right
oh I didn't know
they give you root of course
well I have root
then people say okay
but for Postgres we have a lot of automation
and you can break it
well if I break it I I break it, right?
So if I execute copy from program, as I did with CrunchyBridge,
and I move pgwall directory, well, this is destructive action, right?
I have copy from program to table, and I just MV pgwall to pgwall2,
and I've got panic.
I mean, database got panic i mean database got panic i didn't i i i got not panic
i've got big joy observing that i have i can't destroy myself i mean my my database because
this is how i think okay i own it at least like okay not directly but i own it you know there is
philosophical there is philosophical very good statement.
You can truly own only what you can destroy.
Right?
I've not heard it, but it makes sense.
It makes sense in anything.
For example, if you cannot destroy your own company,
if you're a startup guy, founder, you do not own it at all. Maybe investors own it, right?
And you should realize it.
Who can destroy it?
This is like ownership without ability to destroy it.
So here we come to destructive actions, but let me judge it.
Let me judge it and let me feel it, right?
So I don't accept any reasons like that.
We protect you.
You can destroy it we our support will be fed up with
questions i destroyed something well if you destroy something you can recover from backups
that's it but they also say okay if we have a lot of automation and if we allow you copy from
from program super user for example then you will be able, as a user, to see our automation pieces, right?
Reverse engineer it.
This is already a real reason.
This is the real reason.
They don't tell you the first reason, but this is the number one reason, honestly.
They don't want you to see the automation from
inside.
Crunchy Bridge
is great here. They don't
care. I'm not sure if their
product is open source. There are doubts
on it because they stopped publishing
images, as I know, and so on. But here
they give you a super user. You can
go copy from program and explore
directory, layout, and so on, everything.
Find which programs are there.
Probably try to execute them.
Like, it's your world.
You really own it.
They give you this.
They charge you extra.
Like, you charge me extra and you protect me from myself?
I don't know.
There will be time.
There will be time when people start realizing it,
and I hope auditors will also realize it.
Who owns this database?
Yeah.
Of all the ones I checked,
CrunchyBridge were the only ones that supplied for super user.
A lot of the other ones create kind of a pseudo world
just below that with the flat super base. I couldn't find a list of the other ones create kind of a pseudo role just below that with the
like super base i didn't couldn't find a list of things that were removed from others
but crunch bridge does definitely deserve credit for that and especially if you consider they were
the ones that are publishing the security guide i feel like that's a really good argument for it's
possible it's just maybe people don't want to do it i would say they've got another possible
advantage or at least it may be like some other reasons i i i am not as convinced as you as the
support reasons not a good one because even to investigate was it was it the user that messed
up or was it us that messed up is actually quite difficult sometimes in support i don't know if
you've ever had that do you use use RDS or Cloud SQL or something?
Do you know how the calls with support work?
Do you see how the calls are usually organized?
I haven't ever used their support.
I mean, any of managed services support.
If you just throw something,
they won't work you step by step what happened.
They will offer you to recover from backup, blah, blah, blah, like that.
Of course, there are logs, right?
And possibly, ironically, I have actually seen people have great experience with Crunchy Data support.
This is not sponsored by Crunchy Data, I promise.
But there are providers out there giving really good support.
But I could imagine an argument for, especially if you're a provider that provides a free tier, for example.
The ones that charge more than it would cost you to host your own, I can see the argument for maybe they should provide super user access.
But if it's a free service, I'll go on.
So yesterday I had one million rows in the table. they should provide super user access. But if it's a free service... Oh, go on.
So yesterday I had one million rows in the table.
But today it's only like minus 10 rows.
Where are those 10 rows, right?
Should I go to support?
Because maybe it's a bug.
Maybe it's a bug of their automation.
Maybe it's a bug of PostBits itself.
Rows disappeared.
If you've ever run a product,
you're going to get some customers that come with support questions that it turns out it's nothing to do with your product.
It happens to all of us, right?
Or maybe I know I executed delete.
Or maybe I know my application couldn't execute delete, right?
So this is the same kind of problem.
It's the same question, the same type of question.
That's why I say, then they say we are protective. It's a bullshit.
On that note, like maybe to move on slightly,
I did actually noticed in the crunchy docs that for the super user role,
they have PG audit on by default to, to log what it's doing.
Yeah. This is what I suspected.
That's interesting, right?
Yeah, well, yes. If you enable
it for all users, your logs
will be flooded
with a lot of data
and it will become
a performance bottleneck very quickly.
So it makes sense
to enable it for super users
and capture everything what's happening.
And of course I suspect you could change that if you've got super user access.
You can change that.
But I think the idea is help people help themselves and also help you support them if it may be
alarm if it looks like they are using it as their application user. People are not stupid in general.
There are stupid people, but there is only a minority of them.
And if you say we protect you, it means you don't trust your own customers.
It's bullshit.
That's why I say it's bullshit.
So this is a good thing to have good defaults.
For example, okay, super users and PGA audit is enabled.
If someone disables it this record that
it got disabled goes to log so we have footprints right of the of this section but uh in general
people can at least somehow customers i deal with i see opposite i sometimes think oh i need to
explain you this but they're they're quite smart. They're like, okay, we already got this.
So they can understand what's happening.
My point is that they don't give this to you to protect them, not you.
They want to be protected.
They want to share what they got.
And this is like business decision.
It's not technical decision. So they don't want to share automation and how exactly they adjust the Postgres and so on and so on.
But I hope people will start realizing this and cases like Crunchybreeze will be more common.
And people who are truly open source believers and lovers, they will shift to more open approach and more trustful approach,
like trust your customers.
They can decide if it's destructive or no, right?
And just keep everything open, share your automation, and give ownership and access
to your customers. And in this case,
the premium usually quite significant over infrastructure costs
will be reasonable to pay, right?
Now there is imbalance in this world, as I see.
And I hope auditors will also realize it
and start asking questions.
Who is actual owner of this database?
Yeah, interesting. Good point. realize it and start asking questions who is actual owner of this database yeah interesting good point but what do you think about free like I completely
take the point on providers that are charging a premium over what it would
cost you to run the service but what about the ones that are offering you a
free tier like a like neon or super base or even though yeah there are some newer ones as
well what how is it different well i actually don't know if it maybe it's a premium feature
maybe like if it if i think it will end up in more support personally even if you don't even
if you don't agree that it should end in more support i think it would overall i don't think it's it will be big part of whole picture it won't be
big part i mean this kind of questions like i destroyed my database and i don't of course if
it got destroyed and you didn't do anything like in this is this is a good question but if you have
pg audit set up you you can check logs, and support
easily can point to small
how-to, how to understand what's
happening, and that's it. And
support usually doesn't
look inside your database, right?
This is your area.
Like RDS support, for example, you
need to have
high level of support and convince, like
they usually check only
VAM, underlying things, and
infrastructure things, right?
What's happening inside your database? Who deleted rows?
Or who moved, for example, a PGA wallet?
I still say this is the same
level of things. It's your
area.
Yeah, true. I mean,
I think even... I know many
people won't agree with me, actually. Well, I actually don't know if they would anymore. I mean, I think even... I know many people won't agree with me, actually.
Well, I actually don't know if they would anymore.
I think there's an increasing education around data processing,
especially with all the privacy laws in the EU and in California,
around whose data is it and who's processing the data,
who owns the data.
Right.
Yeah, in true spirit of some of those laws,
like if this is my data, I should own it
and be able to destroy everything,
not only at logical level, like delete.
Maybe I want to destroy a PGA wall right now myself.
I don't know.
This is true ownership. See how it works inside don't know like this is like this is true ownership
and see how it works inside like i i need to feel it this is true ownership in my time inspect it
i can see the argument a little bit for things like replication or you know some of the let's
say some of the cloud providers they said they have for example really easy single checkbox high availability or something,
and then you go and destroy replication,
or you mess something up that means that's not working anymore.
Usually such providers don't use Postgres replication.
This checkbox usually is not based on Postgres replication. It's usually underlying block storage device replication synchronous usually.
So this is what I know about RDS Cloud SQL and so on.
So usually this is lower level.
Cool. Makes sense.
Maybe you can't even mess that up.
Great.
So anyway, we live in the world when people say, honestly, I also think most of developers
look at what RDS did or Cloud SQL did
or Microsoft guys did,
and they just copy this approach,
not thinking deeper.
Crunchy is, as you mentioned,
they are security experts, obviously,
and they are brave.
So, kudos, actually.
I already told this a couple of times on Twitter.
This is great.
But others just copy what other,
like, say, smaller
or new providers.
They just copy decisions from bigger
providers and copy their arguments, like
this is to protect users, this is to protect customers.
This is for your own safety.
Let me decide what is destructive.
I know quite a few
managed service providers listen and people that
work at them. It would be great to hear from you
if there's something we've missed
or if there's
a way of explaining this that
would be better. Let us know.
Actually, I can't criticize this easily because I don't develop managed service,
managed Postgres service, right?
Yeah.
Because in this case, I would need to be more careful
because I would have my situation.
But honestly, I got several times I saw this, like how to implement,
like how to protect, what we should protect.
If you provide super user, what kind of dangers exist. And this list is obviously like this is copy from program, foreign data
wrappers probably and so on, like dangerous parts. So if you think about it, like you can implement a good model. And I think, of course,
maybe Postgres could provide some additional tools
to restrict certain areas.
But if it's my database,
I should decide what to enable, what to disable,
and at which point for whom, right?
Yeah, I don't know if Postgres can do anything about this
because ultimately these are features that we need.
Like, well, someone needs to be able to do them.
Well, maybe if no one needs to be able to do them,
they shouldn't be in Postgres at all.
But if someone wants to be able to do them,
it seems silly to me not to disable it
because we want to be able to host it in clouds.
Yeah, well, I want to disable it sometimes.
For example, in Database Lab,
we have a job bot, right?
And we don't job bot to, like,
we want as much freedom as possible
for end users to execute
any SQL. But if it's this
any SQL is copied from program,
it can be dangerous because, you know,
or foreign data wrapper because
maybe users
who are end users
they are still inside the same team
but maybe admin doesn't want them
to execute it and we just want
to like okay we want to protect here
we still some users
have full access admin decides
right but some users
who work only at
this level this like SQL
experimentation level,
they are restricted and at some point
we removed all possibilities
of copy of program from data
wrappers and so on and so on. So they cannot
do harm even if they are
inside the same team. But admin
decides which permissions to provide,
right? So in the end
of the day, you own this database
and you decide what to do.
When I say you, I mean admins, right?
Because inside, bigger customer,
there might be some additional users, right?
And a good tool can provide the ability
to control those permissions.
Yeah, and that's what we're saying, right?
We're not saying RDS should provide super user to every single person who has access to the RDS dashboard, but to like
an admin. That's what we're asking for, right? One person or a small group of users.
Right, right. Well, if, of course, there is chain reaction. If we, for example, RDS, for example, okay, we provide super users,
but then what to do with access to wall files, backups, physical backups,
physical replication connection, and so on and so on.
There is a chain reaction here, right?
And they restrict you here partially because it's kind of vendor locking, obviously.
Right?
So they also restrict a number of things,
like, for example, recovery target LSN.
So you cannot perform zero downtime upgrades
with our SAP, which involves recovery target LSN.
You can only do with slot advancement.
We call it Instacart approach, right?
This article, Instacart approach, right? This article, Instacart approach,
which the opinions is more risky.
Interesting.
Right.
So the current target is you need to do it,
like you need to provide it.
So, I mean, if you provide super user,
this is a Pandora box in terms of decisions
what to provide to your users.
Well, it's Pandora's box, a box in terms of decisions what to provide to your users.
It's Pandora's Box, but it's not
it's making all of those
decisions at once and saying we let you do
anything. Yeah, that's
cool. You are the owner. You
decide what to do. We provide you automation.
We charge for it.
That's it. Now we
live in a different world. We
provide you some automation.
We hide a lot of capabilities
Postgres has from you.
We restrict you
and charge for it.
It's not fair.
But this is the most popular approach right now.
This is about,
we've had discussions about managed services, right?
Second episode, actually. Maybe it's a little
bit of a loop.
Right, right, right. Yeah, I feel
the loop indeed, but
a different angle completely.
So we had the previous episode also
echoing the first
one. Interesting.
Next topic should echo the third
one, right? Let's do it.
Anyway, okay. that's it.
Podcast wraparound.
Yeah, right.
So maybe this discussion was not
too technical, right?
You can go read the documentation
if you want technical.
Yeah, and there were good talks on roles and
security and just
covering the basics. There's loads of documentation on this kind of thing.
We can include some links.
Right.
Yeah, we had some philosophical discussion.
Do you own your database?
RDS users, do you own your databases?
Who owns it?
Maybe people don't care about it until some auditors decide it's not truly ownership.
But probably how world is organized, probably they will never raise this
question. Everyone is happy. So I don't know.
And at least there's choice out there now.
Like at least there is at least one provider that does.
And you can self-host.
You can even, you know, use the cloud and manage it yourself.
So I also know I also say I want to own my bloat, right?
Because Ardios doesn't allow you to take your physical...
Logical replication.
Yeah, logical.
You lose the bloat your data files have.
And you cannot stop thinking about bloat because it's Postgres.
Bloat is the central of architecture.
So you need to understand how this works if you want good performance.
So, yeah, interesting.
So you work at abstraction only, some abstraction level,
and you cannot copy files.
Okay, maybe enough.
Thanks so much, Nikolai.
Thanks, everyone.
And see you next week.
Bye-bye.