Postgres FM - Multi-tenant options
Episode Date: June 20, 2025Nikolay and Michael are joined by Gwen Shapira to discuss multi-tenant architectures — the high level options, the pros and cons of each, and how they're trying to help with Nile. Here are... some links to things they mentioned:Gwen Shapira https://postgres.fm/people/gwen-shapiraNile https://www.thenile.devSaaS Tenant Isolation Strategies (AWS whitepaper) https://docs.aws.amazon.com/whitepapers/latest/saas-tenant-isolation-strategies/saas-tenant-isolation-strategies.html Row Level Security https://www.postgresql.org/docs/current/ddl-rowsecurity.htmlCitus https://github.com/citusdata/citusPostgres.AI Bot https://postgres.ai/blog/20240127-postgres-ai-bot RLS Performance and Best Practices https://supabase.com/docs/guides/troubleshooting/rls-performance-and-best-practices-Z5JjwvCase Gwen mentioned about the planner thinking an optimisation was unsafe Re-engineering Postgres for Millions of Tenants (Gwen’s recent talk at PGConf.dev) https://www.youtube.com/watch?v=EfAStGb4s88 Multi-tenant database the good, the bad, the ugly (talk by Pierre Ducroquet at PgDay Paris) https://www.youtube.com/watch?v=4uxuPfSvTGU ~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith special thanks to:Jessie Draws for the elephant artwork
Transcript
Discussion (0)
Hello and welcome to Postgres FM, a weekly show about all things Postgres QRL.
I am Michael, founder of PG Mustard, and I'm joined as usual by my co-host, Nikolai,
founder of Postgres AI. Hey, Nikolai.
Hi, Michael. Hi.
And today we are joined by special guest, Gwen Shapira, co-founder and chief product
officer at Nile to talk all things multi-tenancy.
Hello, Gwen. Thank you for joining us.
Thank you for having me. It's very exciting for me to be on your show.
Oh, we're excited to have you. Thank you for having me. It's very exciting for me to be in your show. Oh, we're excited
to have you. Thank you. So to start, perhaps you could give us a little bit of background
and what got you interested in this, the topic of multi-tenancy in general. As many things,
it started with an incident, but this time not actually one of mine. So when my co-founder and I started Nile, we actually started with a very different
idea after about nine months, we were like, this idea is not working very well.
We developed some things.
We're not finding the market.
We hope to find we're sitting in a hacker space over in mountain view.
And we're like, what did we learn?
We talked to 200 companies all doing SAS.
What have we found out as a result?
And the things that call to us is that very early on,
basically when the first lines of code are getting written,
you have to choose a multi-tenancy model.
And then about two, three or four years later,
depending on how fast you're growing,
you have to change it.
And we heard a lot of stories
on what caused people to change it
and whether they regret earlier choices or they're like
we didn't know better and how things went for them and then we started
looking in different blogs and we found so many by very famous companies with
either incidents were something that was done to a single tenant caused a whole chain
of events that took down their entire system, sometimes for days.
And also a lot of slightly better stories how we sharded our highly multi-tenant database.
And we found story after story after story on how people had to re-architect their entire database,
which is, as you guys know, extremely painful to do after you're a successful company three years in.
And we're like, this is a good problem. So many people have it. It is so common.
My past was in databases, not as much Postgres, more Oracle and MySQL, but
I've seen this problem again and again in all kinds of companies. My co-founders seen
this problem again and again in all kinds of companies. This is such a good problem.
Everyone has it and nobody's working on it. Why is nobody working on it? And that's how we got into it.
Yeah, nice. Should we go back then in terms of how do you like to describe the different models or
the different options that people have at the, at the, in the early stages?
Yeah. So everyone basically starts with one out of two And I'm using AWS terminology, even though there
is other terminologies that people apply to it. AWS calls it the pulled model versus isolated
model. And in a pulled model, you basically create your tables as normal, and then add a tenant ID column to each and every
one of your tables, pretty much. Some, maybe not everyone, some have like shared data,
but most of your tables are going to end up with a tenant ID column that tells you which
tenant this row belongs to. Very easy when you start out and all you have to do is sprinkle some work losses
every now and then and you're pretty much good to go. How hard can it possibly be?
The places where it gets you is that you have no solution. Everyone is in one big pool. And if you have a problem tenant, if someone grows really, really
large and suddenly you need to query, start getting slow for them.
If you need to do an upgrade and one customer absolutely refuses to accept
changes or needs their own time window.
There is a lot of different ways you may discover that by putting all your customers
in one big pool, you save a lot of effort,
you save a lot of money.
This is by far the cheapest option you're going to have.
It's shared resources in one database,
but you are not allowing yourself to do anything specific
for any one customer should they need it.
The other approach is basically the reverse.
It gives every customer its own database or sometimes it's its own schema. This still counts
as isolated, even though it's not all that isolated. You share quite a lot in that scenario,
but the schema is separate and it's quite a bit easier to move them out if needed, if the schema is separated.
And in this scenario, first of all,
there is a nice benefit that you get help
from your a lot of popular frameworks.
I think Ruby has, and I think it's called apartment plugin
for having this kind of multi-tenancy model.
Django has something, so a lot of very popular frameworks
have something that helps you in that model.
But if you accidentally grow to a large number of databases,
it starts being very painful.
Obviously, a database with large number of objects
is no longer as easy to work with.
Suddenly you start learning how much space in memory
the catalog can really take when you have connections.
Suddenly you start learning that PGDump
can take a very long time.
If you actually have a database for each tenant,
then doing any kind of maintenance on 100 databases
is already not fun.
If you end up going into a thousand databases,
it's really not fun.
And if you think about it,
a lot of SaaS have customers in the hundreds of thousands,
not just thousand.
So it becomes very painful exactly when you grow.
Yeah, for example, upgrade includes dump or restore schema
and we had cases with 250,000 tables to million indexes well indexes are there
not dumped but tables exactly this is worse than not dumped right if only we
could dump indexes yeah and then you need to update statistics after upgrade
for all of those tables.
It's nightmare honestly.
Yeah, on the other hand, at least you get to do it to a customer at a time.
Imagine that everyone is in one really big database and now you have to upgrade all of
them together.
Yeah.
Yeah, it's painful, all this.
Yeah, and usually, and then there is the mixed model where I think pretty much
everyone ends up with where you basically have, you start with the pool model and as
you grow, you shard it, but you're actually pretty smart and you realize that not all
customers are over the same size and you can have some dedicated charts for your biggest
or most sensitive or most demanding customers. And this is, I think, if you look five years into the
life of the company, I would say that this is the dominant model. Some variation of we have
model. Some variation of we have shared databases with pool and then some dedicated databases with specific customers.
What is the problem we're trying to solve? Is it security or performance or both? And
if we go back to some customers, they share this pool,
it affects security goal, right?
So we don't achieve it.
Absolutely.
So this is one, first of all, this is one drivers
that people actually start with the isolated model.
They know that they're going into a sensitive area.
They're focusing on SaaS for healthcare, SaaS for finance.
Those companies definitely start with isolated model
and try to figure out how to manage
large number of databases.
A lot of times those companies don't become huge.
There is that many hospitals in the United States,
but they still have to build all the tooling to manage large number of isolated databases.
For other companies, it's more complicated. I would say maybe 70% of the time, the reason for eventually moving customers out and sharding and isolating would be performance.
It's amazing how many performance problems can be solved by just having less data in
each database.
On the other side, there are the story where two years in, suddenly a very sensitive customer
shows up or you want to sell into like you thought
you're building a normal CRM or some kind of a rug database. But then a health care
company shows up, a bank shows up or even worse, a government shows up and they show up with a list of demands.
And since they usually have good amounts of money to back those demands, there
is a lot of incentive to figure out a solution for them.
Nice.
We have one kind of ugly duckling in the post-gress world that I'm not sure
quite fits either of these models.
I wonder if it's worth discussing row level security briefly, because if I was to
bucket it based on those definitions, it's the kind of the pulled model in a way
because all of the data is together, but there is some isolation between tenants.
Absolutely.
Yeah.
Yeah.
I have a love hate relationship with RLS.
I think a lot of people do because you're right.
On one hand, it's absolutely a lifesaver in the pooled model.
Developers make mistakes as joins get and conditions gets more complicated.
It's very easy to misplace a work laws and actually leak data that you don't want to.
So RLS will prevent you from doing it if you do it right.
It turns out that a lot of times the rules get complicated and then it leads to bugs.
It also turns out that a lot of times the rules get complicated and it leads to terrible
performance.
And one thing that developers really don't realize, I'd say almost that almost no developer realizes it until they run into it.
The wear conditions that RLS introduces are not optimized like the
wear conditions that you introduce.
Because Postgres, thankfully, is very good about security,
it treats the work conditions in RLS differently.
It's, I think they have,
they call it security conditions or something like that.
And they are very, very conservative
on how they optimize it and how they plan for it.
This has benefits.
You get very strong security guarantees,
very few bugs as a result.
This is fantastic.
On the other hand, the plan would be sometimes
significantly worse than what you would come up with
if you were to look at it really hard and do it yourself.
And with RLS, there is basically no way to force the plan you want.
Like you cannot set, enable, disable different rules because again, like the
main overriding rule is that we're very conservative on how we optimize those
RLS conditions.
So some people call RLS a performance killer.
I wouldn't necessarily go this far, but you can definitely run into gotchas and you need to be
aware that it's not a normal wear condition that you're looking at. Yeah, nice. So what does Nile offer today and what is the ideal solution to all this in Postgres
context of course?
Yeah, so basically we wanted to do maybe three things.
First of all, give isolation while not degrading the developer experience.
So for example, we partition data by tenant out of the box for you completely
transparently because we know that a bit later on you're going to want it and
it's going to be a pain in the ass to edit.
We shard it transparently. Basically, your database
may be spread across multiple different shards. We will route the queries for you and make sure
that they are working and issued. So in a way, you get the model you will have anyway in four to five years, but you're getting it from the get go.
And without doing a lot of the work because we are doing a lot of the management for you.
The other thing is that we have done some work to basically bypass RLS and give you,
still give you isolation. So the queries, you kind of do the same RLS set tenant ID equal.
We use that to actually direct queries.
We rewrite the queries immediately to the partitions that we know
has the data for that tenant.
So we have a small extension that kind of replaces table names with
partition names in the query itself.
And this vastly improves performance in the majority of cases.
I mean, we've seen it in a bunch of cases, especially if you have slightly weird indexes
that RLS may conservatively not use.
The improvement is quite stark. It can, depending
obviously on table sizes, you can get a benchmark that proves anything. So I don't want to throw
numbers out there. But obviously if you break down a table with a million rows into 1000 tenants with 1000 throws each, then you can
show quite a you can see where I'm going with that. The other thing we did, and that was
probably the most work and this is still work in progress, is allow moving tenants around,
because one of the biggest problems is that the tenant gets large or noisy and you want to give it
its own machine. Moving it is usually a long downtime. If you catch it after the tenant
is already large by doing the compute storage separation, we can basically make it transparent.
It's a latency spike while we're holding off some queries, while we're moving things like setting up sequence ideas, moving, pointing into a different compute into the same part of the storage.
But it's essentially a no downtime operation.
So we think it's a huge deal.
Because again, it's just a problem that we keep seeing again and again.
Yeah, I'm curious, is it all open source what you build or only parts of it?
Right now it's mostly hidden. We have started registering parts of it under a Apache license veto. So yeah, the goal is to open
source it and we already publicly declared that it's going to be open source. We have
not a date, but a point of completion where we plan to open it.
Yeah. So I've heard about this this several interesting things here. So one
is extension for this like I guess it's called and your documentation is called
virtual like RLS virtualization or how's it called? We call it tenant virtualization.
The extension itself I think we called it Karnak. We call everything after stuff in Egypt and Karnak is a famous temple.
So data is stored in separate tables and in the same database but extension
rewrites queries to basically route query to proper table right? Is this based on?
It's in separate partitions and we rewrite queries to go to the correct
partition. We basically bypass RLS, we bypass the planner trying to make those calls. We found out
that with a large number of partitions, this is significantly more efficient. I see. So it's a postbis position.
I see. I see. And you mentioned also another thing you
mentioned here is sharding, right?
Yeah. So we use foreign data wrappers to allow.
So we'd have two things in the architecture.
First of all, we have a proxy, surrounding proxy.
So it keeps track of every connection,
which tenant is the current tenant,
and it routes it to the shard that has the correct tenant in it.
And then we also have some cases where some developers want to write queries
that touch multiple tenants.
Those are not going to be as fast, where some developers want to write queries that touch multiple tenants.
Those are not going to be as fast,
but we do allow them by use of foreign data wrappers
and mixing partitions with our own partitioning rules
with foreign data wrappers and still keeping things efficient.
We didn't want the planner on any machine to be aware of all
the partitions in all the other shards, because it just explodes the planning
time in ways that we saw as unacceptable.
So what we did is represent each shard with a table and then put hierarchical table inheritance on top of it.
And the end result is basically a union all between the table with the table inheritance
that points to all those other shards and the table with the partitions. Now this gives us basically predicate pushdown,
because the planner will push the plan to all those different shards.
Only the other shards know that they have partitions,
which the source planner didn't know.
They will plan correctly with all the partitions,
but they only know about the subset of the partitions.
So we see it's a bit hacky and it's a bit
weird to explain. And we think we can do better with
some modifications to Postgres, which we have not
done yet. But this does give us predicate push down,
fairly fast planning and the ability to do queries that
cross tenants in situations where this is
required. Yeah, if you don't involve 2PC, just rely on foreign data wrappers, no two-phase commit.
I'm curious what kind of anomalies can happen there. Yes, and we prevent a lot of things that could cause anomalies.
So we do have a transaction coordinator,
but in order to not overload
and also not overcomplicate our architecture,
we limit some things.
So DDL has to be done on a single tenant,
and you cannot mix cross-tenant queries,
sorry, DML inserts and updates
have to be done on a single tenant
and you cannot mix in the transaction cross. So the moment you start a transaction, you
have to know what tenant you're working on. And then we route it to the correct chart,
which has the correct table and everything has the absolutely correct guarantees. If
you need to do something cross tenanttenant, you do not involve
it with any kind of update. You could still be exposed to some anomalies, I agree, because
there could be ongoing transactions from other people in other places. So you get the basic
read committed guarantees that Postgres gives you, I believe, but not anything more than
that. But yeah, again, we believe that cross-tenant queries are rare and mostly done in analytical
cases where you do reporting where it's slightly less critical to have those.
So you forbid writing to two shards in one transaction. If you want to write to a shard,
it's fantastic. You tell us what tenant you are writing data into and we will direct you to the
correct shard. I mean, if there is a transaction which needs to write to two different shards,
this is a big problem because without the PC. Yes, exactly.
We don't let you do that essentially in order to avoid anomalies.
I see. Another question here,
have you considered the approach used in VITES?
As I understand it, maybe I'm wrong.
Where most of the analytical queries maybe to avoid
distributed transactions data is brought asynchronously from one shard to another
and we have it locally like basically kind of materialized you on top of
logical replication for example or something and it has eventual
consistency approach of course but you can just join it in one postgres in one
shard right have you considered this approach? We have considered it. I think maybe CitusDB has something similar.
If I remember correctly, I'm not 100% sure.
But yeah, it's something that we are like, yeah, this is a good idea that we may examine
in the future.
It's definitely we are trying to build something useful gradually.
And we understand that early on, it's almost safer to have a bunch of limitations that over time we resolve rather than allow people to do something unsafe.
And also, yeah, you know, build a kitchen sink.
Sounds to me like Postgres versus MySQL approaches, because MySQL approach, you remember my some maybe you maybe don't remember but it's it was like
quite
Bad you need to run a repair table all the time because it's yeah
They say it is so long like yeah, it allowed too much
Exactly. Yeah, and my son had a lot of issues
I mean that there is a reason why EnoDB became extremely
popular.
Right, right.
But also with a multi-tenancy use case, I think you're quite right that, well, we,
I mean, you'll find out soon enough, right, if lots of people want these cross-tenant
or cross-shard queries, which are by definition cross-tenant queries. And if they don't, if
you don't need to worry about it, you save a bunch of effort having to even implement that.
So yeah, I like that a lot.
Yeah, last comment here is I'm excited to see
that finally Postgres ecosystem receives some tension
in the area of sharding, I guess it's just time has come
and more and more databases became too large to be handled.
I'm almost surprised. I'm honestly surprised it took that long. I mean, again, if you look at MySQL...
We just... it's just unfortunate. First time I touched this topic, it was 2006,
immediately when we started working with Postgres, honestly. And there was a PL proxy from Skype
at that time already, but it required you to write everything in functions.
Partitioning didn't exist back then.
Existed. It was based on inheritance. It required much more manual...
It was fun, actually. You understood it better, you know.
But yeah, but it was not convenient, not super convenient.
It's just I see that just unfortunate how it turned out in Postgres ecosystem.
And now definitely there is a huge pressure.
Many companies need partitioning.
How would history work?
Imagine that YouTube picked Postgres and not MySQL as their first database.
Yeah, Google or Facebook, they both chose MySQL somehow, yeah.
The test would be for Postgres first, right?
Exactly.
It could have turned out so differently.
Yeah.
It's funny, isn't it?
Changing topics slightly, or LLMs are quite top of mind at the moment.
Are you seeing people's initial choices change as a result of asking for advice earlier from our robot friends?
Oh my God. We're seeing so many weird things.
Like it's just unbelievable how much things are changing. First of all, we're having people show
up on our Discord and say things like, I'm using Nile because my LLM thought it's a good idea.
And I don't really know Postgres, so I need some help, but my LLM assured me that this is still a good idea. A lot of people who are very much beginners, like maybe I can say when I started developing
the first time I had to use a database, my company sent me to a three-weeks database
class.
I think it was Oracle about 20 years back.
And I came back a lot more confident that I know how to use Oracle
and not to leave transactions open too long because people will yell at me.
These days, people don't do the three week class before they try using a database.
So you see a lot more people start using a database earlier on and they do expect more hand holding from the vendors like the LLMs
give them advice up to a certain point but eventually if things are slow they will come
to you and say hey why is my query slow I'm sure you guys have seen your share of that
that. I'm also seeing people use Postgres for their LLMs in different ways. And this is really exciting to me. People use Postgres via MCPs, people use Postgres with vectors,
people building AI applications on top of Postgres. We're seeing a lot of that. And I mean, personally, I'm really excited that people can program with LLMs, not knowing
a lot of about Postgres, not knowing even a lot about software engineering at all, and
still get reasonable security guarantees.
You don't need to know to ask your LLM about RLS
or about, are you sure this query
actually properly isolate tenants?
And it's also interesting how much the results differ
when people use different LLMs.
Like I would say, syncing models do fairly well
iterating in order to get good code
and checking their own results.
Again, given via MCP access to a database.
I would say that if you use a charge GPT-4.0,
you will get a lot of random hallucinatory stuff still in your code.
Yeah, it's funny. I already told Michael we had the cases doing consulting,
but like maybe already almost a year ago I started noticing that people send us like some,
we are building this part of database, can you review it?
We are reviewing, we use different LLMs supporting this review.
And then we have a call and I'm curious, code looks great, I mean schema looks great, but
something is off.
And then we have a call and I see they open tabs and charge a PT, cloud there as well.
So I realize they used LLM to create schema
and then send us for review.
And we use LLMs to review it.
And then there's like a four-party process.
It leads to a good place, but someone
needs to jump with proper expertise
and say this is not a good approach, but someone needs to jump with proper expertise and say, this is
not a good approach.
This is hilarious.
Do you think at some point you and your customer can just step out and let the LLMs figure
it out between themselves?
Well, there's a problem here because it's great, but it does 80% of the job in 1%, not
in even 20% of effort very quickly.
But there is 20% of problems which again, like I don't know, maybe in next few years,
it will change.
I think it will change.
But right now I feel my internal LLM trained much better than than Chagy PT.
You trained your own LLM, right?
Yeah.
No, I mean my own.
Oh yeah.
No, but I think you actually trained your own.
Yeah, we experiment. We do some stuff, we experiment, we have some things like we have,
we start with fine tuning, moving to our own LLM, but not yet there still. So yeah, there is a lot of stuff can be done there to properly. It's hard to compete
with cloud and they have very high pace. With cloud 4 release, I see, wow, it's really great.
But it's still missing many things you learn from practice, which were not discussed. That's
why they don't bring it. Many problems were not discussed yet. And you explore them if you have a lot of data and heavy workloads.
So you're saying that even training on the Postgres mailing list doesn't have all the information in it essentially?
Well, yes. For example, a random problem, we recently touched it.
There is a buffer pool in Postgres and there are 128 basically partitions.
So you can have 128 locks and if you have a huge buffer pool, this becomes bottleneck.
And some people say, let's maybe make it tunable, configurable.
But when you start researching this topic,
you end up finding some recent, well,
recent last August and September last year,
conversation in Hacker Smelling List, which is open-ended.
It's not complete, because somebody
needs to run benchmarks and prove that this is
worth having a new setting.
And that's it.
There's a patch proposed, but that's it.
There are no experiments yet.
This is, by the way, one of the reasons I was so excited about your LLM approach, because
if you think about what is the bottleneck for doing a lot of the database improvement
things, and I am feeling it very personally.
Running benchmarks is hard.
Properly planning a benchmark is hard.
LLMs can help with that.
But again, they sometimes just go off the rails.
And even if they help plan that,
I don't know LLMs that actually run benchmarks
to the point where they provision the machines
in AWS and know that you have to provision a separate machine as a database and a separate
machine, maybe a few of them, to drive the workload and they both need to have appropriate
resources.
All those and then don't get me started on analyzing the results, which is kind of 99.9% of the work.
And so the fact that you actually kind of started
your LLM for, I have an LLM that can actually do benchmarks
is just, I think this will be the biggest breakthrough
in both people tuning their own progress
and also post-gress as as a community being able to advance
the state of the art. Let me share said news. Like I'm not giving up but I have it's a roller
coaster. I spend more than we spend like the team of maybe five engineers spent more than one year
trying to achieve that. we achieved many things.
But first of all, we chose Gemini because they gave us credits.
And I think it was a huge mistake.
Gemini has a lot of problems.
They have a lot of like suddenly you have 500 error, which like so many problems.
It's just not mature product, Gemini, and it has hallucinations all the
time. It's good, for example, for working with JSON, because when you need to run an
experiment, we decided to choose JSON as a config format. It writes much better than
GPT-4 for O and so on. But many things, it just hallucinates. It invents all the time
some things. It just makes up results all the time.
And we have a system to control it,
but it bypasses all the time.
It's really hard.
So then, yeah, we experimented with additional DeepSeq, Lama,
and we fine-tuned a lot.
All versions of GPT, all modern fresh versions.
We also bring them all the time.
And Claude is much better.
We just added it to this system we have.
But after one year, I decided, you know what?
Benchmarks is an extremely hard topic and we cannot trust it anymore. I mean we cannot trust LLM to create
precise configuration and process results fully. So we decided that LLM is just you know it's just
more like connection thing. Like when you engineer benchmark expert needs to engineer. I don't trust any LLM for now because any experiment, like we
have, we plan to publish maybe 15 to 20 experiments in our blog last year. And if you open our
blog, you see just one experiment. And even there we screwed up and someone on Twitter
said this is not right and we quickly corrected, which is good and we had achievements like interesting things bottlenecks show popped up here and there it's really fun
to iterate with LLM but once you allow to think to design experiment and to
treat results in 99% of time we have wrong results wrong conclusions and so
on so for now we are thinking, okay this is just accelerator
of performing experiments but design and understanding results should be in human
brain for now. I love the fact that benchmarks is also hard for robots to be honest.
It's so hard for humans, right? Yeah and we collect so many, but somehow it's super hard still.
So you always think where is the bottleneck?
And simple question, right?
But for now, it's extremely hard to let LLM find bottleneck and draw proper conclusions.
And honestly, if you just had an LM that always said
it's a network, it would be correct about 80% of the time.
What if it's local?
There is no network.
Everything through Unix sockets, and we have these cases
as well.
So I agree with you in production.
Like in production, yes.
But in experiments when we learn Postgres behavior
on single machine running PgBench locally, we don't care sometimes. So there's no network
there sometimes. So it's hard. So I had many moments of frustration, but it's so good.
I still believe that iterations are great. So if you say this is great benchmark, just check it on new version, just changing one thing.
This is good.
We have automation, we have interface,
and it repeats the process of analysis again.
This is where LLM helps a lot.
Because without LLM, you could just have some form, and oh, we don't have this parameter program, it's not exposed in the interface, it's bad.
With LLM you have freedom to change things and iterate based on existing good benchmarks.
So yeah, I cannot say we are there yet. This project right now is we are thinking about next level of it where I think we will let
LLM we will give it less freedom you know that's that's the key and control
more by human human brain. Human in the loop kind of thing? Yeah yeah exactly
exactly so design and first analysis.
Only human should be there.
But once you have confidence that you're moving
in the right direction, and you just need to iterate
and expand, for example, to different versions,
platforms, everything, this is where you can relax.
You already verified results.
You can say just repeat, but on different something.
This is where LLM already can bring you.
It can just speed
up everything. You can throw it to this benchmarking process and have like 10 experiments running in 10
different versions or something. And I think this is also kind of almost the general direction that
agents are taking shape. I mean people started 2025 was supposed to be the year of the agent.
started 2025 was supposed to be the year of the agent. I think it's almost becoming the year of the human
in the loop with the agent.
Like all the successful products I see are,
you tell the agent to do some stuff,
you ask it, please plan something, you give it feedback.
You then say, okay, now that we have a good plan,
go and execute on it.
You come back an hour later, you had your coffee.
Okay, let's see what you've got.
Here's some feedback.
Go fix some stuff.
Like it's, I think it's always.
Every successful product is a bit like this.
Right, but sometimes humans start using different LLM when reviewing things, right?
Being lazy, right?
Yes. That's interesting. So I'm not sure how successful it will be, but my gut tells me that we need to move, move, move in this direction anyway and have some, I don't know, like more
experiments and so on. I hope we will have more soon to publish and start iterating.
But yeah, it was early cost last year, so now we are rebuilding stuff. We will see how
it works. And for example, one of experiments we must do, I think, is to conduct various
benchmarks for RLS because it's obviously...
That would be a fantastic example.
Yes.
I mean, this is something that humans with experience are pretty good at finding cases
where you're like, is RLS going to actually be an issue and have the stories that then
the LLM can go implement and test.
Yeah, we had several cases and also SuperBase has public materials, blog post about this.
Obviously, there is already quite known case when you have like current setting function
inside RLS expression.
And this is, yeah, if you select count one million rows, it's terrible.
And it's quite easy to fix actually.
But yeah, so these kinds of experiments to collect them and see how, actually my goal
with this experiment, I think will be to prove that they are not a problem if you do it
right.
Interesting. I would contribute,
I think there was a recent post on the bug tracker
where basically the optimizer,
the planner refused to use,
I think it was a GIST index or a GIN index
due to the beliefs that the RLS optimization is incorrect,
is unsafe. I can look it up and send it to you. But yeah,
that can also be interesting. Like if, if you have a fix for Postgres, then you can,
it will obviously be nice to showcase. Right. And I think also if you come, so, so you don't
use RLS, you, like you said, use RLS, you said bypass it, right?
We bypass it, yeah.
Yeah, that's interesting.
I'm curious if you, in this mixed schema when we have partitions or shards and RLS, does
make sense at all to involve RLS additionally locally if some of shards have a mixed pool of customers.
One of the questions that I have in mind and this is something that we are trying to help
figure out for our users, often on top of the tenant you still have permissions for
specific users in the tenant.
Like you have an admin that can do anything and then you may have someone who is not allowed
to see some rows at all because they're too sensitive,
all this kind of stuff.
And our users ask us, they can do it with RLS
inside those partitions,
or they can do it in their application.
There is a lot of application level tools
or like middleware kind of tools that will do it for
them. Is it better to do it in the app layer or in Postgres with RLS is a good question
that I don't have an immediate answer for.
Right. So there are several layers of multi-tenancy basically. Yeah, not multi-ten much, but different layers. And if your customer being a tenant for you, they might have tenants for them, but also
inside them, they might have additional clusters or segments of users inside each tenant.
So it's kind of apartments and rooms inside apartments.
Nice. I like that. So it's kind of apartments and rooms inside apartments.
Nice.
I like that.
This is one of the things that make multi-tenancy confusing, right?
Because it's almost like Matryoshka kind of scenario.
Great.
Yes.
This term, by the way, used in Postgres ecosystem multiple times, starting with, you mentioned GIST, original paper by
Hellerstein mentions that Matryoshka R&D RD3, Russian doll tree, so basically Matryoshka
tree.
That's true.
And also in embeddings, they have the Matryoshka type embeddings where you can make them of
any size.
Yeah, I read about this as well.
Yeah, it's funny. So yeah, great. And what's, what's your,
what are your plans for, yeah, I saw MCP servers and like already there is some integration.
We basically have three directions. One is MCP server and making it public,
giving it authentication. Right now it's open source. You can run it
on your own, but we are not hosting it. So we should start hosting it at some point.
So things that just make people who use LLM make it easier for them to use it. Other thing
we've done that we think is very useful for LLMs, this we already have, just make it zero
time to create new databases because LLM is just zero time and zero cost.
Cause they love creating a lot of them every time something goes wrong.
Okay.
Let's try from scratch with a new database kind of situation.
And so we're making it fast and cheap.
The other things that we're still working on is really to make our
documentation more LLM friendly.
that we're still working on is really to make our documentation more LLM friendly. LLM.txt is absolutely not enough. It is actually, it creates a very large file. The LLMs tend to get
lost in it. We need to figure out how to make it better. And then also, you know, everyone is kind
of thinking, can we have our own agents? Can we do something around that?
We're kind of thinking about that.
Something we already have that is useful is just that with the multi-tenant model, the
embedding, the vector indexes are much smaller.
And this is a huge deal for people building those agents and LLMs.
Yeah. And actually you mentioned, so this zero startup goal, you mentioned separation
of compute and data. How is that achieved? Using what approach?
Oh, yeah. Sorry. This is a more than a two minute answer, but the short one is that you kind of,
you patch Postgres and then you find a better way to do your storage and basically wrap every
Postgres storage function with an equivalent with your storage. And then you also need to
apply the wall continuously to the storage layer.
Great job doing this.
Like in actually 20 seconds.
I understood very well. Yeah.
So, and, uh, do you have plans to make this open source as well?
Uh, yes, absolutely.
I mean, this is, uh, first of all, as you know, wall readers have to be registered
are already is and with the open source license.
And then, yeah, we are planning to open source storage layer, our extension. I think at this
point, it's about seven different patches we've made on the Postgres. I don't think we'll want
to open source it as a Postgres fork because the number of patches is quite small
and we are maintaining it.
I think everything from Postgres 12 to 18 at this point
or maybe 13 to 18, something along those lines.
All supported versions, I guess.
Yeah.
That's great.
Yeah, well, looking forward to checking that out
and actually one more question for me, maybe last one,
are you open to some benchmarks? We probably will do
We plan to do some benchmarks with various platforms
It all started after acquisition of neon and there's a discussion on LinkedIn
So I thought about running some benchmarks for different platforms
How do you think like what what what are your thoughts about it? Oh
My god, this is very scary. We're benchmarking ourselves all the time. So
I'm keenly aware of exactly like what benchmarks make us look good and what
benchmark make us look bad. And I think that's also how we react to benchmarks
and social media. Unless it's a benchmark that shows something in
postgres itself, and like clearly attempting to educate and help people, you can design a benchmark to
make anyone in the world look bad.
Benchmarketing is called.
To make everyone look good.
Exactly.
So I will, I'm at any given time I can publish my benchmarks that makes
Nile look fantastic and you can run benchmarks and
that you decided how to build them and may be very realistic and may even expose the problems
that we have that I didn't know about. But yeah, in general, I love benchmarks.
I just have opinions on how benchmarks are used by marketing people.
just have opinions on how benchmarks are used by marketing people. Very good answer. For anybody interested in a bit more about Niles architecture,
Gwen gave a really good talk at PGConf.dev recently and the video just
went up on YouTube so I will put that in the show notes for anybody that wants a
deeper dive there. There was also a good talk I saw at a PG Day Paris
event by Pierre Ducroquet, I'm not sure
if I'm pronouncing that anywhere near correctly, all about multi-tenant database design, especially
focusing on something we didn't focus much on today, which was the downsides of schema
per tenant design, including things like observability and monitoring, which I thought was really
fascinating.
So anybody considering going down that route, definitely check out that
video and I'll put that in the show notes as well. So yeah, thank you so much, Gwen.
I think we're out of time. It's been a real pleasure.
It's been a pleasure. Thank you for having me on.
Thank you.