Postgres FM - PostgREST
Episode Date: December 2, 2022Here are links to a few things we mentioned: PostgREST documentationSupabaseNo offset (blog post by Markus Winand)JSON functions and operators (PostgreSQL docs)Stored procedures (a previous ...Postgres FM episode)PostGraphile Hasura Parse, Inc. FirebasePostgresMLPgQ ------------------------What did you like or not like? What should we discuss next time? Let us know by tweeting us on @samokhvalov / @michristofides / @PostgresFM, or by commenting on our Google doc.If you would like to share this episode, here's a good link (and thank you!)Postgres FM is brought to you by:Nikolay Samokhvalov, founder of Postgres.aiMichael Christofides, founder of pgMustardWith special thanks to:Jessie Draws for the amazing artworkÂ
Transcript
Discussion (0)
Hello, and welcome to PostgresRM, a weekly show about all things PostgresQL.
I'm Michael, founder of PD Mustard, and this is my co-host Nikolai, founder of PostgresAI.
Hey Nikolai, what are we talking about today?
Hi Michael. I call this topic No Backend, but how can we call it else?
It's about the idea of stop writing application code in Java, Python,y php doesn't matter and using only front-end code
in your web or mobile app and also only database side code and that's it nothing in between
yeah so we had a nice this is another listener suggestion so we had a nice question that was
all about good practices to build APIs directly in the database.
And they specifically mentioned Postgres, understandably.
Right, right.
So I described this topic.
First of all, I want to apologize that I'm interrupting you constantly.
I cannot promise I will stop doing it.
But we had a comment on YouTube, and I agree.
I do it quite often.
I know this problem.
Some people cannot work with me because of it actually well I don't mind so please don't worry but I will try to improve
I will try so I described this topic as we 15 or like 20 years ago it was normal to say that code
on clients should be very lightweight. It should be some small HTML
and so on. In 2004 or so, plus minus a couple of years, things changed with Web 2.0 when Gmail,
Google Maps, and other web applications started to write much more code on the client side.
Right? And then iPhone and Android added even more. So so i call it we live in the era of very thick
clients because a lot of code and business logic is coded on the client side in running in browser
or in mobile app and this already ate some it it not ate like it took some work from back-end
developers and changed everything but then the idea why we
still have backend code if also sql is quite powerful today right we have we have we can do
a lot we can extract only the data we need from tables we have we can work with very efficiently
and actually we we can again store procedures or functions with a lot of logic on database.
So database can be kind of limited application server itself.
And also, additional thing is JSON appeared, right?
JSON is a common format of communication for everyone.
This gives us idea why not eliminate
the middle of our
three-tier architecture, so keeping
only client and database.
Maybe some additional middleware
but very lightweight and not
needing constant modification
in between.
Returning to request
of our listeners, it was
like how to write REST API on top of database easily, or it can be GraphQL server.
So we want something that will support the development of our applications, client code, and it can be JavaScript, it can be Swift, mobile app, written in Swift or something else.
Right.
It's quite natural to think in this direction.
Yeah, I think from a simplicity point of view, it makes a lot of sense. written in Swift or something else, right? It's quite natural to think in this direction.
Yeah, I think from a simplicity point of view, it makes a lot of sense.
And also the common one I see is a speed of development perspective. If we can write the client and have the database and not have to write something else,
we can get to a version one of our application quite a lot quicker.
Also think about the data flow here.
For example, we have relational database.
Of course, relational, what else?
Even if it's not relational, data can be stored in JSON.
And relational database, I mean, we have very good, strong JSON support in Postgres.
For example, we store many parts of our data as JSON.
And then data flow.
If it's some Python or it's some Java application, we translate this data, originally in JSON, to some object format, right?
Some objects or arrays or some structures, data structures.
And then we translate them and give to our client code again in json it
feels very strange that this like jumping back and forth it's called actually data marshalling
when we translate too much so we could take json and give json maybe transforming it somehow why
we why we should we deal with all those objects and arrays? And people like myself
invested a lot of time into writing a lot of code following some object-oriented design
pandas. They now regret it because it's good, like it's beautiful in terms of some concept,
but practical-wise, it's not that good. And working with JSON and things like React which define how to work
with both structure
and what to do with this structure
in same way
these things change the world already
and the idea of having
very light middleware
and allow us
to focus only on
client code, only on database code code it makes sense in my opinion
that's why yeah yeah that's why we have postgres and others yes feels like a good time to dive
into postgres do you want to give us a little bit of a top level on what exactly it is and
why it's useful yeah sure yeah first time i I saw it maybe like eight years ago, quite long ago.
And I implemented, I used it in several projects, including a case when a successfully small
team of like four Ruby developers were substituted by like 10 hours of my work per week with
same result. So Postgres is an application written in Haskell, which sits
next to your Postgres database. It can sit on other hosts and, of course, it's stateless,
so you can scale it and you can provision more compute nodes. It's like proxy, right?
But it allows you to define a structure in database of special format that will define automatically what REST
API your client code will be able to use. So for example, if you have a table, you can quickly
define a view in special schema. For example, V1, it's our API version one, you define an updatable view and automatically client code sees endpoint
and can read, insert, select from it.
And HTTP methods like get, put, post, or automatically translate a patch are automatically translated
to four DML statements, select, update, insert, delete we have.
So there is a lot of logic for automation,
like play with headers,
authentication, and so on and so on.
It's already very advanced software,
actively developed for many years.
Version 10, major version 10 is current.
So a couple of last years,
I'm not following very closely its development,
but it's advancing constantly.
And this is the project behind, for example, Superbase.
Superbase is based on Postgres plus Postgres plus several more components.
And so I love this idea.
And actually, I already explained why, right? So you can focus.
You can say we have Postgres. And for example,
we have one Postgres guy or half of it, like maybe 20% of a Postgres guy per week. We just
need to define our tables and these views and so on. And that's it. The whole team can focus
on product development, on client code and go with startup activities, not spending power on API development
in the old-fashioned way, right?
Yeah, and it's free to use as well, right?
So even if you're not using something like Superbase,
when I checked it out, it's just plain MIT license,
which is awesome.
So anybody can very quickly,
with a Postgres DB as a backend,
fire this up and have a fully working REST API
pretty much immediately, right?
Right, right.
It's also very performant.
I remember when I first looked at it,
it lacked some things like, for example,
you know, no offset rule.
Offset is a very terrible thing.
Most of us know it, but very quick reminder,
using offset with like limit offset is
usually a bad idea because if you check the plan execution plan you will see that all previous
entries were fetched and discarded so instead of it you should use like how it's called pagination
key set pagination i think yeah yeah and by the way I've learned from Postgres a lot. For example, this term I learned for dealing with Postgres.
I also learned about Skitch.
I mentioned many times Skitch tool, alternative to Flyway or Liquibase.
So the tool to control schema changes and to have schema version control.
I also learned from there just reading documentation and chatting with people in Postgres community, which is also not huge, but very active and good.
So you can have very performant API.
So it's written in Haskell, very polished.
When I joined, they had this offset.
And I don't remember details, but I contributed a little bit to improve how pagination is working
and to get rid of suboptimal queries
when you have like billions of rows, for example.
So right now, Postgres is battle-proven
on large databases.
Under heavy load, you can use it.
However, there are, of course, cons.
We haven't touched them.
In this approach, we have cons.
Let's first think if we mentioned all pros.
I think another big pro that people bring up is security.
So you can define things at the database level, and it will inherit those.
So it's quite nice to have that in one place and to be able to trust the data
or to be able to trust Postgres in this specific example to handle that.
So both at the schema level,
so which objects are exposed to it, to the Postgres user,
and I guess at the row level, row level security policies,
which is something I'd never seen used much
until I looked into how Superbase were doing things.
Right, right.
So this approach is an alternative to manual implementation of REST API. I think many, maybe most backend developers did it at some point and had feeling that there is some repetitive action, like some copy pasting involved and so on. And it's performant and secure. And also it's an alternate.
Maybe it also makes sense to think about it as an alternative to ORM.
Yeah, for sure.
It's also mapping.
But there is no objects here.
We're mapping usually either relational data to JSON
or JSON in tables to JSON or a mix of them to JSON.
And JSON, of course, is also working with Postgres.
I learned about JSON manipulation in Postgres.
A lot.
I was constantly sitting in JSON part of Postgres documentation
because I needed it.
And it requires some effort to master JSON skills for Postgres documentation because I need it. And it requires some effort to master JSON skills
for Postgres because there are a lot of features and functions. But once you achieve it, you like
it actually, because it's very powerful and a lot of freedom. You can combine relational data in
JSON and give it to a client code so easily. We already discussed this
stored procedures and database site coding. We also mentioned that sometimes a few lines in SQL
can replace hundreds of lines in Python on Java. It can happen easily when we talk about managing
data. So we have it here as well, right?
In Postgres, this is our approach.
We write database side code and we produce JSON and we can consume JSON easily.
And that's why it gives speed of development actually, right?
Because so you'll get rid of all like fat, I would say, encoding.
You don't need to think about these object design patterns.
You just take raw data, do what you need on database side
and give JSON very, very efficiently in development.
Yeah, I guess it's not directly related to this subject,
but while we're talking about them,
I think I'll include links to a lot of the JSON functions.
There's a good list in the Postgres docs.
It's fun to read, even if you're not planning to use it anytime soon, just to see what is possible.
And the other one was, you mentioned the no offset.
And I think Marcus Winnand deserves a shout out for all of the work he's doing publicizing this movement.
But he also, I think, gives out stickers.
And he's got a great page on his modern, yeah, great. He's got a great page on his modern SQL website that I'll link to as well.
Yeah, modern SQL and use the index look.com as well.
Very good.
Brilliant websites for those who are interested in relational performance optimization. Also, a few more words about efficiency of development. If you think about all REST APIs, most of them are repeating some logic,
and it's natural to use some declarative form to declare new endpoints and so on, right?
So SQL is a declarative language,
and Postgres can be considered as a declarative way to build REST API, right?
Yeah.
That's why it's so good, also.
JSON, declarative way, and a lot of things are automated that most of projects need.
Yeah.
Enough pros, maybe switch to cons.
What do you think?
Yeah, for sure.
And I guess, is it worth saying that a lot of these same benefits could be said of similar
projects?
Like in the graphql
space we have things like postgraph or hazura if it feels for me i mean i might be wrong but
logically it's very similar right right so both open source projects you mentioned and the
companies we mentioned superbase and you mentioned hazura The difference is Postgres is like REST API.
There are several attempts to build,
like it's not my thing, GraphQL,
but I know a lot of front-end developers love the idea of GraphQL,
especially those who work with React and so on.
Nice idea, but it's just,
I like experience in this area.
There were attempts to build ability
to create GraphQL APIs
on top of REST,
but also there is Hasura now.
Somehow it's also Haskell.
They use Haskell as well.
It was a coincidence, right?
And also very successful.
So I like to see these successful projects.
It's interesting.
I'm not an expert in history,
but I know that there are a couple of projects, parse.com and one more, I forgot the name.
One was acquired, parse was acquired by Facebook, another was acquired by PayPal.
And they, it was more than 10 years ago, or roughly 10 years ago.
And the ideas of those companies, which were both acquired and then shut down somehow. The idea was to provide
service and cloud to mobile application developers to avoid the need to fulfill the need in backend
development. So this class of projects or services was called MBAS, MDBAS, mobile database as a service.
So not database as a service like RDS, but mobile database as a service,
meaning that database for mobile app developers, I guess.
So you have not only database, but also some kind of application server
equipped with database.
And somehow they were closed.
I don't know why.
Interestingly, right?
It looks like Supabase
and Hasura are a new
wave of this idea
to simplify
backend development for those
who want to focus only on frontend
development. And they are
much more successful and a lot of
users and we see it.
Yeah.
And the other big successful one in that space feels like Firebase.
I know it's closed source,
but it feels worth mentioning that that has been a success. So it seems likely that these other ones could be too.
Right.
So Firebase is no SQL, like key value or document store on Google Cloud.
And it's very successful because, for example,
imagine you're a mobile app development.
You want database, but also if something changes in database,
you want your application to receive the change somehow,
like automatically, probably using like,
I would implement it using like web sockets, for example, right?
So I pushed the change
to my application and this is what firebase offered so it's fully managed database with this
synchronization capability like push the change to client-side code and it's it's i think it's
still successful and super base is positioned as open source alternative to firebase it's based on
postgres postgres plus additional components by the way i'm not sure how it's implemented in detail
but i guess it's based on maybe on logical decoding and web sockets maybe one day we should
invite them to to describe it in detail yeah right but it everything like everything is super
successful especially if you're a small team of some startup and you need to quickly change your interfaces and check various ideas very fast.
And you don't want to keep a whole team of Ruby or Java developers constantly.
Yeah. But let's...
Sounds good. How about...
Yeah.
Sorry, I interrupted you. Sorry.
No, that's fine.
How about some of the downsides,
I guess, is what you're going to go on to next.
Yeah, exactly.
So if you're a small team, it's good.
But what if you're a bigger team?
I observe cases when even in bigger companies,
things like PassGrest or SuperBase Hasura also work well.
I remember some talk about,
I don't remember the company exactly,
but it was a big
company and they decided to stop developing their own APIs. It was REST API originally and a lot of
development in-house. And they switched, it was a one-step switch to GraphQL and Hasura and they
liked it so much. So they talked about it and how to shift it in a big company so it definitely makes
sense for big companies also consider this approach definitely but what are downsides
first of all like maybe not the main definitely not the main downside is that these somehow these
middleware components are written in husky if you want to adjust it, well, Haskell is an interesting language, I would say.
I tried to contribute there.
Maybe some things I did, but definitely you need some mind shift to work with Haskell.
And I have huge respect to people who like it and continue using it.
So this is probably...
All I was going to say is it's a functional programming language
and for full disclosure, pgmusted is written in Clojure. So I can't talk on this.
Full disclosure written in Clojure, right? Yeah. Full disclosure written in Clojure. Clojure is
also, it's like also functional language. It's good. Like these languages are great,
but their adoption is very limited. They are not super popular compared to, I don't know, like Go language, for example, or Python.
And another problem, like possible downside, is that following this route, you will have a tendency to use more database-side code.
And if communication with external, this problem, we discussed this problem
in our episode about stored procedures.
If you need to communicate with external world,
call some REST external API, for example,
or to download some binary, for example, a picture
and save it.
It's not good to consume CPU cycles on your...
On your primary database.
Especially on primary database, exactly.
If it's a function which writes to database,
it can be only primary.
So you consume CPU,
which is a very limited resource,
the most limited resource in any Postgres cluster.
This can be dangerous.
And I remember one day I used even PLSH,
which I wouldn't recommend for normal use at all,
to save a picture from Postgres.
I had a call, like, they called, as I remember, maybe still, like, RPC.
So you have post slash RPC slash your function name.
And this is a function in Postgres.
And it was PLSH function, which is like stored procedure, stored function using bash.
And then I had some cool calls there and stored.
So my primary Postgres primary node downloaded image and then called some additional small microservice to it can be s3 for example you can
now save it on s3 it's the approach you never should you never would like to follow unless it's
some very rough quick and dirty prototype of something it should not be on normal service
because why because cpu on the primary is very limited and you shouldn't do it at all
and if you're if you're a new listener welcome but also we have a previous episode all about this on
stored procedures right right so this is second cons second item in our cons list what what else
good question i well i've got some cons from developers actually on Reddit. I think the first one, they said that when this, in one of the places where this was shared, I think probably when it first came out, but I didn't check. The top comment on that is thanks, I'm unemployed now. So that's, maybe that's a con. L to balance it, I saw, not once, I saw people who say it's not right to use SQL for it. SQL is okay language, but I would rather use my Go or Python or Java for that.
Oh, yeah. By the way, I was definitely joking. I think there's plenty of developer jobs to go around and definitely still more demand than supply you're touching a very sensitive topic so yeah but so maybe we
could say like a new item in the cons list is a lack of libraries probably right because if you
use python or even go or which is newer language you have so many libraries already written.
So you have so huge power to extend your API with additional things.
For example, if you want machine learning and you use Django
and you decided to, this is by the way,
when I replaced Ruby developers in that project,
then later I already, it was fine, but they pivoted this company
and they then hired Python developers.
One of them is one of my best friends in university.
He told me, we are not going to use Postgres anymore.
We know Django.
We are going to use a lot of machine learning stuff.
So we are going to switch from your stuff to Django API development
and we will be using a lot of machine learning.
Of course, I could say there is machine learning in Postgres.
There is Medlib.
There is Postgres ML, something new, right?
So it's possible.
And there is PL Python.
You can use a lot of Python code inside Postgres.
But I failed there.
I mean, not failed.
I lost.
I lost that battle.
Why?
I'm not sure, by the way, why.
Like they just said, this is what we use.
This is easier for us, more convenient.
I think generally people do have a bit more familiarity
with their backend services that they've got used to
than they do with Postgres.
So one of the downsides is that we are pushing
a bit more responsibility and work into the database.
So like, for example, around those security things,
I wonder if that's part of it, familiarity with how to do that
and how to debug things if things go wrong.
No, debugger. Debugger is a good thing.
Yes, IDE and so on.
So the same issues as for stored procedures
when we covered them in that episode, actually.
Right, right.
But if it's a quite simple API, for example,
if you have a table, you define a view, as I said.
If you just need simple inserts, selects, and so on, and some kind of logic related to which columns should be available for users, everything can be done at view level.
You don't need to write a function and use PGSQL at all for this.
Right? So in this case, probably it's good. It's enough. You don't need to write a function and use pgSQL at all for this.
So in this case, probably it's good, it's enough.
And it's faster in terms of development and people should know SQL in general.
But if you need some advanced logic, maybe you'll have issues.
Again, it's possible, but yeah.
So it depends.
And the constant pros are not very strict to me like
for example if i take some project i would really evaluate am i going to use postgres one more time
or i will say okay a team of python developers you you do it so i have personal preference to
use postgres of course and at postgres ai we use it for our platform as well, by the way.
Oh, cool. I didn't realize that. Makes sense.
So if we extend something, this is written in React and Postgres.
That's it.
Are there any, this might be a stupid question,
but are there any downsides?
Or is it easy enough to do complex,
like maybe a quite complex analytical query? is it easy enough to do complex, like maybe quite a complex analytical query?
Is that easy enough to do?
Well, any query can be there.
So it can be viewed, it can be stored procedure.
But it has to be on the Postgres side rather than...
Right, but it should be always on database side.
Queries should be there.
It's not good to load them like our developers do and do everything in memory.
So I don't like this approach.
It should be in database.
Makes sense.
By the way, I think one more item.
It won't be very strictly defined, but one more item to the cons list.
It's more about progress, not about Postgres.
Lack of tools for asynchronous nature of work with data.
We discussed it also several times.
For example, you want something, but you don't want to do this heavy work right now.
You want to schedule it and then return to processing.
And this encourages you to implement some kind of queue inside database or to use external queueing mechanism.
And there are ways to do it,
but from SQL context, it's slightly more difficult,
especially because, unfortunately,
the world of developers lost a very good, in my opinion,
tool called PGQ from Skype, developed many, many years ago.
Unfortunately, humanity lost this quite good dinosaur, I would say. I liked it because of
its performance, and it was easy to work with. It had a good performance, some kind of maintenance
needed, but much less than for Kafka, for example. And since then, we observe a lot of attempts to reinvent that wheel.
Most of them are failing to do it in a way that I would be able to use them in large
projects.
So most of attempt to implement queue in database cannot scale well, unfortunately.
But this is maybe another topic to discuss.
So and if you use Postgres, you build API, for example, to save pictures on some call,
I would think about how to have a queue mechanism close to it.
What would be your queue mechanism?
Synchronous processing.
In this case, you will be fine.
Because for most languages, of course, we have libraries to work with,
like RabbitMQ, Kafka, anything like that.
Yeah, good point.
Wonderful.
Well, I think that's probably a wrap then.
Thank you so much, Nikolai.
Thanks, everyone, for listening.
And see you next week.
Yeah, I hope it was helpful.
Don't forget to share links and provide new ideas to discuss.
Thank you so much.
Till next time.
Bye-bye.