Postgres FM - Data types
Episode Date: August 4, 2023Nikolay and Michael discuss data types in PostgreSQL — including system types, choosing between types, types provided by extensions, and more.  Here are some links to some things they men...tioned:Data Types (docs) https://www.postgresql.org/docs/current/datatype.html 10 tips for beginners https://postgres.ai/blog/20230722-10-postgres-tips-for-beginners Tid Scan (explain glossary) https://www.pgmustard.com/docs/explain/tid-scan Don’t do this (wiki) https://wiki.postgresql.org/wiki/Don't_Do_This Boundless `text` and back again https://brandur.org/text UUID episode https://postgres.fm/episodes/uuid I use ENUM (30min talk by Boriss MejÃas) https://archive.fosdem.org/2021/schedule/event/postgresql_i_use_enum_vindicating_the_underdog_of_data_types/ Peter Geoghegan tweet https://twitter.com/petervgeoghegan/status/1680275871905775616 JSON episode https://postgres.fm/episodes/json pg_repack reorder columns discussion https://github.com/reorg/pg_repack/issues/101 Use bigint https://blog.rustprooflabs.com/2021/06/postgres-bigint-by-default~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is brought to you by:Nikolay Samokhvalov, founder of Postgres.aiMichael Christofides, founder of pgMustardWith special thanks to:Jessie Draws for the amazing artworkÂ
Transcript
Discussion (0)
Hello and welcome to PostgresFM, a week's show about all things PostgresQL.
I am Michael, founder of PGMuster. This is my co-host Nikolai, founder of PostgresAI.
Hey Nikolai, how's it going?
Hi Michael, all good. How are you? So what's the topic today?
The topic is data types.
What kind of data types?
We're going to try and keep this to half an hour, so we're going to very much have to do
a 101 intro to data types but i think you've got
some interesting things you want to go into first right yeah well so data types is a huge topic and
we probably should just try to provide an overview and some entry points as usual because for podcast
it's impossible if you want to squeeze it into 30 minutes,
it's impossible to cover all the topics. Postgres has a huge number of data types
supported and extensions bring even more and more data types, obviously.
Yeah, I've heard it described as one of Postgres' main strengths, one of the biggest reasons to
choose Postgres, which I completely agree with. Did you want to talk about system types to begin with? Yeah, yeah. So before we go really wide, we should discuss, this is my idea, it was my idea,
let's discuss data types each Postgres always has because they are internal. And I encourage
touch some data types even if you just start working with Postgres.
A recent episode with tips for beginners and a related article I posted, it mentioned
CTID data type, which is quite useful to understand how MVCC, multiversion concurrency control works in Postgres and to start understanding versions of rows
and like to feel how vacuum works basically and so on and you just this is just a hidden column
in any table and it provides you two numbers so it's like composite type it you two numbers which tell you location of a tuple.
So page number starting from, I guess, one.
Or I always forget.
One or zero.
So page number and offset inside the page. And the interesting trick, as I usually say, you can extract one of parts if you just convert it to point first and then address the first or second part of this
value it's just a trick which you can use to do some i don't know like some statistics to learn
how your data is currently organized and of course if you perform some updates insert deletes or
auto vacuum or manual vacuum processes the table, of course, physical layout changes
and some rows can migrate to different pages.
This is one of the system types
which is worth knowing and using from time to time.
Yeah, I really liked your tips on using this
to get an understanding for how things work at the disk level.
So understanding tuples are row versions using this to get an understanding for how things work at the disk level so understanding
tuples are row versions and that they can move and that they get a new ctid when they've moved
like if they've been updated even if it's a like a no op type update they can move that was really
good the first time i actually came across this was i did a bunch like i did a whole glossary for
explain and all the different operations and fields you could see in explain.
And I came across a TID scan.
And I was like, what is this?
And it was so interesting to read up on it.
It's generally only used, well, it's like an internal thing.
You can use it, but it's generally a bad idea because tuples can move.
So, yeah, that was my first experience with it.
Right. tuples can move so yeah that was my first experience with it right also there are a few more data types you probably want to at least know they exist and i'm sure at some point they will be
useful for any even like backend engineer not just for dbas they of course all of these things are
implementation details and it's like kind of internals, but it's very useful to know about them.
For example, LSN, there is a data type called PG-LSN and it's kind of our moment in time.
So it always grows, grows, grows, grows.
And interesting thing to remember, if you have two LSNs, you can subtract one from another
and you will have delta measured in bytes.
So it's really easy, just use operator minus and that's it.
Of course there is some function pglsnd for something, but anyway, delta of LSNs is log
sequence number, it's position in wall and it's constantly progressing. Of course there is another
value, xid, transaction ID, you can also play with it
it's like regular integer so
you can work with it as well but LSN is very interesting so you can
understand how much data you produce in terms of wall data
all writes go to wall, obviously.
Also, vacuum can write to wall and shift LSN as well.
And maybe final thing from me is a pair of useful data types,
also internals, OID and reg class.
So each database object has OID,
it's like object ID inside database.
And there are some OIDs which are hard coded
in Postgres code.
For example, when new system table,
system view or new data type in core is added,
it has assigned OID, which is just written source code.
And even there is a function, there is a script, a helper script,
which helps Postgres developers, Postgres hackers,
to find unused OIDs that can be used.
There are some gaps sometimes.
So OID, for example, if you create a table,
it immediately has OID assigned assigned and you can select it from
system view called pg class so you select pg class where real name equals your table name and you can
find oid and also there is a type called rec class which corresponds to records in pg class and all
tables and indexes are there also views materialized views they all are there
and interesting trick is that you can take oid and just convert it to regular table name or
index name just saying colon colon rec class this is interesting this is the shortest way to get table name from OID.
And vice versa, of course, you can convert it back to OID.
So it's useful sometimes to deal with some catalogs and to find, for example, if you
have a very wide rows for some table, you know, there is Toast, we had episode about
Toast.
When you have Toast, you sometimes need to convert table name to... Toast table has OID of
main table inside its table name. So you need to play with it and understanding that it's so easy
to convert to OID and back to relation name. So relation, it's kind of overused term, but almost
anything is a relation. I mean, table is a relation, even index is a relation, and they all go to pgClass. So you can convert
back and forth just remembering type names. And the most difficult
is to remember regClass. I tend to forget it.
But I use it quite a lot as well. I came across it recently
when I was hosting the PG SQL Friday event
and somebody, I think it was Diane
Fay, used it in their query
for PG stat statements as a really neat
way of casting. So actually casting
is the important word there, isn't it?
The colon colon you mentioned is used
to cast from one data type to another.
So it's really cool, really powerful.
Yeah, and it's postgresism
because the standard way is say cast.
Cast something as something. I never use it because it's Postgres-ism because the standard way is say cast. Cast something as something.
I never use it because it's too bloated. It's better to just write colon colon because I
definitely know I'm not going to migrate from Postgres to anywhere.
Nice. In the interest of time, should we move on?
Yeah. Let's talk about basic atomic data types like numbers and texts and dates and times and so on where to start
numbers or text i think text because i thought this was quite like a nail when i first started
learning about postgres the advice i read felt like really sensible and i've got on board with
it and then the more i learn the more i realize it's like a little bit more complex. So I feel like this might be one of the more interesting ones.
So the reason that I think it's interesting is, unlike a lot of relational databases,
Postgres has a text type and a lot of the other databases rely on character.
Yeah, exactly.
Exactly. databases rely on character yeah exactly exactly and character limits are really common in a lot
of when people are using oracle sql server and others and i think there's a few guides out there
and a few even the wiki in fact i was going to mention this later but the don't do this page on
the wiki is brilliant for data types so i'll link that up but one of their advice is generally to use text and they
go into good reasons as to when it may or may not make sense but yeah how do you feel about that
is it the same wiki page which says don't use money yep and don't use timestamp use timestamp
easy don't use money sounds funny actually i i yeah well inside Postgres I tend to follow this advice, but outside I prefer
not to. So it sounds like communism, you know. But as for text, first of all, Postgres supports
varchar char as well, like standard, so standard type support. But with some addition, varchar
can be used without limits as well, You can say VARCHAR.
And it will behave similar to text.
And on Stack Overflow, there is a question where I think Peter Eisentraut commented that there is very, very small difference in terms of performance because there is some additional code.
Well, I have mixed opinion here.
Because first of all, of course, it's good to care about performance but it's also
good to care about the quality of data and when we talk about quality of data we need the constraints
and varchar offers such such constraint and actually if you define varchar 100 and then
next day you decide oh i need 200 200 characters It's just an alter, and it will be quick.
Oh, it's no luck, even if you've got a lot of data in there.
From 100 to 200, it's obvious we don't need to scan table, right?
Well, in theory, but not always things that are in theory possible are true.
I recently checked.
I was surprised it's not scanning table. Well, maybe I'm wrong, actually. But it's easy to test. You generate
a huge table, you start observing logs, measure time, and so on, and you see if it scans table
or not. I suspect in one way it won't, if you increase it. We could spend the rest of this
episode on this specific topic. I remember a really good blog post on this. Maybe it's outdated.
Well, Brander did a really good in-depth blog post on this.
Pardon me?
How many years ago?
Only recently.
Depends on version, probably.
I don't know.
But recently, I was explaining to someone that it's so, as you say.
And I wanted to, as usual, when i say something i i first of all
i need to double check myself so i do experiment always experiment so just check it and see and i
need it also to demonstrate with real experiments so i made experiment and i was surprised that it
didn't perform it was very fast in this way like from 100 to 200, increasing this limit.
So again, it's better not to memorize these things, but know how to check these things.
It's always good to check and it's always also good to have some testing system which
will not let you deploy bad change, which will block table for long.
True.
Right, so we discussed that with database experiments, testing, and so on also.
So it's that area.
Well, okay, to use text, but my question is how are you going to control the length?
Many people tend to control it on application side,
and many years ago I was criticizing them a lot because I was saying, you know, look, only database can control constraints reliably.
Because application, like, soon you can have several applications and you need to copy paste this logic in different languages, probably, right?
It's not good.
Database can control this and so on.
But they had reasons,
for example, painful schema changes. Because I get it. And for example, if you tell me you are
going to use enum, even with recent support, I think in Postgres 13 or something, it was added,
you can alter it saying I want one more value in enum i will say okay what if tomorrow
you will decide to remove one value and you know your table doesn't have it anymore and it's
terrible so instead of enum probably it's better to use just check constraints and know how to
cook them you do it in two phases, you define check constraint with not valid flag.
After each, some people think it doesn't work if you say not valid, but it works for new values.
So all inserts and updates will already be checked for this check constraint, even if it's
marked as not valid. But not valid means we skip checking existing data
and then you know in different transaction you say alter table alter constraint validate
validate constraint and in this case you don't block anyone and it's so it's so good and in this
case you can create another constraint with this approach and drop old constraint. And this is how you migrate to a different set of values to be checked.
So I prefer this recipe instead of using enum.
And similar here, well, I honestly also use text in many cases,
but it's not good when you want to store email and then you forget in your application,
you forget to check it, and then someone inserts one gigabyte of data because Postgres supports up to one gigabyte inside one value.
It's huge, right?
So I question, can you use check constraint for the length then?
Why is it different?
Exactly. Yeah, you can.
And altering them probably is easier because it can be fully automated.
You just define in two phases.
You define new constraint drop all along.
Good about this approach is that you can work with any data type,
even with JSON.
You can take JSON and say, I have check constraint.
This JSON always must have this key.
Well, you can build even JSON schema around it.
And I think SuperBase guys did something like they have extension probably for kind of JSON schema to put some constraints.
You know, sometimes you want, again, for data quality, you want, if you work with JSON, you still want to have it like semi-structured.
You want to have some keys existing or some values inside keys or some arrays, objects,
I mean, and so on.
And in this case, you can define some check constraints that will prohibit inserting data
you don't want to be inserted.
And I mean, this recipe can work with any data type.
This is the good side of this recipe.
Yeah.
And we know how to deploy without blocking.
I think I agree with you on the enums front.
I did once see a really good, really entertaining talk by Boris Meihas that I liked, trying to defend the enums, trying to say all the positive things about them.
But I think overall, I agree with you.
Yeah, exactly.
It was a valiant effort, I think.
Do you want to move on to, like, numeric types?
Numeric types.
Okay, yeah, sure.
One of, again, one of recent tips I had, again, in beginners topic as well, we had it.
Don't use integer for primary keys.
Yeah.
Just don't.
And what else?
Like, numeric, if you talk about integers, there are small int, normal int, 4 byte, and big int.
I think float is an interesting discussion,
and I think there's some subtleties here.
I wanted just to mention that some people think
big int and small int, I just saw it recently,
it's not standard.
Well, it's standard.
Small int, big int, all the relational databases
which want to follow SQL standard implement them.
So big int is the choice, but not always.
I mean, if you know about storage,
we will probably touch the storage topic separately.
But sometimes, of course, you want fewer bytes to be spent.
So I think for keys, it makes a lot of sense.
And a lot of the advice that's out there,
a lot of the good talks that have been given
are not necessarily quite old,
but they're pre-version 10 of Postgres.
So a lot of them mention serial and big serial,
but now we have identity, so that problem goes away.
I still cannot develop a habit to use it.
I still use big a habit to use it. I still use Big Serial primary key.
The only downside of the identity type is it's so long.
To type it out is such a verbose syntax.
But yeah, I'm trying to get in the habit of using it
because it becomes a few of them.
Because it's standard, right?
It's standard, but it also has few...
Serials have some weird complexities, especially if you end up having to migrate, as we discussed in the downtime.
Column has owner, the table has an owner of, like, of sequence.
Yeah.
Ownership, this is kind of, yeah.
So, about float, what do you want to hear? I recently learned from Peter Gagan on Twitter.
Yeah.
Who mentioned that AudioDB published a very good benchmark
showing how it's good to live without vacuum.
And they chose a benchmark which relied on float eight columns,
like I think four columns, and created B3.
And Peter Gagan mentioned that, unfortunately,
B3 optimizations he worked on in Postgres 13 and 14
did duplication, right?
These optimizations don't work with float 8.
Which is a big deal, right?
Because they were looking at the impact on float especially.
Well, maybe some
people well i mean so the general the general advice in the wiki is don't use it use numeric
with precision it's like don't use money don't use flow date similar right because yes and they give
the thing i really like about that wiki is they do go into when should you like what are
the exceptions i think it's a very very smart way of doing it but in general if you've got any doubt
just don't unless you know exactly what you're doing the the optimization by peter reminded me
also that recently i think it was postgres 15 maybe 14 added a bunch of like optimizations
around sorting specific data types.
And I think the more common or the more standard data type,
the more popular data type you choose, the more likely it is to benefit from some of these optimizations,
whether that's in indexing or sorting or something.
Or vice versa, because, for example, if you store your JSON,
or, okay, if you store some vectors inside text column and expect good performance for
some algorithms for sure arbitrary neighbors nearest neighbors search a and n right in this
case you probably will find out that there are extensions which provide better yeah better
functionality and performance.
Yeah, I think we should talk about advanced types in a moment.
Just before we do, do you have anything you wanted to add on?
You did a whole episode on UUIDs, so I think we can probably skip those.
Yeah, let's skip it.
Anything on booleans or dates and times?
Well, boolean is not one bit. It's one byte.
Yeah.
Unexpectedly bigger than just one bit.
Dates and times, the general advice is timestamp.tc.
Timestamp.tc and pay attention to ranges.
Learn ranges.
Oh, yeah.
Ranges are great.
Good shout.
Ranges are great.
They have indexes also.
So if you have beginning and ending points in time
for your data, consider ranges.
They will also take care of data quality.
You will never have beginning,
which is in future compared to ending point.
Yeah, really awesome feature.
And of course, timestamp TZ,
timestamp with time zone is better and just prefer using it.
That's it.
So I also, I think I recently was productive in terms of tips.
And one of the tips was to block authors and consultants
who explain some Postgres stuff to other people and i i asked and it was uh
it resonated i saw many good feedback and i mean people like this idea let's stop using integer
four for primary keys let's always use integer eight and develop like help people develop a
habit to use integer eight and same foramp. Let's stop using timestamp.
Let's use timestamp to that.
Like this is like good things for those who write blog posts
or develop some, I don't know, like educational courses and so on.
I agree.
And I would add using identity over serial and big serial.
This is probably yes.
Well, and in this case, you need to advise,
use cast instead of colon colon and so on and so on.
It depends.
I don't care.
I prefer Postgres because they are usually more compact and just more convenient for me.
So I'm going to continue using Sequences and Big Serial.
But if you, for example, develop a course for students who you don't know,
many of them will probably go and use Oracle, for example develop a course for students who you don't know like many of them will probably go to and use oracle for example but you use postgres when you educate in this case you
need probably to stick with sql standard language syntax right in this case yes well i was just
teasing because i know you said you still use serial.
Yeah, well, today I see your face.
I knew you were teasing, but still worth explaining.
Let's move on, probably.
Yeah, all right.
Advanced types, arrays, JSON. Yeah, first of all, we're going to break the first normal form.
The first normal form.
1NF or NF1.
I think canonical naming is is one enough, right? In
I actually don't know, is it like what probably one of them
is French or something. That's how these acronyms normally get.
So the first normal form says you usually you should not put
something which can be split two things inside the value in your
tables. But well, text can be split two things inside the value in your tables. But well, text can
be split to things. And we know in many languages, text, for example, in C, it's considered as
array as well. You can work with it as an array. So there is a first letter, second
letter with like index zero, index one, and so on. So it's interesting thing. But anyway,
Postgres is very advanced.
In queries, we can even collapse whole table,
whole relation in one value and then unpack it, manipulating.
But it's a different story.
As for complex data types,
first of all, arrays.
They can be of many...
Underlying the data types, for example,
integer 8 array or integer 4 array or text array and so on.
So it's great, but of course you cannot mix.
So you cannot have both text values inside the same array.
And things to remember, the indexing starts with one,
not zero. This is unexpected
to many and leads to
interesting mistakes sometimes.
Well, arrays are awesome.
I just can say, I use them a lot.
Array underscore
agg function, which helps me
when I group something, I can
group many things
inside one value
and then I also use
parentheses and I can
use like I say I need only
first five elements to see
for example if you select from pgset
activity and you want some queries
I use
left function to cut them
left query
comma hundred for example.
And then I use array
underscore agg to aggregate them
and then additional parentheses and then
access indexes from
1 to 5 and you see
only first 5 elements. Quite convenient.
In one single input for one row
you see 5. You can
also order by inside array
agg. Oh, wow.
That's unexpected to many many inside the aggregation functions maybe count doesn't count here sorry i feel like you told me
that before but i've still yeah like it's super interesting yeah yeah so so when you aggregate, for example, JSON, AGG, text, and array AGG, so you can aggregate.
And in this case, you can say, I want specific order for when I aggregate.
And this order can be different compared to order by clause in your statement.
So it's kind of interesting trick.
But what else?
Like we have arrays, it's only the beginning of the topic right it's a huge
topic i mean complex data types yeah i feel like we should skip json because we did a whole episode
on it and we can yeah at that and just don't follow advice uh who those who say prefer always
json b over json they're probably not completely, it's not probably good, very good advice. In some cases, JSON is much better in terms of performance. So just check, use both,
I would say. Yeah. Yeah. Or read about both and see which suits you. I think I've seen way more
use cases. I've seen way more people using JSONB than I have JSON, I think for good reasons.
Also, we've discussed it at length, haven't we? than I have JSON, I think for good reasons.
We've discussed it at length, haven't we?
JSON is a way to escape from entity attribute value approach,
which is terrible when you have a lot of data.
These joins will be terrible.
Actually, you can have kind of, it will not be foreign key, but you can put many IDs inside one row in this array value and then try to join using
a nest function but it's it's kind of interesting and probably you will have better performance
but again okay json and gsob let's consider h store and xml as kind of
semi-duplicated types in our case i would try to avoid them. Despite the fact I participated in development
of XML type and functions many years ago,
these days I would probably, unless needed,
I would try to avoid them.
And Edge Store is very old.
It has good features, but most of the time
you will probably be good with JSON.
And it's what other parts of your architecture
probably understand. And what else parts of your architecture probably understand.
And what else?
We have also many others, for example, full text search, TS vector and TS query.
They also can, like they consist of many things.
TS vector based on the name, you understand like it's many words combined to one vector value and you have you can create a gene index and have good
performance new thing pg vector for very high dimensional vectors like up to 2000 i think they
say and for not strict neighbor search but arbitrary neighbor search for this LLM, charge-GPT area of engineering,
which is currently very hot.
In this case, PgVector.
By the way, PgVector is not the best in terms of...
It implements not the best algorithms.
So there is now competition, and I know Neon developed something,
PgEmbedding, right?
But PGA embedding
is quite new and it doesn't, it's like it builds index in memory. I know they work on making it
persistent, this index. So I mean, if you restart Postgres, you need to build an index again. What?
So it's very new. I wouldn't say it's production ready yet, but it's very interesting because it implements a different algorithm,
which probably is considered one of the best right now.
I think we're going to need to do a whole episode on PGP.
Yes, let's do it.
About vectors.
Another one we haven't done an episode on
that it feels rude not to mention in this space
is PostGIS in terms of data types and functions.
Yeah, exactly.
And a lot of stuff there, a lot, of course.
So many data types, many functions,
and it's a whole world.
Yeah, incredibly impressive and hugely popular.
We've mentioned performance a few times.
Is it like, is there anything you wanted to make sure
we talked about?
I feel like that's where some of the choices
between these gets interesting. Is there anything that people should be definitely be aware of well i
don't think we have a lot of time right now but let's just probably mention a few tips first of
all documentation yeah wiki is also good and many articles and blog posts are good. Probably we will attach a few articles which are helpful in terms of learning about various data types.
But also a data storage tip related to alignment padding.
Due to alignment padding, Postgres always needs to fill gaps. So if you put one byte and then you put eight bytes, Postgres will need to add
seven zeros between them. So there is an approach you basically need to either in the setting or
descending order arrange. So starting from one, one, one, two, two, two, four, four, four, and so
on bytes. So they pack or vice versa. So you have 16, 16, for example, timestamps, then eight, 1, 2, 2, 2, 4, 4, 4, and so on bytes. So they pack. Or vice versa. You have 16, 16, for example, timestamps,
then 8, 8, 8, and there are articles about it.
But there are also Varlina data types like text,
Varchar, TS vector, vector from PG vector.
I'm sure it's Varlina because it allows very huge values,
up to one gigabyte probably as well involving toast these values you should move to the end of the table
in this case you will have as few zeros these gaps as possible and it's called column tetris
basically someone on stackoverflow named it maybe earlier earlier, I don't know. There are a few articles about it.
And sometimes you can save a lot of space,
but don't overestimate it.
I saw people immediately love this topic
and try to optimize.
And in many cases,
it looks like premature optimization
because, okay, you save like 3% of storage,
but you spent a lot of engineering time.
I read a post by braintree
that had huge tables that were very badly optimized i think they saved about 10 to 20
from memory that's great that's a great example but it's rare rare and they and a brain tree you
know they're huge yeah my old ad hoc tool called postgres dba has a report, which is a very interesting recursive CTE I wrote one day.
I'm still not 100% sure it will work in all cases, but so far it works quite good.
And it can tell you about potential optimization in each table.
So you can use it and try to understand if it's worth spending some time to rearrange columns.
For some people, it's unexpected that Postgres doesn't do it itself.
It's interesting, but it's so.
Postgres will store columns as you say,
as you define in create table definition.
And something I only learned today when I was looking it up,
I wondered if pgRepack supported changing the order,
but it doesn't based on implementation details.
It can't really.
So it's very interesting,
but you're going to have to roll this yourself this yourself basically if you want to do this so there is potential to have it but it's not supported yet unfortunately yeah you need to rebuild the table and deal with it it's
it's a huge task if you want to do it without downtime it's a huge task but as you remember
we mentioned don't use int4 primary key because I saw it very often.
int4 primary key, which is doomed to have a wall being bumped into
when you reach 2 billion rows, 2.1 billion rows, roughly, right?
Half of integer 4 space.
And then second column is timestamp.tz, which is 16 bytes.
So we definitely know we lost four bytes
because they're just alignment pending and we have zeros there if we used into eight from very
beginning we wouldn't have this problem with capacity and we would use the same amount of
storage which is very unexpected but it's so i think the argument stands to use in a even if you
were having to pay double the storage i think there's so i think the argument stands to use in a even if you were having to pay
double the storage i think there's so much benefit because if you're not let's say you don't hit two
billion rows you haven't got that much data to worry you haven't got that much footprint anyway
but if you've been worried about the storage footprint you're probably thinking you're going
to have more than two billion rows so that that would be my argument yeah okay maybe one more advice on your choice
and that's it oh what do you think we should well i see people i see a common argument it's not just
for data types but people that choose to do things in a standard way or a way that will let them
convert between databases over you choosing
the power of the current database they're using thinking what if we need to migrate one day like
and i think it's it's an easy trap to fall into it sounds smart but i've not like any of us have
been around for a while you don't move databases that often like it really existing projects have
a lot of momentum and it costs a lot to migrate
there are arguments for doing it we see some from some very expensive vendors coming to postgres
but it's rare and it's basically so much work that i think you're really hurting yourself by
missing out on the power especially a postgres especially when we're talking about so many of
these things that are so have so many features many functions, so much power, that it feels like such a waste that you'd be missing out on that.
It's like choosing C++ but staying with C only syntax, like the very basic types and so on, and not using power of C++ or something.
Like you chose Java, but you think, oh, maybe one day we will migrate to, I don't know, like to something else.
And you restrict yourself using only 5% of functionality.
It's nonsense.
Yeah, I think so.
But I see people, I see smart people falling into that trap.
So that's my advice would be try to avoid that or try to...
Don't consider database as something like as gloves.
It's a heart, not gloves.
It's the heart of your system.
You are not going to replace heart easily. Gloves. It's a heart, not gloves. It's the heart of your system.
Heart, oh yeah.
Yeah, you're not going to replace heart easily.
So database is in the center of everything.
Yeah. Heart transplants are possible, but they're rare.
Well, yes.
They can go badly wrong. It's not gloves. You can just change them every day.
Yeah, yeah, yeah.
Okay, good. Thank you so much.
Oh, we should remind
those who listened
until this point,
please help us grow
with likes,
with comments,
with feedback.
It's both helpful
for us to understand
we are being heard
and also it's helpful,
for example, on YouTube,
if you like, of course,
due to algorithms,
it helps our channel grow and make audience bigger and make this project more interesting basically true and
requests as well feel free to drop requests in the comments on you need more requests
yeah more requests please thank you thanks everyone