Postgres FM - ChatGPT x PostgreSQL
Episode Date: May 12, 2023Nikolay and Michael discuss using ChatGPT for Postgres tasks — should you, if so what for, and some things to be mindful of! Here are links to a few things we mentioned: ChatGPTNikolay’...s polls on Twitter and on LinkedIn The Art of PostgreSQL (book by Dimitri Fontaine)SQL Performance Explained (book by Markus Winand)Nikolay’s YouTube correction about deletes and index amplificationDon’t use ChatGPT to solve problems (blog post by Christophe Pettus)Query optimization session with ChatGPT, Michael, and Nikolay (on YouTube)DBeaver SmartAssistance feature Depesz anonymization feature~~~What did you like or not like? What should we discuss next time? Let us know on YouTube, on social media, or by commenting on our Google doc.If you would like to share this episode, here's a good link (and thank you!)~~~Postgres FM is brought to you by:Nikolay Samokhvalov, founder of Postgres.aiMichael Christofides, founder of pgMustardWith special thanks to:Jessie Draws for the amazing artwork
Transcript
Discussion (0)
Hello and welcome to PostgresFM, a weekly show about all things PostgresQL. I am Michael,
founder of PGMustard. This is my co-host Nikolai, founder of PostgresAI. Hey Nikolai,
what do you want to talk about this week? Hi Michael, it's very boring. Next time I should
introduce ourselves. Next time it's my turn. Let's have less boring episodes. Let's talk
about ChatGPT. Yeah, it's time. I guess during the last three months since GPT-4 released,
everyone already tried it at least a few times.
Yeah, exactly.
Hopefully we're not going to cover any of the basics
in terms of what it is and everything.
I think pretty much everybody knows that already.
And if not, I'm sure other people have done it better than we possibly could.
But you want to talk about how people can be using it with Postgres,
whether they should, what the risks are, are what the opportunities are that kind of thing
right let's talk not about pg vector and how to put embeddings and how to extend behavior of
chart dpt and so on but uh gpt4 and so on like gpt345 let's talk about user experience to support SQL developer experience.
And you are a database engineer,
or you are a backend engineer who needs to work with Postgres.
Maybe you are a DBA who needs to work with Postgres.
What can it bring you, pros and cons?
Let's talk about these aspects.
I mean, PgVector is a great thing.
And we see Super Superbase, Neon
and even RDS recently started
to provide it
it's super cool, a lot of
new ideas are being implemented but let's
put this topic aside and focus
only on regular user experience
I'm a user, I mean I'm a developer
I'm developing my app
or I'm a DBA or
I'm writing SQL code or I'm analyzing my data using or I'm a DBA, or I'm writing SQL code, or I'm analyzing my data
using SQL data analysis or something.
How ChatterPT can help, and so on.
Sounds good.
I like this topic, because I think it's going to be one of the ones we disagree most on.
So I think I'm sorry in advance.
I hope finally.
Sorry for interrupting.
I hope we will disagree 100% in all aspects on this topic.
Yeah, maybe not all aspects, but I think I'm a lot more negative about some of these things, or a lot more conservative, maybe, cautious, than you are based on previous discussions.
Yeah, maybe, I don't know. as a valid assumption that everyone already tried it at least once.
The only thing is that is GPT-4 version still for paid customers
or it's for free as well?
I believe it's still paid only.
Right.
Default is 3.5.
A lot of people used 3.5.
And my first point, don't use it.
Yeah, version 4 is a lot better
in some very noticeable ways.
Just don't use 3.5. Don't use 3.5. Don't use integration with 3.5. Don't use it. Yeah. Version four is a lot better in some very noticeable ways. Just don't use three,
five.
Don't use three,
five.
Don't use integration with three,
five.
Don't use it.
Forget about it.
So quick thing.
If you see people sharing screenshots,
I think an easy way of telling the difference is on version four,
you'll see there's a,
it's a black logo when chat GPT is responding.
And in 3.5,
it's a green logo.
So that's an easy way of seeing what
somebody's used if they share screenshots.
In all my experiments
3.5 was quite silly, stupid
and very basic and a lot
of hallucination. 4
sometimes a lot of hallucination still
but so more like deep
and wide. So just
use version 4.
It's much, much, much better. That's why when it was released,
a new wave of attention has started. But as I said, you're not alone being pessimistic.
Assuming everyone tried, right? I'm checking poll results I started a couple of days ago
on LinkedIn, on Twitter. I asked, do you use ChatDPT for your Postgres-related work?
Three options, not at all, a few times per week, or already a daily user. And the Twitter,
80%, not at all, 80. 15, a few times per week, five.
How many voted?
How many voted? 99.
Okay, wow.
Not many, but I started it two days ago.
So sometimes I have more votes during two days.
And already daily user, just 5%.
Just 5%.
I'm close to being already daily user.
And LinkedIn, similar.
Actually better in terms of positive usage.
Not at all 60.
Few times 30. daily user 10.
And sample again?
Sample, 81, also not many.
Yeah, but not 10.
Not 10, yeah, around 100, but not many.
And with thousands of impressions,
I guess many people just not vote.
Did that surprise you?
What I feel, when GPT-4 was released,
huge wave started.
It was boom, right?
Now we're on declining side.
But it's going to change, I bet.
Because, of course, GPT-5,
when it will be released?
Later this year, maybe next year?
Have they announced that?
There are some rumors, I guess, only.
Maybe later this week, maybe next year.
But, of course, this will be boom.
But also, I think people are developing new integrations, they're adjusting methodologies
and so on.
It's going to stay, in my opinion.
I'm very positive that it's a super helpful thing.
And I'm using it almost daily.
But many people disagree.
I see it.
We see posts like don't use it at all, which is wrong.
I say use it a lot.
Just use version 4.
Of course, it has this stupid limitation, 25 messages per three hours.
I often bump into it.
And then it says, okay, switch to 3.5.
3.5 is just terrible in my experience compared to what version 4 provides so what how
come you're using it so much what kinds of things you're using it for good let's explore this where
to start developer or administrative like infrastructure tasks maybe focus more on what's
the most common things you use it for like in terms of frequency how come it's nearly daily
i don't need the help with sql usually well let me to be's nearly daily? I don't need help with SQL usually.
Well, let me be honest.
Yesterday I saw I don't use procedures often.
And yesterday I needed to code some procedure,
not function.
Well, I could do function as usual,
but I wanted transaction control.
So it's something which talks to external service
right from Postgres primary, which is not good.
But in this case, we do it understanding problems.
So I saw some example somewhere in the documentation and saw Begin Atomic.
And I actually didn't know what Begin Atomic is.
Well, I can guess, right?
So I asked chat GPT.
It says I'm like version September 2021.
Outdated Postgres doesn't support Begin Atomic, it says.
Obviously it does, just in new versions.
But then it says, you know, in SQL standard,
this means that the whole body will be single transaction.
Well, okay, obviously.
And I'm sure Postgres documentation would explain it,
just not on the same page.
I just searched for it, didn't find it.
Chardpt was faster way to understand it.
Well, I understand. Okay, now it's single transaction. I don't for it, didn't find it. ChartDPT was a faster way to understand it. Well, I understand.
Okay, now it's a single transaction.
I don't need it.
I need the regular begin and then commits to split to multiple transactions.
It was useful.
I mean, it was a faster way rather than Google or trying to find it in documentation.
Documentation has this on a different page, obviously, right?
But can I just add at this point, it's super interesting that you use it for that.
I wasn't expecting kind of factual checking
to be like the first use case
because my main fear with using it for things like that,
if I check the source code,
even if it takes me a few seconds later,
if I check the Postgres docs,
I trust that it's true what it says.
If I check JackGBT, how do I know it's true? Of course, with an example like that, you believe it's true what it says. If I check JackGBT, how do I know it's true?
Of course, with an example like that, you believe it's true
because that's also your instinct as to what Begin Atomic was going to mean.
But I get incorrect or false results often enough
that I now no longer trust what it says.
I don't see any problems here at all.
Let's have some spectrum of level of trust.
Documentation probably is the top trustworthy source of truth, but it has bugs sometimes also.
True, yeah.
Just it's not often at all.
Very low percentage.
It's not ideal still.
It might have false statements sometimes or misleading, but probability is tiny.
Especially tiny for things that matter, right?
Like if it was something that mattered, it would have been more likely to have been corrected.
Like, right. It's more likely to have bugs in places that don't matter so much. That would be
my instinct. Right. Now let's take books, for example. Give me any book, I will find the
problem with logic and false statements. I mean, Postgres book. Give me any book, I will find the problem with logic and false statements. I mean, Postgres book. Give me any book, I will find it. Because they are much lower in terms of quality.
I didn't see any super quality books at all. The Art of PostgreSQL. Dimitri Fontaine.
I must admit, didn't read it fully. But okay, this is my homework. I will find problems in it.
Sorry, Dimitri.
Challenge accepted. Yeah, it's a very good book yeah well
maybe it will be not with logic but something like controversial statement or something easy
i will find it what about marcus win and sequel performance explained very good very good example
yeah this is a very good example second challenge accepted i will find some issues you've read that
one surely well uh i read some parts of it okay also very good book this is very quality book you
you chose good examples yeah because and this is my point about google search as well right like so
documentation very high so postgres documentation very good books because i know the authors i have a higher
level of trust so if i can just couple of people plus more couple of people who are correct it's
just tiny group of people they for sure make some mistakes for sure i'm not question i'm not
questioning that i'm just saying it's maybe slightly lower than documentation because i
don't believe as many eyes have gone on it and i don't believe it's as easy to correct although
postgres documentation could be easier to correct for sure then google like search results for example
again you get some metadata you get some information about who wrote this when did they
write this you can like check some sources and you you build up a bunch like over time even on
stack overflow for example over time you start to notice some people that give amazing answers.
And some names I just trust more than others when it comes to like a Stack Overflow response.
But on ChatGPT, I get no information about the source.
Where did this information come from?
What's it derived from?
You get no kind of, I don't get any signals as to whether to trust it or not.
You see, Markus Winant, perfect author.
I have huge respect to this author.
But if you don't know this name, where is trust coming from?
Yeah, I understand trust is built over time, for sure.
The first time I read it, I check it's true.
Okay, yes, that was good.
And then the second time, and the third time, and the tenth time.
And sometimes they teach me something that I didn't expect, or they question an assumption i had that i didn't realize i had
and then when i verify it's true again verify is the key thing i wanted you know in this episode
to deliver yeah i don't trust anyone sometimes i don't trust the. And in my whole practice with Postgres consulting practice, my daily development and so on, I often don't trust myself.
Like maybe I'm wrong.
And this is normal.
And the key point, you need to verify.
You need to experiment, test, and so on.
You also need to know how to properly do it.
For example, if you do something with a large data set, use a large data set and so on. And same here. Of course, some authors produce super high
quality statements and share super high quality thoughts. For example, Tom Lane in PgSkill Hackers
in 99% unbeatable logic, right? But sometimes some statements, even from Tom Lane, I could argue, and some people
could argue as well. Because life is not simple. Maybe, for example, Tom Lane said something,
but for you, for your particular case, it cannot be applied, for example. Or you have different
experience, and Tom Lane said, don't use this approach, but you're still using it and you find it super helpful in your case.
It might happen.
So if you consider Charged EPT as some consultant
and version four should be used again,
assume everything with grain of salt, verify.
The magic is that you can tell,
you know, you say something,
but there is something wrong here, right?
And it can fix problems it
adjusts so this chain of communication sometimes it's annoying like every answer starts i apologize
for confusion right i have it so many times but it's super helpful like same like i hire someone
and the guy is super fast in answering and has super wide knowledge sometimes quite deep
as well but might hallucinating sometimes it's okay and if i have a very good way to verify
ideas and iterate i cannot understand how to live without it at all yes so this so you're saying
that having access to chat gpt4 is like hiring somebody with a credible breadth of knowledge, sometimes very deep knowledge.
And they're always online.
They always reply straight away.
They're never in the middle of something else, except when you run out of queries per hour.
Yeah, I completely agree with that.
The one thing I would say is a bit different.
No, no, no.
You hire someone and you don't trust this employee 100%. You need to verify ideas and sometimes tell, you know, you're probably wrong. You can work with Charity PT as a person. 10 years ago. I was no name here, right? Initially. And when I go to companies like
Chewy, GitLab and others, they doubt everything you say, everything. So if you say something,
you need to prove you are right. And the best way to prove, obviously, is to try and show how it works and explain in very detail how to repeat it.
So you claim something and you show how to check it, to verify it.
And if you consider chat DPT is not 100% reliable thing, it's not documentation, right?
It's some consultant who is wrong sometimes, quite quite often wrong but we allow it to be wrong
sometimes right but we need like without a way to verify it you will end up concluding that
this is a bad tool with verification approach doubt everything and verify it and if it. And if it shows mistakes, show these mistakes and ask to fix them.
It works very well.
One big difference
versus a consultant or human in general,
or at least the types of humans
I would employ,
is that when it is wrong,
it's very confidently wrong.
It doesn't say,
I'm not sure,
or it doesn't ever,
it's very very very confident in
its answers which is very different from humans so that is that's a big difference but it's also
a lot cheaper than humans so i think it's like it's not a fair comparison at all always put
some doubt element into all statements like i would would say that, or most likely blah, blah, blah. And so
on. Like there are many, many intro phrases consultants use, which assume some probability
of mistake. And my own style is I always say, I'm not sure. I never, I'm never sure. In our own
episodes, I was wrong many times. For example, I just commented, I responded to some
YouTube comment where I corrected myself. I claimed that delete has an index amplification problem.
It doesn't. Because deletes just put Xmax to tuple. Indexes are not touched. And I corrected
myself in YouTube comment and someone asked to explain, to elaborate, which I did. I can elaborate some other time.
But when I
elaborated in that comment,
I thought, maybe I'm wrong, you know.
I checked docs, I checked, I didn't check source code,
I checked Igor Rogov's
Postgres inside Postgres.
Internals, yep.
It's not internals, it's inside Postgres. In Russian,
it's inside Postgres. Perfect book,
by the way.
Fully translated to English and so on, but published on… Perfect book without mistakes.
Is that what you're saying?
I'm teasing you, don't worry.
Yeah, yeah, yeah.
Well, I'm sure there are mistakes there.
I'm sure if you ask Igor, he will admit some probability of mistakes there.
Of course.
Good authors always admit it and good consultants
always. So what I did eventually before I put comment to YouTube, I went to P-SQL and verified
everything I'm going to say. And then I added it to the comment. And I found in my consultant's
practice that we always need to do. If we claim something, show it it and that's something chat gpt can't do it can i mean
yes and no chat gpt cannot but you can assume it can be wrong assume it can be hallucinating
ask chat gpt to to provide some ideas example it's a brainstorm tool it's not verification
there you go yeah then you need verification tool.
And I'm going to shamelessly advertise what we do.
PostgreSQL DB Lab.
This is perfect verification tool.
Database branching, thin cloning. If you have chat GPT and if you have cheap and fast way to check anything.
Well, of course, you cannot check hard things
like hardware change and so on.
But SQL query behavior on large datasets,
thin cloning and database branching is priceless here.
People work on bottomless, on serverless,
but we work on priceless thing,
which is database branching,
true database branching on any Postgres, any location.
This is my company, my product.
So you take Char take chart apt and you
go verify on the clone clone is provisioned in a second for free i mean you need to prepare it but
then you have zero extra costs and then you can iterate okay i say you know on my one terabyte
database this is what i see and i have like this, I'm considered like as a human, like it might be mistaken.
So you propose something which is not working on my one terabyte database, and it will adjust.
And in a couple of iterations, you will have much better results.
And I think this is something new here.
With code, it works pretty well.
I already used it for Ansible coding shell scripting like some
advanced shell scripting react coding i'm not a react expert at all but i did some many things i
wow with charger pta i can do many things because with code you can easily check it very quickly
because you can run it if it's c probably you need to wait until it's been compiled right
that's a really good point, though.
It's easy to verify quite quickly.
So we've actually skipped a bit.
Originally, we were talking about verifying things that you could check in the docs, for example,
or asking it for factual information. Now we're talking about trying out ideas, like idea generation, brainstorming.
And now I get for simple, for isolated things, it's very easy to test those things.
What about system-wide issues?
If you're asking about, I don't know, some modeling question, would you use it for that kind of thing?
Where it's like you're maybe asking architectural decisions.
Is this a good idea or is that a good idea?
Everything is a good idea.
It absorbs a lot of stuff.
It absorbs stack overflow.
Of course, sometimes it's, again, hallucinating easily. But if you prepend all responses, like take it with a grain of salt, this is what I think.
As would any good consultant say.
Without verification, I can say this, but it should, like, this needs thorough verification.
But from top of my mind, I think this. If you prepare something like consultant
style phrase to any chat GPT responses, if you use GPT-4, it will become very good, useful tool.
You unlock your ideas and then you go to verify them. Then you come back and say,
this is not working. Data modeling, perfect example. Seeding is perfect example.
I have a new table side I haven't launched yet.
I use it recently.
I have some rows already there, which is probably represent my data in future.
Can you generate more?
Okay.
I have 10 rows.
I want 10 gigabytes of rows.
Can you help me?
It says, okay, use this snippet with generate series.
This is my real life example.
ChargeGPT version 4 provided some pretty interesting generated serious example.
With very useful comments, you can adjust these parameters in this snippet.
It was super helpful.
I would spend some time generating this.
I used it and said, you know, I wanted wanted 10 gigabytes but this produced only three it says
no worries adjusted we generated seven more gigabytes in second iteration i have now some
data set i can start testing my application using this data set this is a very valid case
yeah i can see that i've had similar experience with helping me generate. I needed an example for a blog post.
And I think sometimes I get blocked on those, you know,
just trying to think a little bit creatively about.
So you also use it?
Yeah, I've used it, but my experience is different.
And I pay for four as well at the moment.
And then.
You're going to stop.
Maybe going to stop.
I haven't used it for a couple of weeks,
but sometimes like if I'm exploring creatively,
I found it helpful.
I like your premise that it's,
it can be used as a brainstorming tool.
I very much agree that it can be really helpful for some creative ideas.
In brainstorm approach,
like enterprise style,
we always say like during brainstorm,
don't criticize ideas. just collect ideas don't
demotivate people so don't demotivate chadip it is it tries it it's it's best you know well but
it also lies and that's my like it also doesn't say as humans it depends yeah but it lies way
more explicitly and i and one thing i'd say is the reason lots of people disagree on
this is like i think it's a little bit half glass full half glass empty type viewing of it when i
see people praising chat gpt they could easily get an answer that has 10 bullet points in it
and seven of them are quite impressive and three of them are garbage the glass half full person's
like oh my goodness how did it get those seven so good
and they know enough that the three are not dangerous or are not a problem if it's talking
complete nonsense or if it's suggesting really bad ideas whereas i think it's a very fair criticism
to say that there's there are three quite dangerous pieces of advice in there please don't listen to
what this thing is saying and i think that's where some people are coming from like i think we saw that post by christoph peters just a couple of days ago saying don't use it for
advice because sometimes it's really bad naive this that's naive statement everyone will be going
to use it do you see the different like angles you could be viewing the exact same 10 bullet
points and one person says this is amazing and one person says this is dangerous.
If you hire a consultant,
it's going at some point to be mistaken.
Again,
prepend my prefix to all responses.
If OpenAI would be more like,
I don't know,
they should probably do it.
I agree.
Instead of I apologize for confusion, they should prepend in the first answer, they should probably do it. I agree. Instead of I apologize for the confusion,
they should prepare in the first answer,
they should prepare like,
take this with a grain of salt.
This is what I think.
Without verification.
Go and verify it.
Yeah, it would be much less impressive.
So in 2016-17,
I started to give talks about database experiments.
Like my approach is verify everything.
And of course, it might take ages to verify everything.
That's why we need to simplify it, to speed it up,
to make it cheap experiments.
And we found a way.
Think only database branching.
It's a recipe for most of needs back-end engineers have.
Not DBAs and infrastructure teams.
They have different needs.
But programmers, back-end engineers, and front-end engineers sometimes,
we can satisfy them very well with think-learning and database branching.
So with this approach, like zero-trust approach,
if security have a zero-trust approach, why couldn't approach. If security have zero trust approach,
why couldn't we have this in development?
Okay, someone has some idea,
but without verification, we have zero trust into it.
We say thank you, but we are going to check it.
With this approach, I cannot see how you can consider
charge APT as a bad tool at all.
Right. I think we've covered accuracy quite well. I wanted to ask a couple more things. First one,
privacy. How are you feeling about privacy policy in general? How do you think about that?
Okay, that's a tough question. Before we go there, let me remind our listeners that we had a beautiful session with CharGPT and SQL Optimization.
It's on PostgresTV YouTube channel.
I would like to have another one.
It would be great.
There were some funny moments.
I think we spent a long time in the end.
Yes, but the summary was it exceeded my expectations.
Yeah, it was quite good.
And I think we gave it quite a good example.
I think we only checked one example.
It exceeded not at first iteration.
So secret sauce
was verification.
Because I said, you know, I tried
this, it doesn't help. But I also said
I know there is a good answer.
Well, you can say it always,
right?
But you don't think to, right?
I didn't reveal anything.
It found proper answer after iterations.
And I insisted, right?
Extra prompts, like second and third prompts.
I said, let's try again, let's try again, let's try again.
And eventually it provided a very good example and we verified it.
And that index was great. and this is what i also
export myself also not a first iteration no like there are no perfect perfect humans so if you
consider charging pts a human more like a human it helps well on the topic of sql optimization
i obviously as a product maker i was thinking should I be using this in my product?
Or like, how could I? Is it a threat? You know, it's a direct, in a way, if it can be used for
SQL optimization, it's a direct competitor to the product I make. So it's naturally a very
serious thought. And maybe, maybe I'm deliberately negative, you know, consultants being negative on
it and telling you not to use it could also be self preservation, right? Same with a toolmaker,
who it could be competitive to
but what i wanted to say was the reason i've not used it yet in the product the reason i'm not
planning to anytime soon is we want to give advice that is trustworthy that does come with a lot of
uh you know it does optimize to not give those false positive those uh those bad advice that's
dangerous and that rely on something I'm not willing to...
Exactly.
So I think there's an interesting trade-off there.
My approach is very different.
I've learned in my life that it's good
in terms of finding interesting results.
It's good to generate a lot.
Like this brainstorm approach,
you generate crazy ideas, a lot of them,
and then just go and verify them.
If verification takes a limited time and limited budget, it's great.
So I would rather choose 10 ideas, three of them are crazy and wrong, than just five ideas which are trustworthy.
I would grab 10 and go verify them.
My approach is like, give me all ideas.
I'm going to verify.
Like I have zero trust to any statements.
So I'm going to verify myself anyway.
And without this verification component, we have only half story here.
Let's skip to privacy.
And I want to ask about something else as well.
Why do you say privacy?
Okay, sorry. It's the same thing. So privacy. And I want to ask about something else as well. Why do you say privacy? Okay, sorry.
It's the same thing. So, privacy.
It's not a resolved problem.
And I remember
in DeBeaver, they have
a bold statement, you can disable it here.
Fully disable it here.
Integration with CharterPT.
Which, of course, demonstrates that they know
that many people are
concerned. And we know some countries
haven't banned chat GPT, some already unbanned it. It's still a shaky area, right? It's not
fully clear how to approach this. Because my question to you, like open AI, is it like about
openness of everything? If you talk to chat GP revealing you some data and some secrets is it going to be open
as well because it's open ai so now my my ssn bank account routing number everything is going to be
open or what they say no right like i know you're joking but they say no but do you like it's an
american company right like it's for profit company now i believe like i know it started as a
i know it's supposed to be more of a non-profit initiative and but now i believe it's a fully
for-profit company i'd rather yeah yeah well yeah you could pay a non-profit as well but yeah so i
think there's a real concern now i've definitely seen some companies also not allow their employees
to use it for these some of of these concerns for them but the main
reason i see people not using it or being told they can't use it is actually around ip so like
in terms of using what it tells you like about around oh i guess it's around the ethics of it
how did they you know the gathering of that information do you have any concerns on that
side of of how they got the information?
Let's split it into parts.
We have SQL queries.
We have explained plans.
And we have some
additional parts of IP like
ideas and
approaches and so on. I'm strongly
against patents and ideas. Ideas
should flow. Like a scientific
approach. You go to conference, share ideas,
everyone takes it, improves it, and so on.
And somebody, maybe not you,
will win. Of course, many people
will not agree with me
in terms of patents and ideas.
But this is my opinion.
I think it's bad for innovation
and for progress.
But if we talk about queries and
plans,
what do you think,
what's the best name for a project that will solve it?
Pgmask or PgGloves or something else?
I don't understand.
Okay.
Imagine you talk to Charger PT
and you show some query
and maybe some plans
or execution explain plans
which PgMMaster is working with
as well. And
before you send it to ChargerPT, you
just transform all table names,
object names, and parameters
to something.
Oh, like a...
Yeah, okay. So
like Depeze does that, right?
Like anonymization tool. Yeah.
Depeze has it for plans,
explained plans.
Imagine this project
will transform all object names
to something.
So you're saying
you can anonymize it,
send it through,
get it back anonymized.
And code and then decode, right?
Yeah, I understand.
Or mask and mask
or anonymize, de-anonymize.
You keep mapping.
I think it's good.
Sure, but casual users aren't going to...
Like, if we're talking about using ChatGPT directly at the moment,
that's not super practical.
And if it's a time-saving tool, what's the point anymore?
But yeah, so what about the...
Like, you don't have any concerns around it?
I have concerns.
Like, my query can have emails.
I don't want them to be leaked, or SSN numbers, or some medical info. I don't want them to be leaked or SSN numbers or some medical info.
I don't want this data to be leaked.
So, of course, I need some gloves or mask.
What's the best name for this project?
PG mask or PG gloves or something else?
PGN9 or something.
Maybe that's not got good connotations.
Okay, I'll think about it more.
But this is, I bootstrapped it slightly.
But I think it will have some negative effect
because when you talk to CharGPT, if you have some experience, you say this is table persons,
it has column email and from names, CharGPT also can suggest something useful. For example,
it can suggest you to have index lower on email for example to have it
cancer sensitive or to use data type ci text cancer sensitive text right if you convert it to column
one it won't it like it will work less efficiently right but you well you lose something here, some senses that can be derived from names.
But at least you're protected.
If you transform to some weird names using some tools or maybe your own new ones,
and then in responses you transform substrings back to it,
it will protect you almost fully.
Some people told me that
some SQL queries,
their structure can be also
considered as IP.
Well, I can share very advanced
SQL examples.
Very advanced. I learned from other people.
This is about
Rf3 patents.
This query text... It's not just patents. Okay. Like this query text,
explain plan text.
It's not just patents.
It's not just patents though,
right?
It's also even coming up with those.
It is somewhat,
might be somewhat novel.
And the idea that like secrecy here,
I think everything should be open.
All recipes,
postgres is open.
Recipes should be open.
Talks should be open. Recordings should be open. recipes should be open talks should be open recordings should be
open editing should be open nice we've run long i think it's a good one still it was supposed to
be very short episode yeah how do you want to conclude verify check test experiment this is
the key i like the thinking of using it as like a brainstorming tool like as a creativity
tool give you ideas and very much don't trust anything it says but you might get some good
ideas same as humans don't trust our episodes yeah nice one well thank you nicolai thanks
everybody and see you next week good see you