Postgres FM - ChatGPT x 
PostgreSQL

Episode Date: May 12, 2023

Nikolay and Michael discuss using ChatGPT for Postgres tasks — should you, if so what for, and some things to be mindful of! Here are links to a few things we mentioned: ChatGPTNikolay’...s polls on Twitter and on LinkedIn The Art of PostgreSQL (book by Dimitri Fontaine)SQL Performance Explained (book by Markus Winand)Nikolay’s YouTube correction about deletes and index amplificationDon’t use ChatGPT to solve problems (blog post by Christophe Pettus)Query optimization session with ChatGPT, Michael, and Nikolay (on YouTube)DBeaver SmartAssistance feature  Depesz anonymization feature~~~What did you like or not like? What should we discuss next time? Let us know on YouTube, on social media, or by commenting on our Google doc.If you would like to share this episode, here's a good link (and thank you!)~~~Postgres FM is brought to you by:Nikolay Samokhvalov, founder of Postgres.aiMichael Christofides, founder of pgMustardWith special thanks to:Jessie Draws for the amazing artwork 

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to PostgresFM, a weekly show about all things PostgresQL. I am Michael, founder of PGMustard. This is my co-host Nikolai, founder of PostgresAI. Hey Nikolai, what do you want to talk about this week? Hi Michael, it's very boring. Next time I should introduce ourselves. Next time it's my turn. Let's have less boring episodes. Let's talk about ChatGPT. Yeah, it's time. I guess during the last three months since GPT-4 released, everyone already tried it at least a few times. Yeah, exactly. Hopefully we're not going to cover any of the basics
Starting point is 00:00:33 in terms of what it is and everything. I think pretty much everybody knows that already. And if not, I'm sure other people have done it better than we possibly could. But you want to talk about how people can be using it with Postgres, whether they should, what the risks are, are what the opportunities are that kind of thing right let's talk not about pg vector and how to put embeddings and how to extend behavior of chart dpt and so on but uh gpt4 and so on like gpt345 let's talk about user experience to support SQL developer experience. And you are a database engineer,
Starting point is 00:01:08 or you are a backend engineer who needs to work with Postgres. Maybe you are a DBA who needs to work with Postgres. What can it bring you, pros and cons? Let's talk about these aspects. I mean, PgVector is a great thing. And we see Super Superbase, Neon and even RDS recently started to provide it
Starting point is 00:01:28 it's super cool, a lot of new ideas are being implemented but let's put this topic aside and focus only on regular user experience I'm a user, I mean I'm a developer I'm developing my app or I'm a DBA or I'm writing SQL code or I'm analyzing my data using or I'm a DBA, or I'm writing SQL code, or I'm analyzing my data
Starting point is 00:01:47 using SQL data analysis or something. How ChatterPT can help, and so on. Sounds good. I like this topic, because I think it's going to be one of the ones we disagree most on. So I think I'm sorry in advance. I hope finally. Sorry for interrupting. I hope we will disagree 100% in all aspects on this topic.
Starting point is 00:02:06 Yeah, maybe not all aspects, but I think I'm a lot more negative about some of these things, or a lot more conservative, maybe, cautious, than you are based on previous discussions. Yeah, maybe, I don't know. as a valid assumption that everyone already tried it at least once. The only thing is that is GPT-4 version still for paid customers or it's for free as well? I believe it's still paid only. Right. Default is 3.5. A lot of people used 3.5.
Starting point is 00:02:38 And my first point, don't use it. Yeah, version 4 is a lot better in some very noticeable ways. Just don't use 3.5. Don't use 3.5. Don't use integration with 3.5. Don't use it. Yeah. Version four is a lot better in some very noticeable ways. Just don't use three, five. Don't use three, five. Don't use integration with three,
Starting point is 00:02:48 five. Don't use it. Forget about it. So quick thing. If you see people sharing screenshots, I think an easy way of telling the difference is on version four, you'll see there's a, it's a black logo when chat GPT is responding.
Starting point is 00:03:01 And in 3.5, it's a green logo. So that's an easy way of seeing what somebody's used if they share screenshots. In all my experiments 3.5 was quite silly, stupid and very basic and a lot of hallucination. 4
Starting point is 00:03:15 sometimes a lot of hallucination still but so more like deep and wide. So just use version 4. It's much, much, much better. That's why when it was released, a new wave of attention has started. But as I said, you're not alone being pessimistic. Assuming everyone tried, right? I'm checking poll results I started a couple of days ago on LinkedIn, on Twitter. I asked, do you use ChatDPT for your Postgres-related work?
Starting point is 00:03:47 Three options, not at all, a few times per week, or already a daily user. And the Twitter, 80%, not at all, 80. 15, a few times per week, five. How many voted? How many voted? 99. Okay, wow. Not many, but I started it two days ago. So sometimes I have more votes during two days. And already daily user, just 5%.
Starting point is 00:04:12 Just 5%. I'm close to being already daily user. And LinkedIn, similar. Actually better in terms of positive usage. Not at all 60. Few times 30. daily user 10. And sample again? Sample, 81, also not many.
Starting point is 00:04:32 Yeah, but not 10. Not 10, yeah, around 100, but not many. And with thousands of impressions, I guess many people just not vote. Did that surprise you? What I feel, when GPT-4 was released, huge wave started. It was boom, right?
Starting point is 00:04:49 Now we're on declining side. But it's going to change, I bet. Because, of course, GPT-5, when it will be released? Later this year, maybe next year? Have they announced that? There are some rumors, I guess, only. Maybe later this week, maybe next year.
Starting point is 00:05:05 But, of course, this will be boom. But also, I think people are developing new integrations, they're adjusting methodologies and so on. It's going to stay, in my opinion. I'm very positive that it's a super helpful thing. And I'm using it almost daily. But many people disagree. I see it.
Starting point is 00:05:24 We see posts like don't use it at all, which is wrong. I say use it a lot. Just use version 4. Of course, it has this stupid limitation, 25 messages per three hours. I often bump into it. And then it says, okay, switch to 3.5. 3.5 is just terrible in my experience compared to what version 4 provides so what how come you're using it so much what kinds of things you're using it for good let's explore this where
Starting point is 00:05:52 to start developer or administrative like infrastructure tasks maybe focus more on what's the most common things you use it for like in terms of frequency how come it's nearly daily i don't need the help with sql usually well let me to be's nearly daily? I don't need help with SQL usually. Well, let me be honest. Yesterday I saw I don't use procedures often. And yesterday I needed to code some procedure, not function. Well, I could do function as usual,
Starting point is 00:06:16 but I wanted transaction control. So it's something which talks to external service right from Postgres primary, which is not good. But in this case, we do it understanding problems. So I saw some example somewhere in the documentation and saw Begin Atomic. And I actually didn't know what Begin Atomic is. Well, I can guess, right? So I asked chat GPT.
Starting point is 00:06:39 It says I'm like version September 2021. Outdated Postgres doesn't support Begin Atomic, it says. Obviously it does, just in new versions. But then it says, you know, in SQL standard, this means that the whole body will be single transaction. Well, okay, obviously. And I'm sure Postgres documentation would explain it, just not on the same page.
Starting point is 00:06:59 I just searched for it, didn't find it. Chardpt was faster way to understand it. Well, I understand. Okay, now it's single transaction. I don't for it, didn't find it. ChartDPT was a faster way to understand it. Well, I understand. Okay, now it's a single transaction. I don't need it. I need the regular begin and then commits to split to multiple transactions. It was useful. I mean, it was a faster way rather than Google or trying to find it in documentation.
Starting point is 00:07:19 Documentation has this on a different page, obviously, right? But can I just add at this point, it's super interesting that you use it for that. I wasn't expecting kind of factual checking to be like the first use case because my main fear with using it for things like that, if I check the source code, even if it takes me a few seconds later, if I check the Postgres docs,
Starting point is 00:07:39 I trust that it's true what it says. If I check JackGBT, how do I know it's true? Of course, with an example like that, you believe it's true what it says. If I check JackGBT, how do I know it's true? Of course, with an example like that, you believe it's true because that's also your instinct as to what Begin Atomic was going to mean. But I get incorrect or false results often enough that I now no longer trust what it says. I don't see any problems here at all. Let's have some spectrum of level of trust.
Starting point is 00:08:04 Documentation probably is the top trustworthy source of truth, but it has bugs sometimes also. True, yeah. Just it's not often at all. Very low percentage. It's not ideal still. It might have false statements sometimes or misleading, but probability is tiny. Especially tiny for things that matter, right? Like if it was something that mattered, it would have been more likely to have been corrected.
Starting point is 00:08:30 Like, right. It's more likely to have bugs in places that don't matter so much. That would be my instinct. Right. Now let's take books, for example. Give me any book, I will find the problem with logic and false statements. I mean, Postgres book. Give me any book, I will find the problem with logic and false statements. I mean, Postgres book. Give me any book, I will find it. Because they are much lower in terms of quality. I didn't see any super quality books at all. The Art of PostgreSQL. Dimitri Fontaine. I must admit, didn't read it fully. But okay, this is my homework. I will find problems in it. Sorry, Dimitri. Challenge accepted. Yeah, it's a very good book yeah well maybe it will be not with logic but something like controversial statement or something easy
Starting point is 00:09:13 i will find it what about marcus win and sequel performance explained very good very good example yeah this is a very good example second challenge accepted i will find some issues you've read that one surely well uh i read some parts of it okay also very good book this is very quality book you you chose good examples yeah because and this is my point about google search as well right like so documentation very high so postgres documentation very good books because i know the authors i have a higher level of trust so if i can just couple of people plus more couple of people who are correct it's just tiny group of people they for sure make some mistakes for sure i'm not question i'm not questioning that i'm just saying it's maybe slightly lower than documentation because i
Starting point is 00:09:59 don't believe as many eyes have gone on it and i don't believe it's as easy to correct although postgres documentation could be easier to correct for sure then google like search results for example again you get some metadata you get some information about who wrote this when did they write this you can like check some sources and you you build up a bunch like over time even on stack overflow for example over time you start to notice some people that give amazing answers. And some names I just trust more than others when it comes to like a Stack Overflow response. But on ChatGPT, I get no information about the source. Where did this information come from?
Starting point is 00:10:36 What's it derived from? You get no kind of, I don't get any signals as to whether to trust it or not. You see, Markus Winant, perfect author. I have huge respect to this author. But if you don't know this name, where is trust coming from? Yeah, I understand trust is built over time, for sure. The first time I read it, I check it's true. Okay, yes, that was good.
Starting point is 00:11:00 And then the second time, and the third time, and the tenth time. And sometimes they teach me something that I didn't expect, or they question an assumption i had that i didn't realize i had and then when i verify it's true again verify is the key thing i wanted you know in this episode to deliver yeah i don't trust anyone sometimes i don't trust the. And in my whole practice with Postgres consulting practice, my daily development and so on, I often don't trust myself. Like maybe I'm wrong. And this is normal. And the key point, you need to verify. You need to experiment, test, and so on.
Starting point is 00:11:39 You also need to know how to properly do it. For example, if you do something with a large data set, use a large data set and so on. And same here. Of course, some authors produce super high quality statements and share super high quality thoughts. For example, Tom Lane in PgSkill Hackers in 99% unbeatable logic, right? But sometimes some statements, even from Tom Lane, I could argue, and some people could argue as well. Because life is not simple. Maybe, for example, Tom Lane said something, but for you, for your particular case, it cannot be applied, for example. Or you have different experience, and Tom Lane said, don't use this approach, but you're still using it and you find it super helpful in your case. It might happen.
Starting point is 00:12:27 So if you consider Charged EPT as some consultant and version four should be used again, assume everything with grain of salt, verify. The magic is that you can tell, you know, you say something, but there is something wrong here, right? And it can fix problems it adjusts so this chain of communication sometimes it's annoying like every answer starts i apologize
Starting point is 00:12:53 for confusion right i have it so many times but it's super helpful like same like i hire someone and the guy is super fast in answering and has super wide knowledge sometimes quite deep as well but might hallucinating sometimes it's okay and if i have a very good way to verify ideas and iterate i cannot understand how to live without it at all yes so this so you're saying that having access to chat gpt4 is like hiring somebody with a credible breadth of knowledge, sometimes very deep knowledge. And they're always online. They always reply straight away. They're never in the middle of something else, except when you run out of queries per hour.
Starting point is 00:13:36 Yeah, I completely agree with that. The one thing I would say is a bit different. No, no, no. You hire someone and you don't trust this employee 100%. You need to verify ideas and sometimes tell, you know, you're probably wrong. You can work with Charity PT as a person. 10 years ago. I was no name here, right? Initially. And when I go to companies like Chewy, GitLab and others, they doubt everything you say, everything. So if you say something, you need to prove you are right. And the best way to prove, obviously, is to try and show how it works and explain in very detail how to repeat it. So you claim something and you show how to check it, to verify it. And if you consider chat DPT is not 100% reliable thing, it's not documentation, right?
Starting point is 00:14:40 It's some consultant who is wrong sometimes, quite quite often wrong but we allow it to be wrong sometimes right but we need like without a way to verify it you will end up concluding that this is a bad tool with verification approach doubt everything and verify it and if it. And if it shows mistakes, show these mistakes and ask to fix them. It works very well. One big difference versus a consultant or human in general, or at least the types of humans I would employ,
Starting point is 00:15:16 is that when it is wrong, it's very confidently wrong. It doesn't say, I'm not sure, or it doesn't ever, it's very very very confident in its answers which is very different from humans so that is that's a big difference but it's also a lot cheaper than humans so i think it's like it's not a fair comparison at all always put
Starting point is 00:15:39 some doubt element into all statements like i would would say that, or most likely blah, blah, blah. And so on. Like there are many, many intro phrases consultants use, which assume some probability of mistake. And my own style is I always say, I'm not sure. I never, I'm never sure. In our own episodes, I was wrong many times. For example, I just commented, I responded to some YouTube comment where I corrected myself. I claimed that delete has an index amplification problem. It doesn't. Because deletes just put Xmax to tuple. Indexes are not touched. And I corrected myself in YouTube comment and someone asked to explain, to elaborate, which I did. I can elaborate some other time. But when I
Starting point is 00:16:28 elaborated in that comment, I thought, maybe I'm wrong, you know. I checked docs, I checked, I didn't check source code, I checked Igor Rogov's Postgres inside Postgres. Internals, yep. It's not internals, it's inside Postgres. In Russian, it's inside Postgres. Perfect book,
Starting point is 00:16:44 by the way. Fully translated to English and so on, but published on… Perfect book without mistakes. Is that what you're saying? I'm teasing you, don't worry. Yeah, yeah, yeah. Well, I'm sure there are mistakes there. I'm sure if you ask Igor, he will admit some probability of mistakes there. Of course.
Starting point is 00:17:02 Good authors always admit it and good consultants always. So what I did eventually before I put comment to YouTube, I went to P-SQL and verified everything I'm going to say. And then I added it to the comment. And I found in my consultant's practice that we always need to do. If we claim something, show it it and that's something chat gpt can't do it can i mean yes and no chat gpt cannot but you can assume it can be wrong assume it can be hallucinating ask chat gpt to to provide some ideas example it's a brainstorm tool it's not verification there you go yeah then you need verification tool. And I'm going to shamelessly advertise what we do.
Starting point is 00:17:51 PostgreSQL DB Lab. This is perfect verification tool. Database branching, thin cloning. If you have chat GPT and if you have cheap and fast way to check anything. Well, of course, you cannot check hard things like hardware change and so on. But SQL query behavior on large datasets, thin cloning and database branching is priceless here. People work on bottomless, on serverless,
Starting point is 00:18:16 but we work on priceless thing, which is database branching, true database branching on any Postgres, any location. This is my company, my product. So you take Char take chart apt and you go verify on the clone clone is provisioned in a second for free i mean you need to prepare it but then you have zero extra costs and then you can iterate okay i say you know on my one terabyte database this is what i see and i have like this, I'm considered like as a human, like it might be mistaken.
Starting point is 00:18:49 So you propose something which is not working on my one terabyte database, and it will adjust. And in a couple of iterations, you will have much better results. And I think this is something new here. With code, it works pretty well. I already used it for Ansible coding shell scripting like some advanced shell scripting react coding i'm not a react expert at all but i did some many things i wow with charger pta i can do many things because with code you can easily check it very quickly because you can run it if it's c probably you need to wait until it's been compiled right
Starting point is 00:19:23 that's a really good point, though. It's easy to verify quite quickly. So we've actually skipped a bit. Originally, we were talking about verifying things that you could check in the docs, for example, or asking it for factual information. Now we're talking about trying out ideas, like idea generation, brainstorming. And now I get for simple, for isolated things, it's very easy to test those things. What about system-wide issues? If you're asking about, I don't know, some modeling question, would you use it for that kind of thing?
Starting point is 00:19:51 Where it's like you're maybe asking architectural decisions. Is this a good idea or is that a good idea? Everything is a good idea. It absorbs a lot of stuff. It absorbs stack overflow. Of course, sometimes it's, again, hallucinating easily. But if you prepend all responses, like take it with a grain of salt, this is what I think. As would any good consultant say. Without verification, I can say this, but it should, like, this needs thorough verification.
Starting point is 00:20:21 But from top of my mind, I think this. If you prepare something like consultant style phrase to any chat GPT responses, if you use GPT-4, it will become very good, useful tool. You unlock your ideas and then you go to verify them. Then you come back and say, this is not working. Data modeling, perfect example. Seeding is perfect example. I have a new table side I haven't launched yet. I use it recently. I have some rows already there, which is probably represent my data in future. Can you generate more?
Starting point is 00:20:55 Okay. I have 10 rows. I want 10 gigabytes of rows. Can you help me? It says, okay, use this snippet with generate series. This is my real life example. ChargeGPT version 4 provided some pretty interesting generated serious example. With very useful comments, you can adjust these parameters in this snippet.
Starting point is 00:21:16 It was super helpful. I would spend some time generating this. I used it and said, you know, I wanted wanted 10 gigabytes but this produced only three it says no worries adjusted we generated seven more gigabytes in second iteration i have now some data set i can start testing my application using this data set this is a very valid case yeah i can see that i've had similar experience with helping me generate. I needed an example for a blog post. And I think sometimes I get blocked on those, you know, just trying to think a little bit creatively about.
Starting point is 00:21:52 So you also use it? Yeah, I've used it, but my experience is different. And I pay for four as well at the moment. And then. You're going to stop. Maybe going to stop. I haven't used it for a couple of weeks, but sometimes like if I'm exploring creatively,
Starting point is 00:22:08 I found it helpful. I like your premise that it's, it can be used as a brainstorming tool. I very much agree that it can be really helpful for some creative ideas. In brainstorm approach, like enterprise style, we always say like during brainstorm, don't criticize ideas. just collect ideas don't
Starting point is 00:22:27 demotivate people so don't demotivate chadip it is it tries it it's it's best you know well but it also lies and that's my like it also doesn't say as humans it depends yeah but it lies way more explicitly and i and one thing i'd say is the reason lots of people disagree on this is like i think it's a little bit half glass full half glass empty type viewing of it when i see people praising chat gpt they could easily get an answer that has 10 bullet points in it and seven of them are quite impressive and three of them are garbage the glass half full person's like oh my goodness how did it get those seven so good and they know enough that the three are not dangerous or are not a problem if it's talking
Starting point is 00:23:09 complete nonsense or if it's suggesting really bad ideas whereas i think it's a very fair criticism to say that there's there are three quite dangerous pieces of advice in there please don't listen to what this thing is saying and i think that's where some people are coming from like i think we saw that post by christoph peters just a couple of days ago saying don't use it for advice because sometimes it's really bad naive this that's naive statement everyone will be going to use it do you see the different like angles you could be viewing the exact same 10 bullet points and one person says this is amazing and one person says this is dangerous. If you hire a consultant, it's going at some point to be mistaken.
Starting point is 00:23:51 Again, prepend my prefix to all responses. If OpenAI would be more like, I don't know, they should probably do it. I agree. Instead of I apologize for confusion, they should prepend in the first answer, they should probably do it. I agree. Instead of I apologize for the confusion, they should prepare in the first answer,
Starting point is 00:24:07 they should prepare like, take this with a grain of salt. This is what I think. Without verification. Go and verify it. Yeah, it would be much less impressive. So in 2016-17, I started to give talks about database experiments.
Starting point is 00:24:25 Like my approach is verify everything. And of course, it might take ages to verify everything. That's why we need to simplify it, to speed it up, to make it cheap experiments. And we found a way. Think only database branching. It's a recipe for most of needs back-end engineers have. Not DBAs and infrastructure teams.
Starting point is 00:24:48 They have different needs. But programmers, back-end engineers, and front-end engineers sometimes, we can satisfy them very well with think-learning and database branching. So with this approach, like zero-trust approach, if security have a zero-trust approach, why couldn't approach. If security have zero trust approach, why couldn't we have this in development? Okay, someone has some idea, but without verification, we have zero trust into it.
Starting point is 00:25:16 We say thank you, but we are going to check it. With this approach, I cannot see how you can consider charge APT as a bad tool at all. Right. I think we've covered accuracy quite well. I wanted to ask a couple more things. First one, privacy. How are you feeling about privacy policy in general? How do you think about that? Okay, that's a tough question. Before we go there, let me remind our listeners that we had a beautiful session with CharGPT and SQL Optimization. It's on PostgresTV YouTube channel. I would like to have another one.
Starting point is 00:25:52 It would be great. There were some funny moments. I think we spent a long time in the end. Yes, but the summary was it exceeded my expectations. Yeah, it was quite good. And I think we gave it quite a good example. I think we only checked one example. It exceeded not at first iteration.
Starting point is 00:26:09 So secret sauce was verification. Because I said, you know, I tried this, it doesn't help. But I also said I know there is a good answer. Well, you can say it always, right? But you don't think to, right?
Starting point is 00:26:25 I didn't reveal anything. It found proper answer after iterations. And I insisted, right? Extra prompts, like second and third prompts. I said, let's try again, let's try again, let's try again. And eventually it provided a very good example and we verified it. And that index was great. and this is what i also export myself also not a first iteration no like there are no perfect perfect humans so if you
Starting point is 00:26:54 consider charging pts a human more like a human it helps well on the topic of sql optimization i obviously as a product maker i was thinking should I be using this in my product? Or like, how could I? Is it a threat? You know, it's a direct, in a way, if it can be used for SQL optimization, it's a direct competitor to the product I make. So it's naturally a very serious thought. And maybe, maybe I'm deliberately negative, you know, consultants being negative on it and telling you not to use it could also be self preservation, right? Same with a toolmaker, who it could be competitive to but what i wanted to say was the reason i've not used it yet in the product the reason i'm not
Starting point is 00:27:30 planning to anytime soon is we want to give advice that is trustworthy that does come with a lot of uh you know it does optimize to not give those false positive those uh those bad advice that's dangerous and that rely on something I'm not willing to... Exactly. So I think there's an interesting trade-off there. My approach is very different. I've learned in my life that it's good in terms of finding interesting results.
Starting point is 00:27:58 It's good to generate a lot. Like this brainstorm approach, you generate crazy ideas, a lot of them, and then just go and verify them. If verification takes a limited time and limited budget, it's great. So I would rather choose 10 ideas, three of them are crazy and wrong, than just five ideas which are trustworthy. I would grab 10 and go verify them. My approach is like, give me all ideas.
Starting point is 00:28:26 I'm going to verify. Like I have zero trust to any statements. So I'm going to verify myself anyway. And without this verification component, we have only half story here. Let's skip to privacy. And I want to ask about something else as well. Why do you say privacy? Okay, sorry. It's the same thing. So privacy. And I want to ask about something else as well. Why do you say privacy? Okay, sorry.
Starting point is 00:28:45 It's the same thing. So, privacy. It's not a resolved problem. And I remember in DeBeaver, they have a bold statement, you can disable it here. Fully disable it here. Integration with CharterPT. Which, of course, demonstrates that they know
Starting point is 00:29:01 that many people are concerned. And we know some countries haven't banned chat GPT, some already unbanned it. It's still a shaky area, right? It's not fully clear how to approach this. Because my question to you, like open AI, is it like about openness of everything? If you talk to chat GP revealing you some data and some secrets is it going to be open as well because it's open ai so now my my ssn bank account routing number everything is going to be open or what they say no right like i know you're joking but they say no but do you like it's an american company right like it's for profit company now i believe like i know it started as a
Starting point is 00:29:45 i know it's supposed to be more of a non-profit initiative and but now i believe it's a fully for-profit company i'd rather yeah yeah well yeah you could pay a non-profit as well but yeah so i think there's a real concern now i've definitely seen some companies also not allow their employees to use it for these some of of these concerns for them but the main reason i see people not using it or being told they can't use it is actually around ip so like in terms of using what it tells you like about around oh i guess it's around the ethics of it how did they you know the gathering of that information do you have any concerns on that side of of how they got the information?
Starting point is 00:30:25 Let's split it into parts. We have SQL queries. We have explained plans. And we have some additional parts of IP like ideas and approaches and so on. I'm strongly against patents and ideas. Ideas
Starting point is 00:30:41 should flow. Like a scientific approach. You go to conference, share ideas, everyone takes it, improves it, and so on. And somebody, maybe not you, will win. Of course, many people will not agree with me in terms of patents and ideas. But this is my opinion.
Starting point is 00:30:58 I think it's bad for innovation and for progress. But if we talk about queries and plans, what do you think, what's the best name for a project that will solve it? Pgmask or PgGloves or something else? I don't understand.
Starting point is 00:31:16 Okay. Imagine you talk to Charger PT and you show some query and maybe some plans or execution explain plans which PgMMaster is working with as well. And before you send it to ChargerPT, you
Starting point is 00:31:29 just transform all table names, object names, and parameters to something. Oh, like a... Yeah, okay. So like Depeze does that, right? Like anonymization tool. Yeah. Depeze has it for plans,
Starting point is 00:31:46 explained plans. Imagine this project will transform all object names to something. So you're saying you can anonymize it, send it through, get it back anonymized.
Starting point is 00:31:56 And code and then decode, right? Yeah, I understand. Or mask and mask or anonymize, de-anonymize. You keep mapping. I think it's good. Sure, but casual users aren't going to... Like, if we're talking about using ChatGPT directly at the moment,
Starting point is 00:32:09 that's not super practical. And if it's a time-saving tool, what's the point anymore? But yeah, so what about the... Like, you don't have any concerns around it? I have concerns. Like, my query can have emails. I don't want them to be leaked, or SSN numbers, or some medical info. I don't want them to be leaked or SSN numbers or some medical info. I don't want this data to be leaked.
Starting point is 00:32:29 So, of course, I need some gloves or mask. What's the best name for this project? PG mask or PG gloves or something else? PGN9 or something. Maybe that's not got good connotations. Okay, I'll think about it more. But this is, I bootstrapped it slightly. But I think it will have some negative effect
Starting point is 00:32:47 because when you talk to CharGPT, if you have some experience, you say this is table persons, it has column email and from names, CharGPT also can suggest something useful. For example, it can suggest you to have index lower on email for example to have it cancer sensitive or to use data type ci text cancer sensitive text right if you convert it to column one it won't it like it will work less efficiently right but you well you lose something here, some senses that can be derived from names. But at least you're protected. If you transform to some weird names using some tools or maybe your own new ones, and then in responses you transform substrings back to it,
Starting point is 00:33:42 it will protect you almost fully. Some people told me that some SQL queries, their structure can be also considered as IP. Well, I can share very advanced SQL examples. Very advanced. I learned from other people.
Starting point is 00:34:00 This is about Rf3 patents. This query text... It's not just patents. Okay. Like this query text, explain plan text. It's not just patents. It's not just patents though, right? It's also even coming up with those.
Starting point is 00:34:11 It is somewhat, might be somewhat novel. And the idea that like secrecy here, I think everything should be open. All recipes, postgres is open. Recipes should be open. Talks should be open. Recordings should be open. recipes should be open talks should be open recordings should be
Starting point is 00:34:26 open editing should be open nice we've run long i think it's a good one still it was supposed to be very short episode yeah how do you want to conclude verify check test experiment this is the key i like the thinking of using it as like a brainstorming tool like as a creativity tool give you ideas and very much don't trust anything it says but you might get some good ideas same as humans don't trust our episodes yeah nice one well thank you nicolai thanks everybody and see you next week good see you

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.