Postgres FM - Health check

Episode Date: March 22, 2024

Nikolay and Michael discuss Postgres health checks — what they are, things to include, how often makes sense, and whether improvements to Postgres would increase or decrease the need for th...em. Here are some links to things they mentioned:MOT (car test in the UK) https://en.wikipedia.org/wiki/MOT_test Let's make PostgreSQL multi-threaded (discussion started by Heikki) https://www.postgresql.org/message-id/flat/31cc6df9-53fe-3cd9-af5b-ac0d801163f4%40iki.fi postgres-checkup https://gitlab.com/postgres-ai/postgres-checkup Why upgrade https://why-upgrade.depesz.com/  Related episodes: Default configuration https://postgres.fm/episodes/default-configurationIndex maintenance https://postgres.fm/episodes/index-maintenanceBloat https://postgres.fm/episodes/bloatMonitoring checklist https://postgres.fm/episodes/monitoring-checklistpg_stat_statements https://postgres.fm/episodes/pg_stat_statementsBackups https://postgres.fm/episodes/backups~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is brought to you by:Nikolay Samokhvalov, founder of Postgres.aiMichael Christofides, founder of pgMustardWith special thanks to:Jessie Draws for the amazing artwork 

Transcript
Discussion (0)
Starting point is 00:00:00 Hello everyone, this is PostgresFM, podcast about Guess What Postgres. I'm Nikolai and hi Michael. Hello Nikolai. How are you doing today? I'm good, thank you. How are you? Very good. So we discussed before the call that we have a few options for topics and it was my choice but somehow you chose right you chose the
Starting point is 00:00:29 you chose the health check or check up postgres health checks and how to check the health of postgres clusters periodically why to do so what to look at and so on right well you you said you had two ideas one of them was this one and i was like i like that idea and then you explained the other one i was like i like that idea too so i think we've got lots of good ones left to go yeah we need to discuss the internal policy how transparent should we be right yeah well and nice a reminder to people that you can submit ideas for future episodes on our google doc or by mentioning us on Twitter or anything like that. Right, right. That's good.
Starting point is 00:01:08 I know people go there sometimes, and I think it's good to remind this. This is an opportunity, and this feeds our podcast with ideas. So, yeah, good point. So health check, where to start? Goals, maybe, right? Well, I think definitions actually, because I was doing a little bit of research around this. It is a slightly loaded term. When you say health check, it could be, for example, software just checking that something's even alive, but all the way through to kind of consultancy services around you know that select one is
Starting point is 00:01:46 working that's that's good so yeah so what do you mean when you say health check what do you mean yeah well i i my perception on this term is very broad of course because in my opinion like what we mean it's not like looking at monitoring and see everything is green or something no alerts or nothing no no no it's about something that should be done not every day but also not once per five years slightly more frequently right like for example, once per quarter or twice per year. And it's very similar to checking health of humans or complex technology like cars, right? Sometimes you need to check. And it doesn't eliminate the need to have continuous monitoring.
Starting point is 00:02:41 Like, for example, in your car, you have various checks constantly being working for example tire pressure by the way I think for humans it's big disadvantage that we don't have monitoring it's improving for example modern smartwatch can provide you heartbeat rate and so on but still we we monitoring when i complain about how poor some postgres monitoring systems are i remind myself how poor human body monitoring is right now still and like we we can monitor technological systems much better than non-technological. So, yeah, we have some things constantly being checked.
Starting point is 00:03:31 But still, we need sometimes to perform deeper and wider analysis for different goals. First goal is to ensure that everything currently is fine and nothing is overlooked. And the second goal is to predict and prevent some problems in the nearest future. And the third goal is things like how we will live in the next few years. It involves capacity planning and proper planning of growth we predict and so so these are three key goals very roughly right what do you think yeah i really like it and i think the car analogy is is particularly good in that we i don't know about i don't think this is true in the u.s but in the uk at least once a car is a certain age, like three years old or something, it's required by law to have an annual health check.
Starting point is 00:04:31 We call it an MOT. But it's not just what's wrong with the car now that you need to fix, but it's also looking at things that could go wrong and trying to get them replaced before they fail. So there are certain parts of the car that if they haven't been changed in a certain number of years they have to be changed so i think that's quite a nice kind of that's at least two of the goals looking at what's wrong now but also if we don't do anything about it what could go wrong very soon and we should get ahead of it, which is not necessarily something monitoring always is good at. Yeah, well, in the US it's very different.
Starting point is 00:05:11 No pressure at all. You just change oil and that's it. Everyone is on their own, as I understand. At least in California. And actually, I'm going to argue with myself and with you a little bit. Better technology means less need for health checks. For example, I own a couple of Teslas, and I was very concerned that nobody is replacing anything,
Starting point is 00:05:35 like filters and so on. And then I needed to replace windshield after some trip to Nevada. And I asked, like, guys, like check everything, like brake pads and so on. Because of course I'm worried, like it's already like almost two years and nothing. And they said, everything is fine. We are going to replace just cabin filter and that's it.
Starting point is 00:06:01 No needs to replace anything because it's better technology. So I think, of course, the amount of work that needs to be done. Let's go back to Postgres because it's Postgres podcast, as I said in the beginning. So if we have very old Postgres, it probably perhaps needs more attention and work to check and maintain good health. I think there's two axes. I think you're right that technology plays into it, but I think also simplicity versus complexity plays a part as well. And as Postgres gets more complex or as your Postgres setup gets more complex i think actually maybe there's more you need to do whereas the car on the car analogy i think one of the arguments for things like the electric cars is they're simpler they have fewer parts they are like they they are easier to
Starting point is 00:06:58 maintain so i think there's also an argument for keeping your postgres setup like this is one of the arguments for like simplicity in well we'll get to some of the specifics but like on the technology side for example if we could get a like bigger than um 32-bit uh transaction ids right we wouldn't have to monitor as often for wrap like wraparound would be less of an issue if we defaulted to in a or big in primary keys instead of in four we don't have as many of those issues so like by certain choices we can simplify and prevent issues but also the technology and defaults matter a lot defaults matter if we if for, all our application code or frameworks and, for example, Rubin Rails made this choice a few years ago, as you said, they default to
Starting point is 00:07:53 integer 8 for surrogate primary keys, then minus one report needed, right? At least, yeah, so we don't need to check uh primary keys integer for primary keys capacity and that's good but of course complexity of the system can grow and sometimes we check in postgres help we go beyond postgres for example we check connection pooling and how orchestration and automation around Postgres is organized with the goals to understand DHA, high availability, disaster recovery, these kinds of topics. So and of course capacity planning, will we grow few years without performance issues or not? But anyway, we can add many things like that. For example, if threads topic succeeds,
Starting point is 00:08:47 started by Heike a year ago, if this succeeds, we won't be worried about single CPU core capacity for system processes. Or, for example, there is ongoing discussion for PgBrowser that threading is also a good thing so we won't be bound process won't be bound by single core
Starting point is 00:09:11 this is a hard wall to hit and postgres processors also can hit, some postgres processors can hit single core utilization 100% and even if you have 100 more cores it doesn't help you right, so if I cannot say utilization 100%. And even if you have 100 more cores, it doesn't help you. Right. So if like, I cannot say this is simplicity, because technology is similar to EV. It's also like a
Starting point is 00:09:33 lot of complex technology behind it. But it's the simplicity of the use, right? Of usage, maybe. Well, yeah, I would actually argue that, let's say we did get threading in postgres in a in a major version in a few years time actually at first you'd want to be doing health checks more often like you'd you'd be more nervous of things going on because you've added complexity and things that used the things that maybe for over a decade have been working great suddenly may may not work as well or just their patterns may change well maybe maybe but uh let's let's find some mutual ground here maybe like we both agree that health checks are needed, right? Or right now, at least. Maybe in Postgres 25, it won't be needed so much
Starting point is 00:10:29 because it will be autonomous database. I personally... Off topic. I think it's a really good way of keeping a system reliable and avoiding major issues, including downtime and outages and huge things but that's yeah i i don't need it as an interesting word like a lot of the people i see hitting issues are just living in a reactionary world a lot of them are startups and they yeah
Starting point is 00:10:59 but once like once things are working well and you've got like maybe you're hitting like an inflection point or you need to capacity plan or you're in a large established business where outages can be really expensive, it seems wild to me to not be doing semi-regular or, if not very regular, health checks. Right. Yeah, I think you're very right. There are two categories of people who, for example, go to us for health checks. Right. Yeah, I think you're very right. There are two categories of people who, for example, go to us for health checks. One is probably the bigger category of people, people who knew that it's good to have health check, but they only go to us when something happens, something bad happens, something already hitting.
Starting point is 00:11:46 And in this case, well, it's actually like a psychological thing, right? For example, for CTO, Postgres database is just one of the things in infrastructure. And when it hurts, only then we realize, okay, let's check what can hurt soon as well. What other things can be a problem in this area, post-guys, right? And this is a very good moment to perform health check. Another category is, I would say, wiser guys. They do have checks before launch or like in the initial stage but at the same time it's harder to work work with them because usually project is smaller like and less it's harder to
Starting point is 00:12:36 predict workload patterns growth rates and so on but this is this is better of course to start checking health initially and then return to it after some time yeah it can go hand in hand with like stress testing as well can't it like before i see a lot of people doing this kind of thing before big events like in the u.s um you've got black friday cyber monday play times where you know there's going to be increased traffic or increased load. Before that, there's a lot of work that people are doing often to just make sure they can. Is that a good time to do a health check? Right. Well, yes, but I would distinguish these topics.
Starting point is 00:13:18 Load testing and preparation for some events like Black Friday is something else. Health checks, as I see them doing many years, it's quite established field, right? They include review of settings, review of index health, blood first. Let's talk blood
Starting point is 00:13:40 first. Blood, both table and index B3, then index health, and we had episode about it, we had episode about bloat, about index health and maintenance, then query analysis, like do we have outliers who like attacking us too much in various metrics, I mean, queries in business admins and other extensions. Then static analysis, for example, you mentioned integer 4 primary keys, I don't like lack of foreign keys and so on. Because if we see it first time, a lot of findings, right?
Starting point is 00:14:19 If we see it second time, maybe since last time, developers added a lot of new stuff right and other areas as well and capacity planning can be strategic like many years or it can be before some some specific event like Black Friday for e-commerce in this case it may involve additional actions like let's do load testing but I would say health check normal health check is a simpler task than proper load testing that's why I'm trying to say it's a separate topic because if you involve it health check suddenly becomes bigger right it's like you can think about it
Starting point is 00:15:02 if you go like in the US if you go, like, in the US, if you go to a doctor, there is annual health checkup, covered by insurance always, right? So it's like your primary care physician doing some stuff, discussing with you, checking, many things, actually, but it's not difficult. And then if there is some question, you go to a specific doctor. And low testing is a specific doctor, I would say. Yeah, I was just thinking about timing,
Starting point is 00:15:28 if it's more likely that people start the process then. And just like maybe just before they go on a holiday, maybe they go for their annual checkup. I don't know if there's times where people are more likely to go for those than others. But yeah, should we get into some more specifics? Yeah, well, speaking of time to finish this topic, sometimes people just have periodical, you know, like life cycle episodes in their work, right?
Starting point is 00:15:52 For example, this is financial year, right? And you plan finances, you spend them, and so on, you manage them. Similar proper capacity planning and database health management should be also like periodical. You would just have some point of time, maybe after some events, for example, after Black Friday, you think, okay, next year, how we expect our growth will look like, how we expect our incident management will look like. And this is a good time to review the health, understand ongoing problems, and try to predict future problems. And also perform capacity planning to understand do we need to plan big changes, like architectural changes,
Starting point is 00:16:36 to sustain the big growth if we predict it in a couple of years, right? In this case, probably you need to think about microservices or sharding or something like that. And health check can include this, like overview. And this is a big distinction from monitoring because monitoring very often has very small retention. You see only a couple of weeks or a couple of months that's it for existing evolved system
Starting point is 00:17:07 you need a few years of observation data points better to have them right to understand what will happen in next years and this is when like health check can help and capacity planning maybe load testing as well but it's additionally i would say yeah specific topics so it all starts usually for me specifically we actually have this tool postgres checkup and it has big plan the tool itself implements only i would say less than 50 of the plan the plan was my vision of health check several years ago. I did
Starting point is 00:17:45 conference talks about it. At that time I remember I performed really heavy health checks for a couple of really big companies. It took a few weeks of work. It's like a lot, like deep and wide, like 100 pages report, PDF of 100 pages. It's like interesting. Executive summary alone was like five or six pages, like quite big. Wow. Yeah, yeah, yeah. Well, it's a serious thing.
Starting point is 00:18:19 Yeah, yeah. But it was for a company which costed more than $10 billion. And the Postgres, like the databases observed few clusters were in the center of this serious business. So it was worth it. And then this tool, this was vision, and we implemented in this tool, we implemented like almost half of this vision. And for me, everything starts with version. Simple thing.
Starting point is 00:18:46 Let's check version. Major and minor. And maybe history of it. Because both things matter a lot. If your current version is 11, you're already out of normal life, right? Because community won't help you. And bugs won't be fixed unless you pay a lot.
Starting point is 00:19:06 And out of security patches as well. Right, yeah. That's including security also, bugs. If you're on 12th, you have time until November, right? So major version is important and also you are missing a lot of good stuff. Yeah. For example, from one of recent experiences,
Starting point is 00:19:23 if you're on 12 12, it not only means that you soon will be having non-supported Postgres, but also you lack good things like wall-related metrics in PGSA statements and explain plans, which we read in the next version, 13. You lack a lot of improvements related to logical decoding and replication. Probably you need to move data to Snowflake or between Postgres to Postgres, and you miss a lot of good stuff. So major version also matters a lot, but minor version may be number one thing.
Starting point is 00:20:03 If you're also lagging in minor version, obviously one thing if you also lagging minor version obviously you probably miss some important fixes and security patches so here of our tool is using why upgrade depeche.com links to explain what like yeah it great tool, but now we're already, like, moving towards combining with, like, it's huge list sometimes, right? If you're lagging, like, 10 versions, 10 minor versions behind, it's more than, like, it's
Starting point is 00:20:35 a couple of years of how many? Of fixes, and, like, oh, what's important here, right? Security, but maybe not only security, right? security right so yeah we are moving towards uh trying to summarize this and uh in understand particular context of course guess what helps llm helps here like it's yeah i think in in like in in the nearest future we will have interesting things here like combining observations why upgrade this diff,
Starting point is 00:21:07 and this is a summary what matters for you. Almost part of executive summary can be read by CTO and with links. I hope you're right long-term, but I think at the moment the release notes are important enough. The risk of missing something very important is still very high. I think sometimes the LLMs at the moment still, they can be good for idea generation, but if you want an exhaustive list of what's very important,
Starting point is 00:21:38 they can miss things or make things up. So I would still not rely on them for things like reading, which like even minor release notes can include things that are very important to do, like maybe a re-indexing is needed and things like that. So yeah. I agree with you. I agree with you,
Starting point is 00:21:57 but I think you are criticizing different idea. Okay, so sorry. Than I'm describing. I'm not saying like, first of all, health checks we are doing, and I think in the next few years we'll be doing them still with a lot of manual steps for serious projects.
Starting point is 00:22:14 So they, of course, involve DBA who is looking at everything. But LLM can help here. Like, you have usually pressure of timing. You need to do this work and a lot of stuff.
Starting point is 00:22:29 And change log with 200 items, for example. Of course you check them, but I bet no DBA spends significant time to check properly.
Starting point is 00:22:44 They said they checked, but I don't trust humans as well here. I think including them in a report is different to reading them before upgrading. That's a really good point. Right. So what I'm
Starting point is 00:23:00 trying to say, we try to convince clients that it's important to upgrade both minor and major. And while explaining, of course, we can provide. Not we can. We provide the list for full changes. Why upgrade is great, too. But also, we want to bring the most sound arguments, right?
Starting point is 00:23:22 Yeah. And OLM helps to highlight, oh, you know, and it was my long-term vision, actually, a few years ago. I thought, why great, you take it, and it would be great to understand, oh, there is an item about gist indexes
Starting point is 00:23:37 or gene indexes, for example. We have schema here. We can check if we have gene indexes, right? Does it affect us? And if it does, this should be highlighted. This is the idea. And for LLM, it's good here. It's like it just performs some legwork here. I mean, not only LLM, some automation with LLM. And it just helps DBAs not to miss important things, you know? But of course, again, I'm sure it should be combined.
Starting point is 00:24:10 Ideally, it should be the work from automation and humans. Okay, too much for versions. Actually, it's the smallest item usually, right? We don't have time for everything else. What to do? Okay, let's speed smallest item usually, right? We don't have time for everything else. What to do? Okay, let's speed up maybe a little bit. I think we've mentioned a bunch of them already, haven't we?
Starting point is 00:24:31 Like you've done a host. Yeah, but right. We had episodes separately, but they are super important. Settings go usually second. Settings. It's a huge topic. But from my experience experience if I see auto vacuum settings
Starting point is 00:24:47 being default or check pointer settings first of all max wall size in my opinion the most important one being default 1GB or login settings being default what else these kinds of things
Starting point is 00:25:02 we've done one on the default configuration, but I'm guessing... Our favorite random page cost being four. Recently we discussed it. These are strong signs to me that we need to pay attention here. Even if I didn't see database yet. I already know that this is a problem.
Starting point is 00:25:20 So this is really useful for that first health check you ever do. Maybe you're a new DBA or a new developer on a team. When you're doing a second health check or subsequent ones, are you diffing this list of settings and seeing what's changed in the last year? Good question. We have reports that can help us to understand that, for example, checkpoint needs reviewing, reconsideration, checkpointer settings, or auto-vacuum needs reconsideration. So, yeah, you're right.
Starting point is 00:25:50 I'm just coming here like I first met this database and reviewing it. But, yeah, second is easier usually, but if it grows fast, it can be challenging as well. So, sometimes we need to revisit. Yeah, but
Starting point is 00:26:04 settings, it's a huge topic. I just wanted to highlight the easy, low-hanging fruit we usually have. Default is a low-hanging fruit. For example, if auto-vacuum scale factor, insert, or regular, it's default, like 10%. For any LTP database, I would say it's too high way too high yeah so yeah many things like logging settings if if i see login settings are default i i guess people don't look at database properly yet and we need to adjust them and so on but second time is easier usually i agree the second time is you just visit things that change a lot so bloat estimates tables indexes
Starting point is 00:26:55 quite easy right we usually don't want very roughly we don't want anything to be higher than 40-50%. Depends, of course, but this is normal. And converting to number of bytes always helps. People say, oh, my database is 1TB and I have like 200GB of table and index bloat. Wow. Of course, we have
Starting point is 00:27:20 this asterisk remark that it's an estimate that can be very very very very wrong we discussed we had an episode about bloat and auto vacuum as well the related very related topic here and then from like table bloat index board from index board we already go to index health which is of course index index bloat included but also unused indexes redundant indexes well rarely used indexes
Starting point is 00:27:50 slightly like it's rarely used report also invalid indexes unindexed foreign keys what else maybe that's it overlapping redundant redundant
Starting point is 00:28:03 so yeah and we just talk about index maintenance and i i usually it's if it's first time i explain why it's so important to implement index maintenance and why we should understand that we need we do need to rebuild indexes from time to time how are you doing any corruption checks? Or I'm guessing maybe not in this case. Oh, that's a great question, actually. I think we can include this, but proper check is quite heavy. Yeah.
Starting point is 00:28:36 And one of the nice things about this tool is it's like a lot of these, it's been designed to be lightweight in this case, right? Yeah, it's very lightweight. It was battle-tested in very heavily loaded clusters. And it limits itself with statement timeout like 30 seconds or even 15. I don't remember from the top of my head. But if it takes too long,
Starting point is 00:29:01 sometimes we just miss some reports because statement timeout. We don't want to be a problem. Observer effect. Yeah. If it takes too long, sometimes we just miss some reports because statement timeout. We don't want to be a problem. Observer effect. Yeah. So, yeah. And if we talk about indexes, okay, index maintenance, quite straightforward topic already, quite well studied.
Starting point is 00:29:20 And, of course, we don't forget replicas. And we also don't forget about age of statistics because if it was reset yesterday we cannot draw proper conclusions right because maybe these indexes were not used in last 24 hours but okay anyway we we had index maintenance yes i have a couple of questions i have a couple of questions on like the cluster health so replication are you looking at delay at all or like what what are you looking at when it comes to replication health that's a good question i think i doubt we have replication legs being checked by this tool at all it's usually present in monitoring so we as a as a part of broader health check, not only we use health check, post-checkup report, but we also talk with the customer
Starting point is 00:30:10 and check what they have in monitoring. And sometimes we install secondary monitoring, which we have, and gather at least one day of observations and include everything. And also discussion matters. This is probably what automation won't have won't be having anytime soon is live discussion trying to understand pain points like what what bothers you what what is the issue recently right what what issue do you expect to happen soon this live discussion at high level may help prioritize further research so proper health check includes both automated reports but and this like kinds of so there we can check replication but
Starting point is 00:30:55 usually if people don't mention replication likes being an issue they have it in monitoring we just skip this usually sometimes they mention for example a lot of wall is generated and we say okay let's go to statements and check wall metrics try to understand why we generate so much wall, optimize some queries enable wall compression if it's not enabled and so on and so on
Starting point is 00:31:18 also disk let's just check what kind of disks you have it's a big topic yeah last question i had was another thing that i imagine is very important to check the first time you want to know kind of the overall system health or at least like setup health but might be tricky to automate or quickly run a query for is what's the health of the backup process or the risk You're raising questions
Starting point is 00:31:48 Yeah, if you check the plan which is in readme of the Postgres checkup, this is included there and checkbox is not checked, of course Oh, okay, nice We thought about automating this, but we have so many different situations, different
Starting point is 00:32:03 backup tools used, different managed providers, and so on. Every time it's quite different. We have a wide range of situations. Yeah. I wasn't actually thinking about Postgres Checkup at all. I was thinking more about health check.
Starting point is 00:32:19 As part of a health check, you'd probably want to check Ucommerce Store. Did we have a episode about backups and disaster recovery at all? If no, let's just plan it. Because it's not a topic for one minute. It's a huge topic and probably... Of course. Yeah. Maybe number one topic for DBA because it's the worst nightmare if backups are lost and data loss happened, right? Obviously, we include this to consideration unless this is managed service.
Starting point is 00:32:49 If it's managed service, we have guys who are responsible for it, right? Well, it brings us to this topic, managed versus self-managed. Because, for example, HA, since not long ago, both RDS and GCP Cloud SQL, they allow you to initiate failover to test it, right? So, like, you rely on them in terms of HA, in terms of if one node is down, failover will happen.
Starting point is 00:33:22 But you can also test it, and this is super important. Backups, it's like black box. We cannot access files as we discussed last time but you can test restoration, right? And it's fully automated and maybe you should to do it if you have a serious system running on RDS
Starting point is 00:33:40 because as I understand RDS itself doesn't do it for you. They rely on some global data as I understand. RDS itself doesn't do it for you. They rely on some global data, as I understand. They don't test every backup. All I mean is if I was a new DBA going into a new company and one of my first tasks was to do a health check, that would be one of the things that I would think of as including as in that health check.
Starting point is 00:34:02 Maybe a new consultant would as well dr and ha two fundamental two-letter terms right you definitely if you're starting working with some database seriously you definitely need to allocate good amount of time for both these topics like good it means like at least a couple of days for both to deeply understand what... For example, if we talk about backups, you need to understand RPO, RTO. Real and desired to check if we have
Starting point is 00:34:33 document, where it's documented and procedures defined to maintain these objectives. These are super serious topics and for any infrastructure engineer, I think it's quite clear. For any
Starting point is 00:34:50 SRE, it should be quite clear that these are two SLOs that you need to keep in mind and always revisit and maintain and have monitoring that covers them and so on. But this is, again, maybe in health check, we just need to check and understand if there is a problem here, right?
Starting point is 00:35:08 To study these topics, it's similar to a lot of testing topic. It's like special doctor. So, yeah. Nice. What else? What else? To wrap up, maybe, like, a couple of final things. I like to check query analysis in table form. In table form it's quite exotic
Starting point is 00:35:27 maybe because people got used to some graphs of course, historical data, it's great. But when I look at table data for PG-STAT statements with all these metrics and the derived metrics, it gets me a very good feeling of workload. So to understand that, it's great. And also static analysis. Bordered by total time, I think, because we're looking at system level health. Yeah, yeah, yeah. Total time is default for us because we talk about resources but if we if you aim to analyze the situation for end users you need to order by mean time and sometimes you need to order by different else different metrics for example as as we discussed above yeah earlier wall metric for example right sometimes
Starting point is 00:36:22 temporary files for example if we generate a lot of temporary files we want to order by temporary files in this case sometimes we just generate additional reports manually yeah makes sense yeah and actually that's it not difficult right
Starting point is 00:36:39 no and lots of do you see a lot of people doing this in house I know a lot of consultancies including yourselves offer this as a service like do you see what the pros and cons of those well usually in-house if you do it forever you don't see problems or you don't prioritize them well because i saw we have blowtorch but some kind of live still we are not down
Starting point is 00:37:09 or I saw we have for example 5 terabyte unpartitioned table but it's kind of fine maybe and so on but when external people come at least temporarily or for example you hire a new DBRE yes this is good point to revisit and reprioritize things and
Starting point is 00:37:30 and avoid problems in the future right so that's why like fresh blood fresh look is good here right but yeah but usually people get a lot of value during first checkup, health check. Then smart people implement periodical health checks. And then just, I would say, in one or two years, if things change significantly, this is when you need to talk again. Because if health check was full-fledged in the first one, one or two years will be fine and your team will be performing all the problems, solving all the problems, knowing what to do.
Starting point is 00:38:25 Yeah, nice one. all the problems solving all the problems knowing what to do right yeah but but honestly give me any database I will find problems easily
Starting point is 00:38:31 I mean actual live production database not some synthetic yeah anyway like
Starting point is 00:38:39 summary is don't postpone this perform health check plan it and do yeah check your health i had one last thing to mention i met a few people at pg day paris so hi to everyone who's listening from there and if you're watching on youtube and just listening to the audio we do also have an audio only version in the show notes so you don't have to waste your
Starting point is 00:39:06 precious bandwidth and data on on watching us and if you listen to the audio and didn't know we had a youtube then now you know that too right right we also have a transcript if you like reading reading better and i think it's quite good. It's quite good. Actually, guys who are listening, please check transcript and let us know if you think it's good to read or it's hard to read. I would like to know what you think about quality of transcript because I was thinking about, you know, like improving this and having like some, not a book, right, but kind of set of our discussions we had over almost last two years as just a set of texts. some people just like reading right so if people find the quality good i think it's worth investing to organize it better and maybe provide some more links and pictures maybe and so on so this is this
Starting point is 00:40:16 is the idea we discussed some time ago i think i maybe i will pursue it soon but i i need to know what what people about the quality of recent transcripts we have on Postgres. What's the best way to contact you for that? YouTube, comments, Twitter, or Twitter, right?
Starting point is 00:40:38 Yeah, yeah. Nice. Good. Thanks so much, Nikolai. Thank you, Michael. See you next week. Bye. Bye-bye.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.