Postgres FM - Benchmarking

Episode Date: February 10, 2023

Nikolay and Michael discuss benchmarking — reasons to do it, and some approaches, tools, and resources that can help. Here are links to a few things we mentioned: Towards Millions TPS (bl...og post by Alexander Korotkov)Episode on testing Episode on buffers pgbenchsysbenchImproving Postgres Connection Scalability (blog post by Andres Freund)pgreplaypgreplay-goJMeterpg_qualstatspg_queryDatabase experimenting/benchmarking (talk by Nikolay, 2018)Database testing (talk by Nikolay at PGCon, 2022)Systems Performance (Brendan Gregg’s book, chapter 12)fioNetdataSubtransactions Considered Harmful (blog post by Nikolay including Netdata exports)WAL compression benchmarks (by Vitaly from Postgres.ai)Dumping/restoring a 1 TiB database benchmarks (by Vitaly from Postgres.ai)PostgreSQL on EXT3/4, XFS, BTRFS and ZFS (talk slides from Tomas Vondra)Insert benchmark on ARM and x86 cloud servers (blog post by Mark Callaghan)------------------------What did you like or not like? What should we discuss next time? Let us know by tweeting us on @samokhvalov / @michristofides / @PostgresFM, or by commenting on our Google doc.If you would like to share this episode, here's a good link (and thank you!)Postgres FM is brought to you by:Nikolay Samokhvalov, founder of Postgres.aiMichael Christofides, founder of pgMustardWith special thanks to:Jessie Draws for the amazing artwork 

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to PostgresFM, a weekly show about all things PostgresQL. I'm Michael, founder of PgMustard. This is my co-host Nikolai, founder of PostgresAI. Hey Nikolai, what are we talking about today? Hi Michael, benchmarking. Simple as that. Yeah. Very easy word. Okay, database benchmarking.
Starting point is 00:00:17 But actually not only database benchmarking. I think you cannot conduct database experiments benchmarks without involving also smaller micro benchmarks. So let's talk about them as well. And of course, in the context of databases. Yeah, it's extremely complicated topic that I'm sure we can start simply with at least. And I'm looking forward to seeing how far we get. It's not something I have a ton of experience with personally, but have tried a little bit i've dipped my toe in so i'm i'm looking forward to hearing what the like the more complex side of it can look like why difficult you just run pg bench and see how many tps you have that's
Starting point is 00:00:55 it no no like it's easy right the more i learn about it the trickier it gets but yeah actually i'm i absolutely agree with you i i consider myself as a 1% of database benchmarking expert, not knowing other 99%, trying to constantly learn and study more areas. So I agree, this is a very deep topic where you think the more you know, the more you understand you don't know, right? Yeah. Awesome. So where should we start?
Starting point is 00:01:27 We should start about the goals, I think, for benchmarking. There are several cases I can think of. Number one goal, usually, let's check the limits. And why is it so? Because by default, PgBench does exactly that. It doesn't limit your TPS, doesn't limit anything. So you are checking, you're basically performing stress testing, not just regular load testing, but stress testing. You're exploring limits of your system. And the limits, of course, depend on the various factors and on database itself and workload itself. But if you just get just get pgbench and just create i don't know like 100 million records in pgbench accounts i think it's dash i 10 000 or so i don't remember by default dash i1 initialization step means 100 000 rows in the table right so basically it's just a single just only one table is big others other
Starting point is 00:02:25 three tables are small and then you just if you do it as is it will run as much as many tps as possible if you remember that you have not only one core but many cores you will probably try to use dash c and dash j and then it's already more interesting. But basically, you're performing stress testing. And exploring limits, it's definitely a good goal. But I think it should not be number one goal for database engineers. Number one goal should be, let's make data driven decision when comparing several situations. For example, we want to upgrade, perform major upgrade, and we want to compare the system behavior before and after. So second goal, and I think it's more important than exploring limits.
Starting point is 00:03:16 Knowing limits is definitely a good thing, but we don't want our system in production to live near to limits in saturation so we wanted it to live like CPU below 50% other resources network memory and disk I or they should be below some thresholds so that's why a second goal supporting decisions I think it's more important and I wish a pidgey bench default behavior was like let's limit tps to have some more realistic situation and focus on latencies so second goal second goal is making decisions and what about other goals what do you think there are a couple of more maybe well on the on the decisions front i think that's super interesting and i think there's maybe even a couple of
Starting point is 00:04:03 categories within that we might be checking something we hope would be an improvement so it's kind of it's kind of like an experiment like you have a hypothesis it i might be checking something that i hope has improved matters or i might be checking something like the upgrade and trying to check that there aren't any regressions so either way i have kind of like a null hypothesis and i might be trying hoping for an improvement or i might be just checking that there aren't there isn't a big change in the other direction the other times i see this is like new new feature work people planning ahead i think that's your so if people want to kind of plan ahead see see how much headroom they have but i guess that's the seeing where limits are i guess it's the micro benchmark side of things that you
Starting point is 00:04:43 were talking about but even in just little performance optimizations, if people are doing performance work, they might want to check on a smaller scale. I have an idea. Let's keep goal number one as a stress test to understand limits. And let's split the goal number two to two goals, like number two and number three. Number two will be exactly regression testing. So you want to ensure that it won't become worse at least, but maybe it will be improved in various aspects. And the goal number three is, for example, imagine a situation when you need to deploy a new system and you need to make a choice between various types of instances.
Starting point is 00:05:21 So it's not regression. It's making a decision what type of platform Intel versus AMD or ARM to use and so on, which type of disks to choose. So you're comparing various things and make choice. As for micro benchmarks, I think this is underlying goal, which needs to be involved in each goal. But I have number four goal, actually. Benchmarking to advertise some products. Yeah, the famous benchmarking. But it is often because it's the because a lot of the other ones are internal benchmarks. It the one that a lot of us see other companies publishing,
Starting point is 00:06:02 the majority of benchmarks we'll see out in the world are these kind of marketing posts that are often vendors comparing themselves to others based on some standard. Sometimes I like this. For example, I remember the post from Alexander Kortkov, Postgres, how to make it possible to have 1 million TPS with Postgres. It was 9.4 or 0.5, and some improvements were made in the buffer pool behavior and so on to reduce contention and so on. It was a good post. And it was also done in pairing mode with MySQL folks. So MySQL also achieved 1 million TPS.
Starting point is 00:06:40 Select only, but still both systems achieve that relatively at the same time. So four goals, right? Big goals. And I think number two and three, maybe two is more important. Because you do upgrades at least once per year or two years usually to keep up with the pace of Postgres development. At least it's recommended. And also sometimes you switch to new types of new type of hardware or cloud instances sometimes you upgrade operational system and glibc version changes and
Starting point is 00:07:12 so on and so on sometimes sometimes you change provider but what i would mention here is i i think it's not a good idea to use full-fledged benchmarking to check every product update. So if you just release feature, first of all, it's very often situation usually. Some people have a couple of deployments or more per day and benchmarking is expensive task to do it properly. So in this case, as usual, I can mention our episode about testing. I recommend to do it in shared environment. I would not call it benchmarking, but it's kind of performance testing as well. In shared environment, using single connection
Starting point is 00:07:51 and focusing on IO number of rows and buffers in another episode. So this is for developers, but for infrastructure folks who need to upgrade Postgres annually or biannually and upgrade operational systems once per several years. Benchmarking is very important, right? Yeah, and just to check, you mean biannually,
Starting point is 00:08:14 like every two years, you don't mean people try and upgrade twice a year? Yeah, and exploring limits is also important, but I would say this is an extremely hard topic because, first of all like let's discuss how do we do it how do we do benchmarking we need like the most difficult straight to the point the most difficult part is workload right so as in making it realistic or what's it what do you mean by that exactly so there are several several approaches to have proper workload. First is very simple. You take PgBench and you heavily for not having various kinds of distribution. When it sends selects, like single row selects based on primary key lookup, it chooses ideas
Starting point is 00:09:14 randomly and this is like evenly distributed choice. And it's not what happens in reality. In reality, we usually have like some hot area, right, of users or of posts or some items and a kind of cold area. I remember Sysbench author criticized it saying, Sysbench supports Postgres. He's my SQL guy, so it was quite heavy criticism. I remember quite very emotional articles in Russian, actually. And the idea was right. Sysbench supports Postgres and it supports Zipfian distribution.
Starting point is 00:09:54 Zipfian distribution is closer to reality to how social media is working. So with this hot area. But since then, PgBench already got this support. And when we run PgBench, we can choose Zipfian distribution to be closer to reality. But of course, it will be very far from reality still, right? Because it's some kind of synthetic workload. And how can we do better? Is it worth noting that for some things, PgBench is fine?
Starting point is 00:10:24 Like, depending on exactly what we're testing, I see, I saw Andres Freund, for example, when he published about the improvements to connection management on the Microsoft blog, very, very good blog post. But just to show how the patch worked well with memory management at high numbers of connections, PG Bench is great.
Starting point is 00:10:43 It did, you know, it doesn't, so so it's so depending on exactly what you're testing the nature of the workload might may or may not matter so much right but this is for postgres development itself and it's about limits again but i'm talking again i'm i'm talking about a general situation where database engineer like dba dbre or some backend developer who wants to be more active in the database area, they usually need to compare and they need to talk about their own system. So in this case, PgBench can still be used, but it should be used very differently. So we need to move from synthetic workloads purely synthetic like something in space right
Starting point is 00:11:27 something living in space far from reality we need to move closer to reality and on the other side of possible workload types is a real production workload and like the main approach here is to have mirroring of production workload, which should be very low overhead, which is tricky. And it's just like, imagine some proxy which receives all queries from application and sends these queries to the main node,
Starting point is 00:11:56 which is production node, and also it sends them to a shadow production node, and ignoring the responses from the second. It would be great to have, right? It doesn't exist in, I don't know any such tool developed yet, but hopefully it will be created soon. There are some ideas.
Starting point is 00:12:14 A previous employer of mine, GoCardless, released a tool called PG Replay Go. And that, that the whole, this is is different i know it's different but it's the the aim is the same right it's to replay i mean i mean yeah well yeah imagine the like if we talk about mirroring of course aim is the same to have similar to production workload ideally identical right but with mirror, you can achieve identical. With replaying, it's much more difficult. But I would say that most people's workload, like yesterday's workload is very likely to be similar
Starting point is 00:12:59 to today's and to tomorrow's, right? Like if you're talking about getting close to... Right. But yeah, I do appreciate that you can't like i think also even even mirroring production isn't quite what we need right like if we if we want to we want to kind of want to anticipate right when we're benchmark we want to say what we want to know how will it perform not with today's workload but with tomorrow's you know or like in a month's time if we talk about regression i would i would like to replay i would choose this because in this case i can directly compare
Starting point is 00:13:31 for example behavior of new postgres and next major postgres version or behavior of postgres in different ubuntu versions right or for, compare with different instance types. And of course, you can get it and put it into like your fleet, like a standby node in this case, if JLPC version didn't change, if major Postgres upgrade didn't happen. You can, for some kinds of testing, you can do that. And it would be like kind of A-B testing or blue-green deployments because some part of your customers will use it, really use it. But mirroring gives you fewer risks
Starting point is 00:14:14 because if something goes wrong, customers will not notice, but still you can compare all characteristics like CPU level and so on and so on. A replay was the next item in my list. I like it. Okay, great. I spent a couple of years exploring this path.
Starting point is 00:14:31 Right now, I don't use it directly, like replaying from logs. I believe in this idea less. My belief dropped a little bit after I spent some time with PgReplay and PgReplay Go. Why? Because the hardest part of replaying workload is how to collect the logs.
Starting point is 00:14:49 Like logs collection under heavy load is a big issue because it's quickly becoming a bottleneck and the observer effect hits you very quickly. Fortunately, these days already, we can log only some fraction of whole stream of queries because there are a couple of settings for sampling, right? For example, we can say we want to log all queries, but only 1% of them. And we can do it at transaction level, which is good. And then when replaying, we can say,
Starting point is 00:15:17 okay, let's go 100x in terms of what we have. But in this case, the problem will be you will work with a smaller set of IDs, for example, right? Yeah. You need to find a way. And we come to the same problem of distribution, how to find proper distribution. So in reality, what we usually really do, we combine synthetic approach, like simulation, we take information from PG star statements, right? And then we try to find a way
Starting point is 00:15:48 to get some query parameters. And then we can use various tools, for example, JMeter or even PGBench. PGBench, you can feed multiple files using dash F and you can balance, you can put some weights in terms of how often each file should be used, add sign and some number. So in this case, you can say, I want this file to be used in two times more often than the other file, for example. And you can have some kind of workload. You can try to achieve some kind of situation you have in production, at least having like top 100 queries by calls and top 100 queries by total time. Usually we combine these things both. And you can replay quite reliable in terms of reproduction.
Starting point is 00:16:34 Also, big question. Replay gives you a good idea that you can replay multiple times, right? So you can use workload you already created and replay, replay and have a whole week playing with it. With mirroring, you don't have it. Usually, it's only right now. If you want to compare 10 options, you need to run 10 instances right now. And if tomorrow you have some other idea, you already don't have exactly the same workload. So there are pros and cons for these options as well.
Starting point is 00:17:04 But I still think mirroring is very powerful for regression testing in some cases. Replay, well, replay is hard, but possible if you understand the downsides of it, right? Yep. Makes sense. On the pros and cons side of things, where do you end up? Is there a size? I'm thinking replay might be good up to a certain volume like for smaller shops that still want to check performance is still important but they're not they've not got a huge number of transactions per second i'm guessing replay is like a nice nicer way to go yeah i just want to warn folks about this
Starting point is 00:17:43 observer effect when you enable a lot of logging. There you definitely need to understand the limits. So I would test it properly, understand how many lines per second we can afford in production in terms of disk IO, first of all. And of course, login collector should be enabled in CSV format. It's easier to work with. And I've myself put a couple of very important production systems down just because of this type of mistakes.
Starting point is 00:18:12 So I would like to share this. Don't do it. I knew it will happen and still I was not super careful enough and had a few minutes of downtime in a couple of cases. But in one case, we implemented one kind of crazy idea. Also, any change with like
Starting point is 00:18:35 enabling login collector and changes in the login system and post this requires restart. This is a downside as well. So you cannot, for example, say, I want to temporarily log 100% of queries to RAM using RAM disk, TMPFS, and then switch back. But in one case, it was my own social media startup. So I decided to afford that risk and I decided to go that route. We implemented quite interesting idea. We allocated some part of memory to that. And we started to like very quite intensive log rotation. So we send those archive logs, new log created, so not to reach the limit in terms of memory.
Starting point is 00:19:19 And in this case, we enabled 100% of queries to be logged and we had good results. So we collected for replay, we collected a lot of quiz to be logged and we had good results so we collected for replay we collected a lot of logs and did it right so it's possible if you can afford restarts actually the downside was actually in that system we enable it for all like permanently so just one restart and it's working but of course there is some risk to reach limit and have downtime. It's quite a dangerous approach. So let's rewind to the fact that replay versus mirror versus simulation. I call it actually a crafted workload. Crafted workload? Oh, like the hybrid approach? Yeah, you use Pagista statements, understanding proportions of each query group, but PG-set statements doesn't have parameters. You extract parameters from somewhere else. For example, you can use PG-set activity or actually you can use eBPF to extract parameters. It's possible. It's a very new approach, but I think it's very promising have you ever used there's a there's a postgres extension
Starting point is 00:20:25 i think by the power team in a monitoring tool yes if that's yeah that's a good idea i haven't explored that path i think it's also a valid idea to understand but the problem will be correlation if query has many parameters how to like all ideas to use p statistics, for example, to invent some parameters, some kind of almost random, but with some understanding of distribution. They are not so good because correlation is a big thing and usually you don't have create statistics. Not many folks these days still use create statistics, so it's hard. But extracting from PG activity is, I think, my default choice. But default there for query column, default is 1,024 characters only. I think it should be increased to 10k, usually, because if you have large queries, you definitely want to have it.
Starting point is 00:21:22 This is one of the defaults that also should be adjusted, recalling our last episode. But usually people do it, but unfortunately, again, you need to restart. But once you increase, you can collect samples from there. Majority of queries will be captured. The only problem will be how to join these sets of data. And modern Postgres latest versions have query ID both in pgstat activity and pgstat statements, right? And maybe in logs, I don't remember exactly. But if you
Starting point is 00:21:52 have older Postgres, for example, Postgres 13, 12, 11, you need to use a library called pgquery to join these sets. And for each entry PG-STAT statements to have multiple examples from PG-STAT activity. So this is how we can do some kind of crafted workload and then we use some tool, even PG-Bench is possible to generate, to simulate workload.
Starting point is 00:22:19 We understand it's far from reality, but it's already much better than purely synthetic workload. So let's talk about general picture very briefly. I have a couple of talks about benchmarking. I call them database experiments. Regression is number one case for me usually. But if you imagine something as input and something as output, and input is quite simple.
Starting point is 00:22:44 It's just environment instance type postgres version database you have actually database you have is it's already object then workload and maybe some Delta if you want to compare like before and after you need to describe the difference is this. You can have multiple options there of course if you compare multiple options right. I mean it's just like a science experiment, right? You have all the things that you're keeping the same, and then hopefully just the one thing that you're changing,
Starting point is 00:23:12 or, you know, if it's multiple things, the list of things that you're changing, and then you've got your null hypothesis, and you've got your, you know, checking that. In fact, actually, that's a good question. Do you take a statistical approach to this? Like, are you calculating how long you need to run these for for it to be significant you will come to the output and analysis this is like we discussed workload too much because it's very hard to have workload in each case but the most interesting part is output and analysis and
Starting point is 00:23:42 for that you need to understand performance in general without understanding you will have something measured and like okay we have this number of tps but for example very basic example we have this number of tps or we had controlled tps and we have these latencies for each query and average latency or percentiles involved but question is where was the bottleneck? And referring to Brandon Gregg's book, chapter number 12, question number one is, can we double throughput and reduce latencies? Why not, right? To answer that, you need to understand bottleneck.
Starting point is 00:24:23 And to understand bottleneck, we need to be experts in performance, system performance in general, and database particularly. So we need to be able to understand, are we I.O. bound or CPU bound? If CPU, what exactly happened? If I.O., what is causing this? To give you some example, we had a couple of years ago, we considered with one client, we considered to switch from Intel to AMD on Google Cloud. And AMD somehow jumped straight to database benchmarking. And it showed that on AMD, all latencies are higher, throughput if we go with stress testing is worse. Even if throughput is controlled same, but latencies are high like what's what's happening and without analysis we just would make conclusion that it's worse but with analysis we quite quickly
Starting point is 00:25:14 understood that we suspect that disk io has issues there on amd instances epic, I think it was second generation of epics. So then we went down to micro benchmarks finally, and checked FIO and understood that indeed, if we remove Postgres out of picture, and see that even if we compare FIO basic tests, everyone should know how to do it. It's quite easy, like random reads, random writes, sequential reads, sequential writes. We see that indeed these disks behave. We cannot reach limits, advertised limits, like two gigabytes per second throughput, 100,000 IOPS. We check documentation and see we need to have them. On Intel, we have them. On AMD, we don't have them. We went to Google engineers, Google support, and at some point they admitted that there is a problem by the way disclaimer right now it looks like they don't have
Starting point is 00:26:10 it recently we checked once again it looks like they don't have it anymore probably fixed and it was not a it was definitely not a problem with AMD itself it was something in Google Cloud particularly but this raises the question why didn't we use FIO checks in the very beginning, right? And the answer is, I think we should always start with micro benchmarks, because actually micro benchmarks is much easier to conduct. Less work, right? And faster, right? You just take, for instance, your virtual machine or real machine with disks,
Starting point is 00:26:49 and you run FIO for disk IOT check. And you run, for example, sysbench to check CPU and memory and compare. Same tests lasting some minutes. It's very fast. And compare. And in this case, you don't need to think about very hard workload or to fill all the caches. Well, okay, in the case of memory, you probably need, but with FIO disk checks, it's easy. I mean, you don't need to fill the buffer pool and so on.
Starting point is 00:27:27 So in this case, it will already give you insights about difference in machines. And also important point, we actually had it in a couple of cases where we automated benchmarks. I think it should be always done, even if you don't change instance type, if you don't change disk type, you still need to do it to come if you run benchmarks on different actual machines, even if it's the same type, you need to compare them because sometimes you have some different CPU, even if it's the same instance type. Or you can also have some faulty RAM, for example, or faulty disks. So micro benchmarks help you understand that you are comparing apples. You are having apples versus apples comparison in terms of environment. Yeah, and I guess you made some really good points. It's not just about not trusting the cloud providers.
Starting point is 00:28:12 It could also be something else that's gone wrong. Yeah, especially since it's becoming more popular to consider moving out of clouds. I just read another article this morning about this. So cost optimization, yeah. So, yeah, and then analysis. Analysis is a huge topic, really huge. But my approach is let's collect all artifacts as many as possible and store them.
Starting point is 00:28:37 Let's automate collection of artifacts. So, for example, let's have snapshots before and after each run for all pgstat system views including pgstat statements extension of course so we are able to compare them and of course all logs should be collected for example one of one of mistakes is okay we have better tps but if you check logs you see that some of transactions failed that's why it was so fast on average. It's like error checking is very important. And if you don't collect logs automatically, you probably will be very sad afterwards. Because if you already destroyed machine or reinitialized it, you don't have those logs. Like artifact collection should be automated
Starting point is 00:29:23 and as much as possible, you need to collect everything. Yeah, I was going to say, because you can spend, you could be spending hours on these things in terms of even just letting them run, nevermind all the setup and everything. It's not trivial to run it again to get these things. Or if somebody, if something looks off, if somebody, if you're presenting it to your team or somebody doesn't trust the results, they can look into it as well. Yeah. Actually, one of the concepts Brandon Gregg explains in his book and talks that benchmarks
Starting point is 00:29:53 should not be running without humans involved. Like humans should be like, because you can have good thoughts, just checking artifacts afterwards. But during benchmark, you can have some idea to check something else. And it's better to be there, at least in the beginning, first several runs. But still, benchmarks can be automated and run without humans, but it just requires a better level of implementation. And I agree with you. If you collect artifacts, your colleagues may raise new questions.
Starting point is 00:30:24 If everything is automated, you can repeat it. But of course, it's usually quite expensive in terms of resources, hardware and human time, engineering time. But I also wanted to mention in terms of artifacts collection, one of the artifacts I consider is dashboards and charts from monitoring. And I'm very surprised how still most monitoring systems don't think about non-production and benchmarking. If you, for example, have Grafana and register each new instance, which you have temporarily, you will have spam in your host list, right? But some instance lived only a couple of days, for example, already destroyed, but we still need the data to be present. And instance names are changing depending on benchmarking activities. And you have spam.
Starting point is 00:31:11 What I like is to use net data, which is installed using one-liner. And it's like in-place monitoring already there. And then you can export dashboards manually. Unfortunately, it's only like it's client side and browser feature so automation here is not possible although i raised with nade data developers that they agreed that it would be great to have api for exporting and importing dashboards but at least right now you can open export to, and then later you can open several dashboards for several runs and directly compare CPU, RAM, all the things. I would like to have this in all monitoring systems, especially with API automated.
Starting point is 00:31:55 It would be great. Right now, we can provide in our show notes a couple of examples how my team is conducting this kind of benchmarks. It's quite good. Like we store it sometimes openly. So for example, let's enable wall compression. Will it be better or worse? And we can conduct experiments. We store them.
Starting point is 00:32:17 This kind of experiment, it's publicly available. And you have artifacts there. So you can load them later to Nate data and compare without and with compression and play with those dashboards yourself and of course we automate everything so anyone can reproduce the same type of testing in their own situation so I like actually benchmarking it it's very powerful thing but of course it requires a lot of knowledge. So we try to help other companies in this area as we can in terms of consulting as well. Nice. In terms of resources, you mentioned Brendan Gregg's book. Are there any other
Starting point is 00:32:54 case studies that you've seen people publish that you think are really worth reading through or any other guides? Oh, since, yeah, I like Thomas Wanderer's microbenchmarks for file systems in Postgres context. It's so cool. It's not only microbenchmarks. I think PgBench was used there. In terms of benchmarks people conduct,
Starting point is 00:33:18 I very rarely see super informative benchmarks, but usually when I see some benchmark, I quite quickly find some issues with this. And like, I have questions and want to improve. But so I cannot mention particular benchmarks except this file system benchmark. But maybe there are some cases. Yeah, there's only one that comes to mind. In addition, yeah, there's some great ones in the Postgres community, but slightly outside of that,
Starting point is 00:33:48 I think Mark Callaghan's doing some good work publishing comparisons between database management systems for different types of benchmarks. I think that's pretty cool and worth checking out for people. Well, I like if everything is open and can be reproduced and fully automated. And we can mention here this new Postgres-based project called Hydra, like open-source Snowflake.
Starting point is 00:34:10 And we had an episode with them on Postgres TV recently. And they used a tool from ClickHouse. But of course, the first question there was why so small instance and that small size of database? And they admitted this is so, but this is what this tool does.
Starting point is 00:34:26 And I hope they will publish new episode of benchmarks with more realistic situation for analytical cases. Because on an analytical case, we usually have a lot of data, not enough RAM and so on. And instances usually are bigger and so on. But what I like there is that some kind of standard de facto for analytical databases tool was used. And you can exactly, you can compare a lot of different systems, including Postgres itself and Timescale and this new Hydra project. And you can reproduce yourself.
Starting point is 00:34:59 It's great. So this is the kind of approach that I do like. Although I still have questions about realistic or not. Yeah, of course. Wonderful. Any last things you wanted to make sure we covered? Yeah, I read Ben Gregg's book. I have it here, actually.
Starting point is 00:35:17 The second edition of System Performance, a lot of good things and become better performance engineer if you want to conduct benchmarks and understand what's happening under the hood. So without understanding of internals, it's quite black box benchmarking and it's not working properly. And I did it also many times and failed
Starting point is 00:35:39 in terms of, okay, we see A, but we conclude B, but what happened in reality was c right so awesome well thank you so much um thanks everyone for listening and catch you next week thank you as usual uh reminder please distribute help us grow i i saw michael showed me numbers this morning and they look very good, I would say. So we are growing, but I would like to grow more and more, of course, because I like feedback, and it drives me to share.
Starting point is 00:36:16 Of course, I like when people suggest ideas. Actually, we have a few more suggestions that happened last week, so probably we should review them and use them in the next episodes. But I particularly ask everyone to consider sharing with your colleagues, at least for those episodes, which you think can be helpful to them. And as usual, please subscribe. Don't forget to subscribe, like, but sharing is most important. Right. Thank you.
Starting point is 00:36:39 We love your feedback. Thank you so much. Yeah, absolutely. Thanks everyone. Bye now. Bye bye.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.