Postgres FM - Benchmarking
Episode Date: February 10, 2023Nikolay and Michael discuss benchmarking — reasons to do it, and some approaches, tools, and resources that can help. Here are links to a few things we mentioned: Towards Millions TPS (bl...og post by Alexander Korotkov)Episode on testing Episode on buffers pgbenchsysbenchImproving Postgres Connection Scalability (blog post by Andres Freund)pgreplaypgreplay-goJMeterpg_qualstatspg_queryDatabase experimenting/benchmarking (talk by Nikolay, 2018)Database testing (talk by Nikolay at PGCon, 2022)Systems Performance (Brendan Gregg’s book, chapter 12)fioNetdataSubtransactions Considered Harmful (blog post by Nikolay including Netdata exports)WAL compression benchmarks (by Vitaly from Postgres.ai)Dumping/restoring a 1 TiB database benchmarks (by Vitaly from Postgres.ai)PostgreSQL on EXT3/4, XFS, BTRFS and ZFS (talk slides from Tomas Vondra)Insert benchmark on ARM and x86 cloud servers (blog post by Mark Callaghan)------------------------What did you like or not like? What should we discuss next time? Let us know by tweeting us on @samokhvalov / @michristofides / @PostgresFM, or by commenting on our Google doc.If you would like to share this episode, here's a good link (and thank you!)Postgres FM is brought to you by:Nikolay Samokhvalov, founder of Postgres.aiMichael Christofides, founder of pgMustardWith special thanks to:Jessie Draws for the amazing artworkÂ
Transcript
Discussion (0)
Hello and welcome to PostgresFM, a weekly show about all things PostgresQL.
I'm Michael, founder of PgMustard. This is my co-host Nikolai, founder of PostgresAI.
Hey Nikolai, what are we talking about today?
Hi Michael, benchmarking.
Simple as that.
Yeah.
Very easy word.
Okay, database benchmarking.
But actually not only database benchmarking.
I think you cannot conduct database experiments benchmarks without involving also smaller micro benchmarks.
So let's talk about them as well.
And of course, in the context of databases.
Yeah, it's extremely complicated topic that I'm sure we can start simply with at least.
And I'm looking forward to seeing how far we get.
It's not something I have a ton of experience with personally, but have tried a little bit i've dipped my toe in so i'm i'm looking forward to hearing what the like the more complex
side of it can look like why difficult you just run pg bench and see how many tps you have that's
it no no like it's easy right the more i learn about it the trickier it gets but yeah actually
i'm i absolutely agree with you i i consider myself as a 1% of database benchmarking expert,
not knowing other 99%,
trying to constantly learn and study more areas.
So I agree, this is a very deep topic
where you think the more you know,
the more you understand you don't know, right?
Yeah. Awesome. So where should we start?
We should start about the goals, I think, for benchmarking. There are several cases I can think
of. Number one goal, usually, let's check the limits. And why is it so? Because by default,
PgBench does exactly that. It doesn't limit your TPS, doesn't limit anything. So you are checking,
you're basically performing stress testing, not just regular load testing, but stress testing.
You're exploring limits of your system. And the limits, of course, depend on the various factors
and on database itself and workload itself. But if you just get just get pgbench and just create i don't know like
100 million records in pgbench accounts i think it's dash i 10 000 or so i don't remember by
default dash i1 initialization step means 100 000 rows in the table right so basically it's just a single just only one table is big others other
three tables are small and then you just if you do it as is it will run as much as many tps as
possible if you remember that you have not only one core but many cores you will probably try to
use dash c and dash j and then it's already more interesting. But basically,
you're performing stress testing. And exploring limits, it's definitely a good goal. But I think
it should not be number one goal for database engineers. Number one goal should be, let's
make data driven decision when comparing several situations. For example, we want to upgrade, perform major upgrade,
and we want to compare the system behavior before and after.
So second goal, and I think it's more important than exploring limits.
Knowing limits is definitely a good thing,
but we don't want our system in production to live near to limits in saturation so we wanted it to
live like CPU below 50% other resources network memory and disk I or they should
be below some thresholds so that's why a second goal supporting decisions I think
it's more important and I wish a pidgey bench default behavior was like let's limit tps to have some more
realistic situation and focus on latencies so second goal second goal is making decisions and
what about other goals what do you think there are a couple of more maybe well on the on the
decisions front i think that's super interesting and i think there's maybe even a couple of
categories within that we might be checking something we hope would be an improvement so it's kind of it's kind
of like an experiment like you have a hypothesis it i might be checking something that i hope has
improved matters or i might be checking something like the upgrade and trying to check that there
aren't any regressions so either way i have kind of like a null hypothesis and i might be trying
hoping for an improvement or i might be just checking that there aren't there isn't a big change in the other
direction the other times i see this is like new new feature work people planning ahead i think
that's your so if people want to kind of plan ahead see see how much headroom they have but i
guess that's the seeing where limits are i guess it's the micro benchmark side of things that you
were talking about but even in just little performance optimizations, if people are doing performance work,
they might want to check on a smaller scale. I have an idea. Let's keep goal number one
as a stress test to understand limits. And let's split the goal number two to two goals,
like number two and number three. Number two will be exactly regression testing. So you want to ensure that it won't become worse at least,
but maybe it will be improved in various aspects.
And the goal number three is, for example,
imagine a situation when you need to deploy a new system
and you need to make a choice between various types of instances.
So it's not regression.
It's making a decision what type of platform Intel versus AMD
or ARM to use and so on, which type of disks to choose. So you're comparing various things and
make choice. As for micro benchmarks, I think this is underlying goal, which needs to be involved in
each goal. But I have number four goal, actually. Benchmarking to advertise
some products.
Yeah, the famous benchmarking. But it is often because it's the because a lot of the
other ones are internal benchmarks. It the one that a lot of us see other companies publishing,
the majority of benchmarks we'll see out in the world are these kind of marketing posts that are often vendors comparing themselves to others based on
some standard. Sometimes I like this. For example, I remember the post from Alexander Kortkov,
Postgres, how to make it possible to have 1 million TPS with Postgres. It was 9.4 or 0.5,
and some improvements were made in the buffer pool behavior and so on
to reduce contention and so on.
It was a good post.
And it was also done in pairing mode with MySQL folks.
So MySQL also achieved 1 million TPS.
Select only, but still both systems achieve that relatively at the same time.
So four goals, right?
Big goals.
And I think number two and three, maybe two is more important.
Because you do upgrades at least once per year or two years usually to keep up with the pace of Postgres development.
At least it's recommended.
And also sometimes you switch to new types of new type of
hardware or cloud instances sometimes you upgrade operational system and glibc version changes and
so on and so on sometimes sometimes you change provider but what i would mention here is i i
think it's not a good idea to use full-fledged benchmarking to check every product update. So if you just
release feature, first of all, it's very often situation usually. Some people have a couple of
deployments or more per day and benchmarking is expensive task to do it properly. So in this case,
as usual, I can mention our episode about testing. I recommend to do it in shared environment.
I would not call it benchmarking,
but it's kind of performance testing as well.
In shared environment, using single connection
and focusing on IO number of rows
and buffers in another episode.
So this is for developers,
but for infrastructure folks
who need to upgrade Postgres annually or biannually
and upgrade operational systems once per several years.
Benchmarking is very important, right?
Yeah, and just to check, you mean biannually,
like every two years, you don't mean people try and upgrade twice a year?
Yeah, and exploring limits is also important,
but I would say this is an extremely hard topic
because, first of all like
let's discuss how do we do it how do we do benchmarking we need like the most difficult
straight to the point the most difficult part is workload right so as in making it realistic or
what's it what do you mean by that exactly so there are several several approaches to have proper workload. First is very simple. You take PgBench and you heavily for not having various kinds of distribution.
When it sends selects, like single row selects based on primary key lookup, it chooses ideas
randomly and this is like evenly distributed choice. And it's not what happens in reality.
In reality, we usually have like some hot area, right, of users or of posts or some items and a kind
of cold area.
I remember Sysbench author criticized it saying, Sysbench supports Postgres.
He's my SQL guy, so it was quite heavy criticism.
I remember quite very emotional articles in Russian, actually.
And the idea was right.
Sysbench supports Postgres and it supports Zipfian distribution.
Zipfian distribution is closer to reality to how social media is working.
So with this hot area.
But since then, PgBench already got this support.
And when we run PgBench, we can choose Zipfian distribution to be closer to reality.
But of course, it will be very far from reality still, right?
Because it's some kind of synthetic workload.
And how can we do better?
Is it worth noting that for some things, PgBench is fine?
Like, depending on exactly what we're testing,
I see, I saw Andres Freund, for example,
when he published about the improvements
to connection management on the Microsoft blog,
very, very good blog post.
But just to show how the patch worked well
with memory management at high numbers of connections,
PG Bench is great.
It did, you know, it doesn't, so so it's so depending on exactly what you're testing the nature of the workload might
may or may not matter so much right but this is for postgres development itself and it's about
limits again but i'm talking again i'm i'm talking about a general situation where database engineer
like dba dbre or some backend developer who wants to be more
active in the database area, they usually need to compare and they need to talk about
their own system.
So in this case, PgBench can still be used, but it should be used very differently.
So we need to move from synthetic workloads purely synthetic like something in space right
something living in space far from reality we need to move closer to reality and on the other
side of possible workload types is a real production workload and like the main approach
here is to have mirroring of production workload, which should be very low overhead, which is
tricky. And it's just like,
imagine some proxy which receives
all queries from application
and sends these
queries to the main node,
which is production node, and also
it sends them to a shadow
production node, and ignoring
the responses from the second.
It would be great to have, right?
It doesn't exist in, I don't know any such tool developed yet, but hopefully it will
be created soon.
There are some ideas.
A previous employer of mine, GoCardless, released a tool called PG Replay Go.
And that, that the whole, this is is different i know it's different but it's
the the aim is the same right it's to replay i mean i mean yeah well yeah imagine the like if
we talk about mirroring of course aim is the same to have similar to production workload
ideally identical right but with mirror, you can achieve identical.
With replaying, it's much more difficult.
But I would say that most people's workload,
like yesterday's workload is very likely to be similar
to today's and to tomorrow's, right?
Like if you're talking about getting close to...
Right.
But yeah, I do appreciate that you can't like i think also even even mirroring production isn't quite what
we need right like if we if we want to we want to kind of want to anticipate right when we're
benchmark we want to say what we want to know how will it perform not with today's workload but with
tomorrow's you know or like in a month's time if we talk about regression
i would i would like to replay i would choose this because in this case i can directly compare
for example behavior of new postgres and next major postgres version or behavior of postgres
in different ubuntu versions right or for, compare with different instance types. And of course, you can
get it and put it into like your fleet, like a standby node in this case, if JLPC version didn't
change, if major Postgres upgrade didn't happen. You can, for some kinds of testing, you can do
that. And it would be like kind of A-B testing or blue-green deployments
because some part of your customers will use it,
really use it.
But mirroring gives you fewer risks
because if something goes wrong,
customers will not notice,
but still you can compare all characteristics
like CPU level and so on and so on.
A replay was the next item in my list.
I like it.
Okay, great.
I spent a couple of years exploring this path.
Right now, I don't use it directly,
like replaying from logs.
I believe in this idea less.
My belief dropped a little bit
after I spent some time with PgReplay and PgReplay Go.
Why?
Because the hardest part of replaying workload
is how to collect the logs.
Like logs collection under heavy load is a big issue
because it's quickly becoming a bottleneck
and the observer effect hits you very quickly.
Fortunately, these days already,
we can log only some fraction of whole stream of queries
because there are a couple of
settings for sampling, right? For example, we can say we want to log all queries, but only 1% of
them. And we can do it at transaction level, which is good. And then when replaying, we can say,
okay, let's go 100x in terms of what we have. But in this case, the problem will be you will work with a smaller set of IDs, for example, right?
Yeah.
You need to find a way.
And we come to the same problem of distribution,
how to find proper distribution.
So in reality, what we usually really do,
we combine synthetic approach, like simulation,
we take information from PG star statements, right? And then we try to find a way
to get some query parameters. And then we can use various tools, for example, JMeter or even
PGBench. PGBench, you can feed multiple files using dash F and you can balance, you can put
some weights in terms of how often each file should be used, add sign and some number.
So in this case, you can say, I want this file to be used in two times more often than the other file, for example.
And you can have some kind of workload.
You can try to achieve some kind of situation you have in production, at least having like top 100 queries by calls and top 100 queries by total time.
Usually we combine these things both.
And you can replay quite reliable in terms of reproduction.
Also, big question.
Replay gives you a good idea that you can replay multiple times, right?
So you can use workload you already created and replay, replay and have a whole week playing with it.
With mirroring, you don't have it.
Usually, it's only right now.
If you want to compare 10 options, you need to run 10 instances right now.
And if tomorrow you have some other idea, you already don't have exactly the same workload.
So there are pros and cons for these options as well.
But I still think
mirroring is very powerful for regression testing in some cases. Replay, well, replay is hard,
but possible if you understand the downsides of it, right?
Yep. Makes sense. On the pros and cons side of things, where do you end up? Is there a size?
I'm thinking replay
might be good up to a certain volume like for smaller shops that still want to check performance
is still important but they're not they've not got a huge number of transactions per second i'm
guessing replay is like a nice nicer way to go yeah i just want to warn folks about this
observer effect when you enable a lot of logging.
There you definitely need to understand the limits.
So I would test it properly, understand how many lines per second we can afford in production in terms of disk IO, first of all.
And of course, login collector should be enabled in CSV format.
It's easier to work with. And I've
myself put a couple of very
important production systems down just
because of this type of mistakes.
So I
would like to share this. Don't
do it. I knew it will
happen and still
I was not super
careful enough and
had a few minutes of downtime in a couple of cases.
But in one case, we implemented one kind of crazy idea. Also, any change with like
enabling login collector and changes in the login system and post this requires restart.
This is a downside as well. So you cannot, for example, say, I want to
temporarily log 100% of queries to RAM using RAM disk, TMPFS, and then switch back. But
in one case, it was my own social media startup. So I decided to afford that risk and I decided
to go that route. We implemented quite interesting idea. We allocated some part of memory to that.
And we started to like very quite intensive log rotation.
So we send those archive logs, new log created,
so not to reach the limit in terms of memory.
And in this case, we enabled 100% of queries to be logged
and we had good results.
So we collected for replay, we collected a lot of quiz to be logged and we had good results so we collected for replay
we collected a lot of logs and did it right so it's possible if you can afford restarts actually
the downside was actually in that system we enable it for all like permanently so just one restart
and it's working but of course there is some risk to reach limit and have downtime. It's quite a dangerous approach.
So let's rewind to the fact that replay versus mirror versus simulation.
I call it actually a crafted workload. Crafted workload? Oh, like the hybrid approach? Yeah, you use Pagista statements, understanding proportions of each query group, but PG-set statements doesn't have parameters. You extract parameters from somewhere else. For example, you can use PG-set activity or actually you can use eBPF to extract parameters. It's possible. It's a very new approach, but I think it's very promising have you ever used there's a there's a postgres extension
i think by the power team in a monitoring tool yes if that's yeah that's a good idea i haven't
explored that path i think it's also a valid idea to understand but the problem will be correlation
if query has many parameters how to like all ideas to use p statistics, for example, to invent some parameters, some kind of almost random, but with some understanding of distribution.
They are not so good because correlation is a big thing and usually you don't have create statistics.
Not many folks these days still use create statistics, so it's hard.
But extracting from PG activity is, I think, my default choice.
But default there for query column, default is 1,024 characters only.
I think it should be increased to 10k, usually, because if you have large queries, you definitely want to have it.
This is one of the defaults that also should be adjusted,
recalling our last episode.
But usually people do it, but unfortunately, again, you need to restart.
But once you increase, you can collect samples from there.
Majority of queries will be captured.
The only problem will be how to join these sets of data.
And modern Postgres latest versions have query ID both in pgstat
activity and pgstat statements, right? And maybe in logs, I don't remember exactly. But if you
have older Postgres, for example, Postgres 13, 12, 11, you need to use a library called pgquery to
join these sets. And for each entry PG-STAT statements to have multiple examples
from PG-STAT activity.
So this is how we can do
some kind of crafted workload
and then we use some tool,
even PG-Bench is possible to generate,
to simulate workload.
We understand it's far from reality,
but it's already much better
than purely synthetic workload.
So let's talk about general picture very briefly.
I have a couple of talks about benchmarking.
I call them database experiments.
Regression is number one case for me usually.
But if you imagine something as input and something as output, and input is quite simple.
It's just environment instance
type postgres version database you have actually database you have is it's already object then
workload and maybe some Delta if you want to compare like before and after you need
to describe the difference is this. You can have multiple options there of course if you
compare multiple options right.
I mean it's just like a science experiment, right?
You have all the things that you're keeping the same,
and then hopefully just the one thing that you're changing,
or, you know, if it's multiple things,
the list of things that you're changing,
and then you've got your null hypothesis,
and you've got your, you know, checking that.
In fact, actually, that's a good question.
Do you take a statistical approach to this? Like, are you calculating how long you need to run these for for it to be significant
you will come to the output and analysis this is like we discussed workload too much because it's
very hard to have workload in each case but the most interesting part is output and analysis and
for that you need to understand performance in general without understanding you will have something measured and like okay we have this number of
tps but for example very basic example we have this number of tps or we had controlled tps and
we have these latencies for each query and average latency or percentiles involved but question is
where was the bottleneck?
And referring to Brandon Gregg's book, chapter number 12,
question number one is, can we double throughput and reduce latencies?
Why not, right?
To answer that, you need to understand bottleneck.
And to understand bottleneck, we need to be experts in performance,
system performance in general, and database particularly. So we need to be able to understand, are we I.O. bound or CPU bound? If CPU, what exactly happened? If I.O., what is causing this? To give you some example,
we had a couple of years ago, we considered with one client, we considered to switch from Intel to AMD on Google Cloud.
And AMD somehow jumped straight to database benchmarking.
And it showed that on AMD, all latencies are higher, throughput if we go with stress testing
is worse.
Even if throughput is controlled same, but latencies are high like what's what's happening
and without analysis we just would make conclusion that it's worse but with analysis we quite quickly
understood that we suspect that disk io has issues there on amd instances epic, I think it was second generation of epics. So then we went down to
micro benchmarks finally, and checked FIO and understood that indeed, if we remove Postgres
out of picture, and see that even if we compare FIO basic tests, everyone should know how to do
it. It's quite easy, like random reads, random writes, sequential reads, sequential writes.
We see that indeed these disks behave. We cannot reach limits, advertised limits, like two gigabytes per second throughput, 100,000 IOPS. We check documentation and see we need to have
them. On Intel, we have them. On AMD, we don't have them. We went to Google engineers, Google support,
and at some point
they admitted that there is a problem by the way disclaimer right now it looks like they don't have
it recently we checked once again it looks like they don't have it anymore probably fixed and it
was not a it was definitely not a problem with AMD itself it was something in Google Cloud
particularly but this raises the question why didn't we use FIO checks in the very beginning, right?
And the answer is, I think we should always start with micro benchmarks,
because actually micro benchmarks is much easier to conduct.
Less work, right?
And faster, right?
You just take, for instance, your virtual machine or real machine with disks,
and you run FIO for disk IOT check.
And you run, for example, sysbench to check CPU and memory and compare.
Same tests lasting some minutes.
It's very fast.
And compare.
And in this case, you don't need to think about very hard workload or to fill all the caches.
Well, okay, in the case of memory, you probably need, but with FIO disk checks, it's easy.
I mean, you don't need to fill the buffer pool and so on.
So in this case, it will already give you insights about difference in machines. And also important point,
we actually had it in a couple of cases where we automated benchmarks. I think it should be always done, even if you don't change instance type, if you don't change disk type, you still
need to do it to come if you run benchmarks on different actual machines, even if it's the same type, you need to compare them because sometimes you have some different CPU, even if it's the same instance type.
Or you can also have some faulty RAM, for example, or faulty disks.
So micro benchmarks help you understand that you are comparing apples.
You are having apples versus apples comparison in terms of environment.
Yeah, and I guess you made some really good points.
It's not just about not trusting the cloud providers.
It could also be something else that's gone wrong.
Yeah, especially since it's becoming more popular
to consider moving out of clouds.
I just read another article this morning about this.
So cost optimization, yeah.
So, yeah, and then analysis.
Analysis is a huge topic, really huge.
But my approach is let's collect all artifacts as many as possible and store them.
Let's automate collection of artifacts.
So, for example, let's have snapshots before and after each run for all pgstat system views including pgstat
statements extension of course so we are able to compare them and of course all logs should be
collected for example one of one of mistakes is okay we have better tps but if you check logs you
see that some of transactions failed that's why it was so fast on average.
It's like error checking is very important. And if you don't collect logs automatically,
you probably will be very sad afterwards. Because if you already destroyed machine or
reinitialized it, you don't have those logs. Like artifact collection should be automated
and as much as possible, you need
to collect everything.
Yeah, I was going to say, because you can spend, you could be spending hours on these
things in terms of even just letting them run, nevermind all the setup and everything.
It's not trivial to run it again to get these things. Or if somebody, if something looks
off, if somebody, if you're presenting it to your team or somebody doesn't trust the results, they can look into it as well.
Yeah.
Actually, one of the concepts Brandon Gregg explains in his book and talks that benchmarks
should not be running without humans involved.
Like humans should be like, because you can have good thoughts, just checking artifacts
afterwards.
But during benchmark, you can have some idea to check something else.
And it's better to be there, at least in the beginning, first several runs.
But still, benchmarks can be automated and run without humans, but it just requires a better level of implementation.
And I agree with you.
If you collect artifacts, your colleagues may raise new questions.
If everything is automated, you can repeat it.
But of course, it's usually quite expensive in terms of resources, hardware and human time, engineering time.
But I also wanted to mention in terms of artifacts collection, one of the artifacts I consider is dashboards and charts from monitoring. And I'm very surprised how still most monitoring
systems don't think about non-production and benchmarking. If you, for example, have Grafana
and register each new instance, which you have temporarily, you will have spam in your host list,
right? But some instance lived only a couple of days, for example, already destroyed, but we still need the data to be present.
And instance names are changing depending on benchmarking activities.
And you have spam.
What I like is to use net data, which is installed using one-liner.
And it's like in-place monitoring already there.
And then you can export dashboards manually.
Unfortunately, it's only like it's client
side and browser feature so automation here is not possible although i raised with nade data
developers that they agreed that it would be great to have api for exporting and importing dashboards
but at least right now you can open export to, and then later you can open several dashboards for several runs and directly compare CPU, RAM, all the things.
I would like to have this in all monitoring systems, especially with API automated.
It would be great.
Right now, we can provide in our show notes a couple of examples how my team is conducting this kind of benchmarks.
It's quite good.
Like we store it sometimes openly.
So for example, let's enable wall compression.
Will it be better or worse?
And we can conduct experiments.
We store them.
This kind of experiment, it's publicly available.
And you have artifacts there.
So you can load them later to Nate data and compare without and with compression
and play with those dashboards yourself and of course we automate everything so anyone can
reproduce the same type of testing in their own situation so I like actually benchmarking it it's
very powerful thing but of course it requires a lot of knowledge. So we try to help other companies
in this area as we can in terms of consulting as well.
Nice. In terms of resources, you mentioned Brendan Gregg's book. Are there any other
case studies that you've seen people publish that you think are really worth reading through
or any other guides?
Oh, since, yeah, I like Thomas Wanderer's microbenchmarks
for file systems in Postgres context.
It's so cool.
It's not only microbenchmarks.
I think PgBench was used there.
In terms of benchmarks people conduct,
I very rarely see super informative benchmarks,
but usually when I see some benchmark,
I quite quickly find some
issues with this. And like, I have questions and want to improve. But so I cannot mention
particular benchmarks except this file system benchmark. But maybe there are some cases.
Yeah, there's only one that comes to mind. In addition, yeah, there's some great ones in the
Postgres community,
but slightly outside of that,
I think Mark Callaghan's doing some good work publishing comparisons between database management systems
for different types of benchmarks.
I think that's pretty cool and worth checking out for people.
Well, I like if everything is open and can be reproduced
and fully automated.
And we can mention here this new Postgres-based project
called Hydra,
like open-source Snowflake.
And we had an episode with them
on Postgres TV recently.
And they used a tool from ClickHouse.
But of course,
the first question there was
why so small instance
and that small size of database?
And they admitted this is so, but this is what this tool does.
And I hope they will publish new episode of benchmarks
with more realistic situation for analytical cases.
Because on an analytical case, we usually have a lot of data,
not enough RAM and so on.
And instances usually are bigger and so on.
But what I like there is that some kind of standard de facto for analytical databases tool was used.
And you can exactly, you can compare a lot of different systems, including Postgres itself and Timescale and this new Hydra project.
And you can reproduce yourself.
It's great.
So this is the kind of approach that I do like.
Although I still have questions about realistic or not.
Yeah, of course.
Wonderful.
Any last things you wanted to make sure we covered?
Yeah, I read Ben Gregg's book.
I have it here, actually.
The second edition of System Performance, a lot of good things and become better performance engineer
if you want to conduct
benchmarks and understand what's happening
under the hood. So without
understanding of internals, it's quite
black box benchmarking
and it's not working properly.
And I did it also many times and failed
in terms of, okay, we see
A, but we conclude
B, but what happened in reality was c right so awesome
well thank you so much um thanks everyone for listening and catch you next week thank you as
usual uh reminder please distribute help us grow i i saw michael showed me numbers this morning
and they look very good, I would say.
So we are growing, but I would like to grow more and more, of course,
because I like feedback, and it drives me to share.
Of course, I like when people suggest ideas.
Actually, we have a few more suggestions that happened last week, so probably we should review them and use them in the next episodes.
But I particularly ask everyone to consider sharing with your colleagues,
at least for those episodes, which you think can be helpful to them.
And as usual, please subscribe.
Don't forget to subscribe, like, but sharing is most important.
Right.
Thank you.
We love your feedback.
Thank you so much.
Yeah, absolutely.
Thanks everyone.
Bye now.
Bye bye.