-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a new service pgss_dealloc #331
base: master
Are you sure you want to change the base?
Conversation
If the number of dealloc per second is too low, we can change it to number of dealloc per millisecond. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit dubious about this service. I agree that this is something you should keep you eyes on, but I don't think it would play very well as a check here.
The main problem is that if you schedule the check too frequently it wil be a bit useless. For instance, if you schedule it every 5 minutes, how do you differentiate from "there was once 1 deallocate and then none" from "there is 1 deallocate every 5 minutes" from this service point of view? The only way to know if there's really a problem is either:
- the service is constantly raising a problem
- the frequency of the service moving from ok to problem is high
But if you're in the first case it's likely that the global performance will immediately drop down by a huge factor, so it's unlikely that you won't notice there's a problem. And the second isn't a good way to spot a problem.
The fact that you only return (and handle thresholds as) a rate and not also the raw number probably exacerbates this problem.
-exitval => 127 | ||
) if @hosts != 1; | ||
|
||
is_compat $hosts[0], 'check_pg_stat_statements_dealloc', $PG_VERSION_140 or exit 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that having postgres 14 doesn't mean that you updated the pg_stat_statements extension to get the needed field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I will add a test to check pgss' version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added several tests:
- pg_stat_statements version must be above or equal 1.9
- pg_stat_statements has been created on target database
- pg_stat_statements has been loaded in shared_preload_libraries
Yeah, I shouldn't report rate as perfdata. I will change it to a counter. But for the threshold, I don't see other way. My idea is to, first graph the dealloc rate. For example, it will give you a mean rate of 100 dealloc between 5 minutes. Then, you add a threshold at 500. That means if you reach this threshold, your workload has changed. And, you should understand why you have an increase in dealloc rate before hitting production issue. |
Hello, |
Hello,
And enable |
Hello!
This conversation on hackers remind me that pg_stat_statements deallocs should me monitored.
I suggest adding such service.
Cheer