-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not "STATE_UNKNOWN" on unfinished backups #12
base: master
Are you sure you want to change the base?
Conversation
check_borg
Outdated
[ "$?" = 0 ] || error "Cannot list repository archives. Repo Locked?" | ||
if [ "$?" = 0 ] | ||
then | ||
if not ps aux | grep "${BORG}" | grep 'create' | grep "$BORG_REPO" >> /dev/null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may want to grep for the exact order by using grep -E 'PATTERN1.*PATTERN2'
or by using awk '/PATTERN1.*PATTERN2/'
or sed '/PATTERN1.*PATTERN2/!d'
.
check_borg
Outdated
;; | ||
esac | ||
|
||
if [ -z "${last}" ]; then | ||
echo "BORG CRITICAL, no archive in repository" | ||
exit "${STATE_CRITICAL}" | ||
if ps aux | grep "${BORG}" | grep 'create' | grep "$BORG_REPO" >> /dev/null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
check_borg
Outdated
if ps aux | grep "${BORG}" | grep 'create' | grep "$BORG_REPO" >> /dev/null | ||
then | ||
# A process most likely on the same repo is running | ||
hours=$(ps -xo etime,cmd | grep "${BORG}" | grep 'create' | grep -v 'python' | grep "${BORG_REPO}" | sed 's/^[ ]*//g' | cut -d ' ' -f 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nearly impossible to read IMHO. Needs some tweaking.
What about If there's an operation running on the borg repository blocking our commands, |
I tried that first but I couldn't get with-lock to work (on borg 1.1.8). Could anyone point me to a command that would get the status of a locked backup ? |
I've at least documented the complexity, and realized the code had issues on finished backups (crashing). |
Thanks a lot! It's still not the best approach, but better than the first for sure. The best approach IMHO would be to use borg itself to check for locks as @bebehei already mentioned. Due to lack of time, I have not been able to check |
The current implementation of this PR is just checking if there is a process running on your local machine. It doesn't catch if there is a process running on another machine. But let's take a step back at first and think about the design of the plugin. I've got a few thought on this
So we've got a matrix of possibilities and we should check either if there is a backup running and in time or if there is a list and it got current snapshots. We could implement this either via a nagios specific-way or we could implement this in the plugin. In nagios this would be easy quickfix. We just explicitly use If we implement this in the plugin, we have to check the timestamp of the lock. According to my research, there is no
You have to remove the last 5 chars of the Timestamp to have a Unix-timestamp. So whenever a backup is running, you check the roster and check if there it's in range. Depending on the range the plugin then states OK/WARN/CRIT. I like the second solution. It bloats up the plugin, but it works correctly. And with an interface from the actual borg executable (e.g. |
Hi there, Thanks for being so responsive ! You are right. This patch works for us, as we run it as a local check on the host that does push the backups. I was sending to y'all as a courtesy in case that helps. I'll read the code for with-lock on borg side to figure out how that works and maybe make a feature request to get the "running since" value from borg. G |
Hi there,
I've made a hack so that check_borg send "STATE_OK" when the backups aren't finished and there is a process running that have the same BORG_REPO as the one we're looking to check.
It might be a bit overly complicated with the date to convert the ps output, but it works for us.
G