This tool monitors and restarts unhealthy docker containers.
This functionality was proposed to be included with the addition of HEALTHCHECK
, however didn't go through.
This tool is a workaround till there is native support for --restart-on-unhealthy
or similar (moby/moby#22719).
This docker-autoheal
is a rewrite of the excellent docker-autoheal
from Will Farrell, but in golang.
It is fully compliant with willfarrell/docker-autoheal
, plus few goodies such as metrics (see below)
AUTOHEAL_INTERVAL
,AUTOHEAL_START_PERIOD
,AUTOHEAL_DEFAULT_STOP_TIMEOUT
andCURL_TIMEOUT
are duration and are treated as seconds if no unit provided. For more precision5
can be written5s
.DOCKER_SOCK
env variable can be written to more standardDOCKER_HOST
.DOCKER_SOCK
can still be used.
Env var name | Default | Description |
---|---|---|
AUTOHEAL_CONTAINER_LABEL | autoheal | Specify the name of the label optin'ing to autoheal process. A special value all means all containers are watched |
AUTOHEAL_INTERVAL | 5s | Interval at which containers health is checked |
AUTOHEAL_START_PERIOD | 0s | Warmup time before running the first check |
AUTOHEAL_DEFAULT_STOP_TIMEOUT | 10s | Time to give to containers before shutting down |
DOCKER_HOST (or DOCKER_SOCK) | /var/run/docker.sock | Path/URI of docker socket |
CURL_TIMEOUT | 30s | Timeout when interacting with docker |
Label name | Description |
---|---|
autoheal | if true or AUTOHEAL_CONTAINER_LABEL=all , means this container is watched |
autoheal.stop.timeout | Per containers override for stop timeout seconds during restart, as a duration, e.g. 120s |
Metrics name | Type | Description |
---|---|---|
check_count | counter | Count how many times containers have been checked |
check_failure_count | counter | Count how many failures happened trying to check containers |
restart_count | counter | Count how many containers have been restated |
restart_failure_count | counter | Count how many restart failures happened |