Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero downtime deployment with cord file #439

Merged
merged 1 commit into from
Sep 7, 2023
Merged

Conversation

djmb
Copy link
Collaborator

@djmb djmb commented Aug 31, 2023

When replacing a container currently we:

  1. Boot the new container
  2. Wait for it to become healthy
  3. Stop the old container

Traefik will send requests to the old container until it notices that it is unhealthy. But it may have stopped serving requests before that point which can result in errors.

To get round that the new boot process is:

  1. Create a directory with a single file on the host
  2. Boot the new container, mounting the cord file into /tmp and
    including a check for the file in the docker healthcheck
  3. Wait for it to become healthy
  4. Delete the healthcheck file ("cut the cord") for the old container
  5. Wait for it to become unhealthy and give Traefik a couple of seconds
    to notice
  6. Stop the old container

The extra steps ensure that Traefik stops sending requests before the old container is shutdown.

Doc PR: basecamp/kamal-site#21

djmb added a commit to basecamp/kamal-site that referenced this pull request Aug 31, 2023
When replacing a container currently we:
1. Boot the new container
2. Wait for it to become healthy
3. Stop the old container

Traefik will send requests to the old container until it notices that it
is unhealthy. But it may have stopped serving requests before that point
which can result in errors.

To get round that the new boot process is:

1. Create a directory with a single file on the host
2. Boot the new container, mounting the cord file into /tmp and
including a check for the file in the docker healthcheck
3. Wait for it to become healthy
4. Delete the healthcheck file ("cut the cord") for the old container
5. Wait for it to become unhealthy and give Traefik a couple of seconds
to notice
6. Stop the old container

The extra steps ensure that Traefik stops sending requests before the
old container is shutdown.
@djmb djmb force-pushed the zero-downtime-deploy-file branch from 73943ae to 8a41d15 Compare September 6, 2023 13:35
Base automatically changed from remote-env-file to main September 7, 2023 08:34
@djmb djmb merged commit aa99998 into main Sep 7, 2023
6 checks passed
@djmb djmb deleted the zero-downtime-deploy-file branch September 7, 2023 08:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant