Skip to content

Commit

Permalink
[8.14](backport #39332) More resilient DRA packaging (#39341)
Browse files Browse the repository at this point in the history
Occasionally packaging steps from the DRA pipeline may get stuck[^1].
This causes a breach of the global pipeline timeout (currently 1hr) and
cancels the job.

This commit increases the global timeout to 90min, adds one retry per
step and limits the runtime per step to 40min (so that a single stuck
step doesn't exhaust the entire global timeout).

Finally, we shush slack notifications if the retry recovered the step.

In a future PR we will consider also adding a daily DRA build to cover
for cases where the retries didn't help and there were no subsequent
commits to trigger a new build.

[^1]: https://buildkite.com/elastic/beats-packaging-pipeline/builds/114

(cherry picked from commit 726f6e9)

---------

Co-authored-by: Dimitrios Liappis <dimitrios.liappis@gmail.com>
  • Loading branch information
mergify[bot] and dliappis committed May 1, 2024
1 parent 6947755 commit ac4e882
Showing 1 changed file with 32 additions and 0 deletions.
32 changes: 32 additions & 0 deletions .buildkite/packaging.pipeline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,10 @@ steps:
provider: gcp
image: "${IMAGE_UBUNTU_X86_64}"
machineType: "${GCP_DEFAULT_MACHINE_TYPE}"
timeout_in_minutes: 40
retry:
automatic:
- limit: 1
commands:
- make build/distributions/dependencies.csv
- make beats-dashboards
Expand All @@ -62,6 +66,10 @@ steps:
provider: gcp
image: "${IMAGE_UBUNTU_X86_64}"
machineType: "${GCP_DEFAULT_MACHINE_TYPE}"
timeout_in_minutes: 40
retry:
automatic:
- limit: 1
commands:
- make build/distributions/dependencies.csv
- make beats-dashboards
Expand All @@ -86,6 +94,10 @@ steps:
provider: gcp
image: "${IMAGE_UBUNTU_X86_64}"
machineType: "${GCP_DEFAULT_MACHINE_TYPE}"
timeout_in_minutes: 40
retry:
automatic:
- limit: 1
artifact_paths:
- build/distributions/**/*
matrix:
Expand Down Expand Up @@ -116,6 +128,10 @@ steps:
provider: "aws"
imagePrefix: "${AWS_IMAGE_UBUNTU_ARM_64}"
instanceType: "${AWS_ARM_INSTANCE_TYPE}"
timeout_in_minutes: 40
retry:
automatic:
- limit: 1
artifact_paths:
- build/distributions/**/*
matrix:
Expand All @@ -142,6 +158,10 @@ steps:
provider: gcp
image: "${IMAGE_UBUNTU_X86_64}"
machineType: "c2-standard-16"
timeout_in_minutes: 40
retry:
automatic:
- limit: 1
artifact_paths:
- build/distributions/**/*

Expand All @@ -161,6 +181,10 @@ steps:
provider: gcp
image: "${IMAGE_UBUNTU_X86_64}"
machineType: "${GCP_DEFAULT_MACHINE_TYPE}"
timeout_in_minutes: 40
retry:
automatic:
- limit: 1
artifact_paths:
- build/distributions/**/*
matrix:
Expand Down Expand Up @@ -191,6 +215,10 @@ steps:
provider: "aws"
imagePrefix: "${AWS_IMAGE_UBUNTU_ARM_64}"
instanceType: "${AWS_ARM_INSTANCE_TYPE}"
timeout_in_minutes: 40
retry:
automatic:
- limit: 1
artifact_paths:
- build/distributions/**/*
matrix:
Expand All @@ -217,6 +245,10 @@ steps:
provider: gcp
image: "${IMAGE_UBUNTU_X86_64}"
machineType: "c2-standard-16"
timeout_in_minutes: 40
retry:
automatic:
- limit: 1
artifact_paths:
- build/distributions/**/*

Expand Down

0 comments on commit ac4e882

Please sign in to comment.