You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Notification engine fails to remove on-deployed key from notified.notifications.argoproj.io annotation before ArgoCD app state moves to Synced & Healthy
This happens when notification-engine picks up an app from the queue to process after the states already move to Synced and Healthy. It could happen when the queue is huge and/or notification services are slow to respond.
Our system
number of applications: >160
notification services:
slack
opslevel (webhook)
internal deployment registry (webhook)
trigger.on-deployed: | - description: Application is synced and healthy. send: - app-deployed when: app.status.operationState.phase in ['Succeeded'] and app.status.health.status == 'Healthy'
Reproducible steps
add (1) a slack notification service and (2) a notification service that delays its response (ie. 30s)
annotate all apps with on-sync-running, on-sync-succeeded, on-deployed
k annotate application --all -n argocd notifications.argoproj.io/subscribe.on-sync-running.test=""
k annotate application --all -n argocd notifications.argoproj.io/subscribe.on-sync-succeeded.test=""
k annotate application --all -n argocd notifications.argoproj.io/subscribe.on-deployed.test=""
k annotate application --all -n argocd notifications.argoproj.io/subscribe.on-deployed.slack="#my-channel"
Note: this will make notification-engine stuck in waiting state for the response from the slow-notification-service during sync-running and sync-succeeded while argocd completes sync & rollout of the test app, whose states have already transition to Synced and Healthy.
PS. this issue can occur easily with
more apps (with frequent state changes)
more notification services (with slow response)
smaller sync window (from OutOfSync to Synced & Healthy)
The text was updated successfully, but these errors were encountered:
chatchai-outreach
changed the title
Notifications are not being sent intermittently when notification queue is overloaded and/or notification services are slow to respond
Notifications intermittently not being sent when notification queue is overloaded and/or notification services are slow to respond
Jan 11, 2023
Describe the bug
Notification engine fails to remove
on-deployed
key fromnotified.notifications.argoproj.io
annotation before ArgoCD app state moves to Synced & HealthyThis happens when notification-engine picks up an app from the queue to process after the states already move to Synced and Healthy. It could happen when the queue is huge and/or notification services are slow to respond.
Our system
number of applications: >160
notification services:
trigger.on-deployed: | - description: Application is synced and healthy. send: - app-deployed when: app.status.operationState.phase in ['Succeeded'] and app.status.health.status == 'Healthy'
Reproducible steps
service.slack: | token: $slack-token icon: ":argo:" signingSecret: $slack-signing-secret service.webhook.test: | url: <slow-notification-service-url> headers: - name: X-Delay-Duration value: 30s
k annotate application --all -n argocd notifications.argoproj.io/subscribe.on-sync-running.test=""
k annotate application --all -n argocd notifications.argoproj.io/subscribe.on-sync-succeeded.test=""
k annotate application --all -n argocd notifications.argoproj.io/subscribe.on-deployed.test=""
k annotate application --all -n argocd notifications.argoproj.io/subscribe.on-deployed.slack="#my-channel"
Note: this will make notification-engine stuck in waiting state for the response from the slow-notification-service during sync-running and sync-succeeded while argocd completes sync & rollout of the test app, whose states have already transition to Synced and Healthy.
PS. this issue can occur easily with
Version
notifications-engine: v0.3.1
argocd:
{
"Version": "v2.4.16+7b5899b",
"BuildDate": "2022-11-01T21:17:46Z",
"GitCommit": "7b5899be33d16af7c57523d85ebacaa6f345cb95",
"GitTreeState": "clean",
"GoVersion": "go1.18.8",
"Compiler": "gc",
"Platform": "linux/amd64",
"KustomizeVersion": "v4.4.1 2021-11-11T23:36:27Z",
"HelmVersion": "v3.8.1+g5cb9af4",
"KubectlVersion": "v0.23.1",
"JsonnetVersion": "v0.18.0"
}
The text was updated successfully, but these errors were encountered: