You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Very infrequently it is possible to create to run completion events on the completion of a single pipeline job in kubeflow.
This happens because the argo workflow that is watched by the event processor receives two updates within a short period of time. If this time gap is small enough then the patch to say event processed hasn't been carried out by the first event so both create the run completion event.
To Reproduce
It is only reproducible in a scenario when the webhook call from the event processor is slow enough to make the second call come in. Have managed to reproduce on restarting the manager which seemed to slow the call down enough but only managed once.
Expected behavior
Only one event should be produced on one pipeline run completing
Describe the bug
Very infrequently it is possible to create to run completion events on the completion of a single pipeline job in kubeflow.
This happens because the argo workflow that is watched by the event processor receives two updates within a short period of time. If this time gap is small enough then the patch to say event processed hasn't been carried out by the first event so both create the run completion event.
To Reproduce
It is only reproducible in a scenario when the webhook call from the event processor is slow enough to make the second call come in. Have managed to reproduce on restarting the manager which seemed to slow the call down enough but only managed once.
Expected behavior
Only one event should be produced on one pipeline run completing
Version and configuration
Master version e5c5150
Solutions
Could patch straight away but then if the webhook call fails would need to rollback the patch so not viable.
Possibly us a mutex lock on a struct with the run id in it, would need to investigate this.
The text was updated successfully, but these errors were encountered: