-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add timeout_seconds option to Kubeclient::Client#watch_entities #624
Conversation
Huh. I was expecting this would set some network timeout settings on our side (we have these configurable for non-watch actionsy; watch was originally deliberately excluded to be unlimited) — but instead I see this is sending https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/
Do you know how this differs from client-side timeouts? If you want a timeout on a watch, when would you want it server-side / client-side / both? In any case, if k8s API takes this param, we should support it 👍 |
Hello @cben , actually I am fighting trying to workaround zombification when watching the k8s API on Azure for changes in a CRD. After a while, watch appears to still be up but we don't receive any more updates. My first attempt was making the watch http_client persistent (by hacking Kubeclient::Common::WatchStream#build_client method) and it didn't solve my problem. The http gem does not allow to set a keep-alive on its underlying TCP socket that I know. My second attempt uses this PR: I pass a timeoutSeconds when creating a watch. It limits the maximum duration of a watch, reducing (but not eliminating) the zombification. I think that this PR should be merged since timeoutSeconds is part of the kubernetes watch API. What I'm going to try next:
|
2166c6f
to
b25df1d
Compare
@cben I just updated the README to mention how to force watches to have a maximum duration with the timeout_seconds keyword arguments. If you know a way to make watches timeout on the client side if inactive I am interested. EDIT : I just found out about Kubeclient::Informer and I managed to code something that works (even if the Azure K8S apiserver is very aggressive with connection resets). |
b25df1d
to
a071202
Compare
Oh wow, I dropped the ball on this one. LGTM, merging. 👍 Yeah, Kubeclient::Informer strives to handle watch restarts in a correct way, though it's new-ish and there might be loose ends (search issues for "informer"). [I have to confess it's been a long time since I've used kubeclient myself; I'm acting as limping-along-barely-maintainer and that's why I'm trying to "crowdsource" things like README etc to contributors but would be good if active user(s) stepped up to replace me...] |
Close-cycled to re-run CI. It now fails to run on ruby 2.7:
but this passes tests on linux 2.7 locally. |
This small PR adds a "timeout_seconds" option to the "Kubeclient::Client#watch_entities" method (not set by default).
The objective is to limit the duration of a watch to prevent zombification if the watch lasts for too long (see #512).
We see the zombification often on Azure kubernetes cluster, where our watch stays up but does not receive notifications anymore. By setting a timeout_seconds of X seconds we are sure that we are not zombified for more than X seconds.
Happy to take feedback to improve this PR.