Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected interaction between :single_use pool option and :min runners #60

Open
nickdichev-firework opened this issue Sep 5, 2024 · 0 comments

Comments

@nickdichev-firework
Copy link

Runners in a pool which uses the :single_use option will terminate after they execute a function call, however, this behavior causes the pool to scale below the configured :min value.

I took a stab at fixing it, but I encountered some tricky bits that I wasn't sure how to work around.

In order to work around it in a backwards compatible way, I think the :single_use option needs to change from a boolean with default false, to an enum with values like :all, :only_dynamic, :none with default value :none. :all means that all runners, including the initial booted ones, have the single use behavior applied. :only_dynamic means that only runners booted in response to demand over the initial runners will have the single use behavior applied.

The issue I see with this is that the Terminator now needs to know what "kind" of runner it is: one that was booted to satisfy the :min pool or one that was booted to satisfy demand. This would couple the Terminator/Pool/Runner a little closer but would allow the Terminator to decide to leave initial runners alive after even in the single use pool.

Another idea is to change when the pool queries for unmet demand. Currently, the pool only scales on the "leading edge" of demand on the pool, in the pool's checkout_runner/{3,4}. Another idea is to query the pool's desired size on the "tailing edge" of the runner's lifecycle, when the parent gets the runner's down message. Then, when any runner is terminated due to the single use property the pool can correct itself.

The problem with that strategy is that it gets us into a cycle of recycling workers since the min workers go away due to the single use policy and we use dynamic runners to replace them. At this point, the pool's idle_shutdown_after will apply to the dynamic runners, then those go idle and die, and the pool boots new runners to replace them. This is especially brutal because in some environments where there aren't many requests keeping the workers occupied, they get into a cycle of restarting within a small interval of each other.

I've actually implemented that behavior on top of my change from #51 and its working pretty well, besides the constant recycling of workers. Additionally, it has me thinking about other transition states that we might want to support for scaling the pool. For my desired "constant number of overprovisioned workers" custom strategy it might make sense to ensure the pool is in the desired state on a constant interval.

My final observation is that the single use option could be implemented in a different way by making the callers end their function calls with System.stop. I'm curious why it was implemented as an option on the pool. I suppose such a strategy might not work for place_child in some cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant