-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Configuration to set maximum parallelization of :parallel runner #414
Comments
@mattbrictson
affects every command sent to servers and I wanted to be able to only limit git-related tasks, so only SCM plugin would be limited, so I do not slow down the rest of deployment process. |
I think we do have an issue where it is needed to slow down a class of operation, but not all operations. For example, in a Rails deploy, if we just slow down Git operations we add a lot less overall runtime as compared to slowing down everything including asset precompilation and such. I wonder whether it makes any sense to do something like tagging specific groups, e.g.: on release_roles(:all), type: %i[scm bundle precompilation] do; end And then allow the user to limit by type… SSHKit.config.default_runner_config = { type: { scm: { limit: 10 } } }
# OR
SSHKit.config.typed_runner_config(:scm) = { limit: 10 } |
Good point. Unfortunately this kind of puts us back where we started, in terms of we have to modify every task that could potentially run into this type of problem (e.g. all SCM tasks, Perhaps the design of SSHKit/Capistrano is not such that we can easily address this in an common way without significant modifications. Any ideas, @leehambley? |
I'm wondering if we can use some annotations on the classes ( It'd be a pretty invasive change in any case, but it does keep coming up... |
You're right, this does come up a lot. Conceptually, users think about execution in terms of Rake tasks and would like to configure things at that level. For example:
But in reality, this configuration is done in the As a result, whenever someone wants to alter the execution behavior (i.e. where the commands are executed or how (parallel, groups) the commands are run), they essentially have no choice but to reimplement the entire task. Alternatively the task has to be written in such a way that all anticipated execution customizations can be controlled via Capistrano variables, like I guess I am just restating what we all already know, but this is what I am wrestling with when trying to come up with a good solution that fits into the current design. Another possiblity would be to establish a convention that there is a standard set of configuration variables for each set of tasks to control execution behavior: namespace :git do
task :wrapper do
on fetch(:git_roles), fetch(:git_execution_options) do
# ...
end
end
end
# defaults
set_if_empty :git_roles, -> { release_roles(:all) }
set_if_empty :git_execution_options, {}
# example customization
set :git_execution_options, { in: :groups, limit: 10, wait: 2 } But it might be too late in the development of Capistrano and its many plugins to introduce such a convention, and while it does offer a lot of fine-grained control, the concepts might overwhelm new users. |
If we are happy with this style then I'll ask for a revision of capistrano/capistrano#1957 to use it and get that merged in. Perhaps we can do some quick PRs to implement the same for SVN and Hg as well. 👇 on release_roles(fetch(:git_roles)), fetch(:git_execution_options) do
# ...
end
# defaults
set_if_empty :git_roles, :all
set_if_empty :git_execution_options, {}
|
Or maybe it should be on release_roles(fetch(:git_roles)), fetch(:git_runner_config) do
# ...
end
# defaults
set_if_empty :git_roles, :all
set_if_empty :git_runner_config, {} |
Hey @mattbrictson thanks for making a strong suggestion. I'm not sure I'm keen on the solution, but I don't really have anything better to suggest, I think given that the user thinks of these things on a rake task level I'd like to modify the take API, or set something on the For lack of a better example:
This would also keep the option for doing I don't really care if we have a simple bool flag as above, or something like "runner options" like you suggest. I would prefer to keep it very simple, like a hint to the system that we can choose to interpret in our own way, rather than adding another toolkit for tuning behaviour, hence my preferences for a simple "uses contended resource" flag, which would make is reign in the parallelism slightly perhaps. Just food for thought, anyway, even in my simple-bool proposal, we'd still have to have a set of params somewhere that dictates what that means, that's where your idea and mine would align, and we could get "contended resource run opts" from the settings hash? |
@leehambley thanks for the example, that helps me understand what you are going for. Do you have some ideas on how we can establish a link between the Also, is this what you have in mind for usage? Rake::Task["git:check"].contended_resource = true
Rake::Task["git:clone"].contended_resource = true
Rake::Task["git:update"].contended_resource = true
set :contended_resource_runner_config, { in: :groups, limit: 10, wait: 2 } |
Sure, actually - I thought that since our I'd imagined since I haven't tried any of this out - but I suspect it ought to work: https://github.com/ruby/rake/blob/06381f62847b32b04db0362c174426ca5299c63f/lib/rake/task.rb#L242-L252 defines the base - you can see where the actions are called, I think at that point they're already bound/etc and I don't know if it's too it's too late to change the way the Fun discovery, the first 100 line spike of Rake: https://github.com/ruby/rake/blob/93e55a4ef1dbaee42f0f355f86d837c4e2551fc1/doc/proto_rake.rdoc#L99 |
There have been multiple requests to set an upper limit on the number of
git
operations that are executed in parallel in the default Capistrano git strategy. Similarly, users are also asking for a limit on the number of parallelbundle install
executions incapistrano-bundler
.What these tasks have in common is that they all use the default
:parallel
runner provided by SSHKit. When using Capistrano to deploy to a large number of servers, firing off these operations to all servers in parallel can overload shared resources like a git server or private gem repository.Rather than implement rate limiting for each SCM, capistrano-bundler, etc., I feel like a more general solution should be provided by SSHKit itself.
My proposal would be to change the implementation of the
:parallel
runner to essentially be a subclass of the:groups
runner, except with defaults ofwait: 0
andlimit: INFINITY
. Then, if a user wants to limit the amount of parallelization, they could simply do this:If sharing implementation and configuration keys between
:parallel
and:groups
is too confusing, then perhaps the:parallel
runner could use a different configuration key (but to the same effect):Thoughts?
See also:
The text was updated successfully, but these errors were encountered: