-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unify multithreading handling #81
Comments
I see mayor disadvantages with the suggested solutions:
My idea is to not use public interface MultiThreadSetting
{
boolean useMultiThreading(); // returns true if multi-threading should be used
int suggestNumberOfTasks(); // returns 1 for single-threading, or 4 * availableProccessors() for multi threading, you may also manually specify this value
<T> void forEach( Collection< T > values, Consumer< T > action ); // Executes the action for each value, sequentially for single-threading, in parallel for multi-threading.
} See PR imglib/imglib2#250 For this interface I have three implementations:
|
Thanks a lot @maarzt for digging in to this issue! I am 100% on board with using our own interface. However, the tricky question is: where shall this interface live? Because I want to use it in both SciJava Common and ImgLib2. Do you have a preference? @tpietzsch what do you think? Maybe a new And @maarzt: have you looked in detail at the SciJava |
@maarzt some quick comments
|
@ctrueden @axtimwalde @hanslovsky @StephanPreibisch This can mean either
The decision is made by the caller on the top-level, by running It ties nicely into existing stuff. For example, if you call It supports recursive tasks like ForkJoinPool. @ctrueden This brings us to a remaining question: The "user" would be for example the Any thoughts, comments? |
Hi @maarzt & @tpietzsch, First: thanks very much for trying. Getting abstracted parallelization right is perhaps impossible. It has, though, its uses, but for the general case, I have yet to stumble upon a good solution. The first major issue is that algorithm have different memory requirements and different contention issues. There are of course groups of algorithms that share such properties (i.e. filtering an image), and therefore there can be well-suited solutions for them. An example: suppose I am running a memory-heavy algorithm that uses e.g. 25% of my desktop's RAM. I can only effectively run 4 threads. But, some steps within each parallel processing branch can be done with more threads. Therefore each would use 1/4th of all available threads. When done with this subprocess, then the main, memory-heavy thread would resume (for example, moving on to the next 3D image to process in a 4D series). An example of a subtask would be to copy a resulting image into an If the abstracted parallelization framework assumes that allocating as many free threads as possible to a task is a good idea, the above will fail, or result in suboptimal performance. Another example: suppose there are multiple data structures that have to be accessed for every iteration in the loop. If one naively uses all available threads to process the next available loop index, contention is high: all threads try to access all data structures at once. Instead, one can chunk the data structures and have each thread process a chunk that the others never access. Is chunking included in the abstract parallelization? For some cases, it would be preferable, because it is cumbersome, and could be done automatically via e.g. All that said: "Those who think something can't be done should not stop those who are doing it from doing it." So carry on and let's see how far this goes. Looking forward to the results. Specific comments on the code: statically instantiating both an On the "problem" that an |
Let's discuss when I visit in June. Added to my list for that week. |
Sounds good. @acardona Specifically, the first scenario you mentioned would be possible to do with a ForkJoinPool (resp. the proposed Regarding the second scenario, chunking is not included in the abstract parallelization. The idea is that you say
Hmm, why? These are two super light-weight objects. Neither the |
Many of the implementations in imglib2-algorithm have an
int numThreads
parameter for controlling how many threads to spawn. Others take anExecutorService
. Accepting anExecutorService
is more flexible and would mesh better with SciJava Common’sThreadService
and therefore ImageJ Ops. See also imagej/imagej-ops#599. On the other hand, anExecutorService
alone is not enough to identify the intended number of threads to use when partitioning the task.Note that currently, we often infer
numTasks
fromRuntime.availableProcessors()
by default, which is not a good practice in general because multiple tasks may be ongoing simultaneously, which can result in more threads than processors.@tpietzsch points out:
ForkJoinPool
there is a "parallelism level" that corresponds to this roughly. We should consider using ForkJoinPool throughout.ForkJoinPool
extendsExecutorService
, so it would at least be backwards compatible in some ways.ForkJoinPool.getParallelism()
could replace / augmentnumTasks
.ForkJoinPool
supports work-stealing which would be important if submitted task spawn new subtasks for whose completion they wait. This allows handing down pool through algorithms that parallelise in chunks and for each chunk call another algorithm that parallelizes internally. (With handing down ExecutorService that wouldn't work.)ForkJoinPool.commonPool()
that is used by streams etc. We could fall back to this there is no user-provided pool.Based on a chat in gitter.
The text was updated successfully, but these errors were encountered: