-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-core performance and memory consumption #41
Comments
Dear Jiajun, as you have discovered although NBLAST is in theory an embarrassingly parallel problem that could scale perfectly across parallel cores, in practice non-CPU resources can be limiting. We normally find that memory is limiting in practice. In terms of your server spec, you have a lot of cores for the amount of memory. I suspect that you will not want to run more than 10 jobs, but the only way to be sure is to test. Note that running 1 query neuron against the flyem set will be less efficient than running 10 neurons; the memory usage is likely to be similar for both. A few additional questions. Is that 2.3 GB the on disk or in memory footprint of your > load("fib.twigs5.dps.rda")
> object.size(fib.twigs5.dps)
1594104384 bytes
> length(fib.twigs5.dps)
[1] 27411
> stem(nvertices(fib.twigs5.dps))
The decimal point is 3 digit(s) to the right of the |
0 | 00000000000000000000000000000000000000000000000000000000000000000000+24861
2 | 00000000000000000000000000000000000000000000000000000000000000000000+1810
4 | 00000000000000000000000000000000000111111111111111111111111111122222+289
6 | 00000000001111111111222222222333333344445555555556677778888890000000+27
8 | 111122234444444455566677777788899900233333566788999
10 | 1222223466677779901688
12 | 0123445599935
14 | 112490234
16 | 3938
18 | 24
20 | 28
22 |
24 |
26 |
28 |
30 |
32 |
34 |
36 |
38 | 3
> mean(nvertices(fib.twigs5.dps))
[1] 936.0071 Perhaps you could also add the output of to give a slightly more fine-grained view on this. dput(hist(nvertices(fib.twigs5.dps), plot=F, breaks=100))
|
Finally it would be worth checking a few specific neurons (here identified by bodyid):
You can do this by:
|
@lankiszhang I'm sorry I missed replying to this at the time. Your dotprops appear to be in a plain list object rather than a neuronlist. If you do as.neuronlist on it, then nvertices should work again. As for your benchmarks, it is hard to interpret them without knowing how many neurons you are using or how many vertices there are in the dotprops objects. If you are using relatively few neurons then the overhead of forking and setting up the parallel environment will be large compared with the time saving of running in parallel. This is likely why 4 cores already gives you only 2x speed-up over one core but is actually slightly faster than 16. At the other end, as discussed earlier if you have many neurons, then running many cores will cause problems due to memory issues. |
Dear all,
We have found nblast really helpful to our current project, especially when doing nblast against the FlyEM database.
On my laptop (6 cores 12 threads), it takes about 4 min for a one against all NBlast when running on single core.
As I want to reduce the time, I used doParallel to define a multi-core backend and run NBlast with .parallel = TRUE. Interestingly, I could confirm that all my 12 cores were running with a 100% RAM consumption, and it ended up with more than 10 min for the same task.
Then I tried running NBlast on only two cores to avoid the high memory consumption, and it took 5 min for the task.
Take the longer time and high memory consumption into consideration, I am a little bit confused about how exactly nblast using .parallel. As I have a 4 processors 40 cores 80 threads CPU and 48 GB RAM, and my dps_flyEM is 2.32 GB, is it the best to run NBlast on only 16 cores rather than 80?
Best wishes,
Jiajun Zhang
The text was updated successfully, but these errors were encountered: