You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TCLB currently does not support cases in which overall offset overflows (32bit) integer for:
load_ functions (dynamic and static access of fields) anywhere where nx*ny*nz*fields is larger than 2^31
pop_ functions (loading fields through densities) anywhere where nx*ny*nz is larger than 2^31
This can be fixed by appropriate casting in offset calculations in LatticeAccess, but care has to be taken to not slow down the performance by making unnecessary int64_t operations.
Originally posted by @shkodm in #496 (comment)
[...] Some things still don't work as expected (also the same on master branch). I run on 2 V100 on Bunya, each with 80GB GPUs, my case is large, so I split between 2.
I get: Cumulative allocation of 63.GB)
and then an illegal memory access was encountered in Lattice.hpp at line 279
The error is the same even if try I split between 3 GPUs (40GB each, so plenty of space even if there is some unaccounted memory)
The text was updated successfully, but these errors were encountered:
@shkodm@TravisMitchell After some investigation, we should really think if we want to calculate 64bit indexes, as 64bit multiplication on CUDA is around 20x cost of 32bit multiplication.
@shkodm can you check the sizes in your case? nx,ny,nz, but also number of fields?
TCLB currently does not support cases in which overall offset overflows (32bit) integer for:
load_
functions (dynamic and static access of fields) anywhere wherenx*ny*nz*fields
is larger than2^31
pop_
functions (loading fields through densities) anywhere wherenx*ny*nz
is larger than2^31
This can be fixed by appropriate casting in offset calculations in
LatticeAccess
, but care has to be taken to not slow down the performance by making unnecessaryint64_t
operations.Originally posted by @shkodm in #496 (comment)
[...] Some things still don't work as expected (also the same on
master
branch). I run on 2 V100 on Bunya, each with 80GB GPUs, my case is large, so I split between 2.I get:
Cumulative allocation of 63.GB)
and then
an illegal memory access was encountered in Lattice.hpp at line 279
The error is the same even if try I split between 3 GPUs (40GB each, so plenty of space even if there is some unaccounted memory)
The text was updated successfully, but these errors were encountered: