Conditional element-wise assignment still won't compile #789

zhangcx93 · 2021-03-26T08:40:16Z

a = CuArray([1,2,3,4,5])
a[a .> 3] .= 10

this won't compile, while the cpu version does work:

a = [1,2,3,4,5]
a[ a .> 3] .= 10

Error message:

ERROR: GPU compilation of kernel broadcast_kernel(CUDA.CuKernelContext, SubArray{Int64, 1, CuDeviceVector{Int64, 1}, Tuple{Vector{Int64}}, false}, Base.Broadcast.Broadcasted{Nothing, Tuple{Base.OneTo{Int64}}, typeof(identity), Tuple{Int64}}, Int64) failed
KernelError: passing and using non-bitstype argument

Argument 3 to your kernel function is of type SubArray{Int64, 1, CuDeviceVector{Int64, 1}, Tuple{Vector{Int64}}, false}, which is not isbits:
.indices is of type Tuple{Vector{Int64}} which is not isbits.
.1 is of type Vector{Int64} which is not isbits.

Stacktrace:
[1] check_invocation(job::GPUCompiler.CompilerJob, entry::LLVM.Function)
@ GPUCompiler ~.julia\packages\GPUCompiler\XwWPj\src\validation.jl:68
[2] macro expansion
@ ~.julia\packages\GPUCompiler\XwWPj\src\driver.jl:287 [inlined]
[3] macro expansion
@ ~.julia\packages\TimerOutputs\4QAIk\src\TimerOutput.jl:206 [inlined]
[4] macro expansion
@ ~.julia\packages\GPUCompiler\XwWPj\src\driver.jl:286 [inlined]
[5] emit_asm(job::GPUCompiler.CompilerJob, ir::LLVM.Module, kernel::LLVM.Function; strip::Bool, validate::Bool, format::LLVM.API.LLVMCodeGenFileType)
@ GPUCompiler ~.julia\packages\GPUCompiler\XwWPj\src\utils.jl:62
[6] cufunction_compile(job::GPUCompiler.CompilerJob)
@ CUDA ~.julia\packages\CUDA\qEV3Y\src\compiler\execution.jl:306
[7] check_cache
@ ~.julia\packages\GPUCompiler\XwWPj\src\cache.jl:44 [inlined]
[8] cached_compilation
@ ~.julia\packages\GPUArrays\WV76E\src\host\broadcast.jl:60 [inlined]
[9] cached_compilation(cache::Dict{UInt64, Any}, job::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{GPUArrays.var"#broadcast_kernel#12", Tuple{CUDA.CuKernelContext, SubArray{Int64, 1, CuDeviceVector{Int64, 1}, Tuple{Vector{Int64}}, false}, Base.Broadcast.Broadcasted{Nothing, Tuple{Base.OneTo{Int64}}, typeof(identity), Tuple{Int64}}, Int64}}}, compiler::typeof(CUDA.cufunction_compile), linker::typeof(CUDA.cufunction_link))
@ GPUCompiler ~.julia\packages\GPUCompiler\XwWPj\src\cache.jl:0
[10] cufunction(f::GPUArrays.var"#broadcast_kernel#12", tt::Type{Tuple{CUDA.CuKernelContext, SubArray{Int64, 1, CuDeviceVector{Int64, 1}, Tuple{Vector{Int64}}, false}, Base.Broadcast.Broadcasted{Nothing, Tuple{Base.OneTo{Int64}}, typeof(identity), Tuple{Int64}}, Int64}}; name::Nothing, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ CUDA ~.julia\packages\CUDA\qEV3Y\src\compiler\execution.jl:294
[11] cufunction
@ ~.julia\packages\CUDA\qEV3Y\src\compiler\execution.jl:288 [inlined]
[12] macro expansion
@ ~.julia\packages\CUDA\qEV3Y\src\compiler\execution.jl:102 [inlined]
[13] #launch_heuristic#280
@ ~.julia\packages\CUDA\qEV3Y\src\gpuarrays.jl:17 [inlined]
[14] launch_heuristic
@ ~.julia\packages\CUDA\qEV3Y\src\gpuarrays.jl:17 [inlined]
[15] copyto!
@ ~.julia\packages\GPUArrays\WV76E\src\host\broadcast.jl:66 [inlined]
[16] copyto!
@ ~.julia\packages\GPUArrays\WV76E\src\host\broadcast.jl:76 [inlined]
[17] materialize!
@ .\broadcast.jl:894 [inlined]
[18] materialize!(dest::SubArray{Int64, 1, CuArray{Int64, 1}, Tuple{Vector{Int64}}, false}, bc::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0}, Nothing, typeof(identity), Tuple{Int64}})
@ Base.Broadcast .\broadcast.jl:891
[19] top-level scope
@ REPL[5]:1

Version info:

CUDA toolkit 11.1.1, artifact installation
CUDA driver 11.2.0
NVIDIA driver 461.92.0

Libraries:

CUBLAS: 11.3.0

CURAND: 10.2.2

CUFFT: 10.3.0

CUSOLVER: 11.0.1

CUSPARSE: 11.3.0

CUPTI: 14.0.0

NVML: 11.0.0+461.92

CUDNN: 8.10.0 (for CUDA 11.2.0)

CUTENSOR: 1.2.2 (for CUDA 11.1.0)

Toolchain:

Julia: 1.6.0

LLVM: 11.0.1

PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0

Device support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80

1 device:
0: GeForce RTX 2080 Ti (sm_75, 8.513 GiB / 11.000 GiB available)

I just fresh installed the julia 1.6 stable, and installed the CUDA.jl 2.6.2, on Windows 10 if that matters.

I read about this issue is fixed and merged into the 2.6.2 with PR.

maleadt · 2021-03-26T10:15:07Z

merged into the 2.6.2 with PR.

eeb7027 is only on master. Please try the development branch first.

zhangcx93 · 2021-03-26T11:46:51Z

Please try the development branch first.

It works on master, thanks.

btw according to changelog of 2.6.2, #724 is merged. since it's only on master, maybe it shoud not show up in changelog of that version release?

maleadt · 2021-03-26T11:54:34Z

I agree; that's a tagbot issue: JuliaRegistries/TagBot#181

zhangcx93 added the bug Something isn't working label Mar 26, 2021

maleadt added the needs information Further information is requested label Mar 26, 2021

maleadt closed this as completed Mar 26, 2021

maleadt removed bug Something isn't working needs information Further information is requested labels Mar 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conditional element-wise assignment still won't compile #789

Conditional element-wise assignment still won't compile #789

zhangcx93 commented Mar 26, 2021 •

edited

Loading

maleadt commented Mar 26, 2021

zhangcx93 commented Mar 26, 2021

maleadt commented Mar 26, 2021

Conditional element-wise assignment still won't compile #789

Conditional element-wise assignment still won't compile #789

Comments

zhangcx93 commented Mar 26, 2021 • edited Loading

maleadt commented Mar 26, 2021

zhangcx93 commented Mar 26, 2021

maleadt commented Mar 26, 2021

zhangcx93 commented Mar 26, 2021 •

edited

Loading