Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conditional element-wise assignment still won't compile #789

Closed
zhangcx93 opened this issue Mar 26, 2021 · 3 comments
Closed

Conditional element-wise assignment still won't compile #789

zhangcx93 opened this issue Mar 26, 2021 · 3 comments

Comments

@zhangcx93
Copy link

zhangcx93 commented Mar 26, 2021

a = CuArray([1,2,3,4,5])
a[a .> 3] .= 10

this won't compile, while the cpu version does work:

a = [1,2,3,4,5]
a[ a .> 3] .= 10

Error message:

ERROR: GPU compilation of kernel broadcast_kernel(CUDA.CuKernelContext, SubArray{Int64, 1, CuDeviceVector{Int64, 1}, Tuple{Vector{Int64}}, false}, Base.Broadcast.Broadcasted{Nothing, Tuple{Base.OneTo{Int64}}, typeof(identity), Tuple{Int64}}, Int64) failed
KernelError: passing and using non-bitstype argument

Argument 3 to your kernel function is of type SubArray{Int64, 1, CuDeviceVector{Int64, 1}, Tuple{Vector{Int64}}, false}, which is not isbits:
.indices is of type Tuple{Vector{Int64}} which is not isbits.
.1 is of type Vector{Int64} which is not isbits.

Stacktrace:
[1] check_invocation(job::GPUCompiler.CompilerJob, entry::LLVM.Function)
@ GPUCompiler ~.julia\packages\GPUCompiler\XwWPj\src\validation.jl:68
[2] macro expansion
@ ~.julia\packages\GPUCompiler\XwWPj\src\driver.jl:287 [inlined]
[3] macro expansion
@ ~.julia\packages\TimerOutputs\4QAIk\src\TimerOutput.jl:206 [inlined]
[4] macro expansion
@ ~.julia\packages\GPUCompiler\XwWPj\src\driver.jl:286 [inlined]
[5] emit_asm(job::GPUCompiler.CompilerJob, ir::LLVM.Module, kernel::LLVM.Function; strip::Bool, validate::Bool, format::LLVM.API.LLVMCodeGenFileType)
@ GPUCompiler ~.julia\packages\GPUCompiler\XwWPj\src\utils.jl:62
[6] cufunction_compile(job::GPUCompiler.CompilerJob)
@ CUDA ~.julia\packages\CUDA\qEV3Y\src\compiler\execution.jl:306
[7] check_cache
@ ~.julia\packages\GPUCompiler\XwWPj\src\cache.jl:44 [inlined]
[8] cached_compilation
@ ~.julia\packages\GPUArrays\WV76E\src\host\broadcast.jl:60 [inlined]
[9] cached_compilation(cache::Dict{UInt64, Any}, job::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{GPUArrays.var"#broadcast_kernel#12", Tuple{CUDA.CuKernelContext, SubArray{Int64, 1, CuDeviceVector{Int64, 1}, Tuple{Vector{Int64}}, false}, Base.Broadcast.Broadcasted{Nothing, Tuple{Base.OneTo{Int64}}, typeof(identity), Tuple{Int64}}, Int64}}}, compiler::typeof(CUDA.cufunction_compile), linker::typeof(CUDA.cufunction_link))
@ GPUCompiler ~.julia\packages\GPUCompiler\XwWPj\src\cache.jl:0
[10] cufunction(f::GPUArrays.var"#broadcast_kernel#12", tt::Type{Tuple{CUDA.CuKernelContext, SubArray{Int64, 1, CuDeviceVector{Int64, 1}, Tuple{Vector{Int64}}, false}, Base.Broadcast.Broadcasted{Nothing, Tuple{Base.OneTo{Int64}}, typeof(identity), Tuple{Int64}}, Int64}}; name::Nothing, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ CUDA ~.julia\packages\CUDA\qEV3Y\src\compiler\execution.jl:294
[11] cufunction
@ ~.julia\packages\CUDA\qEV3Y\src\compiler\execution.jl:288 [inlined]
[12] macro expansion
@ ~.julia\packages\CUDA\qEV3Y\src\compiler\execution.jl:102 [inlined]
[13] #launch_heuristic#280
@ ~.julia\packages\CUDA\qEV3Y\src\gpuarrays.jl:17 [inlined]
[14] launch_heuristic
@ ~.julia\packages\CUDA\qEV3Y\src\gpuarrays.jl:17 [inlined]
[15] copyto!
@ ~.julia\packages\GPUArrays\WV76E\src\host\broadcast.jl:66 [inlined]
[16] copyto!
@ ~.julia\packages\GPUArrays\WV76E\src\host\broadcast.jl:76 [inlined]
[17] materialize!
@ .\broadcast.jl:894 [inlined]
[18] materialize!(dest::SubArray{Int64, 1, CuArray{Int64, 1}, Tuple{Vector{Int64}}, false}, bc::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0}, Nothing, typeof(identity), Tuple{Int64}})
@ Base.Broadcast .\broadcast.jl:891
[19] top-level scope
@ REPL[5]:1

Version info:

CUDA toolkit 11.1.1, artifact installation
CUDA driver 11.2.0
NVIDIA driver 461.92.0

Libraries:

  • CUBLAS: 11.3.0
  • CURAND: 10.2.2
  • CUFFT: 10.3.0
  • CUSOLVER: 11.0.1
  • CUSPARSE: 11.3.0
  • CUPTI: 14.0.0
  • NVML: 11.0.0+461.92
  • CUDNN: 8.10.0 (for CUDA 11.2.0)
  • CUTENSOR: 1.2.2 (for CUDA 11.1.0)

Toolchain:

  • Julia: 1.6.0
  • LLVM: 11.0.1
  • PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0
  • Device support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80

1 device:
0: GeForce RTX 2080 Ti (sm_75, 8.513 GiB / 11.000 GiB available)

I just fresh installed the julia 1.6 stable, and installed the CUDA.jl 2.6.2, on Windows 10 if that matters.

I read about this issue is fixed and merged into the 2.6.2 with PR.

@zhangcx93 zhangcx93 added the bug Something isn't working label Mar 26, 2021
@maleadt
Copy link
Member

maleadt commented Mar 26, 2021

merged into the 2.6.2 with PR.

eeb7027 is only on master. Please try the development branch first.

@maleadt maleadt added the needs information Further information is requested label Mar 26, 2021
@zhangcx93
Copy link
Author

Please try the development branch first.

It works on master, thanks.

btw according to changelog of 2.6.2, #724 is merged. since it's only on master, maybe it shoud not show up in changelog of that version release?

@maleadt
Copy link
Member

maleadt commented Mar 26, 2021

I agree; that's a tagbot issue: JuliaRegistries/TagBot#181

@maleadt maleadt closed this as completed Mar 26, 2021
@maleadt maleadt removed bug Something isn't working needs information Further information is requested labels Mar 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants