Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create CUDA extension #259

Merged
merged 8 commits into from
Oct 10, 2023
Merged

Create CUDA extension #259

merged 8 commits into from
Oct 10, 2023

Conversation

devmotion
Copy link
Contributor

I moved the CUDA-specific code to an extension. On Julia < 1.9, CUDA will continue to be a dependency and CUDA is supported automatically; on Julia >= 1.9, the package will not install CUDA by default but CUDA support can be enabled by loading CUDA (explicitly with e.g. using CUDA or when loaded by another dependency).

It was reasonably straightforward:

  • I moved the CUDA-specific files to the ext/EvoTreesCUDAExt.jl directory
  • Added name qualifications to (hopefully) everything in the extension that is not exported by EvoTrees
  • Searched for "GPU", "Cu", and "CUDA" in the remaining src directory and moved GPU-specific branches to functions that can dispatch on the device type. It seems three functions (device_ones -> ones on the CPU, CUDA.ones on the GPU, device_array_type -> Array on the CPU, CuArray on the GPU, and post_fit_gc -> no-op on the CPU and garbage collection on the GPU) were sufficient.

The main problem is that I don't have access to an NVIDIA GPU, and hence can't verify that I did not break CUDA support 😅 I would recommend adding CUDA-specific tests with buildkite to the repo (see https://github.com/JuliaGPU/buildkite#adding-a-repository, the setup is quite straightforward, I did add it to a few repos successfully).

Fixes #226.

@codecov
Copy link

codecov bot commented Oct 5, 2023

Codecov Report

Attention: 76 lines in your changes are missing coverage. Please review.

Comparison is base (cf6e3c0) 50.80% compared to head (6d33c71) 51.08%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #259      +/-   ##
==========================================
+ Coverage   50.80%   51.08%   +0.27%     
==========================================
  Files          21       22       +1     
  Lines        1986     1985       -1     
==========================================
+ Hits         1009     1014       +5     
+ Misses        977      971       -6     
Files Coverage Δ
src/EvoTrees.jl 16.66% <ø> (ø)
src/callback.jl 70.12% <100.00%> (+1.77%) ⬆️
src/fit.jl 96.36% <100.00%> (+1.74%) ⬆️
src/init.jl 94.64% <100.00%> (+3.41%) ⬆️
ext/EvoTreesCUDAExt/subsample.jl 0.00% <0.00%> (ø)
ext/EvoTreesCUDAExt/fit-utils.jl 0.00% <0.00%> (ø)
ext/EvoTreesCUDAExt/EvoTreesCUDAExt.jl 33.33% <33.33%> (ø)
ext/EvoTreesCUDAExt/loss.jl 0.00% <0.00%> (ø)
ext/EvoTreesCUDAExt/eval.jl 0.00% <0.00%> (ø)
ext/EvoTreesCUDAExt/predict.jl 0.00% <0.00%> (ø)
... and 2 more

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ext/EvoTreesCUDAExt/predict.jl Outdated Show resolved Hide resolved
ext/EvoTreesCUDAExt/predict.jl Outdated Show resolved Hide resolved
ext/EvoTreesCUDAExt/fit.jl Show resolved Hide resolved
Project.toml Outdated
Comment on lines 46 to 47
[weakdeps]
CUDA = "052768ef-5323-5732-b1bb-66c8b64840ba"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From Julia's formatter, it appears that expected order of project keys to be:

  • deps
  • compat
  • weakdeps
  • extensions
  • compat
  • extras
  • targets

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on the Julia version you use: On Julia 1.6, the order Pkg uses is

  • deps
  • compat
  • extensions
  • extras
  • targets
  • weakdeps
    whereas on Julia 1.7, 1.8, and 1.9 it is
  • deps
  • weakdeps
  • extensions
  • compat
  • extras
  • targets

See eg the discussion in JuliaTesting/Aqua.jl#105 and in particular JuliaTesting/Aqua.jl#105 (comment)

I guess you were referring to the order in Julia > 1.6 and accidentally included the compat section twice?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I meant to refer to Julia > 1.6, sorry about the confusion for the duplicated compat.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the order to the convention in Julia > 1.6.

@jeremiedb
Copy link
Member

Thanks for the PR!

  1. I'm not clear whether CUDA should be kept in deps. It went fine on Julia 1.8, but I encountered an issue on 1.10 which I hadn't time to investigate
  2. Regarding: CompatHelper: bump compat for EvoTrees to 0.16 for package test, (keep existing compat) TuringLang/MCMCDiagnosticTools.jl#93, I'm not clear why the CI test did fail on nightly. My understanding is that CUDA should load fine regardless if the machine isn't NVIDIA compatible. Has there been a change to that in Julia 1.10?
  3. Is the fit from EvoTrees poorer than for other classifiers in rtstar test? The need to pass 1_000 rather than the default 100 appears suspicious. I wouldn't expect a significant discrepancy compared to XGBoost (with its default to max_round=10 and learning_rate=0.3, I'd actually expects its fit to be inferior). If you'd have a data to share, I'd be curious to see if there might be an issue in the classifier routine.

@devmotion
Copy link
Contributor Author

devmotion commented Oct 7, 2023

  1. I'm not clear whether CUDA should be kept in deps. It went fine on Julia 1.8, but I encountered an issue on 1.10 which I hadn't time to investigate

Yes, CUDA should be kept in the deps section for backwards compatibility with Julia < 1.9. https://pkgdocs.julialang.org/v1/creating-packages/#Transition-from-normal-dependency-to-extension state explicitly

Make sure that the package is both in the [deps] and [weakdeps] section. Newer Julia versions will ignore dependencies in [deps] that are also in [weakdeps].

Problems with 1.10 should be caused by something else.

@devmotion
Copy link
Contributor Author

2. I'm not clear why the CI test did fail on nightly. My understanding is that CUDA should load fine regardless if the machine isn't NVIDIA compatible. Has there been a change to that in Julia 1.10?

No, it should work fine (the main problem with CUDA is that it pulls in many dependencies and in particular CUDA jlls). The problem with the current nightlies (https://github.com/TuringLang/MCMCDiagnosticTools.jl/actions/runs/6428712477/job/17456380934#step:6:335) is that CUDA defines GPU overrides for Random.make_seed (https://github.com/JuliaGPU/CUDA.jl/blob/da140359fc68f453e511608d9af552c90d699d9a/src/device/random.jl#L53) but Random.make_seed was removed last week (JuliaLang/julia#51436).

@jeremiedb
Copy link
Member

Thanks for the clarifications. Regarding the Project.toml keys order, I'll likely move towards the Julia v1.6 order in subsequent release. Let me know if that's an issue for you. Or feel free to make an adjustment in this PR.
Otherwise, good to go!

@devmotion
Copy link
Contributor Author

I updated the order of the sections in the Project.toml file to the convention in Julia > 1.6.

@jeremiedb jeremiedb merged commit 16db639 into Evovest:main Oct 10, 2023
7 of 8 checks passed
@jeremiedb
Copy link
Member

jeremiedb commented Oct 10, 2023

Registraion of new release v0.16.3 is underway. Thanks for your patience!

@devmotion devmotion deleted the dw/cudaext branch October 10, 2023 05:49
@devmotion
Copy link
Contributor Author

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make CUDA a weak dependency
2 participants