-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Initial Java Support for GDS to KvikIO #396
base: branch-24.10
Are you sure you want to change the base?
Add Initial Java Support for GDS to KvikIO #396
Conversation
java/README.md
Outdated
Java Bindings | ||
|
||
Summary | ||
These Java KvikIO bindings for GDS currently support only synchronous read and write IO operations using the underlying CuFile API. Support for batch IO and asynchronous operations are not yet supported. | ||
|
||
Dependencies | ||
The Java KvikIO bindings have been developed to work on Linux based systems and require CUDA to be installed and for GDS to be properly enabled. Instructions for how to install and enable GDS can be found on NVIDIA's website. To compile the shared library it is also necessary to have a JDK installed. To run the included example, it is also necessary to install JCuda as it is used to handle memory allocations and the transfer of data between host and GPU memory. JCuda jar files supporting CUDA 12.x can be found here: | ||
https://repo1.maven.org/maven2/org/jcuda/jcuda/12.0.0/jcuda-12.0.0.jar | ||
https://repo1.maven.org/maven2/org/jcuda/jcuda-natives/12.0.0/jcuda-natives-12.0.0.jar | ||
|
||
Compilation | ||
To recompile the .so file for your local system run the following command. Note: Update the command to reflect the directory where you have installed CUDA and your JDK. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use markdown formatting here, so that this will look a bit better when rendered.
Java Bindings | |
Summary | |
These Java KvikIO bindings for GDS currently support only synchronous read and write IO operations using the underlying CuFile API. Support for batch IO and asynchronous operations are not yet supported. | |
Dependencies | |
The Java KvikIO bindings have been developed to work on Linux based systems and require CUDA to be installed and for GDS to be properly enabled. Instructions for how to install and enable GDS can be found on NVIDIA's website. To compile the shared library it is also necessary to have a JDK installed. To run the included example, it is also necessary to install JCuda as it is used to handle memory allocations and the transfer of data between host and GPU memory. JCuda jar files supporting CUDA 12.x can be found here: | |
https://repo1.maven.org/maven2/org/jcuda/jcuda/12.0.0/jcuda-12.0.0.jar | |
https://repo1.maven.org/maven2/org/jcuda/jcuda-natives/12.0.0/jcuda-natives-12.0.0.jar | |
Compilation | |
To recompile the .so file for your local system run the following command. Note: Update the command to reflect the directory where you have installed CUDA and your JDK. | |
# Java Bindings | |
## Summary | |
These Java KvikIO bindings for GDS currently support only synchronous read and write IO operations using the underlying CuFile API. Support for batch IO and asynchronous operations are not yet supported. | |
## Dependencies | |
The Java KvikIO bindings have been developed to work on Linux based systems and require CUDA to be installed and for GDS to be properly enabled. Instructions for how to install and enable GDS can be found on NVIDIA's website. To compile the shared library it is also necessary to have a JDK installed. To run the included example, it is also necessary to install JCuda as it is used to handle memory allocations and the transfer of data between host and GPU memory. JCuda jar files supporting CUDA 12.x can be found here: | |
https://repo1.maven.org/maven2/org/jcuda/jcuda/12.0.0/jcuda-12.0.0.jar | |
https://repo1.maven.org/maven2/org/jcuda/jcuda-natives/12.0.0/jcuda-natives-12.0.0.jar | |
## Compilation | |
To recompile the .so file for your local system run the following command. Note: Update the command to reflect the directory where you have installed CUDA and your JDK. | |
(See https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax for further syntax).
java/README.md
Outdated
https://repo1.maven.org/maven2/org/jcuda/jcuda/12.0.0/jcuda-12.0.0.jar | ||
https://repo1.maven.org/maven2/org/jcuda/jcuda-natives/12.0.0/jcuda-natives-12.0.0.jar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Prefer not to link to explicit files since this can easily go out of date without us remembering to update things. For preference, can we link to an appropriate installation document for JCuda?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion, I can link to the installation instructions as well but found that the installation instructions themselves are a bit unclear/potentially out of date. I may keep these links but just add a note to confirm they are still up to date and reference the installation instructions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit confused by these bindings. They look to be bindings to the cufile API, and not the kvikio API. Is that deliberate? It seems like if so one will have to replicate a lot of the implementation that already exists in kvikio here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct, these are directly bound to the underlying cufile API. In the initial discussions with the rest of the GDS team there was some concern about potential performance overhead of Java bindings -> C++ bindings -> C API. I have not investigated the overhead yet, but definitely see the benefit you are suggesting to sharing one underlying set of bindings. If you don't mind, I will add an issue to my backlog to investigate the potential performance differences and, if they are minimal, update these to point to the C++ bindings at a later date.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, let's continue with raw java bindings for now.
Side note, I would be very surprised if the basic C++ API introduce any significant overhead. E.g. the overhead of calling read()
vs. calling cuFileRead()
is tiny.
/ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few high level comments before digging too deep into this code:
- I have a weak preference to wrap the C++ bindings rather than re-implement kvikio in Java. This will probably be easier than reimplementing the code as new features are introduced to kvikio, and gives more confidence in testing (bindings are probably less bug-prone than a full reimplementation).
- This will need to pass style checks. See the cuDF guide on Code Formatting. The same guidance applies to using
pre-commit
with kvikio development. - This will need a build system, probably Maven. Long commands in the README that explain how to supply all the dependencies are not a good solution if those dependencies can be fetched and built automatically. See how cuDF's Java bindings use Maven, for instance. https://github.com/rapidsai/cudf/tree/branch-24.08/java
- The end result should be that commands like
mvn clean install
work properly.
- The end result should be that commands like
- This will need CI builds and tests. Review the following resources on cuDF's Java CI and try to duplicate them:
- GitHub Action Job for PRs
- GitHub Action Job for Nightly Tests
- Java build/test script
- This uses conda to supply the libcudf package, and builds the Java bindings from that. If you use direct bindings and do not depend on the kvikio C++ library, then you can do something simpler that only requires installing cuFile from conda.
- Java dependencies.yaml file key
- This file key gives a list of what dependency lists to include. This file key is referenced by the CI build/test script. This file key should include the dependency list
cuda
, so you get the cufile package. See the rapids-dependency-file-generator README for more information.
- This file key gives a list of what dependency lists to include. This file key is referenced by the CI build/test script. This file key should include the dependency list
- Java dependencies.yaml list
- This dependency list should probably include
maven
andopenjdk
. Not sure what else.
- This dependency list should probably include
Give this a start and please feel free to schedule time with me to discuss anything that is unclear!
Is this still planned for 24.08 or should we move to 24.10? |
I will not likely finish the updates here for a couple weeks due to other priorities, so this should be moved to 24.10 |
Ok thanks Alex! 🙏 Have moved to 24.10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick first pass of review -- mostly focused on packaging and CI. Give my suggestions a try and I'll comment /ok to test
to run your next commits on CI.
.github/workflows/build.yaml
Outdated
@@ -34,6 +34,14 @@ jobs: | |||
branch: ${{ inputs.branch }} | |||
date: ${{ inputs.date }} | |||
sha: ${{ inputs.sha }} | |||
java-build: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
build.yaml
is only used for building packages that we distribute as a part of nightly builds. Do you expect this to distribute a new conda package? It should probably be removed, see other comments below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we expect this to distribute a new conda package, so I have removed this bit
conda-java-build: | ||
needs: checks | ||
secrets: inherit | ||
uses: rapidsai/shared-workflows/.github/workflows/conda-java-build.yaml@branch-24.10 | ||
with: | ||
build_type: pull-request | ||
conda-java-tests: | ||
needs: conda-java-build | ||
secrets: inherit | ||
uses: rapidsai/shared-workflows/.github/workflows/conda-java-tests.yaml@branch-24.10 | ||
with: | ||
build_type: pull-request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no conda-java-build or conda-java-tests workflows. We use a single "custom job" workflow for Java builds/tests in cuDF. Copy from here: https://github.com/rapidsai/cudf/blob/afd3a4b4776adf738284c9f0b99e1fc2fcefeec8/.github/workflows/pr.yaml#L153-L163
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to use custom job
.github/workflows/test.yaml
Outdated
@@ -30,3 +30,11 @@ jobs: | |||
branch: ${{ inputs.branch }} | |||
date: ${{ inputs.date }} | |||
sha: ${{ inputs.sha }} | |||
java-tests: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do run Java tests in nightlies, so this can stay, but it needs to use a custom job as noted in my other comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added custom job
#### Run example | ||
|
||
cd kvikio/java/ | ||
java -cp target/cufile-24.10.0-SNAPSHOT.jar:$HOME/.m2/repository/org/jcuda/jcuda/12.0.0/jcuda-12.0.0.jar:$HOME/.m2/repository/org/jcuda/jcuda-natives/12.0.0/jcuda-natives-12.0.0.jar -Djava.library.path=./target bindings.kvikio.example.Main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cufile-24.10.0-SNAPSHOT.jar
cuFile is not versioned like RAPIDS. Is this correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the version, the jar that is generated is not aware of what version of libcufile was used to generate it... that just depends on what version of the cuda toolkit the host machine has installed. It appears I can't make the jar not have a version number at all, so I'm inclined to have the version represent the version of kvikio. If you think there's a way I can inject the version of libcufile into maven at runtime I am open to that, but I have not found any information on how that would be done so far.
* limitations under the License. | ||
*/ | ||
|
||
package bindings.kvikio.cufile; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there code formatting that can be applied? There are some extraneous spaces in this PR.
package bindings.kvikio.cufile; | |
package bindings.kvikio.cufile; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, these should be resolved now.
@bdice I believe I have updated the PR with all of your suggestions, except for the versioning question you had. Please let me know if you had further thoughts there or if it seems that I missed anything! |
void read(void* buffer, | ||
std::size_t size, | ||
std::size_t file_offset, | ||
std::size_t device_offset) const | ||
{ | ||
auto const status = cuFileRead(cufile_handle_, buffer, size, file_offset, device_offset); | ||
|
||
if (status < 0) { | ||
if (IS_CUFILE_ERR(status)) { | ||
throw std::logic_error("Failed to read file into buffer: " + cuFileGetErrorString(status)); | ||
} else { | ||
throw std::logic_error("Failed to read file into buffer: " + cuFileGetErrorString(errno)); | ||
} | ||
} | ||
} | ||
|
||
void write(void* buffer, std::size_t size, std::size_t file_offset, std::size_t buffer_offset) | ||
{ | ||
auto const status = cuFileWrite(cufile_handle_, buffer, size, file_offset, buffer_offset); | ||
if (status < 0) { | ||
if (IS_CUFILE_ERR(status)) { | ||
throw std::logic_error("Failed to write file from buffer: " + cuFileGetErrorString(status)); | ||
} else { | ||
throw std::logic_error("Failed to write file from buffer: " + cuFileGetErrorString(errno)); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a sync_default_stream
argument or always sync.
Contrary to most of the non-async CUDA API, cuFile does not have the semantic of being ordered
with respect to other non-cuFile work in the default stream. By enablingsync_default_stream
,
KvikIO will synchronize the default stream and order the operation with respect to other work
in the null stream. When in KvikIO's compatibility mode or when accessing host memory, the
operation is always default stream ordered like the rest of the non-async CUDA API. In this
case, the value ofsync_default_stream
is ignored.
See https://github.com/rapidsai/kvikio/blob/branch-24.10/cpp/include/kvikio/file_handle.hpp#L317-L323
This PR is intended to add initial support for Java binding to GDS as part of the KvikIO library. In this PR are the minimal set of bindings required to support synchronous read and write IO operations via GDS as well as a single example to demonstrate how the bindings can be used alongside other CUDA libraries, such as JCuda. Full support for the GDS CuFile API, including batch and asynchronous IO, has not yet been implemented and more sophisticated error/exception handling is not yet in place. There is a README located within kvikio/java detailing how this new functionality can be compiled and built locally, along with detailed instructions on how to run the included usage example.