Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Initial Java Support for GDS to KvikIO #396

Open
wants to merge 15 commits into
base: branch-24.10
Choose a base branch
from

Conversation

aslobodaNV
Copy link

@aslobodaNV aslobodaNV commented Jun 25, 2024

This PR is intended to add initial support for Java binding to GDS as part of the KvikIO library. In this PR are the minimal set of bindings required to support synchronous read and write IO operations via GDS as well as a single example to demonstrate how the bindings can be used alongside other CUDA libraries, such as JCuda. Full support for the GDS CuFile API, including batch and asynchronous IO, has not yet been implemented and more sophisticated error/exception handling is not yet in place. There is a README located within kvikio/java detailing how this new functionality can be compiled and built locally, along with detailed instructions on how to run the included usage example.

Copy link

copy-pr-bot bot commented Jun 25, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@aslobodaNV aslobodaNV marked this pull request as draft June 25, 2024 20:48
@madsbk madsbk added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Jun 26, 2024
java/README.md Outdated
Comment on lines 1 to 13
Java Bindings

Summary
These Java KvikIO bindings for GDS currently support only synchronous read and write IO operations using the underlying CuFile API. Support for batch IO and asynchronous operations are not yet supported.

Dependencies
The Java KvikIO bindings have been developed to work on Linux based systems and require CUDA to be installed and for GDS to be properly enabled. Instructions for how to install and enable GDS can be found on NVIDIA's website. To compile the shared library it is also necessary to have a JDK installed. To run the included example, it is also necessary to install JCuda as it is used to handle memory allocations and the transfer of data between host and GPU memory. JCuda jar files supporting CUDA 12.x can be found here:
https://repo1.maven.org/maven2/org/jcuda/jcuda/12.0.0/jcuda-12.0.0.jar
https://repo1.maven.org/maven2/org/jcuda/jcuda-natives/12.0.0/jcuda-natives-12.0.0.jar

Compilation
To recompile the .so file for your local system run the following command. Note: Update the command to reflect the directory where you have installed CUDA and your JDK.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use markdown formatting here, so that this will look a bit better when rendered.

Suggested change
Java Bindings
Summary
These Java KvikIO bindings for GDS currently support only synchronous read and write IO operations using the underlying CuFile API. Support for batch IO and asynchronous operations are not yet supported.
Dependencies
The Java KvikIO bindings have been developed to work on Linux based systems and require CUDA to be installed and for GDS to be properly enabled. Instructions for how to install and enable GDS can be found on NVIDIA's website. To compile the shared library it is also necessary to have a JDK installed. To run the included example, it is also necessary to install JCuda as it is used to handle memory allocations and the transfer of data between host and GPU memory. JCuda jar files supporting CUDA 12.x can be found here:
https://repo1.maven.org/maven2/org/jcuda/jcuda/12.0.0/jcuda-12.0.0.jar
https://repo1.maven.org/maven2/org/jcuda/jcuda-natives/12.0.0/jcuda-natives-12.0.0.jar
Compilation
To recompile the .so file for your local system run the following command. Note: Update the command to reflect the directory where you have installed CUDA and your JDK.
# Java Bindings
## Summary
These Java KvikIO bindings for GDS currently support only synchronous read and write IO operations using the underlying CuFile API. Support for batch IO and asynchronous operations are not yet supported.
## Dependencies
The Java KvikIO bindings have been developed to work on Linux based systems and require CUDA to be installed and for GDS to be properly enabled. Instructions for how to install and enable GDS can be found on NVIDIA's website. To compile the shared library it is also necessary to have a JDK installed. To run the included example, it is also necessary to install JCuda as it is used to handle memory allocations and the transfer of data between host and GPU memory. JCuda jar files supporting CUDA 12.x can be found here:
https://repo1.maven.org/maven2/org/jcuda/jcuda/12.0.0/jcuda-12.0.0.jar
https://repo1.maven.org/maven2/org/jcuda/jcuda-natives/12.0.0/jcuda-natives-12.0.0.jar
## Compilation
To recompile the .so file for your local system run the following command. Note: Update the command to reflect the directory where you have installed CUDA and your JDK.

(See https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax for further syntax).

java/README.md Outdated
Comment on lines 8 to 9
https://repo1.maven.org/maven2/org/jcuda/jcuda/12.0.0/jcuda-12.0.0.jar
https://repo1.maven.org/maven2/org/jcuda/jcuda-natives/12.0.0/jcuda-natives-12.0.0.jar
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Prefer not to link to explicit files since this can easily go out of date without us remembering to update things. For preference, can we link to an appropriate installation document for JCuda?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion, I can link to the installation instructions as well but found that the installation instructions themselves are a bit unclear/potentially out of date. I may keep these links but just add a note to confirm they are still up to date and reference the installation instructions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused by these bindings. They look to be bindings to the cufile API, and not the kvikio API. Is that deliberate? It seems like if so one will have to replicate a lot of the implementation that already exists in kvikio here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, these are directly bound to the underlying cufile API. In the initial discussions with the rest of the GDS team there was some concern about potential performance overhead of Java bindings -> C++ bindings -> C API. I have not investigated the overhead yet, but definitely see the benefit you are suggesting to sharing one underlying set of bindings. If you don't mind, I will add an issue to my backlog to investigate the potential performance differences and, if they are minimal, update these to point to the C++ bindings at a later date.

Copy link
Member

@madsbk madsbk Jun 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, let's continue with raw java bindings for now.

Side note, I would be very surprised if the basic C++ API introduce any significant overhead. E.g. the overhead of calling read() vs. calling cuFileRead() is tiny.

@madsbk
Copy link
Member

madsbk commented Jun 27, 2024

/ok to test

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few high level comments before digging too deep into this code:

  • I have a weak preference to wrap the C++ bindings rather than re-implement kvikio in Java. This will probably be easier than reimplementing the code as new features are introduced to kvikio, and gives more confidence in testing (bindings are probably less bug-prone than a full reimplementation).
  • This will need to pass style checks. See the cuDF guide on Code Formatting. The same guidance applies to using pre-commit with kvikio development.
  • This will need a build system, probably Maven. Long commands in the README that explain how to supply all the dependencies are not a good solution if those dependencies can be fetched and built automatically. See how cuDF's Java bindings use Maven, for instance. https://github.com/rapidsai/cudf/tree/branch-24.08/java
    • The end result should be that commands like mvn clean install work properly.
  • This will need CI builds and tests. Review the following resources on cuDF's Java CI and try to duplicate them:

Give this a start and please feel free to schedule time with me to discuss anything that is unclear!

@jakirkham
Copy link
Member

Is this still planned for 24.08 or should we move to 24.10?

@aslobodaNV
Copy link
Author

I will not likely finish the updates here for a couple weeks due to other priorities, so this should be moved to 24.10

@jakirkham jakirkham changed the base branch from branch-24.08 to branch-24.10 July 24, 2024 20:27
@jakirkham
Copy link
Member

Ok thanks Alex! 🙏

Have moved to 24.10

@aslobodaNV aslobodaNV marked this pull request as ready for review September 4, 2024 21:21
@aslobodaNV aslobodaNV requested review from a team as code owners September 4, 2024 21:21
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick first pass of review -- mostly focused on packaging and CI. Give my suggestions a try and I'll comment /ok to test to run your next commits on CI.

@@ -34,6 +34,14 @@ jobs:
branch: ${{ inputs.branch }}
date: ${{ inputs.date }}
sha: ${{ inputs.sha }}
java-build:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build.yaml is only used for building packages that we distribute as a part of nightly builds. Do you expect this to distribute a new conda package? It should probably be removed, see other comments below.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we expect this to distribute a new conda package, so I have removed this bit

Comment on lines +44 to +55
conda-java-build:
needs: checks
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/conda-java-build.yaml@branch-24.10
with:
build_type: pull-request
conda-java-tests:
needs: conda-java-build
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/conda-java-tests.yaml@branch-24.10
with:
build_type: pull-request
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to use custom job

@@ -30,3 +30,11 @@ jobs:
branch: ${{ inputs.branch }}
date: ${{ inputs.date }}
sha: ${{ inputs.sha }}
java-tests:
Copy link
Contributor

@bdice bdice Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do run Java tests in nightlies, so this can stay, but it needs to use a custom job as noted in my other comment.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added custom job

#### Run example

cd kvikio/java/
java -cp target/cufile-24.10.0-SNAPSHOT.jar:$HOME/.m2/repository/org/jcuda/jcuda/12.0.0/jcuda-12.0.0.jar:$HOME/.m2/repository/org/jcuda/jcuda-natives/12.0.0/jcuda-natives-12.0.0.jar -Djava.library.path=./target bindings.kvikio.example.Main
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cufile-24.10.0-SNAPSHOT.jar

cuFile is not versioned like RAPIDS. Is this correct?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the version, the jar that is generated is not aware of what version of libcufile was used to generate it... that just depends on what version of the cuda toolkit the host machine has installed. It appears I can't make the jar not have a version number at all, so I'm inclined to have the version represent the version of kvikio. If you think there's a way I can inject the version of libcufile into maven at runtime I am open to that, but I have not found any information on how that would be done so far.

* limitations under the License.
*/

package bindings.kvikio.cufile;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there code formatting that can be applied? There are some extraneous spaces in this PR.

Suggested change
package bindings.kvikio.cufile;
package bindings.kvikio.cufile;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, these should be resolved now.

@aslobodaNV
Copy link
Author

Quick first pass of review -- mostly focused on packaging and CI. Give my suggestions a try and I'll comment /ok to test to run your next commits on CI.

@bdice I believe I have updated the PR with all of your suggestions, except for the versioning question you had. Please let me know if you had further thoughts there or if it seems that I missed anything!

Comment on lines +114 to +140
void read(void* buffer,
std::size_t size,
std::size_t file_offset,
std::size_t device_offset) const
{
auto const status = cuFileRead(cufile_handle_, buffer, size, file_offset, device_offset);

if (status < 0) {
if (IS_CUFILE_ERR(status)) {
throw std::logic_error("Failed to read file into buffer: " + cuFileGetErrorString(status));
} else {
throw std::logic_error("Failed to read file into buffer: " + cuFileGetErrorString(errno));
}
}
}

void write(void* buffer, std::size_t size, std::size_t file_offset, std::size_t buffer_offset)
{
auto const status = cuFileWrite(cufile_handle_, buffer, size, file_offset, buffer_offset);
if (status < 0) {
if (IS_CUFILE_ERR(status)) {
throw std::logic_error("Failed to write file from buffer: " + cuFileGetErrorString(status));
} else {
throw std::logic_error("Failed to write file from buffer: " + cuFileGetErrorString(errno));
}
}
}
Copy link
Member

@madsbk madsbk Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a sync_default_stream argument or always sync.

Contrary to most of the non-async CUDA API, cuFile does not have the semantic of being ordered
with respect to other non-cuFile work in the default stream. By enabling sync_default_stream,
KvikIO will synchronize the default stream and order the operation with respect to other work
in the null stream. When in KvikIO's compatibility mode or when accessing host memory, the
operation is always default stream ordered like the rest of the non-async CUDA API. In this
case, the value of sync_default_stream is ignored.

See https://github.com/rapidsai/kvikio/blob/branch-24.10/cpp/include/kvikio/file_handle.hpp#L317-L323

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improves an existing functionality non-breaking Introduces a non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants