-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Initial Java Support for GDS to KvikIO #396
base: branch-24.10
Are you sure you want to change the base?
Changes from all commits
83044ff
62649e6
161b260
30aa3dc
299bdb6
1162efc
3f29546
b3dd38f
fe91427
4ad08c6
83875b6
4120e4c
888d6a7
d337a16
833ad32
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
#!/bin/bash | ||
# Copyright (c) 2024, NVIDIA CORPORATION. | ||
|
||
set -euo pipefail | ||
|
||
. /opt/conda/etc/profile.d/conda.sh | ||
|
||
rapids-logger "Generate java testing dependencies" | ||
rapids-dependency-file-generator \ | ||
--output conda \ | ||
--file-key test_java \ | ||
--matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch)" | tee env.yaml | ||
|
||
rapids-mamba-retry env create --yes -f env.yaml -n test | ||
|
||
# Temporarily allow unbound variables for conda activation. | ||
set +u | ||
conda activate test | ||
set -u | ||
|
||
rapids-logger "Downloading artifacts from previous jobs" | ||
CPP_CHANNEL=$(rapids-download-conda-from-s3 cpp) | ||
|
||
rapids-print-env | ||
|
||
rapids-mamba-retry install \ | ||
--channel "${CPP_CHANNEL}" \ | ||
libkvikio libkvikio-tests | ||
|
||
rapids-logger "Check GPU usage" | ||
nvidia-smi | ||
|
||
EXITCODE=0 | ||
trap "EXITCODE=1" ERR | ||
set +e | ||
|
||
rapids-logger "Run Java tests" | ||
pushd java | ||
mvn test -B | ||
popd | ||
|
||
rapids-logger "Test script exiting with value: $EXITCODE" | ||
exit ${EXITCODE} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,74 @@ | ||
# Java KvikIO Bindings | ||
|
||
## Summary | ||
These Java KvikIO bindings for GDS currently support only synchronous read and write IO operations using the underlying CuFile API. Support for batch IO and asynchronous operations are not yet supported. | ||
|
||
## Dependencies | ||
The Java KvikIO bindings have been developed to work on Linux based systems and require [CUDA](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html) to be installed and for [GDS](https://docs.nvidia.com/gpudirect-storage/troubleshooting-guide/index.html) to be properly enabled. To compile the shared library it is also necessary to have a JDK installed. To run the included example, it is also necessary to install JCuda as it is used to handle memory allocations and the transfer of data between host and GPU memory. JCuda jar files supporting CUDA 12.x can be found here: | ||
[jcuda-12.0.0.jar](https://repo1.maven.org/maven2/org/jcuda/jcuda/12.0.0/jcuda-12.0.0.jar), | ||
[jcuda-natives-12.0.0.jar](https://repo1.maven.org/maven2/org/jcuda/jcuda-natives/12.0.0/jcuda-natives-12.0.0.jar) | ||
|
||
For more information on JCuda and potentially more up to date installation instructions or jar files, see here: | ||
[JCuda](http://javagl.de/jcuda.org/), [JCuda Usage](https://github.com/jcuda/jcuda-main/blob/master/USAGE.md), [JCuda Maven Repo](https://mvnrepository.com/artifact/org.jcuda) | ||
|
||
## Compilation | ||
To recompile the .so file for your local system run the following command. Note: Update the command to reflect the directory where you have installed CUDA and your JDK. | ||
|
||
/usr/local/cuda/bin/nvcc -shared -o libCuFileJNI.so -I/usr/local/cuda/include/ -I/usr/lib/jvm/java-21-openjdk-amd64/include/ -I/usr/lib/jvm/java-21-openjdk-amd64/include/linux src/main/native/src/CuFileJni.cpp --compiler-options "-fPIC" -lcufile | ||
|
||
The resulting .so file must be in your JVM library path when running upstream Java programs. If it is not already placed on your path in can be included by including an argument like the following: | ||
|
||
-Djava.library.path={path/to/your/so/file/} | ||
|
||
## Examples | ||
An example for how to use the Java KvikIO bindings can be found in src/main/java/bindings/kvikio/example . Note: This example has a dependency on JCuda so ensure that when running the example the JCuda shared library files are on the JVM library path along with the libCuFileJNI.so file. | ||
|
||
### Specific instructions to run the example using Maven | ||
|
||
#### Compile the shared library and Java files with Maven | ||
|
||
cd kvikio/java/ | ||
mvn clean install | ||
|
||
#### Setup a test file target NOTE: your mount directory may differ from /mnt/nvme, so update this command appropriately as well as example/Main.java to point to the correct file path. | ||
|
||
touch /mnt/nvme/java_test | ||
|
||
#### Run example | ||
|
||
cd kvikio/java/ | ||
java -cp target/cufile-24.10.0-SNAPSHOT.jar:$HOME/.m2/repository/org/jcuda/jcuda/12.0.0/jcuda-12.0.0.jar:$HOME/.m2/repository/org/jcuda/jcuda-natives/12.0.0/jcuda-natives-12.0.0.jar -Djava.library.path=./target bindings.kvikio.example.Main | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
cuFile is not versioned like RAPIDS. Is this correct? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Regarding the version, the jar that is generated is not aware of what version of libcufile was used to generate it... that just depends on what version of the cuda toolkit the host machine has installed. It appears I can't make the jar not have a version number at all, so I'm inclined to have the version represent the version of kvikio. If you think there's a way I can inject the version of libcufile into maven at runtime I am open to that, but I have not found any information on how that would be done so far. |
||
|
||
### Specific instructions to run the example from a terminal | ||
|
||
#### Compile class files | ||
|
||
cd kvikio/java/src/main/java/bindings/kvikio/cufile | ||
javac *.java | ||
|
||
#### Retrieve Jcuda jar files | ||
|
||
cd kvikio/java/ | ||
mkdir lib | ||
cd lib | ||
wget https://repo1.maven.org/maven2/org/jcuda/jcuda/12.0.0/jcuda-12.0.0.jar | ||
wget https://repo1.maven.org/maven2/org/jcuda/jcuda-natives/12.0.0/jcuda-natives-12.0.0.jar | ||
|
||
#### Compile shared library | ||
|
||
cd kvikio/java/lib | ||
/usr/local/cuda/bin/nvcc -shared -o libCuFileJNI.so -I/usr/local/cuda/include/ -I/usr/lib/jvm/java-21-openjdk-amd64/include/ -I/usr/lib/jvm/java-21-openjdk-amd64/include/linux ../src/main/native/src/CuFileJni.cpp --compiler-options "-fPIC" -lcufile | ||
|
||
#### Setup a test file target NOTE: your mount directory may differ from /mnt/nvme, so update this command appropriately as well as example/Main.java to point to the correct file path. | ||
|
||
touch /mnt/nvme/java_test | ||
|
||
#### Compile example file | ||
|
||
cd kvikio/java/src/main/java | ||
javac -cp .:../../../lib/jcuda-12.0.0.jar:../../../lib/jcuda-natives-12.0.0.jar bindings/kvikio/example/Main.java | ||
|
||
#### Run example | ||
|
||
cd kvikio/java/src/main/java | ||
java -cp .:../../../lib/jcuda-12.0.0.jar:../../../lib/jcuda-natives-12.0.0.jar -Djava.library.path=../../../lib/ bindings.kvikio.example.main |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,147 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
|
||
<project xmlns="http://maven.apache.org/POM/4.0.0" | ||
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" | ||
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> | ||
<modelVersion>4.0.0</modelVersion> | ||
|
||
<groupId>bindings.kvikio</groupId> | ||
<artifactId>cufile</artifactId> | ||
<version>24.10.0-SNAPSHOT</version> | ||
|
||
<name>cufile</name> | ||
<description> | ||
This project provides java bindings for the GPUDirect Storage cufile library, enabling the GPU to load and | ||
save large amounts of data to and from persistent storage. This is still a work in progress so some APIs may change. | ||
</description> | ||
<url>http://ai.rapids</url> | ||
|
||
<properties> | ||
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> | ||
<maven.compiler.source>21</maven.compiler.source> | ||
<maven.compiler.target>21</maven.compiler.target> | ||
<junit.version>5.4.2</junit.version> | ||
</properties> | ||
|
||
<dependencies> | ||
<dependency> | ||
<groupId>org.jcuda</groupId> | ||
<artifactId>jcuda</artifactId> | ||
<version>12.0.0</version> | ||
</dependency> | ||
<dependency> | ||
<groupId>org.jcuda</groupId> | ||
<artifactId>jcuda-natives</artifactId> | ||
<version>12.0.0</version> | ||
</dependency> | ||
<dependency> | ||
<groupId>org.junit.jupiter</groupId> | ||
<artifactId>junit-jupiter-api</artifactId> | ||
<version>${junit.version}</version> | ||
<scope>test</scope> | ||
</dependency> | ||
<dependency> | ||
<groupId>org.junit.jupiter</groupId> | ||
<artifactId>junit-jupiter-params</artifactId> | ||
<version>${junit.version}</version> | ||
<scope>test</scope> | ||
</dependency> | ||
</dependencies> | ||
|
||
<build> | ||
<pluginManagement> | ||
<plugins> | ||
<plugin> | ||
<artifactId>maven-exec-plugin</artifactId> | ||
<version>1.6.0</version> | ||
</plugin> | ||
<plugin> | ||
<artifactId>maven-clean-plugin</artifactId> | ||
<version>3.1.0</version> | ||
<configuration> | ||
<createDirs>true</createDirs> | ||
</configuration> | ||
</plugin> | ||
<plugin> | ||
<artifactId>maven-compiler-plugin</artifactId> | ||
<version>3.8.0</version> | ||
<configuration> | ||
<source>21</source> | ||
<target>21</target> | ||
</configuration> | ||
</plugin> | ||
<plugin> | ||
<artifactId>maven-surefire-plugin</artifactId> | ||
<version>2.22.1</version> | ||
<configuration> | ||
<argLine>-Djava.library.path=${project.build.directory}:${java.library.path}</argLine> | ||
</configuration> | ||
<dependencies> | ||
<dependency> | ||
<groupId>org.junit.platform</groupId> | ||
<artifactId>junit-platform-surefire-provider</artifactId> | ||
<version>1.2.0</version> | ||
</dependency> | ||
<dependency> | ||
<groupId>org.junit.jupiter</groupId> | ||
<artifactId>junit-jupiter-engine</artifactId> | ||
<version>5.4.2</version> | ||
</dependency> | ||
</dependencies> | ||
</plugin> | ||
<plugin> | ||
<artifactId>maven-jar-plugin</artifactId> | ||
<version>3.0.2</version> | ||
</plugin> | ||
<plugin> | ||
<artifactId>maven-install-plugin</artifactId> | ||
<version>2.5.2</version> | ||
</plugin> | ||
<plugin> | ||
<artifactId>maven-deploy-plugin</artifactId> | ||
<version>2.8.2</version> | ||
</plugin> | ||
<plugin> | ||
<artifactId>maven-site-plugin</artifactId> | ||
<version>3.7.1</version> | ||
</plugin> | ||
<plugin> | ||
<artifactId>maven-project-info-reports-plugin</artifactId> | ||
<version>3.0.0</version> | ||
</plugin> | ||
</plugins> | ||
</pluginManagement> | ||
<plugins> | ||
<plugin> | ||
<artifactId>maven-antrun-plugin</artifactId> | ||
<version>3.0.0</version> | ||
<executions> | ||
<execution> | ||
<id>compile-native-code</id> | ||
<phase>generate-sources</phase> | ||
<goals> | ||
<goal>run</goal> | ||
</goals> | ||
<configuration> | ||
<target> | ||
<!-- Compile native code using nvcc --> | ||
<exec executable="/usr/local/cuda/bin/nvcc"> | ||
<arg value="-shared"/> | ||
<arg value="-o"/> | ||
<arg value="${project.build.directory}/libCuFileJNI.so"/> | ||
<arg value="-I/usr/local/cuda/include/"/> | ||
<arg value="-I/usr/lib/jvm/java-21-openjdk-amd64/include/"/> | ||
<arg value="-I/usr/lib/jvm/java-21-openjdk-amd64/include/linux"/> | ||
<arg value="${project.basedir}/src/main/native/src/CuFileJni.cpp"/> | ||
<arg value="--compiler-options"/> | ||
<arg value="-fPIC"/> | ||
<arg value="-lcufile"/> | ||
</exec> | ||
</target> | ||
</configuration> | ||
</execution> | ||
</executions> | ||
</plugin> | ||
</plugins> | ||
</build> | ||
</project> |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
/* | ||
* Copyright (c) 2024, NVIDIA CORPORATION. | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
package bindings.kvikio.cufile; | ||
|
||
public class CuFile { | ||
private static boolean initialized = false; | ||
private static CuFileDriver driver; | ||
|
||
static { | ||
initialize(); | ||
} | ||
|
||
static synchronized void initialize() { | ||
if (!initialized) { | ||
try { | ||
System.loadLibrary("CuFileJNI"); | ||
driver = new CuFileDriver(); | ||
Runtime.getRuntime().addShutdownHook(new Thread(() -> { | ||
driver.close(); | ||
})); | ||
initialized = true; | ||
} catch (Throwable t) { | ||
System.out.println("could not load cufile jni library"); | ||
} | ||
} | ||
} | ||
|
||
public static boolean libraryLoaded() { | ||
return initialized; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no conda-java-build or conda-java-tests workflows. We use a single "custom job" workflow for Java builds/tests in cuDF. Copy from here: https://github.com/rapidsai/cudf/blob/afd3a4b4776adf738284c9f0b99e1fc2fcefeec8/.github/workflows/pr.yaml#L153-L163
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to use custom job