Skip to content

Commit

Permalink
Merge pull request #1195 from NvTimLiu/release-tmp
Browse files Browse the repository at this point in the history
Merge branch 'branch-23.06' into main [skip ci]
  • Loading branch information
NvTimLiu authored Jun 8, 2023
2 parents acbfc50 + 17596ab commit 1ad41b4
Show file tree
Hide file tree
Showing 33 changed files with 1,111 additions and 163 deletions.
5 changes: 3 additions & 2 deletions .github/workflows/action-helper/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2022, NVIDIA CORPORATION.
# Copyright (c) 2022-2023, NVIDIA CORPORATION.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand All @@ -18,6 +18,7 @@ WORKDIR /
COPY python /python
COPY entrypoint.sh .
RUN chmod -R +x /python /entrypoint.sh
RUN pip install requests
# pin urllib3<2.0 for https://github.com/psf/requests/issues/6432
RUN pip install requests "urllib3<2.0"

ENTRYPOINT ["/entrypoint.sh"]
6 changes: 3 additions & 3 deletions .github/workflows/auto-merge.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,12 @@ name: auto-merge HEAD to BASE
on:
pull_request_target:
branches:
- branch-23.04
- branch-23.06
types: [closed]

env:
HEAD: branch-23.04
BASE: branch-23.06
HEAD: branch-23.06
BASE: branch-23.08

jobs:
auto-merge:
Expand Down
2 changes: 1 addition & 1 deletion .gitmodules
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[submodule "thirdparty/cudf"]
path = thirdparty/cudf
url = https://github.com/rapidsai/cudf.git
branch = branch-23.04
branch = branch-23.06
30 changes: 17 additions & 13 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,10 @@ Maven `package` goal can be used to build the RAPIDS Accelerator JNI jar. After
build the RAPIDS Accelerator JNI jar will be in the `spark-rapids-jni/target/` directory.
Be sure to select the jar with the CUDA classifier.

When building spark-rapids-jni, the pom.xml in the submodule thirdparty/cudf is completely
bypassed. For a detailed explanation please read
[this](https://github.com/NVIDIA/spark-rapids-jni/issues/1084#issuecomment-1513471739).

### Building in the Docker Container

The `build/build-in-docker` script will build the spark-rapids-jni artifact within a Docker
Expand Down Expand Up @@ -67,18 +71,18 @@ settings. If an explicit reconfigure of libcudf is needed (e.g.: when changing c
The following build properties can be set on the Maven command-line (e.g.: `-DCPP_PARALLEL_LEVEL=4`)
to control aspects of the build:

|Property Name |Description |Default|
|------------------------------------|---------------------------------------|-------|
|`CPP_PARALLEL_LEVEL` |Parallelism of the C++ builds |10 |
|`GPU_ARCHS` |CUDA architectures to target |ALL |
|`CUDF_USE_PER_THREAD_DEFAULT_STREAM`|CUDA per-thread default stream |ON |
|`RMM_LOGGING_LEVEL` |RMM logging control |OFF |
|`USE_GDS` |Compile with GPU Direct Storage support|OFF |
|`BUILD_TESTS` |Compile tests |OFF |
|`BUILD_BENCHMARKS` |Compile benchmarks |OFF |
|`libcudf.build.configure` |Force libcudf build to configure |false |
|`libcudf.clean.skip` |Whether to skip cleaning libcudf build |true |
|`submodule.check.skip` |Whether to skip checking git submodules|false |
|Property Name |Description | Default |
|------------------------------------|---------------------------------------|---------|
|`CPP_PARALLEL_LEVEL` |Parallelism of the C++ builds | 10 |
|`GPU_ARCHS` |CUDA architectures to target | RAPIDS |
|`CUDF_USE_PER_THREAD_DEFAULT_STREAM`|CUDA per-thread default stream | ON |
|`RMM_LOGGING_LEVEL` |RMM logging control | OFF |
|`USE_GDS` |Compile with GPU Direct Storage support| OFF |
|`BUILD_TESTS` |Compile tests | OFF |
|`BUILD_BENCHMARKS` |Compile benchmarks | OFF |
|`libcudf.build.configure` |Force libcudf build to configure | false |
|`libcudf.clean.skip` |Whether to skip cleaning libcudf build | true |
|`submodule.check.skip` |Whether to skip checking git submodules| false |


### Local testing of cross-repo contributions cudf, spark-rapids-jni, and spark-rapids
Expand Down Expand Up @@ -144,7 +148,7 @@ $ ./build/build-in-docker install ...
```

Now cd to ~/repos/NVIDIA/spark-rapids and build with one of the options from
[spark-rapids instructions](https://github.com/NVIDIA/spark-rapids/blob/branch-23.04/CONTRIBUTING.md#building-from-source).
[spark-rapids instructions](https://github.com/NVIDIA/spark-rapids/blob/branch-23.06/CONTRIBUTING.md#building-from-source).

```bash
$ ./build/buildall
Expand Down
3 changes: 2 additions & 1 deletion ci/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,8 @@ RUN yum install -y devtoolset-${DEVTOOLSET_VERSION} rh-python38 epel-release
RUN yum install -y zlib-devel maven tar wget patch ninja-build
# require git 2.18+ to keep consistent submodule operations
RUN yum -y install https://packages.endpointdev.com/rhel/7/os/x86_64/endpoint-repo.x86_64.rpm && yum install -y git
RUN scl enable rh-python38 "pip install requests"
# pin urllib3<2.0 for https://github.com/psf/requests/issues/6432
RUN scl enable rh-python38 "pip install requests 'urllib3<2.0'"

## pre-create the CMAKE_INSTALL_PREFIX folder, set writable by any user for Jenkins
RUN mkdir /usr/local/rapids && mkdir /rapids && chmod 777 /usr/local/rapids && chmod 777 /rapids
Expand Down
2 changes: 1 addition & 1 deletion ci/Jenkinsfile.premerge
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,7 @@ pipeline {
if (TEMP_IMAGE_BUILD) {
PREMERGE_TAG = "centos7-cuda11.8.0-blossom-dev-${BUILD_TAG}"
IMAGE_PREMERGE = "${ARTIFACTORY_NAME}/sw-spark-docker-local/plugin-jni:${PREMERGE_TAG}"
docker.build(IMAGE_PREMERGE, "-f ${PREMERGE_DOCKERFILE} -t $IMAGE_PREMERGE .")
docker.build(IMAGE_PREMERGE, "--network=host -f ${PREMERGE_DOCKERFILE} -t $IMAGE_PREMERGE .")
uploadDocker(IMAGE_PREMERGE)
}
}
Expand Down
2 changes: 1 addition & 1 deletion ci/nightly-build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -30,5 +30,5 @@ ${MVN} clean package ${MVN_MIRROR} \
-Psource-javadoc \
-DCPP_PARALLEL_LEVEL=${PARALLEL_LEVEL} \
-Dlibcudf.build.configure=true \
-DUSE_GDS=${USE_GDS} -Dtest=*,!CuFileTest,!CudaFatalTest \
-DUSE_GDS=${USE_GDS} -Dtest=*,!CuFileTest,!CudaFatalTest,!ColumnViewNonEmptyNullsTest \
-DBUILD_TESTS=ON -Dcuda.version=$CUDA_VER
2 changes: 1 addition & 1 deletion ci/premerge-build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -26,5 +26,5 @@ PARALLEL_LEVEL=${PARALLEL_LEVEL:-4}
${MVN} verify ${MVN_MIRROR} \
-DCPP_PARALLEL_LEVEL=${PARALLEL_LEVEL} \
-Dlibcudf.build.configure=true \
-DUSE_GDS=ON -Dtest=*,!CuFileTest,!CudaFatalTest \
-DUSE_GDS=ON -Dtest=*,!CuFileTest,!CudaFatalTest,!ColumnViewNonEmptyNullsTest \
-DBUILD_TESTS=ON
2 changes: 1 addition & 1 deletion ci/submodule-sync.sh
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ set +e
${MVN} verify ${MVN_MIRROR} \
-DCPP_PARALLEL_LEVEL=${PARALLEL_LEVEL} \
-Dlibcudf.build.configure=true \
-DUSE_GDS=ON -Dtest=*,!CuFileTest,!CudaFatalTest \
-DUSE_GDS=ON -Dtest=*,!CuFileTest,!CudaFatalTest,!ColumnViewNonEmptyNullsTest \
-DBUILD_TESTS=ON
verify_status=$?
set -e
Expand Down
15 changes: 13 additions & 2 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@

<groupId>com.nvidia</groupId>
<artifactId>spark-rapids-jni</artifactId>
<version>23.04.0</version>
<version>23.06.0</version>
<packaging>jar</packaging>
<name>RAPIDS Accelerator JNI for Apache Spark</name>
<description>
Expand Down Expand Up @@ -76,7 +76,7 @@
<properties>
<arrow.version>0.15.1</arrow.version>
<CPP_PARALLEL_LEVEL>10</CPP_PARALLEL_LEVEL>
<GPU_ARCHS>ALL</GPU_ARCHS>
<GPU_ARCHS>RAPIDS</GPU_ARCHS>
<CUDF_USE_PER_THREAD_DEFAULT_STREAM>ON</CUDF_USE_PER_THREAD_DEFAULT_STREAM>
<RMM_LOGGING_LEVEL>OFF</RMM_LOGGING_LEVEL>
<SPARK_RAPIDS_JNI_CXX_FLAGS/>
Expand Down Expand Up @@ -535,9 +535,20 @@
<configuration>
<excludes>
<exclude>**/CudaFatalTest.java</exclude>
<exclude>**/ColumnViewNonEmptyNullsTest.java</exclude>
</excludes>
</configuration>
</execution>
<execution>
<id>non-empty-null-test</id>
<goals>
<goal>test</goal>
</goals>
<configuration>
<argLine>-da:ai.rapids.cudf.AssertEmptyNulls</argLine>
<test>ColumnViewNonEmptyNullsTest</test>
</configuration>
</execution>
<execution>
<id>fatal-cuda-test</id>
<goals>
Expand Down
9 changes: 8 additions & 1 deletion src/main/cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ rapids_cuda_init_architectures(SPARK_RAPIDS_JNI)

project(
SPARK_RAPIDS_JNI
VERSION 23.04.00
VERSION 23.06.00
LANGUAGES C CXX CUDA
)

Expand Down Expand Up @@ -91,6 +91,12 @@ include(cmake/Modules/ConfigureCUDA.cmake) # set other CUDA compilation flags
# ##################################################################################################
# * dependencies ----------------------------------------------------------------------------------

# find libcu++
include(${rapids-cmake-dir}/cpm/libcudacxx.cmake)

# find thrust/cub
include(${CUDF_DIR}/cpp/cmake/thirdparty/get_thrust.cmake)

# JNI
find_package(JNI REQUIRED)
if(JNI_FOUND)
Expand Down Expand Up @@ -147,6 +153,7 @@ add_library(
src/RowConversionJni.cpp
src/SparkResourceAdaptorJni.cpp
src/ZOrderJni.cpp
src/cast_decimal_to_string.cu
src/cast_string.cu
src/cast_string_to_float.cu
src/decimal_utils.cu
Expand Down
33 changes: 26 additions & 7 deletions src/main/cpp/src/CastStringJni.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,7 @@ constexpr char const* JNI_CAST_ERROR_CLASS = "com/nvidia/spark/rapids/jni/CastEx
extern "C" {

JNIEXPORT jlong JNICALL Java_com_nvidia_spark_rapids_jni_CastStrings_toInteger(
JNIEnv* env, jclass, jlong input_column, jboolean ansi_enabled, jboolean strip,
jint j_dtype)
JNIEnv* env, jclass, jlong input_column, jboolean ansi_enabled, jboolean strip, jint j_dtype)
{
JNI_NULL_CHECK(env, input_column, "input column is null", 0);

Expand All @@ -56,15 +55,19 @@ JNIEXPORT jlong JNICALL Java_com_nvidia_spark_rapids_jni_CastStrings_toInteger(

cudf::strings_column_view scv{*reinterpret_cast<cudf::column_view const*>(input_column)};
return cudf::jni::release_as_jlong(spark_rapids_jni::string_to_integer(
cudf::jni::make_data_type(j_dtype, 0), scv, ansi_enabled, strip,
cudf::get_default_stream()));
cudf::jni::make_data_type(j_dtype, 0), scv, ansi_enabled, strip, cudf::get_default_stream()));
}
CATCH_CAST_EXCEPTION(env, 0);
}

JNIEXPORT jlong JNICALL Java_com_nvidia_spark_rapids_jni_CastStrings_toDecimal(
JNIEnv* env, jclass, jlong input_column, jboolean ansi_enabled, jboolean strip,
jint precision, jint scale)
JNIEXPORT jlong JNICALL
Java_com_nvidia_spark_rapids_jni_CastStrings_toDecimal(JNIEnv* env,
jclass,
jlong input_column,
jboolean ansi_enabled,
jboolean strip,
jint precision,
jint scale)
{
JNI_NULL_CHECK(env, input_column, "input column is null", 0);

Expand Down Expand Up @@ -92,4 +95,20 @@ JNIEXPORT jlong JNICALL Java_com_nvidia_spark_rapids_jni_CastStrings_toFloat(
}
CATCH_CAST_EXCEPTION(env, 0);
}

JNIEXPORT jlong JNICALL Java_com_nvidia_spark_rapids_jni_CastStrings_fromDecimal(JNIEnv* env,
jclass,
jlong input_column)
{
JNI_NULL_CHECK(env, input_column, "input column is null", 0);

try {
cudf::jni::auto_set_device(env);

cudf::column_view cv{*reinterpret_cast<cudf::column_view const*>(input_column)};
return cudf::jni::release_as_jlong(
spark_rapids_jni::decimal_to_non_ansi_string(cv, cudf::get_default_stream()));
}
CATCH_CAST_EXCEPTION(env, 0);
}
}
18 changes: 17 additions & 1 deletion src/main/cpp/src/DecimalUtilsJni.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2022, NVIDIA CORPORATION.
* Copyright (c) 2022-2023, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -58,6 +58,22 @@ JNIEXPORT jlongArray JNICALL Java_com_nvidia_spark_rapids_jni_DecimalUtils_divid
CATCH_STD(env, 0);
}

JNIEXPORT jlongArray JNICALL Java_com_nvidia_spark_rapids_jni_DecimalUtils_remainder128(JNIEnv *env, jclass,
jlong j_view_a,
jlong j_view_b,
jint j_remainder_scale) {
JNI_NULL_CHECK(env, j_view_a, "column is null", 0);
JNI_NULL_CHECK(env, j_view_b, "column is null", 0);
try {
cudf::jni::auto_set_device(env);
auto view_a = reinterpret_cast<cudf::column_view const *>(j_view_a);
auto view_b = reinterpret_cast<cudf::column_view const *>(j_view_b);
auto scale = static_cast<int>(j_remainder_scale);
return cudf::jni::convert_table_for_return(env, cudf::jni::remainder_decimal128(*view_a, *view_b, scale));
}
CATCH_STD(env, 0);
}

JNIEXPORT jlongArray JNICALL Java_com_nvidia_spark_rapids_jni_DecimalUtils_add128(JNIEnv *env, jclass,
jlong j_view_a,
jlong j_view_b,
Expand Down
34 changes: 14 additions & 20 deletions src/main/cpp/src/RowConversionJni.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2022, NVIDIA CORPORATION.
* Copyright (c) 2022-2023, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand All @@ -14,15 +14,15 @@
* limitations under the License.
*/

#include "row_conversion.hpp"

#include "cudf_jni_apis.hpp"
#include "dtype_utils.hpp"
#include "row_conversion.hpp"

extern "C" {

JNIEXPORT jlongArray JNICALL
Java_com_nvidia_spark_rapids_jni_RowConversion_convertToRowsFixedWidthOptimized(JNIEnv *env, jclass, jlong input_table) {
Java_com_nvidia_spark_rapids_jni_RowConversion_convertToRowsFixedWidthOptimized(JNIEnv *env, jclass,
jlong input_table) {
JNI_NULL_CHECK(env, input_table, "input table is null", 0);

try {
Expand All @@ -39,15 +39,15 @@ Java_com_nvidia_spark_rapids_jni_RowConversion_convertToRowsFixedWidthOptimized(
CATCH_STD(env, 0);
}

JNIEXPORT jlongArray JNICALL
Java_com_nvidia_spark_rapids_jni_RowConversion_convertToRows(JNIEnv *env, jclass, jlong input_table)
{
JNIEXPORT jlongArray JNICALL Java_com_nvidia_spark_rapids_jni_RowConversion_convertToRows(
JNIEnv *env, jclass, jlong input_table) {
JNI_NULL_CHECK(env, input_table, "input table is null", 0);

try {
cudf::jni::auto_set_device(env);
cudf::table_view const *n_input_table = reinterpret_cast<cudf::table_view const *>(input_table);
std::vector<std::unique_ptr<cudf::column>> cols = spark_rapids_jni::convert_to_rows(*n_input_table);
std::vector<std::unique_ptr<cudf::column>> cols =
spark_rapids_jni::convert_to_rows(*n_input_table);
int const num_columns = cols.size();
cudf::jni::native_jlongArray outcol_handles(env, num_columns);
std::transform(cols.begin(), cols.end(), outcol_handles.begin(),
Expand All @@ -69,8 +69,7 @@ Java_com_nvidia_spark_rapids_jni_RowConversion_convertFromRowsFixedWidthOptimize
cudf::jni::native_jintArray n_types(env, types);
cudf::jni::native_jintArray n_scale(env, scale);
if (n_types.size() != n_scale.size()) {
JNI_THROW_NEW(env, "java/lang/IllegalArgumentException", "types and scales must match size",
NULL);
JNI_THROW_NEW(env, cudf::jni::ILLEGAL_ARG_CLASS, "types and scales must match size", NULL);
}
std::vector<cudf::data_type> types_vec;
std::transform(n_types.begin(), n_types.end(), n_scale.begin(), std::back_inserter(types_vec),
Expand All @@ -82,12 +81,8 @@ Java_com_nvidia_spark_rapids_jni_RowConversion_convertFromRowsFixedWidthOptimize
CATCH_STD(env, 0);
}

JNIEXPORT jlongArray JNICALL
Java_com_nvidia_spark_rapids_jni_RowConversion_convertFromRows(JNIEnv *env, jclass,
jlong input_column,
jintArray types,
jintArray scale)
{
JNIEXPORT jlongArray JNICALL Java_com_nvidia_spark_rapids_jni_RowConversion_convertFromRows(
JNIEnv *env, jclass, jlong input_column, jintArray types, jintArray scale) {
JNI_NULL_CHECK(env, input_column, "input column is null", 0);
JNI_NULL_CHECK(env, types, "types is null", 0);

Expand All @@ -97,16 +92,15 @@ Java_com_nvidia_spark_rapids_jni_RowConversion_convertFromRows(JNIEnv *env, jcla
cudf::jni::native_jintArray n_types(env, types);
cudf::jni::native_jintArray n_scale(env, scale);
if (n_types.size() != n_scale.size()) {
JNI_THROW_NEW(env, "java/lang/IllegalArgumentException", "types and scales must match size",
NULL);
JNI_THROW_NEW(env, cudf::jni::ILLEGAL_ARG_CLASS, "types and scales must match size", NULL);
}
std::vector<cudf::data_type> types_vec;
std::transform(n_types.begin(), n_types.end(), n_scale.begin(), std::back_inserter(types_vec),
[](jint type, jint scale) { return cudf::jni::make_data_type(type, scale); });
std::unique_ptr<cudf::table> result = spark_rapids_jni::convert_from_rows(list_input, types_vec);
std::unique_ptr<cudf::table> result =
spark_rapids_jni::convert_from_rows(list_input, types_vec);
return cudf::jni::convert_table_for_return(env, result);
}
CATCH_STD(env, 0);
}

}
Loading

0 comments on commit 1ad41b4

Please sign in to comment.