Skip to content

Commit

Permalink
refresh website design and document to v1.2.0 (#21)
Browse files Browse the repository at this point in the history
  • Loading branch information
weiting-chen authored Aug 18, 2024
1 parent 41f97f8 commit bffb8fc
Show file tree
Hide file tree
Showing 54 changed files with 411 additions and 131 deletions.
4 changes: 2 additions & 2 deletions 404.html
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,6 @@
<div class="container">
<h1>404</h1>

<p><strong>Page not found :(</strong></p>
<p>The requested page could not be found.</p>
<p><strong>Gluten Page not found :(</strong></p>
<p>The requested gluten webiste's page could not be found.</p>
</div>
4 changes: 2 additions & 2 deletions _config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ plugins:
- jekyll-readme-index # GitHub Pages
- jekyll-relative-links # GitHub Pages

logo: "/images/gluten-logo-blue.png"
logo: "/assets/images/gluten-logo-blue.png"

# Exclude from processing.
# The following items will not be processed, by default.
Expand Down Expand Up @@ -120,7 +120,7 @@ heading_anchors: true
# hide_icon: false # set to true to hide the external link icon - defaults to false
# opens_in_new_tab: false # set to true to open this link in a new tab - defaults to false

color_scheme: dark
color_scheme: light

# Footer content
# appears at the bottom of every page's main content
Expand Down
4 changes: 2 additions & 2 deletions archives/v1.1.1/developers/HowTo.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ gdb gluten_home/cpp/build/releases/libgluten.so 'core-Executor task l-2000883-16
Now, both Parquet and DWRF format files are supported, related scripts and files are under the directory of `gluten_home/backends-velox/workload/tpch`.
The file `README.md` under `gluten_home/backends-velox/workload/tpch` offers some useful help but it's still not enough and exact.

One way of run TPC-H test is to run velox-be by workflow, you can refer to [velox_be.yml](https://github.com/oap-project/gluten/blob/main/.github/workflows/velox_be.yml#L90)
One way of run TPC-H test is to run velox-be by workflow, you can refer to [velox_be.yml](https://github.com/apache/incubator-gluten/blob/branch-1.1.1/.github/workflows/velox_be.yml#L90)

Here will explain how to run TPC-H on Velox backend with the Parquet file format.
1. First step, prepare the datasets, you have two choices.
Expand All @@ -136,7 +136,7 @@ Here will explain how to run TPC-H on Velox backend with the Parquet file format
var gluten_root = "/home/gluten"
```
- Modify `gluten_home/backends-velox/workload/tpch/run_tpch/tpch_parquet.sh`.
- Set `GLUTEN_JAR` correctly. Please refer to the section of [Build Gluten with Velox Backend](../get-started/Velox.md/#2-build-gluten-with-velox-backend)
- Set `GLUTEN_JAR` correctly. Please refer to the section of [Build Gluten with Velox Backend](https://gluten.apache.org/archives/v1.1.1/docs/velox/getting-started/#build-gluten-with-velox-backend)
- Set `SPARK_HOME` correctly.
- Set the memory configurations appropriately.
- Execute `tpch_parquet.sh` using the below command.
Expand Down
8 changes: 4 additions & 4 deletions archives/v1.1.1/developers/MicroBenchmarks.md
Original file line number Diff line number Diff line change
Expand Up @@ -258,15 +258,15 @@ done

### Run Examples

We also provide some example inputs in [cpp/velox/benchmarks/data](../../cpp/velox/benchmarks/data).
E.g. [generic_q5/q5_first_stage_0.json](../../cpp/velox/benchmarks/data/generic_q5/q5_first_stage_0.json) simulates a
We also provide some example inputs in [cpp/velox/benchmarks/data](https://github.com/apache/incubator-gluten/tree/branch-1.1.1/cpp/velox/benchmarks/data).
E.g. [generic_q5/q5_first_stage_0.json](https://github.com/apache/incubator-gluten/blob/branch-1.1.1/cpp/velox/benchmarks/data/generic_q5/q5_first_stage_0.json) simulates a
first-stage in TPCH Q5, which has the the most heaviest table scan. You can follow below steps to run this example.

1. Open [generic_q5/q5_first_stage_0.json](../../cpp/velox/benchmarks/data/generic_q5/q5_first_stage_0_split.json) with
1. Open [generic_q5/q5_first_stage_0.json](https://github.com/apache/incubator-gluten/blob/branch-1.1.1/cpp/velox/benchmarks/data/generic_q5/q5_first_stage_0_split.json) with
file editor. Search for `"uriFile": "LINEITEM"` and replace `LINEITEM` with the URI to one partition file in
lineitem. In the next line, replace the number in `"length": "..."` with the actual file length. Suppose you are
using the provided small TPCH table
in [cpp/velox/benchmarks/data/tpch_sf10m](../../cpp/velox/benchmarks/data/tpch_sf10m), the replaced JSON should be
in [cpp/velox/benchmarks/data/tpch_sf10m](https://github.com/apache/incubator-gluten/tree/branch-1.1.1/cpp/velox/benchmarks/data/tpch_sf10m), the replaced JSON should be
like:

```
Expand Down
26 changes: 13 additions & 13 deletions archives/v1.1.1/docs/GettingStarted_Velox.md
Original file line number Diff line number Diff line change
Expand Up @@ -379,7 +379,7 @@ The following steps demonstrate how to set up a UDF library project:

- The interface functions are mapping to marcos in [Udf.h](../../cpp/velox/udf/Udf.h). Here's an example of how to implement these functions:

```
```cpp
// Filename MyUDF.cpp

#include <velox/expression/VectorFunction.h>
Expand Down Expand Up @@ -415,7 +415,7 @@ The following steps demonstrate how to set up a UDF library project:
## Building the UDF library
To build the UDF library, users need to compile the C++ code and link to `libvelox.so`. It's recommended to create a CMakeLists.txt for the project. Here's an example:
```
```cpp
project(myudf)
set(CMAKE_CXX_STANDARD 17)
Expand Down Expand Up @@ -465,16 +465,16 @@ You can also specify the local or HDFS URIs to the UDF libraries or archives. Lo
We provided an Velox UDF example file [MyUDF.cpp](../../cpp/velox/udf/examples/MyUDF.cpp). After building gluten cpp, you can find the example library at /path/to/gluten/cpp/build/velox/udf/examples/libmyudf.so

Start spark-shell or spark-sql with below configuration
```
```shell
--files /path/to/gluten/cpp/build/velox/udf/examples/libmyudf.so
--conf spark.gluten.sql.columnar.backend.velox.udfLibraryPaths=libmyudf.so
```
Run query. The functions `myudf1` and `myudf2` increment the input value by a constant of 5
```
```sql
select myudf1(1), myudf2(100L)
```
The output from spark-shell will be like
```
```sql
+----------------+------------------+
|udfexpression(1)|udfexpression(100)|
+----------------+------------------+
Expand Down Expand Up @@ -629,14 +629,14 @@ There is 8 QAT acceleration device(s) in the system:

3. Extra Gluten configurations are required when starting Spark application

```
```shell
--conf spark.gluten.sql.columnar.shuffle.codec=gzip # Valid options are gzip and zstd
--conf spark.gluten.sql.columnar.shuffle.codecBackend=qat
```

4. You can use below command to check whether QAT is working normally at run-time. The value of fw_counters should continue to increase during shuffle.

```
```shell
while :; do cat /sys/kernel/debug/qat_4xxx_0000:6b:00.0/fw_counters; sleep 1; done
```

Expand Down Expand Up @@ -697,7 +697,7 @@ sudo ls -l /dev/iax
```

The output should be like:
```
```bash
total 0
crw-rw---- 1 root iaa 509, 0 Apr 5 18:54 wq1.0
crw-rw---- 1 root iaa 509, 5 Apr 5 18:54 wq11.0
Expand All @@ -711,7 +711,7 @@ crw-rw---- 1 root iaa 509, 4 Apr 5 18:54 wq9.0

2. Extra Gluten configurations are required when starting Spark application

```
```bash
--conf spark.gluten.sql.columnar.shuffle.codec=gzip
--conf spark.gluten.sql.columnar.shuffle.codecBackend=iaa
```
Expand Down Expand Up @@ -746,7 +746,7 @@ Some other versions of TPC-DS queries are also provided, but are **not** recomme
Submit test script from spark-shell. You can find the scala code to [Run TPC-H](../../tools/workload/tpch/run_tpch/tpch_parquet.scala) as an example. Please remember to modify
the location of TPC-H files as well as TPC-H queries before you run the testing.

```
```scala
var parquet_file_path = "/PATH/TO/TPCH_PARQUET_PATH"
var gluten_root = "/PATH/TO/GLUTEN"
```
Expand Down Expand Up @@ -777,7 +777,7 @@ Refer to [Gluten configuration](../Configuration.md) for more details.
## Result
*wholestagetransformer* indicates that the offload works.

![TPC-H Q6](../image/TPC-H_Q6_DAG.png)
![TPC-H Q6](/assets/images/TPC-H_Q6_DAG.png)

## Performance

Expand Down Expand Up @@ -811,7 +811,7 @@ Developers can register `SparkListener` to handle these two Gluten events.

Gluten provides a tab based on Spark UI, named `Gluten SQL / DataFrame`

![Gluten-UI](../image/gluten-ui.png)
![Gluten-UI](/assets/images/gluten-ui.png)

This tab contains two parts:

Expand Down Expand Up @@ -906,7 +906,7 @@ ashProbe: Input: 9 rows (864B, 3 batches), Output: 27 rows (3.56KB, 3 batches),

Gluten provides a helper class to get the fallback summary from a Spark Dataset.

```
```scala
import org.apache.spark.sql.execution.GlutenImplicits._
val df = spark.sql("SELECT * FROM t")
df.fallbackSummary
Expand Down
2 changes: 1 addition & 1 deletion asf.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
layout: page
title: Apache Software Foundation
nav_order: 7
nav_order: 8
permalink: /asf/
---

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/images/Gazelle-jni.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/images/TPC-H_Q6_DAG.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/images/TPCH-q5-first-stage.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/images/flow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
9 changes: 9 additions & 0 deletions assets/images/gluten-logo.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/images/gluten-ui.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/images/gluten.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/images/gluten_golden_file_upload.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/images/operators.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/images/overall_design.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/images/reproduce_natively.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/images/support.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/images/veloxbe_memory_layout.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion contact-us.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
layout: page
title: Contact Us
nav_order: 7
nav_order: 9
---
# Contact Us

Expand Down
4 changes: 2 additions & 2 deletions contributing.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
---
layout: page
title: Contributing to Gluten
nav_order: 6
nav_order: 7
---

## How to become a committer
# How to become a committer

To initiate your contributions to Gluten, understand the contribution process—any individual can submit patches, documentation, and examples to the project.

Expand Down
18 changes: 9 additions & 9 deletions docs/developers/HowTo.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,14 +44,14 @@ To debug C++, you have to generate the example files, the example files consist
You can generate the example files by the following steps:

1. build Velox and Gluten CPP
```
```bash
gluten_home/dev/builddeps-veloxbe.sh --build_tests=ON --build_benchmarks=ON --build_type=Debug
```
- Compiling with `--build_type=Debug` is good for debugging.
- The executable file `generic_benchmark` will be generated under the directory of `gluten_home/cpp/build/velox/benchmarks/`.

2. build Gluten and generate the example files
```
```bash
cd gluten_home
mvn clean package -Pspark-3.2 -Pbackends-velox -Prss
mvn test -Pspark-3.2 -Pbackends-velox -Prss -pl backends-velox -am -DtagsToInclude="io.glutenproject.tags.GenerateExample" -Dtest=none -DfailIfNoTests=false -Darrow.version=11.0.0-gluten -Dexec.skip
Expand All @@ -72,7 +72,7 @@ gluten_home/backends-velox/generated-native-benchmark/
```

3. now, run benchmarks with GDB
```
```bash
cd gluten_home/cpp/build/velox/benchmarks/
gdb generic_benchmark
```
Expand All @@ -91,7 +91,7 @@ gdb generic_benchmark
will be used as default.
- You can also edit the file `example.json` to custom the Substrait plan or specify the inputs files placed in the other directory.

6. get more detail information about benchmarks from [MicroBenchmarks](./MicroBenchmarks.md)
6. get more detail information about benchmarks from [MicroBenchmarks](https://gluten.apache.org/docs/developers/microbenchmarks/#generate-micro-benchmarks-for-velox-backend)

## 2 How to debug plan validation process
Gluten will validate generated plan before execute it, and validation usually happens in native side, so we provide a utility to help debug validation process in native side.
Expand All @@ -105,7 +105,7 @@ wait to add

## 4 How to debug with core-dump
wait to complete
```
```bash
cd the_directory_of_core_file_generated
gdb gluten_home/cpp/build/releases/libgluten.so 'core-Executor task l-2000883-1671542526'

Expand All @@ -117,7 +117,7 @@ gdb gluten_home/cpp/build/releases/libgluten.so 'core-Executor task l-2000883-16
Now, both Parquet and DWRF format files are supported, related scripts and files are under the directory of `gluten_home/backends-velox/workload/tpch`.
The file `README.md` under `gluten_home/backends-velox/workload/tpch` offers some useful help but it's still not enough and exact.

One way of run TPC-H test is to run velox-be by workflow, you can refer to [velox_be.yml](https://github.com/oap-project/gluten/blob/main/.github/workflows/velox_be.yml#L90)
One way of run TPC-H test is to run velox with docker by workflow, you can refer to [velox_docker.yml](https://github.com/apache/incubator-gluten/blob/main/.github/workflows/velox_docker.yml)

Here will explain how to run TPC-H on Velox backend with the Parquet file format.
1. First step, prepare the datasets, you have two choices.
Expand All @@ -128,15 +128,15 @@ Here will explain how to run TPC-H on Velox backend with the Parquet file format
2. Second step, run TPC-H on Velox backend testing.
- Modify `gluten_home/backends-velox/workload/tpch/run_tpch/tpch_parquet.scala`.
- set `var parquet_file_path` to correct directory. If using the small dataset directly in the step one, then modify it as below
```
```scala
var parquet_file_path = "gluten_home/backends-velox/src/test/resources/tpch-data-parquet-velox"
```
- set `var gluten_root` to correct directory. If `gluten_home` is the directory of `/home/gluten`, then modify it as below
```
```scala
var gluten_root = "/home/gluten"
```
- Modify `gluten_home/backends-velox/workload/tpch/run_tpch/tpch_parquet.sh`.
- Set `GLUTEN_JAR` correctly. Please refer to the section of [Build Gluten with Velox Backend](../get-started/Velox.md/#2-build-gluten-with-velox-backend)
- Set `GLUTEN_JAR` correctly. Please refer to the section of [Build Gluten with Velox Backend](http://gluten.apache.org/docs/getting-started/velox-backend/#build-gluten-with-velox-backend)
- Set `SPARK_HOME` correctly.
- Set the memory configurations appropriately.
- Execute `tpch_parquet.sh` using the below command.
Expand Down
14 changes: 8 additions & 6 deletions docs/developers/HowToRelease.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ layout: page
title: How To Release
nav_order: 10
parent: Developers
grand_parent: Documentations
permalink: /docs/developers/how-to-release/
---
# How to Release

Expand Down Expand Up @@ -42,7 +44,7 @@ All projects under the Apache umbrella must adhere to the [Apache Release Policy

3. Sign the release artifacts with the GPG key.

```
```bash
# create a GPG key, after executing this command, select the first one RSA 和 RSA
$ gpg --full-generate-key

Expand All @@ -66,7 +68,7 @@ $ for i in *.tar.gz; do echo $i; gpg --local-user xxxx --armor --output $i.asc -

#### How to Generate checksums for the release artifacts.

```
```bash
# create the checksums
$ for i in *.tar.gz; do echo $i; sha512sum $i > $i.sha512 ; done
```
Expand All @@ -82,7 +84,7 @@ $ for i in *.tar.gz; do echo $i; sha512sum $i > $i.sha512 ; done
release-version format: apache-gluten-#.#.#-rc#

3. Upload the release artifacts to the SVN repository.
```
```bash
$ svn co https://dist.apache.org/repos/dist/dev/incubator/gluten/
$ cp /path/to/release/artifacts/* ./{release-version}/
$ svn add ./{release-version}/*
Expand All @@ -91,7 +93,7 @@ $ svn commit -m "add Apache Answer release artifacts for {release-version}"

4. After the upload, please visit the link `https://dist.apache.org/repos/dist/dev/incubator/gluten/{release-version}` to verify if the file upload is successful or not.
The upload release artifacts should be include
```
```bash
* apache-gluten-#.#.#-incubating-src.tar.gz
* apache-gluten-#.#.#-incubating-src.tar.gz.asc
* apache-gluten-#.#.#-incubating-src.tar.gz.sha512
Expand Down Expand Up @@ -119,7 +121,7 @@ Please follow below steps to verify the release artifacts.

Please follow below steps to verify the signatures.

```
```bash
# download KEYS
$ curl https://dist.apache.org/repos/dist/release/incubator/gluten/KEYS > KEYS

Expand All @@ -144,7 +146,7 @@ $ for i in *.tar.gz; do echo $i; gpg --verify $i.asc $i ; done
#### How to Verify the checksums

Please follow below steps to verify the checksums
```
```bash
# verify the checksums
$ for i in *.tar.gz; do echo $i; sha512sum --check $i.sha512; done
```
Expand Down
Loading

0 comments on commit bffb8fc

Please sign in to comment.