From 56fecc2d1d50ccc916b4065c1ba7307a7eeb26b7 Mon Sep 17 00:00:00 2001 From: Dongjoon Hyun Date: Sun, 3 Dec 2023 14:54:05 -0800 Subject: [PATCH] ORC-1536: Remove `hive-storage-api` link from `maven-javadoc-plugin` This PR aims to remove `hive-storage-api` link from `maven-javadoc-plugin`. In addition, all document links are updated with the official Apache Hive library java doc. This is reported here - https://github.com/apache/orc/pull/1663#issuecomment-1826214376 **BEFORE** ``` $ ./mvnw javadoc:javadoc -pl shims | grep ERROR Using `mvn` from path: /opt/homebrew/bin/mvn [ERROR] The given File link: /Users/dongjoon/APACHE/orc-merge/java/shims/../target/javadoc/api/hive-storage-api is not a dir. [ERROR] Error fetching link: /Users/dongjoon/APACHE/orc-merge/java/shims/../target/javadoc/api/hive-storage-api. Ignored it. ``` **AFTER** ``` $ ./mvnw javadoc:javadoc -pl shims | grep ERROR Using `mvn` from path: /opt/homebrew/bin/mvn ``` Manual tests. Closes #1671 from dongjoon-hyun/ORC-1536. Authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun (cherry picked from commit 0a710a7049f6c2dbb2adc2f18f7de9be59ea3f2c) Signed-off-by: Dongjoon Hyun --- java/pom.xml | 5 ---- site/_docs/core-java.md | 62 ++++++++++++++++++++--------------------- site/_docs/mapred.md | 4 +-- site/_docs/mapreduce.md | 4 +-- 4 files changed, 35 insertions(+), 40 deletions(-) diff --git a/java/pom.xml b/java/pom.xml index be4f9fb2de..88a60d665f 100644 --- a/java/pom.xml +++ b/java/pom.xml @@ -788,16 +788,11 @@ https://hadoop.apache.org/docs/r${hadoop.version}/api - https://orc.apache.org/api/hive-storage-api https://orc.apache.org/api/orc-core https://orc.apache.org/api/orc-mapreduce https://orc.apache.org/api/orc-tools - - https://orc.apache.org/api/hive-storage-api - ${project.basedir}/../../site/api/hive-storage-api - https://orc.apache.org/api/orc-core ${project.basedir}/../../site/api/orc-core diff --git a/site/_docs/core-java.md b/site/_docs/core-java.md index c4211a9e4c..3bb66f9747 100644 --- a/site/_docs/core-java.md +++ b/site/_docs/core-java.md @@ -11,10 +11,10 @@ read and write the data. ## Vectorized Row Batch Data is passed to ORC as instances of -[VectorizedRowBatch](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatch.html) +[VectorizedRowBatch](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatch.html) that contain the data for 1024 rows. The focus is on speed and accessing the data fields directly. `cols` is an array of -[ColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/ColumnVector.html) +[ColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/ColumnVector.html) and `size` is the number of rows. ~~~ java @@ -27,7 +27,7 @@ public class VectorizedRowBatch { } ~~~ -[ColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/ColumnVector.html) +[ColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/ColumnVector.html) is the parent type of the different kinds of columns and has some fields that are shared across all of the column types. In particular, the `noNulls` flag if there are no nulls in this column for this batch @@ -58,26 +58,26 @@ The subtypes of ColumnVector are: | ORC Type | ColumnVector | | -------- | ------------- | -| array | [ListColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/ListColumnVector.html) | -| binary | [BytesColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.html) | -| bigint | [LongColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/LongColumnVector.html) | -| boolean | [LongColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/LongColumnVector.html) | -| char | [BytesColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.html) | -| date | [LongColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/LongColumnVector.html) | -| decimal | [DecimalColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/DecimalColumnVector.html) | -| double | [DoubleColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/DoubleColumnVector.html) | -| float | [DoubleColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/DoubleColumnVector.html) | -| int | [LongColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/LongColumnVector.html) | -| map | [MapColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/MapColumnVector.html) | -| smallint | [LongColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/LongColumnVector.html) | -| string | [BytesColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.html) | -| struct | [StructColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/StructColumnVector.html) | -| timestamp | [TimestampColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/TimestampColumnVector.html) | -| tinyint | [LongColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/LongColumnVector.html) | -| uniontype | [UnionColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/UnionColumnVector.html) | -| varchar | [BytesColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.html) | - -[LongColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/LongColumnVector.html) handles all of the integer types (boolean, bigint, +| array | [ListColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/ListColumnVector.html) | +| binary | [BytesColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.html) | +| bigint | [LongColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/LongColumnVector.html) | +| boolean | [LongColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/LongColumnVector.html) | +| char | [BytesColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.html) | +| date | [LongColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/LongColumnVector.html) | +| decimal | [DecimalColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/DecimalColumnVector.html) | +| double | [DoubleColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/DoubleColumnVector.html) | +| float | [DoubleColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/DoubleColumnVector.html) | +| int | [LongColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/LongColumnVector.html) | +| map | [MapColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/MapColumnVector.html) | +| smallint | [LongColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/LongColumnVector.html) | +| string | [BytesColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.html) | +| struct | [StructColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/StructColumnVector.html) | +| timestamp | [TimestampColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/TimestampColumnVector.html) | +| tinyint | [LongColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/LongColumnVector.html) | +| uniontype | [UnionColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/UnionColumnVector.html) | +| varchar | [BytesColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.html) | + +[LongColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/LongColumnVector.html) handles all of the integer types (boolean, bigint, date, int, smallint, and tinyint). The data is represented as an array of longs where each value is sign-extended as necessary. @@ -88,7 +88,7 @@ public class LongColumnVector extends ColumnVector { } ~~~ -[TimestampColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/TimestampColumnVector.html) +[TimestampColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/TimestampColumnVector.html) handles timestamp values. The data is represented as an array of longs and an array of ints. @@ -104,7 +104,7 @@ public class TimestampColumnVector extends ColumnVector { } ~~~ -[DoubleColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/DoubleColumnVector.html) +[DoubleColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/DoubleColumnVector.html) handles all of the floating point types (double, and float). The data is represented as an array of doubles. @@ -115,7 +115,7 @@ public class DoubleColumnVector extends ColumnVector { } ~~~ -[DecimalColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/DecimalColumnVector.html) +[DecimalColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/DecimalColumnVector.html) handles decimal columns. The data is represented as an array of HiveDecimalWritable. Note that this implementation is not performant and will likely be replaced. @@ -127,7 +127,7 @@ public class DecimalColumnVector extends ColumnVector { } ~~~ -[BytesColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.html) +[BytesColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.html) handles all of the binary types (binary, char, string, and varchar). The data is represented as a byte array, offset, and length. The byte arrays may or may not be shared between values. @@ -141,7 +141,7 @@ public class BytesColumnVector extends ColumnVector { } ~~~ -[StructColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/StructColumnVector.html) +[StructColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/StructColumnVector.html) handles the struct columns and represents the data as an array of `ColumnVector`. The value for row 5 consists of the fifth value from each of the `fields` values. @@ -153,7 +153,7 @@ public class StructColumnVector extends ColumnVector { } ~~~ -[UnionColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/UnionColumnVector.html) +[UnionColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/UnionColumnVector.html) handles the union columns and represents the data as an array of integers that pick the subtype and a `fields` array one per a subtype. Only the value of the `fields` that corresponds to @@ -167,7 +167,7 @@ public class UnionColumnVector extends ColumnVector { } ~~~ -[ListColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/ListColumnVector.html) +[ListColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/ListColumnVector.html) handles the array columns and represents the data as two arrays of integers for the offset and lengths and a `ColumnVector` for the children values. @@ -187,7 +187,7 @@ public class ListColumnVector extends ColumnVector { } ~~~ -[MapColumnVector](/api/hive-storage-api/index.html?org/apache/hadoop/hive/ql/exec/vector/MapColumnVector.html) +[MapColumnVector](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/ql/exec/vector/MapColumnVector.html) handles the map columns and represents the data as two arrays of integers for the offset and lengths and two `ColumnVector`s for the keys and values. diff --git a/site/_docs/mapred.md b/site/_docs/mapred.md index 137ec5c656..bf84a4c922 100644 --- a/site/_docs/mapred.md +++ b/site/_docs/mapred.md @@ -49,8 +49,8 @@ the key and a value based on the table below expanded recursively. | bigint | org.apache.hadoop.io.LongWritable | | boolean | org.apache.hadoop.io.BooleanWritable | | char | org.apache.hadoop.io.Text | -| date | [org.apache.hadoop.hive.serde2.io.DateWritable](/api/hive-storage-api/index.html?org/apache/hadoop/hive/serde2/io/DateWritable.html) | -| decimal | [org.apache.hadoop.hive.serde2.io.HiveDecimalWritable](/api/hive-storage-api/index.html?org/apache/hadoop/hive/serde2/io/HiveDecimalWritable.html) | +| date | [org.apache.hadoop.hive.serde2.io.DateWritable](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/serde2/io/DateWritable.html) | +| decimal | [org.apache.hadoop.hive.serde2.io.HiveDecimalWritable](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/serde2/io/HiveDecimalWritable.html) | | double | org.apache.hadoop.io.DoubleWritable | | float | org.apache.hadoop.io.FloatWritable | | int | org.apache.hadoop.io.IntWritable | diff --git a/site/_docs/mapreduce.md b/site/_docs/mapreduce.md index 2a88de6da4..66f52c0456 100644 --- a/site/_docs/mapreduce.md +++ b/site/_docs/mapreduce.md @@ -49,8 +49,8 @@ the key and a value based on the table below expanded recursively. | bigint | org.apache.hadoop.io.LongWritable | | boolean | org.apache.hadoop.io.BooleanWritable | | char | org.apache.hadoop.io.Text | -| date | [org.apache.hadoop.hive.serde2.io.DateWritable](/api/hive-storage-api/index.html?org/apache/hadoop/hive/serde2/io/DateWritable.html) | -| decimal | [org.apache.hadoop.hive.serde2.io.HiveDecimalWritable](/api/hive-storage-api/index.html?org/apache/hadoop/hive/serde2/io/HiveDecimalWritable.html) | +| date | [org.apache.hadoop.hive.serde2.io.DateWritable](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/serde2/io/DateWritable.html) | +| decimal | [org.apache.hadoop.hive.serde2.io.HiveDecimalWritable](https://javadoc.io/static/org.apache.hive/hive-storage-api/2.8.1/org/apache/hadoop/hive/serde2/io/HiveDecimalWritable.html) | | double | org.apache.hadoop.io.DoubleWritable | | float | org.apache.hadoop.io.FloatWritable | | int | org.apache.hadoop.io.IntWritable |