From 25718699944203c38516bf4bb84f1254335f2704 Mon Sep 17 00:00:00 2001 From: kassiez Date: Fri, 20 Dec 2024 16:46:50 +0800 Subject: [PATCH 1/5] fix-deadlink-1220 --- .../gettingStarted/demo-block/demo-block.css | 44 +- .../gettingStarted/demo-block/latest.tsx | 12 +- .../gettingStarted/demo-block/page-hero-1.tsx | 4 +- .../building-lakehouse/doris-iceberg.md | 164 ----- .../gettingStarted/what-is-apache-doris.md | 2 +- .../async-materialized-view/faq.md | 2 +- .../functions-and-demands.md | 2 +- .../async-materialized-view/overview.md | 2 +- .../async-materialized-view/use-guide.md | 20 +- .../releasenotes/v2.1/release-2.1.0.md | 26 +- .../releasenotes/v2.1/release-2.1.2.md | 2 +- .../releasenotes/v2.1/release-2.1.3.md | 2 +- .../releasenotes/v2.1/release-2.1.4.md | 20 +- .../releasenotes/v2.1/release-2.1.5.md | 4 +- .../releasenotes/v2.1/release-2.1.6.md | 12 +- .../releasenotes/v2.1/release-2.1.7.md | 16 +- .../releasenotes/v3.0/release-3.0.0.md | 24 +- .../releasenotes/v3.0/release-3.0.1.md | 4 +- .../releasenotes/v3.0/release-3.0.2.md | 2 +- .../releasenotes/v3.0/release-3.0.3.md | 8 +- .../gettingStarted/demo-block/demo-block.css | 44 +- .../gettingStarted/demo-block/latest.tsx | 12 +- .../gettingStarted/demo-block/page-hero-1.tsx | 4 +- .../releasenotes/v2.1/release-2.1.0.md | 26 +- .../releasenotes/v2.1/release-2.1.2.md | 2 +- .../releasenotes/v2.1/release-2.1.3.md | 2 +- .../releasenotes/v2.1/release-2.1.4.md | 20 +- .../releasenotes/v2.1/release-2.1.5.md | 4 +- .../releasenotes/v2.1/release-2.1.6.md | 12 +- .../releasenotes/v2.1/release-2.1.7.md | 16 +- .../releasenotes/v3.0/release-3.0.0.md | 24 +- .../releasenotes/v3.0/release-3.0.1.md | 4 +- .../releasenotes/v3.0/release-3.0.2.md | 2 +- .../releasenotes/v3.0/release-3.0.3.md | 12 +- .../gettingStarted/demo-block/demo-block.css | 44 +- .../gettingStarted/demo-block/latest.tsx | 12 +- .../gettingStarted/demo-block/page-hero-1.tsx | 4 +- .../releasenotes/v2.1/release-2.1.0.md | 26 +- .../releasenotes/v2.1/release-2.1.2.md | 2 +- .../releasenotes/v2.1/release-2.1.3.md | 2 +- .../releasenotes/v2.1/release-2.1.4.md | 20 +- .../releasenotes/v2.1/release-2.1.5.md | 4 +- .../releasenotes/v2.1/release-2.1.6.md | 12 +- .../releasenotes/v2.1/release-2.1.7.md | 16 +- .../releasenotes/v3.0/release-3.0.0.md | 24 +- .../releasenotes/v3.0/release-3.0.1.md | 4 +- .../releasenotes/v3.0/release-3.0.2.md | 2 +- .../releasenotes/v3.0/release-3.0.3.md | 12 +- .../gettingStarted/demo-block/latest.tsx | 2 +- .../gettingStarted/demo-block/page-hero-1.tsx | 4 +- .../async-materialized-view/faq.md | 2 +- .../functions-and-demands.md | 2 +- .../async-materialized-view/overview.md | 2 +- .../async-materialized-view/use-guide.md | 16 +- .../releasenotes/v2.1/release-2.1.0.md | 26 +- .../releasenotes/v2.1/release-2.1.2.md | 2 +- .../releasenotes/v2.1/release-2.1.3.md | 3 +- .../releasenotes/v2.1/release-2.1.4.md | 20 +- .../releasenotes/v2.1/release-2.1.5.md | 4 +- .../releasenotes/v2.1/release-2.1.6.md | 13 +- .../releasenotes/v2.1/release-2.1.7.md | 16 +- .../releasenotes/v3.0/release-3.0.0.md | 24 +- .../releasenotes/v3.0/release-3.0.1.md | 4 +- .../releasenotes/v3.0/release-3.0.2.md | 2 +- .../releasenotes/v3.0/release-3.0.3.md | 12 +- .../gettingStarted/demo-block/demo-block.css | 44 +- .../gettingStarted/demo-block/latest.tsx | 12 +- .../gettingStarted/demo-block/page-hero-1.tsx | 4 +- .../async-materialized-view/faq.md | 2 +- .../functions-and-demands.md | 2 +- .../async-materialized-view/overview.md | 2 +- .../async-materialized-view/use-guide.md | 20 +- .../version-3.0/releasenotes/all-release.md | 6 +- .../releasenotes/v2.1/release-2.1.0.md | 26 +- .../releasenotes/v2.1/release-2.1.2.md | 2 +- .../releasenotes/v2.1/release-2.1.3.md | 2 +- .../releasenotes/v2.1/release-2.1.4.md | 20 +- .../releasenotes/v2.1/release-2.1.5.md | 6 +- .../releasenotes/v2.1/release-2.1.6.md | 12 +- .../releasenotes/v2.1/release-2.1.7.md | 16 +- .../releasenotes/v3.0/release-3.0.0.md | 24 +- .../releasenotes/v3.0/release-3.0.1.md | 4 +- .../releasenotes/v3.0/release-3.0.2.md | 2 +- .../releasenotes/v3.0/release-3.0.3.md | 12 +- sidebars.json | 11 +- .../version-1.2/releasenotes/all-release.md | 88 +++ .../releasenotes/v1.1/release-1.1.0.md | 379 +++++++++++ .../releasenotes/v1.1/release-1.1.1.md | 78 +++ .../releasenotes/v1.1/release-1.1.2.md | 84 +++ .../releasenotes/v1.1/release-1.1.3.md | 92 +++ .../releasenotes/v1.1/release-1.1.4.md | 72 +++ .../releasenotes/v1.1/release-1.1.5.md | 65 ++ .../releasenotes/v2.0/release-2.0.0.md | 236 +++++++ .../releasenotes/v2.0/release-2.0.1.md | 224 +++++++ .../releasenotes/v2.0/release-2.0.10.md | 59 ++ .../releasenotes/v2.0/release-2.0.11.md | 60 ++ .../releasenotes/v2.0/release-2.0.12.md | 58 ++ .../releasenotes/v2.0/release-2.0.13.md | 61 ++ .../releasenotes/v2.0/release-2.0.14.md | 59 ++ .../releasenotes/v2.0/release-2.0.15.md | 91 +++ .../releasenotes/v2.0/release-2.0.2.md | 157 +++++ .../releasenotes/v2.0/release-2.0.3.md | 253 ++++++++ .../releasenotes/v2.0/release-2.0.4.md | 67 ++ .../releasenotes/v2.0/release-2.0.5.md | 73 +++ .../releasenotes/v2.0/release-2.0.6.md | 59 ++ .../releasenotes/v2.0/release-2.0.7.md | 84 +++ .../releasenotes/v2.0/release-2.0.8.md | 76 +++ .../releasenotes/v2.0/release-2.0.9.md | 75 +++ .../releasenotes/v2.1/release-2.1.0.md | 159 +++++ .../releasenotes/v2.1/release-2.1.1.md | 251 ++++++++ .../releasenotes/v2.1/release-2.1.2.md | 110 ++++ .../releasenotes/v2.1/release-2.1.3.md | 191 ++++++ .../releasenotes/v2.1/release-2.1.4.md | 289 +++++++++ .../releasenotes/v2.1/release-2.1.5.md | 395 ++++++++++++ .../releasenotes/v2.1/release-2.1.6.md | 524 +++++++++++++++ .../releasenotes/v2.1/release-2.1.7.md | 180 ++++++ .../releasenotes/v3.0/release-3.0.0.md | 469 ++++++++++++++ .../releasenotes/v3.0/release-3.0.1.md | 604 ++++++++++++++++++ .../releasenotes/v3.0/release-3.0.2.md | 341 ++++++++++ .../releasenotes/v3.0/release-3.0.3.md | 226 +++++++ .../releasenotes/v1.1/release-1.1.0.md | 379 +++++++++++ .../releasenotes/v1.1/release-1.1.1.md | 78 +++ .../releasenotes/v1.1/release-1.1.2.md | 84 +++ .../releasenotes/v1.1/release-1.1.3.md | 92 +++ .../releasenotes/v1.1/release-1.1.4.md | 72 +++ .../releasenotes/v1.1/release-1.1.5.md | 65 ++ .../releasenotes/v1.2/release-1.2.0.md | 563 ++++++++++++++++ .../releasenotes/v1.2/release-1.2.1.md | 196 ++++++ .../releasenotes/v1.2/release-1.2.2.md | 254 ++++++++ .../releasenotes/v1.2/release-1.2.3.md | 109 ++++ .../releasenotes/v1.2/release-1.2.4.md | 81 +++ .../releasenotes/v1.2/release-1.2.5.md | 199 ++++++ .../releasenotes/v1.2/release-1.2.6.md | 135 ++++ .../releasenotes/v1.2/release-1.2.7.md | 46 ++ .../releasenotes/v1.2/release-1.2.8.md | 47 ++ .../releasenotes/v2.1/release-2.1.0.md | 159 +++++ .../releasenotes/v2.1/release-2.1.1.md | 251 ++++++++ .../releasenotes/v2.1/release-2.1.2.md | 110 ++++ .../releasenotes/v2.1/release-2.1.3.md | 191 ++++++ .../releasenotes/v2.1/release-2.1.4.md | 289 +++++++++ .../releasenotes/v2.1/release-2.1.5.md | 395 ++++++++++++ .../releasenotes/v2.1/release-2.1.6.md | 524 +++++++++++++++ .../releasenotes/v2.1/release-2.1.7.md | 180 ++++++ .../releasenotes/v3.0/release-3.0.0.md | 469 ++++++++++++++ .../releasenotes/v3.0/release-3.0.1.md | 604 ++++++++++++++++++ .../releasenotes/v3.0/release-3.0.2.md | 341 ++++++++++ .../releasenotes/v3.0/release-3.0.3.md | 226 +++++++ .../releasenotes/v1.1/release-1.1.0.md | 379 +++++++++++ .../releasenotes/v1.1/release-1.1.1.md | 78 +++ .../releasenotes/v1.1/release-1.1.2.md | 84 +++ .../releasenotes/v1.1/release-1.1.3.md | 92 +++ .../releasenotes/v1.1/release-1.1.4.md | 72 +++ .../releasenotes/v1.1/release-1.1.5.md | 65 ++ .../releasenotes/v1.2/release-1.2.0.md | 563 ++++++++++++++++ .../releasenotes/v1.2/release-1.2.1.md | 196 ++++++ .../releasenotes/v1.2/release-1.2.2.md | 254 ++++++++ .../releasenotes/v1.2/release-1.2.3.md | 109 ++++ .../releasenotes/v1.2/release-1.2.4.md | 81 +++ .../releasenotes/v1.2/release-1.2.5.md | 199 ++++++ .../releasenotes/v1.2/release-1.2.6.md | 135 ++++ .../releasenotes/v1.2/release-1.2.7.md | 46 ++ .../releasenotes/v1.2/release-1.2.8.md | 47 ++ .../releasenotes/v2.0/release-2.0.0.md | 236 +++++++ .../releasenotes/v2.0/release-2.0.1.md | 224 +++++++ .../releasenotes/v2.0/release-2.0.10.md | 59 ++ .../releasenotes/v2.0/release-2.0.11.md | 60 ++ .../releasenotes/v2.0/release-2.0.12.md | 58 ++ .../releasenotes/v2.0/release-2.0.13.md | 61 ++ .../releasenotes/v2.0/release-2.0.14.md | 59 ++ .../releasenotes/v2.0/release-2.0.15.md | 91 +++ .../releasenotes/v2.0/release-2.0.2.md | 157 +++++ .../releasenotes/v2.0/release-2.0.3.md | 253 ++++++++ .../releasenotes/v2.0/release-2.0.4.md | 67 ++ .../releasenotes/v2.0/release-2.0.5.md | 73 +++ .../releasenotes/v2.0/release-2.0.6.md | 59 ++ .../releasenotes/v2.0/release-2.0.7.md | 84 +++ .../releasenotes/v2.0/release-2.0.8.md | 76 +++ .../releasenotes/v2.0/release-2.0.9.md | 75 +++ .../releasenotes/v3.0/release-3.0.0.md | 469 ++++++++++++++ .../releasenotes/v3.0/release-3.0.1.md | 604 ++++++++++++++++++ .../releasenotes/v3.0/release-3.0.2.md | 341 ++++++++++ .../releasenotes/v3.0/release-3.0.3.md | 226 +++++++ .../releasenotes/v1.1/release-1.1.0.md | 379 +++++++++++ .../releasenotes/v1.1/release-1.1.1.md | 78 +++ .../releasenotes/v1.1/release-1.1.2.md | 84 +++ .../releasenotes/v1.1/release-1.1.3.md | 92 +++ .../releasenotes/v1.1/release-1.1.4.md | 72 +++ .../releasenotes/v1.1/release-1.1.5.md | 65 ++ .../releasenotes/v1.2/release-1.2.0.md | 563 ++++++++++++++++ .../releasenotes/v1.2/release-1.2.1.md | 196 ++++++ .../releasenotes/v1.2/release-1.2.2.md | 254 ++++++++ .../releasenotes/v1.2/release-1.2.3.md | 109 ++++ .../releasenotes/v1.2/release-1.2.4.md | 81 +++ .../releasenotes/v1.2/release-1.2.5.md | 199 ++++++ .../releasenotes/v1.2/release-1.2.6.md | 135 ++++ .../releasenotes/v1.2/release-1.2.7.md | 46 ++ .../releasenotes/v1.2/release-1.2.8.md | 47 ++ .../releasenotes/v2.0/release-2.0.0.md | 236 +++++++ .../releasenotes/v2.0/release-2.0.1.md | 224 +++++++ .../releasenotes/v2.0/release-2.0.10.md | 59 ++ .../releasenotes/v2.0/release-2.0.11.md | 60 ++ .../releasenotes/v2.0/release-2.0.12.md | 58 ++ .../releasenotes/v2.0/release-2.0.13.md | 61 ++ .../releasenotes/v2.0/release-2.0.14.md | 59 ++ .../releasenotes/v2.0/release-2.0.15.md | 91 +++ .../releasenotes/v2.0/release-2.0.2.md | 157 +++++ .../releasenotes/v2.0/release-2.0.3.md | 253 ++++++++ .../releasenotes/v2.0/release-2.0.4.md | 67 ++ .../releasenotes/v2.0/release-2.0.5.md | 73 +++ .../releasenotes/v2.0/release-2.0.6.md | 59 ++ .../releasenotes/v2.0/release-2.0.7.md | 84 +++ .../releasenotes/v2.0/release-2.0.8.md | 76 +++ .../releasenotes/v2.0/release-2.0.9.md | 75 +++ .../releasenotes/v2.1/release-2.1.0.md | 159 +++++ .../releasenotes/v2.1/release-2.1.1.md | 251 ++++++++ .../releasenotes/v2.1/release-2.1.2.md | 110 ++++ .../releasenotes/v2.1/release-2.1.3.md | 191 ++++++ .../releasenotes/v2.1/release-2.1.4.md | 289 +++++++++ .../releasenotes/v2.1/release-2.1.5.md | 395 ++++++++++++ .../releasenotes/v2.1/release-2.1.6.md | 524 +++++++++++++++ .../releasenotes/v2.1/release-2.1.7.md | 180 ++++++ .../releasenotes/v3.0/release-3.0.3.md | 2 +- versioned_sidebars/version-1.2-sidebars.json | 134 +++- versioned_sidebars/version-2.0-sidebars.json | 143 ++++- versioned_sidebars/version-2.1-sidebars.json | 83 ++- versioned_sidebars/version-3.0-sidebars.json | 79 ++- 226 files changed, 25239 insertions(+), 676 deletions(-) create mode 100644 versioned_docs/version-1.2/releasenotes/all-release.md create mode 100644 versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.0.md create mode 100644 versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.1.md create mode 100644 versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.2.md create mode 100644 versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.3.md create mode 100644 versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.4.md create mode 100644 versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.5.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.0.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.1.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.10.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.11.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.12.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.13.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.14.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.15.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.2.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.3.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.4.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.5.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.6.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.7.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.8.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.9.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.0.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.1.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.2.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.3.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.4.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.5.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.6.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.7.md create mode 100644 versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.0.md create mode 100644 versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.1.md create mode 100644 versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.2.md create mode 100644 versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.3.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.0.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.1.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.2.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.3.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.4.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.5.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.0.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.1.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.2.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.3.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.4.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.5.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.6.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.7.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.8.md create mode 100644 versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.0.md create mode 100644 versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.1.md create mode 100644 versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.2.md create mode 100644 versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.3.md create mode 100644 versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.4.md create mode 100644 versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.5.md create mode 100644 versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.6.md create mode 100644 versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.7.md create mode 100644 versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.0.md create mode 100644 versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.1.md create mode 100644 versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.2.md create mode 100644 versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.3.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.0.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.1.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.2.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.3.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.4.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.5.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.0.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.1.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.2.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.3.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.4.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.5.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.6.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.7.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.8.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.0.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.1.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.10.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.11.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.12.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.13.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.14.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.15.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.2.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.3.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.4.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.5.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.6.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.7.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.8.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.9.md create mode 100644 versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.0.md create mode 100644 versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.1.md create mode 100644 versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.2.md create mode 100644 versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.3.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.0.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.1.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.2.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.3.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.4.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.5.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.0.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.1.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.2.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.3.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.4.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.5.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.6.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.7.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.8.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.0.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.1.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.10.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.11.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.12.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.13.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.14.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.15.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.2.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.3.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.4.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.5.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.6.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.7.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.8.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.9.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.0.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.1.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.2.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.3.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.4.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.5.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.6.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.7.md diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/demo-block.css b/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/demo-block.css index 934e88ba28aaf..1257919249c60 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/demo-block.css +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/demo-block.css @@ -105,15 +105,6 @@ a:active { padding-right: 2rem } -.home-page-hero-right { - flex: 1; - flex-direction: row; - justify-content: center; - width: fit-content -} - - - .home-page-option-button { display: flex; margin-bottom: 0.5rem; @@ -209,11 +200,6 @@ a:active { justify-content: center; } -.home-page-hero-right { - align-items: center; - display: flex; - flex-direction: row; -} .home-page-hero-button { /* background-color: #fafafa; */ @@ -279,8 +265,18 @@ a:active { margin-top: 15px } +.home-page-hero-right a { + color: #4c576c +} - +.home-page-hero-right a:hover, +a:active { + /* color: #444fd9; */ + text-decoration: none; + transition-duration: .3s; + transition-timing-function: cubic-bezier(0, 0, .2, 1); + background-color: #fafafa +} .section-border { @@ -355,6 +351,24 @@ a:active { } +@media (max-width: 996px) { + .latest-button { + flex: 1 1 100%; + margin-right: 0 !important; + max-width: calc(100vw - 2rem); + min-height: 170px; + height: auto !important; + } + + .home-page-hero-right { + flex-wrap: wrap !important + } + .latest-button-CN{ + margin-right: 0 !important; + max-width: calc(100vw - 2rem); + } +} + .latest-button-CN { /* background-color: #fafafa; */ border: 0.3px solid #dcdcdc; diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/latest.tsx b/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/latest.tsx index 3e1eb5090e0fb..7c92f75c3c137 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/latest.tsx +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/latest.tsx @@ -33,12 +33,12 @@ export default function Latest() { */} -
Doris Summit Asia 2024|12 月 14 日 深圳
+
Doris Summit Asia 2024 圆满落幕
-
一年一度的 Apache Doris 峰会再次启航,Doris Summit Asia 2024 现已开启报名,将于 12 月 14 日在深圳正式举办。
-
立即报名
+
2024 年 12 月 14 日,由飞轮科技主办,腾讯云和阿里云联合主办的 Doris Summit Asia 2024 在深圳圆满落幕。演讲回放及资料会在 10 个工作日内逐步释出,可通过 Doris Summit 官网获取。
+
回放生成中
- +
版本发布
{/*
@@ -47,9 +47,9 @@ export default function Latest() {
*/} -
Apache Doris 3.0.2 正式发布
+
Apache Doris 3.0.3 正式发布
-
3.0.2 版本在存算分离、存储、湖仓一体、查询优化器以及执行引擎持续升级改进,欢迎大家下载使用。
+
3.0.3 版本在存算分离、存储、湖仓一体、查询优化器以及执行引擎持续升级改进,欢迎大家下载使用。
查看详情
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/page-hero-1.tsx b/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/page-hero-1.tsx index 4b9826c5d4e23..6666f3f97ac60 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/page-hero-1.tsx +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/page-hero-1.tsx @@ -35,9 +35,9 @@ export default function PageHero() {
如何基于 Apache Doris 构建开放、高性能低成本、统一的日志存储分析平台。
- +
-
资源管理
+
负载管理
{/*
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/tutorials/building-lakehouse/doris-iceberg.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/tutorials/building-lakehouse/doris-iceberg.md index 3653581432726..3cc43ab17e47e 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/tutorials/building-lakehouse/doris-iceberg.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/tutorials/building-lakehouse/doris-iceberg.md @@ -304,167 +304,3 @@ mysql> SELECT * FROM iceberg.nyc.taxis FOR TIME AS OF "2024-07-29 03:40:22"; +-----------+---------+---------------+-------------+--------------------+----------------------------+ 4 rows in set (0.05 sec) ``` - -### 07 与 PyIceberg 交互 - -加载 Iceberg 表: - -```python -from pyiceberg.catalog import load_catalog - -catalog = load_catalog( - "iceberg", - **{ - "warehouse" = "warehouse", - "uri" = "http://rest:8181", - "s3.access-key-id" = "admin", - "s3.secret-access-key" = "password", - "s3.endpoint" = "http://minio:9000" - }, -) -table = catalog.load_table("nyc.taxis") -``` - -读取为 Arrow Table: - -```python -print(table.scan().to_arrow()) - -pyarrow.Table -vendor_id: int64 -trip_id: int64 -trip_distance: float -fare_amount: double -store_and_fwd_flag: large_string -ts: timestamp[us] ----- -vendor_id: [[1],[1],[2],[2]] -trip_id: [[1000371],[1000374],[1000373],[1000372]] -trip_distance: [[1.8],[8.4],[0.9],[2.5]] -fare_amount: [[15.32],[42.13],[9.01],[22.15]] -store_and_fwd_flag: [["N"],["Y"],["N"],["N"]] -ts: [[2024-01-01 09:15:23.000000],[2024-01-03 07:12:33.000000],[2024-01-01 03:25:15.000000],[2024-01-02 12:10:11.000000]] -``` - -读取为 Pandas DataFrame: - -```python -print(table.scan().to_pandas()) - -vendor_id trip_id trip_distance fare_amount store_and_fwd_flag ts -0 1 1000371 1.8 15.32 N 2024-01-01 09:15:23 -1 1 1000374 8.4 42.13 Y 2024-01-03 07:12:33 -2 2 1000373 0.9 9.01 N 2024-01-01 03:25:15 -3 2 1000372 2.5 22.15 N 2024-01-02 12:10:11 -``` - -读取为 Polars DataFrame: - -```python -import polars as pl - -print(pl.scan_iceberg(table).collect()) - -shape: (4, 6) -┌───────────┬─────────┬───────────────┬─────────────┬────────────────────┬─────────────────────┐ -│ vendor_id ┆ trip_id ┆ trip_distance ┆ fare_amount ┆ store_and_fwd_flag ┆ ts │ -│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ -│ i64 ┆ i64 ┆ f32 ┆ f64 ┆ str ┆ datetime[μs] │ -╞═══════════╪═════════╪═══════════════╪═════════════╪════════════════════╪═════════════════════╡ -│ 1 ┆ 1000371 ┆ 1.8 ┆ 15.32 ┆ N ┆ 2024-01-01 09:15:23 │ -│ 1 ┆ 1000374 ┆ 8.4 ┆ 42.13 ┆ Y ┆ 2024-01-03 07:12:33 │ -│ 2 ┆ 1000373 ┆ 0.9 ┆ 9.01 ┆ N ┆ 2024-01-01 03:25:15 │ -│ 2 ┆ 1000372 ┆ 2.5 ┆ 22.15 ┆ N ┆ 2024-01-02 12:10:11 │ -└───────────┴─────────┴───────────────┴─────────────┴────────────────────┴─────────────────────┘ -``` - -> 通过 pyiceberg 写入 iceberg 数据,请参阅[步骤](#通过-pyiceberg-写入数据) - -### 08 附录 - -#### 通过 PyIceberg 写入数据 - -加载 Iceberg 表: - -```python -from pyiceberg.catalog import load_catalog - -catalog = load_catalog( - "iceberg", - **{ - "warehouse" = "warehouse", - "uri" = "http://rest:8181", - "s3.access-key-id" = "admin", - "s3.secret-access-key" = "password", - "s3.endpoint" = "http://minio:9000" - }, -) -table = catalog.load_table("nyc.taxis") -``` - -Arrow Table 写入 Iceberg: - -```python -import pyarrow as pa - -df = pa.Table.from_pydict( - { - "vendor_id": pa.array([1, 2, 2, 1], pa.int64()), - "trip_id": pa.array([1000371, 1000372, 1000373, 1000374], pa.int64()), - "trip_distance": pa.array([1.8, 2.5, 0.9, 8.4], pa.float32()), - "fare_amount": pa.array([15.32, 22.15, 9.01, 42.13], pa.float64()), - "store_and_fwd_flag": pa.array(["N", "N", "N", "Y"], pa.string()), - "ts": pa.compute.strptime( - ["2024-01-01 9:15:23", "2024-01-02 12:10:11", "2024-01-01 3:25:15", "2024-01-03 7:12:33"], - "%Y-%m-%d %H:%M:%S", - "us", - ), - } -) -table.append(df) -``` - -Pandas DataFrame 写入 Iceberg: - -```python -import pyarrow as pa -import pandas as pd - -df = pd.DataFrame( - { - "vendor_id": pd.Series([1, 2, 2, 1]).astype("int64[pyarrow]"), - "trip_id": pd.Series([1000371, 1000372, 1000373, 1000374]).astype("int64[pyarrow]"), - "trip_distance": pd.Series([1.8, 2.5, 0.9, 8.4]).astype("float32[pyarrow]"), - "fare_amount": pd.Series([15.32, 22.15, 9.01, 42.13]).astype("float64[pyarrow]"), - "store_and_fwd_flag": pd.Series(["N", "N", "N", "Y"]).astype("string[pyarrow]"), - "ts": pd.Series(["2024-01-01 9:15:23", "2024-01-02 12:10:11", "2024-01-01 3:25:15", "2024-01-03 7:12:33"]).astype("timestamp[us][pyarrow]"), - } -) -table.append(pa.Table.from_pandas(df)) -``` - -Polars DataFrame 写入 Iceberg: - -```python -import polars as pl - -df = pl.DataFrame( - { - "vendor_id": [1, 2, 2, 1], - "trip_id": [1000371, 1000372, 1000373, 1000374], - "trip_distance": [1.8, 2.5, 0.9, 8.4], - "fare_amount": [15.32, 22.15, 9.01, 42.13], - "store_and_fwd_flag": ["N", "N", "N", "Y"], - "ts": ["2024-01-01 9:15:23", "2024-01-02 12:10:11", "2024-01-01 3:25:15", "2024-01-03 7:12:33"], - }, - { - "vendor_id": pl.Int64, - "trip_id": pl.Int64, - "trip_distance": pl.Float32, - "fare_amount": pl.Float64, - "store_and_fwd_flag": pl.String, - "ts": pl.String, - }, -).with_columns(pl.col("ts").str.strptime(pl.Datetime, "%Y-%m-%d %H:%M:%S")) -table.append(df.to_arrow()) -``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/what-is-apache-doris.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/what-is-apache-doris.md index 468d60e1b104c..809ad7b7b6e26 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/what-is-apache-doris.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/what-is-apache-doris.md @@ -104,6 +104,6 @@ Apache Doris 也支持**强一致的物化视图**,物化视图的更新和选 ![Doris 查询引擎是向量化](/images/apache-doris-query-engine-2.png) -**Apache Doris 采用了自适应查询执行(Adaptive Query Execution)技术,** 可以根据 Runtime Statistics 来动态调整执行计划,比如通过 Runtime Filter 技术能够在运行时生成 Filter 推到 Probe 侧,并且能够将 Filter 自动穿透到 Probe 侧最底层的 Scan 节点,从而大幅减少 Probe 的数据量,加速 Join 性能。Apache Doris 的 Runtime Filter 支持 In/Min/Max/Bloom Filter。 +**Apache Doris 采用了自适应查询执行(Adaptive Query Execution)技术,**可以根据 Runtime Statistics 来动态调整执行计划,比如通过 Runtime Filter 技术能够在运行时生成 Filter 推到 Probe 侧,并且能够将 Filter 自动穿透到 Probe 侧最底层的 Scan 节点,从而大幅减少 Probe 的数据量,加速 Join 性能。Apache Doris 的 Runtime Filter 支持 In/Min/Max/Bloom Filter。 在**优化器**方面,Apache Doris 使用 CBO 和 RBO 结合的优化策略,RBO 支持常量折叠、子查询改写、谓词下推等,CBO 支持 Join Reorder。目前 CBO 还在持续优化中,主要集中在更加精准的统计信息收集和推导,更加精准的代价模型预估等方面。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/faq.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/faq.md index 69a7f275c0ea0..72068e39c7cec 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/faq.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/faq.md @@ -1,6 +1,6 @@ --- { - "title": "常见问题", + "title": "异步物化视图常见问题", "language": "zh-CN" } --- diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md index b0d4b9cabecfb..b470ff4a83deb 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md @@ -1,6 +1,6 @@ --- { - "title": "功能描述", + "title": "异步物化视图功能描述", "language": "zh-CN" } --- diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/overview.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/overview.md index 830ad751e2bd0..a140b4a871859 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/overview.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/overview.md @@ -1,6 +1,6 @@ --- { - "title": "原理介绍", + "title": "异步物化视图原理介绍", "language": "zh-CN" } --- diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/use-guide.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/use-guide.md index 382b9ad525f9a..0382278609121 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/use-guide.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/use-guide.md @@ -1,6 +1,6 @@ --- { - "title": "使用与实践", + "title": "异步物化视图使用与实践", "language": "zh-CN" } --- @@ -26,9 +26,9 @@ under the License. ## 异步物化视图使用原则 -1. **时效性考虑:**异步物化视图通常用于对数据时效性要求不高的场景,一般是 T+1 的数据。如果时效性要求高,应考虑使用同步物化视图。 +1. **时效性考虑:** 异步物化视图通常用于对数据时效性要求不高的场景,一般是 T+1 的数据。如果时效性要求高,应考虑使用同步物化视图。 -2. **加速效果与一致性考虑:**在查询加速场景,创建物化视图时,DBA 应将常见查询 SQL 模式分组,尽量使组之间无重合。SQL 模式组划分越清晰,物化视图构建的质量越高。一个查询可能使用多个物化视图,同时一个物化视图也可能被多个查询使用。构建物化视图需要综合考虑命中物化视图的响应时间(加速效果)、构建成本、数据一致性要求等。 +2. **加速效果与一致性考虑:** 在查询加速场景,创建物化视图时,DBA 应将常见查询 SQL 模式分组,尽量使组之间无重合。SQL 模式组划分越清晰,物化视图构建的质量越高。一个查询可能使用多个物化视图,同时一个物化视图也可能被多个查询使用。构建物化视图需要综合考虑命中物化视图的响应时间(加速效果)、构建成本、数据一致性要求等。 3. **物化视图定义与构建成本考虑:** @@ -38,11 +38,11 @@ under the License. 需要注意: -1. **物化视图数量控制:**物化视图并非越多越好。物化视图参与透明改写,且 CBO 代价模型选择需要时间。理论上,物化视图越多,透明改写的时间越长,且物化视图构建和刷新占用的资源越大。 +1. **物化视图数量控制:** 物化视图并非越多越好。物化视图参与透明改写,且 CBO 代价模型选择需要时间。理论上,物化视图越多,透明改写的时间越长,且物化视图构建和刷新占用的资源越大。 -2. **定期检查物化视图使用状态:**如果未使用,应及时删除。 +2. **定期检查物化视图使用状态:** 如果未使用,应及时删除。 -3. **基表数据更新频率:**如果物化视图的基表数据频繁更新,可能不太适合使用物化视图,因为这会导致物化视图频繁失效,不能用于透明改写(可直查)。如果需要使用此类物化视图进行透明改写,需要允许查询的数据有一定的时效延迟,并可以设定`grace_period`。具体见`grace_period`的适用介绍。 +3. **基表数据更新频率:** 如果物化视图的基表数据频繁更新,可能不太适合使用物化视图,因为这会导致物化视图频繁失效,不能用于透明改写(可直查)。如果需要使用此类物化视图进行透明改写,需要允许查询的数据有一定的时效延迟,并可以设定`grace_period`。具体见`grace_period`的适用介绍。 ## 物化视图刷新方式选择原则 @@ -184,9 +184,9 @@ GROUP BY 通常物化视图会出现两种状态: -- **状态正常:**指的是当前物化视图是否可用于透明改写。 +- **状态正常:** 指的是当前物化视图是否可用于透明改写。 -- **不可用、状态不正常:**指的是物化视图不能用于透明改写的简称。尽管如此,该物化视图还是可以直查的。 +- **不可用、状态不正常:** 的是物化视图不能用于透明改写的简称。尽管如此,该物化视图还是可以直查的。 ### 查看物化视图元数据 @@ -222,9 +222,9 @@ SyncWithBaseTables: 1 - 对于分区增量的物化视图,分区物化视图是否可用,是以分区粒度去看的。也就是说,即使物化视图的部分分区不可用,但只要查询的是有效分区,那么此物化视图依旧可用于透明改写。是否能透明改写,主要看查询所用分区的 `SyncWithBaseTables` 字段是否一致。如果 `SyncWithBaseTables` 是 1,此分区可用于透明改写;如果是 0,则不能用于透明改写。 -- **JobName:**物化视图构建 Job 的名称,每个物化视图有一个 Job,每次刷新会有一个新的 Task,Job 和 Task 是 1:n 的关系 +- **JobName:** 物化视图构建 Job 的名称,每个物化视图有一个 Job,每次刷新会有一个新的 Task,Job 和 Task 是 1:n 的关系 -- **State:**如果变为 SCHEMA_CHANGE,代表基表的 Schema 发生了变化,此时物化视图将不能用来透明改写 (但是不影响直接查询物化视图),下次刷新任务如果执行成功,将恢复为 NORMAL。 +- **State:** 如果变为 SCHEMA_CHANGE,代表基表的 Schema 发生了变化,此时物化视图将不能用来透明改写 (但是不影响直接查询物化视图),下次刷新任务如果执行成功,将恢复为 NORMAL。 - **SchemaChangeDetail:** 表示 SCHEMA_CHANGE 发生的原因。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.0.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.0.md index 434677f520819..d14aec8a307e5 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.0.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.0.md @@ -104,7 +104,7 @@ under the License. ![Local Shuffle Clickbench and TPCH-100](/images/2.1-doris-clickbench-tpch.png) :::note 备注 -参考文档:[Pipeline X 执行引擎](https://doris.apache.org/zh-CN/docs/query-acceleration/pipeline-execution-engine) +参考文档:[Pipeline X 执行引擎](../../query-acceleration/pipeline-execution-engine) ::: ## ARM 架构深度适配,性能提升 230% @@ -141,9 +141,9 @@ under the License. 该功能目前为实验性质功能,当前已经支持 ClickHouse、Presto、Trino、Hive、Spark。在此我们以 Trino 为例,部署完 SQL 转换服务后,在会话变量中设置 `set sql_dialect = trino` ,即可直接采取 Trino SQL 语法执行查询。在某些社区用户的实际线上业务 SQL 兼容性测试中,在全部 3w 多条查询语句中与 Trino SQL 兼容度高达 99% 以上。也欢迎所有用户在使用过程中向我们反馈不兼容的 Case,帮助 Apache Doris 更加完善。 :::note -- 演示 Demo: https://www.bilibili.com/video/BV1cS421A7kA/?spm_id_from=333.999.0.0 +- [演示 Demo](https://www.bilibili.com/video/BV1cS421A7kA/?spm_id_from=333.999.0.0) -- 参考文档:[SQL 方言兼容](https://doris.apache.org/zh-CN/docs/lakehouse/sql-dialect.md) +- 参考文档:[SQL 方言兼容](../../lakehouse/sql-dialect.md) ::: @@ -302,7 +302,7 @@ CREATE MATERIALIZED VIEW mv1 :::note - 演示 Demo: https://www.bilibili.com/video/BV1s2421T71z/?spm_id_from=333.999.0.0 -- 参考文档:[异步物化视图](https://doris.apache.org/zh-CN/docs/query-acceleration/materialized-view/async-materialized-view/overview) +- 参考文档:[异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/overview) ::: ## 存储能力增强 @@ -408,7 +408,7 @@ PROPERTIES ( :::note -参考文档:[数据划分](https://doris.apache.org/zh-CN/docs/table-design/data-partitioning/basic-concepts) +参考文档:[数据划分](../../table-design/data-partitioning/basic-concepts) ::: ### INSERT INTO SELECT 导入性能提升 100% @@ -470,7 +470,7 @@ MemTable 前移在 2.1 版本中默认开启,用户无需修改原有的导入 :::note - 演示 Demo:https://www.bilibili.com/video/BV1um411o7Ha/?spm_id_from=333.999.0.0 -- 参考文档和完整测试报告:[Group Commit](https://doris.apache.org/zh-CN/docs/data-operate/import/import-way/group-commit-manual) +- 参考文档和完整测试报告:[Group Commit](../../data-operate/import/import-way/group-commit-manual) ::: @@ -542,7 +542,7 @@ SELECT v["properties"]["title"] from ${table_name} :::note - 演示 Demo: https://www.bilibili.com/video/BV13u4m1g7ra/?spm_id_from=333.999.0.0 -- 参考文档:[VARIANT](https://doris.apache.org/zh-CN/docs/sql-manual/sql-data-types/semi-structured/VARIANT.md) +- 参考文档:[VARIANT](../../sql-manual/sql-data-types/semi-structured/VARIANT.md) ::: @@ -557,7 +557,7 @@ SELECT v["properties"]["title"] from ${table_name} - INET_ATON:获取包含 IPv4 地址的字符串,格式为 A.B.C.D(点分隔的十进制数字) :::note -参考文档:[IPV6](https://doris.apache.org/zh-CN/docs/sql-manual/sql-data-types/ip/IPV6) +参考文档:[IPV6](../../sql-manual/sql-data-types/ip/IPV6) ::: @@ -674,7 +674,7 @@ mysql> select struct(1,"2") not in (struct(1,3), struct(1,"2"), struct(1,1), nul - `MAP_AGG`:接收 expr1 作为键,expr2 作为对应的值,返回一个 MAP :::note -参考文档:[MAP_AGG](https://doris.apache.org/zh-CN/docs/sql-manual/sql-functions/aggregate-functions/map-agg.md) +参考文档:[MAP_AGG](../../sql-manual/sql-functions/aggregate-functions/map-agg.md) ::: @@ -699,7 +699,7 @@ mysql> select struct(1,"2") not in (struct(1,3), struct(1,"2"), struct(1,1), nul :::note - 演示 Demo:https://www.bilibili.com/video/BV1Fz421X7XE/?spm_id_from=333.999.0.0 -- 参考文档:[Workload Group](https://doris.apache.org/zh-CN/docs/admin-manual/resource-admin/workload-group.md) +- 参考文档:[Workload Group](../../admin-manual/resource-admin/workload-group.md) ::: @@ -757,7 +757,7 @@ select QueryId,max(BePeakMemoryBytes) as be_peak_mem from active_queries() group 目前主要展示的负载类型包括 Select 和`Insert Into……Select`,预计在 2.1 版本之上的三位迭代版本中会支持 Stream Load 和 Broker Load 的资源用量展示。 :::note -参考文档:[ACTIVE_QUERIES](https://doris.apache.org/zh-CN/docs/sql-manual/sql-functions/table-functions/active_queries.md) +参考文档:[ACTIVE_QUERIES](../../sql-manual/sql-functions/table-functions/active_queries.md) ::: @@ -858,7 +858,7 @@ JOB e_daily :::caution 注意事项 -当前 Job Scheduler 仅支持 Insert 内表,参考文档:[CREATE-JOB](https://doris.apache.org/zh-CN/docs/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-JOB.md) +当前 Job Scheduler 仅支持 Insert 内表,参考文档:[CREATE-JOB](../../sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-JOB.md) ::: @@ -878,7 +878,7 @@ JOB e_daily - 对于之前已经安装过审计日志插件的用户,升级后可以继续使用原有插件,也可以通过 uninstall 命令卸载原有插件后,使用新的插件。但注意,切换插件后,审计日志表也将切换到新的表中。 - - 具体可参阅:[审计日志插件](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin.md) + - 具体可参阅:[审计日志插件](../../admin-manual/audit-plugin.md) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.2.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.2.md index 96b7c849d341b..1517bf0b53fca 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.2.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.2.md @@ -40,7 +40,7 @@ under the License. - https://github.com/apache/doris/pull/33282 -3. Auto Partition 语法变化,详见 https://doris.apache.org/zh-CN/docs/table-design/data-partition#%E8%87%AA%E5%8A%A8%E5%88%86%E5%8C%BA +3. Auto Partition 语法变化,详见[文档](../../table-design/data-partitioning/auto-partitioning.md) - https://github.com/apache/doris/pull/32737 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.3.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.3.md index dc33f0d6011fa..15056902e7534 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.3.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.3.md @@ -37,7 +37,7 @@ under the License. 从 2.1.3 版本开始,Apache Doris 支持对 Hive 的 DDL 和 DML 操作。用户可以直接通过 Apache Doris 在 Hive 中创建库表,通过执行`INSERT INTO`语句来向 Hive 表中写入数据。通过该功能,用户可以通过 Apache Doris 对 Hive 进行完整的数据查询和写入操作,进一步帮助用户简化湖仓一体架构。 -参考文档:[https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/](https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/) +参考[文档](../../lakehouse/datalake-building/hive-build) **2. 支持在异步物化视图之上构建新的异步物化视图** diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.4.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.4.md index d8e3a2d8be538..722de717ea32a 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.4.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.4.md @@ -40,9 +40,9 @@ under the License. 关于更多信息,请参考文档: - - [BE 日志管理](../admin-manual/log-management/be-log.md) + - [BE 日志管理](../../admin-manual/log-management/be-log.md) - - [FE 日志管理](../admin-manual/log-management/fe-log.md) + - [FE 日志管理](../../admin-manual/log-management/fe-log.md) - 如果建表时没有填写表注释,默认注释为空,不再使用表类型作为默认表注释。 [#36025](https://github.com/apache/doris/pull/36025) @@ -54,7 +54,7 @@ under the License. - **支持 FE 火焰图工具**:在 FE 部署目录 `${DORIS_FE_HOME}/bin` 中会增加`profile_fe.sh` 脚本,可以利用 async-profiler 工具生成 FE 的火焰图,用以发现性能瓶颈点。 - 关于更多信息,请参考文档:[使用 FE Profiler 生成火焰图](/community/developer-guide/fe-profiler.md) + 关于更多信息,请参考文档:[使用 FE Profiler 生成火焰图](https://doris.apache.org/zh-CN/community/developer-guide/fe-profiler) - **支持 SELECT DISTINCT 与聚合函数同时使用**:支持 `SELECT DISTINCT` 与聚合函数同时使用,在一个查询中同时去重和进行聚合操作,如 SUM、MIN/MAX 等。 @@ -66,15 +66,15 @@ under the License. - **支持 Paimon 的原生读取器来处理 Deletion Vector:** Deletion Vector 主要用于标记或追踪哪些数据已被删除或标记为删除,通常应用在需要保留历史数据的场景,基于本优化可以提升大量数据更新或删除时的处理效率。 [#35241](https://github.com/apache/doris/pull/35241) - 关于更多信息,请参考文档:[数据湖分析 - Paimon](../lakehouse/datalake-analytics/paimon.md) + 关于更多信息,请参考文档:[数据湖分析 - Paimon](../../lakehouse/datalake-analytics/paimon.md) - **支持在表值函数(TVF)中使用 Resource**:TVF 功能为 Apache Doris 提供了直接将对象存储或 HDFS 上的文件作为 Table 进行查询分析的能力。通过在 TVF 中引用 Resource,可以避免重复填写连接信息,提升使用体验。 [#35139](https://github.com/apache/doris/pull/35139) - 关于更多信息,请参考文档:[表函数 - HDFS](../sql-manual/sql-functions/table-functions/hdfs.md) + 关于更多信息,请参考文档:[表函数 - HDFS](../../sql-manual/sql-functions/table-functions/hdfs.md) - **支持通过 Ranger 插件实现数据脱敏**:开启 Ranger 鉴权功能后,支持使用 Ranger 中的 Data Mask 功能进行数据脱敏。 - 关于更多信息,请参考文档:[基于 Apache Ranger 的鉴权管理](https://doris.apache.org/zh-CN/docs/admin-manual/auth/ranger/#%E5%AE%89%E8%A3%85-doris-ranger-%E6%8F%92%E4%BB%B6) + 关于更多信息,请参考文档:[基于 Apache Ranger 的鉴权管理](../../admin-manual/auth/ranger#资源和权限) ### 异步物化视图 @@ -82,21 +82,21 @@ under the License. - 支持单表透明改写。 - 关于更多信息,请参考文档:[查询异步物化视图](../query/view-materialized-view/query-async-materialized-view.md) + 关于更多信息,请参考文档:[查询异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md) - 透明改写支持 agg_state, agg_union 类型的聚合上卷,物化视图可以定义为 agg_state 或者 agg_union,查询使用具体的聚合函数,或者使用 agg_merge - 关于更多信息,请参考文档:[AGG_STATE](../sql-manual/sql-types/Data-Types/AGG_STATE.md) + 关于更多信息,请参考文档:[AGG_STATE](../../sql-manual/sql-data-types/aggregate/AGG-STATE.md) ### 其他 - **新增 `replace_empty` 函数**:将字符串中的子字符串进行替换,当旧字符串为空时,会将新字符串插入到原有字符串的每个字符前以及最后。 - 关于更多信息,请参考文档:[字符串函数 - REPLACE_EMPTY](../sql-manual/sql-functions/string-functions/replace_empty.md) + 关于更多信息,请参考文档:[字符串函数 - REPLACE_EMPTY](../../sql-manual/sql-functions/string-functions/replace_empty.md) - 支持 `show storage policy using` 语句:支持查看所有或指定存储策略关联的表和分区。 - 关于更多信息,请参考文档:[SQL 语句 - SHOW](../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md) + 关于更多信息,请参考文档:[SQL 语句 - SHOW](../../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md) - **支持 BE 侧的 JVM 指标:** 通过在 `be.conf` 配置文件中设置`enable_jvm_monitor=true`,可以启用对 BE 节点 JVM 的监控和指标收集,有助于了解 BE JVM 的资源使用情况,以便进行故障排除和性能优化。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.5.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.5.md index b463d42968326..c41df17fce4ba 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.5.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.5.md @@ -131,7 +131,7 @@ under the License. - 数据导出(Export/Outfile)支持指定 Parquet 和 ORC 的压缩格式。 - - 更多信息,请参考[文档](https://doris.apache.org/docs/sql-manual/sql-statements/Data-Manipulation-Statements/Manipulation/EXPORT/?_highlight=compress_type)。 + - 更多信息,请参考[文档](../../sql-manual/sql-statements/data-modification/load-and-export/EXPORT.md)。 - 当使用 CTAS+TVF 创建表时,TVF 中的分区列将被自动映射为 Varchar(65533)而非 String,以便该分区列能够作为内表的分区列使用。 [#37161](https://github.com/apache/doris/pull/37161) @@ -207,7 +207,7 @@ under the License. - 支持为 `INSERT INTO ... FROM TABLE VALUE FUNCTION` 语句设置 `max_filter_ratio` 参数。 - - 更多信息,请参考[文档](https://doris.apache.org/zh-CN/docs/data-operate/import/import-way/insert-into-manual/) + - 更多信息,请参考[文档](../../data-operate/import/import-way/insert-into-manual) ## Bug 修复 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.6.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.6.md index 6261e4e0c6612..65853079ee177 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.6.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.6.md @@ -56,15 +56,15 @@ under the License. - 实现 Iceberg 表的写回功能。 - - 更多信息,请查看文档数据湖构建-[Iceberg](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-building/iceberg-build) + - 更多信息,请查看文档数据湖构建-[Iceberg](../../lakehouse/datalake-building/iceberg-build) - 增强 SQL 拦截规则,支持对外表的拦截处理。 - - 更多信息,请查看文档查询管理-[SQL 拦截](https://doris.apache.org/zh-CN/docs/admin-manual/query-admin/sql-interception) + - 更多信息,请查看文档查询管理-[SQL 拦截](../../admin-manual/query-admin/sql-interception) - 新增系统表`file_cache_statistics`,用于查看 BE 节点的数据缓存性能指标。 - - 更多信息,请查看文档系统表-[file_cache_statistics](https://doris.apache.org/docs/admin-manual/system-tables/file_cache_statistics/) + - 更多信息,请查看文档系统表-[file_cache_statistics](../../admin-manual/system-tables/information_schema/file_cache_statistics) ### 异步物化视图 @@ -108,10 +108,10 @@ under the License. - 新增系统表`table_properties`,便于用户查看和管理表的各项属性。 - - 更多信息,请查看文档 [table_properties](https://doris.apache.org/zh-CN/docs/admin-manual/system-tables/information_schema/table_properties/) + - 更多信息,请查看文档 [table_properties](../../admin-manual/system-tables/information_schema/table_properties/) - 新增 FE 中死锁和慢锁检测功能。 - - 更多信息,请查看文档 [FE 锁管理](https://doris.apache.org/zh-CN/docs/admin-manual/maint-monitor/frontend-lock-manager/) + - 更多信息,请查看文档 [FE 锁管理](../../admin-manual/maint-monitor/frontend-lock-manager/) ## 改进提升 @@ -119,7 +119,7 @@ under the License. - 革新外表元数据缓存机制。 - - 更多信息,请查看文档 [元数据缓存](https://doris.apache.org/zh-CN/docs/lakehouse/metacache/)。 + - 更多信息,请查看文档 [元数据缓存](../../lakehouse/metacache)。 - 新增会话变量`keep_carriage_return`,默认关闭。读取 Hive Text 格式表时,默认将`\r\n`与`\n`均视为换行符。[#38099](https://github.com/apache/doris/pull/38099) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.7.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.7.md index 2d85c595f497c..f5bfea1d272f5 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.7.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.7.md @@ -38,7 +38,7 @@ under the License. - enable_fallback_to_original_planner: true - enable_pipeline_x_engine: true - 审计日志增加了新的列 [#42262](https://github.com/apache/doris/pull/42262) - - 更多信息,请参考[管理指南](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin/) + - 更多信息,请参考[管理指南](../../admin-manual/audit-plugin) ## 新功能 @@ -61,8 +61,8 @@ under the License. - 增加了 `information_schema.table_options` 和 `information_schema.``table_properties` 系统表,支持查询建表时设置的一些属性。[#34384](https://github.com/apache/doris/pull/34384) - 更多信息,请参考系统表: - - [table_options](https://doris.apache.org/docs/admin-manual/system-tables/information_schema/table_options/) - - [table_properties](https://doris.apache.org/zh-CN/docs/admin-manual/system-tables/information_schema/table_properties/) + - [table_options](../../admin-manual/system-tables/information_schema/table_options) + - [table_properties](../../admin-manual/system-tables/information_schema/table_properties) - 支持 `bitmap_empty` 作为默认值。[#40364](https://github.com/apache/doris/pull/40364) - 增加了一个新的 Session 变量`require_sequence_in_insert` 来控制向 Unique Key 表进行`insert into select` 写入时,是否必须提供 Sequence 列。[#41655](https://github.com/apache/doris/pull/41655) @@ -75,16 +75,16 @@ under the License. ### 湖仓一体 - 支持写入数据到 Hive Text 格式表。[#40537](https://github.com/apache/doris/pull/40537) - - 更多信息,请参考[使用 Hive 构建数据湖](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-building/hive-build/)文档 + - 更多信息,请参考[使用 Hive 构建数据湖](../../lakehouse/datalake-building/hive-build/)文档 - 使用 MaxCompute Open Storage API 访问 MaxCompute 数据。[#41610](https://github.com/apache/doris/pull/41610) - - 更多信息,请参考 [MaxCompute](https://doris.apache.org/zh-CN/docs/lakehouse/database/max-compute/) 文档 + - 更多信息,请参考 [MaxCompute](../../lakehouse/database/max-compute/) 文档 - 支持 Paimon DLF Catalog。[#41694](https://github.com/apache/doris/pull/41694) - - 更多信息,请参考 [Paimon Catalog](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-analytics/paimon/) 文档 + - 更多信息,请参考 [Paimon Catalog](../../lakehouse/datalake-analytics/paimon/) 文档 - 新增语法 `table$partitions` 语法支持直接查询 Hive 分区信息 [#41230](https://github.com/apache/doris/pull/41230) - - 更多信息,请参考[通过 Hive 分析数据湖](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-analytics/hive/)文档 + - 更多信息,请参考[通过 Hive 分析数据湖](../../lakehouse/datalake-analytics/hive/)文档 - 支持 brotli 压缩格式的 Parquet 文件读取。[#42162](https://github.com/apache/doris/pull/42162) - 支持读取 Parquet 文件中的 DECIMAL 256 类型。[#42241](https://github.com/apache/doris/pull/42241) -- 支持读取 OpenCsvSerde 格式的 Hive 表。[#42939](https://github.com/apache/doris/pull/42939)https://github.com/apache/doris/pull/42939 +- 支持读取 OpenCsvSerde 格式的 Hive 表。[#42939](https://github.com/apache/doris/pull/42939) ### 异步物化视图 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.0.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.0.md index 5065dfc1566b7..40919bb5e2054 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.0.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.0.md @@ -151,7 +151,7 @@ under the License. :::info 备注 -参考文档:[存算分离](https://doris.apache.org/zh-CN/docs/3.0/compute-storage-decoupled/overview) +参考文档:[存算分离](../../compute-storage-decoupled/overview) ::: @@ -200,15 +200,15 @@ under the License. - [接入 Trino Connector](https://doris.apache.org/zh-CN/community/how-to-contribute/trino-connector-developer-guide) -- [TPC-H](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/tpch/) +- [TPC-H](../../lakehouse/datalake-analytics/tpch/) -- [TPC-DS](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/tpcds/) +- [TPC-DS](../../lakehouse/datalake-analytics/tpcds/) -- [Delta Lake](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/deltalake) +- [Delta Lake](../../lakehouse/datalake-analytics/deltalake) -- [Kudu](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/kudu) +- [Kudu](../../lakehouse/datalake-analytics/kudu) -- [BigQuery](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/bigquery) +- [BigQuery](../../lakehouse/datalake-analytics/bigquery) ::: ### 2-3 数据湖构建 @@ -219,7 +219,7 @@ under the License. :::info 备注 -参考文档:[数据湖构建](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build/) +参考文档:[数据湖构建](../../lakehouse/datalake-building/hive-build/) ::: @@ -277,7 +277,7 @@ Variant 数据类型在经过大规模生产打磨后,已具备充分的稳定 参考文档: -- [事务](https://doris.apache.org/zh-CN/docs/3.0/data-operate/transaction/) +- [事务](../../data-operate/transaction/) - 目前 CCR 暂未支持显示事务同步。 ::: @@ -329,9 +329,9 @@ Variant 数据类型在经过大规模生产打磨后,已具备充分的稳定 :::info 备注 参考文档: -- [异步物化视图概览](https://doris.apache.org/zh-CN/docs/query/view-materialized-view/async-materialized-view) +- [异步物化视图概览](../../query-acceleration/materialized-view/async-materialized-view/overview.md) -- [查询异步物化视图](https://doris.apache.org/zh-CN/docs/3.0/query/view-materialized-view/query-async-materialized-view/) +- [查询异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md) ::: ## 6. 性能提升 @@ -400,7 +400,7 @@ Runtime Filter 是否能够准确生成对查询性能的影响至关重要, ``` :::info 备注 -参考文档: [Java UDF - UDTF](https://doris.apache.org/zh-CN/docs/query/udf/java-user-defined-function#udtf-1) +参考文档: [Java UDF - UDTF](../../query-data/udf/java-user-defined-function.md#java-udtf-实例介绍) ::: ### 7-2 生成列 @@ -415,7 +415,7 @@ Runtime Filter 是否能够准确生成对查询性能的影响至关重要, 参考文档: -[CREATE TABLE AND GENERATED COLUMN](https://doris.apache.org/zh-CN/docs/3.0/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-TABLE-AND-GENERATED-COLUMN/) +[CREATE TABLE AND GENERATED COLUMN](../../sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-TABLE-AND-GENERATED-COLUMN/) ::: ## 8. 功能改进 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.1.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.1.md index 6f79a76c5872c..dd3d7829f2783 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.1.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.1.md @@ -74,7 +74,7 @@ under the License. - SQL 拦截功能现在支持外部表 - - 更多内容,参考文档[SQL 拦截](https://doris.apache.org/zh-CN/docs/admin-manual/query-admin/sql-interception) + - 更多内容,参考文档[SQL 拦截](../..//admin-manual/query-admin/sql-interception) - Insert Overwrite 现在支持 Iceberg 表。[#37191](https://github.com/apache/doris/pull/37191) @@ -108,7 +108,7 @@ under the License. - 新增加了 FE 参数 `skip_audit_user_list`,在此配置项中的用户操作将不会被记录到审计日志中。[#38310](https://github.com/apache/doris/pull/38310) - - 更多内容,参考文档[审计插件](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin/) + - 更多内容,参考文档[审计插件](../../admin-manual/audit-plugin/) ## 改进 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.2.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.2.md index bd84408eec7f0..cd509e52023ff 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.2.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.2.md @@ -63,7 +63,7 @@ under the License. ### Lakehouse -- 新增 Lakesoul Catalog。[Apache Doris Docs](https://doris.apache.org/zh-CN/docs/dev/lakehouse/datalake-analytics/lakesoul) +- 新增 Lakesoul Catalog。[Apache Doris Docs](../../lakehouse/datalake-analytics/lakesoul) - 新增系统表 `catalog_meta_cache_statistics`,用于查看 External Catalog 中各类元数据缓存的使用情况。[#40155](https://github.com/apache/doris/pull/40155) ### 查询优化器 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.3.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.3.md index 99a49f6207103..8a3ecbfa4f62f 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.3.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.3.md @@ -45,11 +45,11 @@ under the License. - 新增 `table$partition` 语法,用于查询 Hive 表的分区信息。[#40774](https://github.com/apache/doris/pull/40774) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/hive#查询-hive-分区) + - [查看文档](../../lakehouse/datalake-analytics/hive#查询-hive-分区) - 支持创建 Text 格式的 Hive 表。[#41860](https://github.com/apache/doris/pull/41860) [#42175](https://github.com/apache/doris/pull/42175) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build#table) + - [查看文档](../../lakehouse/datalake-building/hive-build#table) ### 异步物化视图 @@ -96,7 +96,7 @@ under the License. - Paimon Catalog 支持阿里云 DLF 和 OSS-HDFS 存储。[#41247](https://github.com/apache/doris/pull/41247) [#42585](https://github.com/apache/doris/pull/42585) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/paimon#基于-aliyun-dlf-创建-catalog) + - [查看文档](../../lakehouse/datalake-analytics/paimon#基于-aliyun-dlf-创建-catalog) - 支持读取 OpenCSV 格式的 Hive 表。[#42257](https://github.com/apache/doris/pull/42257) [#42942](https://github.com/apache/doris/pull/42942) - 优化了访问 External Catalog 中 `information_schema.columns` 表的性能。[#41659](https://github.com/apache/doris/pull/41659) [#41962](https://github.com/apache/doris/pull/41962) @@ -224,4 +224,4 @@ under the License. - 补充了审计日志表和文件中缺失的审计日志字段。[#43303](https://github.com/apache/doris/pull/43303) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file + - [查看文档](../../admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/demo-block.css b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/demo-block.css index 934e88ba28aaf..1257919249c60 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/demo-block.css +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/demo-block.css @@ -105,15 +105,6 @@ a:active { padding-right: 2rem } -.home-page-hero-right { - flex: 1; - flex-direction: row; - justify-content: center; - width: fit-content -} - - - .home-page-option-button { display: flex; margin-bottom: 0.5rem; @@ -209,11 +200,6 @@ a:active { justify-content: center; } -.home-page-hero-right { - align-items: center; - display: flex; - flex-direction: row; -} .home-page-hero-button { /* background-color: #fafafa; */ @@ -279,8 +265,18 @@ a:active { margin-top: 15px } +.home-page-hero-right a { + color: #4c576c +} - +.home-page-hero-right a:hover, +a:active { + /* color: #444fd9; */ + text-decoration: none; + transition-duration: .3s; + transition-timing-function: cubic-bezier(0, 0, .2, 1); + background-color: #fafafa +} .section-border { @@ -355,6 +351,24 @@ a:active { } +@media (max-width: 996px) { + .latest-button { + flex: 1 1 100%; + margin-right: 0 !important; + max-width: calc(100vw - 2rem); + min-height: 170px; + height: auto !important; + } + + .home-page-hero-right { + flex-wrap: wrap !important + } + .latest-button-CN{ + margin-right: 0 !important; + max-width: calc(100vw - 2rem); + } +} + .latest-button-CN { /* background-color: #fafafa; */ border: 0.3px solid #dcdcdc; diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/latest.tsx b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/latest.tsx index 3e1eb5090e0fb..7c92f75c3c137 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/latest.tsx +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/latest.tsx @@ -33,12 +33,12 @@ export default function Latest() {
*/} -
Doris Summit Asia 2024|12 月 14 日 深圳
+
Doris Summit Asia 2024 圆满落幕
-
一年一度的 Apache Doris 峰会再次启航,Doris Summit Asia 2024 现已开启报名,将于 12 月 14 日在深圳正式举办。
-
立即报名
+
2024 年 12 月 14 日,由飞轮科技主办,腾讯云和阿里云联合主办的 Doris Summit Asia 2024 在深圳圆满落幕。演讲回放及资料会在 10 个工作日内逐步释出,可通过 Doris Summit 官网获取。
+
回放生成中
- +
版本发布
{/*
@@ -47,9 +47,9 @@ export default function Latest() {
*/} -
Apache Doris 3.0.2 正式发布
+
Apache Doris 3.0.3 正式发布
-
3.0.2 版本在存算分离、存储、湖仓一体、查询优化器以及执行引擎持续升级改进,欢迎大家下载使用。
+
3.0.3 版本在存算分离、存储、湖仓一体、查询优化器以及执行引擎持续升级改进,欢迎大家下载使用。
查看详情
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/page-hero-1.tsx b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/page-hero-1.tsx index 4b9826c5d4e23..6666f3f97ac60 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/page-hero-1.tsx +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/page-hero-1.tsx @@ -35,9 +35,9 @@ export default function PageHero() {
如何基于 Apache Doris 构建开放、高性能低成本、统一的日志存储分析平台。
- +
-
资源管理
+
负载管理
{/*
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.0.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.0.md index 434677f520819..d14aec8a307e5 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.0.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.0.md @@ -104,7 +104,7 @@ under the License. ![Local Shuffle Clickbench and TPCH-100](/images/2.1-doris-clickbench-tpch.png) :::note 备注 -参考文档:[Pipeline X 执行引擎](https://doris.apache.org/zh-CN/docs/query-acceleration/pipeline-execution-engine) +参考文档:[Pipeline X 执行引擎](../../query-acceleration/pipeline-execution-engine) ::: ## ARM 架构深度适配,性能提升 230% @@ -141,9 +141,9 @@ under the License. 该功能目前为实验性质功能,当前已经支持 ClickHouse、Presto、Trino、Hive、Spark。在此我们以 Trino 为例,部署完 SQL 转换服务后,在会话变量中设置 `set sql_dialect = trino` ,即可直接采取 Trino SQL 语法执行查询。在某些社区用户的实际线上业务 SQL 兼容性测试中,在全部 3w 多条查询语句中与 Trino SQL 兼容度高达 99% 以上。也欢迎所有用户在使用过程中向我们反馈不兼容的 Case,帮助 Apache Doris 更加完善。 :::note -- 演示 Demo: https://www.bilibili.com/video/BV1cS421A7kA/?spm_id_from=333.999.0.0 +- [演示 Demo](https://www.bilibili.com/video/BV1cS421A7kA/?spm_id_from=333.999.0.0) -- 参考文档:[SQL 方言兼容](https://doris.apache.org/zh-CN/docs/lakehouse/sql-dialect.md) +- 参考文档:[SQL 方言兼容](../../lakehouse/sql-dialect.md) ::: @@ -302,7 +302,7 @@ CREATE MATERIALIZED VIEW mv1 :::note - 演示 Demo: https://www.bilibili.com/video/BV1s2421T71z/?spm_id_from=333.999.0.0 -- 参考文档:[异步物化视图](https://doris.apache.org/zh-CN/docs/query-acceleration/materialized-view/async-materialized-view/overview) +- 参考文档:[异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/overview) ::: ## 存储能力增强 @@ -408,7 +408,7 @@ PROPERTIES ( :::note -参考文档:[数据划分](https://doris.apache.org/zh-CN/docs/table-design/data-partitioning/basic-concepts) +参考文档:[数据划分](../../table-design/data-partitioning/basic-concepts) ::: ### INSERT INTO SELECT 导入性能提升 100% @@ -470,7 +470,7 @@ MemTable 前移在 2.1 版本中默认开启,用户无需修改原有的导入 :::note - 演示 Demo:https://www.bilibili.com/video/BV1um411o7Ha/?spm_id_from=333.999.0.0 -- 参考文档和完整测试报告:[Group Commit](https://doris.apache.org/zh-CN/docs/data-operate/import/import-way/group-commit-manual) +- 参考文档和完整测试报告:[Group Commit](../../data-operate/import/import-way/group-commit-manual) ::: @@ -542,7 +542,7 @@ SELECT v["properties"]["title"] from ${table_name} :::note - 演示 Demo: https://www.bilibili.com/video/BV13u4m1g7ra/?spm_id_from=333.999.0.0 -- 参考文档:[VARIANT](https://doris.apache.org/zh-CN/docs/sql-manual/sql-data-types/semi-structured/VARIANT.md) +- 参考文档:[VARIANT](../../sql-manual/sql-data-types/semi-structured/VARIANT.md) ::: @@ -557,7 +557,7 @@ SELECT v["properties"]["title"] from ${table_name} - INET_ATON:获取包含 IPv4 地址的字符串,格式为 A.B.C.D(点分隔的十进制数字) :::note -参考文档:[IPV6](https://doris.apache.org/zh-CN/docs/sql-manual/sql-data-types/ip/IPV6) +参考文档:[IPV6](../../sql-manual/sql-data-types/ip/IPV6) ::: @@ -674,7 +674,7 @@ mysql> select struct(1,"2") not in (struct(1,3), struct(1,"2"), struct(1,1), nul - `MAP_AGG`:接收 expr1 作为键,expr2 作为对应的值,返回一个 MAP :::note -参考文档:[MAP_AGG](https://doris.apache.org/zh-CN/docs/sql-manual/sql-functions/aggregate-functions/map-agg.md) +参考文档:[MAP_AGG](../../sql-manual/sql-functions/aggregate-functions/map-agg.md) ::: @@ -699,7 +699,7 @@ mysql> select struct(1,"2") not in (struct(1,3), struct(1,"2"), struct(1,1), nul :::note - 演示 Demo:https://www.bilibili.com/video/BV1Fz421X7XE/?spm_id_from=333.999.0.0 -- 参考文档:[Workload Group](https://doris.apache.org/zh-CN/docs/admin-manual/resource-admin/workload-group.md) +- 参考文档:[Workload Group](../../admin-manual/resource-admin/workload-group.md) ::: @@ -757,7 +757,7 @@ select QueryId,max(BePeakMemoryBytes) as be_peak_mem from active_queries() group 目前主要展示的负载类型包括 Select 和`Insert Into……Select`,预计在 2.1 版本之上的三位迭代版本中会支持 Stream Load 和 Broker Load 的资源用量展示。 :::note -参考文档:[ACTIVE_QUERIES](https://doris.apache.org/zh-CN/docs/sql-manual/sql-functions/table-functions/active_queries.md) +参考文档:[ACTIVE_QUERIES](../../sql-manual/sql-functions/table-functions/active_queries.md) ::: @@ -858,7 +858,7 @@ JOB e_daily :::caution 注意事项 -当前 Job Scheduler 仅支持 Insert 内表,参考文档:[CREATE-JOB](https://doris.apache.org/zh-CN/docs/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-JOB.md) +当前 Job Scheduler 仅支持 Insert 内表,参考文档:[CREATE-JOB](../../sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-JOB.md) ::: @@ -878,7 +878,7 @@ JOB e_daily - 对于之前已经安装过审计日志插件的用户,升级后可以继续使用原有插件,也可以通过 uninstall 命令卸载原有插件后,使用新的插件。但注意,切换插件后,审计日志表也将切换到新的表中。 - - 具体可参阅:[审计日志插件](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin.md) + - 具体可参阅:[审计日志插件](../../admin-manual/audit-plugin.md) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.2.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.2.md index 96b7c849d341b..1517bf0b53fca 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.2.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.2.md @@ -40,7 +40,7 @@ under the License. - https://github.com/apache/doris/pull/33282 -3. Auto Partition 语法变化,详见 https://doris.apache.org/zh-CN/docs/table-design/data-partition#%E8%87%AA%E5%8A%A8%E5%88%86%E5%8C%BA +3. Auto Partition 语法变化,详见[文档](../../table-design/data-partitioning/auto-partitioning.md) - https://github.com/apache/doris/pull/32737 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.3.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.3.md index dc33f0d6011fa..15056902e7534 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.3.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.3.md @@ -37,7 +37,7 @@ under the License. 从 2.1.3 版本开始,Apache Doris 支持对 Hive 的 DDL 和 DML 操作。用户可以直接通过 Apache Doris 在 Hive 中创建库表,通过执行`INSERT INTO`语句来向 Hive 表中写入数据。通过该功能,用户可以通过 Apache Doris 对 Hive 进行完整的数据查询和写入操作,进一步帮助用户简化湖仓一体架构。 -参考文档:[https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/](https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/) +参考[文档](../../lakehouse/datalake-building/hive-build) **2. 支持在异步物化视图之上构建新的异步物化视图** diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.4.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.4.md index d8e3a2d8be538..722de717ea32a 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.4.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.4.md @@ -40,9 +40,9 @@ under the License. 关于更多信息,请参考文档: - - [BE 日志管理](../admin-manual/log-management/be-log.md) + - [BE 日志管理](../../admin-manual/log-management/be-log.md) - - [FE 日志管理](../admin-manual/log-management/fe-log.md) + - [FE 日志管理](../../admin-manual/log-management/fe-log.md) - 如果建表时没有填写表注释,默认注释为空,不再使用表类型作为默认表注释。 [#36025](https://github.com/apache/doris/pull/36025) @@ -54,7 +54,7 @@ under the License. - **支持 FE 火焰图工具**:在 FE 部署目录 `${DORIS_FE_HOME}/bin` 中会增加`profile_fe.sh` 脚本,可以利用 async-profiler 工具生成 FE 的火焰图,用以发现性能瓶颈点。 - 关于更多信息,请参考文档:[使用 FE Profiler 生成火焰图](/community/developer-guide/fe-profiler.md) + 关于更多信息,请参考文档:[使用 FE Profiler 生成火焰图](https://doris.apache.org/zh-CN/community/developer-guide/fe-profiler) - **支持 SELECT DISTINCT 与聚合函数同时使用**:支持 `SELECT DISTINCT` 与聚合函数同时使用,在一个查询中同时去重和进行聚合操作,如 SUM、MIN/MAX 等。 @@ -66,15 +66,15 @@ under the License. - **支持 Paimon 的原生读取器来处理 Deletion Vector:** Deletion Vector 主要用于标记或追踪哪些数据已被删除或标记为删除,通常应用在需要保留历史数据的场景,基于本优化可以提升大量数据更新或删除时的处理效率。 [#35241](https://github.com/apache/doris/pull/35241) - 关于更多信息,请参考文档:[数据湖分析 - Paimon](../lakehouse/datalake-analytics/paimon.md) + 关于更多信息,请参考文档:[数据湖分析 - Paimon](../../lakehouse/datalake-analytics/paimon.md) - **支持在表值函数(TVF)中使用 Resource**:TVF 功能为 Apache Doris 提供了直接将对象存储或 HDFS 上的文件作为 Table 进行查询分析的能力。通过在 TVF 中引用 Resource,可以避免重复填写连接信息,提升使用体验。 [#35139](https://github.com/apache/doris/pull/35139) - 关于更多信息,请参考文档:[表函数 - HDFS](../sql-manual/sql-functions/table-functions/hdfs.md) + 关于更多信息,请参考文档:[表函数 - HDFS](../../sql-manual/sql-functions/table-functions/hdfs.md) - **支持通过 Ranger 插件实现数据脱敏**:开启 Ranger 鉴权功能后,支持使用 Ranger 中的 Data Mask 功能进行数据脱敏。 - 关于更多信息,请参考文档:[基于 Apache Ranger 的鉴权管理](https://doris.apache.org/zh-CN/docs/admin-manual/auth/ranger/#%E5%AE%89%E8%A3%85-doris-ranger-%E6%8F%92%E4%BB%B6) + 关于更多信息,请参考文档:[基于 Apache Ranger 的鉴权管理](../../admin-manual/auth/ranger#资源和权限) ### 异步物化视图 @@ -82,21 +82,21 @@ under the License. - 支持单表透明改写。 - 关于更多信息,请参考文档:[查询异步物化视图](../query/view-materialized-view/query-async-materialized-view.md) + 关于更多信息,请参考文档:[查询异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md) - 透明改写支持 agg_state, agg_union 类型的聚合上卷,物化视图可以定义为 agg_state 或者 agg_union,查询使用具体的聚合函数,或者使用 agg_merge - 关于更多信息,请参考文档:[AGG_STATE](../sql-manual/sql-types/Data-Types/AGG_STATE.md) + 关于更多信息,请参考文档:[AGG_STATE](../../sql-manual/sql-data-types/aggregate/AGG-STATE.md) ### 其他 - **新增 `replace_empty` 函数**:将字符串中的子字符串进行替换,当旧字符串为空时,会将新字符串插入到原有字符串的每个字符前以及最后。 - 关于更多信息,请参考文档:[字符串函数 - REPLACE_EMPTY](../sql-manual/sql-functions/string-functions/replace_empty.md) + 关于更多信息,请参考文档:[字符串函数 - REPLACE_EMPTY](../../sql-manual/sql-functions/string-functions/replace_empty.md) - 支持 `show storage policy using` 语句:支持查看所有或指定存储策略关联的表和分区。 - 关于更多信息,请参考文档:[SQL 语句 - SHOW](../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md) + 关于更多信息,请参考文档:[SQL 语句 - SHOW](../../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md) - **支持 BE 侧的 JVM 指标:** 通过在 `be.conf` 配置文件中设置`enable_jvm_monitor=true`,可以启用对 BE 节点 JVM 的监控和指标收集,有助于了解 BE JVM 的资源使用情况,以便进行故障排除和性能优化。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.5.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.5.md index b463d42968326..c41df17fce4ba 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.5.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.5.md @@ -131,7 +131,7 @@ under the License. - 数据导出(Export/Outfile)支持指定 Parquet 和 ORC 的压缩格式。 - - 更多信息,请参考[文档](https://doris.apache.org/docs/sql-manual/sql-statements/Data-Manipulation-Statements/Manipulation/EXPORT/?_highlight=compress_type)。 + - 更多信息,请参考[文档](../../sql-manual/sql-statements/data-modification/load-and-export/EXPORT.md)。 - 当使用 CTAS+TVF 创建表时,TVF 中的分区列将被自动映射为 Varchar(65533)而非 String,以便该分区列能够作为内表的分区列使用。 [#37161](https://github.com/apache/doris/pull/37161) @@ -207,7 +207,7 @@ under the License. - 支持为 `INSERT INTO ... FROM TABLE VALUE FUNCTION` 语句设置 `max_filter_ratio` 参数。 - - 更多信息,请参考[文档](https://doris.apache.org/zh-CN/docs/data-operate/import/import-way/insert-into-manual/) + - 更多信息,请参考[文档](../../data-operate/import/import-way/insert-into-manual) ## Bug 修复 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.6.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.6.md index 6261e4e0c6612..65853079ee177 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.6.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.6.md @@ -56,15 +56,15 @@ under the License. - 实现 Iceberg 表的写回功能。 - - 更多信息,请查看文档数据湖构建-[Iceberg](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-building/iceberg-build) + - 更多信息,请查看文档数据湖构建-[Iceberg](../../lakehouse/datalake-building/iceberg-build) - 增强 SQL 拦截规则,支持对外表的拦截处理。 - - 更多信息,请查看文档查询管理-[SQL 拦截](https://doris.apache.org/zh-CN/docs/admin-manual/query-admin/sql-interception) + - 更多信息,请查看文档查询管理-[SQL 拦截](../../admin-manual/query-admin/sql-interception) - 新增系统表`file_cache_statistics`,用于查看 BE 节点的数据缓存性能指标。 - - 更多信息,请查看文档系统表-[file_cache_statistics](https://doris.apache.org/docs/admin-manual/system-tables/file_cache_statistics/) + - 更多信息,请查看文档系统表-[file_cache_statistics](../../admin-manual/system-tables/information_schema/file_cache_statistics) ### 异步物化视图 @@ -108,10 +108,10 @@ under the License. - 新增系统表`table_properties`,便于用户查看和管理表的各项属性。 - - 更多信息,请查看文档 [table_properties](https://doris.apache.org/zh-CN/docs/admin-manual/system-tables/information_schema/table_properties/) + - 更多信息,请查看文档 [table_properties](../../admin-manual/system-tables/information_schema/table_properties/) - 新增 FE 中死锁和慢锁检测功能。 - - 更多信息,请查看文档 [FE 锁管理](https://doris.apache.org/zh-CN/docs/admin-manual/maint-monitor/frontend-lock-manager/) + - 更多信息,请查看文档 [FE 锁管理](../../admin-manual/maint-monitor/frontend-lock-manager/) ## 改进提升 @@ -119,7 +119,7 @@ under the License. - 革新外表元数据缓存机制。 - - 更多信息,请查看文档 [元数据缓存](https://doris.apache.org/zh-CN/docs/lakehouse/metacache/)。 + - 更多信息,请查看文档 [元数据缓存](../../lakehouse/metacache)。 - 新增会话变量`keep_carriage_return`,默认关闭。读取 Hive Text 格式表时,默认将`\r\n`与`\n`均视为换行符。[#38099](https://github.com/apache/doris/pull/38099) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.7.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.7.md index 2d85c595f497c..f5bfea1d272f5 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.7.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.7.md @@ -38,7 +38,7 @@ under the License. - enable_fallback_to_original_planner: true - enable_pipeline_x_engine: true - 审计日志增加了新的列 [#42262](https://github.com/apache/doris/pull/42262) - - 更多信息,请参考[管理指南](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin/) + - 更多信息,请参考[管理指南](../../admin-manual/audit-plugin) ## 新功能 @@ -61,8 +61,8 @@ under the License. - 增加了 `information_schema.table_options` 和 `information_schema.``table_properties` 系统表,支持查询建表时设置的一些属性。[#34384](https://github.com/apache/doris/pull/34384) - 更多信息,请参考系统表: - - [table_options](https://doris.apache.org/docs/admin-manual/system-tables/information_schema/table_options/) - - [table_properties](https://doris.apache.org/zh-CN/docs/admin-manual/system-tables/information_schema/table_properties/) + - [table_options](../../admin-manual/system-tables/information_schema/table_options) + - [table_properties](../../admin-manual/system-tables/information_schema/table_properties) - 支持 `bitmap_empty` 作为默认值。[#40364](https://github.com/apache/doris/pull/40364) - 增加了一个新的 Session 变量`require_sequence_in_insert` 来控制向 Unique Key 表进行`insert into select` 写入时,是否必须提供 Sequence 列。[#41655](https://github.com/apache/doris/pull/41655) @@ -75,16 +75,16 @@ under the License. ### 湖仓一体 - 支持写入数据到 Hive Text 格式表。[#40537](https://github.com/apache/doris/pull/40537) - - 更多信息,请参考[使用 Hive 构建数据湖](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-building/hive-build/)文档 + - 更多信息,请参考[使用 Hive 构建数据湖](../../lakehouse/datalake-building/hive-build/)文档 - 使用 MaxCompute Open Storage API 访问 MaxCompute 数据。[#41610](https://github.com/apache/doris/pull/41610) - - 更多信息,请参考 [MaxCompute](https://doris.apache.org/zh-CN/docs/lakehouse/database/max-compute/) 文档 + - 更多信息,请参考 [MaxCompute](../../lakehouse/database/max-compute/) 文档 - 支持 Paimon DLF Catalog。[#41694](https://github.com/apache/doris/pull/41694) - - 更多信息,请参考 [Paimon Catalog](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-analytics/paimon/) 文档 + - 更多信息,请参考 [Paimon Catalog](../../lakehouse/datalake-analytics/paimon/) 文档 - 新增语法 `table$partitions` 语法支持直接查询 Hive 分区信息 [#41230](https://github.com/apache/doris/pull/41230) - - 更多信息,请参考[通过 Hive 分析数据湖](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-analytics/hive/)文档 + - 更多信息,请参考[通过 Hive 分析数据湖](../../lakehouse/datalake-analytics/hive/)文档 - 支持 brotli 压缩格式的 Parquet 文件读取。[#42162](https://github.com/apache/doris/pull/42162) - 支持读取 Parquet 文件中的 DECIMAL 256 类型。[#42241](https://github.com/apache/doris/pull/42241) -- 支持读取 OpenCsvSerde 格式的 Hive 表。[#42939](https://github.com/apache/doris/pull/42939)https://github.com/apache/doris/pull/42939 +- 支持读取 OpenCsvSerde 格式的 Hive 表。[#42939](https://github.com/apache/doris/pull/42939) ### 异步物化视图 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.0.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.0.md index 5065dfc1566b7..2e7cdee64215e 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.0.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.0.md @@ -151,7 +151,7 @@ under the License. :::info 备注 -参考文档:[存算分离](https://doris.apache.org/zh-CN/docs/3.0/compute-storage-decoupled/overview) +参考文档:[存算分离](../../compute-storage-decoupled/overview) ::: @@ -200,15 +200,15 @@ under the License. - [接入 Trino Connector](https://doris.apache.org/zh-CN/community/how-to-contribute/trino-connector-developer-guide) -- [TPC-H](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/tpch/) +- [TPC-H](../../lakehouse/datalake-analytics/tpch/) -- [TPC-DS](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/tpcds/) +- [TPC-DS](../../lakehouse/datalake-analytics/tpcds/) -- [Delta Lake](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/deltalake) +- [Delta Lake](../../lakehouse/datalake-analytics/deltalake) -- [Kudu](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/kudu) +- [Kudu](../../lakehouse/datalake-analytics/kudu) -- [BigQuery](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/bigquery) +- [BigQuery](../../lakehouse/datalake-analytics/bigquery) ::: ### 2-3 数据湖构建 @@ -219,7 +219,7 @@ under the License. :::info 备注 -参考文档:[数据湖构建](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build/) +参考文档:[数据湖构建](../../lakehouse/datalake-building/hive-build/) ::: @@ -277,7 +277,7 @@ Variant 数据类型在经过大规模生产打磨后,已具备充分的稳定 参考文档: -- [事务](https://doris.apache.org/zh-CN/docs/3.0/data-operate/transaction/) +- [事务](../../data-operate/transaction/) - 目前 CCR 暂未支持显示事务同步。 ::: @@ -329,9 +329,9 @@ Variant 数据类型在经过大规模生产打磨后,已具备充分的稳定 :::info 备注 参考文档: -- [异步物化视图概览](https://doris.apache.org/zh-CN/docs/query/view-materialized-view/async-materialized-view) +- [异步物化视图概览](../../query-acceleration/materialized-view/async-materialized-view/overview.md) -- [查询异步物化视图](https://doris.apache.org/zh-CN/docs/3.0/query/view-materialized-view/query-async-materialized-view/) +- [查询异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md) ::: ## 6. 性能提升 @@ -400,7 +400,7 @@ Runtime Filter 是否能够准确生成对查询性能的影响至关重要, ``` :::info 备注 -参考文档: [Java UDF - UDTF](https://doris.apache.org/zh-CN/docs/query/udf/java-user-defined-function#udtf-1) +参考文档: [Java UDF - UDTF](../../query-data/udf/java-user-defined-function.md#java-udtf-实例介绍) ::: ### 7-2 生成列 @@ -415,7 +415,7 @@ Runtime Filter 是否能够准确生成对查询性能的影响至关重要, 参考文档: -[CREATE TABLE AND GENERATED COLUMN](https://doris.apache.org/zh-CN/docs/3.0/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-TABLE-AND-GENERATED-COLUMN/) +[CREATE TABLE AND GENERATED COLUMN](../../sql-manual/sql-statements/table-and-view/table/CREATE-TABLE.md) ::: ## 8. 功能改进 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.1.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.1.md index 6f79a76c5872c..dd3d7829f2783 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.1.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.1.md @@ -74,7 +74,7 @@ under the License. - SQL 拦截功能现在支持外部表 - - 更多内容,参考文档[SQL 拦截](https://doris.apache.org/zh-CN/docs/admin-manual/query-admin/sql-interception) + - 更多内容,参考文档[SQL 拦截](../..//admin-manual/query-admin/sql-interception) - Insert Overwrite 现在支持 Iceberg 表。[#37191](https://github.com/apache/doris/pull/37191) @@ -108,7 +108,7 @@ under the License. - 新增加了 FE 参数 `skip_audit_user_list`,在此配置项中的用户操作将不会被记录到审计日志中。[#38310](https://github.com/apache/doris/pull/38310) - - 更多内容,参考文档[审计插件](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin/) + - 更多内容,参考文档[审计插件](../../admin-manual/audit-plugin/) ## 改进 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.2.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.2.md index bd84408eec7f0..cd509e52023ff 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.2.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.2.md @@ -63,7 +63,7 @@ under the License. ### Lakehouse -- 新增 Lakesoul Catalog。[Apache Doris Docs](https://doris.apache.org/zh-CN/docs/dev/lakehouse/datalake-analytics/lakesoul) +- 新增 Lakesoul Catalog。[Apache Doris Docs](../../lakehouse/datalake-analytics/lakesoul) - 新增系统表 `catalog_meta_cache_statistics`,用于查看 External Catalog 中各类元数据缓存的使用情况。[#40155](https://github.com/apache/doris/pull/40155) ### 查询优化器 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.3.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.3.md index 2f72f702483e3..8a3ecbfa4f62f 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.3.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.3.md @@ -45,11 +45,11 @@ under the License. - 新增 `table$partition` 语法,用于查询 Hive 表的分区信息。[#40774](https://github.com/apache/doris/pull/40774) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/hive#查询-hive-分区) + - [查看文档](../../lakehouse/datalake-analytics/hive#查询-hive-分区) - 支持创建 Text 格式的 Hive 表。[#41860](https://github.com/apache/doris/pull/41860) [#42175](https://github.com/apache/doris/pull/42175) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build#table) + - [查看文档](../../lakehouse/datalake-building/hive-build#table) ### 异步物化视图 @@ -71,7 +71,7 @@ under the License. - 数组函数 `array_agg` 支持在 ARRAY 中嵌套 ARRAY/MAP/STRUCT。[#42009](https://github.com/apache/doris/pull/42009) - 新增近似聚合统计函数 `approx_top_k` 和 `approx_top_sum`。[#44082](https://github.com/apache/doris/pull/44082) -## 改进 +## 改进与优化 ### 存储 @@ -96,7 +96,7 @@ under the License. - Paimon Catalog 支持阿里云 DLF 和 OSS-HDFS 存储。[#41247](https://github.com/apache/doris/pull/41247) [#42585](https://github.com/apache/doris/pull/42585) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/paimon#基于-aliyun-dlf-创建-catalog) + - [查看文档](../../lakehouse/datalake-analytics/paimon#基于-aliyun-dlf-创建-catalog) - 支持读取 OpenCSV 格式的 Hive 表。[#42257](https://github.com/apache/doris/pull/42257) [#42942](https://github.com/apache/doris/pull/42942) - 优化了访问 External Catalog 中 `information_schema.columns` 表的性能。[#41659](https://github.com/apache/doris/pull/41659) [#41962](https://github.com/apache/doris/pull/41962) @@ -142,7 +142,7 @@ under the License. - FE 监控项中的连接数信息支持按用户分别显示。[#39200](https://github.com/apache/doris/pull/39200) -## 缺陷修复 +## 问题修复 ### 存储 @@ -224,4 +224,4 @@ under the License. - 补充了审计日志表和文件中缺失的审计日志字段。[#43303](https://github.com/apache/doris/pull/43303) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file + - [查看文档](../../admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/demo-block.css b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/demo-block.css index 934e88ba28aaf..1257919249c60 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/demo-block.css +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/demo-block.css @@ -105,15 +105,6 @@ a:active { padding-right: 2rem } -.home-page-hero-right { - flex: 1; - flex-direction: row; - justify-content: center; - width: fit-content -} - - - .home-page-option-button { display: flex; margin-bottom: 0.5rem; @@ -209,11 +200,6 @@ a:active { justify-content: center; } -.home-page-hero-right { - align-items: center; - display: flex; - flex-direction: row; -} .home-page-hero-button { /* background-color: #fafafa; */ @@ -279,8 +265,18 @@ a:active { margin-top: 15px } +.home-page-hero-right a { + color: #4c576c +} - +.home-page-hero-right a:hover, +a:active { + /* color: #444fd9; */ + text-decoration: none; + transition-duration: .3s; + transition-timing-function: cubic-bezier(0, 0, .2, 1); + background-color: #fafafa +} .section-border { @@ -355,6 +351,24 @@ a:active { } +@media (max-width: 996px) { + .latest-button { + flex: 1 1 100%; + margin-right: 0 !important; + max-width: calc(100vw - 2rem); + min-height: 170px; + height: auto !important; + } + + .home-page-hero-right { + flex-wrap: wrap !important + } + .latest-button-CN{ + margin-right: 0 !important; + max-width: calc(100vw - 2rem); + } +} + .latest-button-CN { /* background-color: #fafafa; */ border: 0.3px solid #dcdcdc; diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/latest.tsx b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/latest.tsx index 3e1eb5090e0fb..7c92f75c3c137 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/latest.tsx +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/latest.tsx @@ -33,12 +33,12 @@ export default function Latest() {
*/} -
Doris Summit Asia 2024|12 月 14 日 深圳
+
Doris Summit Asia 2024 圆满落幕
-
一年一度的 Apache Doris 峰会再次启航,Doris Summit Asia 2024 现已开启报名,将于 12 月 14 日在深圳正式举办。
-
立即报名
+
2024 年 12 月 14 日,由飞轮科技主办,腾讯云和阿里云联合主办的 Doris Summit Asia 2024 在深圳圆满落幕。演讲回放及资料会在 10 个工作日内逐步释出,可通过 Doris Summit 官网获取。
+
回放生成中
- +
版本发布
{/*
@@ -47,9 +47,9 @@ export default function Latest() {
*/} -
Apache Doris 3.0.2 正式发布
+
Apache Doris 3.0.3 正式发布
-
3.0.2 版本在存算分离、存储、湖仓一体、查询优化器以及执行引擎持续升级改进,欢迎大家下载使用。
+
3.0.3 版本在存算分离、存储、湖仓一体、查询优化器以及执行引擎持续升级改进,欢迎大家下载使用。
查看详情
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/page-hero-1.tsx b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/page-hero-1.tsx index 4b9826c5d4e23..6666f3f97ac60 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/page-hero-1.tsx +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/page-hero-1.tsx @@ -35,9 +35,9 @@ export default function PageHero() {
如何基于 Apache Doris 构建开放、高性能低成本、统一的日志存储分析平台。
- +
-
资源管理
+
负载管理
{/*
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.0.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.0.md index 434677f520819..d14aec8a307e5 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.0.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.0.md @@ -104,7 +104,7 @@ under the License. ![Local Shuffle Clickbench and TPCH-100](/images/2.1-doris-clickbench-tpch.png) :::note 备注 -参考文档:[Pipeline X 执行引擎](https://doris.apache.org/zh-CN/docs/query-acceleration/pipeline-execution-engine) +参考文档:[Pipeline X 执行引擎](../../query-acceleration/pipeline-execution-engine) ::: ## ARM 架构深度适配,性能提升 230% @@ -141,9 +141,9 @@ under the License. 该功能目前为实验性质功能,当前已经支持 ClickHouse、Presto、Trino、Hive、Spark。在此我们以 Trino 为例,部署完 SQL 转换服务后,在会话变量中设置 `set sql_dialect = trino` ,即可直接采取 Trino SQL 语法执行查询。在某些社区用户的实际线上业务 SQL 兼容性测试中,在全部 3w 多条查询语句中与 Trino SQL 兼容度高达 99% 以上。也欢迎所有用户在使用过程中向我们反馈不兼容的 Case,帮助 Apache Doris 更加完善。 :::note -- 演示 Demo: https://www.bilibili.com/video/BV1cS421A7kA/?spm_id_from=333.999.0.0 +- [演示 Demo](https://www.bilibili.com/video/BV1cS421A7kA/?spm_id_from=333.999.0.0) -- 参考文档:[SQL 方言兼容](https://doris.apache.org/zh-CN/docs/lakehouse/sql-dialect.md) +- 参考文档:[SQL 方言兼容](../../lakehouse/sql-dialect.md) ::: @@ -302,7 +302,7 @@ CREATE MATERIALIZED VIEW mv1 :::note - 演示 Demo: https://www.bilibili.com/video/BV1s2421T71z/?spm_id_from=333.999.0.0 -- 参考文档:[异步物化视图](https://doris.apache.org/zh-CN/docs/query-acceleration/materialized-view/async-materialized-view/overview) +- 参考文档:[异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/overview) ::: ## 存储能力增强 @@ -408,7 +408,7 @@ PROPERTIES ( :::note -参考文档:[数据划分](https://doris.apache.org/zh-CN/docs/table-design/data-partitioning/basic-concepts) +参考文档:[数据划分](../../table-design/data-partitioning/basic-concepts) ::: ### INSERT INTO SELECT 导入性能提升 100% @@ -470,7 +470,7 @@ MemTable 前移在 2.1 版本中默认开启,用户无需修改原有的导入 :::note - 演示 Demo:https://www.bilibili.com/video/BV1um411o7Ha/?spm_id_from=333.999.0.0 -- 参考文档和完整测试报告:[Group Commit](https://doris.apache.org/zh-CN/docs/data-operate/import/import-way/group-commit-manual) +- 参考文档和完整测试报告:[Group Commit](../../data-operate/import/import-way/group-commit-manual) ::: @@ -542,7 +542,7 @@ SELECT v["properties"]["title"] from ${table_name} :::note - 演示 Demo: https://www.bilibili.com/video/BV13u4m1g7ra/?spm_id_from=333.999.0.0 -- 参考文档:[VARIANT](https://doris.apache.org/zh-CN/docs/sql-manual/sql-data-types/semi-structured/VARIANT.md) +- 参考文档:[VARIANT](../../sql-manual/sql-data-types/semi-structured/VARIANT.md) ::: @@ -557,7 +557,7 @@ SELECT v["properties"]["title"] from ${table_name} - INET_ATON:获取包含 IPv4 地址的字符串,格式为 A.B.C.D(点分隔的十进制数字) :::note -参考文档:[IPV6](https://doris.apache.org/zh-CN/docs/sql-manual/sql-data-types/ip/IPV6) +参考文档:[IPV6](../../sql-manual/sql-data-types/ip/IPV6) ::: @@ -674,7 +674,7 @@ mysql> select struct(1,"2") not in (struct(1,3), struct(1,"2"), struct(1,1), nul - `MAP_AGG`:接收 expr1 作为键,expr2 作为对应的值,返回一个 MAP :::note -参考文档:[MAP_AGG](https://doris.apache.org/zh-CN/docs/sql-manual/sql-functions/aggregate-functions/map-agg.md) +参考文档:[MAP_AGG](../../sql-manual/sql-functions/aggregate-functions/map-agg.md) ::: @@ -699,7 +699,7 @@ mysql> select struct(1,"2") not in (struct(1,3), struct(1,"2"), struct(1,1), nul :::note - 演示 Demo:https://www.bilibili.com/video/BV1Fz421X7XE/?spm_id_from=333.999.0.0 -- 参考文档:[Workload Group](https://doris.apache.org/zh-CN/docs/admin-manual/resource-admin/workload-group.md) +- 参考文档:[Workload Group](../../admin-manual/resource-admin/workload-group.md) ::: @@ -757,7 +757,7 @@ select QueryId,max(BePeakMemoryBytes) as be_peak_mem from active_queries() group 目前主要展示的负载类型包括 Select 和`Insert Into……Select`,预计在 2.1 版本之上的三位迭代版本中会支持 Stream Load 和 Broker Load 的资源用量展示。 :::note -参考文档:[ACTIVE_QUERIES](https://doris.apache.org/zh-CN/docs/sql-manual/sql-functions/table-functions/active_queries.md) +参考文档:[ACTIVE_QUERIES](../../sql-manual/sql-functions/table-functions/active_queries.md) ::: @@ -858,7 +858,7 @@ JOB e_daily :::caution 注意事项 -当前 Job Scheduler 仅支持 Insert 内表,参考文档:[CREATE-JOB](https://doris.apache.org/zh-CN/docs/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-JOB.md) +当前 Job Scheduler 仅支持 Insert 内表,参考文档:[CREATE-JOB](../../sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-JOB.md) ::: @@ -878,7 +878,7 @@ JOB e_daily - 对于之前已经安装过审计日志插件的用户,升级后可以继续使用原有插件,也可以通过 uninstall 命令卸载原有插件后,使用新的插件。但注意,切换插件后,审计日志表也将切换到新的表中。 - - 具体可参阅:[审计日志插件](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin.md) + - 具体可参阅:[审计日志插件](../../admin-manual/audit-plugin.md) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.2.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.2.md index 96b7c849d341b..1517bf0b53fca 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.2.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.2.md @@ -40,7 +40,7 @@ under the License. - https://github.com/apache/doris/pull/33282 -3. Auto Partition 语法变化,详见 https://doris.apache.org/zh-CN/docs/table-design/data-partition#%E8%87%AA%E5%8A%A8%E5%88%86%E5%8C%BA +3. Auto Partition 语法变化,详见[文档](../../table-design/data-partitioning/auto-partitioning.md) - https://github.com/apache/doris/pull/32737 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.3.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.3.md index dc33f0d6011fa..15056902e7534 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.3.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.3.md @@ -37,7 +37,7 @@ under the License. 从 2.1.3 版本开始,Apache Doris 支持对 Hive 的 DDL 和 DML 操作。用户可以直接通过 Apache Doris 在 Hive 中创建库表,通过执行`INSERT INTO`语句来向 Hive 表中写入数据。通过该功能,用户可以通过 Apache Doris 对 Hive 进行完整的数据查询和写入操作,进一步帮助用户简化湖仓一体架构。 -参考文档:[https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/](https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/) +参考[文档](../../lakehouse/datalake-building/hive-build) **2. 支持在异步物化视图之上构建新的异步物化视图** diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.4.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.4.md index d8e3a2d8be538..722de717ea32a 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.4.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.4.md @@ -40,9 +40,9 @@ under the License. 关于更多信息,请参考文档: - - [BE 日志管理](../admin-manual/log-management/be-log.md) + - [BE 日志管理](../../admin-manual/log-management/be-log.md) - - [FE 日志管理](../admin-manual/log-management/fe-log.md) + - [FE 日志管理](../../admin-manual/log-management/fe-log.md) - 如果建表时没有填写表注释,默认注释为空,不再使用表类型作为默认表注释。 [#36025](https://github.com/apache/doris/pull/36025) @@ -54,7 +54,7 @@ under the License. - **支持 FE 火焰图工具**:在 FE 部署目录 `${DORIS_FE_HOME}/bin` 中会增加`profile_fe.sh` 脚本,可以利用 async-profiler 工具生成 FE 的火焰图,用以发现性能瓶颈点。 - 关于更多信息,请参考文档:[使用 FE Profiler 生成火焰图](/community/developer-guide/fe-profiler.md) + 关于更多信息,请参考文档:[使用 FE Profiler 生成火焰图](https://doris.apache.org/zh-CN/community/developer-guide/fe-profiler) - **支持 SELECT DISTINCT 与聚合函数同时使用**:支持 `SELECT DISTINCT` 与聚合函数同时使用,在一个查询中同时去重和进行聚合操作,如 SUM、MIN/MAX 等。 @@ -66,15 +66,15 @@ under the License. - **支持 Paimon 的原生读取器来处理 Deletion Vector:** Deletion Vector 主要用于标记或追踪哪些数据已被删除或标记为删除,通常应用在需要保留历史数据的场景,基于本优化可以提升大量数据更新或删除时的处理效率。 [#35241](https://github.com/apache/doris/pull/35241) - 关于更多信息,请参考文档:[数据湖分析 - Paimon](../lakehouse/datalake-analytics/paimon.md) + 关于更多信息,请参考文档:[数据湖分析 - Paimon](../../lakehouse/datalake-analytics/paimon.md) - **支持在表值函数(TVF)中使用 Resource**:TVF 功能为 Apache Doris 提供了直接将对象存储或 HDFS 上的文件作为 Table 进行查询分析的能力。通过在 TVF 中引用 Resource,可以避免重复填写连接信息,提升使用体验。 [#35139](https://github.com/apache/doris/pull/35139) - 关于更多信息,请参考文档:[表函数 - HDFS](../sql-manual/sql-functions/table-functions/hdfs.md) + 关于更多信息,请参考文档:[表函数 - HDFS](../../sql-manual/sql-functions/table-functions/hdfs.md) - **支持通过 Ranger 插件实现数据脱敏**:开启 Ranger 鉴权功能后,支持使用 Ranger 中的 Data Mask 功能进行数据脱敏。 - 关于更多信息,请参考文档:[基于 Apache Ranger 的鉴权管理](https://doris.apache.org/zh-CN/docs/admin-manual/auth/ranger/#%E5%AE%89%E8%A3%85-doris-ranger-%E6%8F%92%E4%BB%B6) + 关于更多信息,请参考文档:[基于 Apache Ranger 的鉴权管理](../../admin-manual/auth/ranger#资源和权限) ### 异步物化视图 @@ -82,21 +82,21 @@ under the License. - 支持单表透明改写。 - 关于更多信息,请参考文档:[查询异步物化视图](../query/view-materialized-view/query-async-materialized-view.md) + 关于更多信息,请参考文档:[查询异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md) - 透明改写支持 agg_state, agg_union 类型的聚合上卷,物化视图可以定义为 agg_state 或者 agg_union,查询使用具体的聚合函数,或者使用 agg_merge - 关于更多信息,请参考文档:[AGG_STATE](../sql-manual/sql-types/Data-Types/AGG_STATE.md) + 关于更多信息,请参考文档:[AGG_STATE](../../sql-manual/sql-data-types/aggregate/AGG-STATE.md) ### 其他 - **新增 `replace_empty` 函数**:将字符串中的子字符串进行替换,当旧字符串为空时,会将新字符串插入到原有字符串的每个字符前以及最后。 - 关于更多信息,请参考文档:[字符串函数 - REPLACE_EMPTY](../sql-manual/sql-functions/string-functions/replace_empty.md) + 关于更多信息,请参考文档:[字符串函数 - REPLACE_EMPTY](../../sql-manual/sql-functions/string-functions/replace_empty.md) - 支持 `show storage policy using` 语句:支持查看所有或指定存储策略关联的表和分区。 - 关于更多信息,请参考文档:[SQL 语句 - SHOW](../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md) + 关于更多信息,请参考文档:[SQL 语句 - SHOW](../../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md) - **支持 BE 侧的 JVM 指标:** 通过在 `be.conf` 配置文件中设置`enable_jvm_monitor=true`,可以启用对 BE 节点 JVM 的监控和指标收集,有助于了解 BE JVM 的资源使用情况,以便进行故障排除和性能优化。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.5.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.5.md index b463d42968326..c41df17fce4ba 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.5.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.5.md @@ -131,7 +131,7 @@ under the License. - 数据导出(Export/Outfile)支持指定 Parquet 和 ORC 的压缩格式。 - - 更多信息,请参考[文档](https://doris.apache.org/docs/sql-manual/sql-statements/Data-Manipulation-Statements/Manipulation/EXPORT/?_highlight=compress_type)。 + - 更多信息,请参考[文档](../../sql-manual/sql-statements/data-modification/load-and-export/EXPORT.md)。 - 当使用 CTAS+TVF 创建表时,TVF 中的分区列将被自动映射为 Varchar(65533)而非 String,以便该分区列能够作为内表的分区列使用。 [#37161](https://github.com/apache/doris/pull/37161) @@ -207,7 +207,7 @@ under the License. - 支持为 `INSERT INTO ... FROM TABLE VALUE FUNCTION` 语句设置 `max_filter_ratio` 参数。 - - 更多信息,请参考[文档](https://doris.apache.org/zh-CN/docs/data-operate/import/import-way/insert-into-manual/) + - 更多信息,请参考[文档](../../data-operate/import/import-way/insert-into-manual) ## Bug 修复 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.6.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.6.md index 6261e4e0c6612..65853079ee177 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.6.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.6.md @@ -56,15 +56,15 @@ under the License. - 实现 Iceberg 表的写回功能。 - - 更多信息,请查看文档数据湖构建-[Iceberg](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-building/iceberg-build) + - 更多信息,请查看文档数据湖构建-[Iceberg](../../lakehouse/datalake-building/iceberg-build) - 增强 SQL 拦截规则,支持对外表的拦截处理。 - - 更多信息,请查看文档查询管理-[SQL 拦截](https://doris.apache.org/zh-CN/docs/admin-manual/query-admin/sql-interception) + - 更多信息,请查看文档查询管理-[SQL 拦截](../../admin-manual/query-admin/sql-interception) - 新增系统表`file_cache_statistics`,用于查看 BE 节点的数据缓存性能指标。 - - 更多信息,请查看文档系统表-[file_cache_statistics](https://doris.apache.org/docs/admin-manual/system-tables/file_cache_statistics/) + - 更多信息,请查看文档系统表-[file_cache_statistics](../../admin-manual/system-tables/information_schema/file_cache_statistics) ### 异步物化视图 @@ -108,10 +108,10 @@ under the License. - 新增系统表`table_properties`,便于用户查看和管理表的各项属性。 - - 更多信息,请查看文档 [table_properties](https://doris.apache.org/zh-CN/docs/admin-manual/system-tables/information_schema/table_properties/) + - 更多信息,请查看文档 [table_properties](../../admin-manual/system-tables/information_schema/table_properties/) - 新增 FE 中死锁和慢锁检测功能。 - - 更多信息,请查看文档 [FE 锁管理](https://doris.apache.org/zh-CN/docs/admin-manual/maint-monitor/frontend-lock-manager/) + - 更多信息,请查看文档 [FE 锁管理](../../admin-manual/maint-monitor/frontend-lock-manager/) ## 改进提升 @@ -119,7 +119,7 @@ under the License. - 革新外表元数据缓存机制。 - - 更多信息,请查看文档 [元数据缓存](https://doris.apache.org/zh-CN/docs/lakehouse/metacache/)。 + - 更多信息,请查看文档 [元数据缓存](../../lakehouse/metacache)。 - 新增会话变量`keep_carriage_return`,默认关闭。读取 Hive Text 格式表时,默认将`\r\n`与`\n`均视为换行符。[#38099](https://github.com/apache/doris/pull/38099) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.7.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.7.md index 2d85c595f497c..f5bfea1d272f5 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.7.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.7.md @@ -38,7 +38,7 @@ under the License. - enable_fallback_to_original_planner: true - enable_pipeline_x_engine: true - 审计日志增加了新的列 [#42262](https://github.com/apache/doris/pull/42262) - - 更多信息,请参考[管理指南](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin/) + - 更多信息,请参考[管理指南](../../admin-manual/audit-plugin) ## 新功能 @@ -61,8 +61,8 @@ under the License. - 增加了 `information_schema.table_options` 和 `information_schema.``table_properties` 系统表,支持查询建表时设置的一些属性。[#34384](https://github.com/apache/doris/pull/34384) - 更多信息,请参考系统表: - - [table_options](https://doris.apache.org/docs/admin-manual/system-tables/information_schema/table_options/) - - [table_properties](https://doris.apache.org/zh-CN/docs/admin-manual/system-tables/information_schema/table_properties/) + - [table_options](../../admin-manual/system-tables/information_schema/table_options) + - [table_properties](../../admin-manual/system-tables/information_schema/table_properties) - 支持 `bitmap_empty` 作为默认值。[#40364](https://github.com/apache/doris/pull/40364) - 增加了一个新的 Session 变量`require_sequence_in_insert` 来控制向 Unique Key 表进行`insert into select` 写入时,是否必须提供 Sequence 列。[#41655](https://github.com/apache/doris/pull/41655) @@ -75,16 +75,16 @@ under the License. ### 湖仓一体 - 支持写入数据到 Hive Text 格式表。[#40537](https://github.com/apache/doris/pull/40537) - - 更多信息,请参考[使用 Hive 构建数据湖](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-building/hive-build/)文档 + - 更多信息,请参考[使用 Hive 构建数据湖](../../lakehouse/datalake-building/hive-build/)文档 - 使用 MaxCompute Open Storage API 访问 MaxCompute 数据。[#41610](https://github.com/apache/doris/pull/41610) - - 更多信息,请参考 [MaxCompute](https://doris.apache.org/zh-CN/docs/lakehouse/database/max-compute/) 文档 + - 更多信息,请参考 [MaxCompute](../../lakehouse/database/max-compute/) 文档 - 支持 Paimon DLF Catalog。[#41694](https://github.com/apache/doris/pull/41694) - - 更多信息,请参考 [Paimon Catalog](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-analytics/paimon/) 文档 + - 更多信息,请参考 [Paimon Catalog](../../lakehouse/datalake-analytics/paimon/) 文档 - 新增语法 `table$partitions` 语法支持直接查询 Hive 分区信息 [#41230](https://github.com/apache/doris/pull/41230) - - 更多信息,请参考[通过 Hive 分析数据湖](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-analytics/hive/)文档 + - 更多信息,请参考[通过 Hive 分析数据湖](../../lakehouse/datalake-analytics/hive/)文档 - 支持 brotli 压缩格式的 Parquet 文件读取。[#42162](https://github.com/apache/doris/pull/42162) - 支持读取 Parquet 文件中的 DECIMAL 256 类型。[#42241](https://github.com/apache/doris/pull/42241) -- 支持读取 OpenCsvSerde 格式的 Hive 表。[#42939](https://github.com/apache/doris/pull/42939)https://github.com/apache/doris/pull/42939 +- 支持读取 OpenCsvSerde 格式的 Hive 表。[#42939](https://github.com/apache/doris/pull/42939) ### 异步物化视图 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.0.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.0.md index 5065dfc1566b7..2e7cdee64215e 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.0.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.0.md @@ -151,7 +151,7 @@ under the License. :::info 备注 -参考文档:[存算分离](https://doris.apache.org/zh-CN/docs/3.0/compute-storage-decoupled/overview) +参考文档:[存算分离](../../compute-storage-decoupled/overview) ::: @@ -200,15 +200,15 @@ under the License. - [接入 Trino Connector](https://doris.apache.org/zh-CN/community/how-to-contribute/trino-connector-developer-guide) -- [TPC-H](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/tpch/) +- [TPC-H](../../lakehouse/datalake-analytics/tpch/) -- [TPC-DS](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/tpcds/) +- [TPC-DS](../../lakehouse/datalake-analytics/tpcds/) -- [Delta Lake](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/deltalake) +- [Delta Lake](../../lakehouse/datalake-analytics/deltalake) -- [Kudu](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/kudu) +- [Kudu](../../lakehouse/datalake-analytics/kudu) -- [BigQuery](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/bigquery) +- [BigQuery](../../lakehouse/datalake-analytics/bigquery) ::: ### 2-3 数据湖构建 @@ -219,7 +219,7 @@ under the License. :::info 备注 -参考文档:[数据湖构建](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build/) +参考文档:[数据湖构建](../../lakehouse/datalake-building/hive-build/) ::: @@ -277,7 +277,7 @@ Variant 数据类型在经过大规模生产打磨后,已具备充分的稳定 参考文档: -- [事务](https://doris.apache.org/zh-CN/docs/3.0/data-operate/transaction/) +- [事务](../../data-operate/transaction/) - 目前 CCR 暂未支持显示事务同步。 ::: @@ -329,9 +329,9 @@ Variant 数据类型在经过大规模生产打磨后,已具备充分的稳定 :::info 备注 参考文档: -- [异步物化视图概览](https://doris.apache.org/zh-CN/docs/query/view-materialized-view/async-materialized-view) +- [异步物化视图概览](../../query-acceleration/materialized-view/async-materialized-view/overview.md) -- [查询异步物化视图](https://doris.apache.org/zh-CN/docs/3.0/query/view-materialized-view/query-async-materialized-view/) +- [查询异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md) ::: ## 6. 性能提升 @@ -400,7 +400,7 @@ Runtime Filter 是否能够准确生成对查询性能的影响至关重要, ``` :::info 备注 -参考文档: [Java UDF - UDTF](https://doris.apache.org/zh-CN/docs/query/udf/java-user-defined-function#udtf-1) +参考文档: [Java UDF - UDTF](../../query-data/udf/java-user-defined-function.md#java-udtf-实例介绍) ::: ### 7-2 生成列 @@ -415,7 +415,7 @@ Runtime Filter 是否能够准确生成对查询性能的影响至关重要, 参考文档: -[CREATE TABLE AND GENERATED COLUMN](https://doris.apache.org/zh-CN/docs/3.0/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-TABLE-AND-GENERATED-COLUMN/) +[CREATE TABLE AND GENERATED COLUMN](../../sql-manual/sql-statements/table-and-view/table/CREATE-TABLE.md) ::: ## 8. 功能改进 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.1.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.1.md index 6f79a76c5872c..dd3d7829f2783 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.1.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.1.md @@ -74,7 +74,7 @@ under the License. - SQL 拦截功能现在支持外部表 - - 更多内容,参考文档[SQL 拦截](https://doris.apache.org/zh-CN/docs/admin-manual/query-admin/sql-interception) + - 更多内容,参考文档[SQL 拦截](../..//admin-manual/query-admin/sql-interception) - Insert Overwrite 现在支持 Iceberg 表。[#37191](https://github.com/apache/doris/pull/37191) @@ -108,7 +108,7 @@ under the License. - 新增加了 FE 参数 `skip_audit_user_list`,在此配置项中的用户操作将不会被记录到审计日志中。[#38310](https://github.com/apache/doris/pull/38310) - - 更多内容,参考文档[审计插件](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin/) + - 更多内容,参考文档[审计插件](../../admin-manual/audit-plugin/) ## 改进 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.2.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.2.md index bd84408eec7f0..cd509e52023ff 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.2.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.2.md @@ -63,7 +63,7 @@ under the License. ### Lakehouse -- 新增 Lakesoul Catalog。[Apache Doris Docs](https://doris.apache.org/zh-CN/docs/dev/lakehouse/datalake-analytics/lakesoul) +- 新增 Lakesoul Catalog。[Apache Doris Docs](../../lakehouse/datalake-analytics/lakesoul) - 新增系统表 `catalog_meta_cache_statistics`,用于查看 External Catalog 中各类元数据缓存的使用情况。[#40155](https://github.com/apache/doris/pull/40155) ### 查询优化器 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.3.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.3.md index 2f72f702483e3..8a3ecbfa4f62f 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.3.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.3.md @@ -45,11 +45,11 @@ under the License. - 新增 `table$partition` 语法,用于查询 Hive 表的分区信息。[#40774](https://github.com/apache/doris/pull/40774) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/hive#查询-hive-分区) + - [查看文档](../../lakehouse/datalake-analytics/hive#查询-hive-分区) - 支持创建 Text 格式的 Hive 表。[#41860](https://github.com/apache/doris/pull/41860) [#42175](https://github.com/apache/doris/pull/42175) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build#table) + - [查看文档](../../lakehouse/datalake-building/hive-build#table) ### 异步物化视图 @@ -71,7 +71,7 @@ under the License. - 数组函数 `array_agg` 支持在 ARRAY 中嵌套 ARRAY/MAP/STRUCT。[#42009](https://github.com/apache/doris/pull/42009) - 新增近似聚合统计函数 `approx_top_k` 和 `approx_top_sum`。[#44082](https://github.com/apache/doris/pull/44082) -## 改进 +## 改进与优化 ### 存储 @@ -96,7 +96,7 @@ under the License. - Paimon Catalog 支持阿里云 DLF 和 OSS-HDFS 存储。[#41247](https://github.com/apache/doris/pull/41247) [#42585](https://github.com/apache/doris/pull/42585) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/paimon#基于-aliyun-dlf-创建-catalog) + - [查看文档](../../lakehouse/datalake-analytics/paimon#基于-aliyun-dlf-创建-catalog) - 支持读取 OpenCSV 格式的 Hive 表。[#42257](https://github.com/apache/doris/pull/42257) [#42942](https://github.com/apache/doris/pull/42942) - 优化了访问 External Catalog 中 `information_schema.columns` 表的性能。[#41659](https://github.com/apache/doris/pull/41659) [#41962](https://github.com/apache/doris/pull/41962) @@ -142,7 +142,7 @@ under the License. - FE 监控项中的连接数信息支持按用户分别显示。[#39200](https://github.com/apache/doris/pull/39200) -## 缺陷修复 +## 问题修复 ### 存储 @@ -224,4 +224,4 @@ under the License. - 补充了审计日志表和文件中缺失的审计日志字段。[#43303](https://github.com/apache/doris/pull/43303) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file + - [查看文档](../../admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/gettingStarted/demo-block/latest.tsx b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/gettingStarted/demo-block/latest.tsx index acaf64e6c44b3..7c92f75c3c137 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/gettingStarted/demo-block/latest.tsx +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/gettingStarted/demo-block/latest.tsx @@ -38,7 +38,7 @@ export default function Latest() {
2024 年 12 月 14 日,由飞轮科技主办,腾讯云和阿里云联合主办的 Doris Summit Asia 2024 在深圳圆满落幕。演讲回放及资料会在 10 个工作日内逐步释出,可通过 Doris Summit 官网获取。
回放生成中
- +
版本发布
{/*
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/gettingStarted/demo-block/page-hero-1.tsx b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/gettingStarted/demo-block/page-hero-1.tsx index 4b9826c5d4e23..6666f3f97ac60 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/gettingStarted/demo-block/page-hero-1.tsx +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/gettingStarted/demo-block/page-hero-1.tsx @@ -35,9 +35,9 @@ export default function PageHero() {
如何基于 Apache Doris 构建开放、高性能低成本、统一的日志存储分析平台。
- +
-
资源管理
+
负载管理
{/*
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/faq.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/faq.md index c850659d3b047..905111aa22c7f 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/faq.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/faq.md @@ -1,6 +1,6 @@ --- { - "title": "常见问题", + "title": "异步物化视图常见问题", "language": "zh-CN" } --- diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md index 9c5b27d37c38d..aa80b6df129d2 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md @@ -1,6 +1,6 @@ --- { - "title": "功能描述", + "title": "异步物化视图功能描述", "language": "zh-CN" } --- diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/overview.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/overview.md index 830ad751e2bd0..a140b4a871859 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/overview.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/overview.md @@ -1,6 +1,6 @@ --- { - "title": "原理介绍", + "title": "异步物化视图原理介绍", "language": "zh-CN" } --- diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/use-guide.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/use-guide.md index c9f6e28fe620e..49caa7caf98ed 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/use-guide.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/use-guide.md @@ -1,6 +1,6 @@ --- { - "title": "使用与实践", + "title": "异步物化视图使用与实践", "language": "zh-CN" } --- @@ -26,9 +26,9 @@ under the License. ## 异步物化视图使用原则 -1. **时效性考虑:**异步物化视图通常用于对数据时效性要求不高的场景,一般是 T+1 的数据。如果时效性要求高,应考虑使用同步物化视图。 +1. **时效性考虑:** 异步物化视图通常用于对数据时效性要求不高的场景,一般是 T+1 的数据。如果时效性要求高,应考虑使用同步物化视图。 -2. **加速效果与一致性考虑:**在查询加速场景,创建物化视图时,DBA 应将常见查询 SQL 模式分组,尽量使组之间无重合。SQL 模式组划分越清晰,物化视图构建的质量越高。一个查询可能使用多个物化视图,同时一个物化视图也可能被多个查询使用。构建物化视图需要综合考虑命中物化视图的响应时间(加速效果)、构建成本、数据一致性要求等。 +2. **加速效果与一致性考虑:** 在查询加速场景,创建物化视图时,DBA 应将常见查询 SQL 模式分组,尽量使组之间无重合。SQL 模式组划分越清晰,物化视图构建的质量越高。一个查询可能使用多个物化视图,同时一个物化视图也可能被多个查询使用。构建物化视图需要综合考虑命中物化视图的响应时间(加速效果)、构建成本、数据一致性要求等。 3. **物化视图定义与构建成本考虑:** @@ -38,11 +38,11 @@ under the License. 需要注意: -1. **物化视图数量控制:**物化视图并非越多越好。物化视图参与透明改写,且 CBO 代价模型选择需要时间。理论上,物化视图越多,透明改写的时间越长,且物化视图构建和刷新占用的资源越大。 +1. **物化视图数量控制:** 物化视图并非越多越好。物化视图参与透明改写,且 CBO 代价模型选择需要时间。理论上,物化视图越多,透明改写的时间越长,且物化视图构建和刷新占用的资源越大。 -2. **定期检查物化视图使用状态:**如果未使用,应及时删除。 +2. **定期检查物化视图使用状态:** 如果未使用,应及时删除。 -3. **基表数据更新频率:**如果物化视图的基表数据频繁更新,可能不太适合使用物化视图,因为这会导致物化视图频繁失效,不能用于透明改写(可直查)。如果需要使用此类物化视图进行透明改写,需要允许查询的数据有一定的时效延迟,并可以设定`grace_period`。具体见`grace_period`的适用介绍。 +3. **基表数据更新频率:** 如果物化视图的基表数据频繁更新,可能不太适合使用物化视图,因为这会导致物化视图频繁失效,不能用于透明改写(可直查)。如果需要使用此类物化视图进行透明改写,需要允许查询的数据有一定的时效延迟,并可以设定`grace_period`。具体见`grace_period`的适用介绍。 ## 物化视图刷新方式选择原则 @@ -222,9 +222,9 @@ SyncWithBaseTables: 1 - 对于分区增量的物化视图,分区物化视图是否可用,是以分区粒度去看的。也就是说,即使物化视图的部分分区不可用,但只要查询的是有效分区,那么此物化视图依旧可用于透明改写。是否能透明改写,主要看查询所用分区的 `SyncWithBaseTables` 字段是否一致。如果 `SyncWithBaseTables` 是 1,此分区可用于透明改写;如果是 0,则不能用于透明改写。 -- **JobName:**物化视图构建 Job 的名称,每个物化视图有一个 Job,每次刷新会有一个新的 Task,Job 和 Task 是 1:n 的关系 +- **JobName:** 物化视图构建 Job 的名称,每个物化视图有一个 Job,每次刷新会有一个新的 Task,Job 和 Task 是 1:n 的关系 -- **State:**如果变为 SCHEMA_CHANGE,代表基表的 Schema 发生了变化,此时物化视图将不能用来透明改写 (但是不影响直接查询物化视图),下次刷新任务如果执行成功,将恢复为 NORMAL。 +- **State:** 如果变为 SCHEMA_CHANGE,代表基表的 Schema 发生了变化,此时物化视图将不能用来透明改写 (但是不影响直接查询物化视图),下次刷新任务如果执行成功,将恢复为 NORMAL。 - **SchemaChangeDetail:** 表示 SCHEMA_CHANGE 发生的原因。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.0.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.0.md index 434677f520819..6eee28debcab2 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.0.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.0.md @@ -104,7 +104,7 @@ under the License. ![Local Shuffle Clickbench and TPCH-100](/images/2.1-doris-clickbench-tpch.png) :::note 备注 -参考文档:[Pipeline X 执行引擎](https://doris.apache.org/zh-CN/docs/query-acceleration/pipeline-execution-engine) +参考文档:[Pipeline X 执行引擎](../../query-acceleration/pipeline-execution-engine) ::: ## ARM 架构深度适配,性能提升 230% @@ -141,9 +141,9 @@ under the License. 该功能目前为实验性质功能,当前已经支持 ClickHouse、Presto、Trino、Hive、Spark。在此我们以 Trino 为例,部署完 SQL 转换服务后,在会话变量中设置 `set sql_dialect = trino` ,即可直接采取 Trino SQL 语法执行查询。在某些社区用户的实际线上业务 SQL 兼容性测试中,在全部 3w 多条查询语句中与 Trino SQL 兼容度高达 99% 以上。也欢迎所有用户在使用过程中向我们反馈不兼容的 Case,帮助 Apache Doris 更加完善。 :::note -- 演示 Demo: https://www.bilibili.com/video/BV1cS421A7kA/?spm_id_from=333.999.0.0 +- [演示 Demo](https://www.bilibili.com/video/BV1cS421A7kA/?spm_id_from=333.999.0.0) -- 参考文档:[SQL 方言兼容](https://doris.apache.org/zh-CN/docs/lakehouse/sql-dialect.md) +- 参考文档:[SQL 方言兼容](../../lakehouse/sql-dialect.md) ::: @@ -302,7 +302,7 @@ CREATE MATERIALIZED VIEW mv1 :::note - 演示 Demo: https://www.bilibili.com/video/BV1s2421T71z/?spm_id_from=333.999.0.0 -- 参考文档:[异步物化视图](https://doris.apache.org/zh-CN/docs/query-acceleration/materialized-view/async-materialized-view/overview) +- 参考文档:[异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/overview) ::: ## 存储能力增强 @@ -408,7 +408,7 @@ PROPERTIES ( :::note -参考文档:[数据划分](https://doris.apache.org/zh-CN/docs/table-design/data-partitioning/basic-concepts) +参考文档:[数据划分](../../table-design/data-partitioning/basic-concepts) ::: ### INSERT INTO SELECT 导入性能提升 100% @@ -470,7 +470,7 @@ MemTable 前移在 2.1 版本中默认开启,用户无需修改原有的导入 :::note - 演示 Demo:https://www.bilibili.com/video/BV1um411o7Ha/?spm_id_from=333.999.0.0 -- 参考文档和完整测试报告:[Group Commit](https://doris.apache.org/zh-CN/docs/data-operate/import/import-way/group-commit-manual) +- 参考文档和完整测试报告:[Group Commit](../../data-operate/import/import-way/group-commit-manual) ::: @@ -542,7 +542,7 @@ SELECT v["properties"]["title"] from ${table_name} :::note - 演示 Demo: https://www.bilibili.com/video/BV13u4m1g7ra/?spm_id_from=333.999.0.0 -- 参考文档:[VARIANT](https://doris.apache.org/zh-CN/docs/sql-manual/sql-data-types/semi-structured/VARIANT.md) +- 参考文档:[VARIANT](../../sql-manual/sql-data-types/semi-structured/VARIANT.md) ::: @@ -557,7 +557,7 @@ SELECT v["properties"]["title"] from ${table_name} - INET_ATON:获取包含 IPv4 地址的字符串,格式为 A.B.C.D(点分隔的十进制数字) :::note -参考文档:[IPV6](https://doris.apache.org/zh-CN/docs/sql-manual/sql-data-types/ip/IPV6) +参考文档:[IPV6](../../sql-manual/sql-data-types/ip/IPV6) ::: @@ -674,7 +674,7 @@ mysql> select struct(1,"2") not in (struct(1,3), struct(1,"2"), struct(1,1), nul - `MAP_AGG`:接收 expr1 作为键,expr2 作为对应的值,返回一个 MAP :::note -参考文档:[MAP_AGG](https://doris.apache.org/zh-CN/docs/sql-manual/sql-functions/aggregate-functions/map-agg.md) +参考文档:[MAP_AGG](../../sql-manual/sql-functions/aggregate-functions/map-agg.md) ::: @@ -699,7 +699,7 @@ mysql> select struct(1,"2") not in (struct(1,3), struct(1,"2"), struct(1,1), nul :::note - 演示 Demo:https://www.bilibili.com/video/BV1Fz421X7XE/?spm_id_from=333.999.0.0 -- 参考文档:[Workload Group](https://doris.apache.org/zh-CN/docs/admin-manual/resource-admin/workload-group.md) +- 参考文档:[Workload Group](../../admin-manual/resource-admin/workload-group.md) ::: @@ -757,7 +757,7 @@ select QueryId,max(BePeakMemoryBytes) as be_peak_mem from active_queries() group 目前主要展示的负载类型包括 Select 和`Insert Into……Select`,预计在 2.1 版本之上的三位迭代版本中会支持 Stream Load 和 Broker Load 的资源用量展示。 :::note -参考文档:[ACTIVE_QUERIES](https://doris.apache.org/zh-CN/docs/sql-manual/sql-functions/table-functions/active_queries.md) +参考文档:[ACTIVE_QUERIES](../../sql-manual/sql-functions/table-functions/active_queries.md) ::: @@ -858,7 +858,7 @@ JOB e_daily :::caution 注意事项 -当前 Job Scheduler 仅支持 Insert 内表,参考文档:[CREATE-JOB](https://doris.apache.org/zh-CN/docs/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-JOB.md) +当前 Job Scheduler 仅支持 Insert 内表,参考文档:[CREATE-JOB](../../sql-manual/sql-statements/job/CREATE-JOB.md) ::: @@ -878,7 +878,7 @@ JOB e_daily - 对于之前已经安装过审计日志插件的用户,升级后可以继续使用原有插件,也可以通过 uninstall 命令卸载原有插件后,使用新的插件。但注意,切换插件后,审计日志表也将切换到新的表中。 - - 具体可参阅:[审计日志插件](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin.md) + - 具体可参阅:[审计日志插件](../../admin-manual/audit-plugin.md) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.2.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.2.md index 96b7c849d341b..1517bf0b53fca 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.2.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.2.md @@ -40,7 +40,7 @@ under the License. - https://github.com/apache/doris/pull/33282 -3. Auto Partition 语法变化,详见 https://doris.apache.org/zh-CN/docs/table-design/data-partition#%E8%87%AA%E5%8A%A8%E5%88%86%E5%8C%BA +3. Auto Partition 语法变化,详见[文档](../../table-design/data-partitioning/auto-partitioning.md) - https://github.com/apache/doris/pull/32737 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.3.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.3.md index dc33f0d6011fa..2eebec69e5e3b 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.3.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.3.md @@ -37,7 +37,8 @@ under the License. 从 2.1.3 版本开始,Apache Doris 支持对 Hive 的 DDL 和 DML 操作。用户可以直接通过 Apache Doris 在 Hive 中创建库表,通过执行`INSERT INTO`语句来向 Hive 表中写入数据。通过该功能,用户可以通过 Apache Doris 对 Hive 进行完整的数据查询和写入操作,进一步帮助用户简化湖仓一体架构。 -参考文档:[https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/](https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/) +参考[文档](../../lakehouse/datalake-building/hive-build) + **2. 支持在异步物化视图之上构建新的异步物化视图** diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.4.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.4.md index d8e3a2d8be538..722de717ea32a 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.4.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.4.md @@ -40,9 +40,9 @@ under the License. 关于更多信息,请参考文档: - - [BE 日志管理](../admin-manual/log-management/be-log.md) + - [BE 日志管理](../../admin-manual/log-management/be-log.md) - - [FE 日志管理](../admin-manual/log-management/fe-log.md) + - [FE 日志管理](../../admin-manual/log-management/fe-log.md) - 如果建表时没有填写表注释,默认注释为空,不再使用表类型作为默认表注释。 [#36025](https://github.com/apache/doris/pull/36025) @@ -54,7 +54,7 @@ under the License. - **支持 FE 火焰图工具**:在 FE 部署目录 `${DORIS_FE_HOME}/bin` 中会增加`profile_fe.sh` 脚本,可以利用 async-profiler 工具生成 FE 的火焰图,用以发现性能瓶颈点。 - 关于更多信息,请参考文档:[使用 FE Profiler 生成火焰图](/community/developer-guide/fe-profiler.md) + 关于更多信息,请参考文档:[使用 FE Profiler 生成火焰图](https://doris.apache.org/zh-CN/community/developer-guide/fe-profiler) - **支持 SELECT DISTINCT 与聚合函数同时使用**:支持 `SELECT DISTINCT` 与聚合函数同时使用,在一个查询中同时去重和进行聚合操作,如 SUM、MIN/MAX 等。 @@ -66,15 +66,15 @@ under the License. - **支持 Paimon 的原生读取器来处理 Deletion Vector:** Deletion Vector 主要用于标记或追踪哪些数据已被删除或标记为删除,通常应用在需要保留历史数据的场景,基于本优化可以提升大量数据更新或删除时的处理效率。 [#35241](https://github.com/apache/doris/pull/35241) - 关于更多信息,请参考文档:[数据湖分析 - Paimon](../lakehouse/datalake-analytics/paimon.md) + 关于更多信息,请参考文档:[数据湖分析 - Paimon](../../lakehouse/datalake-analytics/paimon.md) - **支持在表值函数(TVF)中使用 Resource**:TVF 功能为 Apache Doris 提供了直接将对象存储或 HDFS 上的文件作为 Table 进行查询分析的能力。通过在 TVF 中引用 Resource,可以避免重复填写连接信息,提升使用体验。 [#35139](https://github.com/apache/doris/pull/35139) - 关于更多信息,请参考文档:[表函数 - HDFS](../sql-manual/sql-functions/table-functions/hdfs.md) + 关于更多信息,请参考文档:[表函数 - HDFS](../../sql-manual/sql-functions/table-functions/hdfs.md) - **支持通过 Ranger 插件实现数据脱敏**:开启 Ranger 鉴权功能后,支持使用 Ranger 中的 Data Mask 功能进行数据脱敏。 - 关于更多信息,请参考文档:[基于 Apache Ranger 的鉴权管理](https://doris.apache.org/zh-CN/docs/admin-manual/auth/ranger/#%E5%AE%89%E8%A3%85-doris-ranger-%E6%8F%92%E4%BB%B6) + 关于更多信息,请参考文档:[基于 Apache Ranger 的鉴权管理](../../admin-manual/auth/ranger#资源和权限) ### 异步物化视图 @@ -82,21 +82,21 @@ under the License. - 支持单表透明改写。 - 关于更多信息,请参考文档:[查询异步物化视图](../query/view-materialized-view/query-async-materialized-view.md) + 关于更多信息,请参考文档:[查询异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md) - 透明改写支持 agg_state, agg_union 类型的聚合上卷,物化视图可以定义为 agg_state 或者 agg_union,查询使用具体的聚合函数,或者使用 agg_merge - 关于更多信息,请参考文档:[AGG_STATE](../sql-manual/sql-types/Data-Types/AGG_STATE.md) + 关于更多信息,请参考文档:[AGG_STATE](../../sql-manual/sql-data-types/aggregate/AGG-STATE.md) ### 其他 - **新增 `replace_empty` 函数**:将字符串中的子字符串进行替换,当旧字符串为空时,会将新字符串插入到原有字符串的每个字符前以及最后。 - 关于更多信息,请参考文档:[字符串函数 - REPLACE_EMPTY](../sql-manual/sql-functions/string-functions/replace_empty.md) + 关于更多信息,请参考文档:[字符串函数 - REPLACE_EMPTY](../../sql-manual/sql-functions/string-functions/replace_empty.md) - 支持 `show storage policy using` 语句:支持查看所有或指定存储策略关联的表和分区。 - 关于更多信息,请参考文档:[SQL 语句 - SHOW](../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md) + 关于更多信息,请参考文档:[SQL 语句 - SHOW](../../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md) - **支持 BE 侧的 JVM 指标:** 通过在 `be.conf` 配置文件中设置`enable_jvm_monitor=true`,可以启用对 BE 节点 JVM 的监控和指标收集,有助于了解 BE JVM 的资源使用情况,以便进行故障排除和性能优化。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.5.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.5.md index b463d42968326..c41df17fce4ba 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.5.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.5.md @@ -131,7 +131,7 @@ under the License. - 数据导出(Export/Outfile)支持指定 Parquet 和 ORC 的压缩格式。 - - 更多信息,请参考[文档](https://doris.apache.org/docs/sql-manual/sql-statements/Data-Manipulation-Statements/Manipulation/EXPORT/?_highlight=compress_type)。 + - 更多信息,请参考[文档](../../sql-manual/sql-statements/data-modification/load-and-export/EXPORT.md)。 - 当使用 CTAS+TVF 创建表时,TVF 中的分区列将被自动映射为 Varchar(65533)而非 String,以便该分区列能够作为内表的分区列使用。 [#37161](https://github.com/apache/doris/pull/37161) @@ -207,7 +207,7 @@ under the License. - 支持为 `INSERT INTO ... FROM TABLE VALUE FUNCTION` 语句设置 `max_filter_ratio` 参数。 - - 更多信息,请参考[文档](https://doris.apache.org/zh-CN/docs/data-operate/import/import-way/insert-into-manual/) + - 更多信息,请参考[文档](../../data-operate/import/import-way/insert-into-manual) ## Bug 修复 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.6.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.6.md index 6261e4e0c6612..738e6b4b0d326 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.6.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.6.md @@ -24,6 +24,7 @@ specific language governing permissions and limitations under the License. --> + 亲爱的社区小伙伴们,**Apache Doris 2.1.6 版本已于 2024 年 9 月 10 日正式发布。**2.1.6 版本在湖仓一体、异步物化视图、半结构化数据管理持续升级改进,同时在查询优化器、执行引擎、存储管理、数据导入与导出以及权限管理等方面完成了若干修复。欢迎大家下载使用。 - 官网下载:https://doris.apache.org/download @@ -56,15 +57,15 @@ under the License. - 实现 Iceberg 表的写回功能。 - - 更多信息,请查看文档数据湖构建-[Iceberg](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-building/iceberg-build) + - 更多信息,请查看文档数据湖构建-[Iceberg](../../lakehouse/datalake-building/iceberg-build) - 增强 SQL 拦截规则,支持对外表的拦截处理。 - - 更多信息,请查看文档查询管理-[SQL 拦截](https://doris.apache.org/zh-CN/docs/admin-manual/query-admin/sql-interception) + - 更多信息,请查看文档查询管理-[SQL 拦截](../../admin-manual/query-admin/sql-interception) - 新增系统表`file_cache_statistics`,用于查看 BE 节点的数据缓存性能指标。 - - 更多信息,请查看文档系统表-[file_cache_statistics](https://doris.apache.org/docs/admin-manual/system-tables/file_cache_statistics/) + - 更多信息,请查看文档系统表-[file_cache_statistics](../../admin-manual/system-tables/information_schema/file_cache_statistics) ### 异步物化视图 @@ -108,10 +109,10 @@ under the License. - 新增系统表`table_properties`,便于用户查看和管理表的各项属性。 - - 更多信息,请查看文档 [table_properties](https://doris.apache.org/zh-CN/docs/admin-manual/system-tables/information_schema/table_properties/) + - 更多信息,请查看文档 [table_properties](../../admin-manual/system-tables/information_schema/table_properties/) - 新增 FE 中死锁和慢锁检测功能。 - - 更多信息,请查看文档 [FE 锁管理](https://doris.apache.org/zh-CN/docs/admin-manual/maint-monitor/frontend-lock-manager/) + - 更多信息,请查看文档 [FE 锁管理](../../admin-manual/maint-monitor/frontend-lock-manager/) ## 改进提升 @@ -119,7 +120,7 @@ under the License. - 革新外表元数据缓存机制。 - - 更多信息,请查看文档 [元数据缓存](https://doris.apache.org/zh-CN/docs/lakehouse/metacache/)。 + - 更多信息,请查看文档 [元数据缓存](../../lakehouse/metacache)。 - 新增会话变量`keep_carriage_return`,默认关闭。读取 Hive Text 格式表时,默认将`\r\n`与`\n`均视为换行符。[#38099](https://github.com/apache/doris/pull/38099) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.7.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.7.md index 2d85c595f497c..f5bfea1d272f5 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.7.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.7.md @@ -38,7 +38,7 @@ under the License. - enable_fallback_to_original_planner: true - enable_pipeline_x_engine: true - 审计日志增加了新的列 [#42262](https://github.com/apache/doris/pull/42262) - - 更多信息,请参考[管理指南](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin/) + - 更多信息,请参考[管理指南](../../admin-manual/audit-plugin) ## 新功能 @@ -61,8 +61,8 @@ under the License. - 增加了 `information_schema.table_options` 和 `information_schema.``table_properties` 系统表,支持查询建表时设置的一些属性。[#34384](https://github.com/apache/doris/pull/34384) - 更多信息,请参考系统表: - - [table_options](https://doris.apache.org/docs/admin-manual/system-tables/information_schema/table_options/) - - [table_properties](https://doris.apache.org/zh-CN/docs/admin-manual/system-tables/information_schema/table_properties/) + - [table_options](../../admin-manual/system-tables/information_schema/table_options) + - [table_properties](../../admin-manual/system-tables/information_schema/table_properties) - 支持 `bitmap_empty` 作为默认值。[#40364](https://github.com/apache/doris/pull/40364) - 增加了一个新的 Session 变量`require_sequence_in_insert` 来控制向 Unique Key 表进行`insert into select` 写入时,是否必须提供 Sequence 列。[#41655](https://github.com/apache/doris/pull/41655) @@ -75,16 +75,16 @@ under the License. ### 湖仓一体 - 支持写入数据到 Hive Text 格式表。[#40537](https://github.com/apache/doris/pull/40537) - - 更多信息,请参考[使用 Hive 构建数据湖](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-building/hive-build/)文档 + - 更多信息,请参考[使用 Hive 构建数据湖](../../lakehouse/datalake-building/hive-build/)文档 - 使用 MaxCompute Open Storage API 访问 MaxCompute 数据。[#41610](https://github.com/apache/doris/pull/41610) - - 更多信息,请参考 [MaxCompute](https://doris.apache.org/zh-CN/docs/lakehouse/database/max-compute/) 文档 + - 更多信息,请参考 [MaxCompute](../../lakehouse/database/max-compute/) 文档 - 支持 Paimon DLF Catalog。[#41694](https://github.com/apache/doris/pull/41694) - - 更多信息,请参考 [Paimon Catalog](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-analytics/paimon/) 文档 + - 更多信息,请参考 [Paimon Catalog](../../lakehouse/datalake-analytics/paimon/) 文档 - 新增语法 `table$partitions` 语法支持直接查询 Hive 分区信息 [#41230](https://github.com/apache/doris/pull/41230) - - 更多信息,请参考[通过 Hive 分析数据湖](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-analytics/hive/)文档 + - 更多信息,请参考[通过 Hive 分析数据湖](../../lakehouse/datalake-analytics/hive/)文档 - 支持 brotli 压缩格式的 Parquet 文件读取。[#42162](https://github.com/apache/doris/pull/42162) - 支持读取 Parquet 文件中的 DECIMAL 256 类型。[#42241](https://github.com/apache/doris/pull/42241) -- 支持读取 OpenCsvSerde 格式的 Hive 表。[#42939](https://github.com/apache/doris/pull/42939)https://github.com/apache/doris/pull/42939 +- 支持读取 OpenCsvSerde 格式的 Hive 表。[#42939](https://github.com/apache/doris/pull/42939) ### 异步物化视图 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.0.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.0.md index 5065dfc1566b7..2e7cdee64215e 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.0.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.0.md @@ -151,7 +151,7 @@ under the License. :::info 备注 -参考文档:[存算分离](https://doris.apache.org/zh-CN/docs/3.0/compute-storage-decoupled/overview) +参考文档:[存算分离](../../compute-storage-decoupled/overview) ::: @@ -200,15 +200,15 @@ under the License. - [接入 Trino Connector](https://doris.apache.org/zh-CN/community/how-to-contribute/trino-connector-developer-guide) -- [TPC-H](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/tpch/) +- [TPC-H](../../lakehouse/datalake-analytics/tpch/) -- [TPC-DS](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/tpcds/) +- [TPC-DS](../../lakehouse/datalake-analytics/tpcds/) -- [Delta Lake](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/deltalake) +- [Delta Lake](../../lakehouse/datalake-analytics/deltalake) -- [Kudu](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/kudu) +- [Kudu](../../lakehouse/datalake-analytics/kudu) -- [BigQuery](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/bigquery) +- [BigQuery](../../lakehouse/datalake-analytics/bigquery) ::: ### 2-3 数据湖构建 @@ -219,7 +219,7 @@ under the License. :::info 备注 -参考文档:[数据湖构建](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build/) +参考文档:[数据湖构建](../../lakehouse/datalake-building/hive-build/) ::: @@ -277,7 +277,7 @@ Variant 数据类型在经过大规模生产打磨后,已具备充分的稳定 参考文档: -- [事务](https://doris.apache.org/zh-CN/docs/3.0/data-operate/transaction/) +- [事务](../../data-operate/transaction/) - 目前 CCR 暂未支持显示事务同步。 ::: @@ -329,9 +329,9 @@ Variant 数据类型在经过大规模生产打磨后,已具备充分的稳定 :::info 备注 参考文档: -- [异步物化视图概览](https://doris.apache.org/zh-CN/docs/query/view-materialized-view/async-materialized-view) +- [异步物化视图概览](../../query-acceleration/materialized-view/async-materialized-view/overview.md) -- [查询异步物化视图](https://doris.apache.org/zh-CN/docs/3.0/query/view-materialized-view/query-async-materialized-view/) +- [查询异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md) ::: ## 6. 性能提升 @@ -400,7 +400,7 @@ Runtime Filter 是否能够准确生成对查询性能的影响至关重要, ``` :::info 备注 -参考文档: [Java UDF - UDTF](https://doris.apache.org/zh-CN/docs/query/udf/java-user-defined-function#udtf-1) +参考文档: [Java UDF - UDTF](../../query-data/udf/java-user-defined-function.md#java-udtf-实例介绍) ::: ### 7-2 生成列 @@ -415,7 +415,7 @@ Runtime Filter 是否能够准确生成对查询性能的影响至关重要, 参考文档: -[CREATE TABLE AND GENERATED COLUMN](https://doris.apache.org/zh-CN/docs/3.0/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-TABLE-AND-GENERATED-COLUMN/) +[CREATE TABLE AND GENERATED COLUMN](../../sql-manual/sql-statements/table-and-view/table/CREATE-TABLE.md) ::: ## 8. 功能改进 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.1.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.1.md index 6f79a76c5872c..dd3d7829f2783 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.1.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.1.md @@ -74,7 +74,7 @@ under the License. - SQL 拦截功能现在支持外部表 - - 更多内容,参考文档[SQL 拦截](https://doris.apache.org/zh-CN/docs/admin-manual/query-admin/sql-interception) + - 更多内容,参考文档[SQL 拦截](../..//admin-manual/query-admin/sql-interception) - Insert Overwrite 现在支持 Iceberg 表。[#37191](https://github.com/apache/doris/pull/37191) @@ -108,7 +108,7 @@ under the License. - 新增加了 FE 参数 `skip_audit_user_list`,在此配置项中的用户操作将不会被记录到审计日志中。[#38310](https://github.com/apache/doris/pull/38310) - - 更多内容,参考文档[审计插件](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin/) + - 更多内容,参考文档[审计插件](../../admin-manual/audit-plugin/) ## 改进 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.2.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.2.md index bd84408eec7f0..cd509e52023ff 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.2.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.2.md @@ -63,7 +63,7 @@ under the License. ### Lakehouse -- 新增 Lakesoul Catalog。[Apache Doris Docs](https://doris.apache.org/zh-CN/docs/dev/lakehouse/datalake-analytics/lakesoul) +- 新增 Lakesoul Catalog。[Apache Doris Docs](../../lakehouse/datalake-analytics/lakesoul) - 新增系统表 `catalog_meta_cache_statistics`,用于查看 External Catalog 中各类元数据缓存的使用情况。[#40155](https://github.com/apache/doris/pull/40155) ### 查询优化器 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.3.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.3.md index 2f72f702483e3..8a3ecbfa4f62f 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.3.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.3.md @@ -45,11 +45,11 @@ under the License. - 新增 `table$partition` 语法,用于查询 Hive 表的分区信息。[#40774](https://github.com/apache/doris/pull/40774) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/hive#查询-hive-分区) + - [查看文档](../../lakehouse/datalake-analytics/hive#查询-hive-分区) - 支持创建 Text 格式的 Hive 表。[#41860](https://github.com/apache/doris/pull/41860) [#42175](https://github.com/apache/doris/pull/42175) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build#table) + - [查看文档](../../lakehouse/datalake-building/hive-build#table) ### 异步物化视图 @@ -71,7 +71,7 @@ under the License. - 数组函数 `array_agg` 支持在 ARRAY 中嵌套 ARRAY/MAP/STRUCT。[#42009](https://github.com/apache/doris/pull/42009) - 新增近似聚合统计函数 `approx_top_k` 和 `approx_top_sum`。[#44082](https://github.com/apache/doris/pull/44082) -## 改进 +## 改进与优化 ### 存储 @@ -96,7 +96,7 @@ under the License. - Paimon Catalog 支持阿里云 DLF 和 OSS-HDFS 存储。[#41247](https://github.com/apache/doris/pull/41247) [#42585](https://github.com/apache/doris/pull/42585) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/paimon#基于-aliyun-dlf-创建-catalog) + - [查看文档](../../lakehouse/datalake-analytics/paimon#基于-aliyun-dlf-创建-catalog) - 支持读取 OpenCSV 格式的 Hive 表。[#42257](https://github.com/apache/doris/pull/42257) [#42942](https://github.com/apache/doris/pull/42942) - 优化了访问 External Catalog 中 `information_schema.columns` 表的性能。[#41659](https://github.com/apache/doris/pull/41659) [#41962](https://github.com/apache/doris/pull/41962) @@ -142,7 +142,7 @@ under the License. - FE 监控项中的连接数信息支持按用户分别显示。[#39200](https://github.com/apache/doris/pull/39200) -## 缺陷修复 +## 问题修复 ### 存储 @@ -224,4 +224,4 @@ under the License. - 补充了审计日志表和文件中缺失的审计日志字段。[#43303](https://github.com/apache/doris/pull/43303) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file + - [查看文档](../../admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/demo-block.css b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/demo-block.css index 934e88ba28aaf..1257919249c60 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/demo-block.css +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/demo-block.css @@ -105,15 +105,6 @@ a:active { padding-right: 2rem } -.home-page-hero-right { - flex: 1; - flex-direction: row; - justify-content: center; - width: fit-content -} - - - .home-page-option-button { display: flex; margin-bottom: 0.5rem; @@ -209,11 +200,6 @@ a:active { justify-content: center; } -.home-page-hero-right { - align-items: center; - display: flex; - flex-direction: row; -} .home-page-hero-button { /* background-color: #fafafa; */ @@ -279,8 +265,18 @@ a:active { margin-top: 15px } +.home-page-hero-right a { + color: #4c576c +} - +.home-page-hero-right a:hover, +a:active { + /* color: #444fd9; */ + text-decoration: none; + transition-duration: .3s; + transition-timing-function: cubic-bezier(0, 0, .2, 1); + background-color: #fafafa +} .section-border { @@ -355,6 +351,24 @@ a:active { } +@media (max-width: 996px) { + .latest-button { + flex: 1 1 100%; + margin-right: 0 !important; + max-width: calc(100vw - 2rem); + min-height: 170px; + height: auto !important; + } + + .home-page-hero-right { + flex-wrap: wrap !important + } + .latest-button-CN{ + margin-right: 0 !important; + max-width: calc(100vw - 2rem); + } +} + .latest-button-CN { /* background-color: #fafafa; */ border: 0.3px solid #dcdcdc; diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/latest.tsx b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/latest.tsx index 3e1eb5090e0fb..7c92f75c3c137 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/latest.tsx +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/latest.tsx @@ -33,12 +33,12 @@ export default function Latest() {
*/} -
Doris Summit Asia 2024|12 月 14 日 深圳
+
Doris Summit Asia 2024 圆满落幕
-
一年一度的 Apache Doris 峰会再次启航,Doris Summit Asia 2024 现已开启报名,将于 12 月 14 日在深圳正式举办。
-
立即报名
+
2024 年 12 月 14 日,由飞轮科技主办,腾讯云和阿里云联合主办的 Doris Summit Asia 2024 在深圳圆满落幕。演讲回放及资料会在 10 个工作日内逐步释出,可通过 Doris Summit 官网获取。
+
回放生成中
- +
版本发布
{/*
@@ -47,9 +47,9 @@ export default function Latest() {
*/} -
Apache Doris 3.0.2 正式发布
+
Apache Doris 3.0.3 正式发布
-
3.0.2 版本在存算分离、存储、湖仓一体、查询优化器以及执行引擎持续升级改进,欢迎大家下载使用。
+
3.0.3 版本在存算分离、存储、湖仓一体、查询优化器以及执行引擎持续升级改进,欢迎大家下载使用。
查看详情
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/page-hero-1.tsx b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/page-hero-1.tsx index 4b9826c5d4e23..6666f3f97ac60 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/page-hero-1.tsx +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/page-hero-1.tsx @@ -35,9 +35,9 @@ export default function PageHero() {
如何基于 Apache Doris 构建开放、高性能低成本、统一的日志存储分析平台。
- +
-
资源管理
+
负载管理
{/*
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/faq.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/faq.md index c850659d3b047..905111aa22c7f 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/faq.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/faq.md @@ -1,6 +1,6 @@ --- { - "title": "常见问题", + "title": "异步物化视图常见问题", "language": "zh-CN" } --- diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md index 0f7f67f0cdf3c..68da377d09ab3 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md @@ -1,6 +1,6 @@ --- { - "title": "功能描述", + "title": "异步物化视图功能描述", "language": "zh-CN" } --- diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/overview.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/overview.md index 830ad751e2bd0..a140b4a871859 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/overview.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/overview.md @@ -1,6 +1,6 @@ --- { - "title": "原理介绍", + "title": "异步物化视图原理介绍", "language": "zh-CN" } --- diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/use-guide.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/use-guide.md index c9f6e28fe620e..c40439a15c37a 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/use-guide.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/use-guide.md @@ -1,6 +1,6 @@ --- { - "title": "使用与实践", + "title": "异步物化视图使用与实践", "language": "zh-CN" } --- @@ -26,9 +26,9 @@ under the License. ## 异步物化视图使用原则 -1. **时效性考虑:**异步物化视图通常用于对数据时效性要求不高的场景,一般是 T+1 的数据。如果时效性要求高,应考虑使用同步物化视图。 +1. **时效性考虑:** 异步物化视图通常用于对数据时效性要求不高的场景,一般是 T+1 的数据。如果时效性要求高,应考虑使用同步物化视图。 -2. **加速效果与一致性考虑:**在查询加速场景,创建物化视图时,DBA 应将常见查询 SQL 模式分组,尽量使组之间无重合。SQL 模式组划分越清晰,物化视图构建的质量越高。一个查询可能使用多个物化视图,同时一个物化视图也可能被多个查询使用。构建物化视图需要综合考虑命中物化视图的响应时间(加速效果)、构建成本、数据一致性要求等。 +2. **加速效果与一致性考虑:** 在查询加速场景,创建物化视图时,DBA 应将常见查询 SQL 模式分组,尽量使组之间无重合。SQL 模式组划分越清晰,物化视图构建的质量越高。一个查询可能使用多个物化视图,同时一个物化视图也可能被多个查询使用。构建物化视图需要综合考虑命中物化视图的响应时间(加速效果)、构建成本、数据一致性要求等。 3. **物化视图定义与构建成本考虑:** @@ -38,11 +38,11 @@ under the License. 需要注意: -1. **物化视图数量控制:**物化视图并非越多越好。物化视图参与透明改写,且 CBO 代价模型选择需要时间。理论上,物化视图越多,透明改写的时间越长,且物化视图构建和刷新占用的资源越大。 +1. **物化视图数量控制:** 物化视图并非越多越好。物化视图参与透明改写,且 CBO 代价模型选择需要时间。理论上,物化视图越多,透明改写的时间越长,且物化视图构建和刷新占用的资源越大。 -2. **定期检查物化视图使用状态:**如果未使用,应及时删除。 +2. **定期检查物化视图使用状态:** 如果未使用,应及时删除。 -3. **基表数据更新频率:**如果物化视图的基表数据频繁更新,可能不太适合使用物化视图,因为这会导致物化视图频繁失效,不能用于透明改写(可直查)。如果需要使用此类物化视图进行透明改写,需要允许查询的数据有一定的时效延迟,并可以设定`grace_period`。具体见`grace_period`的适用介绍。 +3. **基表数据更新频率:** 如果物化视图的基表数据频繁更新,可能不太适合使用物化视图,因为这会导致物化视图频繁失效,不能用于透明改写(可直查)。如果需要使用此类物化视图进行透明改写,需要允许查询的数据有一定的时效延迟,并可以设定`grace_period`。具体见`grace_period`的适用介绍。 ## 物化视图刷新方式选择原则 @@ -184,9 +184,9 @@ GROUP BY 通常物化视图会出现两种状态: -- **状态正常:**指的是当前物化视图是否可用于透明改写。 +- **状态正常:** 指的是当前物化视图是否可用于透明改写。 -- **不可用、状态不正常:**指的是物化视图不能用于透明改写的简称。尽管如此,该物化视图还是可以直查的。 +- **不可用、状态不正常:** 指的是物化视图不能用于透明改写的简称。尽管如此,该物化视图还是可以直查的。 ### 查看物化视图元数据 @@ -222,9 +222,9 @@ SyncWithBaseTables: 1 - 对于分区增量的物化视图,分区物化视图是否可用,是以分区粒度去看的。也就是说,即使物化视图的部分分区不可用,但只要查询的是有效分区,那么此物化视图依旧可用于透明改写。是否能透明改写,主要看查询所用分区的 `SyncWithBaseTables` 字段是否一致。如果 `SyncWithBaseTables` 是 1,此分区可用于透明改写;如果是 0,则不能用于透明改写。 -- **JobName:**物化视图构建 Job 的名称,每个物化视图有一个 Job,每次刷新会有一个新的 Task,Job 和 Task 是 1:n 的关系 +- **JobName:** 物化视图构建 Job 的名称,每个物化视图有一个 Job,每次刷新会有一个新的 Task,Job 和 Task 是 1:n 的关系 -- **State:**如果变为 SCHEMA_CHANGE,代表基表的 Schema 发生了变化,此时物化视图将不能用来透明改写 (但是不影响直接查询物化视图),下次刷新任务如果执行成功,将恢复为 NORMAL。 +- **State:** 如果变为 SCHEMA_CHANGE,代表基表的 Schema 发生了变化,此时物化视图将不能用来透明改写 (但是不影响直接查询物化视图),下次刷新任务如果执行成功,将恢复为 NORMAL。 - **SchemaChangeDetail:** 表示 SCHEMA_CHANGE 发生的原因。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/all-release.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/all-release.md index 2cb8a1e320631..14e9ae2cc665c 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/all-release.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/all-release.md @@ -39,13 +39,13 @@ under the License.
-- [2024-12-02, Apache Doris 3.0.2 版本发布](../releasenotes/v3.0/release-3.0.2.md) +- [2024-12-02, Apache Doris 3.0.3 版本发布](../releasenotes/v3.0/release-3.0.3.md) -- [2024-11-10, Apache Doris 2.1.7 版本发布](../releasenotes/v2.1/release-2.1.7.md) +- [2024-11-10, Apache Doris 2.1.7 版本发布](../releasenotes/v2.1/release-2.1.7) - [2024-10-15, Apache Doris 3.0.2 版本发布](../releasenotes/v3.0/release-3.0.2.md) -- [2024-09-30, Apache Doris 2.0.15 版本发布](../releasenotes/v2.0/release-2.0.15.md) +- [2024-09-30, Apache Doris 2.0.15 版本发布](/releasenotes/v2.0/release-2.0.15.md) - [2024-09-10, Apache Doris 2.1.6 版本发布](../releasenotes/v2.1/release-2.1.6.md) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.0.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.0.md index 434677f520819..d14aec8a307e5 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.0.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.0.md @@ -104,7 +104,7 @@ under the License. ![Local Shuffle Clickbench and TPCH-100](/images/2.1-doris-clickbench-tpch.png) :::note 备注 -参考文档:[Pipeline X 执行引擎](https://doris.apache.org/zh-CN/docs/query-acceleration/pipeline-execution-engine) +参考文档:[Pipeline X 执行引擎](../../query-acceleration/pipeline-execution-engine) ::: ## ARM 架构深度适配,性能提升 230% @@ -141,9 +141,9 @@ under the License. 该功能目前为实验性质功能,当前已经支持 ClickHouse、Presto、Trino、Hive、Spark。在此我们以 Trino 为例,部署完 SQL 转换服务后,在会话变量中设置 `set sql_dialect = trino` ,即可直接采取 Trino SQL 语法执行查询。在某些社区用户的实际线上业务 SQL 兼容性测试中,在全部 3w 多条查询语句中与 Trino SQL 兼容度高达 99% 以上。也欢迎所有用户在使用过程中向我们反馈不兼容的 Case,帮助 Apache Doris 更加完善。 :::note -- 演示 Demo: https://www.bilibili.com/video/BV1cS421A7kA/?spm_id_from=333.999.0.0 +- [演示 Demo](https://www.bilibili.com/video/BV1cS421A7kA/?spm_id_from=333.999.0.0) -- 参考文档:[SQL 方言兼容](https://doris.apache.org/zh-CN/docs/lakehouse/sql-dialect.md) +- 参考文档:[SQL 方言兼容](../../lakehouse/sql-dialect.md) ::: @@ -302,7 +302,7 @@ CREATE MATERIALIZED VIEW mv1 :::note - 演示 Demo: https://www.bilibili.com/video/BV1s2421T71z/?spm_id_from=333.999.0.0 -- 参考文档:[异步物化视图](https://doris.apache.org/zh-CN/docs/query-acceleration/materialized-view/async-materialized-view/overview) +- 参考文档:[异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/overview) ::: ## 存储能力增强 @@ -408,7 +408,7 @@ PROPERTIES ( :::note -参考文档:[数据划分](https://doris.apache.org/zh-CN/docs/table-design/data-partitioning/basic-concepts) +参考文档:[数据划分](../../table-design/data-partitioning/basic-concepts) ::: ### INSERT INTO SELECT 导入性能提升 100% @@ -470,7 +470,7 @@ MemTable 前移在 2.1 版本中默认开启,用户无需修改原有的导入 :::note - 演示 Demo:https://www.bilibili.com/video/BV1um411o7Ha/?spm_id_from=333.999.0.0 -- 参考文档和完整测试报告:[Group Commit](https://doris.apache.org/zh-CN/docs/data-operate/import/import-way/group-commit-manual) +- 参考文档和完整测试报告:[Group Commit](../../data-operate/import/import-way/group-commit-manual) ::: @@ -542,7 +542,7 @@ SELECT v["properties"]["title"] from ${table_name} :::note - 演示 Demo: https://www.bilibili.com/video/BV13u4m1g7ra/?spm_id_from=333.999.0.0 -- 参考文档:[VARIANT](https://doris.apache.org/zh-CN/docs/sql-manual/sql-data-types/semi-structured/VARIANT.md) +- 参考文档:[VARIANT](../../sql-manual/sql-data-types/semi-structured/VARIANT.md) ::: @@ -557,7 +557,7 @@ SELECT v["properties"]["title"] from ${table_name} - INET_ATON:获取包含 IPv4 地址的字符串,格式为 A.B.C.D(点分隔的十进制数字) :::note -参考文档:[IPV6](https://doris.apache.org/zh-CN/docs/sql-manual/sql-data-types/ip/IPV6) +参考文档:[IPV6](../../sql-manual/sql-data-types/ip/IPV6) ::: @@ -674,7 +674,7 @@ mysql> select struct(1,"2") not in (struct(1,3), struct(1,"2"), struct(1,1), nul - `MAP_AGG`:接收 expr1 作为键,expr2 作为对应的值,返回一个 MAP :::note -参考文档:[MAP_AGG](https://doris.apache.org/zh-CN/docs/sql-manual/sql-functions/aggregate-functions/map-agg.md) +参考文档:[MAP_AGG](../../sql-manual/sql-functions/aggregate-functions/map-agg.md) ::: @@ -699,7 +699,7 @@ mysql> select struct(1,"2") not in (struct(1,3), struct(1,"2"), struct(1,1), nul :::note - 演示 Demo:https://www.bilibili.com/video/BV1Fz421X7XE/?spm_id_from=333.999.0.0 -- 参考文档:[Workload Group](https://doris.apache.org/zh-CN/docs/admin-manual/resource-admin/workload-group.md) +- 参考文档:[Workload Group](../../admin-manual/resource-admin/workload-group.md) ::: @@ -757,7 +757,7 @@ select QueryId,max(BePeakMemoryBytes) as be_peak_mem from active_queries() group 目前主要展示的负载类型包括 Select 和`Insert Into……Select`,预计在 2.1 版本之上的三位迭代版本中会支持 Stream Load 和 Broker Load 的资源用量展示。 :::note -参考文档:[ACTIVE_QUERIES](https://doris.apache.org/zh-CN/docs/sql-manual/sql-functions/table-functions/active_queries.md) +参考文档:[ACTIVE_QUERIES](../../sql-manual/sql-functions/table-functions/active_queries.md) ::: @@ -858,7 +858,7 @@ JOB e_daily :::caution 注意事项 -当前 Job Scheduler 仅支持 Insert 内表,参考文档:[CREATE-JOB](https://doris.apache.org/zh-CN/docs/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-JOB.md) +当前 Job Scheduler 仅支持 Insert 内表,参考文档:[CREATE-JOB](../../sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-JOB.md) ::: @@ -878,7 +878,7 @@ JOB e_daily - 对于之前已经安装过审计日志插件的用户,升级后可以继续使用原有插件,也可以通过 uninstall 命令卸载原有插件后,使用新的插件。但注意,切换插件后,审计日志表也将切换到新的表中。 - - 具体可参阅:[审计日志插件](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin.md) + - 具体可参阅:[审计日志插件](../../admin-manual/audit-plugin.md) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.2.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.2.md index 96b7c849d341b..1517bf0b53fca 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.2.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.2.md @@ -40,7 +40,7 @@ under the License. - https://github.com/apache/doris/pull/33282 -3. Auto Partition 语法变化,详见 https://doris.apache.org/zh-CN/docs/table-design/data-partition#%E8%87%AA%E5%8A%A8%E5%88%86%E5%8C%BA +3. Auto Partition 语法变化,详见[文档](../../table-design/data-partitioning/auto-partitioning.md) - https://github.com/apache/doris/pull/32737 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.3.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.3.md index dc33f0d6011fa..15056902e7534 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.3.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.3.md @@ -37,7 +37,7 @@ under the License. 从 2.1.3 版本开始,Apache Doris 支持对 Hive 的 DDL 和 DML 操作。用户可以直接通过 Apache Doris 在 Hive 中创建库表,通过执行`INSERT INTO`语句来向 Hive 表中写入数据。通过该功能,用户可以通过 Apache Doris 对 Hive 进行完整的数据查询和写入操作,进一步帮助用户简化湖仓一体架构。 -参考文档:[https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/](https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/) +参考[文档](../../lakehouse/datalake-building/hive-build) **2. 支持在异步物化视图之上构建新的异步物化视图** diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.4.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.4.md index d8e3a2d8be538..722de717ea32a 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.4.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.4.md @@ -40,9 +40,9 @@ under the License. 关于更多信息,请参考文档: - - [BE 日志管理](../admin-manual/log-management/be-log.md) + - [BE 日志管理](../../admin-manual/log-management/be-log.md) - - [FE 日志管理](../admin-manual/log-management/fe-log.md) + - [FE 日志管理](../../admin-manual/log-management/fe-log.md) - 如果建表时没有填写表注释,默认注释为空,不再使用表类型作为默认表注释。 [#36025](https://github.com/apache/doris/pull/36025) @@ -54,7 +54,7 @@ under the License. - **支持 FE 火焰图工具**:在 FE 部署目录 `${DORIS_FE_HOME}/bin` 中会增加`profile_fe.sh` 脚本,可以利用 async-profiler 工具生成 FE 的火焰图,用以发现性能瓶颈点。 - 关于更多信息,请参考文档:[使用 FE Profiler 生成火焰图](/community/developer-guide/fe-profiler.md) + 关于更多信息,请参考文档:[使用 FE Profiler 生成火焰图](https://doris.apache.org/zh-CN/community/developer-guide/fe-profiler) - **支持 SELECT DISTINCT 与聚合函数同时使用**:支持 `SELECT DISTINCT` 与聚合函数同时使用,在一个查询中同时去重和进行聚合操作,如 SUM、MIN/MAX 等。 @@ -66,15 +66,15 @@ under the License. - **支持 Paimon 的原生读取器来处理 Deletion Vector:** Deletion Vector 主要用于标记或追踪哪些数据已被删除或标记为删除,通常应用在需要保留历史数据的场景,基于本优化可以提升大量数据更新或删除时的处理效率。 [#35241](https://github.com/apache/doris/pull/35241) - 关于更多信息,请参考文档:[数据湖分析 - Paimon](../lakehouse/datalake-analytics/paimon.md) + 关于更多信息,请参考文档:[数据湖分析 - Paimon](../../lakehouse/datalake-analytics/paimon.md) - **支持在表值函数(TVF)中使用 Resource**:TVF 功能为 Apache Doris 提供了直接将对象存储或 HDFS 上的文件作为 Table 进行查询分析的能力。通过在 TVF 中引用 Resource,可以避免重复填写连接信息,提升使用体验。 [#35139](https://github.com/apache/doris/pull/35139) - 关于更多信息,请参考文档:[表函数 - HDFS](../sql-manual/sql-functions/table-functions/hdfs.md) + 关于更多信息,请参考文档:[表函数 - HDFS](../../sql-manual/sql-functions/table-functions/hdfs.md) - **支持通过 Ranger 插件实现数据脱敏**:开启 Ranger 鉴权功能后,支持使用 Ranger 中的 Data Mask 功能进行数据脱敏。 - 关于更多信息,请参考文档:[基于 Apache Ranger 的鉴权管理](https://doris.apache.org/zh-CN/docs/admin-manual/auth/ranger/#%E5%AE%89%E8%A3%85-doris-ranger-%E6%8F%92%E4%BB%B6) + 关于更多信息,请参考文档:[基于 Apache Ranger 的鉴权管理](../../admin-manual/auth/ranger#资源和权限) ### 异步物化视图 @@ -82,21 +82,21 @@ under the License. - 支持单表透明改写。 - 关于更多信息,请参考文档:[查询异步物化视图](../query/view-materialized-view/query-async-materialized-view.md) + 关于更多信息,请参考文档:[查询异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md) - 透明改写支持 agg_state, agg_union 类型的聚合上卷,物化视图可以定义为 agg_state 或者 agg_union,查询使用具体的聚合函数,或者使用 agg_merge - 关于更多信息,请参考文档:[AGG_STATE](../sql-manual/sql-types/Data-Types/AGG_STATE.md) + 关于更多信息,请参考文档:[AGG_STATE](../../sql-manual/sql-data-types/aggregate/AGG-STATE.md) ### 其他 - **新增 `replace_empty` 函数**:将字符串中的子字符串进行替换,当旧字符串为空时,会将新字符串插入到原有字符串的每个字符前以及最后。 - 关于更多信息,请参考文档:[字符串函数 - REPLACE_EMPTY](../sql-manual/sql-functions/string-functions/replace_empty.md) + 关于更多信息,请参考文档:[字符串函数 - REPLACE_EMPTY](../../sql-manual/sql-functions/string-functions/replace_empty.md) - 支持 `show storage policy using` 语句:支持查看所有或指定存储策略关联的表和分区。 - 关于更多信息,请参考文档:[SQL 语句 - SHOW](../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md) + 关于更多信息,请参考文档:[SQL 语句 - SHOW](../../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md) - **支持 BE 侧的 JVM 指标:** 通过在 `be.conf` 配置文件中设置`enable_jvm_monitor=true`,可以启用对 BE 节点 JVM 的监控和指标收集,有助于了解 BE JVM 的资源使用情况,以便进行故障排除和性能优化。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.5.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.5.md index b463d42968326..d8b86761d77fc 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.5.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.5.md @@ -56,7 +56,7 @@ under the License. - 会话变量 `read_csv_empty_line_as_null` 用于控制在读取 CSV 格式文件时,是否忽略空行。默认情况下忽略空行,当设置为 true 时,空行将被读取为所有列均为 Null 的行。[#37153](https://github.com/apache/doris/pull/37153) - - 更多信息,请参考[文档](https://doris.apache.org/docs/lakehouse/datalake-analytics/hive?_highlight=compress_type)。 + - 更多信息,请参考[文档](../../lakehouse/datalake-analytics/hive?_highlight=compress_type)。 - 新增兼容 Presto 的复杂类型输出格式。通过设置 `set serde_dialect="presto"`,可以控制复杂类型的输出格式 与 Presto 一致,用于平滑迁移 Presto 业务。[#37253](https://github.com/apache/doris/pull/37253) @@ -131,7 +131,7 @@ under the License. - 数据导出(Export/Outfile)支持指定 Parquet 和 ORC 的压缩格式。 - - 更多信息,请参考[文档](https://doris.apache.org/docs/sql-manual/sql-statements/Data-Manipulation-Statements/Manipulation/EXPORT/?_highlight=compress_type)。 + - 更多信息,请参考[文档](../../sql-manual/sql-statements/data-modification/load-and-export/EXPORT.md)。 - 当使用 CTAS+TVF 创建表时,TVF 中的分区列将被自动映射为 Varchar(65533)而非 String,以便该分区列能够作为内表的分区列使用。 [#37161](https://github.com/apache/doris/pull/37161) @@ -207,7 +207,7 @@ under the License. - 支持为 `INSERT INTO ... FROM TABLE VALUE FUNCTION` 语句设置 `max_filter_ratio` 参数。 - - 更多信息,请参考[文档](https://doris.apache.org/zh-CN/docs/data-operate/import/import-way/insert-into-manual/) + - 更多信息,请参考[文档](../../data-operate/import/import-way/insert-into-manual) ## Bug 修复 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.6.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.6.md index 6261e4e0c6612..65853079ee177 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.6.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.6.md @@ -56,15 +56,15 @@ under the License. - 实现 Iceberg 表的写回功能。 - - 更多信息,请查看文档数据湖构建-[Iceberg](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-building/iceberg-build) + - 更多信息,请查看文档数据湖构建-[Iceberg](../../lakehouse/datalake-building/iceberg-build) - 增强 SQL 拦截规则,支持对外表的拦截处理。 - - 更多信息,请查看文档查询管理-[SQL 拦截](https://doris.apache.org/zh-CN/docs/admin-manual/query-admin/sql-interception) + - 更多信息,请查看文档查询管理-[SQL 拦截](../../admin-manual/query-admin/sql-interception) - 新增系统表`file_cache_statistics`,用于查看 BE 节点的数据缓存性能指标。 - - 更多信息,请查看文档系统表-[file_cache_statistics](https://doris.apache.org/docs/admin-manual/system-tables/file_cache_statistics/) + - 更多信息,请查看文档系统表-[file_cache_statistics](../../admin-manual/system-tables/information_schema/file_cache_statistics) ### 异步物化视图 @@ -108,10 +108,10 @@ under the License. - 新增系统表`table_properties`,便于用户查看和管理表的各项属性。 - - 更多信息,请查看文档 [table_properties](https://doris.apache.org/zh-CN/docs/admin-manual/system-tables/information_schema/table_properties/) + - 更多信息,请查看文档 [table_properties](../../admin-manual/system-tables/information_schema/table_properties/) - 新增 FE 中死锁和慢锁检测功能。 - - 更多信息,请查看文档 [FE 锁管理](https://doris.apache.org/zh-CN/docs/admin-manual/maint-monitor/frontend-lock-manager/) + - 更多信息,请查看文档 [FE 锁管理](../../admin-manual/maint-monitor/frontend-lock-manager/) ## 改进提升 @@ -119,7 +119,7 @@ under the License. - 革新外表元数据缓存机制。 - - 更多信息,请查看文档 [元数据缓存](https://doris.apache.org/zh-CN/docs/lakehouse/metacache/)。 + - 更多信息,请查看文档 [元数据缓存](../../lakehouse/metacache)。 - 新增会话变量`keep_carriage_return`,默认关闭。读取 Hive Text 格式表时,默认将`\r\n`与`\n`均视为换行符。[#38099](https://github.com/apache/doris/pull/38099) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.7.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.7.md index 2d85c595f497c..f5bfea1d272f5 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.7.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.7.md @@ -38,7 +38,7 @@ under the License. - enable_fallback_to_original_planner: true - enable_pipeline_x_engine: true - 审计日志增加了新的列 [#42262](https://github.com/apache/doris/pull/42262) - - 更多信息,请参考[管理指南](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin/) + - 更多信息,请参考[管理指南](../../admin-manual/audit-plugin) ## 新功能 @@ -61,8 +61,8 @@ under the License. - 增加了 `information_schema.table_options` 和 `information_schema.``table_properties` 系统表,支持查询建表时设置的一些属性。[#34384](https://github.com/apache/doris/pull/34384) - 更多信息,请参考系统表: - - [table_options](https://doris.apache.org/docs/admin-manual/system-tables/information_schema/table_options/) - - [table_properties](https://doris.apache.org/zh-CN/docs/admin-manual/system-tables/information_schema/table_properties/) + - [table_options](../../admin-manual/system-tables/information_schema/table_options) + - [table_properties](../../admin-manual/system-tables/information_schema/table_properties) - 支持 `bitmap_empty` 作为默认值。[#40364](https://github.com/apache/doris/pull/40364) - 增加了一个新的 Session 变量`require_sequence_in_insert` 来控制向 Unique Key 表进行`insert into select` 写入时,是否必须提供 Sequence 列。[#41655](https://github.com/apache/doris/pull/41655) @@ -75,16 +75,16 @@ under the License. ### 湖仓一体 - 支持写入数据到 Hive Text 格式表。[#40537](https://github.com/apache/doris/pull/40537) - - 更多信息,请参考[使用 Hive 构建数据湖](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-building/hive-build/)文档 + - 更多信息,请参考[使用 Hive 构建数据湖](../../lakehouse/datalake-building/hive-build/)文档 - 使用 MaxCompute Open Storage API 访问 MaxCompute 数据。[#41610](https://github.com/apache/doris/pull/41610) - - 更多信息,请参考 [MaxCompute](https://doris.apache.org/zh-CN/docs/lakehouse/database/max-compute/) 文档 + - 更多信息,请参考 [MaxCompute](../../lakehouse/database/max-compute/) 文档 - 支持 Paimon DLF Catalog。[#41694](https://github.com/apache/doris/pull/41694) - - 更多信息,请参考 [Paimon Catalog](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-analytics/paimon/) 文档 + - 更多信息,请参考 [Paimon Catalog](../../lakehouse/datalake-analytics/paimon/) 文档 - 新增语法 `table$partitions` 语法支持直接查询 Hive 分区信息 [#41230](https://github.com/apache/doris/pull/41230) - - 更多信息,请参考[通过 Hive 分析数据湖](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-analytics/hive/)文档 + - 更多信息,请参考[通过 Hive 分析数据湖](../../lakehouse/datalake-analytics/hive/)文档 - 支持 brotli 压缩格式的 Parquet 文件读取。[#42162](https://github.com/apache/doris/pull/42162) - 支持读取 Parquet 文件中的 DECIMAL 256 类型。[#42241](https://github.com/apache/doris/pull/42241) -- 支持读取 OpenCsvSerde 格式的 Hive 表。[#42939](https://github.com/apache/doris/pull/42939)https://github.com/apache/doris/pull/42939 +- 支持读取 OpenCsvSerde 格式的 Hive 表。[#42939](https://github.com/apache/doris/pull/42939) ### 异步物化视图 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.0.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.0.md index 5065dfc1566b7..2e7cdee64215e 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.0.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.0.md @@ -151,7 +151,7 @@ under the License. :::info 备注 -参考文档:[存算分离](https://doris.apache.org/zh-CN/docs/3.0/compute-storage-decoupled/overview) +参考文档:[存算分离](../../compute-storage-decoupled/overview) ::: @@ -200,15 +200,15 @@ under the License. - [接入 Trino Connector](https://doris.apache.org/zh-CN/community/how-to-contribute/trino-connector-developer-guide) -- [TPC-H](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/tpch/) +- [TPC-H](../../lakehouse/datalake-analytics/tpch/) -- [TPC-DS](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/tpcds/) +- [TPC-DS](../../lakehouse/datalake-analytics/tpcds/) -- [Delta Lake](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/deltalake) +- [Delta Lake](../../lakehouse/datalake-analytics/deltalake) -- [Kudu](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/kudu) +- [Kudu](../../lakehouse/datalake-analytics/kudu) -- [BigQuery](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/bigquery) +- [BigQuery](../../lakehouse/datalake-analytics/bigquery) ::: ### 2-3 数据湖构建 @@ -219,7 +219,7 @@ under the License. :::info 备注 -参考文档:[数据湖构建](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build/) +参考文档:[数据湖构建](../../lakehouse/datalake-building/hive-build/) ::: @@ -277,7 +277,7 @@ Variant 数据类型在经过大规模生产打磨后,已具备充分的稳定 参考文档: -- [事务](https://doris.apache.org/zh-CN/docs/3.0/data-operate/transaction/) +- [事务](../../data-operate/transaction/) - 目前 CCR 暂未支持显示事务同步。 ::: @@ -329,9 +329,9 @@ Variant 数据类型在经过大规模生产打磨后,已具备充分的稳定 :::info 备注 参考文档: -- [异步物化视图概览](https://doris.apache.org/zh-CN/docs/query/view-materialized-view/async-materialized-view) +- [异步物化视图概览](../../query-acceleration/materialized-view/async-materialized-view/overview.md) -- [查询异步物化视图](https://doris.apache.org/zh-CN/docs/3.0/query/view-materialized-view/query-async-materialized-view/) +- [查询异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md) ::: ## 6. 性能提升 @@ -400,7 +400,7 @@ Runtime Filter 是否能够准确生成对查询性能的影响至关重要, ``` :::info 备注 -参考文档: [Java UDF - UDTF](https://doris.apache.org/zh-CN/docs/query/udf/java-user-defined-function#udtf-1) +参考文档: [Java UDF - UDTF](../../query-data/udf/java-user-defined-function.md#java-udtf-实例介绍) ::: ### 7-2 生成列 @@ -415,7 +415,7 @@ Runtime Filter 是否能够准确生成对查询性能的影响至关重要, 参考文档: -[CREATE TABLE AND GENERATED COLUMN](https://doris.apache.org/zh-CN/docs/3.0/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-TABLE-AND-GENERATED-COLUMN/) +[CREATE TABLE AND GENERATED COLUMN](../../sql-manual/sql-statements/table-and-view/table/CREATE-TABLE.md) ::: ## 8. 功能改进 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.1.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.1.md index 6f79a76c5872c..dd3d7829f2783 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.1.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.1.md @@ -74,7 +74,7 @@ under the License. - SQL 拦截功能现在支持外部表 - - 更多内容,参考文档[SQL 拦截](https://doris.apache.org/zh-CN/docs/admin-manual/query-admin/sql-interception) + - 更多内容,参考文档[SQL 拦截](../..//admin-manual/query-admin/sql-interception) - Insert Overwrite 现在支持 Iceberg 表。[#37191](https://github.com/apache/doris/pull/37191) @@ -108,7 +108,7 @@ under the License. - 新增加了 FE 参数 `skip_audit_user_list`,在此配置项中的用户操作将不会被记录到审计日志中。[#38310](https://github.com/apache/doris/pull/38310) - - 更多内容,参考文档[审计插件](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin/) + - 更多内容,参考文档[审计插件](../../admin-manual/audit-plugin/) ## 改进 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.2.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.2.md index bd84408eec7f0..cd509e52023ff 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.2.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.2.md @@ -63,7 +63,7 @@ under the License. ### Lakehouse -- 新增 Lakesoul Catalog。[Apache Doris Docs](https://doris.apache.org/zh-CN/docs/dev/lakehouse/datalake-analytics/lakesoul) +- 新增 Lakesoul Catalog。[Apache Doris Docs](../../lakehouse/datalake-analytics/lakesoul) - 新增系统表 `catalog_meta_cache_statistics`,用于查看 External Catalog 中各类元数据缓存的使用情况。[#40155](https://github.com/apache/doris/pull/40155) ### 查询优化器 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.3.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.3.md index 2f72f702483e3..8a3ecbfa4f62f 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.3.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.3.md @@ -45,11 +45,11 @@ under the License. - 新增 `table$partition` 语法,用于查询 Hive 表的分区信息。[#40774](https://github.com/apache/doris/pull/40774) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/hive#查询-hive-分区) + - [查看文档](../../lakehouse/datalake-analytics/hive#查询-hive-分区) - 支持创建 Text 格式的 Hive 表。[#41860](https://github.com/apache/doris/pull/41860) [#42175](https://github.com/apache/doris/pull/42175) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build#table) + - [查看文档](../../lakehouse/datalake-building/hive-build#table) ### 异步物化视图 @@ -71,7 +71,7 @@ under the License. - 数组函数 `array_agg` 支持在 ARRAY 中嵌套 ARRAY/MAP/STRUCT。[#42009](https://github.com/apache/doris/pull/42009) - 新增近似聚合统计函数 `approx_top_k` 和 `approx_top_sum`。[#44082](https://github.com/apache/doris/pull/44082) -## 改进 +## 改进与优化 ### 存储 @@ -96,7 +96,7 @@ under the License. - Paimon Catalog 支持阿里云 DLF 和 OSS-HDFS 存储。[#41247](https://github.com/apache/doris/pull/41247) [#42585](https://github.com/apache/doris/pull/42585) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/paimon#基于-aliyun-dlf-创建-catalog) + - [查看文档](../../lakehouse/datalake-analytics/paimon#基于-aliyun-dlf-创建-catalog) - 支持读取 OpenCSV 格式的 Hive 表。[#42257](https://github.com/apache/doris/pull/42257) [#42942](https://github.com/apache/doris/pull/42942) - 优化了访问 External Catalog 中 `information_schema.columns` 表的性能。[#41659](https://github.com/apache/doris/pull/41659) [#41962](https://github.com/apache/doris/pull/41962) @@ -142,7 +142,7 @@ under the License. - FE 监控项中的连接数信息支持按用户分别显示。[#39200](https://github.com/apache/doris/pull/39200) -## 缺陷修复 +## 问题修复 ### 存储 @@ -224,4 +224,4 @@ under the License. - 补充了审计日志表和文件中缺失的审计日志字段。[#43303](https://github.com/apache/doris/pull/43303) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file + - [查看文档](../../admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file diff --git a/sidebars.json b/sidebars.json index 0497b7d49365f..aa9ea56e057e8 100644 --- a/sidebars.json +++ b/sidebars.json @@ -117,10 +117,11 @@ "type": "category", "label": "Data Models", "items": [ - "admin-manual/data-admin/ccr/overview", - "admin-manual/data-admin/ccr/quickstart", - "admin-manual/data-admin/ccr/feature", - "admin-manual/data-admin/ccr/manual" + "table-design/data-model/overview", + "table-design/data-model/duplicate", + "table-design/data-model/unique", + "table-design/data-model/aggregate", + "table-design/data-model/tips" ] }, "table-design/row-store", @@ -1893,4 +1894,4 @@ ] } ] -} +} \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/all-release.md b/versioned_docs/version-1.2/releasenotes/all-release.md new file mode 100644 index 0000000000000..392e8e1a9562e --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/all-release.md @@ -0,0 +1,88 @@ +--- +{ + "title": "All Releases", + "language": "en" +} +--- + + + +This document presents a summary of Apache Doris versions released within one year, listed in reverse chronological order. + +:::tip Latest Release + +🎉 Version 3.0.3 released now. Check out the 🔗[Release Notes](../releasenotes/v3.0/release-3.0.3) here. Starting from version 3.X, Apache Doris supports a compute-storage decoupled mode in addition to the compute-storage coupled mode for cluster deployment. With the cloud-native architecture that decouples the computation and storage layers, users can achieve physical isolation between query loads across multiple compute clusters, as well as isolation between read and write loads. + +
+ +🎉 Version 2.1.7 released now. Check out the 🔗[Release Notes](../releasenotes/v2.1/release-2.1.6) here. The 2.1 version delivers exceptional performance with 100% higher out-of-the-box queries proven by TPC-DS 1TB tests, enhanced data lake analytics that are 4-6 times speedier than Trino and Spark, solid support for semi-structured data analysis with new Variant types and suite of analytical functions, asynchronous materialized views for query acceleration, optimized real-time writing at scale, and better workload management with stability and runtime SQL resource tracking. + +::: + + +
+ +- [2024-12-02, Apache Doris 3.0.3 is released](../releasenotes/v3.0/release-3.0.3.md) + +- [2024-11-10, Apache Doris 2.1.7 is released](../releasenotes/v2.1/release-2.1.7.md) + +- [2024-10-15, Apache Doris 3.0.2 is released](../releasenotes/v3.0/release-3.0.2.md) + +- [2024-09-30, Apache Doris 2.0.15 is released](../releasenotes/v2.0/release-2.0.15.md) + +- [2024-09-10, Apache Doris 2.1.6 is released](../releasenotes/v2.1/release-2.1.6.md) + +- [2024-08-23, Apache Doris 3.0.1 is released](../releasenotes/v3.0/release-3.0.1.md) + +- [2024-07-24, Apache Doris 2.1.5 is released](../releasenotes/v2.1/release-2.1.5.md) + +- [2024-07-17, Apache Doris 2.0.13 is released](../releasenotes/v2.0/release-2.0.13.md) + +- [2024-06-27, Apache Doris 2.0.12 is released](../releasenotes/v2.0/release-2.0.12.md) + +- [2024-06-26, Apache Doris 2.1.4 is released](../releasenotes/v2.1/release-2.1.4.md) + +- [2024-06-05, Apache DOris 2.0.11 is released](../releasenotes/v2.0/release-2.0.11.md) + +- [2024-05-21, Apache Doris 2.1.3 is released](../releasenotes/v2.1/release-2.1.3.md) + +- [2024-05-16, Apache Doris 2.0.10 is released](../releasenotes/v2.0/release-2.0.10.md) + +- [2024-04-23, Apache Doris 2.0.9 is released](../releasenotes/v2.0/release-2.0.9.md) + +- [2024-04-12, Apache Doris 2.1.2 is released](../releasenotes/v2.1/release-2.1.2.md) + +- [2024-04-09, Apache Doris 2.0.8 is released](../releasenotes/v2.0/release-2.0.8.md) + +- [2024-04-03, Apache Doris 2.1.1 is released](../releasenotes/v2.1/release-2.1.1.md) + +- [2024-03-26, Apache Doris 2.0.7 is released](../releasenotes/v2.0/release-2.0.7.md) + +- [2024-03-12, Apache Doris 2.1.0 is released](../releasenotes/v2.1/release-2.1.0.md) + +- [2024-03-11, Apache Doris 2.0.6 is released](../releasenotes/v2.0/release-2.0.6.md) + +- [2024-02-28, Apache Doris 2.0.5 is released](../releasenotes/v2.0/release-2.0.5.md) + +- [2024-01-26, Apache Doris 2.0.4 is released](../releasenotes/v2.0/release-2.0.4.md) + + + + diff --git a/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.0.md b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.0.md new file mode 100644 index 0000000000000..dd94da6816294 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.0.md @@ -0,0 +1,379 @@ +--- +{ + "title": "Release 1.1.0", + "language": "en" +} +--- + + + +In version 1.1, we realized the full vectorization of the computing layer and storage layer, and officially enabled the vectorized execution engine as a stable function. All queries are executed by the vectorized execution engine by default, and the performance is 3-5 times higher than the previous version. It increases the ability to access the external tables of Apache Iceberg and supports federated query of data in Doris and Iceberg, and expands the analysis capabilities of Apache Doris on the data lake; on the basis of the original LZ4, the ZSTD compression algorithm is added , further improves the data compression rate; fixed many performance and stability problems in previous versions, greatly improving system stability. Downloading and using is recommended. + +## Upgrade Notes + +### The vectorized execution engine is enabled by default + +In version 1.0, we introduced the vectorized execution engine as an experimental feature and Users need to manually enable it when executing queries by configuring the session variables through `set batch_size = 4096` and `set enable_vectorized_engine = true` . + +In version 1.1, we officially fully enabled the vectorized execution engine as a stable function. The session variable `enable_vectorized_engine` is set to true by default. All queries are executed by default through the vectorized execution engine. + +### BE Binary File Renaming + +BE binary file has been renamed from palo_be to doris_be . Please pay attention to modifying the relevant scripts if you used to rely on process names for cluster management and other operations. + +### Segment storage format upgrade + +The storage format of earlier versions of Apache Doris was Segment V1. In version 0.12, we had implemented Segment V2 as a new storage format, which introduced Bitmap indexes, memory tables, page cache, dictionary compression, delayed materialization and many other features. Starting from version 0.13, the default storage format for newly created tables is Segment V2, while maintaining compatibility with the Segment V1 format. + +In order to ensure the maintainability of the code structure and reduce the additional learning and development costs caused by redundant historical codes, we have decided to no longer support the Segment v1 storage format from the next version. It is expected that this part of the code will be deleted in the Apache Doris 1.2 version. + +### Normal Upgrade + +For normal upgrade operations, you can perform rolling upgrades according to the cluster upgrade documentation on the official website. + +[https://doris.apache.org//docs/admin-manual/cluster-management/upgrade](https://doris.apache.org//docs/admin-manual/cluster-management/upgrade) + +## Features + +### Support random distribution of data [experimental] + +In some scenarios (such as log data analysis), users may not be able to find a suitable bucket key to avoid data skew, so the system needs to provide additional distribution methods to solve the problem. + +Therefore, when creating a table you can set `DISTRIBUTED BY random BUCKETS number`to use random distribution, the data will be randomly written to a single tablet when importing to reduce the data fanout during the loading process. And reduce resource overhead and improve system stability. + +### Support for creating Iceberg external tables[experimental] + +Iceberg external tables provide Apache Doris with direct access to data stored in Iceberg. Through Iceberg external tables, federated queries on data stored in local storage and Iceberg can be implemented, which saves tedious data loading work, simplifies the system architecture for data analysis, and performs more complex analysis operations. + +In version 1.1, Apache Doris supports creating Iceberg external tables and querying data, and supports automatic synchronization of all table schemas in the Iceberg database through the REFRESH command. + +### Added ZSTD compression algorithm + +At present, the data compression method in Apache Doris is uniformly specified by the system, and the default is LZ4. For some scenarios that are sensitive to data storage costs, the original data compression ratio requirements cannot be met. + +In version 1.1, users can set "compression"="zstd" in the table properties to specify the compression method as ZSTD when creating a table. In the 25GB 110 million lines of text log test data, the highest compression rate is nearly 10 times, which is 53% higher than the original compression rate, and the speed of reading data from disk and decompressing it is increased by 30%. + +## Improvements + +### More comprehensive vectorization support + +In version 1.1, we implemented full vectorization of the compute and storage layers, including: + +Implemented vectorization of all built-in functions + +The storage layer implements vectorization and supports dictionary optimization for low-cardinality string columns + +Optimized and resolved numerous performance and stability issues with the vectorization engine. + +We tested the performance of Apache Doris version 1.1 and version 0.15 on the SSB and TPC-H standard test datasets: + +On all 13 SQLs in the SSB test data set, version 1.1 is better than version 0.15, and the overall performance is improved by about 3 times, which solves the problem of performance degradation in some scenarios in version 1.0; + +On all 22 SQLs in the TPC-H test data set, version 1.1 is better than version 0.15, the overall performance is improved by about 4.5 times, and the performance of some scenarios is improved by more than ten times; + +![](/images/release-note-1.1.0-SSB.png) + +

SSB Benchmark

+ +![](/images/release-note-1.1.0-TPC-H.png) + + +

TPC-H Benchmark

+ +**Performance test report** + +[https://doris.apache.org//docs/benchmark/ssb](https://doris.apache.org//docs/benchmark/ssb) + +[https://doris.apache.org//docs/benchmark/tpch](https://doris.apache.org//docs/benchmark/tpch) + +### Compaction logic optimization and real-time guarantee + +In Apache Doris, each commit will generate a data version. In high concurrent write scenarios, -235 errors are prone to occur due to too many data versions and untimely compaction, and query performance will also decrease accordingly. + +In version 1.1, we introduced QuickCompaction, which will actively trigger compaction when the data version increases. At the same time, by improving the ability to scan fragment metadata, it can quickly find fragments with too many data versions and trigger compaction. Through active triggering and passive scanning, the real-time problem of data merging is completely solved. + +At the same time, for high-frequency small file cumulative compaction, the scheduling and isolation of compaction tasks is implemented to prevent the heavyweight base compaction from affecting the merging of new data. + +Finally, for the merging of small files, the strategy of merging small files is optimized, and the method of gradient merging is adopted. Each time the files participating in the merging belong to the same data magnitude, it prevents versions with large differences in size from merging, and gradually merges hierarchically. , reducing the number of times a single file participates in merging, which can greatly save the CPU consumption of the system. + +When the data upstream maintains a write frequency of 10w per second (20 concurrent write tasks, 5000 rows per job, and checkpoint interval of 1s), version 1.1 behaves as follows: + +- Quick data consolidation: Tablet version remains below 50 and compaction score is stable. Compared with the -235 problem that frequently occurred during high concurrent writing in the previous version, the compaction merge efficiency has been improved by more than 10 times. + +- Significantly reduced CPU resource consumption: The strategy has been optimized for small file Compaction. In the above scenario of high concurrent writing, CPU resource consumption is reduced by 25%; + +- Stable query time consumption: The overall orderliness of data is improved, and the fluctuation of query time consumption is greatly reduced. The query time consumption during high concurrent writing is the same as that of only querying, and the query performance is improved by 3-4 times compared with the previous version. + +### Read efficiency optimization for Parquet and ORC files + +By adjusting arrow parameters, arrow's multi-threaded read capability is used to speed up Arrow's reading of each row_group, and it is modified to SPSC model to reduce the cost of waiting for the network through prefetching. After optimization, the performance of Parquet file import is improved by 4 to 5 times. + +### Safer metadata Checkpoint + +By double-checking the image files generated after the metadata checkpoint and retaining the function of historical image files, the problem of metadata corruption caused by image file errors is solved. + +## Bugfix + +### Fix the problem that the data cannot be queried due to the missing data version.(Serious) + +This issue was introduced in version 1.0 and may result in the loss of data versions for multiple replicas. + +### Fix the problem that the resource isolation is invalid for the resource usage limit of loading tasks (Moderate) + +In 1.1, the broker load and routine load will use Backends with specified resource tags to do the load. + +### Use HTTP BRPC to transfer network data packets over 2GB (Moderate) + +In the previous version, when the data transmitted between Backends through BRPC exceeded 2GB, +it may cause data transmission errors. + +## Others + +### Disabling Mini Load + +The `/_load` interface is disabled by default, please use `the /_stream_load` interface uniformly. +Of course, you can re-enable it by turning off the FE configuration item `disable_mini_load`. + +The Mini Load interface will be completely removed in version 1.2. + +### Completely disable the SegmentV1 storage format + +Data in SegmentV1 format is no longer allowed to be created. Existing data can continue to be accessed normally. +You can use the `ADMIN SHOW TABLET STORAGE FORMAT` statement to check whether the data in SegmentV1 format +still exists in the cluster. And convert to SegmentV2 through the data conversion command + +Access to SegmentV1 data will no longer be supported in version 1.2. + +### Limit the maximum length of String type + +In previous versions, String types were allowed a maximum length of 2GB. +In version 1.1, we will limit the maximum length of the string type to 1MB. Strings longer than this length cannot be written anymore. +At the same time, using the String type as a partitioning or bucketing column of a table is no longer supported. + +The String type that has been written can be accessed normally. + +### Fix fastjson related vulnerabilities + +Update to Canal version to fix fastjson security vulnerability. + +### Added `ADMIN DIAGNOSE TABLET` command + +Used to quickly diagnose problems with the specified tablet. + +## Download to Use + +### Download Link + +[hhttps://doris.apache.org/download](https://doris.apache.org/download) + +### Feedback + +If you encounter any problems with use, please feel free to contact us through GitHub discussion forum or Dev e-mail group anytime. + +GitHub Forum: [https://github.com/apache/doris/discussions](https://github.com/apache/doris/discussions) + +Mailing list: [dev@doris.apache.org](dev@doris.apache.org) + +## Thanks + +Thanks to everyone who has contributed to this release: + +``` + +@adonis0147 + +@airborne12 + +@amosbird + +@aopangzi + +@arthuryangcs + +@awakeljw + +@BePPPower + +@BiteTheDDDDt + +@bridgeDream + +@caiconghui + +@cambyzju + +@ccoffline + +@chenlinzhong + +@daikon12 + +@DarvenDuan + +@dataalive + +@dataroaring + +@deardeng + +@Doris-Extras + +@emerkfu + +@EmmyMiao87 + +@englefly + +@Gabriel39 + +@GoGoWen + +@gtchaos + +@HappenLee + +@hello-stephen + +@Henry2SS + +@hewei-nju + +@hf200012 + +@jacktengg + +@jackwener + +@Jibing-Li + +@JNSimba + +@kangshisen + +@Kikyou1997 + +@kylinmac + +@Lchangliang + +@leo65535 + +@liaoxin01 + +@liutang123 + +@lovingfeel + +@luozenglin + +@luwei16 + +@luzhijing + +@mklzl + +@morningman + +@morrySnow + +@nextdreamblue + +@Nivane + +@pengxiangyu + +@qidaye + +@qzsee + +@SaintBacchus + +@SleepyBear96 + +@smallhibiscus + +@spaces-X + +@stalary + +@starocean999 + +@steadyBoy + +@SWJTU-ZhangLei + +@Tanya-W + +@tarepanda1024 + +@tianhui5 + +@Userwhite + +@wangbo + +@wangyf0555 + +@weizuo93 + +@whutpencil + +@wsjz + +@wunan1210 + +@xiaokang + +@xinyiZzz + +@xlwh + +@xy720 + +@yangzhg + +@Yankee24 + +@yiguolei + +@yinzhijian + +@yixiutt + +@zbtzbtzbt + +@zenoyang + +@zhangstar333 + +@zhangyifan27 + +@zhannngchen + +@zhengshengjun + +@zhengshiJ + +@zingdle + +@zuochunwei + +@zy-kkk +``` diff --git a/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.1.md b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.1.md new file mode 100644 index 0000000000000..73a6c2d976999 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.1.md @@ -0,0 +1,78 @@ +--- +{ + "title": "Release 1.1.1", + "language": "en" +} +--- + + + +## Features + +### Support ODBC Sink in Vectorized Engine. + +This feature is enabled in non-vectorized engine but it is missed in vectorized engine in 1.1. So that we add back this feature in 1.1.1. + +### Simple Memtracker for Vectorized Engine. + +There is no memtracker in BE for vectorized engine in 1.1, so that the memory is out of control and cause OOM. In 1.1.1, a simple memtracker is added to BE and could control the memory and cancel the query when memory exceeded. + +## Improvements + +### Cache decompressed data in page cache. + +Some data is compressed using bitshuffle and it costs a lot of time to decompress it during query. In 1.1.1, doris will decompress the data that encoded by bitshuffle to accelerate query and we find it could reduce 30% latency for some query in ssb-flat. + +## Bug Fix + +### Fix the problem that could not do rolling upgrade from 1.0.(Serious) + +This issue was introduced in version 1.1 and may cause BE core when upgrade BE but not upgrade FE. + +If you encounter this problem, you can try to fix it with [#10833](https://github.com/apache/doris/pull/10833). + +### Fix the problem that some query not fall back to non-vectorized engine, and BE will core. + +Currently, vectorized engine could not deal with all sql queries and some queries (like left outer join) will use non-vectorized engine to run. But there are some cases not covered in 1.1. And it will cause be crash. + +### Compaction not work correctly and cause -235 Error. + +One rowset multi segments in uniq key compaction, segments rows will be merged in generic_iterator but merged_rows not increased. Compaction will failed in check_correctness, and make a tablet with too much versions which lead to -235 load error. + +### Some segment fault cases during query. + +[#10961](https://github.com/apache/doris/pull/10961) +[#10954](https://github.com/apache/doris/pull/10954) +[#10962](https://github.com/apache/doris/pull/10962) + +# Thanks + +Thanks to everyone who has contributed to this release: + +``` +@jacktengg +@mrhhsg +@xinyiZzz +@yixiutt +@starocean999 +@morrySnow +@morningman +@HappenLee +``` \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.2.md b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.2.md new file mode 100644 index 0000000000000..223b65fda064c --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.2.md @@ -0,0 +1,84 @@ +--- +{ + "title": "Release 1.1.2", + "language": "en" +} +--- + + + + +In this release, Doris Team has fixed more than 170 issues or performance improvement since 1.1.1. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + +# Features + +### New MemTracker + +Introduced new MemTracker for both vectorized engine and non-vectorized engine which is more accurate. + +### Add API for showing current queries and kill query + +### Support read/write emoji of UTF16 via ODBC Table + +# Improvements + +### Data Lake related improvements + +- Improved HDFS ORC File scan performance about 300%. [#11501](https://github.com/apache/doris/pull/11501) + +- Support HDFS HA mode when query Iceberg table. + +- Support query Hive data created by [Apache Tez](https://tez.apache.org/) + +- Add Ali OSS as Hive external support. + +### Add support for string and text type in Spark Load + + +### Add reuse block in non-vectorized engine and have 50% performance improvement in some cases. [#11392](https://github.com/apache/doris/pull/11392) + +### Improve like or regex performance + +### Disable tcmalloc's aggressive_memory_decommit + +It will have 40% performance gains in load or query. + +Currently it is a config, you can change it by set config `tc_enable_aggressive_memory_decommit`. + +# Bug Fix + +### Some issues about FE that will cause FE failure or data corrupt. + +- Add reserved disk config to avoid too many reserved BDB-JE files.**(Serious)** In an HA environment, BDB JE will retains as many reserved files. The BDB-je log doesn't delete until approaching a disk limit. + +- Fix fatal bug in BDB-JE which will cause FE replica could not start correctly or data corrupted.** (Serious)** + +### Fe will hang on waitFor_rpc during query and BE will hang in high concurrent scenarios. + +[#12459](https://github.com/apache/doris/pull/12459) [#12458](https://github.com/apache/doris/pull/12458) [#12392](https://github.com/apache/doris/pull/12392) + +### A fatal issue in vectorized storage engine which will cause wrong result. **(Serious)** + +[#11754](https://github.com/apache/doris/pull/11754) [#11694](https://github.com/apache/doris/pull/11694) + +### Lots of planner related issues that will cause BE core or in abnormal state. + +[#12080](https://github.com/apache/doris/pull/12080) [#12075](https://github.com/apache/doris/pull/12075) [#12040](https://github.com/apache/doris/pull/12040) [#12003](https://github.com/apache/doris/pull/12003) [#12007](https://github.com/apache/doris/pull/12007) [#11971](https://github.com/apache/doris/pull/11971) [#11933](https://github.com/apache/doris/pull/11933) [#11861](https://github.com/apache/doris/pull/11861) [#11859](https://github.com/apache/doris/pull/11859) [#11855](https://github.com/apache/doris/pull/11855) [#11837](https://github.com/apache/doris/pull/11837) [#11834](https://github.com/apache/doris/pull/11834) [#11821](https://github.com/apache/doris/pull/11821) [#11782](https://github.com/apache/doris/pull/11782) [#11723](https://github.com/apache/doris/pull/11723) [#11569](https://github.com/apache/doris/pull/11569) + diff --git a/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.3.md b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.3.md new file mode 100644 index 0000000000000..cfa7151097de3 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.3.md @@ -0,0 +1,92 @@ +--- +{ + "title": "Release 1.1.3", + "language": "en" +} +--- + + + + +In this release, Doris Team has fixed more than 80 issues or performance improvement since 1.1.2. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + + +# Features + +- Support escape identifiers for sqlserver and postgresql in ODBC table. + +- Could use Parquet as output file format. + +# Improvements + +- Optimize flush policy to avoid small segments. [#12706](https://github.com/apache/doris/pull/12706) [#12716](https://github.com/apache/doris/pull/12716) + +- Refactor runtime filter to reduce the prepare time. [#13127](https://github.com/apache/doris/pull/13127) + +- Lots of memory control related issues during query or load process. [#12682](https://github.com/apache/doris/pull/12682) [#12688](https://github.com/apache/doris/pull/12688) [#12708](https://github.com/apache/doris/pull/12708) [#12776](https://github.com/apache/doris/pull/12776) [#12782](https://github.com/apache/doris/pull/12782) [#12791](https://github.com/apache/doris/pull/12791) [#12794](https://github.com/apache/doris/pull/12794) [#12820](https://github.com/apache/doris/pull/12820) [#12932](https://github.com/apache/doris/pull/12932) [#12954](https://github.com/apache/doris/pull/12954) [#12951](https://github.com/apache/doris/pull/12951) + +# BugFix + +- Core dump on compaction with largeint. [#10094](https://github.com/apache/doris/pull/10094) + +- Grouping sets cause be core or return wrong results. [#12313](https://github.com/apache/doris/pull/12313) + +- PREAGGREGATION flag in orthogonal_bitmap_union_count operator is wrong. [#12581](https://github.com/apache/doris/pull/12581) + +- Level1Iterator should release iterators in heap and it may cause memory leak. [#12592](https://github.com/apache/doris/pull/12592) + +- Fix decommission failure with 2 BEs and existing colocation table. [#12644](https://github.com/apache/doris/pull/12644) + +- BE may core dump because of stack-buffer-overflow when TBrokerOpenReaderResponse too large. [#12658](https://github.com/apache/doris/pull/12658) + +- BE may OOM during load when error code -238 occurs. [#12666](https://github.com/apache/doris/pull/12666) + +- Fix wrong child expression of lead function. [#12587](https://github.com/apache/doris/pull/12587) + +- Fix intersect query failed in row storage code. [#12712](https://github.com/apache/doris/pull/12712) + +- Fix wrong result produced by curdate()/current_date() function. [#12720](https://github.com/apache/doris/pull/12720) + +- Fix lateral view explode_split with temp table bug. [#13643](https://github.com/apache/doris/pull/13643) + +- Bucket shuffle join plan is wrong in two same table. [#12930](https://github.com/apache/doris/pull/12930) + +- Fix bug that tablet version may be wrong when doing alter and load. [#13070](https://github.com/apache/doris/pull/13070) + +- BE core when load data using broker with md5sum()/sm3sum(). [#13009](https://github.com/apache/doris/pull/13009) + +# Upgrade Notes + +PageCache and ChunkAllocator are disabled by default to reduce memory usage and can be re-enabled by modifying the configuration items `disable_storage_page_cache` and `chunk_reserved_bytes_limit`. + +Storage Page Cache and Chunk Allocator cache user data chunks and memory preallocation, respectively. + +These two functions take up a certain percentage of memory and are not freed. This part of memory cannot be flexibly allocated, which may lead to insufficient memory for other tasks in some scenarios, affecting system stability and availability. Therefore, we disabled these two features by default in version 1.1.3. + +However, in some latency-sensitive reporting scenarios, turning off this feature may lead to increased query latency. If you are worried about the impact of this feature on your business after upgrade, you can add the following parameters to be.conf to keep the same behavior as the previous version. + +``` +disable_storage_page_cache=false +chunk_reserved_bytes_limit=10% +``` + +* ``disable_storage_page_cache``: Whether to disable Storage Page Cache. version 1.1.2 (inclusive), the default is false, i.e., on. version 1.1.3 defaults to true, i.e., off. +* `chunk_reserved_bytes_limit`: Chunk allocator reserved memory size. 1.1.2 (and earlier), the default is 10% of the overall memory. 1.1.3 version default is 209715200 (200MB). + diff --git a/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.4.md b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.4.md new file mode 100644 index 0000000000000..4710463f4bcde --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.4.md @@ -0,0 +1,72 @@ +--- +{ + "title": "Release 1.1.4", + "language": "en" +} +--- + + + +In this release, Doris Team has fixed about 60 issues or performance improvement since 1.1.3. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + + +# Features + +- Support obs broker load for Huawei Cloud. [#13523](https://github.com/apache/doris/pull/13523) + +- SparkLoad support parquet and orc file.[#13438](https://github.com/apache/doris/pull/13438) + +# Improvements + +- Do not acquire mutex in metric hook since it will affect query performance during heavy load.[#10941](https://github.com/apache/doris/pull/10941) + + +# BugFix + +- The where condition does not take effect when spark load loads the file. [#13804](https://github.com/apache/doris/pull/13804) + +- If function return error result when there is nullable column in vectorized mode. [#13779](https://github.com/apache/doris/pull/13779) + +- Fix incorrect result when using anti join with other join predicates. [#13743](https://github.com/apache/doris/pull/13743) + +- BE crash when call function concat(ifnull). [#13693](https://github.com/apache/doris/pull/13693) + +- Fix planner bug when there is a function in group by clause. [#13613](https://github.com/apache/doris/pull/13613) + +- Table name and column name is not recognized correctly in lateral view clause. [#13600](https://github.com/apache/doris/pull/13600) + +- Unknown column when use MV and table alias. [#13605](https://github.com/apache/doris/pull/13605) + +- JSONReader release memory of both value and parse allocator. [#13513](https://github.com/apache/doris/pull/13513) + +- Fix allow create mv using to_bitmap() on negative value columns when enable_vectorized_alter_table is true. [#13448](https://github.com/apache/doris/pull/13448) + +- Microsecond in function from_date_format_str is lost. [#13446](https://github.com/apache/doris/pull/13446) + +- Sort exprs nullability property may not be right after subsitute using child's smap info. [#13328](https://github.com/apache/doris/pull/13328) + +- Fix core dump on case when have 1000 condition. [#13315](https://github.com/apache/doris/pull/13315) + +- Fix bug that last line of data lost for stream load. [#13066](https://github.com/apache/doris/pull/13066) + +- Restore table or partition with the same replication num as before the backup. [#11942](https://github.com/apache/doris/pull/11942) + + + diff --git a/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.5.md b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.5.md new file mode 100644 index 0000000000000..ee0482b3ba487 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.5.md @@ -0,0 +1,65 @@ +--- +{ + "title": "Release 1.1.5", + "language": "en" +} +--- + + + +In this release, Doris Team has fixed about 36 issues or performance improvement since 1.1.4. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + +# Behavior Changes + +When alias name is same as the original column name like "select year(birthday) as birthday" and use it in group by, order by , having clause, doris's behavior is different from MySQL in the past. In this release, we make it follow MySQL's behavior. Group by and having clause will use original column at first and order by will use alias first. It maybe a litter confuse here so there is a simple advice here, you'd better not use an alias the same as original column name. + +# Features + +Add support of murmur_hash3_64. [#14636](https://github.com/apache/doris/pull/14636) + +# Improvements + +Add timezone cache for convert_tz to improve performance. [#14616](https://github.com/apache/doris/pull/14616) + +Sort result by tablename when call show clause. [#14492](https://github.com/apache/doris/pull/14492) + +# Bug Fix + +Fix coredump when there is a if constant expr in select clause. [#14858](https://github.com/apache/doris/pull/14858) + +ColumnVector::insert_date_column may crashed. [#14839](https://github.com/apache/doris/pull/14839) + +Update high_priority_flush_thread_num_per_store default value to 6 and it will improve the load performance. [#14775](https://github.com/apache/doris/pull/14775) + +Fix quick compaction core. [#14731](https://github.com/apache/doris/pull/14731) + +Partition column is not duplicate key, spark load will throw IndexOutOfBounds error. [#14661](https://github.com/apache/doris/pull/14661) + +Fix a memory leak problem in VCollectorIterator. [#14549](https://github.com/apache/doris/pull/14549) + +Fix create table like when having sequence column. [#14511](https://github.com/apache/doris/pull/14511) + +Using avg rowset to calculate batch size instead of using total_bytes since it costs a lot of cpu. [#14273](https://github.com/apache/doris/pull/14273) + +Fix right outer join core with conjunct. [#14821](https://github.com/apache/doris/pull/14821) + +Optimize policy of tcmalloc gc. [#14777](https://github.com/apache/doris/pull/14777) [#14738](https://github.com/apache/doris/pull/14738) [#14374](https://github.com/apache/doris/pull/14374) + + diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.0.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.0.md new file mode 100644 index 0000000000000..61ba6c5c60890 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.0.md @@ -0,0 +1,236 @@ +--- +{ + "title": "Release 2.0.0", + "language": "en" +} +--- + + + + +We are more than excited to announce that, after six months of coding, testing, and fine-tuning, Apache Doris 2.0.0 is now production-ready. Special thanks to the 275 committers who altogether contributed over 4100 optimizations and fixes to the project. + +This new version highlights: + +- 10 times faster data queries +- Enhanced log analytic and federated query capabilities +- More efficient data writing and updates +- Improved multi-tenant and resource isolation mechanisms +- Progresses in elastic scaling of resources and storage-compute separation +- Enterprise-facing features for higher usability + +> Download: https://doris.apache.org/download +> +> GitHub source code: https://github.com/apache/doris/releases/tag/2.0.0-rc04 + +## **A 10 Times Performance Increase** + +In SSB-Flat and TPC-H benchmarking, Apache Doris 2.0.0 delivered **over 10-time faster query performance** compared to an early version of Apache Doris. + +![](/images/release-note-2.0.0-1.png) + +This is realized by the introduction of a smarter query optimizer, inverted index, a parallel execution model, and a series of new functionalities to support high-concurrency point queries. + +### A smarter query optimizer + +The brand new query optimizer, Nereids, has a richer statistical base and adopts the Cascades framework. It is capable of self-tuning in most query scenarios and supports all 99 SQLs in TPC-DS, so users can expect high performance without any fine-tuning or SQL rewriting. + +TPC-H tests showed that Nereids, with no human intervention, outperformed the old query optimizer by a wide margin. Over 100 users have tried Apache Doris 2.0.0 in their production environment and the vast majority of them reported huge speedups in query execution. + +![](/images/release-note-2.0.0-2.png) + +**Doc**: https://doris.apache.org/docs/dev/query-acceleration/nereids/ + +Nereids is enabled by default in Apache Doris 2.0.0: `SET enable_nereids_planner=true`. Nereids collects statistical data by calling the Analyze command. + +### Inverted Index + +In Apache Doris 2.0.0, we introduced inverted index to better support fuzzy keyword search, equivalence queries, and range queries. + +A smartphone manufacturer tested Apache Doris 2.0.0 in their user behavior analysis scenarios. With inverted index enabled, v2.0.0 was able to finish the queries within milliseconds and maintain stable performance as the query concurrency level went up. In this case, it is 5 to 90 times faster than its old version. + +![](/images/release-note-2.0.0-3.png) + +### 20 times higher concurrency capability + +In scenarios like e-commerce order queries and express tracking, a huge number of end data users search for a certain data record simultaneously. These are what we call high-concurrency point queries, which can bring huge pressure on the system. A traditional solution is to introduce Key-Value stores like Apache HBase for such queries, and Redis as a cache layer to ease the burden, but that means redundant storage and higher maintenance costs. + +For a column-oriented DBMS like Apache Doris, the I/O usage of point queries will be multiplied. We need neater execution. Thus, on the basis of columnar storage, we added row storage format and row cache to increase row reading efficiency, short-circuit plans to speed up data retrieval, and prepared statements to reduce frontend overheads. + +After these optimizations, Apache Doris 2.0 reached a concurrency level of **30,000 QPS per node** on YCSB on a 16 Core 64G cloud server with 4×1T hard drives, representing an improvement of **20 times** compared to its older version. This makes Apache Doris a good alternative to HBase in high-concurrency scenarios, so that users don't need to endure extra maintenance costs and redundant storage brought by complicated tech stacks. + +Read more: https://doris.apache.org/blog/High_concurrency + +### A self-adaptive parallel execution model + +Apache 2.0 brought in a Pipeline execution model for higher efficiency and stability in hybrid analytic workloads. In this model, the execution of queries is driven by data. The blocking operators in all query execution processes are split into pipelines. Whether a pipeline gets an execution thread depends on whether its relevant data is ready. This enables asynchronous blocking operations and more flexible system resource management. Also, this improves CPU efficiency as the system doesn't have to create and destroy threads that much. + +Doc: https://doris.apache.org/docs/dev/query-acceleration/pipeline-execution-engine/ + +**How to enable the Pipeline execution model** + +- The Pipeline execution engine is enabled by default in Apache Doris 2.0: `Set enable_pipeline_engine = true`. +- `parallel_pipeline_task_num` represents the number of pipeline tasks that are parallelly executed in SQL queries. The default value of it is `0`, which means Apache Doris will automatically set the concurrency level to half the number of CPUs in each backend node. Users can change this value as they need it. +- For those who are upgrading to Apache Doris 2.0 from an older version, it is recommended to set the value of `parallel_pipeline_task_num` to that of `parallel_fragment_exec_instance_num` in the old version. + +## A Unified Platform for Multiple Analytic Workloads + +Apache Doris has been pushing its boundaries. Starting as an OLAP engine for reporting, it is now a data warehouse capable of ETL/ELT and more. Version 2.0 is making advancements in its log analysis and data lakehousing capabilities. + +### A 10 times more cost-effective log analysis solution + +Apache Doris 2.0.0 provides native support for semi-structured data. In addition to JSON and Array, it now supports a complex data type: Map. Based on Light Schema Change, it also supports Schema Evolution, which means you can adjust the schema as your business changes. You can add or delete fields and indexes, and change the data types for fields. As we introduced inverted index and a high-performance text analysis algorithm into it, it can execute full-text search and dimensional analysis of logs more efficiently. With faster data writing and query speed and lower storage cost, it is 10 times more cost-effective than the common log analytic solution within the industry. + +![](/images/release-note-2.0.0-4.png) + +### Enhanced data lakehousing capabilities + +In Apache Doris 1.2, we introduced Multi-Catalog to allow for auto-mapping and auto-synchronization of data from heterogeneous sources. In version 2.0.0, we extended the list of data sources supported and optimized Doris for based on users' needs in production environment. + +![](/images/release-note-2.0.0-5.png) + +Apache Doris 2.0.0 supports dozens of data sources including Hive, Hudi, Iceberg, Paimon, MaxCompute, Elasticsearch, Trino, ClickHouse, and almost all open lakehouse formats. It also supports snapshot queries on Hudi Copy-on-Write tables and read optimized queries on Hudi Merge-on-Read tables. It allows for authorization of Hive Catalog using Apache Ranger, so users can reuse their existing privilege control system. Besides, it supports extensible authorization plug-ins to enable user-defined authorization methods for any catalog. + +TPC-H benchmark tests showed that Apache Doris 2.0.0 is 3~5 times faster than Presto/Trino in queries on Hive tables. This is realized by all-around optimizations (in small file reading, flat table reading, local file cache, ORC/Parquet file reading, Compute Nodes, and information collection of external tables) finished in this development cycle and the distributed execution framework, vectorized execution engine, and query optimizer of Apache Doris. + +![](/images/release-note-2.0.0-6.png) + +All this gives Apache Doris 2.0.0 an edge in data lakehousing scenarios. With Doris, you can do incremental or overall synchronization of multiple upstream data sources in one place, and expect much higher data query performance than other query engines. The processed data can be written back to the sources or provided for downstream systems. In this way, you can make Apache Doris your unified data analytic gateway. + +## Efficient Data Update + +Data update is important in real-time analysis, since users want to always be accessible to the latest data, and be able to update data flexibly, such as updating a row or just a few columns, batching updating or deleting their specified data, or even overwriting a whole data partition. + +Efficient data updating has been another hill to climb in data analysis. Apache Hive only supports updates on the partition level, while Hudi and Iceberg do better in low-frequency batch updates instead of real-time updates due to their Merge-on-Read and Copy-on-Write implementations. + +As for data updating, Apache Doris 2.0.0 is capable of: + +- **Faster data writing**: In the pressure tests with an online payment platform, under 20 concurrent data writing tasks, Doris reached a writing throughput of 300,000 records per second and maintained stability throughout the over 10-hour continuous writing process. +- **Partial column update**: Older versions of Doris implements partial column update by `replace_if_not_null` in the Aggregate Key model. In 2.0.0, we enable partial column updates in the Unique Key model. That means you can directly write data from multiple source tables into a flat table, without having to concatenate them into one output stream using Flink before writing. This method avoids a complicated processing pipeline and the extra resource consumption. You can simply specify the columns you need to update. +- **Conditional update and deletion**: In addition to the simple Update and Delete operations, we realize complicated conditional updates and deletes operations on the basis of Merge-on-Write. + +## Faster, Stabler, and Smarter Data Writing + +### Higher speed in data writing + +As part of our continuing effort to strengthen the real-time analytic capability of Apache Doris, we have improved the end-to-end real-time data writing capability of version 2.0.0. Benchmark tests reported higher throughput in various writing methods: + +- Stream Load, TPC-H 144G lineitem table, 48-bucket Duplicate table, triple-replica writing: throughput increased by 100% +- Stream Load, TPC-H 144G lineitem table, 48-bucket Unique Key table, triple-replica writing: throughput increased by 200% +- Insert Into Select, TPC-H 144G lineitem table, 48-bucket Duplicate table: throughput increased by 50% +- Insert Into Select, TPC-H 144G lineitem table, 48-bucket Unique Key table: throughput increased by 150% + +### Greater stability in high-concurrency data writing + +The sources of system instability often includes small file merging, write amplification, and the consequential disk I/O and CPU overheads. Hence, we introduced Vertical Compaction and Segment Compaction in version 2.0.0 to eliminate OOM errors in compaction and avoid the generation of too many segment files during data writing. After such improvements, Apache Doris can write data 50% faster while **using only 10% of the memory that it previously used**. + +Read more: https://doris.apache.org/blog/Compaction + +### Auto-synchronization of table schema + +The latest Flink-Doris-Connector allows users to synchronize an entire database (such as MySQL and Oracle) to Apache Doris by one simple step. According to our test results, one single synchronization task can support the real-time concurrent writing of thousands of tables. Users no longer need to go through a complicated synchronization procedure because Apache Doris has automated the process. Changes in the upstream data schema will be automatically captured and dynamically updated to Apache Doris in a seamless manner. + +Read more: https://doris.apache.org/blog/FDC + +## A New Multi-Tenant Resource Isolation Solution + +The purpose of multi-tenant resource isolation is to avoid resource preemption in the case of heavy loads. For that sake, older versions of Apache Doris adopted a hard isolation plan featured by Resource Group: Backend nodes of the same Doris cluster would be tagged, and those of the same tag formed a Resource Group. As data was ingested into the database, different data replicas would be written into different Resource Groups, which will be responsible for different workloads. For example, data reading and writing will be conducted on different data tablets, so as to realize read-write separation. Similarly, you can also put online and offline business on different Resource Groups. + +![](/images/release-note-2.0.0-7.png) + +This is an effective solution, but in practice, it happens that some Resource Groups are heavily occupied while others are idle. We want a more flexible way to reduce vacancy rate of resources. Thus, in 2.0.0, we introduce Workload Group resource soft limit. + +![](/images/release-note-2.0.0-8.png) + +The idea is to divide workloads into groups to allow for flexible management of CPU and memory resources. Apache Doris associates a query with a Workload Group, and limits the percentage of CPU and memory that a single query can use on a backend node. The memory soft limit can be configured and enabled by the user. + +When there is a cluster resource shortage, the system will kill the largest memory-consuming query tasks; when there are sufficient cluster resources, once a Workload Group uses more resources than expected, the idle cluster resources will be shared among all the Workload Groups to give full play to the system memory and ensure stable execution of queries. You can also prioritize the Workload Groups in terms of resource allocation. In other words, you can decide which tasks can be assigned with adequate resources and which not. + +Meanwhile, we introduced Query Queue in 2.0.0. Upon Workload Group creation, you can set a maximum query number for a query queue. Queries beyond that limit will wait for execution in the queue. This is to reduce system burden under heavy workloads. + +## Elastic Scaling and Storage-Compute Separation + +When it comes to computation and storage resources, what do users want? + +- **Elastic scaling of computation resources**: Scale up resources quickly in peak times to increase efficiency and scale down in valley times to reduce costs. +- **Lower storage costs**: Use low-cost storage media and separate storage from computation. +- **Separation of workloads**: Isolate the computation resources of different workloads to avoid preemption. +- **Unified management of data**: Simply manage catalogs and data in one place. + +To separate storage and computation is a way to realize elastic scaling of resources, but it demands more efforts in maintaining storage stability, which determines the stability and continuity of OLAP services. To ensure storage stability, we introduced mechanisms including cache management, computation resource management, and garbage collection. + + In this respect, we divide our users into three groups after investigation: + +1. Users with no need for resource scaling +2. Users requiring resource scaling, low storage costs, and workload separation from Apache Doris +3. Users who already have a stable large-scale storage system and thus require an advanced compute-storage-separated architecture for efficient resource scaling + +Apache Doris 2.0 provides two solutions to address the needs of the first two types of users. + +1. **Compute nodes**. We introduced stateless compute nodes in version 2.0. Unlike the mix nodes, the compute nodes do not save any data and are not involved in workload balancing of data tablets during cluster scaling. Thus, they are able to quickly join the cluster and share the computing pressure during peak times. In addition, in data lakehouse analysis, these nodes will be the first ones to execute queries on remote storage (HDFS/S3) so there will be no resource competition between internal tables and external tables. + 1. Doc: https://doris.apache.org/docs/dev/advanced/compute_node/ +2. **Hot-cold data separation**. Hot/cold data refers to data that is frequently/seldom accessed, respectively. Generally, it makes more sense to store cold data in low-cost storage. Older versions of Apache Doris support lifecycle management of table partitions: As hot data cooled down, it would be moved from SSD to HDD. However, data was stored with multiple replicas on HDD, which was still a waste. Now, in Apache Doris 2.0, cold data can be stored in object storage, which is even cheaper and allows single-copy storage. That reduces the storage costs by 70% and cuts down the computation and network overheads that come with storage. + 1. Read more: https://doris.apache.org/blog/HCDS/ + +For neater separate of computation and storage, the VeloDB team is going to contribute the Cloud Compute-Storage-Separation solution to the Apache Doris project. The performance and stability of it has stood the test of hundreds of companies in their production environment. The merging of code will be finished by October this year, and all Apache Doris users will be able to get an early taste of it in September. + +## Enhanced Usability + +Apache Doris 2.0.0 also highlights some enterprise-facing functionalities. + +### Support for Kubernetes Deployment + +Older versions of Apache Doris communicate based on IP, so any host failure in Kubernetes deployment that causes a POD IP drift will lead to cluster unavailability. Now, version 2.0 supports FQDN. That means the failed Doris nodes can recover automatically without human intervention, which lays the foundation for Kubernetes deployment and elastic scaling. + +### Support for Cross-Cluster Replication (CCR) + +Apache Doris 2.0.0 supports cross-cluster replication (CCR). Data changes at the database/table level in the source cluster will be synchronized to the target cluster. You can choose to replicate the incremental data or the overall data. + +It also supports synchronization of DDL, which means DDL statements executed by the source cluster can also by automatically replicated to the target cluster. + +It is simple to configure and use CCR in Doris. Leveraging this functionality, you can implement read-write separation and multi-datacenter replication + +This feature allows for higher availability of data, read/write workload separation, and cross-data-center replication more efficiently. + +## Behavior Change + +- Use rolling upgrade from 1.2-ITS to 2.0.0, and restart upgrade from preview versions of 2.0 to 2.0.0; +- The new query optimizer (Nereids) is enabled by default: `enable_nereids_planner=true`; +- All non-vectorized code has been removed from the system, so the `enable_vectorized_engine` parameter no long works; +- A new parameter `enable_single_replica_compaction` has been added; +- datev2, datetimev2, and decimalv3 are the default data types in table creation; datav1, datetimev1, and decimalv2 are not supported in table creation; +- decimalv3 is the default data type for JDBC and Iceberg Catalog; +- A new data type `AGG_STATE` has been added; +- The cluster column has been removed from backend tables; +- For better compatibility with BI tools, datev2 and datetimev2 are displayed as date and datetime when `show create table`; +- max_openfiles and swaps checks are added to the backend startup script so inappropriate system configuration might lead to backend failure; +- Password-free login is not allowed when accessing frontend on localhost; +- If there is a Multi-Catalog in the system, by default, only data of the internal catalog will be displayed when querying information schema; +- A limit has been imposed on the depth of the expression tree. The default value is 200; +- The single quote in the return value of array string has been changed to double quote; +- The Doris processes are renamed to DorisFE and DorisBE. +- The functions AES and SM4 with two arguments' behaviour changed. See more informations in [relative function docs](../../sql-manual/sql-functions/encrypt-digest-functions/sm4-encrypt.md) + +## Embarking on the 2.0.0 Journey + +To make Apache Doris 2.0.0 production-ready, we invited hundreds of enterprise users to engage in the testing and optimized it for better performance, stability, and usability. In the next phase, we will continue responding to user needs with agile release planning. We plan to launch 2.0.1 in late August and 2.0.2 in September, as we keep fixing bugs and adding new features. We also plan to release an early version of 2.1 in September to bring a few long-requested capabilities to you. For example, in Doris 2.1, the Variant data type will better serve the schema-free analytic needs of semi-structured data; the multi-table materialized views will be able to simplify the data scheduling and processing link while speeding up queries; more and neater data ingestion methods will be added and nested composite data types will be realized. + +If you have any questions or ideas when investigating, testing, and deploying Apache Doris, please find us on [Slack](https://t.co/ZxJuNJHXb2). Our developers will be happy to hear them and provide targeted support. + diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.1.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.1.md new file mode 100644 index 0000000000000..d8c19fb67525b --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.1.md @@ -0,0 +1,224 @@ +--- +{ + "title": "Release 2.0.1", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, 383 improvements and bug fixes have been made in Doris 2.0.1. + +## Behavior Changes + +- [https://github.com/apache/doris/pull/21302](https://github.com/apache/doris/pull/21302) + +## Improvements + +### functionality and stability of array and map datatypes +- [https://github.com/apache/doris/pull/22793](https://github.com/apache/doris/pull/22793) +- [https://github.com/apache/doris/pull/22927](https://github.com/apache/doris/pull/22927) +- https://github.com/apache/doris/pull/22738 +- https://github.com/apache/doris/pull/22347 +- https://github.com/apache/doris/pull/23250 +- https://github.com/apache/doris/pull/22300 + +### performance for inverted index query +- https://github.com/apache/doris/pull/22836 +- https://github.com/apache/doris/pull/23381 +- https://github.com/apache/doris/pull/23389 +- https://github.com/apache/doris/pull/22570 + +### performance for bitmap, like, scan, agg functions +- https://github.com/apache/doris/pull/23172 +- https://github.com/apache/doris/pull/23495 +- https://github.com/apache/doris/pull/23476 +- https://github.com/apache/doris/pull/23396 +- https://github.com/apache/doris/pull/23182 +- https://github.com/apache/doris/pull/22216 + +### functionality and stability of CCR +- https://github.com/apache/doris/pull/22447 +- https://github.com/apache/doris/pull/22559 +- https://github.com/apache/doris/pull/22173 +- https://github.com/apache/doris/pull/22678 + +### merge on write unique table + +- https://github.com/apache/doris/pull/22282 +- https://github.com/apache/doris/pull/22984 +- https://github.com/apache/doris/pull/21933 +- https://github.com/apache/doris/pull/22874 + +### optimizer table stats and analyze + +- https://github.com/apache/doris/pull/22658 +- https://github.com/apache/doris/pull/22211 +- https://github.com/apache/doris/pull/22775 +- https://github.com/apache/doris/pull/22896 +- https://github.com/apache/doris/pull/22788 +- https://github.com/apache/doris/pull/22882 +- + +### functionality and performance of multi catalog + +- https://github.com/apache/doris/pull/22949 +- https://github.com/apache/doris/pull/22923 +- https://github.com/apache/doris/pull/22336 +- https://github.com/apache/doris/pull/22915 +- https://github.com/apache/doris/pull/23056 +- https://github.com/apache/doris/pull/23297 +- https://github.com/apache/doris/pull/23279 + + +## Important Bug fixes + +- https://github.com/apache/doris/pull/22673 +- https://github.com/apache/doris/pull/22656 +- https://github.com/apache/doris/pull/22892 +- https://github.com/apache/doris/pull/22959 +- https://github.com/apache/doris/pull/22902 +- https://github.com/apache/doris/pull/22976 +- https://github.com/apache/doris/pull/22734 +- https://github.com/apache/doris/pull/22840 +- https://github.com/apache/doris/pull/23008 +- https://github.com/apache/doris/pull/23003 +- https://github.com/apache/doris/pull/22966 +- https://github.com/apache/doris/pull/22965 +- https://github.com/apache/doris/pull/22784 +- https://github.com/apache/doris/pull/23049 +- https://github.com/apache/doris/pull/23084 +- https://github.com/apache/doris/pull/22947 +- https://github.com/apache/doris/pull/22919 +- https://github.com/apache/doris/pull/22979 +- https://github.com/apache/doris/pull/23096 +- https://github.com/apache/doris/pull/23113 +- https://github.com/apache/doris/pull/23062 +- https://github.com/apache/doris/pull/22918 +- https://github.com/apache/doris/pull/23026 +- https://github.com/apache/doris/pull/23175 +- https://github.com/apache/doris/pull/23167 +- https://github.com/apache/doris/pull/23015 +- https://github.com/apache/doris/pull/23165 +- https://github.com/apache/doris/pull/23264 +- https://github.com/apache/doris/pull/23246 +- https://github.com/apache/doris/pull/23198 +- https://github.com/apache/doris/pull/23221 +- https://github.com/apache/doris/pull/23277 +- https://github.com/apache/doris/pull/23249 +- https://github.com/apache/doris/pull/23272 +- https://github.com/apache/doris/pull/23383 +- https://github.com/apache/doris/pull/23372 +- https://github.com/apache/doris/pull/23399 +- https://github.com/apache/doris/pull/23295 +- https://github.com/apache/doris/pull/23446 +- https://github.com/apache/doris/pull/23406 +- https://github.com/apache/doris/pull/23387 +- https://github.com/apache/doris/pull/23421 +- https://github.com/apache/doris/pull/23456 +- https://github.com/apache/doris/pull/23361 +- https://github.com/apache/doris/pull/23402 +- https://github.com/apache/doris/pull/23369 +- https://github.com/apache/doris/pull/23245 +- https://github.com/apache/doris/pull/23532 +- https://github.com/apache/doris/pull/23529 +- https://github.com/apache/doris/pull/23601 + + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.1-merged+is%3Aclosed) . + + +## Big Thanks + +Thanks all who contribute to this release: + +@adonis0147 +@airborne12 +@amorynan +@AshinGau +@BePPPower +@BiteTheDDDDt +@bobhan1 +@ByteYue +@caiconghui +@CalvinKirs +@csun5285 +@DarvenDuan +@deadlinefen +@DongLiang-0 +@Doris-Extras +@dutyu +@englefly +@freemandealer +@Gabriel39 +@GoGoWen +@HappenLee +@hello-stephen +@HHoflittlefish777 +@hubgeter +@hust-hhb +@JackDrogon +@jacktengg +@jackwener +@Jibing-Li +@kaijchen +@kaka11chen +@Kikyou1997 +@Lchangliang +@LemonLiTree +@liaoxin01 +@LiBinfeng-01 +@lsy3993 +@luozenglin +@morningman +@morrySnow +@mrhhsg +@Mryange +@mymeiyi +@shuke987 +@sohardforaname +@starocean999 +@TangSiyang2001 +@Tanya-W +@ucasfl +@vinlee19 +@wangbo +@wsjz +@wuwenchi +@xiaokang +@XieJiann +@xinyiZzz +@yujun777 +@Yukang-Lian +@Yulei-Yang +@zclllyybb +@zddr +@zenoyang +@zgxme +@zhangguoqiang666 +@zhangstar333 +@zhannngchen +@zhiqiang-hhhh +@zxealous +@zy-kkk +@zzzxl1993 +@zzzzzzzs + diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.10.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.10.md new file mode 100644 index 0000000000000..5d8592a0ee25c --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.10.md @@ -0,0 +1,59 @@ +--- +{ + "title": "Release 2.0.10", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 83 improvements and bug fixes have been made in Doris 2.0.10 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + + +## Improvement and Optimizations + +- This enhancement introduces the `read_only` and `super_read_only` variables to the database system, ensuring compatibility with MySQL's read-only modes. + +- When the check status is not IO_ERROR, the disk path should not be added to the broken list. This ensures that only disks with actual I/O errors are marked as broken. + +- When performing a Create Table As Select (CTAS) operation from an external table, convert the `VARCHAR` column to `STRING` type. + +- Support mapping Paimon column type "ROW" to Doris type "STRUCT" + +- Choose disk tolerate with little skew when creating tablet + +- Write editlog to `set replica drop` to avoid confusing status on follower FE + +- Make the schema change memory space adaptive to avoid memory over limit + +- Inverted index 'unicode' tokenizer supports configuration to exclude stop words + +See the complete list of improvements and bug fixes on [GitHub](https://github.com/apache/doris/compare/2.0.9...2.0.10) . + +## Credits + +Thanks to all who contributed to this release: + +@airborne12, @BePPPower, @ByteYue, @CalvinKirs, @cambyzju, @csun5285, @dataroaring, @deardeng, @DongLiang-0, @eldenmoon, @felixwluo, @HappenLee, @hubgeter, @jackwener, @kaijchen, @kaka11chen, @Lchangliang, @liaoxin01, @LiBinfeng-01, @luennng, @morningman, @morrySnow, @Mryange, @nextdreamblue, @qidaye, @starocean999, @suxiaogang223, @SWJTU-ZhangLei, @w41ter, @xiaokang, @xy720, @yujun777, @Yukang-Lian, @zhangstar333, @zxealous, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.11.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.11.md new file mode 100644 index 0000000000000..1a2598b0d41a0 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.11.md @@ -0,0 +1,60 @@ +--- +{ + "title": "Release 2.0.11", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 123 improvements and bug fixes have been made in Doris 2.0.11 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + +## 1 Behavior change + +Since the inverted index is now mature and stable, it can replace the old BITMAP INDEX. Therefore, any newly created `BITMAP INDEX` will automatically switch to an `INVERTED INDEX`, while existing `BITMAP INDEX` will remain unchanged. This entire switching process is transparent to the user, with no changes to writing or querying. Additionally, users can disable this automatic switch by setting the FE configuration `enable_create_bitmap_index_as_inverted_index` to false. [#35528](https://github.com/apache/doris/pull/35528) + + +## 2 Improvement and optimizations + +- Add Trino JDBC Catalog type mapping for JSON and TIME + +- FE exit when failed to transfer to (non) master to prevent unknown state and too many logs + +- Write audit log while doing drop stats table. + +- Ignore min/max column stats if table is partially analyzed to avoid inefficient query plan + +- Support minus operation for set like `set1 - set2` + +- Improve perfmance of LIKE and REGEXP clause with concat (col, pattern_str), eg. `col1 LIKE concat('%', col2, '%')` + +- Add query options for short circuit queries for upgrade compatibility + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.10...2.0.11) . + +## Credits + +Thanks all who contribute to this release: + +@AshinGau, @BePPPower, @BiteTheDDDDt, @ByteYue, @CalvinKirs, @cambyzju, @csun5285, @dataroaring, @eldenmoon, @englefly, @feiniaofeiafei, @Gabriel39, @GoGoWen, @HHoflittlefish777, @hubgeter, @jacktengg, @jackwener, @jeffreys-cat, @Jibing-Li, @kaka11chen, @kobe6th, @LiBinfeng-01, @mongo360, @morningman, @morrySnow, @mrhhsg, @Mryange, @nextdreamblue, @qidaye, @sjyango, @starocean999, @SWJTU-ZhangLei, @w41ter, @wangbo, @wsjz, @wuwenchi, @xiaokang, @XieJiann, @xy720, @yujun777, @Yukang-Lian, @Yulei-Yang, @zclllyybb, @zddr, @zhangstar333, @zhiqiang-hhhh, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.12.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.12.md new file mode 100644 index 0000000000000..0bc289c91a8ef --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.12.md @@ -0,0 +1,58 @@ +--- +{ + "title": "Release 2.0.12", + "language": "en" +} +--- + + + +Thanks to our community developers and users for their contributions. Doris version 2.0.12 will bring 99 improvements and bug fixes. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- No longer set the default table comment to the table type. Instead, set it to be empty by default, for example, change COMMENT 'OLAP' to COMMENT ' '. This new behavior is more friendly for BI software that relies on table comments. [#35855](https://github.com/apache/doris/pull/35855) + +- Change the type of the `@@autocommit` variable from `BOOLEAN` to `BIGINT` to prevent errors from certain MySQL clients (such as .NET MySQL.Data). [#33282](https://github.com/apache/doris/pull/33282) + + +## Improvements + +- Remove the `disable_nested_complex_type` parameter and allow the creation of nested `ARRAY`, `MAP`, and `STRUCT` types by default. [#36255](https://github.com/apache/doris/pull/36255) + +- The HMS catalog supports the `SHOW CREATE DATABASE` command. [#28145](https://github.com/apache/doris/pull/28145) + +- Add more inverted index metrics to the query profile. [#36545](https://github.com/apache/doris/pull/36545) + +- Cross-Cluster Replication (CCR) supports inverted indices. [#31743](https://github.com/apache/doris/pull/31743) + +You can access the full list through the GitHub [link](https://github.com/apache/doris/compare/2.0.11...2.0.12) , with the key features and improvements highlighted below. + + + +## Credits + +Thanks all who contribute to this release: + +@airborne12, D14@amorynan, D14@BiteTheDDDDt, D14@cambyzju, D14@caoliang-web, D14@dataroaring, D14@eldenmoon, D14@feiniaofeiafei, D14@felixwluo, D14@gavinchou, D14@HappenLee, D14@hello-stephen, D14@jacktengg, D14@Jibing-Li, D14@Johnnyssc, D14@liaoxin01, D14@LiBinfeng-01, D14@luwei16, D14@mongo360, D14@morningman, D14@morrySnow, D14@mrhhsg, D14@Mryange, D14@mymeiyi, D14@qidaye, D14@qzsee, D14@starocean999, D14@w41ter, D14@wangbo, D14@wsjz, D14@wuwenchi, D14@xiaokang, D14@XuPengfei-1020, D14@xy720, D14@yongjinhou, D14@yujun777, D14@Yukang-Lian, D14@Yulei-Yang, D14@zclllyybb, D14@zddr, D14@zhannngchen, D14@zhiqiang-hhhh, D14@zy-kkk, D14@zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.13.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.13.md new file mode 100644 index 0000000000000..1b6e54d948d7d --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.13.md @@ -0,0 +1,61 @@ +--- +{ + "title": "Release 2.0.13", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 112 improvements and bug fixes have been made in Doris 2.0.13 version + +[Quick Download](https://doris.apache.org/download/) + +## Behavior changes + +SQL input is treated as multiple statements only when the `CLIENT_MULTI_STATEMENTS` setting is enabled on the client side, enhancing compatibility with MySQL. [#36759](https://github.com/apache/doris/pull/36759) + +## New features + +- A new BE configuration `allow_zero_date` has been added, allowing dates with all zeros. When set to `false`, `0000-00-00` is parsed as `NULL`, and when set to `true`, it is parsed as `0000-01-01`. The default value is `false` to maintain consistency with previous behavior. [#34961](https://github.com/apache/doris/pull/34961) + +- `LogicalWindow` and `LogicalPartitionTopN` support multi-field predicate pushdown to improve performance. [#36828](https://github.com/apache/doris/pull/36828) + +- The ES Catalog now maps ES `nested` or `object` types to Doris `JSON` types. [#37101](https://github.com/apache/doris/pull/37101) + +## Improvements + +- Queries with `LIMIT` end reading data earlier to reduce resource consumption and improve performance. [#36535](https://github.com/apache/doris/pull/36535) + +- Special JSON data with empty keys is now supported. [#36762](https://github.com/apache/doris/pull/36762) + +- Stability and usability of routine load have been improved, including load balancing, automatic recovery, exception handling, and more user-friendly error messages. [#36450](https://github.com/apache/doris/pull/36450) [#35376](https://github.com/apache/doris/pull/35376) [#35266](https://github.com/apache/doris/pull/35266) [ #33372](https://github.com/apache/doris/pull/33372) [#32282](https://github.com/apache/doris/pull/32282) [#32046](https://github.com/apache/doris/pull/32046) [#32021](https://github.com/apache/doris/pull/32021) [#31846](https://github.com/apache/doris/pull/31846) [#31273](https://github.com/apache/doris/pull/31273) + +- BE load balancing selection of hard disk strategy and speed optimization. [#36826](https://github.com/apache/doris/pull/36826) [#36795](https://github.com/apache/doris/pull/36795) [#36509](https://github.com/apache/doris/pull/36509) + +- Stability and usability of the JDBC catalog have been improved, including encryption, thread pool connection count configuration, and more user-friendly error messages. [#36940](https://github.com/apache/doris/pull/36940) [#36720](https://github.com/apache/doris/pull/36720) [#30880](https://github.com/apache/doris/pull/30880) [#35692](https://github.com/apache/doris/pull/35692) + +You can access the full list through the GitHub [link](https://github.com/apache/doris/compare/2.0.12...2.0.13) , with the key features and improvements highlighted below. + +## Credits + +Thanks to all who contributed to this release: + +@Gabriel39, @Jibing-Li, @Johnnyssc, @Lchangliang, @LiBinfeng-01, @SWJTU-ZhangLei, @Thearas, @Yukang-Lian, @Yulei-Yang, @airborne12, @amorynan, @bobhan1, @cambyzju, @csun5285, @dataroaring, @deardeng, @eldenmoon, @englefly, @feiniaofeiafei, @hello-stephen, @jacktengg, @kaijchen, @liutang123, @luwei16, @morningman, @morrySnow, @mrhhsg, @mymeiyi, @platoneko, @qidaye, @sollhui, @starocean999, @w41ter, @xiaokang, @xy720, @yujun777, @zclllyybb, @zddr \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.14.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.14.md new file mode 100644 index 0000000000000..061c5cb7a1093 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.14.md @@ -0,0 +1,59 @@ +--- +{ + "title": "Release 2.0.14", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 110 improvements and bug fixes have been made in Doris 2.0.14 version + + +## 1 New features + +- Adds a REST interface to retrieve the most recent query profile: `curl http://user:password@127.0.0.1:8030/api/profile/text` [#38268](https://github.com/apache/doris/pull/38268) + +## 2 Improvements + +- Optimizes the primary key point query performance for MOW tables with sequence columns [#38287](https://github.com/apache/doris/pull/38287) + +- Enhances the performance of inverted index queries with many conditions [#35346](https://github.com/apache/doris/pull/35346) + +- Automatically enables the `support_phrase` option when creating a tokenized inverted index to accelerate `match_phrase` phrase queries [#37949](https://github.com/apache/doris/pull/37949) + +- Supports simplified SQL hints, for example: `SELECT /*+ query_timeout(3000) */ * FROM t;` [#37720](https://github.com/apache/doris/pull/37720) + +- Automatically retries reading from object storage when encountering a `429` error to improve stability [#35396](https://github.com/apache/doris/pull/35396) + +- LEFT SEMI / ANTI JOIN terminates subsequent matching execution upon matching a qualifying data row to enhance performance. [#34703](https://github.com/apache/doris/pull/34703) + +- Prevents coredump when returning illegal data to MySQL results. [#28069](https://github.com/apache/doris/pull/28069) + +- Unifies the output of type names in lowercase to maintain compatibility with MySQL and be more friendly to BI tools. [#38521](https://github.com/apache/doris/pull/38521) + + +You can access the full list through the GitHub [link](https://github.com/apache/doris/compare/2.0.13...2.0.14) , with the key features and improvements highlighted below. + +## Credits + +Thanks all who contribute to this release: + +@ByteYue, @CalvinKirs, @GoGoWen, @HappenLee, @Jibing-Li, @Lchangliang, @LiBinfeng-01, @Mryange, @XieJiann, @Yukang-Lian, @Yulei-Yang, @airborne12, @amorynan, @biohazard4321, @cambyzju, @csun5285, @eldenmoon, @englefly, @freemandealer, @hello-stephen, @hubgeter, @kaijchen, @liaoxin01, @luwei16, @morningman, @morrySnow, @mymeiyi, @qidaye, @sollhui, @starocean999, @w41ter, @wuwenchi, @xiaokang, @xy720, @yujun777, @zclllyybb, @zddr, @zhangstar333, @zhiqiang-hhhh, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.15.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.15.md new file mode 100644 index 0000000000000..58237f7c3f097 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.15.md @@ -0,0 +1,91 @@ +--- +{ + "title": "Release 2.0.15", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 157 improvements and bug fixes have been made in Doris 2.0.15 version + +- Quick Download: https://doris.apache.org/download + +- GitHub: https://github.com/apache/doris/releases/tag/2.0.15 + +## 1 Behavior Change + +NA + +## 2 New Features + +- Restore now supports deleting redundant tablets and partition options. [#39028](https://github.com/apache/doris/pull/39028) + +- Support JSON function `json_search`.[#40948](https://github.com/apache/doris/pull/40948) + +## 3 Improvement and Optimizations + +### Stability + +- Add a FE configuration `abort_txn_after_lost_heartbeat_time_second` for transaction abort time. [#28662](https://github.com/apache/doris/pull/28662) + +- Abort transactions after a BE loses heartbeat for over 1 minute instead of 5 seconds, to avoid overly sensitive transaction aborts. [#22781](https://github.com/apache/doris/pull/22781) + +- Delay scheduling EOF tasks of routine load to avoid an excessive number of small transactions. [#39975](https://github.com/apache/doris/pull/39975) + +- Prefer querying from online disk services to be more robust. [#39467](https://github.com/apache/doris/pull/39467) + +- Skip checking newly inserted rows in non-strict mode partial updates if the row's delete sign is marked. [#40322](https://github.com/apache/doris/pull/40322) + +- To prevent FE OOM, limit the number of tablets in backup tasks, with a default value of 300,000. [#39987](https://github.com/apache/doris/pull/39987) + +### Performance + +- Optimize slow column updates caused by concurrent column updates and compactions. [#38487](https://github.com/apache/doris/pull/38487) + +- When a NullLiteral exists in a filter condition, it can now be folded into False and further converted to an EmptySet to reduce unnecessary data scanning and computation. [#38135](https://github.com/apache/doris/pull/38135) + +- Improve performance of `ORDER BY` permutation. [#38985](https://github.com/apache/doris/pull/38985) + +- Improve the performance of string processing in inverted indexes. [#37395](https://github.com/apache/doris/pull/37395) + +### Optimizer and Statistics + +- Added support for statements beginning with a semicolon. [#39399](https://github.com/apache/doris/pull/39399) + +- Polish aggregate function signature matching. [#39352](https://github.com/apache/doris/pull/39352) + +- Drop column statistics and trigger auto analysis after schema change. [#39101](https://github.com/apache/doris/pull/39101) + +- Support dropping cached stats using `DROP CACHED STATS table_name`. [#39367](https://github.com/apache/doris/pull/39367) + +### Multi Catalog and Others + +- Optimize JDBC Catalog refresh to reduce the frequency of client creation. [#40261](https://github.com/apache/doris/pull/40261) + +- Fix thread leaks in JDBC Catalog under certain conditions. [#39423](https://github.com/apache/doris/pull/39423) + +- ARRAY MAP STRUCT types now support `REPLACE_IF_NOT_NULL`. [#38304](https://github.com/apache/doris/pull/38304) + +- Retry delete jobs for failures that are not `DELETE_INVALID_XXX`. [#37834](https://github.com/apache/doris/pull/37834) + +**Credits** + +@924060929, @BePPPower, @BiteTheDDDDt, @CalvinKirs, @GoGoWen, @HappenLee, @Jibing-Li, @Johnnyssc, @LiBinfeng-01, @Mryange, @SWJTU-ZhangLei, @TangSiyang2001, @Toms1999, @Vallishp, @Yukang-Lian, @airborne12, @amorynan, @bobhan1, @cambyzju, @csun5285, @dataroaring, @eldenmoon, @englefly, @feiniaofeiafei, @hello-stephen, @htyoung, @hubgeter, @justfortaste, @liaoxin01, @liugddx, @liutang123, @luwei16, @mongo360, @morrySnow, @qidaye, @smallx, @sollhui, @starocean999, @w41ter, @xiaokang, @xzj7019, @yujun777, @zclllyybb, @zddr, @zhangstar333, @zhannngchen, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.2.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.2.md new file mode 100644 index 0000000000000..3f8e89cddf946 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.2.md @@ -0,0 +1,157 @@ +--- +{ + "title": "Release 2.0.2", + "language": "en" +} +--- + + + +# Release 2.0.2 + +Thanks to our community users and developers, 489 improvements and bug fixes have been made in Doris 2.0.2. + +## Behavior Changes + +- [Remove json -> operator convert to json_extract #24679](https://github.com/apache/doris/pull/24679) + + Remove json '->' operator since it is conflicted with lambda function syntax. It's a syntax sugar for function json_extract and can be replaced with the former. +- [Start the script to set metadata_failure_recovery #24308](https://github.com/apache/doris/pull/24308) + + Move metadata_failure_recovery from fe.conf to start_fe.sh argument to prevent being used unexpectedly. +- [Change ordinary type null value is \N,complex type null value is null #24207](https://github.com/apache/doris/pull/24207) +- [Optimize priority_ network matching logic for be #23795](https://github.com/apache/doris/pull/23795) +- [Fix cancel load failed because Job could not be cancelled… #17730](https://github.com/apache/doris/pull/17730) + + Allow cancel a retrying load job. + +## Improvements + +### Easier to use + +- [Support custom lib dir to save custom libs #23887](https://github.com/apache/doris/pull/23887) + + Add a custom_lib dir to allow users place custom lib files and custom_lib will not be replaced. +- [Optimize priority_ network matching logic #23784](https://github.com/apache/doris/pull/23784) + + Optimize priority_network logic to avoid error when this config is wrong or not configured. +- [Row policy support role #23022](https://github.com/apache/doris/pull/23022) + + Support role based auth for row policy. + +### New optimizer Nereids statistics collection improvement + +- [Disable file cache while running analysis tasks. #23663](https://github.com/apache/doris/pull/23663) +- [Show column stats even when error occurred. #23703](https://github.com/apache/doris/pull/23703) +- [Support basic jdbc external table stats collection. #23965](https://github.com/apache/doris/pull/23965) +- [Skip unknown col stats check on __internal_scheam and information_schema #24625](https://github.com/apache/doris/pull/24625) + +### Better support for JDBC, HDFS, Hive, MySQL, Max Compute, Multi-Catalog + +- [Support hadoop viewfs. #24168](https://github.com/apache/doris/pull/24168) +- [Avoid calling checksum when replaying creating jdbc catalog and fix ranger issue #22369](https://github.com/apache/doris/pull/22369) +- [Optimize the JDBC Catalog connection error message #23868](https://github.com/apache/doris/pull/23868) + + Improve property check and error message for JDBC catalog +- [Fix mc decimal type parse, fix wrong obj location #24242](https://github.com/apache/doris/pull/24242) + + Fix some issues for Max Compute catalog +- [Support sql cache for hms catalog #23391](https://github.com/apache/doris/pull/23391) + + SQL cache for Hive catalog +- [Merge hms partition events. #22869](https://github.com/apache/doris/pull/22869) + + Improve performance for Hive metadata sync +- [Add metadata_name_ids for quickly get catlogs,db,table and add profiling table in order to Compatible with mysql #22702](https://github.com/apache/doris/pull/22702) + +### Performance for inverted index query + +- [Add bkd index query cache to improve perf #23952](https://github.com/apache/doris/pull/23952) +- [Improve performance for count on index other than match #24678](https://github.com/apache/doris/pull/24678) +- [Improve match performance without index #24751](https://github.com/apache/doris/pull/24751) +- [Optimize multiple terms conjunction query #23871](https://github.com/apache/doris/pull/23871) +Improve performance of MATCH_ALL +- [Optimize unnecessary conversions #24389](https://github.com/apache/doris/pull/24389) +Improve performance of MATCH + +### Improve Array functions + +- [[Fix old optimizer with some array literal functions #23630](https://github.com/apache/doris/pull/23630) +- [Improve array union support multi params #24327](https://github.com/apache/doris/pull/24327) +- [Improve explode func with array nested complex type #24455](https://github.com/apache/doris/pull/24455) + +## Important Bug fixes + +- [The parameter positions of timestamp diff function to sql are reversed #23601](https://github.com/apache/doris/pull/23601) +- [Fix old optimizer with some array literal functions #23630](https://github.com/apache/doris/pull/23630) +- [Fix query cache returns wrong result after deleting partitions. #23555](https://github.com/apache/doris/pull/23555) +- [Fix potential data loss when clone task's dst tablet is cooldown replica #17644](https://github.com/apache/doris/pull/17644) +- [Fix array map batch append data with right next_array_item_rowid #23779](https://github.com/apache/doris/pull/23779) +- [Fix or to in rule #23940](https://github.com/apache/doris/pull/23940) +- [Fix 'char' function's toSql implementation is wrong #23860](https://github.com/apache/doris/pull/23860) +- [Record wrong best plan properties #23973](https://github.com/apache/doris/pull/23973) +- [Make TVF's distribution spec always be RANDOM #24020](https://github.com/apache/doris/pull/24020) +- [External scan use STORAGE_ANY instead of ANY as distibution #24039](https://github.com/apache/doris/pull/24039) +- [Runtimefilter target is not SlotReference #23958](https://github.com/apache/doris/pull/23958) +- [mv in select materialized_view should disable show table #24104](https://github.com/apache/doris/pull/24104) +- [Fail over to remote file reader if local cache failed #24097](https://github.com/apache/doris/pull/24097) +- [Fix revoke role operation cause fe down #23852](https://github.com/apache/doris/pull/23852) +- [Handle status code correctly and add a new error code `ENTRY_NOT_FOUND` #24139](https://github.com/apache/doris/pull/24139) +- [Fix leaky abstraction and shield the status code `END_OF_FILE` from upper layers #24165](https://github.com/apache/doris/pull/24165) +- [Fix bug that Read garbled files caused be crash. #24164](https://github.com/apache/doris/pull/24164) +- [Fix be core when user sepcified empty `column_separator` using hdfs tvf #24369](https://github.com/apache/doris/pull/24369) +- [Fix need to restart BE after replacing the jar package in java-udf #24372](https://github.com/apache/doris/pull/24372) +- [Need to call 'set_version' in nested functions #24381](https://github.com/apache/doris/pull/24381) +- [windown_funnel compatibility issue with multi backends #24385](https://github.com/apache/doris/pull/24385) +- [correlated anti join shouldn't be translated to null aware anti join #24290](https://github.com/apache/doris/pull/24290) +- [Change ordinary type null value is \N,complex type null value is null #24207](https://github.com/apache/doris/pull/24207) +- [Fix analyze failed when there are thousands of partitions. #24521](https://github.com/apache/doris/pull/24521) +- [Do not use enum as the data type for JavaUdfDataType. #24460](https://github.com/apache/doris/pull/24460) +- [Fix multi window projection issue temporarily #24568](https://github.com/apache/doris/pull/24568) +- [Make metadata compatible with 2.0.3 #24610](https://github.com/apache/doris/pull/24610) +- [Select outfile column order is wrong #24595](https://github.com/apache/doris/pull/24595) +- [Incorrect result of semi/anti mark join #24616](https://github.com/apache/doris/pull/24616) +- [Fix broker read issue #24635](https://github.com/apache/doris/pull/24635) +- [Skip unknown col stats check on __internal_scheam and information_schema #24625](https://github.com/apache/doris/pull/24625) +- [Fixed bug when parsing multi-character delimiters. #24572](https://github.com/apache/doris/pull/24572) +- [Fix timezone parse when there is no tzfile #24578](https://github.com/apache/doris/pull/24578) +- [We need to issue an error when starting FE without setting the Java home environment #23943](https://github.com/apache/doris/pull/23943) +- [Enable_unique_key_partial_update should be forwarded to master #24697](https://github.com/apache/doris/pull/24697) +- [Fix paimon file catalog meta issue and replication num analysis issue #24681](https://github.com/apache/doris/pull/24681) +- [Add more log for ingest_binlog && Fix ingest_binlog not rewrite rowset_meta tablet_uid #24617](https://github.com/apache/doris/pull/24617) +- [Do not abort when a disk is broken #24692](https://github.com/apache/doris/pull/24692) +- [colocate join could not work well on full outer join #24700](https://github.com/apache/doris/pull/24700) +- [Optimize unnecessary conversions #24389](https://github.com/apache/doris/pull/24389) +- [Optimize the reading efficiency of nullable (string) columns. #24698](https://github.com/apache/doris/pull/24698) +- [Fix segment cache core when output rowset is nullptr #24778](https://github.com/apache/doris/pull/24778) +- [Fix duplicate key in schema change #24782](https://github.com/apache/doris/pull/24782) +- [Make metadata compatible for future version after 2.0.2 #24800](https://github.com/apache/doris/pull/24800) +- [Fix map/array deserialize string with quote pair #24808](https://github.com/apache/doris/pull/24808) +- [Failed on arm platform, with clang compiler and pch on, close #24633 #24636](https://github.com/apache/doris/pull/24636) +- [Table column order is changed if add a column and do truncate #24981](https://github.com/apache/doris/pull/24981) +- [Make parser mode coarse grained by default #24949](https://github.com/apache/doris/pull/24949) + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.2-merged+is%3Aclosed) . + +## Big Thanks + +Thanks all who contribute to this release: + +[@adonis0147](https://github.com/adonis0147) [@airborne12](https://github.com/airborne12) [@amorynan](https://github.com/amorynan) [@AshinGau](https://github.com/AshinGau) [@BePPPower](https://github.com/BePPPower) [@BiteTheDDDDt](https://github.com/BiteTheDDDDt) [@bobhan1](https://github.com/bobhan1) [@ByteYue](https://github.com/ByteYue) [@caiconghui](https://github.com/caiconghui) [@CalvinKirs](https://github.com/CalvinKirs) [@cambyzju](https://github.com/cambyzju) [@ChengDaqi2023](https://github.com/ChengDaqi2023) [@ChinaYiGuan](https://github.com/ChinaYiGuan) [@CodeCooker17](https://github.com/CodeCooker17) [@csun5285](https://github.com/csun5285) [@dataroaring](https://github.com/dataroaring) [@deadlinefen](https://github.com/deadlinefen) [@DongLiang-0](https://github.com/DongLiang-0) [@Doris-Extras](https://github.com/Doris-Extras) [@dutyu](https://github.com/dutyu) [@eldenmoon](https://github.com/eldenmoon) [@englefly](https://github.com/englefly) [@freemandealer](https://github.com/freemandealer) [@Gabriel39](https://github.com/Gabriel39) [@gnehil](https://github.com/gnehil) [@GoGoWen](https://github.com/GoGoWen) [@gohalo](https://github.com/gohalo) [@HappenLee](https://github.com/HappenLee) [@hello-stephen](https://github.com/hello-stephen) [@HHoflittlefish777](https://github.com/HHoflittlefish777) [@hubgeter](https://github.com/hubgeter) [@hust-hhb](https://github.com/hust-hhb) [@ixzc](https://github.com/ixzc) [@JackDrogon](https://github.com/JackDrogon) [@jacktengg](https://github.com/jacktengg) [@jackwener](https://github.com/jackwener) [@Jibing-Li](https://github.com/Jibing-Li) [@JNSimba](https://github.com/JNSimba) [@kaijchen](https://github.com/kaijchen) [@kaka11chen](https://github.com/kaka11chen) [@Kikyou1997](https://github.com/Kikyou1997) [@Lchangliang](https://github.com/Lchangliang) [@LemonLiTree](https://github.com/LemonLiTree) [@liaoxin01](https://github.com/liaoxin01) [@LiBinfeng-01](https://github.com/LiBinfeng-01) [@liugddx](https://github.com/liugddx) [@luwei16](https://github.com/luwei16) [@mongo360](https://github.com/mongo360) [@morningman](https://github.com/morningman) [@morrySnow](https://github.com/morrySnow) @mrhhsg @Mryange @mymeiyi @neuyilan @pingchunzhang @platoneko @qidaye @realize096 @RYH61 @shuke987 @sohardforaname @starocean999 @SWJTU-ZhangLei @TangSiyang2001 @Tech-Circle-48 @w41ter @wangbo @wsjz @wuwenchi @wyx123654 @xiaokang @XieJiann @xinyiZzz @XuJianxu @xutaoustc @xy720 @xyfsjq @xzj7019 @yiguolei @yujun777 @Yukang-Lian @Yulei-Yang @zclllyybb @zddr @zhangguoqiang666 @zhangstar333 @ZhangYu0123 @zhannngchen @zxealous @zy-kkk @zzzxl1993 @zzzzzzzs diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.3.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.3.md new file mode 100644 index 0000000000000..a716d6d711fb0 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.3.md @@ -0,0 +1,253 @@ +--- +{ + "title": "Release 2.0.3", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 1000 improvements and bug fixes have been made in Doris 2.0.3 version, including optimizer statistics, inverted index, complex datatypes, data lake, replica management. + + + +## 1 Behavior change + +- The output format of the complex data type array/map/struct has been changed to be consistent to the input format and JSON specification. The main changes from the previous version are that DATE/DATETIME and STRING/VARCHAR are enclosed in double quotes and null values inside ARRAY/MAP are displayed as `null` instead of `NULL`. + - https://github.com/apache/doris/pull/25946 +- SHOW_VIEW permission is supported. Users with SELECT or LOAD permission will no longer be able to execute the 'SHOW CREATE VIEW' statement and must be granted the SHOW_VIEW permission separately. + - https://github.com/apache/doris/pull/25370 + + +## 2 New features + +### 2.1 Support collecting statistics for optimizer automatically + +Collecting statistics helps the optimizer understand the data distribution characteristics and choose a better plan to greatly improve query performance. It is officially supported starting from version 2.0.3 and is enabled all day by default. + +### 2.2 Support complex datatypes for more datalake source +- Support complex datatypes for JAVA UDF, JDBC and Hudi MOR + - https://github.com/apache/doris/pull/24810 + - https://github.com/apache/doris/pull/26236 +- Support complex datatypes for Paimon + - https://github.com/apache/doris/pull/25364 +- Suport Paimon version 0.5 + - https://github.com/apache/doris/pull/24985 + + +### 2.3 Add more builtin functions +- Support the BitmapAgg function in new optimizer + - https://github.com/apache/doris/pull/25508 +- Supports SHA series digest functions + - https://github.com/apache/doris/pull/24342 +- Support the BITMAP datatype in the aggregate functions min_by and max_by + - https://github.com/apache/doris/pull/25430 +- Add milliseconds/microseconds_add/sub/diff functions + - https://github.com/apache/doris/pull/24114 +- Add some json functions: json_insert, json_replace, json_set + - https://github.com/apache/doris/pull/24384 + + +## 3 Improvement and optimizations + +### 3.1 Performance optimizations + +- When the inverted index MATCH WHERE condition with a high filter rate is combined with the common WHERE condition with a low filter rate, the I/O of the index column is greatly reduced. +- Optimize the efficiency of random data access after the where filter. +- Optimizes the performance of the old get_json_xx function on JSON data types by 2~4x. +- Supports the configuration to reduce the priority of the data read thread, ensuring the CPU resources for real-time writing. +- Adds `uuid-numeric` function that returns largeint, which is 20 times faster than `uuid` function that returns string. +- Optimized the performance of case when by 3x. +- Cut out unnecessary predicate calculations in storage engine execution. +- Accelerate count performance by pushing down count operator to storage tier. +- Optimizes the computation performance of the nullable type in and or expressions. +- Supports rewriting the limit operator before `join` in more scenarios to improve query performance. +- Eliminate useless `order by` operators from inline view to improve query performance. +- Optimizes the accuracy of cardinality estimates and cost models in some cases. +- Optimized jdbc catalog predicate pushdown logic. +- Optimized the read efficiency of the file cache when it's enable for the first time. +- Optimizes the hive table sql cache policy and uses the partition update time stored in HMS to improve the cache hit ratio. +- Optimize mow compaction efficiency. +- Optimized thread allocation logic for external table query to reduce memory usage +- Optimize memory usage for column reader. + + + +### 3.2 Distributed replica management improvements + +Distributed replica management improvements include skipping partition deletion, colocate group deletion, balance failure due to continuous write, and hot and cold seperation table balance. + + +### 3.3 Security enhancement +- The audit log plug-in uses a token instead of a plaintext password to enhance security + - https://github.com/apache/doris/pull/26278 +- log4j configures security enhancement + - https://github.com/apache/doris/pull/24861 +- Sensitive user information is not displayed in logs + - https://github.com/apache/doris/pull/26912 + + +## 4 Bugfix and stability + +### 4.1 Complex datatypes +- Fix issues that fixed-length CHAR(n) was not truncated correctly in map/struct. + - https://github.com/apache/doris/pull/25725 +- Fix write failure for struct datatype nested for map/array + - https://github.com/apache/doris/pull/26973 +- Fix the issue that count distinct did not support array/map/struct + - https://github.com/apache/doris/pull/25483 +- Fix be crash in updating to 2.0.3 after the delete complex type appeared in query + - https://github.com/apache/doris/pull/26006 +- Fix be crash when JSON datatype is in WHERE clause. + - https://github.com/apache/doris/pull/27325 +- Fix be crash when ARRAY datatype is in OUTER JOIN clause. + - https://github.com/apache/doris/pull/25669 +- Fix reading incorrect result for DECIMAL datatype in ORC format. + - https://github.com/apache/doris/pull/26548 + - https://github.com/apache/doris/pull/25977 + - https://github.com/apache/doris/pull/26633 + +### 4.2 Inverted index +- Fix incorrect result for OR NOT combination in WHERE clause were incorrect when disable inverted index query. + - https://github.com/apache/doris/pull/26327 +- Fix be crash when write a empty with inverted index + - https://github.com/apache/doris/pull/25984 +- Fix be crash in index compaction when the output of compaction is empty. + - https://github.com/apache/doris/pull/25486 +- Fixed the problem of adding an inverted index to be crashed when no data is written to the newly added column. +- Fix be crash when BUILD INDEX after ADD COLUMN without new data written. + - https://github.com/apache/doris/pull/27276 +- Fix missing and leak problem of hardlink for inverted index file. + - https://github.com/apache/doris/pull/26903 +- Fix index file corrupt when disk is full temporarilly + - https://github.com/apache/doris/pull/28191 +- Fix incorrect result due to optimization for skip reading index column + - https://github.com/apache/doris/pull/28104 + +### 4.3 Materialized View +- Fix the problem of BE crash caused by repeated expressions in the group by statement +- Fix be crash when there are duplicate expressions in `group by` statements. + - https://github.com/apache/doris/pull/27523 +- Disables the float/double type in the `group by` clause when a view is created. + - https://github.com/apache/doris/pull/25823 +- Improve the function of select query matching materialized view + - https://github.com/apache/doris/pull/24691 +- Fix an issue that materialized views could not be matched when a table alias was used + - https://github.com/apache/doris/pull/25321 +- Fix the problem using percentile_approx when creating materialized views + - https://github.com/apache/doris/pull/26528 + +### 4.4 Table sample +- Fix the problem that table sample query can not work on table with partitions. + - https://github.com/apache/doris/pull/25912 +- Fix the problem that table sample query can not work when specify tablet. + - https://github.com/apache/doris/pull/25378 + + +### 4.5 Unique with merge on write +- Fix null pointer exception in conditional update based on primary key + - https://github.com/apache/doris/pull/26881 +- Fix field name capitalization issues in partial update + - https://github.com/apache/doris/pull/27223 +- Fix duplicate keys occur in mow during schema change repairement. + - https://github.com/apache/doris/pull/25705 + + +### 4.6 Load and compaction +- Fix unkown slot descriptor error in routineload for running multiple tables + - https://github.com/apache/doris/pull/25762 +- Fix be crash due to concurrent memory access when caculating memory + - https://github.com/apache/doris/pull/27101 +- Fix be crash on duplicate cancel for load. + - https://github.com/apache/doris/pull/27111 +- Fix broker connection error during broker load + - https://github.com/apache/doris/pull/26050 +- Fix incorrect result delete predicates in concurrent case of compation and scan. + - https://github.com/apache/doris/pull/24638 +- Fix the problem tha compaction task would print too many stacktrace logs + - https://github.com/apache/doris/pull/25597 + + +### 4.7 Data Lake compatibility +- Solve the problem that the iceberg table contains special characters that cause query failure + - https://github.com/apache/doris/pull/27108 +- Fix compatibility issues of different hive metastore versions + - https://github.com/apache/doris/pull/27327 +- Fix an error reading max compute partition table + - https://github.com/apache/doris/pull/24911 +- Fix the issue that backup to object storage failed + - https://github.com/apache/doris/pull/25496 + - https://github.com/apache/doris/pull/25803 + + +### 4.8 JDBC external table compatibility + +- Fix Oracle date type format error in jdbc catalog + - https://github.com/apache/doris/pull/25487 +- Fix MySQL 0000-00-00 date exception in jdbc catalog + - https://github.com/apache/doris/pull/26569 +- Fix an exception in reading data from Mariadb where the default value of the time type is current_timestamp + - https://github.com/apache/doris/pull/25016 +- Fix be crash when processing BITMAP datatype in jdbc catalog + - https://github.com/apache/doris/pull/25034 + - https://github.com/apache/doris/pull/26933 + + +### 4.9 SQL Planner and Optimizer + +- Fix partition prune error in some scenes + - https://github.com/apache/doris/pull/27047 + - https://github.com/apache/doris/pull/26873 + - https://github.com/apache/doris/pull/25769 + - https://github.com/apache/doris/pull/27636 + +- Fix incorrect sub-query processing in some scenarios + - https://github.com/apache/doris/pull/26034 + - https://github.com/apache/doris/pull/25492 + - https://github.com/apache/doris/pull/25955 + - https://github.com/apache/doris/pull/27177 + +- Fix some semantic parsing errors + - https://github.com/apache/doris/pull/24928 + - https://github.com/apache/doris/pull/25627 + +- Fix data loss during right outer/anti join + - https://github.com/apache/doris/pull/26529 + +- Fix incorrect pushing down of predicate pass aggregation operators. + - https://github.com/apache/doris/pull/25525 + +- Fix incorrect result header in some cases + - https://github.com/apache/doris/pull/25372 + +- Fix incorrect plan when the nullsafeEquals expression (<=>) is used as the join condition + - https://github.com/apache/doris/pull/27127 + +- Fix correct column prune in set operation operator. + - https://github.com/apache/doris/pull/26884 + + +### Others + +- Fix BE crash when the order of columns in a table is changed and then upgraded to 2.0.3. + - https://github.com/apache/doris/pull/28205 + + +See the complete list of improvements and bug fixes on [github dev/2.0.3-merged](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.3-merged+is%3Aclosed) . diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.4.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.4.md new file mode 100644 index 0000000000000..e1dac58fbf69a --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.4.md @@ -0,0 +1,67 @@ +--- +{ + "title": "Release 2.0.4", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, about 333 improvements and bug fixes have been made in Doris 2.0.4 version. + +**Quick Download** : [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub** : [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Behavior change +- More reasonable and accurate precision and scale inference for decimal data type + - [https://github.com/apache/doris/pull/28034](https://github.com/apache/doris/pull/28034) + +- Support drop policy for user or role + - [https://github.com/apache/doris/pull/29488](https://github.com/apache/doris/pull/29488) + +## New features + +- Support datev1, datetimev1 and decimalv2 datatypes in new optimizer Nereids. +- Support ODBC table for new optimizer Nereids. +- Add `lower_case` and `ignore_above` option for inverted index +- Support `match_regexp` and `match_phrase_prefix` optimization by inverted index +- Support paimon native reader in datalake +- Support audit-log for `insert into` SQL +- Support reading parquet file in lzo compressed format + +## Three Improvement and optimizations + +- Improve storage management including balance, migration, publish and others. +- Improve storage cooldown policy to use save disk space. +- Performance optimization for substr with ascii string. +- Improve partition prune when date function is used. +- Improve auto analyze visibility and performance. + +See the complete list of improvements and bug fixes on github [dev/2.0.4-merged](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.4-merged+is%3Aclosed) + + + +## Credits +Last but not least, this release would not have been possible without the following contributors: + +airborne12, amorynan, AshinGau, BePPPower, bingquanzhao, BiteTheDDDDt, bobhan1, ByteYue, caiconghui,CalvinKirs, cambyzju, caoliang-web, catpineapple, csun5285, dataroaring, deardeng, dutyu, eldenmoon, englefly, feifeifeimoon, fornaix, Gabriel39, gnehil, HappenLee, hello-stephen, HHoflittlefish777,hubgeter, hust-hhb, ixzc, jacktengg, jackwener, Jibing-Li, kaka11chen, KassieZ, LemonLiTree,liaoxin01, LiBinfeng-01, lihuigang, liugddx, luwei16, morningman, morrySnow, mrhhsg, Mryange, nextdreamblue, Nitin-Kashyap, platoneko, py023, qidaye, shuke987, starocean999, SWJTU-ZhangLei, w41ter, wangbo, wsjz, wuwenchi, Xiaoccer, xiaokang, XieJiann, xingyingone, xinyiZzz, xuwei0912, xy720, xzj7019, yujun777, zclllyybb, zddr, zhangguoqiang666, zhangstar333, zhannngchen, zhiqiang-hhhh, zy-kkk, zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.5.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.5.md new file mode 100644 index 0000000000000..20d6bd9302b2c --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.5.md @@ -0,0 +1,73 @@ +--- +{ + "title": "Release 2.0.5", + "language": "en" +} +--- + + + + + +Thanks to our community users and developers, about 217 improvements and bug fixes have been made in Doris 2.0.5 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Behavior change +- Change char function behaviour: `select char(0) = '\0'` return true as MySQL + - https://github.com/apache/doris/pull/30034 +- Allow exporting empty data + - https://github.com/apache/doris/pull/30703 + +## New features +- Eliminate left outer join with `is null` condition +- Add `show-tablets-belong` stmt for analyzing a batch of tablet-ids +- InferPredicates support In, such as `a = b & a in [1, 2] -> b in [1, 2]` +- Optimize plan when column stats are unavailable +- Optimize plan using rollup column stats +- Support analyze materialized view +- Support ShowProcessStmt Show all FE connection + +## Improvement and optimizations +- Optimize query plan when column stats are unaviable +- Optimize query plan using rollup column stats +- Stop analyze quickly after user close auto analyze +- Catch load column stats exception, avoid print too much stack info to fe.out +- Select materialized view by specify the view name in SQL +- Change auto analyze max table width default value to 100 +- Escape characters for columns in recovery predicate pushdown in JDBC Catalog +- Fix JDBC MYSQL Catalog `to_date` fun pushdown +- Optimize the close logic of JDBC client +- Optimize JDBC connection pool parameter settings +- Obtain hudi partition information through HMS's API +- Optimize routine load job error msg and memory +- Skip all backup/restore jobs if max allowd option is set to 0 + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.4-rc06...2.0.5-rc02). + + +## Credits +Thanks all who contribute to this release: + +airborne12, alexxing662, amorynan, AshinGau, BePPPower, bingquanzhao, BiteTheDDDDt, ByteYue, caiconghui, cambyzju, catpineapple, dataroaring, eldenmoon, Emor-nj, englefly, felixwluo, GoGoWen, HappenLee, hello-stephen, HHoflittlefish777, HowardQin, JackDrogon, jacktengg, jackwener, Jibing-Li, KassieZ, LemonLiTree, liaoxin01, liugddx, LuGuangming, morningman, morrySnow, mrhhsg, Mryange, mymeiyi, nextdreamblue, qidaye, ryanzryu, seawinde,starocean999, TangSiyang2001, vinlee19, w41ter, wangbo, wsjz, wuwenchi, xiaokang, XieJiann, xingyingone, xy720,xzj7019, yujun777, zclllyybb, zhangstar333, zhannngchen, zhiqiang-hhhh, zxealous, zy-kkk, zzzxl1993 + diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.6.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.6.md new file mode 100644 index 0000000000000..9591ed8d3fab8 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.6.md @@ -0,0 +1,59 @@ +--- +{ + "title": "Release 2.0.6", + "language": "en" +} +--- + + + + + +Thanks to our community users and developers, about 114 improvements and bug fixes have been created by 51 contributors in Doris 2.0.6 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Behavior change +- N/A + +## New features +- Support match a function with alias in materialized-view +- Add a command to drop a tablet replica safely on backend +- Add row count cache for external table. +- Support analyze rollup to gather statistics for optimizer + +## Improvement and optimizations +- Improve tablet schema cache memory by using deterministic way to serialize protobuf +- Improve show column stats performance +- Support estimate row count for iceberg and paimon +- Support sqlserver timestamp type read for JDBC catalog + + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.5-rc02...2.0.6). + + +## Credits +Thanks all who contribute to this release: + +924060929, AshinGau, BePPPower, BiteTheDDDDt, CalvinKirs, cambyzju, deardeng, DongLiang-0, eldenmoon, englefly, feelshana, feiniaofeiafei, felixwluo, HappenLee, hust-hhb, iwanttobepowerful, ixzc, JackDrogon, Jibing-Li, KassieZ, larshelge, liaoxin01, LiBinfeng-01, liutang123, luennng, morningman, morrySnow, mrhhsg, qidaye, starocean999, TangSiyang2001, wangbo, wsjz, wuwenchi, xiaokang, XieJiann, xuwei0912, xy720, xzj7019, yiguolei, yujun777, Yukang-Lian, Yulei-Yang, zclllyybb, zddr, zhangstar333, zhannngchen, zhiqiang-hhhh, zy-kkk, zzzxl1993 + diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.7.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.7.md new file mode 100644 index 0000000000000..10f226dbd63b4 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.7.md @@ -0,0 +1,84 @@ +--- +{ + "title": "Release 2.0.7", + "language": "en" +} +--- + + + + + +Thanks to our community users and developers, about 80 improvements and bug fixes have been made in Doris 2.0.7 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## 1 Behavior change + +- `round` function defaults to rounding normally as MySQL, eg. round(5/2) return 3 instead of 2. + + - https://github.com/apache/doris/pull/31583 + +- `round` datetime with scale from string literal as MySQL, eg. round '2023-10-12 14:31:49.666' to '2023-10-12 14:31:50' . + + - https://github.com/apache/doris/pull/27965 + + +## 2 New features +- Support make miss slot as null alias when converting outer join to anti join to speed up query + + - https://github.com/apache/doris/pull/31854 + +- Enable proxy protocol to support IP transparency for Nginx and HAProxy. + + - https://github.com/apache/doris/pull/32338 + + +## 3 Improvement and optimizations + +- Add DEFAULT_ENCRYPTION column in `information_schema` table and add `processlist` table for better compatibility for BI tools + +- Automatically test connectivity by default when creating a JDBC Catalog. + +- Enhance auto resume to keep routine load stable + +- Use lowercase by default for Chinese tokenizer in inverted index + +- Add error msg if exceeded maximum default value in repeat function + +- Skip hidden file and dir in Hive table + +- Reduce file meta cache size and disable cache for some cases to avoid OOM + +- Reduce jvm heap memory consumed by profiles of BrokerLoadJob + +- Remove sort which is under table sink to speed up query like `INSERT INTO t1 SELECT * FROM t2 ORDER BY k`. + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.6...2.0.7) . + + +## 4 Credits + +Thanks all who contribute to this release: + +924060929,airborne12,amorynan,ByteYue,dataroaring,deardeng,feiniaofeiafei,felixwluo,freemandealer,gavinchou,hello-stephen,HHoflittlefish777,jacktengg,jackwener,jeffreys-cat,Jibing-Li,KassieZ,LiBinfeng-01,luwei16,morningman,mrhhsg,Mryange,nextdreamblue,platoneko,qidaye,rohitrs1983,seawinde,shuke987,starocean999,SWJTU-ZhangLei,w41ter,wsjz,wuwenchi,xiaokang,XieJiann,XuJianxu,yujun777,Yulei-Yang,zhangstar333,zhiqiang-hhhh,zy-kkk,zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.8.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.8.md new file mode 100644 index 0000000000000..d881a80628b44 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.8.md @@ -0,0 +1,76 @@ +--- +{ + "title": "Release 2.0.8", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, about 65 improvements and bug fixes have been made in Doris 2.0.8 version. + +- **Quick Download** : [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub** : [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + + +## 1 Behavior change + +The `ADMIN SHOW` statement can not be executed with high version of MySQL 8.x jdbc driver. So rename these statement, remove the `ADMIN` keywords. + +- https://github.com/apache/doris/pull/29492 + +```sql +ADMIN SHOW CONFIG -> SHOW CONFIG +ADMIN SHOW REPLICA -> SHOW REPLICA +ADMIN DIAGNOSE TABLET -> SHOW TABLET DIAGNOSIS +ADMIN SHOW TABLET -> SHOW TABLET +``` + + +## 2 New features + +N/A + + + +## 3 Improvement and optimizations + +- Make Inverted Index work with TopN opt in Nereids + +- Limit the max string length to 1024 while collecting column stats to control BE memory usage + +- JDBC Catalog close when JDBC client is not empty + +- Accept all Iceberg database and do not check the name format of database + +- Refresh external table's rowcount async to avoid cache miss and unstable query plan + +- Simplify the isSplitable method of hive external table to avoid too many hadoop metrics + +See the complete list of improvements and bug fixes on [GitHub](https://github.com/apache/doris/compare/2.0.7...2.0.8) . + +## 4 Credits + +Thanks all who contribute to this release: + +924060929, AcKing-Sam, amorynan, AshinGau, BePPPower, BiteTheDDDDt, ByteYue, cambyzju, dongsilun, eldenmoon, feiniaofeiafei, gnehil, Jibing-Li, liaoxin01, luwei16, morningman, morrySnow, mrhhsg, Mryange, nextdreamblue, platoneko, starocean999, SWJTU-ZhangLei, wuwenchi, xiaokang, xinyiZzz, Yukang-Lian, Yulei-Yang, zclllyybb, zddr, zhangstar333, zhiqiang-hhhh, ziyanTOP, zy-kkk, zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.9.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.9.md new file mode 100644 index 0000000000000..04048fc060461 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.9.md @@ -0,0 +1,75 @@ +--- +{ + "title": "Release 2.0.9", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, about 68 improvements and bug fixes have been made in Doris 2.0.9 version. + +- **Quick Download** : [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub** : [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## 1 Behavior change + +NA + +## 2 New features + +- Support predicate apprear both on key and value mv column + +- Support mv with `bitmap_union(bitmap_from_array())` + +- Add a FE config to force replicate allocation for OLAP tables in the cluster + +- Support date literal support timezone in new optimizer Nereids + +- Support slop in fulltext search `match_phrase` to specify word distence + +- Show index id in `SHOW PROC INDEXES` + +## 3 Improvement and optimizations + +- Sdd a secondary argument in `first_value` / `last_value` to ignore NULL values + +- the offset params in `LEAD`/ `LAG` function could use 0 + +- Adjust priority of materialized view match rule + +- TopN opt reads only limit number of records for better performance + +- Add profile for delete_bitmap get_agg function + +- Refine the Meta cache to get better performance + +- Add FE config `autobucket_max_buckets` + +See the complete list of improvements and bug fixes on [GitHub](https://github.com/apache/doris/compare/2.0.8...2.0.9) . + +## Big Thanks + +Thanks all who contribute to this release: + +adonis0147, airborne12, amorynan, AshinGau, BePPPower, BiteTheDDDDt, CalvinKirs, cambyzju, csun5285, eldenmoon, englefly, feiniaofeiafei, HHoflittlefish777, htyoung, hust-hhb, jackwener, Jibing-Li, kaijchen, kylinmac, liaoxin01, luwei16, morningman, mrhhsg, qidaye, starocean999, SWJTU-ZhangLei, w41ter, xiaokang, xiedeyantu, xy720, zclllyybb, zhangstar333, zhannngchen, zy-kkk, zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.0.md b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.0.md new file mode 100644 index 0000000000000..b0b88f715ee51 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.0.md @@ -0,0 +1,159 @@ +--- +{ + "title": "Release 2.1.0", + "language": "en" +} +--- + + + +Dear community, we are pleased to share with you the official release of Apache Doris 2.1.0, now available for download and use as of March 8th. This latest version marks a significant milestone in our journey towards enhancing data analysis capabilities, particularly for handling massive and complex datasets. + +With Doris 2.1.0, our primary focus has been on optimizing analysis performance, and the results speak for themselves. We have achieved an impressive performance improvement of over 100% on the TPC-DS 1TB test dataset, making Apache Doris more capable of challenging real-world business scenarios. + +- **Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Performance improvement + +### Smarter optimizer + +On the basis of V2.0, the query optimizer in Doris V2.1 comes with enhanced statistics-based inference and enumeration framework. We have upgraded the cost model and expanded the optimization rules to serve the needs of more use cases + +### Better heuristic optimization + +For data analytics at scale or data lake scenarios, Doris V2.1 provides better heuristic query plans. Meanwhile, the RuntimeFilter is more self-adaptive to enable higher performance even without statistical information. + +### Parallel adaptive scan + +Doris V2.1 has adopted parallel adaptive scan to optimize scan I/O and thus improve query performance. It can avoid the negative impact of unreasonable numbers of buckets. (This feature is currently available on the Duplicate Key model and Merge-on-Write Unique Key model.) + +### Local shuffle + +We have introduced Local Shuffle to prevent uneven data distribution. Benchmark tests show that Local Shuffle in combination with Parallel Adaptive Scan can guarantee fast query performance in spite of unreasonable bucket number settings upon table creation. + +### Faster INSERT INTO SELECT + +To further improve the performance of INSERT INTO SELECT, which is a frequent operation in ETL, we have moved forward the MemTable execution-wise to reduce data ingestion overheads. Tests show that this can double the data ingestion speed in most cases compared to V2.0. +Improved data lake analytics capabilities + +## Data lake analytic performance + +### TPC-DS Benchmark + +According to TPC-DS benchmark tests (1TB) of Doris V2.1 against Trino, + +- Without caching, the total execution time of Doris is 56% of that of Trino V435. (717s VS 1296s) +- Enabling file cache can further increase the overall performance of Doris by 2.2 times. (323s) + This is achieved by a series of optimizations in I/O, parquet/ORC file reading, predicate pushdown, caching, and scan task scheduling, etc. + +### SQL dialects compatibility + +To facilitate migration to Doris and increase its compatibility with other DBMS, we have enabled SQL dialect conversion in V2.1. ([read more](../../lakehouse/sql-dialect)) For example, by set sql_dialect = "trino" in Doris, you can use the Trino SQL dialect as you're used to, without modifying your current business logic, and Doris will execute the corresponding queries for you. Tests in user production environment show that Doris V2.1 is compatible with 99% of Trino SQL. + +### Arrow Flight SQL protocol + +As a column-oriented database compatible with MySQL 8.0 protocol, Doris V2.1 now supports the Arrow Flight SQL protocol as well so users can have fast access to Doris data via Pandas/Numpy without data serialization and deserialization. For most common data types, the Arrow Flight protocol enables tens of times faster performance than the MySQL protocol. + +## Asynchronous materialized view + +V2.1 allows creating a materialized view based on multiple tables. This feature currently supports: + +- Transparent rewriting: supports transparent rewriting of common operators including Select, Where, Join, Group By, and Aggregation. +- Auto refresh: supports regular refresh, manual refresh, full refresh, incremental refresh, and partition-based refresh. +- Materialized view of external tables: supports materialized views based on external data tables such as those on Hive, Hudi, and Iceberg; supported synchronizing data from data lakes into Doris internal tables via materialized views. +- Direct query on materialized views: Materialized views can be regarded as the result set after ETL. In this sense, materialized views are data tables, so users can conduct queries on them directly. + +## Enhanced storage + +### Auto-increment column + +V2.1 supports auto-increment columns, which can ensure data uniqueness of each row. This lays the foundation for efficient dictionary encoding and query pagination. For example, for precise UV calculation and customer grouping, users often apply the bitmap type in Doris, the process of which entails dictionary encoding. With V2.1, users can first create a dictionary table using the auto-increment column, and then simply load user data into it. + +### Auto partition + +To further release burden on operation and maintenance, V2.1 allows auto data partitioning. Upon data ingestion, it detects whether a partition exists for the data based on the partitioning column. If not, it automatically creates one and starts data ingestion. + +### High-concurrency real-time data ingestion + +For data writing, a back pressure mechanism is in place to avoid execessive data versions, so as to reduce resource consumption by data version merging. In addition, V2.1 supports group commit ([read more](../../data-operate/import/import-way/group-commit-manual)), which means to accumulate multiple writing and commit them as one. Benchmark tests on group commit with JDBC ingestion and the Stream Load method present great results. + +## Semi-structured data analysis + +### A new data type: Variant + +V2.1 supports a new data type named Variant. It can accommodate semi-structured data such as JSON as well as compound data types that contain integers, strings, booleans, etcs. Users don't have to pre-define the exact data types for a Variant column in the table schema. The Variant type is handy when processing nested data structures. +You can include Variant columns and static columns with pre-defined data types in the same table. This will provide you with more flexibility in storage and queries. +Tests with ClickBench datasets prove that data in Variant columns takes up the same storage space as data in static columns, which is half of that in JSON format. In terms of query performance, the Variant type enables 8 times higher query speed than JSON in hot runs and even more in cold runs. + +### IP types + +Doris V2.1 provides native support for IPv4 and IPv6. It stores IP data in binary format, which cuts down storage space usage by 60% compared to IP string in plain texts. Along with these IP types, we have added over 20 functions for IP data processing. + +### More powerful functions for compound data types + +- explode_map: supports exploding rows into columns for the Map data type. +- Supports the STRUCT data type in the IN predicates + +## Workload Management + +### Hard isolation of resources + +On the basis of the Workload Group mechanism, which imposes a soft limit on the resources that a workload group can use, Doris 2.1 introduces a hard limit on CPU resource consumption for workload groups as a way to ensure higher stability in query performance. + +### TopSQL + +V2.1 allows users to check the most resource-consuming SQL queries in the runtime. This can be a big help when handling cluster load spike caused by unexpected large queries. + + +## Others + +### Decimal 256 + +For users in the financial sector or high-end manufacturing, V2.1 supports a high-precision data type: Decimal, which supports up to 76 significant digits (an experimental feature, please set enable_decimal256=true.) + +### Job scheduler + +V2.1 provides a good option for regular task scheduling: Doris Job Scheduler. It can trigger the pre-defined operations on schedule or at fixed intervals. The Doris Job Scheduler is accurate to the second. It provides consistency guarantee for data writing, high efficiency and flexibility, high-performance processing queues, retraceable scheduling records, and high availability of jobs. + +### Support Docker fast start to experience the new version + +Starting from version 2.1.0, we will provide a separate Docker Image to support the rapid creation of a 1FE, 1BE Docker container to experience the new version of Doris. The container will complete the initialization of FE and BE, BE registration and other steps by default. After creating the container, it can directly access and use the Doris cluster about 1 [minute.In](http://minute.in/) this image version, the default `max_map_count`, `ulimit`, `Swap` and other hard limits are removed. It supports X64 (avx2) machines and ARM machines for deployment. The default open ports are 8000, 8030, 8040, 9030.If you need to experience the Broker component, you can add the environment variable `--env BROKER=true` at startup to start the Broker process synchronously. After startup, it will automatically complete the registration. The Broker name is `test`. + +Please note that this version is only suitable for quick experience and functional testing, not for production environment! + +## Behavior changed + +- The default data model is the Merge-on-Write Unique Key model. enable_unique_key_merge_on_write will be included as a default setting when a table is created in the Unique Key model. +- As inverted index has proven to be more performant than bitmap index, V2.1 stops supporting bitmap index. Existing bitmap indexes will remain effective but new creation is not allowed. We will remove bitmap index-related code in the future. +- cpu_resource_limit is no longer supported. It is to put a limit on the number of scanner threads on Doris BE. Since the workload group mechanism also supports such settings, the already configured cpu_resource_limit will be invalid. +- The default value of enable_segcompaction is true. This means Doris supports compaction of multiple segments in the same rowset. +- Audit log plug-in + - Since V2.1.0, Doris has a built-in audit log plug-in. Users can simply enable or disable it by setting the enable_audit_plugin parameter. + - If you have already installed your own audit log plug-in, you can either continue using it after upgrading to Doris V2.1, or uninstall it and use the one in Doris. Please note that the audit log table will be relocated after switching plug-in. + - For more details, please see the [docs](../../admin-manual/audit-plugin). + + +## Credits +Thanks all who contribute to this release: + +467887319, 924060929, acnot, airborne12, AKIRA, alan_rodriguez, AlexYue, allenhooo, amory, amory, AshinGau, beat4ocean, BePPPower, bigben0204, bingquanzhao, BirdAmosBird, BiteTheDDDDt, bobhan1, caiconghui, camby, camby, CanGuan, caoliang-web, catpineapple, Centurybbx, chen, ChengDaqi2023, ChenyangSunChenyang, Chester, ChinaYiGuan, ChouGavinChou, chunping, colagy, CSTGluigi, czzmmc, daidai, dalong, dataroaring, DeadlineFen, DeadlineFen, deadlinefen, deardeng, didiaode18, DongLiang-0, dong-shuai, Doris-Extras, Dragonliu2018, DrogonJackDrogon, DuanXujianDuan, DuRipeng, dutyu, echo-dundun, ElvinWei, englefly, Euporia, feelshana, feifeifeimoon, feiniaofeiafei, felixwluo, figurant, flynn, fornaix, FreeOnePlus, Gabriel39, gitccl, gnehil, GoGoWen, gohalo, guardcrystal, hammer, HappenLee, HB, hechao, HelgeLarsHelge, herry2038, HeZhangJianHe, HHoflittlefish777, HonestManXin, hongkun-Shao, HowardQin, hqx871, httpshirley, htyoung, huanghaibin, HuJerryHu, HuZhiyuHu, Hyman-zhao, i78086, irenesrl, ixzc, jacktengg, jacktengg, jackwener, jayhua, Jeffrey, jiafeng.zhang, Jibing-Li, JingDas, julic20s, kaijchen, kaka11chen, KassieZ, kindred77, KirsCalvinKirs, KirsCalvinKirs, kkop, koarz, LemonLiTree, LHG41278, liaoxin01, LiBinfeng-01, LiChuangLi, LiDongyangLi, Lightman, lihangyu, lihuigang, LingAdonisLing, liugddx, LiuGuangdongLiu, LiuHongLiu, liuJiwenliu, LiuLijiaLiu, lsy3993, LuGuangmingLu, LuoMetaLuo, luozenglin, Luwei, Luzhijing, lxliyou001, Ma1oneZhang, mch_ucchi, Miaohongkai, morningman, morrySnow, Mryange, mymeiyi, nanfeng, nanfeng, Nitin-Kashyap, PaiVallishPai, Petrichor, plat1ko, py023, q763562998, qidaye, QiHouliangQi, ranxiang327, realize096, rohitrs1983, sdhzwc, seawinde, seuhezhiqiang, seuhezhiqiang, shee, shuke987, shysnow, songguangfan, Stalary, starocean999, SunChenyangSun, sunny, SWJTU-ZhangLei, TangSiyang2001, Tanya-W, taoxutao, Uniqueyou, vhwzIs, walter, walter, wangbo, Wanghuan, wangqt, wangtao, wangtianyi2004, wenluowen, whuxingying, wsjz, wudi, wudongliang, wuwenchihdu, wyx123654, xiangran0327, Xiaocc, XiaoChangmingXiao, xiaokang, XieJiann, Xinxing, xiongjx, xuefengze, xueweizhang, XueYuhai, XuJianxu, xuke-hat, xy, xy720, xyfsjq, xzj7019, yagagagaga, yangshijie, YangYAN, yiguolei, yiguolei, yimeng, YinShaowenYin, Yoko, yongjinhou, ytwp, yuanyuan8983, yujian, yujun777, Yukang-Lian, Yulei-Yang, yuxuan-luo, zclllyybb, ZenoYang, zfr95, zgxme, zhangdong, zhangguoqiang, zhangstar333, zhangstar333, zhangy5, ZhangYu0123, zhannngchen, ZhaoLongZhao, zhaoshuo, zhengyu, zhiqqqq, ZhongJinHacker, ZhuArmandoZhu, zlw5307, ZouXinyiZou, zxealous, zy-kkk, zzwwhh, zzzxl1993, zzzzzzzs diff --git a/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.1.md b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.1.md new file mode 100644 index 0000000000000..384bccdceb414 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.1.md @@ -0,0 +1,251 @@ +--- +{ + "title": "Release 2.1.1", + "language": "en" +} +--- + + + +Dear community members, Apache Doris 2.1.1 has been officially released on April 3, 2024, with several enhancements and bug fixes based on 2.1.0, enabling smoother user experience. + +- **Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + +## Behavior Changed + +1. Change float type output format to improve float type serialization performance. + +- https://github.com/apache/doris/pull/32049 + +2. Change system table value functions active_queries(), workload_groups() to system tables. + +- https://github.com/apache/doris/pull/32314 + +3. Disable show query/load profile stmt because there are not so many developers use it and the pipeline and pipelinex engine not support it. + +- https://github.com/apache/doris/pull/32467 + +4. Upgrade arrow flight version to 15.0.2 to fix some bugs, so that please use ADBC 15.0.2 version to access Doris. + +- https://github.com/apache/doris/pull/32827. + + + +## Upgrade Problem + +1. BE will core when rolling pgrade problem from 2.0.x to 2.1.x + +- https://github.com/apache/doris/pull/32672 + +- https://github.com/apache/doris/pull/32444 + +- https://github.com/apache/doris/pull/32162 + +2. JDBC Catalog will have query errors when rolling grade rom 2.0.x to 2.1.x. + +- https://github.com/apache/doris/pull/32618 + + + +## New Feature + +1. Enable column auth by default. + +- https://github.com/apache/doris/pull/32659 + + +2. Get correct cores for pipeline and pipelinex engine when running within docker or k8s. + +- https://github.com/apache/doris/pull/32370 + +3. Support read parquet int96 type. + +- https://github.com/apache/doris/pull/32394 + +4. Enable proxy protocol to support IP transparency. Using this protocol, IP transparency for load balancing can be achieved, so that after load balancing, Doris can still obtain the client's real IP and implement permission control such as whitelisting. + +- https://github.com/apache/doris/pull/32338/files + +5. Add workload group queue related columns for active_queries system table. Uses could use this system to monitor the workload queue usage. + +- https://github.com/apache/doris/pull/32259 + +6. Add new system table backend_active_tasks to monitor the realtime query statics on every BE. + +- https://github.com/apache/doris/pull/31945 + +7. Add ipv4 and ipv6 support for spark-doris connector. + +- https://github.com/apache/doris/pull/32240 + +8. Add inverted index support for CCR. + +- https://github.com/apache/doris/pull/32101 + +9. Support select experimental session variable. + +- https://github.com/apache/doris/pull/31837 + +10. Support materialized view with bitmap_union(bitmap_from_array()) case. + +- https://github.com/apache/doris/pull/31962 + +11. Support partition prune for *HIVE_DEFAULT_PARTITION*. + +- https://github.com/apache/doris/pull/31736 + +12. Support function in set variable statement. + +- https://github.com/apache/doris/pull/32492 + +13. Support arrow serialization for varint type. + +- https://github.com/apache/doris/pull/32809 + + + +## Optimization + +1. Auto resume routine load when be restart or during upgrade. And keep the routine load stable. + +- https://github.com/apache/doris/pull/32239 + +2. Routine Load: optimize allocate task to be algorithm for load balance. + +- https://github.com/apache/doris/pull/32021 + +3. Spark Load: update spark version for spark load to resolve cve problem. + +- https://github.com/apache/doris/pull/30368 + +4. Skip cooldown if the tablet is dropped. + +- https://github.com/apache/doris/pull/32079 + +5. Support using workload group to manage routine load. + +- https://github.com/apache/doris/pull/31671 + +6. [MTMV ]Improve the performance for query rewritting by materialized view. + +- https://github.com/apache/doris/pull/31886 + +7. Reduce jvm heap memory consumed by profiles of BrokerLoadJob. + +- https://github.com/apache/doris/pull/31985 +8. Imporve the high QPS query by speed up PartitionPrunner. + +- https://github.com/apache/doris/pull/31970 + +9. Reduce duplicated memory consumption for column name and column path for schema cache. + +- https://github.com/apache/doris/pull/31141 + +10. Support more join types for query rewriting by materialized view such as INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN, FULL OUTER JOIN, LEFT SEMI JOIN, RIGHT SEMI JOIN, LEFT ANTI JOIN, RIGHT ANTI JOIN + +- https://github.com/apache/doris/pull/32909 + + + +## Bugfix + + +1. Do not push down topn-filter through right/full outer join if the first orderkey is nulls first. + +- https://github.com/apache/doris/pull/32633 + +2. Fix memory leak in Java UDF + +- https://github.com/apache/doris/pull/32630 + +3. If some odbc tables use the same resource, and restore not all odbc tables, it will not retain the resource. +and check some conf for backup/restore + +- https://github.com/apache/doris/pull/31989 + +4. Fold constant will core for variant type. + +- https://github.com/apache/doris/pull/32265 + +5. Routine load will pause when transaction fail in some cases. + +- https://github.com/apache/doris/pull/32638 + +6. the result of left semi join with empty right side should be false instead of null. + +- https://github.com/apache/doris/pull/32477 + +7. Fix core when build inverted index for a new column with no data. + +- https://github.com/apache/doris/pull/32669 + +8. Fix be core caused by null-safe-equal join. + +- https://github.com/apache/doris/pull/32623 + +9. Partial update: fix data correctness risk when load delete sign data into a table with sequence col. + +- https://github.com/apache/doris/pull/32574 + +10. Select outfile: Fix the column type mapping in the orc/parquet file format. + +- https://github.com/apache/doris/pull/32281 + +11. Fix BE core during restore stage. + +- https://github.com/apache/doris/pull/32489 + +12. Use array_agg func after other agg func like count, sum, may make be core. + +- https://github.com/apache/doris/pull/32387 + +13. Variant type should always nullable or there will some bugs. + +- https://github.com/apache/doris/pull/32248 + +14. Fix the bug of handling empty blocks in schema change. + +- https://github.com/apache/doris/pull/32396 + +15. Fix BE will core when use json_length() in some cases. + +- https://github.com/apache/doris/pull/32145 + +16. Fix error when query iceberg table using date cast predicate + +- https://github.com/apache/doris/pull/32194 + +17. Fix some bugs when build inverted index for variant type. + +- https://github.com/apache/doris/pull/31992 + +18. Wrong result of two or more map_agg functions in query. + +- https://github.com/apache/doris/pull/31928 + +19. Fix wrong result of money_format function. + +- https://github.com/apache/doris/pull/31883 + +20. Fix connection hang after too many connections. + +- https://github.com/apache/doris/pull/31594 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.2.md b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.2.md new file mode 100644 index 0000000000000..6116bd9984632 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.2.md @@ -0,0 +1,110 @@ +--- +{ + "title": "Release 2.1.2", + "language": "en" +} +--- + + + +## Behavior Changed + +1. Set the default value of the `data_consistence` property of EXPORT to partition to make export more stable during load. + +- https://github.com/apache/doris/pull/32830 + +2. Some of MySQL Connector (eg, dotnet MySQL.Data) rely on variable's column type to make connection. + + eg, select @[@autocommit]([@autocommit](https://github.com/autocommit)) should with column type BIGINT, not BIT, otherwise it will throw error. So we change column type of @[@autocommit](https://github.com/autocommit) to BIGINT. + + - https://github.com/apache/doris/pull/33282 + + +## Upgrade Problem + +1. Normal workload group is not created when upgrade from 2.0 or other old versions. + + - https://github.com/apache/doris/pull/33197 + +## New Feature + + +1. Add processlist table in information_schema database, users could use this table to query active connections. + + - https://github.com/apache/doris/pull/32511 + +2. Add a new table valued function `LOCAL` to allow access file system like shared storage. + + - https://github.com/apache/doris-website/pull/494 + + +## Optimization + +1. Skip some useless process to make graceful stop more quickly in K8s env. + + - https://github.com/apache/doris/pull/33212 + +2. Add rollup table name in profile to help find the mv selection problem. + + - https://github.com/apache/doris/pull/33137 + +3. Add test connection function to DB2 database to allow user check the connection when create DB2 Catalog. + + - https://github.com/apache/doris/pull/33335 + +4. Add DNS Cache for FQDN to accelerate the connect process among BEs in K8s env. + + - https://github.com/apache/doris/pull/32869 + +5. Refresh external table's rowcount async to make the query plan more stable. + + - https://github.com/apache/doris/pull/32997 + + +## Bugfix + + +1. Fix Iceberg Catalog of HMS and Hadoop do not support Iceberg properties like "io.manifest.cache-enabled" to enable manifest cache in Iceberg. + + - https://github.com/apache/doris/pull/33113 + +2. The offset params in `LEAD`/`LAG` function could use 0 as offset. + + - https://github.com/apache/doris/pull/33174 + +3. Fix some timeout issues with load. + + - https://github.com/apache/doris/pull/33077 + + - https://github.com/apache/doris/pull/33260 + +4. Fix core problem related with `ARRAY`/`MAP`/`STRUCT` compaction process. + + - https://github.com/apache/doris/pull/33130 + + - https://github.com/apache/doris/pull/33295 + +5. Fix runtime filter wait timeout. + + - https://github.com/apache/doris/pull/33369 + +6. Fix `unix_timestamp` core for string input in auto partition. + + - https://github.com/apache/doris/pull/32871 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.3.md b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.3.md new file mode 100644 index 0000000000000..e88ec3e94fb6d --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.3.md @@ -0,0 +1,191 @@ +--- +{ + "title": "Release 2.1.3", + "language": "en" +} +--- + + + +Apache Doris 2.1.3 was officially released on May 21, 2024. This version has updated several improvements, including writing data back to Hive, materialized view, permission management and bug fixes. It further enhances the performance and stability of the system. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + + + +## Feature Enhancements + +**1. Support writing back data to hive tables via Hive Catalog** + +Starting from version 2.1.3, Apache Doris supports DDL and DML operations on Hive. Users can directly create libraries and tables in Hive through Apache Doris and write data to Hive tables by executing `INSERT INTO` statements. This feature allows users to perform complete data query and write operations on Hive through Apache Doris, further simplifying the integrated lakehouse architecture. + +Please refer: [https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/](https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/) + +**2. Support building new asynchronous materialized views on top of existing ones** + +Users can create new asynchronous materialized views on top of existing ones, directly reusing pre-computed intermediate results for data processing. This simplifies complex aggregation and computation operations, reducing resource consumption and maintenance costs while further accelerating query performance and improving data availability. [#32984](https://github.com/apache/doris/pull/32984) + +**3. Support rewriting through nested materialized views** + +Materialized View (MV) is a database object used to store query results. Now, Apache Doris supports rewriting through nested materialized views, which helps optimize query performance. [#33362](https://github.com/apache/doris/pull/33362) + +**4. New `SHOW VIEWS` statement** + +The `SHOW VIEWS` statement can be used to query views in the database, facilitating better management and understanding of view objects in the database. [#32358](https://github.com/apache/doris/pull/32358) + +**5. Workload Group supports binding to specific BE nodes** + +Workload Group can be bound to specific BE nodes, enabling more refined control over query execution to optimize resource usage and improve performance. [#32874](https://github.com/apache/doris/pull/32874) + +**6. Broker Load supports compressed JSON format** + +Broker Load now supports importing compressed JSON format data, significantly reducing bandwidth requirements for data transmission and accelerating data import performance. [#30809](https://github.com/apache/doris/pull/30809) + +**7. TRUNCATE Function can use columns as scale parameters** + +The TRUNCATE function can now accept columns as scale parameters, providing more flexibility when processing numerical data. [#32746](https://github.com/apache/doris/pull/32746) + +**8. Add new functions `uuid_to_int` and `int_to_uuid`** + +These two functions allow users to convert between UUID and integer, significantly helping in scenarios that require handling UUID data. [#33005](https://github.com/apache/doris/pull/33005) + +**9. Add `bypass_workload_group` session variable to bypass query queue** + +The `bypass_workload_group` session variable allows certain queries to bypass the Workload Group queue and execute directly, which is useful for handling critical queries that require quick responses. [#33101](https://github.com/apache/doris/pull/33101) + +**10. Add strcmp function** + +The strcmp function compares two strings and returns their comparison result, simplifying text data processing. [#33272](https://github.com/apache/doris/pull/33272) + +**11. Support HLL functions `hll_from_base64` and `hll_to_base64`** + +HyperLogLog (HLL) is an algorithm for cardinality estimation. These two functions allow users to decode HLL data from a Base64-encoded string or encode HLL data as a Base64 string, which is very useful for storing and transmitting HLL data. [#32089](https://github.com/apache/doris/pull/32089) + +## Optimization and Improvements + +**1. Replace SipHash with XXHash to improve shuffle performance** + +Both SipHash and XXHash are hashing functions, but XXHash may provide faster hashing speeds and better performance in certain scenarios. This optimization aims to improve performance during data shuffling by adopting XXHash. [#32919](https://github.com/apache/doris/pull/32919) + +**2. Asynchronous materialized views support NULL partition columns in OLAP tables** + +This enhancement allows asynchronous materialized views to support NULL partition columns in OLAP tables, enhancing data processing flexibility.[#32698](https://github.com/apache/doris/pull/32698) + +**3. Limit maximum string length to 1024 when collecting column statistics to control BE memory usage** + +Limiting the string length when collecting column statistics prevents excessive data from consuming too much BE memory, helping maintain system stability and performance. [#32470](https://github.com/apache/doris/pull/32470) + +**4. Support dynamic deletion of Bitmap cache to improve performance** + +Dynamically deleting no longer needed Bitmap Cache can free up memory and improve system performance. [#32991](https://github.com/apache/doris/pull/32991) + +**5. Reduce memory usage during ALTER operations** + +Reducing memory usage during ALTER operations improves the efficiency of system resource utilization. [#33474](https://github.com/apache/doris/pull/33474) + +**6. Support constant folding for complex types** + +Supports constant folding for Array/Map/Struct complex types.[#32867](https://github.com/apache/doris/pull/32867) + +**7. Add support for Variant type in Aggregate Key Model** + +The Variant data type can store multiple data types. This optimization allows aggregation operations on Variant type data, enhancing the flexibility of semi-structured data analysis. [#33493](https://github.com/apache/doris/pull/33493) + +**8. Support new inverted index format in CCR** [#33415](https://github.com/apache/doris/pull/33415) + +**9. Optimize rewriting performance for nested materialized views** [#34127](https://github.com/apache/doris/pull/34127) + +**10. Support decimal256 type in row-based storage format** + +Supporting the decimal256 type in row-based storage extends the system's ability to handle high-precision numerical data. [#34887](https://github.com/apache/doris/pull/34887) + +## Behavioral Changes + +**1. Authorization** + +- **Grant_priv permission changes**: `Grant_priv` can no longer be arbitrarily granted. When performing a `GRANT` operation, the user not only needs to have `Grant_priv` but also the permissions to be granted. For example, to grant `SELECT` permission on `table1`, the user needs both `GRANT` permission and `SELECT` permission on `table1`, enhancing security and consistency in permission management. [#32825](https://github.com/apache/doris/pull/32825) + +- **Workload group and resource usage_priv**: `Usage_priv` for Workload Group and Resource is no longer global but limited to Resource and Workload Group, making permission granting and usage more specific. [#32907](https://github.com/apache/doris/pull/32907) + +- **Authorization for operations**: Operations that were previously unauthorized now have corresponding authorizations for more detailed and comprehensive operational permission control. [#33347](https://github.com/apache/doris/pull/33347) + +**2. LOG directory configuration** + +The log directory configuration for FE and BE now uniformly uses the `LOG_DIR` environment variable. All other different types of logs will be stored with `LOG_DIR` as the root directory. To maintain compatibility between versions, the previous configuration item `sys_log_dir` can still be used. [#32933](https://github.com/apache/doris/pull/32933) + +**3. S3 Table Function (TVF)** + +Due to issues with correctly recognizing or processing S3 URLs in certain cases, the parsing logic for object storage paths has been refactored. For file paths in S3 table functions, the `force_parsing_by_standard_uri` parameter needs to be passed to ensure correct parsing. [#33858](https://github.com/apache/doris/pull/33858) + +## Upgrade Issues + +Since many users use certain keywords as column names or attribute values, the following keywords have been set as non-reserved, allowing users to use them as identifiers. [#34613](https://github.com/apache/doris/pull/34613) + +## Bug Fixes + +**1. Fix no data error when reading Hive tables on Tencent Cloud COSN** + +Resolved the no data error that could occur when reading Hive tables on Tencent Cloud COSN, enhancing compatibility with Tencent Cloud storage services. + +**2. Fix incorrect results returned by `milliseconds_diff` function** + +Fixed an issue where the `milliseconds_diff` function returned incorrect results in some cases, ensuring the accuracy of time difference calculations. [#32897](https://github.com/apache/doris/pull/32897) + +**3. User-defined variables should be rorwarded to the Master node** + +Ensured that user-defined variables are correctly passed to the Master node for consistency and correct execution logic across the entire system. [#33013]https://github.com/apache/doris/pull/33013 + +**4. Fix Schema Change issues when adding complex type columns** + +Resolved Schema Change issues that could arise when adding complex type columns, ensuring the correctness of Schema Changes. [#31824](https://github.com/apache/doris/pull/31824) + +**5. Fix data loss issue in Routine Load when FE Master node changes** + +`Routine Load` is often used to subscribe to Kafka message queues. This fix addresses potential data loss issues that may occur during FE Master node changes. [#33678](https://github.com/apache/doris/pull/33678) + +**6. Fix Routine Load failure when Workload Group cannot be found** + +Resolved an issue where `Routine Load` would fail if the specified Workload Group could not be found. [#33596](https://github.com/apache/doris/pull/33596) + +**7. Support column string64 to avoid join failures when string size overflows unit32** + +In some cases, string sizes may exceed the unit32 limit. Supporting the `string64` type ensures correct execution of string JOIN operations. [#33850](https://github.com/apache/doris/pull/33850) + +**8. Allow Hadoop users to create Paimon Catalog** + +Permitted authorized Hadoop users to create Paimon Catalogs.[#33833](https://github.com/apache/doris/pull/33833) + +**9. Fix `function_ipxx_cidr` function issues with constant parameters** + +Resolved problems with the `function_ipxx_cidr` function when handling constant parameters, ensuring the correctness of function execution.[#33968](https://github.com/apache/doris/pull/33968) + +**10. Fix file download errors when restoring using HDFS** + +Resolved "failed to download" errors encountered during data restoration using HDFS, ensuring the accuracy and reliability of data recovery. [#33303](https://github.com/apache/doris/issues/33303) + +**11. Fix column permission issues related to hidden columns** + +In some cases, permission settings for hidden columns may be incorrect. This fix ensures the correctness and security of column permission settings. [#34849](https://github.com/apache/doris/pull/34849) + +**12. Fix issue where Arrow Flight cannot obtain the correct IP in K8s deployments** + +This fix resolves an issue where Arrow Flight cannot correctly obtain the IP address in Kubernetes deployment environments.[#34850](https://github.com/apache/doris/pull/34850) \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.4.md b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.4.md new file mode 100644 index 0000000000000..521694ffa60fa --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.4.md @@ -0,0 +1,289 @@ +--- +{ + "title": "Release 2.1.4", + "language": "en" +} +--- + + + +**Apache Doris version 2.1.4 was officially released on June 26, 2024.** In this update, we have optimized various functional experiences for data lakehouse scenarios, with a focus on resolving the abnormal memory usage issue in the previous version. Additionally, we have implemented several improvemnents and bug fixes to enhance the stability. Welcome to download and use it. + + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + + +## Behavior changes + +- Non-existent files will be ignored when querying external tables such as Hive. [#35319](https://github.com/apache/doris/pull/35319) + + The file list is obtained from the meta cache, and it may not be consistent with the actual file list. + + Ignoring non-existent files helps to avoid query errors. + +- By default, creating a Bitmap Index will no longer be automatically changed to an Inverted Index. [#35521](https://github.com/apache/doris/pull/35521) + + This behavior is controlled by the FE configuration item `enable_create_bitmap_index_as_inverted_index`, which defaults to false. + +- When starting FE and BE processes using `--console`, all logs will be output to the standard output and differentiated by prefixes indicating the log type. [#35679](https://github.com/apache/doris/pull/35679) + + For more infomation, please see the documentations: + + - [Log Management - FE Log](../admin-manual/log-management/fe-log.md) + + - [Log Management - BE Log](../admin-manual/log-management/be-log.md) + +- If no table comment is provided when creating a table, the default comment will be empty instead of using the table type as the default comment. [#36025](https://github.com/apache/doris/pull/36025) + +- The default precision of DECIMALV3 has been adjusted from (9, 0) to (38, 9) to maintain compatibility with the version in which this feature was initially released. [#36316](https://github.com/apache/doris/pull/36316) + +## New features + +### Query optimizer + +- Support FE flame graph tool + + For more information, see the [documentation](/community/developer-guide/fe-profiler.md) + +- Support `SELECT DISTINCT` to be used with aggregation. + +- Support single table query rewrite without `GROUP BY`. This is useful for complex filters or expressions. [#35242](https://github.com/apache/doris/pull/35242). + +- The new optimizer fully supports point query functionality [#36205](https://github.com/apache/doris/pull/36205). + +### Data Lakehouse + +- Support native reader of Apache Paimon deletion vector [#35241](https://github.com/apache/doris/pull/35241) + +- Support using Resource in Table Valued Functions [#35139](https://github.com/apache/doris/pull/35139) + +- Access controller with Hive Ranger plugin supports Data Mask + +### Asynchronous materialized views + +- Build support for internal table triggered updates, where if a materialized view uses an internal table and the data in the internal table changes, it can trigger a refresh of the materialized view, specifying REFRESH ON COMMIT when creating the materialized view. + +- Support transparent rewriting for single tables. For more information, see [Querying Async Materialized View](../query/view-materialized-view/query-async-materialized-view.md). + +- Transparent rewriting supports aggregation roll-up for agg_state, agg_union types; materialized views can be defined as agg_state or agg_union, queries can use specific aggregation functions, or use agg_merge. For more information, see [AGG_STATE](../sql-manual/sql-types/Data-Types/AGG_STATE.md). + +### Others + +- Added function `replace_empty`. + + For more information, see [documentation]../sql-manual/sql-functions/string-functions/replace_empty). + +- Support `show storage policy using` statement. + + For more information, see [documentation](../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md). + +- Support JVM metrics on the BE side. + + By setting `enable_jvm_monitor=true` in `be.conf` to enable this feature. + +## Improvements + +- Supported creating inverted indexes for columns with Chinese names. [#36321](https://github.com/apache/doris/pull/36321) + +- Estimate memory consumed by segment cache more accurately so that unused memory can be released more quickly. [#35751](https://github.com/apache/doris/pull/35751) + +- Filter empty partitions before exporting tables to remote storage. [#35542](https://github.com/apache/doris/pull/35542) + +- Optimize routine load task allocation algorithm to balance the load among Backends. [#34778](https://github.com/apache/doris/pull/34778) + +- Provide hints when a related variable is not found during a set operation. [#35775](https://github.com/apache/doris/pull/35775) + +- Support placing Java UDF jar files in the FE's `custom_lib` directory for default loading. [#35984](https://github.com/apache/doris/pull/35984) + +- Add a timeout global variable `audit_plugin_load_timeout` for audit log load jobs. + +- Optimize the performance of transparent rewrite planning for asynchronous materialized views. + +- Optimize the `INSERT` operation that when the source is empty, the BE will not execute. [#34418](https://github.com/apache/doris/pull/34418) + +- Support fetching file lists of Hive/Hudi tables in batches. In a senario with 1.2 million files, the time taken to obtain the list of files has been reduced from 390 seconds to 46 seconds. [#35107](https://github.com/apache/doris/pull/35107) + +- Forbid dynamic partitioning when creating asynchronous materialized views. + +- Support detecting whether the partition data of external data of external tables in Hive is synchronized with asynchronous materialized views. + +- Allow to create index for asynchronous materialized views. + +## Bug fixes + +### Query optimizer + +- Fixed the issue where SQL cache returns old results after truncating a partition. [#34698](https://github.com/apache/doris/pull/34698) + +- Fixed the issue where casting from JSON to other types did not correctly handle nullable attributes. [#34707](https://github.com/apache/doris/pull/34707) + +- Fixed occasional DATETIMEV2 literal simplification errors. [#35153](https://github.com/apache/doris/pull/35153) + +- Fixed the issue where `COUNT(*)` could not be used in window functions. [#35220](https://github.com/apache/doris/pull/35220) + +- Fixed the issue where nullable attributes could be incorrect when all `SELECT` statements under `UNION ALL` have no `FROM` clause. [#35074](https://github.com/apache/doris/pull/35074) + +- Fixed the issue where `bitmap in join` and subquery unnesting could not be used simultaneously. [#35435](https://github.com/apache/doris/pull/35435) + +- Fixed the performance issue where filter conditions could not be pushed down to the CTE producer in specific situations. [#35463](https://github.com/apache/doris/pull/35463) + +- Fixed the issue where aggregate combinators written in uppercase could not be found. [#35540](https://github.com/apache/doris/pull/35540) + +- Fixed the performance issue where window functions were not properly pruned by column pruning. [#35504](https://github.com/apache/doris/pull/35504) + +- Fixed the issue where queries might parse incorrectly leading to wrong results when multiple tables with the same name but in different databases appeared simultaneously in the query. [#35571](https://github.com/apache/doris/pull/35571) + +- Fixed the query error caused by generating runtime filters during schema table scans. [#35655](https://github.com/apache/doris/pull/35655) + +- Fixed the issue where nested correlated subqueries could not execute because the join condition was folded into a null literal. [#35811](https://github.com/apache/doris/pull/35811) + +- Fixed the occasional issue where decimal literals were set with incorrect precision during planning. [#36055](https://github.com/apache/doris/pull/36055) + +- Fixed the occasional issue where multiple layers of aggregation were merged incorrectly during planning. [#36145](https://github.com/apache/doris/pull/36145) + +- Fixed the occasional issue where the input-output mismatch error occurred after aggregate expansion planning. [#36207](https://github.com/apache/doris/pull/36207) + +- Fixed the occasional issue where `<=>` was incorrectly converted to `=`. [#36521](https://github.com/apache/doris/pull/36521) + +### Query execution + +- Fixed the issue where the query hangs if the limited rows are reached on the pipeline engine and memory is not released. [#35746](https://github.com/apache/doris/pull/35746) + +- Fixed the BE coredump when `enable_decimal256` is true but falls back to the old planner. [#35731](https://github.com/apache/doris/pull/35731) + +### Asynchronous materialized views + +- Fixed the issue in the asynchronous materialized view build where the store_row_column attribute specified was not being recognized by the core. + +- Fixed the problem in the asynchronous materialized view build where specifying the storage_medium was not taking effect. + +- Resolved the error occurring in the asynchronous materialized view show partitions after the base table is deleted. + +- Fixed the issue where asynchronous materialized views caused backup and restore exceptions. [#35703](https://github.com/apache/doris/pull/35703) + +- Fixed the issue where partition rewrite could lead to incorrect results. [#35236](https://github.com/apache/doris/pull/35236) + +### Semi-structured + +- Fixed the core dump problem when a VARIANT with an empty key is used. [#35671](https://github.com/apache/doris/pull/35671) +- Bitmap and BloomFilter index should not perform light index changes. [#35225](https://github.com/apache/doris/pull/35225) + +### Primary key + +- Fixed the issue where an exception BE restart occurred in the case of partial column updates during import, which could result in duplicate keys. [#35678](https://github.com/apache/doris/pull/35678) + +- Fixed the issue where BE might core dump during clone operations when memory is tight. [#34702](https://github.com/apache/doris/pull/34702) + +### Data Lakehouse + +- Fixed the issue where a Hive table could not be created with a fully qualified name such as `ctl.db.tbl` [#34984](https://github.com/apache/doris/pull/34984) + +- Fixed the issue where the Hive metastore connection did not close when refreshing [#35426](https://github.com/apache/doris/pull/35426) + +- Fixed a potential meta replay issue when upgrading from 2.0.x to 2.1.x. [#35532](https://github.com/apache/doris/pull/35532) + +- Fixed the issue where the Table Valued Function could not read an empty snappy compressed file. [#34926](https://github.com/apache/doris/pull/34926) + +- Fixed the issue where unable to read Parquet files with invalid min-max column statistics [#35041](https://github.com/apache/doris/pull/35041) + +- Fixed the issue where unable to handle pushdown predicates with null-aware functions in the Parquet/ORC reader [#35335](https://github.com/apache/doris/pull/35335) + +- Fixed the issue about the order of partition columns when creating a Hive table [#35347](https://github.com/apache/doris/pull/35347) + +- Fixed the issue where writing to a Hive table on S3 failed when partition values contained spaces [#35645](https://github.com/apache/doris/pull/35645) + +- Fixed the issue about incorrect scheme of Aliyun OSS endpoint [#34907](https://github.com/apache/doris/pull/34907) + +- Fixed the issue where the Parquet format Hive table written by Doris could not be read by Hive [#34981](https://github.com/apache/doris/pull/34981) + +- Fixed the issue where unable to read ORC files after the schema change of a Hive table [#35583](https://github.com/apache/doris/pull/35583) + +- Fixed the issue where unable to read Paimon tables via JNI after the schema change of the Paimon table [#35309](https://github.com/apache/doris/pull/35309) + +- Fixed the issue of too small Row Groups in Parquet format files written out. [#36042](https://github.com/apache/doris/pull/36042) [#36143](https://github.com/apache/doris/pull/36143) + +- Fixed the issue where unable to read Paimon tables after schema changes [#36049](https://github.com/apache/doris/pull/36049) + +- Fixed the issue where unable to read Hive Parquet format tables after schema changes [#36182](https://github.com/apache/doris/pull/36182) + +- Fixed the FE OOM issue caused by Hadoop FS cache [#36403](https://github.com/apache/doris/pull/36403) + +- Fixed the issue where FE could not start after enabling the Hive Metastore Listener [#36533](https://github.com/apache/doris/pull/36533) + +- Fixed the issue of query performance degradation with a large number of files [#36431](https://github.com/apache/doris/pull/36431) + +- Fixed the timezone issue when reading the timestamp column type in Iceberg [#36435](https://github.com/apache/doris/pull/36435) + +- Fixed DATETIME conversion error and data path error on Iceberg Table. [#35708](https://github.com/apache/doris/pull/35708) + +- Support retain and pass the additional user-defined properties fo Table Valued Functions to the S3 SDK. [#35515](https://github.com/apache/doris/pull/35515) + + +### Data import + +- Fixed the issue where `CANCEL LOAD` did not work [#35352](https://github.com/apache/doris/pull/35352) + +- Fixed the issue where a null pointer error in the Publish phase of load transactions prevented the load from completing [#35977](https://github.com/apache/doris/pull/35977) + +- Fixed the issue with bRPC serializing large data files when sent via HTTP [#36169](https://github.com/apache/doris/pull/36169) + +### Data management + +- Fixed the isseu that the resource tag in ConnectionContext was not set after forwarding DDL or DML to master FE. [#35618](https://github.com/apache/doris/pull/35618) + +- Fixed the issue where the restored table name was incorrect when `lower_case_table_names` was enabled [#35508](https://github.com/apache/doris/pull/35508) + +- Fixed the issue where `admin clean trash` could not work [#35271](https://github.com/apache/doris/pull/35271) + +- Fixed the issue where a storage policy could not be deleted from a partition [#35874](https://github.com/apache/doris/pull/35874) + +- Fixed the issue of data loss when importing into a multi-replica automatic partition table [#36586](https://github.com/apache/doris/pull/36586) + +- Fixed the issue where the partition column of a table changed when querying or inserting into an automatic partition table using the old optimizer [#36514](https://github.com/apache/doris/pull/36514) + +### Memory management + +- Fixed the issue of frequent errors in the logs due to failure in obtaining Cgroup meminfo. [#35425](https://github.com/apache/doris/pull/35425) + +- Fixed the issue where the Segment cache size was uncontrolled when using BloomFilter, leading to abnormal process memory growth. [#34871](https://github.com/apache/doris/pull/34871) + +### Permissions + +- Fixed the issue where permission settings were ineffective after enabling case-insensitive table names. [#36557](https://github.com/apache/doris/pull/36557) + +- Fixed the issue where setting LDAP passwords through non-Master FE nodes did not take effect. [#36598](https://github.com/apache/doris/pull/36598) + +- Fixed the issue where authorization could not be checked for the `SELECT COUNT(*)` statement. [#35465](https://github.com/apache/doris/pull/35465) + +### Others + +- Fixed the issue where the client JDBC program could not close the connection if the MySQL connection was broken. [#36616](https://github.com/apache/doris/pull/36616) + +- Fixed MySQL protocol compatibility issue with the `SHOW PROCEDURE STATUS` statement. [#35350](https://github.com/apache/doris/pull/35350) + +- The `libevent` now forces Keepalive to solve the issue of connection leaks in certain situations. [#36088](https://github.com/apache/doris/pull/36088) + +## Credits + +Thanks to every one who contributes to this release. + +@airborne12, @amorynan, @AshinGau, @BePPPower, @BiteTheDDDDt, @ByteYue, @caiconghui, @CalvinKirs, @cambyzju, @catpineapple, @cjj2010, @csun5285, @DarvenDuan, @dataroaring, @deardeng, @Doris-Extras, @eldenmoon, @englefly, @feiniaofeiafei, @felixwluo, @freemandealer, @Gabriel39, @gavinchou, @GoGoWen, @HappenLee, @hello-stephen, @hubgeter, @hust-hhb, @jacktengg, @jackwener, @jeffreys-cat, @Jibing-Li, @kaijchen, @kaka11chen, @Lchangliang, @liaoxin01, @LiBinfeng-01, @lide-reed, @luennng, @luwei16, @mongo360, @morningman, @morrySnow, @mrhhsg, @Mryange, @mymeiyi, @nextdreamblue, @platoneko, @qidaye, @qzsee, @seawinde, @shuke987, @sollhui, @starocean999, @suxiaogang223, @TangSiyang2001, @Thearas, @Vallishp, @w41ter, @wangbo, @whutpencil, @wsjz, @wuwenchi, @xiaokang, @xiedeyantu, @XieJiann, @xinyiZzz, @XuPengfei-1020, @xy720, @xzj7019, @yiguolei, @yongjinhou, @yujun777, @Yukang-Lian, @Yulei-Yang, @zclllyybb, @zddr, @zfr9527, @zgxme, @zhangbutao, @zhangstar333, @zhannngchen, @zhiqiang-hhhh, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.5.md b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.5.md new file mode 100644 index 0000000000000..7c1910eeae8c5 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.5.md @@ -0,0 +1,395 @@ +--- +{ + "title": "Release 2.1.5", + "language": "en" +} +--- + + + +**Apache Doris version 2.1.5 was officially released on July 24, 2024.** In this update, we have optimized various functional experiences for data lakehouse and high concurrency scenarios, functionalities of asynchronous materialized views. Additionaly, we have implemented several improvemnents and bug fixes to enhance the stability. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- The default connection pool size for the JDBC Catalog has been increased from 10 to 30 to prevent connection exhaustion in high-concurrency scenarios. [#37023](https://github.com/apache/doris/pull/37023). + +- The system's reserved memory (low water mark) has been adjusted to `min(6.4GB, MemTotal * 5%)` to mitigate BE OOM issues. + +- When processing multiple statements in a single request, only the last statement's result is returned if the `CLIENT_MULTI_STATEMENTS` flag is not set. + +- Direct modifications to data in asynchronous materialized views are no longer permitted.[#37129](https://github.com/apache/doris/pull/37129) + +- A session variable `use_max_length_of_varchar_in_ctas` has been added to control the behavior of varchar and char type length generation during CTAS (Create Table As Select). The default value is true. When set to false, the derived varchar length is used instead of the maximum length. [#37284](https://github.com/apache/doris/pull/37284) + +- Statistics collection now defaults to enabling the functionality of estimating the number of rows in Hive tables based on file size. [#37694](https://github.com/apache/doris/pull/37694) + +- Transparent rewrite for asynchronous materialized views is now enabled by default. [#35897](https://github.com/apache/doris/pull/35897) + +- Transparent rewrite utilizes partitioned materialized views. If partitions fail, the base tables are unioned with the materialized view to ensure data correctness. [#35897](https://github.com/apache/doris/pull/35897) + +## New features + +### Lakehouse + +- The session variable `read_csv_empty_line_as_null` can be used to control whether empty lines are ignored when reading CSV format files. [#37153](https://github.com/apache/doris/pull/37153) + + By default, empty lines are ignored. When set to true, empty lines will be read as rows where all columns are null. + +- Compatibility with Presto's complex type output format can be enabled by setting `serde_dialect="presto"`. [#37253](https://github.com/apache/doris/pull/37253) + +### Multi-Table Materialized View + +- Supports non-deterministic functions in materialized view building. [#37651](https://github.com/apache/doris/pull/37651) + +- Atomically replaces definitions of asynchronous materialized views. [#37147](https://github.com/apache/doris/pull/37147) + +- Views creation statements can be viewed via `SHOW CREATE MATERIALIZED VIEW`. [#37125](https://github.com/apache/doris/pull/37125) + +- Transparent rewrites for multi-dimensional aggregation and non-aggregate queries. [#37436](https://github.com/apache/doris/pull/37436) [#37497](https://github.com/apache/doris/pull/37497) + +- Supports DISTINCT aggregations with key columns and partitioning for roll-ups. [#37651](https://github.com/apache/doris/pull/37651) + +- Support for partitioning materialized views to roll up partitions using `date_trunc` [#31812](https://github.com/apache/doris/pull/31812) [#35562](https://github.com/apache/doris/pull/35562) + +- Partitioned table-valued functions (TVFs) are supported. [#36479](https://github.com/apache/doris/pull/36479) + +### Semi-Structured Data Management + +- Tables using the VARIANT type now support partial column updates. [#34925](https://github.com/apache/doris/pull/34925) + +- PreparedStatement support is now enabled by default. [#36581](https://github.com/apache/doris/pull/36581) + +- The VARIANT type can be exported to CSV format. [#37857](https://github.com/apache/doris/pull/37857) + +- `explode_json_object` function transposes JSON Object rows into columns. [#36887](https://github.com/apache/doris/pull/36887) + +- The ES Catalog now maps ES NESTED or OBJECT types to the Doris JSON type.[#37101](https://github.com/apache/doris/pull/37101) + +- By default, support_phrase is enabled for inverted indexes with specified analyzers to improve the performance of match_phrase series queries. [#37949](https://github.com/apache/doris/pull/37949) + +### Query Optimizer + +- Support for explaining `DELETE FROM` statements. [#37100](https://github.com/apache/doris/pull/37100) + +- Support for hint form of constant expression parameters [#37988](https://github.com/apache/doris/pull/37988) + +### Memory Management + +- Added an HTTP API to clear the cache. [#36599](https://github.com/apache/doris/pull/36599) + +### Permissions + +- Support for authorization of resources within Table-Valued Functions (TVFs) [#37132](https://github.com/apache/doris/pull/37132) + +## Improvements + +### Lakehouse + +- Upgraded Paimon to version 0.8.1 + +- Fixes ClassNotFoundException for org.apache.commons.lang.StringUtils when querying Paimon tables. [#37512](https://github.com/apache/doris/pull/37512) + +- Added support for Tencent Cloud LakeFS. [#36891](https://github.com/apache/doris/pull/36891) + +- Optimized the timeout duration when fetching file lists for external table queries. [#36842](https://github.com/apache/doris/pull/36842) + +- Configurable via the session variable `fetch_splits_max_wait_time_ms`. + +- Improved default connection logic for SQLServer JDBC Catalog. [#36971](https://github.com/apache/doris/pull/36971) + + By default, the connection encryption settings are not intervened. Only when `force_sqlserver_jdbc_encrypt_false` is set to true, encrypt=false is forcibly added to the JDBC URL to reduce authentication errors. This allows for more flexible control over encryption behavior, enabling it to be turned on or off as needed. + +- Added serde properties to the show create table statements for Hive tables. [#37096](https://github.com/apache/doris/pull/37096) + +- Changed the default cache time for Hive table lists on the FE from 1 day to 4 hours + +- Data export (Export/Outfile) now supports specifying compression formats for Parquet and ORC + + For more information, please refer to [docs](https://doris.apache.org/docs/sql-manual/sql-statements/Data-Manipulation-Statements/Manipulation/EXPORT/?_highlight=compress_type). + +- When creating a table using CTAS+TVF, partition columns in the TVF are automatically mapped to Varchar(65533) instead of String, allowing them to be used as partition columns for internal tables [#37161](https://github.com/apache/doris/pull/37161) + +- Optimized the number of metadata accesses for Hive write operations [#37127](https://github.com/apache/doris/pull/37127) + +- ES Catalog now supports mapping nested/object types to Doris's Json type. [#37182](https://github.com/apache/doris/pull/37182) + +- Improved error messages when connecting to Oracle using older versions of the ojdbc driver [#37634](https://github.com/apache/doris/pull/37634) + +- When Hudi tables return an empty set during Incremental Read, Doris now also returns an empty set instead of error [#37636](https://github.com/apache/doris/pull/37636) + +- Fixed an issue where inner-outer table join queries could lead to FE timeouts in some cases [#37757](https://github.com/apache/doris/pull/37757) + +- Fixed an issue with FE metadata replay errors during upgrades from older versions to newer versions when the Hive metastore event listener is enabled. [#37757](https://github.com/apache/doris/pull/37757) + +### Multi-Table Materialized View + +- Automate key column selection for asynchronous materialized views. [#36601](https://github.com/apache/doris/pull/36601) + +- Support date_trunc in materialized view partition definitions.. [#35562](https://github.com/apache/doris/pull/35562) + +- Enable transparent rewrites across nested materialized view aggregations. [#37651](https://github.com/apache/doris/pull/37651) + +- Asynchronous materialized views remain available when schema changes do not affect the correctness of their data. [#37122](https://github.com/apache/doris/pull/37122) + +- Improve planning speed for transparent rewrites. [#37935](https://github.com/apache/doris/pull/37935) + +- When calculating the availability of asynchronous materialized views, the current refresh status is no longer taken into account. [#36617](https://github.com/apache/doris/pull/36617) + +### Semi-Structured Data Management + +- Optimize DESC performance for viewing VARIANT sub-columns through sampling. [#37217](https://github.com/apache/doris/pull/37217) + +- Support for special JSON data with empty keys in the JSON type. [#36762](https://github.com/apache/doris/pull/36762) + +### Inverted Index + +- Reduce latency by minimizing the invocation of inverted index exists to avoid delays in accessing object storage. [#36945](https://github.com/apache/doris/pull/36945) + +- Optimize the overhead of the inverted index query process. [#35357](https://github.com/apache/doris/pull/35357) + +- Prevent inverted indices in materialized views. [#36869](https://github.com/apache/doris/pull/36869) + +### Query Optimizer + +- When both sides of a comparison expression are literals, the string literal will attempt to convert to the type of the other side. [#36921](https://github.com/apache/doris/pull/36921) + +- Refactored the sub-path pushdown functionality for the variant type, now better supporting complex pushdown scenarios. [#36923](https://github.com/apache/doris/pull/36923) + +- Optimized the logic for calculating the cost of materialized views, enabling more accurate selection of lower-cost materialized views. [#37098](https://github.com/apache/doris/pull/37098) + +- Improved the SQL cache planning speed when using user variables in SQL. [#37119](https://github.com/apache/doris/pull/37119) + +- Optimized the row estimation logic for NOT NULL expressions, resulting in better performance when NOT NULL is present in queries. [#37498](https://github.com/apache/doris/pull/37498) + +- Optimized the null rejection derivation logic for LIKE expressions. [#37864](https://github.com/apache/doris/pull/37864) + +- Improved error messages when querying a specific partition fails, making it clearer which table is causing the issue. [#37280](https://github.com/apache/doris/pull/37280) + +### Query Execution + +- Improved the performance of the bitmap_union operator up to 3 times in certain scenarios. + +- Enhanced the reading performance of Arrow Flight in ARM environments. + +- Optimized the execution performance of the explode, explode_map, and explode_json functions. + +### Data Loading + +- Support setting `max_filter_ratio` for `INSERT INTO ... FROM TABLE VALUE FUNCTION` + +## Bug fixes + +### Lakehouse + +- Fixed an issue that caused BE crashes in some cases when querying Parquet format [#37086](https://github.com/apache/doris/pull/37086) + +- Fixed an issue where BE printed excessive logs when querying Parquet format. [#37012](https://github.com/apache/doris/pull/37012) + +- Fixed an issue where the FE side created a large number of duplicate FileSystem objects in some cases. [#37142](https://github.com/apache/doris/pull/37142) + +- Fixed an issue where transaction information was not cleaned up after writing to Hive in some cases. [#37172](https://github.com/apache/doris/pull/37172) + +- Fixed a thread leak issue caused by Hive table write operations in some cases. [#37247](https://github.com/apache/doris/pull/37247) + +- Fixed an issue where Hive Text format row and column delimiters could not be correctly obtained in some cases. [#37188](https://github.com/apache/doris/pull/37188) + +- Fixed a concurrency issue when reading lz4 compressed blocks in some cases. [#37187](https://github.com/apache/doris/pull/37187) + +- Fixed an issue where `count(*)` on Iceberg tables returned incorrect results in some cases. [#37810](https://github.com/apache/doris/pull/37810) + +- Fixed an issue where creating a Paimon catalog based on MinIO caused FE metadata replay errors in some cases. [#37249](https://github.com/apache/doris/pull/37249) + +- Fixed an issue where using Ranger to create a catalog caused the client to hang in some cases. [#37551](https://github.com/apache/doris/pull/37551) + +### Multi-Table Materialized View + +- Fixed an issue where adding new partitions to the base table could lead to incorrect results after partition aggregation roll-up rewrites. [#37651](https://github.com/apache/doris/pull/37651) + +- Fixed an issue where the materialized view partition status was not set to out-of-sync after deleting associated base table partitions. [#36602](https://github.com/apache/doris/pull/36602) + +- Fixed an occasional deadlock issue during asynchronous materialized view builds. [#37133](https://github.com/apache/doris/pull/37133) + +- Fixed an occasional "nereids cost too much time" error when refreshing a large number of partitions in a single asynchronous materialized view refresh. [#37589](https://github.com/apache/doris/pull/37589) + +- Fixed an issue where an asynchronous materialized view could not be created if the final select list contained a null literal. [#37281](https://github.com/apache/doris/pull/37281) + +- Fixed an issue with single-table materialized views where, even though the aggregation materialized view was successfully rewritten, the CBO did not select it. [#35721](https://github.com/apache/doris/pull/35721) [#36058](https://github.com/apache/doris/pull/36058) + +- Fixed an issue where partition derivation failed when building a partitioned materialized view with both join inputs being aggregations. [#34781](https://github.com/apache/doris/pull/34781) + +### Semi-Structured Data Management + +- Fixed issues with VARIANT in special cases such as concurrency and abnormal data.[#37976](https://github.com/apache/doris/pull/37976) [#37839](https://github.com/apache/doris/pull/37839) [#37794](https://github.com/apache/doris/pull/37794) [#37674](https://github.com/apache/doris/pull/37674) [#36997](https://github.com/apache/doris/pull/36997) + +- Fixed coredump issues when using VARIANT in unsupported SQL. [#37640](https://github.com/apache/doris/pull/37640) + +- Fixed coredump issues related to MAP data type when upgrading from 1.x to 2.x or higher versions. [#36937](https://github.com/apache/doris/pull/36937) + +- Improved ES Catalog support for Array types. [#36936](https://github.com/apache/doris/pull/36936) + +### Inverted Index + +- Fixed an issue where DROP INDEX for Inverted Index v2 did not delete metadata. [#37646](https://github.com/apache/doris/pull/37646) + +- Fixed query accuracy issues when string length exceeded the "ignore above" threshold. [#37679](https://github.com/apache/doris/pull/37679) + +- Fixed issues with index size statistics. [#37232](https://github.com/apache/doris/pull/37232) [#37564](https://github.com/apache/doris/pull/37564) + +### Query Optimizer + +- Fixed an issue that prevented import operations from executing due to the use of reserved keywords. [#35938](https://github.com/apache/doris/pull/35938) + +- Fixed a type error where char(255) was incorrectly recorded as char(1) when creating a table. [#37671](https://github.com/apache/doris/pull/37671) + +- Fixed incorrect results when the join expression in a correlated subquery was a complex expression. [#37683](https://github.com/apache/doris/pull/37683) + +- Fixed a potential issue with incorrect bucket pruning for decimal types. [#38013](https://github.com/apache/doris/pull/38013) + +- Fixed incorrect aggregation operator results when pipeline local shuffle was enabled in certain scenarios. [#38016](https://github.com/apache/doris/pull/38016) + +- Fixed planning errors that could occur when equal expressions existed in aggregation operators. [#36622](https://github.com/apache/doris/pull/36622) + +- Fixed planning errors that could occur when lambda expressions were present in aggregation operators. [#37285](https://github.com/apache/doris/pull/37285) + +- Fixed an issue where a literal generated from a window function being optimized to a literal had the wrong type, preventing execution. [#37283](https://github.com/apache/doris/pull/37283) + +- Fixed an issue with the null attribute being incorrectly output by the aggregate function foreach combinator. [#37980](https://github.com/apache/doris/pull/37980) + +- Fixed an issue where the acos function could not be planned when its parameter was a literal out of range. [#37996](https://github.com/apache/doris/pull/37996) + +- Fixed planning errors when specifying partitions for a query on a synchronized materialized view. [#36982](https://github.com/apache/doris/pull/36982) + +- Fixed occasional Null Pointer Exceptions (NPEs) during planning. [#38024](https://github.com/apache/doris/pull/38024) + +### Query Execution + +- Fixed an error in delete where statements when using decimal data types as conditions. [#37801](https://github.com/apache/doris/pull/37801) + +- Fixed an issue where BE memory was not released after query execution ended. [#37792](https://github.com/apache/doris/pull/37792) [#37297](https://github.com/apache/doris/pull/37297) + +- Fixed a problem where audit logs occupied too much FE memory under high QPS scenarios. [#37786](https://github.com/apache/doris/pull/37786) + +- Fixed BE core dumps when the sleep function received illegal input values. [#37681](https://github.com/apache/doris/pull/37681) + +- Fixed an error encountered during sync filter size execution. [#37103](https://github.com/apache/doris/pull/37103) + +- Fixed incorrect results when using time zones during execution. [#37062](https://github.com/apache/doris/pull/37062) + +- Fixed incorrect results when casting strings to integers. [#36788](https://github.com/apache/doris/pull/36788) + +- Fixed query errors when using the Arrow Flight protocol with pipelinex enabled. [#35804](https://github.com/apache/doris/pull/35804) + +- Fixed errors when casting strings to dates/datetimes. [#35637](https://github.com/apache/doris/pull/35637) + +- Fixed BE core dumps during large table join queries using <=>. [#36263](https://github.com/apache/doris/pull/36263) + +### Storage Management + +- Fixed the issue of invisible DELETE SIGN data encountered during column update and write operations. [#36755](https://github.com/apache/doris/pull/36755) + +- Optimized FE's memory usage during schema changes. [#36756](https://github.com/apache/doris/pull/36756) + +- Fixed the issue where BE would hang during restart due to transactions not being aborted [#36437](https://github.com/apache/doris/pull/36437) + +- Fixed occasional errors when changing from NOT NULL to NULL data types. [#36389](https://github.com/apache/doris/pull/36389) + +- Optimized replica repair scheduling when BE goes down. [#36897](https://github.com/apache/doris/pull/36897) + +- Supported round-robin disk selection for tablet creation on a single BE. [#36900](https://github.com/apache/doris/pull/36900) + +- Fixed query error -230 caused by slow publishing. [#36222](https://github.com/apache/doris/pull/36222) + +- Improved the speed of partition balancing. [#36976](https://github.com/apache/doris/pull/36976) + +- Controlled segment cache using the number of file descriptors (FDs) and memory to avoid FD exhaustion. [#37035](https://github.com/apache/doris/pull/37035) + +- Fixed potential replica loss caused by concurrent clone and alter operations [#36858](https://github.com/apache/doris/pull/36858) + +- Fixed the issue of not being able to adjust column order.[#37226](https://github.com/apache/doris/pull/37226) + +- Prohibited certain schema change operations on auto-increment columns. [#37331](https://github.com/apache/doris/pull/37331) + +- Fixed inaccurate error reporting for DELETE operations. [#37374](https://github.com/apache/doris/pull/37374) + +- Adjusted the trash expiration time on BE side to one day. [#37409](https://github.com/apache/doris/pull/37409) + +- Optimized compaction memory usage and scheduling. [#37491](https://github.com/apache/doris/pull/37491) + +- Checked for potential oversized backups causing FE restarts. [#37466](https://github.com/apache/doris/pull/37466) + +- Restored dynamic partition deletion policies and cross-partition behaviors to 2.1.3. [#37570](https://github.com/apache/doris/pull/37570) [#37506](https://github.com/apache/doris/pull/37506) + +- Fixed errors related to decimal types in DELETE predicates. [#37710](https://github.com/apache/doris/pull/37710) + +### Data Loading + +- Fixed data invisibility issues caused by race conditions in error handling during imports [#36744](https://github.com/apache/doris/pull/36744) + +- Added support for hhl_from_base64 in streamload imports. [#36819](https://github.com/apache/doris/pull/36819) + +- Fixed potential FE OOM issues when importing very large numbers of tablets for a single table. [#36944](https://github.com/apache/doris/pull/36944) + +- Fixed possible auto-increment column duplication during FE master-slave switchovers. [#36961](https://github.com/apache/doris/pull/36961) + +- Fixed errors when inserting into select with auto-increment columns. [#37029](https://github.com/apache/doris/pull/37029) + +- Reduced the number of data flush threads to optimize memory usage. [#37092](https://github.com/apache/doris/pull/37092) + +- Improved automatic recovery and error messaging for routine load tasks. [#37371](https://github.com/apache/doris/pull/37371) + +- Increased the default batch size for routine load. [#37388](https://github.com/apache/doris/pull/37388) + +- Fixed routine load task stoppage due to Kafka EOF expiration. [#37983](https://github.com/apache/doris/pull/37983) + +- Fixed coredump issues in multi-table streaming. [#37370](https://github.com/apache/doris/pull/37370) + +- Fixed premature backpressure caused by inaccurate memory estimation in groupcommit. [#37379](https://github.com/apache/doris/pull/37379) + +- Optimized BE-side thread usage in groupcommit. [#37380](https://github.com/apache/doris/pull/37380) + +- Fixed the issue of no error URL when data was not partitioned. [#37401](https://github.com/apache/doris/pull/37401) + +- Fixed potential memory misoperations during imports. [#38021](https://github.com/apache/doris/pull/38021) + +### Merge on Write Unique Key + +- Reduced memory usage during compaction for primary key tables. [#36968](https://github.com/apache/doris/pull/36968) + +- Fixed potential duplicate data issues when primary key replica cloning fails. [#37229](https://github.com/apache/doris/pull/37229) + +### Permissions + +- Fixed the issue of missing authorization when a table-valued function references a resource. [#37132](https://github.com/apache/doris/pull/37132) + +- Fixed the issue where the SHOW ROLE statement did not include workload group permissions. [#36032](https://github.com/apache/doris/pull/36032) + +- Fixed the issue where executing two statements simultaneously when creating a row policy could cause FE to fail to restart. [#37342](https://github.com/apache/doris/pull/37342) + +- Fixed the issue where, in some cases, upgrading from an older version could result in FE metadata replay failures due to row policies. [#37342](https://github.com/apache/doris/pull/37342) + +### Others + +- Fixed the issue of compute nodes participating in internal table creation. [#37961](https://github.com/apache/doris/pull/37961) + +- Fixed the read lag issue when `enable_strong_read_consistency` is set to true. [#37641](https://github.com/apache/doris/pull/37641) \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.6.md b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.6.md new file mode 100644 index 0000000000000..c14d25b52573f --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.6.md @@ -0,0 +1,524 @@ +--- +{ + "title": "Release 2.1.6", + "language": "en" +} +--- + + + +Dear community, **Apache Doris version 2.1.6 was officially released on September 10, 2024.** This version brings continuous upgrades and improvements to the Lakehouse, Async Materialized Views, and Semi-Structured Data Management. Additionally, several fixes have been implemented in areas such as the query optimizer, execution engine, storage management, permission management. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- Removed the `delete_if_exists` option from create repository. [#38192](https://github.com/apache/doris/pull/38192) + +- Added the `enable_prepared_stmt_audit_log` session variable to control whether JDBC prepared statements record audit logs, with the default being no recording. [#38624](https://github.com/apache/doris/pull/38624) [#39009](https://github.com/apache/doris/pull/39009) + +- Implemented fd limit and memory constraints for segment cache. [#39689](https://github.com/apache/doris/pull/39689) + +- When the FE configuration item `sys_log_mode` is set to BRIEF, file location information is added to the logs. [#39571](https://github.com/apache/doris/pull/39571) + +- Changed the default value of the session variable `max_allowed_packet` to 16MB. [#38697](https://github.com/apache/doris/pull/38697) + +- When a single request contains multiple statements, semicolons must be used to separate them. [#38670](https://github.com/apache/doris/pull/38670) + +- Added support for statements to begin with a semicolon. [#39399](https://github.com/apache/doris/pull/39399) + +- Aligned type formatting with MySQL in statements such as `show create table`. [#38012](https://github.com/apache/doris/pull/38012) + +- When the new optimizer planning times out, it no longer falls back to prevent the old optimizer from using longer planning times. [#39499](https://github.com/apache/doris/pull/39499) + +## New features + +### Lakehouse + +- Supported writeback for Iceberg tables. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/lakehouse/datalake-building/iceberg-build). + +- SQL interception rules now support external tables. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/query-admin/sql-interception). + +- Added the system table `file_cache_statistics` to view BE data cache metrics. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/system-tables/file_cache_statistics). + +### Async Materialized View + +- Supported transparent rewriting during inserts. [#38115](https://github.com/apache/doris/pull/38115) + +- Supported transparent rewriting when variant types exist in queries.[ #37929](https://github.com/apache/doris/pull/37929) + +### Semi-Structured Data Management + +- Supported casting ARRAY MAP to JSON type.[ #36548](https://github.com/apache/doris/pull/36548) + +- Supported the `json_keys` function.[ #36411](https://github.com/apache/doris/pull/36411) + +- Supported specifying the JSON path $. when importing JSON. [#38213](https://github.com/apache/doris/pull/38213) + +- ARRAY, MAP, STRUCT types now support `replace_if_not_null`[#38304](https://github.com/apache/doris/pull/38304) + +- ARRAY, MAP, STRUCT types now support adjusting column order.[#39210](https://github.com/apache/doris/pull/39210) + +- Added the `multi_match` function to match keywords across multiple fields, with support for inverted index acceleration. [#37722](https://github.com/apache/doris/pull/37722) + +### Query Optimizer + +- Filled in the original database name, table name, column name, and alias for returned columns in the MySQL protocol. [ #38126](https://github.com/apache/doris/pull/38126) + +- Supported the aggregation function `group_concat` with both order by and distinct simultaneously. [#38080](https://github.com/apache/doris/pull/38080) + +- SQL cache now supports reusing cached results for queries with different comments. [#40049](https://github.com/apache/doris/pull/40049) + +- In partition pruning, supported including `date_trunc` and date functions in filter conditions. [#38025](https://github.com/apache/doris/pull/38025) [#38743](https://github.com/apache/doris/pull/38743) + +- Allowed using the database name where the table resides as a qualifier prefix for table aliases. [#38640](https://github.com/apache/doris/pull/38640) + +- Supported hint-style comments.[#39113](https://github.com/apache/doris/pull/39113) + +### Others + +- Added the system table `table_properties` for viewing table properties. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/system-tables/information_schema/table_properties). + +- Introduced deadlock and slow lock detection in FE. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/maint-monitor/frontend-lock-manager). + +## Improvements + +### Lakehouse + +- Reimplemented the external table metadata caching mechanism. + + - For details, refer to the [documentation](https://doris.apache.org/docs/lakehouse/metacache). + +- Added the session variable `keep_carriage_return` with a default value of false. By default, reading Hive Text format tables treats both `\r\n` and `\n` as newline characters. [#38099](https://github.com/apache/doris/pull/38099) + +- Optimized memory statistics for Parquet/ORC file read/write operations.[#37257](https://github.com/apache/doris/pull/37257) + +- Supported pushing down IN/NOT IN predicates for Paimon tables. [#38390](https://github.com/apache/doris/pull/38390) + +- Enhanced the optimizer to support Time Travel syntax for Hudi tables. [#38591](https://github.com/apache/doris/pull/38591) + +- Optimized Kerberos authentication-related processes. [ #37301](https://github.com/apache/doris/pull/37301) + +- Enabled reading Hive tables after renaming column operations. [#38809](https://github.com/apache/doris/pull/38809) + +- Optimized the reading performance of partition columns for external tables. [#38810](https://github.com/apache/doris/pull/38810) + +- Improved the data shard merging strategy during external table query planning to avoid performance degradation caused by a large number of small shards.[#38964](https://github.com/apache/doris/pull/38964) + +- Added attributes such as location to `SHOW CREATE DATABASE/TABLE`. [#39644](https://github.com/apache/doris/pull/39644) + +- Supported complex types in MaxCompute Catalog. [#39822](https://github.com/apache/doris/pull/39822) + +- Optimized the file cache loading strategy by using asynchronous loading to avoid long BE startup times. [#39036](https://github.com/apache/doris/pull/39036) + +- Improved the file cache eviction strategy, such as evicting locks held for extended periods. [#39721](https://github.com/apache/doris/pull/39721) + +### Async Materialized View + +- Supported hourly, weekly, and quarterly partition roll-up construction. [#37678](https://github.com/apache/doris/pull/37678) + +- For materialized views based on Hive external tables, the metadata cache is now updated before refresh to ensure the latest data is obtained during each refresh. [#38212](https://github.com/apache/doris/pull/38212) + +- Improved the performance of transparent rewrite planning in storage-compute decoupled mode by batch fetching metadata. [#39301](https://github.com/apache/doris/pull/39301) + +- Enhanced the performance of transparent rewrite planning by prohibiting duplicate enumerations. [#39541](https://github.com/apache/doris/pull/39541) + +- Improved the performance of transparent rewrite for refreshing materialized views based on Hive external table partitions.[#38525](https://github.com/apache/doris/pull/38525) + +### Semi-Structured Data Management + +- Optimized memory allocation for TOPN queries to improve performance. [#37429](https://github.com/apache/doris/pull/37429) + +- Enhanced the performance of string processing in inverted indexes.[#37395](https://github.com/apache/doris/pull/37395) + +- Optimized the performance of inverted indexes in MOW tables. [#37428](https://github.com/apache/doris/pull/37428) + +- Supported specifying the row-store `page_size` during table creation to control compression effectiveness. [#37145](https://github.com/apache/doris/pull/37145) + +### Query Optimizer + +- Adjusted the row count estimation algorithm for mark joins, resulting in more accurate cardinality estimates for mark joins. [#38270](https://github.com/apache/doris/pull/38270) + +- Optimized the cost estimation algorithm for semi/anti joins, enabling more accurate selection of semi/anti join orders. [#37951](https://github.com/apache/doris/pull/37951) + +- Adjusted the filter estimation algorithm for cases where some columns have no statistical information, leading to more accurate cardinality estimates. [#39592](https://github.com/apache/doris/pull/39592) + +- Modified the instance calculation logic for set operation operators to prevent insufficient parallelism in extreme cases. [#39999](https://github.com/apache/doris/pull/39999) + +- Adjusted the usage strategy of bucket shuffle, achieving better performance when data is not sufficiently shuffled. [#36784](https://github.com/apache/doris/pull/36784) + +- Enabled early filtering of window function data, supporting multiple window functions in a single projection. [#38393](https://github.com/apache/doris/pull/38393) + +- When a `NullLiteral` exists in a filter condition, it can now be folded into false, further converted to an `EmptySet` to reduce unnecessary data scanning and computation. [#38135](https://github.com/apache/doris/pull/38135) + +- Expanded the scope of predicate derivation, reducing data scanning in queries with specific patterns. [#37314](https://github.com/apache/doris/pull/37314) + +- Supported partial short-circuit evaluation logic in partition pruning to improve partition pruning performance, achieving over 100% improvement in specific scenarios. [#38191](https://github.com/apache/doris/pull/38191) + +- Enabled the computation of arbitrary scalar functions within user variables. [#39144](https://github.com/apache/doris/pull/39144) + +- Maintained error messages consistent with MySQL when alias conflicts exist in queries. [#38104](https://github.com/apache/doris/pull/38104) + +### Query Execution + +- Adapted AggState for compatibility from 2.1 to 3.x and fixed coredump issues. [#37104](https://github.com/apache/doris/pull/37104) + +- Refactored the strategy selection for local shuffle when no joins are involved. [#37282](https://github.com/apache/doris/pull/37282) + +- Modified the scanner for internal table queries to an asynchronous approach to prevent blocking during internal table queries. [#38403](https://github.com/apache/doris/pull/38403) + +- Optimized the block merge process when building hash tables in Join operators. [#37471](https://github.com/apache/doris/pull/37471) + +- Reduced the lock holding time for MultiCast operations. [37462](https://github.com/apache/doris/pull/37462) + +- Optimized gRPC's keepAliveTime and added a connection monitoring mechanism, reducing the probability of query failures due to RPC errors during query execution. [#37304](https://github.com/apache/doris/pull/37304) + +- Cleaned up all dirty pages in jemalloc when memory limits are exceeded. [#37164](https://github.com/apache/doris/pull/37164) + +- Improved the performance of `aes_encrypt`/`decrypt` functions when handling constant types. [#37194](https://github.com/apache/doris/pull/37194) + +- Optimized the performance of `json_extract` functions when processing constant data. [#36927](https://github.com/apache/doris/pull/36927) + +- Optimized the performance of ParseURL functions when processing constant data. [#36882](https://github.com/apache/doris/pull/36882) + +### Backup Recovery / CCR + +- Restore now supports deleting redundant tablets and partition options. [#39363](https://github.com/apache/doris/pull/39363) + +- Check storage connectivity when creating a repository. [#39538](https://github.com/apache/doris/pull/39538) + +- Enables binlog to support `DROP TABLE`, allowing CCR to incrementally synchronize `DROP TABLE` operations. [#38541](https://github.com/apache/doris/pull/38541) + +### Compaction + +- Improves the issue where high-priority compaction tasks were not subject to task concurrency control limits. [#38189](https://github.com/apache/doris/pull/38189) + +- Automatically reduces compaction memory consumption based on data characteristics. [#37486](https://github.com/apache/doris/pull/37486) + +- Fixes an issue where the sequential data optimization strategy could lead to incorrect data in aggregate tables or MOR UNIQUE tables. [ #38299](https://github.com/apache/doris/pull/38299) + +- Optimizes the rowset selection strategy during compaction during replica replenishment to avoid triggering -235 errors. [#39262](https://github.com/apache/doris/pull/39262) + +### MOW (Merge-On-Write) + +- Optimizes slow column updates caused by concurrent column updates and compactions. [#38682](https://github.com/apache/doris/pull/38682) + +- Fixes an issue where segcompaction during bulk data imports could lead to incorrect MOW data. [#38992](https://github.com/apache/doris/pull/38992) [#39707](https://github.com/apache/doris/pull/39707) + +- Fixes data loss in column updates that may occur after BE restarts. [#39035](https://github.com/apache/doris/pull/39035) + +### Storage Management + +- Adds FE configuration to control whether queries under hot-cold tiering prefer local data replicas. [#38322](https://github.com/apache/doris/pull/38322) + +- Optimizes expired BE report messages to include newly created tablets. [#38839](https://github.com/apache/doris/pull/38839) [#39605](https://github.com/apache/doris/pull/39605) + +- Optimizes replica scheduling priority strategy to prioritize replicas with missing data. [#38884](https://github.com/apache/doris/pull/38884) + +- Prevents tablets with unfinished ALTER jobs from being balanced. [#39202](https://github.com/apache/doris/pull/39202) + +- Enables modifying the number of buckets for tables with list partitioning. [#39688](https://github.com/apache/doris/pull/39688) + +- Prefers querying from online disk services. [#39654](https://github.com/apache/doris/pull/39654) + +- Improves error messages for materialized view base tables that do not support deletion during synchronization. [#39857](https://github.com/apache/doris/pull/39857) + +- Improves error messages for single columns exceeding 4GB. [#39897](https://github.com/apache/doris/pull/39897) + +- Fixes an issue where aborted transactions were omitted when plan errors occurred during `INSERT` statements.[#38260](https://github.com/apache/doris/pull/38260) + +- Fixes exceptions during SSL connection closure.[#38677](https://github.com/apache/doris/pull/38677) + +- Fixes an issue where table locks were not held when aborting transactions using labels. [#38842](https://github.com/apache/doris/pull/38842) + +- Fixes `gson pretty` causing large image issues. [#39135](https://github.com/apache/doris/pull/39135) + +- Fixes an issue where the new optimizer did not check for bucket values of 0 in `CREATE TABLE` statements.[#38999](https://github.com/apache/doris/pull/38999) + +- Fixes errors when Chinese column names are included in `DELETE` condition predicates. [#39500](https://github.com/apache/doris/pull/39500) + +- Fixes frequent tablet balancing issues in partition balancing mode. [#39606](https://github.com/apache/doris/pull/39606) + +- Fixes an issue where partition storage policy attributes were lost. [#39677](https://github.com/apache/doris/pull/39677) + +- Fixes incorrect statistics when importing multiple tables within a transaction. [#39548](https://github.com/apache/doris/pull/39548) + +- Fixes errors when deleting random bucket tables. [#39830](https://github.com/apache/doris/pull/39830) + +- Fixes issues where FE fails to start due to non-existent UDFs. [#39868](https://github.com/apache/doris/pull/39868) + +- Fixes inconsistencies in the last failed version between FE master and slave. [#39947](https://github.com/apache/doris/pull/39947) + +- Fixes an issue where related tablets may still be in schema change state when schema change jobs are canceled. [ #39327](https://github.com/apache/doris/pull/39327) + +- Fixes errors when modifying type and column order in a single statement schema change (SC). [#39107](https://github.com/apache/doris/pull/39107) + +### Data Loading + +- Improves error messages for -238 errors during imports. [#39182](https://github.com/apache/doris/pull/39182) + +- Allows importing to other partitions while restoring a partition. [#39915](https://github.com/apache/doris/pull/39915) + +- Optimizes the strategy for FE to select BEs during group commit. [#37830](https://github.com/apache/doris/pull/37830) [#39010](https://github.com/apache/doris/pull/39010) + +- Avoids printing stack traces for some common streamload error messages. [#38418](https://github.com/apache/doris/pull/38418) + +- Improves handling of issues where offline BEs may affect import errors. [#38256](https://github.com/apache/doris/pull/38256) + +### Permissions + +- Optimizes access performance after enabling the Ranger authentication plugin. [#38575](https://github.com/apache/doris/pull/38575) +- Optimizes permission strategies for Refresh Catalog/Database/Table operations, allowing users to perform these operations with only SHOW permissions. [#39008](https://github.com/apache/doris/pull/39008) + +## Bug fixes + +### Lakehouse + +- Fixes the issue where switching catalogs may result in an error of not finding the database. [#38114](https://github.com/apache/doris/pull/38114) + +- Addresses exceptions caused by attempting to read non-existent data on S3. [#38253](https://github.com/apache/doris/pull/38253) + +- Resolves the issue where specifying an abnormal path during export operations may lead to incorrect export locations. [#38602](https://github.com/apache/doris/pull/38602) + +- Fixes the timezone issue for time columns in Paimon tables. [#37716](https://github.com/apache/doris/pull/37716) + +- Temporarily disables the Parquet PageIndex feature to avoid certain erroneous behaviors. + +- Corrects the selection of Backend nodes in the blacklist during external table queries. [#38984](https://github.com/apache/doris/pull/38984) + +- Resolves errors caused by missing subcolumns in Parquet Struct column types.[#39192](https://github.com/apache/doris/pull/39192) + +- Addresses several issues with predicate pushdown in JDBC Catalog. [#39082](https://github.com/apache/doris/pull/39082) + +- Fixes issues where some historical Parquet formats led to incorrect query results. [#39375](https://github.com/apache/doris/pull/39375) + +- Improves compatibility with ojdbc6 drivers for Oracle JDBC Catalog. [#39408](https://github.com/apache/doris/pull/39408) + +- Resolves potential FE memory leaks caused by Refresh Catalog/Database/Table operations. [#39186](https://github.com/apache/doris/pull/39186) [#39871](https://github.com/apache/doris/pull/39871) + +- Fixes thread leaks in JDBC Catalog under certain conditions. [#39666](https://github.com/apache/doris/pull/39666) [#39582](https://github.com/apache/doris/pull/39582) + +- Addresses potential event processing failures after enabling Hive Metastore event subscription. [#39239](https://github.com/apache/doris/pull/39239) + +- Disables reading Hive Text format tables with custom escape characters and null formats to prevent data errors. [#39869](https://github.com/apache/doris/pull/39869) + +- Resolves issues accessing Iceberg tables created via the Iceberg API under certain conditions. [#39203](https://github.com/apache/doris/pull/39203) + +- Fixes the inability to read Paimon tables stored on HDFS clusters with high availability enabled. [#39876](https://github.com/apache/doris/pull/39876) + +- Addresses errors that may occur when reading Paimon table deletion vectors after enabling file caching. [#39875](https://github.com/apache/doris/pull/39875) + +- Resolves potential deadlocks when reading Parquet files under certain conditions. [#39945](https://github.com/apache/doris/pull/39945) + +### Async Materialized View + +- Fixes the inability to use `SHOW CREATE MATERIALIZED VIEW` on follower FEs. [#38794](https://github.com/apache/doris/pull/38794) + +- Unifies the object type of asynchronous materialized views in metadata as tables to enable proper display in data tools. [#38797](https://github.com/apache/doris/pull/38797) + +- Resolves the issue where nested asynchronous materialized views always perform full refreshes. [#38698](https://github.com/apache/doris/pull/38698) + +- Fixes the issue where canceled tasks may show as running after restarting FEs. [ #39424](https://github.com/apache/doris/pull/39424) + +- Addresses incorrect use of contexts, which may lead to unexpected failures of materialized view refresh tasks. [#39690](https://github.com/apache/doris/pull/39690) + +- Resolves issues that may cause varchar type write failures due to unreasonable lengths when creating asynchronous materialized views based on external tables.[#37668](https://github.com/apache/doris/pull/37668) + +- Fixes the potential invalidation of asynchronous materialized views based on external tables after FE restarts or catalog rebuilds. [#39355](https://github.com/apache/doris/pull/39355) + +- Prohibits the use of partition rollup for materialized views with list partitions to prevent the generation of incorrect data. [#38124](https://github.com/apache/doris/pull/38124) + +- Fixes incorrect results when literals exist in the select list during transparent rewriting for aggregation rollup. [#38958](https://github.com/apache/doris/pull/38958) + +- Addresses potential errors during transparent rewriting when queries contain filters like `a = a`. [#39629](https://github.com/apache/doris/pull/39629) + +- Fixes issues where transparent rewriting for direct external table queries fails. [#39041](https://github.com/apache/doris/pull/39041) + +### Semi-Structured Data Management + +- Removes support for prepared statements in the old optimizer. [#39465](https://github.com/apache/doris/pull/39465) + +- Fixes issues with JSON escape character handling. [#37251](https://github.com/apache/doris/pull/37251) + +- Resolves issues with duplicate processing of JSON fields. [#38490](https://github.com/apache/doris/pull/38490) + +- Fixes issues with some ARRAY and MAP functions. [#39307](https://github.com/apache/doris/pull/39307) [#39699](https://github.com/apache/doris/pull/39699) [#39757](https://github.com/apache/doris/pull/39757) + +- Resolves complex combinations of inverted index queries and LIKE queries. [#36687](https://github.com/apache/doris/pull/36687) + +### Query Optimizer + +- Fixed the potential partition pruning error issue when the 'OR' condition exists in partition filter conditions. [#38897](https://github.com/apache/doris/pull/38897) + +- Fixed the potential partition pruning error issue when complex expressions are involved. [#39298](https://github.com/apache/doris/pull/39298) + +- Fixed the issue where nullable in `agg_state` subtypes might be planned incorrectly, leading to execution errors. [#37489](https://github.com/apache/doris/pull/37489) + +- Fixed the issue where nullable in set operation operators might be planned incorrectly, leading to execution errors. [#39109](https://github.com/apache/doris/pull/39109) + +- Fixed the incorrect execution priority issue of intersect operator. [#39095](https://github.com/apache/doris/pull/39095) + +- Fixed the NPE issue that may occur when the maximum valid date literal exists in the query. [#39482](https://github.com/apache/doris/pull/39482) + +- Fixed the occasional planning error that results in an illegal slot error during execution. [#39640](https://github.com/apache/doris/pull/39640) + +- Fixed the issue where repeatedly referencing columns in cte may lead to missing data in some columns in the result. [#39850](https://github.com/apache/doris/pull/39850) + +- Fixed the occasional planning error issue when 'case when' exists in the query. [#38491](https://github.com/apache/doris/pull/38491) + +- Fixed the issue where IP types cannot be implicitly converted to string types. [#39318](https://github.com/apache/doris/pull/39318) + +- Fixed the potential planning error issue when using multi-dimensional aggregation and the same column and its alias exist in the select list. [ #38166](https://github.com/apache/doris/pull/38166) + +- Fixed the issue where boolean types might be handled incorrectly when using BE constant folding. [#39019](https://github.com/apache/doris/pull/39019) + +- Fixed the planning error issue caused by `default_cluster`: as a prefix for the database name in expressions. [#39114](https://github.com/apache/doris/pull/39114) + +- Fixed the potential deadlock issue caused by` insert into`. [#38660](https://github.com/apache/doris/pull/38660) + +- Fixed the potential planning error issue caused by not holding table locks throughout the planning process. [#38950](https://github.com/apache/doris/pull/38950) + +- Fixed the issue where CHAR(0), VARCHAR(0) are not handled correctly when creating tables. [#38427](https://github.com/apache/doris/pull/38427) + +- Fixed the issue where `show create table` may incorrectly display hidden columns. [#38796](https://github.com/apache/doris/pull/38796) + +- Fixed the issue where columns with the same name as hidden columns are not prohibited when creating tables. [#38796](https://github.com/apache/doris/pull/38796) + +- Fixed the occasional planning error issue when executing `insert into as select` with CTEs. [#38526](https://github.com/apache/doris/pull/38526) + +- Fixed the issue where `insert into values` cannot automatically fill null default values. **[[fix](Nereids) fix insert into table with null literal default value #39122](https://github.com/apache/doris/pull/39122)** + +- Fixed the NPE issue caused by using cte in delete without using it. [#39379](https://github.com/apache/doris/pull/39379) + +- Fixed the issue where deleting from a randomly distributed aggregation model table fails. [#37985](https://github.com/apache/doris/pull/37985) + +### Query Execution + +- Fixed the issue where the pipeline execution engine gets stuck in multiple scenarios, causing queries not to end. [#38657](https://github.com/apache/doris/pull/38657) [#38206](https://github.com/apache/doris/pull/38206) [#38885](https://github.com/apache/doris/pull/38885) + +- Fixed the coredump issue caused by null and non-null columns in set difference calculations.[#38737](https://github.com/apache/doris/pull/38737) + +- Fixed the incorrect result issue of the `width_bucket` function. [#37892](https://github.com/apache/doris/pull/37892) + +- Fixed the query error issue when a single row of data is large and the result set is also large (exceeding 2GB). [#37990](https://github.com/apache/doris/pull/37990) + +- Fixed the incorrect result issue of `stddev` with DecimalV2 type. [#38731](https://github.com/apache/doris/pull/38731) + +- Fixed the coredump issue caused by the `MULTI_MATCH_ANY` function. [#37959](https://github.com/apache/doris/pull/37959) + +- Fixed the issue where `insert overwrite auto partition` causes transaction rollback. [#38103](https://github.com/apache/doris/pull/38103) + +- Fixed the incorrect result issue of the `convert_tz` function. [#37358](https://github.com/apache/doris/pull/37358) [#38764](https://github.com/apache/doris/pull/38764) + +- Fixed the coredump issue when using the `collect_set` function with window functions. [#38234](https://github.com/apache/doris/pull/38234) + +- Fixed the coredump issue caused by the mod function with abnormal input. [#37999](https://github.com/apache/doris/pull/37999) + +- Fixed the issue where executing the same expression in multiple threads may lead to incorrect Java UDF results. [#38612](https://github.com/apache/doris/pull/38612) + +- Fixed the overflow issue caused by the incorrect return type of the `conv` function. [#38001](https://github.com/apache/doris/pull/38001) + +- Fixed the unstable result issue of the histogram function. [#38608](https://github.com/apache/doris/pull/38608) + +### Backup & Recovery / CCR + +- Fixed the issue where the data version after backup and recovery may be incorrect, leading to unreadability. [#38343](https://github.com/apache/doris/pull/38343) + +- Fixed the issue of using restore version across versions. [#38396](https://github.com/apache/doris/pull/38396) + +- Fixed the issue where the job is not canceled when backup fails. [#38993](https://github.com/apache/doris/pull/38993) + +- Fixed the NPE issue in ccr during the upgrade from 2.1.4 to 2.1.5, causing the FE to fail to start. [#39910](https://github.com/apache/doris/pull/39910) + +- Fixed the issue where views and materialized views cannot be used after restoration. [#38072](https://github.com/apache/doris/pull/38072) [#39848](https://github.com/apache/doris/pull/39848) + +### Storage Management + +- Fixed possible memory leaks in routine load when loading multiple tables from a single stream. [#38824](https://github.com/apache/doris/pull/38824) + +- Fixed the issue where delimiters and escape characters in routine load were not effective. [#38825](https://github.com/apache/doris/pull/38825) + +- Fixed incorrectly show routine load results when the routine load task name contained uppercase letters. [#38826](https://github.com/apache/doris/pull/38826) + +- Fixed the issue where the offset cache was not reset when changing the routineload topic. [#38474](https://github.com/apache/doris/pull/38474) + +- Fixed the potential exception triggered by show routineload under concurrent scenarios. [#39525](https://github.com/apache/doris/pull/39525) + +- Fixed the issue where routine load might import data repeatedly. [#39526](https://github.com/apache/doris/pull/39526) + +- Fixed the data error caused by `setNull` when enabling group commit via JDBC. [#38276](https://github.com/apache/doris/pull/38276) + +- Fixed the potential NPE issue when enabling group commit insert to a non-master FE. [#38345](https://github.com/apache/doris/pull/38345) + +- Fixed incorrect error handling during internal data writing in group commit. [#38997](https://github.com/apache/doris/pull/38997) + +- Fixed the coredump that might be triggered when the group commit execution plan failed. [#39396](https://github.com/apache/doris/pull/39396) + +- Fixed the issue where concurrent imports into auto partition tables might report non-existent tablets. [#38793](https://github.com/apache/doris/pull/38793) + +- Fixed potential load stream leakage issues. [#39039](https://github.com/apache/doris/pull/39039) + +- Fixed the issue where transactions were opened for `insert into select` with no data. [#39108](https://github.com/apache/doris/pull/39108) + +- Ignored the single-replica import configuration when using memtable prefetching. [#39154](https://github.com/apache/doris/pull/39154) + +- Fixed the issue where background imports of stream load records might be abnormally aborted upon encountering db deletion. [#39527](https://github.com/apache/doris/pull/39527) + +- Fixed inaccurate error messages when data errors occurred in strict mode. [#39587](https://github.com/apache/doris/pull/39587) + +- Fixed the issue where streamload did not return an error URL upon encountering erroneous data. [#38417](https://github.com/apache/doris/pull/38417) + +- Fixed the issue with the combined use of insert overwrite and auto partition. [#38442](https://github.com/apache/doris/pull/38442) + +- Fixed parsing errors when CSV encountered data where the line delimiter was enclosed by the enclosing character. [#38445](https://github.com/apache/doris/pull/38445) + +### Data Exporting + +- Fixed the issue where enabling the delete_existing_files property during export operations might result in duplicate deletion of exported data. [#39304](https://github.com/apache/doris/pull/39304)) + +### Permissions + +- Fixed the incorrect requirement of ALTER TABLE permission when creating a materialized view. [#38011](https://github.com/apache/doris/pull/38011) + +- Fixed the issue where the db was explicitly displayed as empty when showing routine load. [#38365](https://github.com/apache/doris/pull/38365) + +- Fixed the incorrect requirement of CREATE permission on the original table when using CREATE TABLE LIKE. [#37879](https://github.com/apache/doris/pull/37879) + +- Fixed the issue where grant operations did not check if the object existed. [#39597](https://github.com/apache/doris/pull/39597) + +## Upgrade suggestions + +When upgrading Doris, please follow the principle of not skipping two minor versions and upgrade sequentially. + +For example, if you are upgrading from version 0.15.x to 2.0.x, it is recommended to first upgrade to the latest version of 1.1, then upgrade to the latest version of 1.2, and finally upgrade to the latest version of 2.0. + +For more upgrade information, see the documentation: [Cluster Upgrade](../../admin-manual/cluster-management/upgrade) \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.7.md b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.7.md new file mode 100644 index 0000000000000..414229276e6b0 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.7.md @@ -0,0 +1,180 @@ +--- +{ + "title": "Release 2.1.7", + "language": "en" +} +--- + + + +Dear community, **Apache Doris version 2.1.7 was officially released on November 10, 2024.** This version brings continuous upgrades and improvements. Additionally, several fixes have been implemented in areas such as the to the Lakehouse, Async Materialized Views, and Semi-Structured Data Management, Query Optimizer and Permission Management. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- The following global variables will be forcibly set to the following default values: + - enable_nereids_dml: true + - enable_nereids_dml_with_pipeline: true + - enable_nereids_planner: true + - enable_fallback_to_original_planner: true + - enable_pipeline_x_engine: true +- New columns have been added to the audit log. [#42262](https://github.com/apache/doris/pull/42262) + - For more information, please refer to [docs](https://doris.apache.org/docs/admin-manual/audit-plugin/) + +## New features + +### Async Materialized View + +- An asynchronous materialized view has added a property called `use_for_rewrite` to control whether it participates in transparent rewriting. [#40332](https://github.com/apache/doris/pull/40332) + +### Query Execution + +- The list of changed session variables is now output in the Profile. [#41016](https://github.com/apache/doris/pull/41016) +- Support for `trim_in`, `ltrim_in`, and `rtrim_in` functions has been added. [#42641](https://github.com/apache/doris/pull/42641) (Note: This is a duplicate mention, but I'm including it as per your original list.) +- Support for several URL functions (top_level_domain, first_significant_subdomain, cut_to_first_significant_subdomain) has been added. [#42916](https://github.com/apache/doris/pull/42916) +- The `bit_set` function has been added. [#42916](https://github.com/apache/doris/pull/42099) +- The `count_substrings` function has been added. [#42055](https://github.com/apache/doris/pull/42055) +- The `translate` and `url_encode` functions have been added. [#41051](https://github.com/apache/doris/pull/41051) +- The `normal_cdf`, `to_iso8601`, and `from_iso8601_date` functions have been added. [#40695](https://github.com/apache/doris/pull/40695) + + +### Storage Management + +- The `information_schema.table_options` and `table_properties` system tables have been added, supporting the querying of attributes set during table creation. [#34384](https://github.com/apache/doris/pull/34384) +- Support for `bitmap_empty` as a default value has been implemented. [#40364](https://github.com/apache/doris/pull/40364) +- A new session variable `require_sequence_in_insert` has been introduced to control whether a sequence column must be provided when performing `INSERT INTO SELECT` writes to a unique key table. [#41655](https://github.com/apache/doris/pull/41655) + +### Others + +- Allow for generating flame graphs on the BE WebUI page.[#41044](https://github.com/apache/doris/pull/41044) + +## Improvements + +### Lakehouse + +- Support for writing data to Hive text format tables. [#40537](https://github.com/apache/doris/pull/40537) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/datalake-building/hive-build) +- Access MaxCompute data using MaxCompute Open Storage API. [#41610](https://github.com/apache/doris/pull/41610) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/database/max-compute) +- Support for Paimon DLF Catalog. [#41694](https://github.com/apache/doris/pull/41694) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/datalake-analytics/paimon) +- Added `table$partitions` syntax to directly query Hive partition information.[#41230](https://github.com/apache/doris/pull/41230) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/datalake-analytics/hive) +- Support for reading Parquet files in brotli compression format.[#42162](https://github.com/apache/doris/pull/42162) +- Support for reading DECIMAL 256 types in Parquet files. [#42241](https://github.com/apache/doris/pull/42241) +- Support for reading Hive tables in OpenCsvSerde format.[#42939](https://github.com/apache/doris/pull/42939) + +### Async Materialized View + +- Refined the granularity of lock holding during the build process for asynchronous materialized views. [#40402](https://github.com/apache/doris/pull/40402) [#41010](https://github.com/apache/doris/pull/41010). + +### Query optimizer + +- Improved the accuracy of statistic information collection and usage in extreme cases to enhance planning stability. [#40457](https://github.com/apache/doris/pull/40457) +- Runtime filters can now be generated in more scenarios to improve query performance. [#40815](https://github.com/apache/doris/pull/40815) +- Enhanced constant folding capabilities for numerical, date, and string functions to boost query performance. [#40820](https://github.com/apache/doris/pull/40820) +- Optimized the column pruning algorithm to enhance query performance. [#41548](https://github.com/apache/doris/pull/41548) + +### Query Execution + +- Supported parallel preparation to reduce the time consumed by short queries. [#40270](https://github.com/apache/doris/pull/40270) +- Corrected the names of some counters in the profile to match the audit logs. [#41993](https://github.com/apache/doris/pull/41993) +- Added new local shuffle rules to speed up certain queries. [#40637](https://github.com/apache/doris/pull/40637) + +### Storage Management + +- The `SHOW PARTITIONS` command now supports displaying the commit version. [#28274](https://github.com/apache/doris/pull/28274) +- Checked for unreasonable partition expressions when creating tables. [#40158](https://github.com/apache/doris/pull/40158) +- Optimized the scheduling logic when encountering EOF in Routine Load. [#40509](https://github.com/apache/doris/pull/40509) +- Made Routine Load aware of schema changes. [#40508](https://github.com/apache/doris/pull/40508) +- Improved the timeout logic for Routine Load tasks. [#41135](https://github.com/apache/doris/pull/41135) + +### Others + +- Allowed closing the built-in service port of BRPC via BE configuration. [#41047](https://github.com/apache/doris/pull/41047) +- Fixed issues with missing fields and duplicate records in audit logs. [#43015](https://github.com/apache/doris/pull/43015) + +## Bug fixes + +### Lakehouse + +- Fixed the inconsistency in the behavior of INSERT OVERWRITE with Hive. [#39840](https://github.com/apache/doris/pull/39840) +- Cleaned up temporarily created folders to address the issue of too many empty folders on HDFS. [#40424](https://github.com/apache/doris/pull/40424) +- Resolved memory leaks in FE caused by using the JDBC Catalog in some cases. [#40923](https://github.com/apache/doris/pull/40923) +- Resolved memory leaks in BE caused by using the JDBC Catalog in some cases. [#41266](https://github.com/apache/doris/pull/41266) +- Fixed errors in reading Snappy compressed formats in certain scenarios. [#40862](https://github.com/apache/doris/pull/40862) +- Addressed potential FileSystem leaks on the FE side in certain scenarios. [#41108](https://github.com/apache/doris/pull/41108) +- Resolved issues where using EXPLAIN VERBOSE to view external table execution plans could cause null pointer exceptions in some cases. [#41231] (https://github.com/apache/doris/pull/41231) +- Fixed the inability to read tables in Paimon parquet format. [#41487](https://github.com/apache/doris/pull/41487) +- Addressed performance issues introduced by compatibility changes in the JDBC Oracle Catalog. [#41407](https://github.com/apache/doris/pull/41407) +- Disabled predicate pushing down after implicit conversion to resolve incorrect query results in some cases with JDBC Catalog. [#42242](https://github.com/apache/doris/pull/42242) +- Fixed issues with case-sensitive access to table names in the External Catalog. [#42261](https://github.com/apache/doris/pull/42261) + +### Async Materialized View + +- Fixed the issue where user-specified start times were not effective. [#39573](https://github.com/apache/doris/pull/39573) +- Resolved the issue of nested materialized views not refreshing. [#40433](https://github.com/apache/doris/pull/40433) +- Fixed the issue where materialized views might not refresh after the base table was deleted and recreated. [#41762](https://github.com/apache/doris/pull/41762) +- Addressed issues where partition compensation rewrites could lead to incorrect results. [#40803](https://github.com/apache/doris/pull/40803) +- Fixed potential errors in rewrite results when `sql_select_limit` was set. [#40106](https://github.com/apache/doris/pull/40106) + +### Semi-Structured Data Management + +- Fixed the issue of index file handle leaks. [#41915](https://github.com/apache/doris/pull/41915) +- Addressed inaccuracies in the `count()` function of inverted indexes in special cases. (#41127)[https://github.com/apache/doris/pull/41127] +- Fixed exceptions with variant when light schema change was not enabled. [#40908](https://github.com/apache/doris/pull/40908) +- Resolved memory leaks when variant returns arrays. [#41339](https://github.com/apache/doris/pull/41339) + +### Query optimizer + +- Corrected potential errors in nullable calculations for filter conditions during external table queries, leading to execution exceptions. [#41014](https://github.com/apache/doris/pull/41014) +- Fixed potential errors in optimizing range comparison expressions. [#41356](https://github.com/apache/doris/pull/41356) + +### Query Execution + +- The match_regexp function could not correctly handle empty strings. [#39503](https://github.com/apache/doris/pull/39503) +- Resolved issues where the scanner thread pool could become stuck in high-concurrency scenarios. [#40495](https://github.com/apache/doris/pull/40495) +- Fixed errors in the results of the `data_floor` function. [#41948](https://github.com/apache/doris/pull/41948) +- Addressed incorrect cancel messages in some scenarios. [#41798](https://github.com/apache/doris/pull/41798) +- Fixed issues with excessive warning logs printed by arrow flight. [#41770](https://github.com/apache/doris/pull/41770) +- Resolved issues where runtime filters failed to send in some scenarios. [#41698](https://github.com/apache/doris/pull/41698) +- Fixed problems where some system table queries could not end normally or became stuck. [#41592](https://github.com/apache/doris/pull/41592) +- Addressed incorrect results from window functions. ][#40761](https://github.com/apache/doris/pull/40761) +- Fixed issues where the encrypt and decrypt functions caused BE cores. [#40726](https://github.com/apache/doris/pull/40726) +- Resolved errors in the results of the conv function. [#40530](https://github.com/apache/doris/pull/40530) + +### Storage Management + +- Fixed import failures when Memtable migration was used in multi-replica scenarios with machine crashes. [#38003](https://github.com/apache/doris/pull/38003) +- Addressed inaccurate memory statistics during the Memtable flush phase during imports. [#39536](https://github.com/apache/doris/pull/39536) +- Fixed fault tolerance issues with Memtable migration in multi-replica scenarios. [#40477](https://github.com/apache/doris/pull/40477) +- Resolved inaccurate bvar statistics with Memtable migration. [#40985](https://github.com/apache/doris/pull/40985) +- Fixed inaccurate progress reporting for S3 loads. [#40987](https://github.com/apache/doris/pull/40987) + +### Permissions + +- Fixed permission issues related to show columns, show sync, and show data from db.table. [#39726](https://github.com/apache/doris/pull/39726) + +### Others + +- Fixed the issue where the audit log plugin for version 2.0 could not be used in version 2.1. [#41400](https://github.com/apache/doris/pull/41400) diff --git a/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.0.md b/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.0.md new file mode 100644 index 0000000000000..baa62b37e1e75 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.0.md @@ -0,0 +1,469 @@ +--- +{ + "title": "Release 3.0.0", + "language": "en" +} +--- + + + + +We are excited to announce the release of Apache Doris 3.0! + +**Starting from version 3.X, Apache Doris supports a compute-storage decoupled mode in addition to the compute-storage coupled mode for cluster deployment. With the cloud-native architecture that decouples the computation and storage layers, users can achieve physical isolation between query loads across multiple compute clusters, as well as isolation between read and write loads. Additionally, users can take advantage of low-cost shared storage systems such as object storage or HDFS to significantly reduce storage costs.** + +Version 3.0 marks a milestone in the evolution of Apache Doris towards a unified data lake and data warehouse architecture. This version introduces the ability to write data back to data lakes, allowing users to perform data analysis, sharing, processing, and storage operations across multiple data sources within Apache Doris. With capabilities such as asynchronous materialized views, Apache Doris can serve as a unified data processing engine for enterprises, helping users better manage data across lakes, warehouses, and databases. Also, Apache Doris 3.0 introduces the Trino Connector. It allows users to quickly connect or adapt to more data sources, and leverage the high-performance compute engine of Doris to deliver faster query results than Trino. + +Version 3.0 also enhances support for ETL batch processing scenarios, adding explicit transaction support for operations like `insert into select`, `delete` and `update`. The observability of query execution has also been improved. + +In terms of performance, we have improved the framework capabilities, infrastructure, and rules of the query optimizer in version 3.0. This provides optimized performance, which has been proven by blind testing in more complex and diverse business scenarios. + +The adaptive Runtime Filter computation method now accurately estimates filters based on data size during execution, delivering better performance under large data volumes and high loads. Additionally, asynchronous materialized view has been more stable and user-friendly in query acceleration and data modeling. + +**During the development of version 3.0, over 200 contributors submitted nearly 5,000 optimizations** and fixes to Apache Doris. Contributors from companies such as VeloDB, Baidu, Meituan, ByteDance, Tencent, Alibaba, Kwai, Huawei, and Tianyi Cloud actively collaborated with the community, contributing test cases from real-world use cases to help us improve Apache Doris. We extend our heartfelt thanks to all the contributors involved in the development, testing, and feedback process for this release. + +- **GitHub**: https://github.com/apache/doris/releases + +- **Website**: https://doris.apache.org/download + +## 1. Compute-storage decoupled mode + +Since V3.0, Apache Doris supports the compute-storage decoupled mode. Users can choose between it and the compute-storage coupled mode during cluster deployment. + +In the compute-storage decoupled mode, the BE nodes no longer store the data, but instead, a shared storage layer (HDFS and object storage) is introduced as the shared data storage layer. The computing and storage resources can be scaled independently, bringing multiple benefits to users: + +- **Workload isolation**: Multiple compute clusters can share the same data, allowing users to isolate different business workloads or offline loads using separate compute clusters. + +- **Reduced storage costs**: The full dataset is stored in the more cost-effective and highly reliable shared storage, with only hot data cached locally. Compared to the compute-storage coupled mode with three data replicas, the storage cost can be reduced by up to 90%. + +- **Elastic computing resources**: Since no data is stored on the BE nodes, the computing resources can be scaled flexibly based on the load requirements. Users can scale in or out an individual compute cluster or increase/decrease the number of compute clusters. This also leads to cost savings. + +- **Improved system robustness**: By storing the data in shared storage, Doris no longer needs to handle the complex logic of multi-replica consistency, thus simplifying distributed storage complexity and improving the overall system robustness. + +- **Flexible data sharing and cloning**: The flexibility of the compute-storage decoupled mode extends beyond a single Doris cluster. Tables from one Doris cluster can be easily cloned to another Doris cluster, with just metadata replication. + +### 1-1. From coupled to decoupled + +In the compute-storage coupled mode, the Apache Doris architecture consists of two main process types: Frontend (FE) and Backend (BE). The FE is primarily responsible for user request access, query parsing and planning, metadata management, and node management. The BE is responsible for data storage and query plan execution. + +The BE nodes employ an MPP (Massively Parallel Processing) distributed computing architecture, leveraging a multi-replica consistency protocol to ensure high service availability and high data reliability. + +![From coupled to decoupled](/images/storage-compute-decoupled.PNG) + + +The maturation of emerging cloud computing infrastructure, including public clouds, private clouds, and Kubernetes-based container platforms, has driven the need for cloud-native capabilities. Increasingly, users are seeking deeper integration between Apache Doris and cloud computing infrastructure to provide more elasticity. + +**To address this need, the VeloDB team has designed and implemented a cloud-native version of Apache Doris that decouples compute and storage, known as VeloDB Cloud. After extensive production testing and refinement across hundreds of enterprises over a long time, this cloud-native solution has now been contributed to the Apache Doris community, manifesting as the Apache Doris 3.0 in the compute-storage decoupled mode.** + +In the compute-storage decoupled mode, the Apache Doris architecture consists of three layers: + +- **Meta data layer**: A new Meta Service module has been introduced to provide meta data services, such as processing database and table information, schemas, rowset meta, and transactions. The Meta Service is stateless and horizontally scalable. In V3.0, all of the BE's meta data and parts of the FE's meta data have been migrated to the Meta Service. We will finish the migration of the remains in future versions. +- **Computation layer**: The stateless BE nodes execute query plans and cache a portion of the data and tablet meta data locally to improve query performance. Multiple stateless BE nodes can be organized into a computing resource pool (i.e., compute cluster), and multiple compute clusters can share the same data and metadata service. The compute clusters can be elastically scaled by adding or removing nodes as needed. +- **Shared storage layer**: Data is persisted to the shared storage layer, which currently supports HDFS as well as various cloud-based object storage systems that are compatible with the S3 protocol, such as S3, OSS, GCS, Azure Blob, COS, BOS, and MinIO. + +![From coupled to decoupled-2](/images/storage-compute-decoupled-2.JPEG) + +### 1-2 Design highlight + +The design of the compute-storage decoupled mode of Apache Doris highlights the transformation of the FE's in-memory metadata model into a shared metadata service. This approach offers a globally consistent state view, allowing any node to directly submit writes without needing to go through the FE for publishing. During write operations, data is stored in shared storage, while metadata is managed by the metadata service. **This effectively controls the number of small files in shared storage. Meanwhile, the real-time write performance for individual tables is nearly on par with that in the compute-storage coupled mode. The system's overall write capacity is no longer limited by the processing power of a single FE node.** + +![Design highlight](/images/design-hightlight.PNG) + +Based on the globally consistent state view, for data garbage collection, we have adopted a design approach for data deletion that is easier to prove correct and more efficient. + +Specifically, data in the shared storage is incorporated into the globally consistent view offered by the shared meta data service. Whenever data is generated, we bind it to a separate, independent transaction. Similarly, for a meta data deletion operation, we also bind it to a separate, independent transaction. The purpose of this approach is to ensure that deletion and write operations cannot succeed together. The view records which data needs to be deleted, and the asynchronous deletion process can simply perform a forward deletion of the data based on the transaction records, without the need for reverse garbage collection. + +As the tablet-related meta data in the FE is gradually migrated to the shared meta data service, the scalability of the Doris cluster will no longer be constrained by the memory capacity of a single FE node. Building upon the shared meta data service and the forward data deletion technique, we can conveniently expand functionality such as data sharing and lightweight cloning. + +### 1-3 Comparison with alternative solutions + +Another design of decoupling compute and storage in the industry is to store the data and BE node meta data in a shared object storage or HDFS. However, this approach brings the following problems: + +- **Inability to support real-time writes**: During data writes, the data is mapped to tablets based on the partitioning and bucketing rules, generating segment files and rowset meta data. During the write process, a two-phase commit (Publish) is performed through the FE. When a BE node receives the Publish request, it then sets the rowset as visible. The Publish operation must not fail. If the rowset meta data is stored in the shared storage, the total small file data during the real-time write process would triple the size of the actual data files - one replica of data files, one for rowset meta data, and another for rowset meta data changes during Publish. The Publish operation is driven by a single FE node, so the write capacity of a single table or even the entire system is limited by the FE node's capabilities. + + ![Comparison with alternative solutions](/images/comparison-with-alternative-solutions.png) + + We compared the real-time data write performance of Apache Doris 3.0 with the above-described solution. We simulated 500 concurrent tasks writing 10,000 data files with 500 rows each, and 50 concurrent tasks writing 250 data files with 20,000 rows each, using the same computational resources. + + **The results showed that at 50 concurrent tasks, the micro-batch write performances of Apache Doris in both compute-storage coupled and decoupled modes were almost identical, while the industry solution lagged behind Apache Doris by a factor of 100.** + + At 500 concurrent tasks, the performance of Apache Doris in the compute-storage decoupled mode showed slight degradation, but it still maintained an 11X advantage over the industry solution. To ensure a fair test, Apache Doris did not enable the Group Commit feature (which the industry solution lacks). Enabling Group Commit would further enhance real-time write performance. + + ![Comparison with alternative solutions](/images/real-time-write-performance..png) + + Additionally, the industry solution also faces stability and cost issues in terms of real-time data ingestion: + + - Stability concerns: A large number of small files can put pressure on the shared storage, especially HDFS, and introduce stability risks. + + - High object storage request costs: Some public cloud object storage services charge 10 times more for Put and Delete operations compared to Get operations. A large number of small files can lead to a significant increase in object storage request costs, which can even exceed the storage costs. + +- **Limited scalability**: Use cases of the compute-storage decoupled model often handles larger data storage sizes, since the FE (Frontend) meta data is entirely in-memory, when the number of tablets reaches a certain high level (e.g. tens of millions), the FE's memory pressure can become a bottleneck that limits the overall write throughput of the system. + +- **Potential data deletion logic issues**: In the compute-storage decoupled architecture, data is stored with one single replica. Therefore, the data deletion logic is critical for the system's reliability. The conventional approach of cross-system data deletion by comparing the differences can be challenging. During the write process, there is no way to completely avoid deletion and write from succeeding together, which can lead to data loss. Additionally, when the storage system experiences anomalies, the input used for difference calculation may be incorrect, which potentially leads to unintended data deletion. + +- **Data sharing and lightweight cloning**: The flexibility of the decoupled storage-compute architecture can enable future data sharing and lightweight data cloning, reducing the burden of enterprise data management. However, if each cluster has a separate FE, after cloning data across clusters, it becomes difficult to accurately determine which data is no longer referenced and can be safely deleted, as calculating cross-cluster references can easily lead to unintended data deletion. + +By evolving the FE's full in-memory meta data model into a shared meta data service, Apache Doris 3.0 avoids all the aforementioned issues. + +### 1-4 Query performance comparison + +In the compute-storage decoupled mode, data needs to be read from the remote shared storage system, the main bottleneck has become the network bandwidth instead of the disk I/O in the compute-storage coupled mode. + +To accelerate data access, Apache Doris has implemented a high-speed caching mechanism based on local disks, and provides two cache management policies: LRU (Least Recently Used) and TTL (Time-To-Live). The newly imported data is asynchronously written to the cache to accelerate the first-time access to the latest data. If the data required by a query is not in the cache, the system will read the data from the remote storage into memory and synchronously write it to the cache for subsequent queries. + +In use cases involving multiple compute clusters, Apache Doris provides a cache preheating function. When a new compute cluster is established, users can choose to preheat specific data (such as tables or partitions) to further improve query efficiency. + +In this context, we have conducted performance tests with different caching strategies in both the compute-storage coupled and decoupled modes, using the TPC-DS 1TB test dataset. The results are concluded as follows: + +- When the cache is fully hit (i.e., all the data required for the query is loaded into the cache), **the query performance of the compute-storage decoupled mode is on par with that of the compute-storage coupled mode**. + +- When the cache is partially hit (i.e., the cache is cleared before the test, and data is gradually loaded into the cache during the test, with performance continuously improving), the query performance of the compute-storage decoupled mode is about 10% lower than that of the compute-storage coupled mode. This test scenario is the most similar to the real-life use cases. + +- When the cache is completely missed (i.e., the cache is cleared before every SQL execution, simulating an extreme case), the performance loss is around 35%. **Even so, Apache Doris in the compute-storage decoupled mode delivers much higher performance than its alternative solutions.** + +![Query performance comparison](/images/query-performance-comparison.png) + +### 1-5 Write speed comparison + +In terms of write performance, we have simulated two test cases under the same computing resources: batch import and high-concurrency real-time import. The comparison of write performance between the compute-storage coupled mode and the compute-storage decoupled mode is as follows: + +- **Batch import**: When importing the 1TB TPC-H and 1TB TPC-DS test datasets, **the write performance of the compute-storage decoupled mode is 20.05% and 27.98% higher than the compute-storage coupled mode**, respectively, under the single-replica configuration. During batch import, the segment file size is generally in the range of tens to hundreds of MB. In the compute-storage decoupled mode, the segment files are split into smaller files and concurrently uploaded to the object storage, which can result in higher throughput compared to writing to local disks. In real-life deployments, the compute-storage coupled mode typically uses three replicas, which means the write speed advantage of the compute-storage decoupled mode will be even more pronounced. + +- **High-concurrency real-time import**: as described in the "Comparison with alternative solutions" section. + +![Write speed comparison](/images/write-speed-comparison.png) + +### 1-6 Tips for production environment + +- **Performance**: For real-time data analysis, users can achieve query performance comparable to the compute-storage coupled mode by specifying a TTL (Time-To-Live) for the cache and writing newly ingested data into the cache. To prevent query jitter, users can cache the data generated by background tasks such as compaction and schema changes based on how frequently used the data is. + +- **Workload isolation**: Users can achieve physical resource isolation for different business using multiple compute clusters. For workload isolation within a single compute cluster, users can utilize the Workload Group mechanism to limit and isolate resources for different queries. + +### 1-7 Notes + +- Apache Doris 3.0 does not support the co-existence of the compute-storage coupled mode and the compute-storage decoupled mode. Users need to specify one of them during cluster deployment. + +- If users need the compute-storage coupled mode, following the [documentation](https://doris.apache.org/docs/3.0/install/source-install/compilation-with-docker/) for its deployment and upgrade. We recommend using Doris Manager for quick deployment and cluster upgrades. However, the compute-storage decoupled mode does not yet support Doris Manager deployment and upgrade. We will continue iteration for better support in future versions. + +- Currently Apache Doris does not support in-place upgrade from V2.1 to the compute-storage decoupled mode of V3.0. For such purpose, users need to perform data migration using tools like X2Doris after deploying the compute-storage decoupled clusters. In the future, we will support migration without service interruption through the CCR (Change Data Capture) capability. + +:::info +See doc: +https://doris.apache.org/docs/3.0/compute-storage-decoupled/overview/ +::: + +## 2. Data lakehouse + +Apache Doris is positioned as a real-time data warehouse, but it is much more than that. In previous versions, we have consistently pushed beyond the boundaries of traditional data warehouse capabilities, advancing towards a unified data lakehouse. Version 3.0 marks a milestone in this journey, with its capabilities in the lakehouse architecture becoming fully mature. We believe that a unified lakehouse is identified by **boundaryless data** and **lakehouse fusion**: + +**Boundaryless data: Apache Doris serves as a unified query processing engine, breaking down data barriers across different systems. It provides a consistent and ultra-fast analysis experience across all data sources, including data warehouses, data lakes, data streams, and local data files.** + +- **Lakehouse query acceleration**: Without the need to migrate data to Apache Doris, users can leverage Doris’ efficient query engine to directly query data stored in data lakes such as Iceberg, Hudi, Paimon, and offline data warehouses like Hive, thereby accelerating query analysis. + +- **Federated analysis**: By extending its catalog and storage plugins, Apache Doris enhances its federated analysis capabilities, allowing users to perform unified analysis across multiple heterogeneous data sources without physically centralizing the data in a single storage system. This enables external table queries and federated joins between internal and external tables, breaking down data silos and providing globally consistent data insights. + +- **Data lake construction**: Apache Doris introduces write-back functionality for Hive and Iceberg, allowing users to directly create Hive and Iceberg tables through Doris and write data into them. This allows users to write internal table data back to the offline lakehouse or process offline lakehouse data using Doris and save the results back into the lakehouse, simplifying and streamlining the data lake construction process. + +**Lakehouse fusion: As data lake architectures become increasingly complex, the costs of technology selection and maintenance rise for users. Achieving consistent fine-grained access control across multiple systems also becomes challenging, and real-time performance suffers. To address this, Apache Doris integrates core features of the data lake, transforming itself into a lightweight, efficient, native real-time lakehouse.** + +- **Real-time data updates**: Starting with version 1.2, Apache Doris enhanced the primary key model by introducing Merge-on-Write, supporting real-time updates. This feature allows high-frequency, real-time data updates based on primary key changes from upstream data sources. + +- **Data science and** **AI** **computation support**: From version 2.1, Apache Doris, using the efficient Arrow Flight protocol, increased the openness of its storage system and its support for various compute loads, enabling data science and AI computations. + +- **Enhancements for semi-structured and unstructured Data**: Apache Doris has introduced support for data types like Array, Map, Struct, JSON, and Variant, with plans to support vector indexing in the future. + +- **Improved resource efficiency by decoupling storage and compute**: With version 3.0, Apache Doris supports a decoupled storage and compute mode, further improving resource efficiency and scalability. + +### 2-1 Faster queries in the data lakehouse + +TPC-H and TPC-DS benchmarking proves that Apache Doris achieves average query performance that is 3 to 5 times faster than Trino/Presto. + +In V3.0, we have focused on optimizing query performance for production environments, including: + +- **More granular task splitting strategy**: By adjusting the consistent hashing algorithm and introducing a task sharding weighting mechanism, we ensure balanced query loads across all nodes. + +- **Scheduling optimizations for use cases with numerous partitions and files**: For cases with a large number of files (over 1 million), we have largely reduced query latency (from 100 seconds to 10 seconds) and alleviated memory pressure on the Frontend (FE) by asynchronously and batch-fetching file shards. + +We will continue to specifically enhance query acceleration performance in real-world business scenarios, improve the actual user experience, and build an industry-leading lakehouse query acceleration engine. + +### 2-2 Federated analysis: more data connectors + +Previous versions of Apache Doris support connectors for over 10 mainstream data lakehouses, warehouses, and relational databases. In V3.0, we have introduced the Trino Connector compatibility framework, which expands the range of data sources that Apache Doris can connect to. With this framework, users can easily adapt their existing setups to access corresponding data sources using Doris and leverage its high-speed computing engine for data analysis. + +Currently, Doris has completed adaptations for Delta Lake, Kudu, BigQuery, Kafka, TPCH, and TPCDS. We also encourage contributions from developers to prolong this list. + +:::info Note + +See doc: + +- Trino Connector: https://doris.apache.org/community/how-to-contribute/trino-connector-developer-guide/ + +- TPC-H: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/tpch/ + +- TPC-DS: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/tpcds/ + +- Delta Lake: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/deltalake/ + +- Kudu: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/kudu/ + +- BigQuery: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/bigquery/ +::: + + +### 2-3 Data lake building + +In V3.0, we have introduced data writeback functionality for Hive and Iceberg. This allows users to create Hive and Iceberg tables directly through Doris and write data into these tables, and enables users to perform data analysis, sharing, processing, and storage operations across multiple data sources within Doris. + +In future iterations, Apache Doris will further enhance support for data lake table formats and improve the openness of storage APIs. + +:::info Note +See doc: https://doris.apache.org/docs/3.0/lakehouse/datalake-building/hive-build/ +::: + +## 3. Upgraded semi-structured data analysis capabilities + +In versions 2.0 and 2.1, Apache Doris introduced some well-embraced features such as inverted index, NGram Bloom Filter, and Variant data type to support high-performance full-text search and multi-dimensional analysis. With them, the storage and processing of complex semi-structured data have been more flexible and efficient. + +In V3.0, we have further enhanced the capabilities in this scenario. + +After extensive testing in production environments, the Variant data type has gained sufficient stability and become the preferred choice for JSON data storage and analysis. In V3.0, we have made multiple optimizations to it: + +- Support for indexing of the Variant data type to accelerate queries, including inverted index, Bloom Filter index, and the built-in ZoneMap index. + +- Support for flexible partial column updates for Unique Key tables containing the Variant data type. + +- Support for the use of the Variant data type in the compute-storage decoupled mode, with optimizations of its metadata storage. + +- Support for exporting the Variant data type to formats such as Parquet and CSV. + +The inverted index, introduced since V2.0, has reached a high level of maturity after more than a year of refinement and is now running in production environments of hundreds of enterprises. In V3.0, we have made multiple optimizations to the inverted index: + +- After performance optimizations, including lock concurrency, Apache Doris outperforms Elasticsearch in key metrics such as query latency and concurrency in real-time reporting analysis. + +- Optimized index file in the compute-storage decoupled mode to reduce remote storage calls and decrease index query latency. + +- Support for the Array data type to accelerate the `array_contains` queries. + +- Enhanced the `match_phrase_*` functionality, including support for slop and phrase prefix matching `match_phrase_prefix`. + +## 4. Enhanced ETL capabilities + +### 4-1. Transaction improvements + +Data processing in data warehouses often involves multiple data changes that need to be handled as a single transaction. V3.0 provides explicit transaction support for `insert into select`, `delete`, and `update` operations. Example cases include: + +- **Transactional requirements**: For example, when updating data within a time range, the typical approach is to first delete the data in that time range, and then insert the new data. Considering that the data might already be in service, there is a need to ensure that queries visit either the old data or the new data. Thus, it can be achieved by executing the `delete` and `insert into select` operations in a transaction. + + ```Java + BEGIN; + DELETE FROM table WHERE date >= "2024-07-01" AND date <= "2024-07-31"; + INSERT INTO table SELECT * FROM stage_table; + COMMIT; + ``` + +- **Simplified the processing of failed tasks**: For example, when two `insert into select` operations are executed within a single transaction, if any of the operations fail, it can be retried directly. + + ```Java + BEGIN WITH LABEL label_etl_1; + INTO table1 SELECT * FROM stage_table1; + INSERT INTO table SELECT * FROM stage_table; + COMMIT; + ``` + +:::info Note +See doc: https://doris.apache.org/docs/3.0/data-operate/transaction/ +Currently, explicit transaction synchronization is not supported in Cross-Cluster Replication (CCR). +::: + +### 4-2. Improved observability + +- **Real-time profile retrieval**: In previous versions, due to issues with the execution plan or the data, some complex queries might have high computational requirements, so developers can only access the query profile for performance analysis after the completion of the query. This makes it hard to promptly identify issues in query execution to guarantee stability of the production environment. Now, with the ability to retrieve real-time profiles, V3.0 allows users to monitor query execution as the query is running. It also allows them to better monitor the progress of each ETL job. + +- **`backend_active_tasks` system table**: The `backend_active_tasks` system table provides real-time resource consumption information for each query on each BE node. Users can analyze this system table using SQL to obtain the resource usage of each query, which helps identify large queries or abnormal workloads. + +## 5. Asynchronous materialized view + +In V3.0, asynchronous materialized view is faster and more stable. It is also more user-friendly for query acceleration and data modeling scenarios. We have restructured the logic for transparent rewrite and expanded its capabilities, making it 2X faster. + +### 5-1 Refresh + +- Support for incremental update of materialized views by partitions and partition roll-ups on materialized views to allow refreshes at different granularities. + +- Support for nested materialized views, which is useful in data modeling scenarios. + +- Support for index creation and sort key specification in asynchronous materialized views, which will improve query performance after the materialized view is hit. + +- Higher usability of materialized view DDL with support for atomically replacing materialized views, allowing modifications to the materialized view definition SQL while keeping the materialized view available. + +- Support for non-deterministic functions in materialized views to better serve daily materialized view creation. + +- Support for trigger-based materialized view refresh, which ensures data consistency in data modeling with nested materialized views. + +- Support for a broader range of SQL patterns for building partitioned materialized views, making the incremental update capability available to more use cases. + +### 5-2 Refresh stability + +- V3.0 supports specifying a Workload Group for building materialized views. This is to limit the resources used by the materialized view build process and ensure that sufficient resources remain available for ongoing queries. + +### 5-3 Transparent rewrite + +- Support for transparent rewrite of more Join types, including derived Joins. Even when there is a mismatch of Join types between the query and materialized view, transparent rewrite can still be performed by compensating with additional predicates, as long as the materialized view can provide all the data needed for the query. + +- Support for more aggregate functions for roll-up as well as rewrite of multi-dimensional aggregations like GROUPING SETS, ROLLUP, and CUBE; support rewriting queries with aggregations when the materialized view does not contain aggregations, simplifying Join operations and expression computation. + +- Support for transparent rewrite of nested materialized views, enabling higher performance for complex queries. + +- For partially invalid partitioned materialized views, V3.0 supports `Union All` the base tables for data completion, expanding the applicability of partitioned materialized views. + +### 5-4 Transparent rewrite performance + +- Continuous optimization has been done to improve the transparent rewrite performance, achieving 2X the speed compared to version 2.1.0. + +:::info Note + +See doc: + +https://doris.apache.org/docs/3.0/query/view-materialized-view/query-async-materialized-view + +https://doris.apache.org/docs/3.0/query/view-materialized-view/async-materialized-view/ + +::: + +## 6. Performance improvement + +### 6-1 Smarter optimizer + +In V3.0, the query optimizer has been enhanced in terms of framework capabilities, distributed plan support, optimizer infrastructure, and rule expansion. It provides better optimization capabilities for more complex and diverse business scenarios, with higher blind test performance for complex SQL: + +- **Improved plan enumeration capability**: The key structure Memo for plan enumeration has been restructured and normalized. This improves the efficiency of the Cascades framework in plan enumeration and the possibility of producing better plans. Additionally, it fixes incomplete column pruning during the Join Reorder process in older versions, which led to unnecessary overhead of the Join operator, thus improving the execution performance in the relevant scenarios. + +- **Improved distributed plan support**: The distributed query plan has been enhanced to allow aggregation, join, and window function operations to more intelligently identify the data characteristics of intermediate computation results, avoiding ineffective data redistribution operations. Meanwhile, we have optimized the execution under the multi-replica continuous execution mode, making it more data cache-friendly. + +- **Improved optimizer infrastructure**: V3 has fixed several issues in cost model and statistics information estimation. The fixes to the cost model are more adaptable to the evolution of the execution engine, making the execution plan more stable compared to previous versions. + +- **Enhanced Runtime Filter plan support**: On the basis of Join Runtime Filter, V3.0 has expanded the capability of the TopN Runtime Filter to achieve better performance in use cases that involve a TopN operator. + +- **Enriched optimization rule library**: Based on user feedback and internal testing results, we have introduced optimization rules such as Intersect Reorder to enrich the rule set of the optimizer. + +### 6-2 Self-adaptive Runtime Filter + +In previous versions, the generation of Runtime Filter relies on manual setting by users based on statistical information. However, inaccurate settings in certain cases could lead to performance instability. + +In V3.0, Doris implements a self-adaptive Runtime Filter calculation approach. It can estimate the Runtime Filter at runtime based on the data size with high accuracy, enabling better performance in use cases with large data volumes and high workloads. + +### 6-3 Function performance optimization + +- V3.0 has improved the vectorized implementation of dozens of functions, enabling a performance improvement of over 50% for some commonly used functions. +- V3.0 has also made extensive optimizations to the aggregation of nullable data types, enabling a 30% performance improvement. + +### 6-4 Blind test performance improvement + +Our blind tests on V3.0 and V2.1 show that the new version is 7.3% and 6.2% faster in TPC-DS and TPC-H benchmark tests, respectively. + +![Blind test performance improvement](/images/blind-test-performance-improvement.png) + +## 7. New features + +### 7-1 Java UDTF + +Version 3.0 has added support for Java UDTFs. The key operations are as follows: + +- Implementing a UDTF: Similar to a UDF, a UDTF requires the user to implement an `evaluate` method. Note that the return value of a UDTF function must be of the `Array` data type. + + ```sql + public class UDTFStringTest { + public ArrayList evaluate(String value, String separator) { + if (value == null || separator == null) { + return null; + } else { + return new ArrayList<>(Arrays.asList(value.split(separator))); + } + } + } + ``` + +- Creating a UDTF: By default, two corresponding functions will be created - `java-utdf`and `java-utdf_outer`. The `_outer` suffix adds a single row of `NULL` data when the table function generates 0 rows of output. + + ```sql + CREATE TABLES FUNCTION java-utdf(string, string) RETURNS array PROPERTIES ( + "file"="file:///pathTo/java-udaf.jar", + "symbol"="org.apache.doris.udf.demo.UDTFStringTest", + "always_nullable"="true", + "type"="JAVA_UDF" + ); + ``` + +:::info + +See doc: https://doris.apache.org/docs/3.0/query/udf/java-user-defined-function/#udtf-1 + +::: + +### 7-2 Generated column + +A generated column is a special column whose value is calculated from the values of other columns rather than directly inserted or updated by the user. It supports pre-computing the results of expressions and storing them in the database, which is suitable for scenarios that require frequent queries or complex calculations. + +Results can be automatically calculated based on predefined expressions when data is imported or updated, and then stored persistently. In this way, during subsequent queries, the system can directly access these calculated results without performing complex calculations, thereby improving query performance. + +Generated columns are supported since V3.0. When creating a table, you can specify a column as generated column. A generated column automatically calculates values based on the defined expression when data is written. Generated columns allow for more complex expressions to be defined, but the value cannot be explicitly written or set. + +:::info + +See doc: https://doris.apache.org/docs/3.0/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-TABLE-AND-GENERATED-COLUMN/ + +::: + +## 8. Functional improvements + +### 8-1. Materialized view + +We have refactored the selection logic for materialized views and migrated it from the rule-based optimizer (RBO) to the cost-based optimizer (CBO). This aligns the selection logic with that of asynchronous materialized views. This functionality is enabled by default. If any issues are encountered, you can revert to the RBO mode using `set global enable_sync_mv_cost_based_rewrite = false`. + +### 8-2. Routine Load + +In previous versions, the Routine Load functionality faced some usability challenges, such as uneven task scheduling across BE nodes, untimely task scheduling, complex configuration requirements (the need to change multiple FE and BE settings for optimization), insufficient overall stability (where restarts or upgrades could frequently pause Routine Load jobs, requiring manual user intervention to resume). + +To address these issues, we have made extensive optimizations to the Routine Load feature: + +- **Resource scheduling**: We have improved the scheduling balance to make sure that tasks are more evenly distributed across BE nodes. Jobs that encounter unrepairable errors will be promptly paused to avoid wasting resources on futile scheduling attempts. Additionally, we have improved the timeliness of the scheduling process, which has enhanced the import performance of Routine Load. + +- **Parameter configuration**: Users in most environments no longer need to modify FE and BE configurations for optimization. An automatic adjustment mechanism with timeout parameter has been introduced to prevent tasks from constantly retrying when cluster pressure increases. + +- **Stability**: We have enhanced the robustness of Doris in various exceptional scenarios, such as FE failovers, BE rolling upgrades, and Kafka cluster anomalies, ensuring continuous stable operation. We have also optimized the Auto Resume mechanism, allowing Routine Load to automatically resume operation after faults are repaired, reducing the need for manual user intervention. + +## 9. Behavior changed + +- `cpu_resource_limit` will no longer be supported, and all types of resource isolation will be implemented through Workload Groups. + +- Please use JDK 17 for Apache Doris 3.0 and later versions. The recommended version being `jdk-17.0.10_linux-x64_bin.tar.gz`. + +## Try Apache Doris 3.0 now! + +Before the official release of version 3.0, the compute-storage decoupled mode of Apache Doris has undergone nearly two years of extensive testing and optimization in the production environments of hundreds of enterprises. Contributors from many tech giants have collaborated with the community to provide a significant number of test cases based on their real-world business needs. This has rigorously validated the usability and stability of version 3.0. + +We highly recommend users with compute-storage decoupling needs to download version 3.0 and experience it firsthand. + +Going forward, we will accelerate our release iteration cycle to deliver a more stable version experience for all users. Feel free to join us in the [Apache Doris community](https://join.slack.com/t/apachedoriscommunity/shared_invite/zt-2gmq5o30h-455W226d79zP3L96ZhXIoQ) and engage directly with the core developers. + +## Credits + +Special thanks to the following contributors who participated in the development, testing, and provided feedback for this version: + +@133tosakarin、@390008457、@924060929、@AcKing-Sam、@AshinGau、@BePPPower、@BiteTheDDDDt、@ByteYue、@CSTGluigi、@CalvinKirs、@Ceng23333、@DarvenDuan、@DongLiang-0、@Doris-Extras、@Dragonliu2018、@Emor-nj、@FreeOnePlus、@Gabriel39、@GoGoWen、@HappenLee、@HowardQin、@Hyman-zhao、@INNOCENT-BOY、@JNSimba、@JackDrogon、@Jibing-Li、@KassieZ、@Lchangliang、@LemonLiTree、@LiBinfeng-01、@LompleZ、@M1saka2003、@Mryange、@Nitin-Kashyap、@On-Work-Song、@SWJTU-ZhangLei、@StarryVerse、@TangSiyang2001、@Tech-Circle-48、@Thearas、@Vallishp、@WinkerDu、@XieJiann、@XuJianxu、@XuPengfei-1020、@Yukang-Lian、@Yulei-Yang、@Z-SWEI、@ZhongJinHacker、@adonis0147、@airborne12、@allenhooo、@amorynan、@bingquanzhao、@biohazard4321、@bobhan1、@caiconghui、@cambyzju、@caoliang-web、@catpineapple、@cjj2010、@csun5285、@dataroaring、@deardeng、@dongsilun、@dutyu、@echo-hhj、@eldenmoon、@elvestar、@englefly、@feelshana、@feifeifeimoon、@feiniaofeiafei、@felixwluo、@freemandealer、@gavinchou、@ghkang98、@gnehil、@hechao-ustc、@hello-stephen、@httpshirley、@hubgeter、@hust-hhb、@iszhangpch、@iwanttobepowerful、@ixzc、@jacktengg、@jackwener、@jeffreys-cat、@kaijchen、@kaka11chen、@kindred77、@koarz、@kobe6th、@kylinmac、@larshelge、@liaoxin01、@lide-reed、@liugddx、@liujiwen-up、@liutang123、@lsy3993、@luwei16、@luzhijing、@lxliyou001、@mongo360、@morningman、@morrySnow、@mrhhsg、@my-vegetable-has-exploded、@mymeiyi、@nanfeng1999、@nextdreamblue、@pingchunzhang、@platoneko、@py023、@qidaye、@qzsee、@raboof、@rohitrs1983、@rotkang、@ryanzryu、@seawinde、@shoothzj、@shuke987、@sjyango、@smallhibiscus、@sollhui、@sollhui、@spaces-X、@stalary、@starocean999、@superdiaodiao、@suxiaogang223、@taptao、@vhwzx、@vinlee19、@w41ter、@wangbo、@wangshuo128、@whutpencil、@wsjz、@wuwenchi、@wyxxxcat、@xiaokang、@xiedeyantu、@xiedeyantu、@xingyingone、@xinyiZzz、@xy720、@xzj7019、@yagagagaga、@yiguolei、@yongjinhou、@ytwp、@yuanyuan8983、@yujun777、@yuxuan-luo、@zclllyybb、@zddr、@zfr9527、@zgxme、@zhangbutao、@zhangstar333、@zhannngchen、@zhiqiang-hhhh、@ziyanTOP、@zxealous、@zy-kkk、@zzzxl1993、@zzzzzzzs \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.1.md b/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.1.md new file mode 100644 index 0000000000000..9b9007e4391aa --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.1.md @@ -0,0 +1,604 @@ +--- +{ + "title": "Release 3.0.1", + "language": "en" +} +--- + + + +Dear community members, the Apache Doris 3.0.1 version was officially released on August 23, 2024, featuring updates and improvements in compute-storage decoupling, lakehouse, semi-structured data analysis, asynchronous materialized views, and more. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior Changes + +### Query Optimizer + +- Added the variable `use_max_length_of_varchar_in_ctas` to control the length behavior of VARCHAR type when executing `CREATE TABLE AS SELECT` (CTAS) operations. [#37069](https://github.com/apache/doris/pull/37069) + + - This variable is set to true by default. + + - When set to true, if the VARCHAR type column originates from a table, the derived length is used; otherwise, the maximum length is used. + + - When set to false, the VARCHAR type will always use the derived length. + +- All data types will now be displayed in lowercase to maintain compatibility with MySQL format. [#38012](https://github.com/apache/doris/pull/38012) + +- Multiple query statements in the same query request must now be separated by semicolons. [#38670](https://github.com/apache/doris/pull/38670) + +### Query Execution + +- The default number of parallel tasks after shuffle operations in the cluster is set to 100, which will improve query stability and concurrent processing capability in large clusters. [#38196](https://github.com/apache/doris/pull/38196) + +### Storage + +- The default value of `trash_file_expire_time_sec` has been changed from 86400 seconds to 0 seconds, which means that if files are deleted by mistake and the FE trash is cleared, the data cannot be recovered. + +- The table attribute `enable_mow_delete_on_delete_predicate` (introduced in version 3.0.0) has been renamed to `enable_mow_light_delete`. + +- Explicit transactions are now prohibited from performing delete operations on tables with written data. + +- Heavy schema change operations are prohibited on tables with auto-increment fields. + + + +## New Features + +### Job Scheduling + +- Optimized the execution logic of internal scheduling jobs, decoupling the strong association between start time and immediate execution parameters. Now, tasks can be created with a specified start time or selected for immediate execution, without conflict, enhancing scheduling flexibility. [#36805](https://github.com/apache/doris/pull/36805) + +### Compute-Storage Decoupled + +- Supports dynamic modification of the upper limit for file cache usage. [#37484](https://github.com/apache/doris/pull/37484) + +- Recycler now supports object storage rate limiting and server-side rate limiting retry functionality. [#37663](https://github.com/apache/doris/pull/37663) [#37680](https://github.com/apache/doris/pull/37680) + +### Lakehouse + +- Added the session variable `serde_dialect` to set the output format for complex types. [#37039](https://github.com/apache/doris/pull/37039) + +- SQL interception now supports external tables. + + - For more information, refer to the documentation on [SQL Interception](https://doris.apache.org/docs/admin-manual/query-admin/sql-interception). + +- Insert overwrite now supports Iceberg tables. [#37191](https://github.com/apache/doris/pull/37191) + +### Asynchronous Materialized Views + +- Supports partition roll-up and build at the hourly level. [#37678](https://github.com/apache/doris/pull/37678) + +- Supports atomic replacement of asynchronous materialized view definition statements. [#36749](https://github.com/apache/doris/pull/36749) + +- Transparent rewriting now supports Insert statements. [#38115](https://github.com/apache/doris/pull/38115) + +- Transparent rewriting now supports the VARIANT type. [#37929](https://github.com/apache/doris/pull/37929) + +### Query Execution + +- The group concat function now supports DISTINCT and ORDER BY options. [#38744](https://github.com/apache/doris/pull/38744) + +### Semi-Structured Data Management + +- The ES Catalog now maps `nested` or `object` types in Elasticsearch to the JSON type in Doris. [#37101](https://github.com/apache/doris/pull/37101) + +- Added the `MULTI_MATCH` function, which supports matching keywords across multiple fields and can leverage inverted indexes to accelerate searches. [#37722](https://github.com/apache/doris/pull/37722) + +- Added the `explode_json_object` function, which can unfold objects in JSON data into multiple rows. [#36887](https://github.com/apache/doris/pull/36887) + +- Inverted indexes now support memtable advancement, requiring index construction only once during multi-replica writes, reducing CPU consumption and improving performance. [#35891](https://github.com/apache/doris/pull/35891) + +- Added `MATCH_PHRASE` support for positive slop, e.g., `msg MATCH_PHRASE 'a b 2+'` can match instances containing words a and b with a slop of no more than two, and a preceding b; regular slop without the final `+` does not guarantee this order. [#36356](https://github.com/apache/doris/pull/36356) + +### Other + +- Added the FE parameter `skip_audit_user_list`, where user operations specified in this configuration will not be recorded in the audit log. [#38310](https://github.com/apache/doris/pull/38310) + + - For more information, refer to the documentation on [Audit Plugin](https://doris.apache.org/docs/admin-manual/audit-plugin/). + + + +## Improvements + +### Storage + +- Reduced the likelihood of write failures caused by disk balancing within a single BE. [#38000](https://github.com/apache/doris/pull/38000) + +- Decreased memory consumption by the memtable limiter. [#37511](https://github.com/apache/doris/pull/37511) + +- Moved old partitions to the FE trash during partition replacement operations. [#36361](https://github.com/apache/doris/pull/36361) + +- Optimized memory consumption during compaction. [#37099](https://github.com/apache/doris/pull/37099) + +- Added a session variable to control audit logs for JDBC PreparedStatement, with default setting to not print. [#38419](https://github.com/apache/doris/pull/38419) + +- Optimized the logic for selecting BEs for group commits. [#35558](https://github.com/apache/doris/pull/35558) + +- Improved the performance of column updates. [#38487](https://github.com/apache/doris/pull/38487) + +- Optimized the use of `delete bitmap cache`. [#38761](https://github.com/apache/doris/pull/38761) + +- Added a configuration to control query affinity during hot and cold tiering. [#37492](https://github.com/apache/doris/pull/37492) + +### Compute-Storage Decoupled + +- Implemented automatic retries when encountering object storage server rate limiting. [#37199](https://github.com/apache/doris/pull/37199) + +- Adapted the number of threads for memtable flush in the compute-storage decoupled mode. [#38789](https://github.com/apache/doris/pull/38789) + +- Added Azure as a compile option to support compilation in environments without Azure support. + +- Optimized the observability of object storage access rate limiting. [#38294](https://github.com/apache/doris/pull/38294) + +- Allowed the file cache TTL queue to perform LRU eviction, enhancing TTL queue usability. [#37312](https://github.com/apache/doris/pull/37312) + +- Optimized the number of balance writeeditlog IO operations in the storage and compute separation mode. [#37787](https://github.com/apache/doris/pull/37787) + +- Improved table creation speed in the storage and compute separation mode by sending tablet creation requests in batches. [#36786](https://github.com/apache/doris/pull/36786) + +- Optimized read failures caused by potential inconsistencies in the local file cache through backoff retries. [#38645](https://github.com/apache/doris/pull/38645) + +### Lakehouse + +- Optimized memory statistics for Parquet/ORC format read and write operations. [#37234](https://github.com/apache/doris/pull/37234) + +- Trino Connector Catalog now supports predicate pushdown. [#37874](https://github.com/apache/doris/pull/37874) + +- Added a session variable `enable_count_push_down_for_external_table` to control whether to enable `count(*)` pushdown optimization for external tables. [#37046](https://github.com/apache/doris/pull/37046) + +- Optimized the read logic for Hudi snapshot reads, returning an empty set when the snapshot is empty, consistent with Spark behavior. [#37702](https://github.com/apache/doris/pull/37702) + +- Improved the read performance of partition columns for Hive tables. [#37377](https://github.com/apache/doris/pull/37377) + +### Asynchronous Materialized Views + +- Improved transparent rewrite plan speed by 20%. [#37197](https://github.com/apache/doris/pull/37197) + +- Eliminated roll-up during transparent rewrite if the group key satisfies data uniqueness for better nested matching. [#38387](https://github.com/apache/doris/pull/38387) + +- Transparent rewrite now performs better aggregation elimination to improve the matching success rate of nested materialized views. [#36888](https://github.com/apache/doris/pull/36888) + +### MySQL Compatibility + +- Now correctly populates the database name, table name, and original name in the MySQL protocol result columns. [#38126](https://github.com/apache/doris/pull/38126) + +- Supported the hint format `/*+ func(value) */`. [#37720](https://github.com/apache/doris/pull/37720) + +### Query Optimizer + +- Significantly improved the plan speed for complex queries. [#38317](https://github.com/apache/doris/pull/38317) + +- Adaptively chose whether to perform bucket shuffle based on the number of data buckets to avoid performance degradation in extreme cases. [#36784](https://github.com/apache/doris/pull/36784) + +- Optimized the cost estimation logic for SEMI / ANTI JOIN. [#37951](https://github.com/apache/doris/pull/37951) [#37060](https://github.com/apache/doris/pull/37060) + +- Supported pushing Limit down to the first stage of aggregation to improve performance. [#34853](https://github.com/apache/doris/pull/34853) + +- Partition pruning now supports filter conditions containing the `date_trunc` or `date` function. [#38025](https://github.com/apache/doris/pull/38025) [#38743](https://github.com/apache/doris/pull/38743) + +- SQL cache now supports query scenarios that include user variables. [#37915](https://github.com/apache/doris/pull/37915) + +- Optimized error messages for invalid aggregation semantics. [#38122](https://github.com/apache/doris/pull/38122) + +### Query Execution + +- Adapted AggState compatibility from 2.1 to 3.x and fixed Coredump issues. [#37104](https://github.com/apache/doris/pull/37104) + +- Refactored the strategy selection for local shuffle without Join. [#37282](https://github.com/apache/doris/pull/37282) + +- Modified the scanner for internal table queries to be asynchronous to prevent stalling during such queries. [#38403](https://github.com/apache/doris/pull/38403) + +- Optimized the block merge process during Hash table construction for Join operators. [#37471](https://github.com/apache/doris/pull/37471) + +- Optimized the duration of lock holding for MultiCast. [#37462](https://github.com/apache/doris/pull/37462) + +- Optimized gRPC keepAliveTime and added link monitoring to reduce the probability of query failure due to RPC errors. [#37304](https://github.com/apache/doris/pull/37304) + +- Cleaned up all dirty pages in jemalloc when memory limits were exceeded. [#37164](https://github.com/apache/doris/pull/37164) + +- Optimized the processing performance of `aes_encrypt`/`decrypt` functions for constant types. [#37194](https://github.com/apache/doris/pull/37194) + +- Optimized the processing performance of the `json_extract` function for constant data. [#36927](https://github.com/apache/doris/pull/36927) + +- Optimized the processing performance of the `ParseUrl` function for constant data. [#36882](https://github.com/apache/doris/pull/36882) + +### Semi-Structured Data Management + +- Bitmap indexes now default to using inverted indexes, with `enable_create_bitmap_index_as_inverted_index` set to true by default. [#36692](https://github.com/apache/doris/pull/36692) + +- In the compute-storage decoupled mode, DESC can now view sub-columns of VARIANT type. [#38143](https://github.com/apache/doris/pull/38143) + +- Removed the step of checking file existence during inverted index queries to reduce access latency to remote storage. [#36945](https://github.com/apache/doris/pull/36945) + +- Complex types ARRAY / MAP / STRUCT now support `replace_if_not_null` for AGG tables. [#38304](https://github.com/apache/doris/pull/38304) + +- Escape characters for JSON data are now supported. [#37176](https://github.com/apache/doris/pull/37176) [#37251](https://github.com/apache/doris/pull/37251) + +- Inverted index queries now behave consistently on MOW tables and DUP tables. [#37428](https://github.com/apache/doris/pull/37428) + +- Optimized the performance of inverted index acceleration for IN queries. [#37395](https://github.com/apache/doris/pull/37395) + +- Reduced unnecessary memory allocation during TOPN queries to improve performance. [#37429](https://github.com/apache/doris/pull/37429) + +- When creating an inverted index with tokenization, the `support_phrase` option is now automatically enabled to accelerate `match_phrase` series phrase queries. [#37949](https://github.com/apache/doris/pull/37949) + +### Other + +- Audit log now can record SQL types. [#37790](https://github.com/apache/doris/pull/37790) + +- Added support for `information_schema.processlist` to show all FE. [#38701](https://github.com/apache/doris/pull/38701) + +- Cached ranger's `atamask` and `rowpolicy` to accelerate query efficiency. [#37723](https://github.com/apache/doris/pull/37723) + +- Optimized metadata management in job manager to release locks immediately after modifying metadata, reducing lock holding time. [#38162](https://github.com/apache/doris/pull/38162) + + + +## Bug Fixes + +### Upgrade + +- Fix the issue where `mtmv load` fails during upgrade from version 2.1. [#38799](https://github.com/apache/doris/pull/38799) + +- Resolve the issue where `null_type` cannot be found during the upgrade to version 2.1. [#39373](https://github.com/apache/doris/pull/39373) + +- Address the compatibility issue with permission persistence during the upgrade from version 2.1 to 3.0. [#39288](https://github.com/apache/doris/pull/39288) + +### Load + +- Fix the issue where parsing fails when the newline character is surrounded by delimiters in CSV format parsing. [#38347](https://github.com/apache/doris/pull/38347) +- Resolve potential exception issues when FE forwards group commit. [#38228](https://github.com/apache/doris/pull/38228) [#38265](https://github.com/apache/doris/pull/38265) + +- Group commit now supports the new optimizer. [#37002](https://github.com/apache/doris/pull/37002) + +- Fix the issue where group commit reports data errors when JDBC setNull is used. [#38262](https://github.com/apache/doris/pull/38262) + +- Optimize the retry logic for group commit when encountering `delete bitmap lock` errors. [#37600](https://github.com/apache/doris/pull/37600) + +- Resolve the issue where routine load cannot use CSV delimiters and escape characters. [#38402](https://github.com/apache/doris/pull/38402) + +- Fix the issue where routine load job names with mixed case cannot be displayed. [#38523](https://github.com/apache/doris/pull/38523) + +- Optimize the logic for actively recovering routine load during FE master-slave switching. [#37876](https://github.com/apache/doris/pull/37876) + +- Resolve the issue where routine load pauses when all data in Kafka is expired. [#37288](https://github.com/apache/doris/pull/37288) + +- Fix the issue where `show routine load` returns empty results. [#38199](https://github.com/apache/doris/pull/38199) + +- Resolve the memory leak issue during multi-table stream import in routine load. [#38255](https://github.com/apache/doris/pull/38255) + +- Fix the issue where stream load does not return the error URL. [#38325](https://github.com/apache/doris/pull/38325) + +- Resolve potential load channel leak issues. [#38031](https://github.com/apache/doris/pull/38031) [#37500](https://github.com/apache/doris/pull/37500) + +- Fix the issue where no error may be reported when importing fewer segments than expected. [#36753](https://github.com/apache/doris/pull/36753) + +- Resolve the load stream leak issue. [#38912](https://github.com/apache/doris/pull/38912) + +- Optimize the impact of offline nodes on import operations. [#38198](https://github.com/apache/doris/pull/38198) + +- Fix the issue where transactions do not end when inserting into empty data. [#38991](https://github.com/apache/doris/pull/38991) + +### Storage + +**01 Backup and Restoration** + +- Fix the issue where tables cannot be written after backup and restoration. [#37089](https://github.com/apache/doris/pull/37089) + +- Resolve the issue where view database names are incorrect after backup and restoration. [#37412](https://github.com/apache/doris/pull/37412) + +**02 Compaction** + +- Fix the issue where cumu compaction handles delete errors incorrectly during ordered data compression. [#38742](https://github.com/apache/doris/pull/38742) + +- Resolve the issue of duplicate keys in aggregate tables caused by sequential compression optimization. [#38224](https://github.com/apache/doris/pull/38224) + +- Fix the issue where compression operations cause coredump in large wide tables. [#37960](https://github.com/apache/doris/pull/37960) + +- Resolve the compression starvation issue caused by inaccurate concurrent statistics of compression tasks. [#37318](https://github.com/apache/doris/pull/37318) + +**03 MOW Unique Key** + +- Resolve the issue of inconsistent data between replicas caused by cumulative compression deletion of delete sign. [#37950](https://github.com/apache/doris/pull/37950) + +- MOW delete now uses partial column updates with the new optimizer. [#38751](https://github.com/apache/doris/pull/38751) + +- Fix the potential duplicate key issue in MOW tables under compute-storage decoupled. [#39018](https://github.com/apache/doris/pull/39018) + +- Resolve the issue where MOW unique and duplicate tables cannot modify column order. [#37067](https://github.com/apache/doris/pull/37067) + +- Fix the potential data correctness issue caused by segcompaction. [#37760](https://github.com/apache/doris/pull/37760) + +- Resolve the potential memory leak issue during column updates. [#37706](https://github.com/apache/doris/pull/37706) + +**04 Other** + +- Fix the small probability of exceptions in TOPN queries. [#39119](https://github.com/apache/doris/pull/39119) [#39199](https://github.com/apache/doris/pull/39199) + +- Resolve the issue where auto-increment IDs may duplicate during FE restart. [#37306](https://github.com/apache/doris/pull/37306) + +- Fix the potential queuing issue in the delete operation priority queue. [#37169](https://github.com/apache/doris/pull/37169) + +- Optimize the delete retry logic. [#37363](https://github.com/apache/doris/pull/37363) + +- Resolve the issue with `bucket = 0` in table creation statements under the new optimizer. [#38971](https://github.com/apache/doris/pull/38971) + +- Fix the issue where FE reports success incorrectly when image generation fails. [#37508](https://github.com/apache/doris/pull/37508) + +- Resolve the issue where using the wrong nodename during FE offline nodes may cause inconsistent FE members. [#37987](https://github.com/apache/doris/pull/37987) + +- Fix the issue where CCR partition addition may fail. [#37295](https://github.com/apache/doris/pull/37295) + +- Resolve the `int32` overflow issue in inverted index files. [#38891](https://github.com/apache/doris/pull/38891) + +- Fix the issue where TRUNCATE TABLE failure may cause BE to fail to go offline. [#37334](https://github.com/apache/doris/pull/37334) + +- Resolve the issue where publish cannot continue due to null pointers. [#37724](https://github.com/apache/doris/pull/37724) [#37531](https://github.com/apache/doris/pull/37531) + +- Fix the potential coredump issue when manually triggering disk migration. [#37712](https://github.com/apache/doris/pull/37712) + +### Compute-Storage Decoupled + +- Fixed the issue where `show create table` might display the `file_cache_ttl_seconds` attribute twice. [#38052](https://github.com/apache/doris/pull/38052) + +- Fixed the issue where segment Footer TTL was not set correctly after setting file cache TTL. [#37485](https://github.com/apache/doris/pull/37485) + +- Fixed the issue where file cache might cause coredump due to massive conversion of cache types. [#38518](https://github.com/apache/doris/pull/38518) + +- Fixed the potential file descriptor (fd) leak in file cache. [#38051](https://github.com/apache/doris/pull/38051) + +- Fixed the issue where schema change Job overwriting compaction Job prevented base tablet compaction from completing normally. [#38210](https://github.com/apache/doris/pull/38210) + +- Fixed the potential inaccuracy of base compaction score due to data race. [#38006](https://github.com/apache/doris/pull/38006) + +- Fixed the issue where error messages from imports might not be uploaded correctly to object storage. [#38359](https://github.com/apache/doris/pull/38359) + +- Fixed the inconsistency in return information between compute-storage decoupled mode and storage and compute integration mode for 2PC imports. [#38076](https://github.com/apache/doris/pull/38076) + +- Fix the issue where incorrect file size setting during file cache warm-up leads to coredump. [#38939](https://github.com/apache/doris/pull/38939) + +- Fixed the issue where partial column updates did not correctly dequeue delete operations. [#37151](https://github.com/apache/doris/pull/37151) + +- Fixed compatibility issues with permission persistence in compute-storage decoupled mode. [#38136](https://github.com/apache/doris/pull/38136) [#37708](https://github.com/apache/doris/pull/37708) + +- Fixed the issue where observer did not retry correctly when encountering a `-230` error. [#37625](https://github.com/apache/doris/pull/37625) + +- Fixed the issue where `show load` with conditions did not perform correct analysis. [#37656](https://github.com/apache/doris/pull/37656) + +- Fixed the issue where `show streamload` in compute-storage decoupled mode caused BE coredump. [#37903](https://github.com/apache/doris/pull/37903) + +- Fixed the issue where `copy into` did not correctly verify column names in strict mode. [#37650](https://github.com/apache/doris/pull/37650) + +- Fixed the issue where multi-stream imports into a single table lacked permissions. [#38878](https://github.com/apache/doris/pull/38878) + +- Fixed the potential overflow issue in `getVersionUpdateTimeMs`. [#38074](https://github.com/apache/doris/pull/38074) + +- Fixed the issue where FE azure blob list was not implemented correctly. [#37986](https://github.com/apache/doris/pull/37986) + +- Fixed the issue where inaccurate azure blob recycling time calculation prevented recycling. [#37535](https://github.com/apache/doris/pull/37535) + +- Fixed the issue where inverted index files were not deleted in compute-storage decoupled mode. [#38306](https://github.com/apache/doris/pull/38306) + +### Lakehouse + +- Fixed the issue with reading binary data from Oracle Catalog. [#37078](https://github.com/apache/doris/pull/37078) + +- Fixed the potential deadlock issue when acquiring external table metadata in multi-FE scenarios. [#37756](https://github.com/apache/doris/pull/37756) + +- Fixed the issue where JNI scanner failure caused BE nodes to crash. [#37697](https://github.com/apache/doris/pull/37697) + +- Fixed the issue with slow reading of date types from Trino Connector Catalog. [#37266](https://github.com/apache/doris/pull/37266) + +- Optimized kerberos authentication logic for Hive Catalog. [#37301](https://github.com/apache/doris/pull/37301) + +- Fixed the issue where region attributes might be parsed incorrectly when parsing MinIO properties. [#37249](https://github.com/apache/doris/pull/37249) + +- Fixed the issue where creating too many FileSystems by FE caused memory leaks. [#36954](https://github.com/apache/doris/pull/36954) + +- Fixed the issue with reading incorrect time zone information from Paimon. [#37716](https://github.com/apache/doris/pull/37716) + +- Fixed the potential thread leak issue caused by Hive write-back operations. [#36990](https://github.com/apache/doris/pull/36990) + +- Fixed the null pointer issue caused by enabling Hive metastore event synchronization. [#38421](https://github.com/apache/doris/pull/38421) + +- Fixed the issue where error messages were unclear or caused stalling when creating catalogs. [#37551](https://github.com/apache/doris/pull/37551) + +- Fixed the issue where reading Hive text format tables behaved differently from Hive. [#37638](https://github.com/apache/doris/pull/37638) + +- Fixed the logic error when switching between catalogs and databases. [#37828](https://github.com/apache/doris/pull/37828) + +### MySQL Compatibility + +- Fixed the issue where certain flags in the MySQL protocol were set incorrectly when SSL was enabled. [#38086](https://github.com/apache/doris/pull/38086) + +### Asynchronous Materialized Views + +- Fixed the issue where construction might fail when the base table had a very large number of partitions. [#37589](https://github.com/apache/doris/pull/37589) + +- Fixed the issue where nested materialized views incorrectly performed full table refreshes even when partition refreshes were possible. [#38698](https://github.com/apache/doris/pull/38698) + +- Fixed the issue where partition refresh could not handle the simultaneous existence of valid and invalid dependencies when analyzing partition dependencies. [#38367](https://github.com/apache/doris/pull/38367) + +- Fixed the issue where the final result containing NULL type might cause asynchronous materialized views to fail. [#37019](https://github.com/apache/doris/pull/37019) + +- Fixed the planning error that might occur during transparent rewriting when both synchronous and asynchronous materialized views with the same name were present. [#37311](https://github.com/apache/doris/pull/37311) + +### Synchronous Materialized Views + +- The rewritten synchronous materialized views now can correctly perform partition pruning. [#38527](https://github.com/apache/doris/pull/38527) + +- When rewriting synchronous materialized views, those with unready data are no longer selected. [#38148](https://github.com/apache/doris/pull/38148) + +### Query Optimizer + +- Fixed the deadlock issue that might occur when queries and delete operations are performed simultaneously. [#38660](https://github.com/apache/doris/pull/38660) + +- Fixed the issue where bucket pruning might incorrectly prune on decimal column buckets. [#37889](https://github.com/apache/doris/pull/37889) + +- Fixed the issue where planning might be incorrect when mark join participates in join reorder. [#39152](https://github.com/apache/doris/pull/39152) + +- Fixed the issue where the result is incorrect when the correlation condition of a correlated subquery is not a simple column. [#37644](https://github.com/apache/doris/pull/37644) + +- Fixed the issue where partition pruning cannot correctly handle or expressions. [#38897](https://github.com/apache/doris/pull/38897) + +- Fixed the planning error that might occur when optimizing the execution order of JOIN and AGG. [#37343](https://github.com/apache/doris/pull/37343) + +- Fixed the issue where `str_to_date` performs incorrect constant folding calculations on datev1 types. [#37360](https://github.com/apache/doris/pull/37360) + +- Fixed the issue where the ACOS function's constant folding returns non-NaN values. [#37932](https://github.com/apache/doris/pull/37932) + +- Fixed the occasional planning error: "The children format needs to be [WhenClause+, DefaultValue?]". [#38491](https://github.com/apache/doris/pull/38491) + +- Fixed the issue where planning might be incorrect when the projection includes window functions and there is both the original column and its alias. [#38166](https://github.com/apache/doris/pull/38166) + +- Fixed the issue where planning might report an error when the aggregation parameter contains a lambda expression. [#37109](https://github.com/apache/doris/pull/37109) + +- Fixed the insert error that might occur in extreme cases: "MultiCastDataSink cannot be cast to DataStreamSink". [#38526](https://github.com/apache/doris/pull/38526) + +- Fixed the issue where the new optimizer does not correctly handle `char(0)/varchar(0)` when creating a table. [#38427](https://github.com/apache/doris/pull/38427) + +- Fixed the incorrect behavior of `char(255) toSql`. [#37340](https://github.com/apache/doris/pull/37340) + +- Fixed the issue where the nullable attribute within the `agg_state` type might lead to planning errors. [#37489](https://github.com/apache/doris/pull/37489) +- Fixed the issue where row count statistics are inaccurate during mark Join. [#38270](https://github.com/apache/doris/pull/38270) + +### Query Execution + +- Fixed issues where the Pipeline execution engine was stuck, causing queries to not end, in multiple scenarios. [#38657](https://github.com/apache/doris/pull/38657), [#38206](https://github.com/apache/doris/pull/38206), [#38885](https://github.com/apache/doris/pull/38885), [#38151](https://github.com/apache/doris/pull/38151), [#37297](https://github.com/apache/doris/pull/37297) + +- Fixed the coredump issue caused by NULL and non-NULL columns during set difference calculations. [#38750](https://github.com/apache/doris/pull/38750) + +- Fixed the error when using the DECIMAL type with pure decimals in delete statements. [#37801](https://github.com/apache/doris/pull/37801) + +- Fixed the issue where the `width_bucket` function returned incorrect results. [#37892](https://github.com/apache/doris/pull/37892) + +- Fixed the query error when a single row of data was very large and the result set was also large (exceeding 2GB). [#37990](https://github.com/apache/doris/pull/37990) + +- Fixed the coredump issue caused by incorrect release of rpc connections during single-replica imports. [#38087](https://github.com/apache/doris/pull/38087) + +- Fixed the coredump issue caused by processing NULL values with the `foreach` function. [#37349](https://github.com/apache/doris/pull/37349) + +- Fixed the issue where stddev returned incorrect results for DECIMALV2 types. [#38731](https://github.com/apache/doris/pull/38731) + +- Fixed the slow performance of `bitmap union` calculations. [#37816](https://github.com/apache/doris/pull/37816) + +- Fixed the issue where RowsProduced for aggregation operators was not set in the profile. [#38271](https://github.com/apache/doris/pull/38271) + +- Fixed the overflow issue when calculating the number of buckets for the hash table under hash join. [#37193](https://github.com/apache/doris/pull/37193), [#37493](https://github.com/apache/doris/pull/37493) + +- Fixed the inaccurate recording of the `jemalloc cache memory tracker`. [#37464](https://github.com/apache/doris/pull/37464) + +- Added the `enable_stacktrace` configuration option, allowing users to control whether exception stacks are output in BE logs. [#37713](https://github.com/apache/doris/pull/37713) + +- Fixed the issue where Arrow Flight SQL did not work correctly when `enable_parallel_result_sink` was set to false. [#37779](https://github.com/apache/doris/pull/37779) + +- Fixed the incorrect use of colocate Join. [#37361](https://github.com/apache/doris/pull/37361), [#37729](https://github.com/apache/doris/pull/37729) + +- Fixed the calculation overflow issue of the `round` function on DECIMAL128 types. [#37733](https://github.com/apache/doris/pull/37733), [#38106](https://github.com/apache/doris/pull/38106) + +- Fixed the coredump issue when passing a const string to the `sleep` function. [#37681](https://github.com/apache/doris/pull/37681) + +- Increased the queue length for audit logs, solving the issue where audit logs could not be recorded normally under high concurrency scenarios with thousands of concurrent connections. [#37786](https://github.com/apache/doris/pull/37786) + +- Fixed the issue where creating a workload group caused too many threads, leading to BE coredump. [#38096](https://github.com/apache/doris/pull/38096) + +- Fixed the coredump issue caused by the `MULTI_MATCH_ANY` function. [#37959](https://github.com/apache/doris/pull/37959) + +- Fixed the transaction rollback issue caused by `insert overwrite auto partition`. [#38103](https://github.com/apache/doris/pull/38103) + +- Fixed the issue where the TimeUtils formatter did not use the correct time zone. [#37465](https://github.com/apache/doris/pull/37465) + +- Fixed the issue where results were incorrect under constant folding scenarios for week/yearweek. [#37376](https://github.com/apache/doris/pull/37376) + +- Fixed the issue where the `convert_tz` function returned incorrect results. [#37358](https://github.com/apache/doris/pull/37358), [#38764](https://github.com/apache/doris/pull/38764) + +- Fixed the coredump issue when using the `collect_set` function with window functions. [#38234](https://github.com/apache/doris/pull/38234) + +- Fixed the coredump issue caused by `percentile_approx` during rolling upgrades. [#39321](https://github.com/apache/doris/pull/39321) + +- Fixed the coredump issue caused by the `mod` function when encountering abnormal input. [#37999](https://github.com/apache/doris/pull/37999) + +- Fixed the issue where the hash table was not fully built when the broadcast join probe started running. [#37643](https://github.com/apache/doris/pull/37643) + +- Fixed the issue where executing the same expression in multithreaded environments might lead to incorrect results for Java UDFs. [#38612](https://github.com/apache/doris/pull/38612) + +- Fixed the overflow issue caused by incorrect return types of the `conv` function. [#38001](https://github.com/apache/doris/pull/38001) + +- Fixed the issue where the `json_replace` function returned incorrect types. [#3701](https://github.com/apache/doris/pull/37014) + +- Fixed the issue where the nullable attribute setting was unreasonable for the `percentile` aggregation function. [#37330](https://github.com/apache/doris/pull/37330) + +- Fixed the issue where the results of the `histogram` function were unstable. [#38608](https://github.com/apache/doris/pull/38608) + +- Fixed the issue where task state was displayed incorrectly in the profile. [#38082](https://github.com/apache/doris/pull/38082) + +- Fixed the issue where some queries were incorrectly canceled when the system just started. [#37662](https://github.com/apache/doris/pull/37662) + +### Semi-Structured Data Management + +- Fix some issues with time series compression. [#39170](https://github.com/apache/doris/pull/39170) [#39176](https://github.com/apache/doris/pull/39176) + +- Fix the issue of incorrect index size statistics during compression. [#37232](https://github.com/apache/doris/pull/37232) + +- Fix the potential incorrect matching of ultra-long strings without tokenization in inverted indexes. [#37679](https://github.com/apache/doris/pull/37679) [#38218](https://github.com/apache/doris/pull/38218) + +- Fix the high memory usage issue of `array_range` and `array_with_const` functions when dealing with large data volumes. [#38284](https://github.com/apache/doris/pull/38284) [#37495](https://github.com/apache/doris/pull/37495) + +- Fix the potential coredump issue when selecting columns of ARRAY / MAP / STRUCT types. [#37936](https://github.com/apache/doris/pull/37936) + +- Fix the import failure issue caused by simdjson parsing errors when specifying jsonpath in Stream Load. [#38490](https://github.com/apache/doris/pull/38490) + +- Fix the exception handling issue when there are duplicate keys in JSON data. [#38146](https://github.com/apache/doris/pull/38146) + +- Fix the potential query error after DROP INDEX. [#37646](https://github.com/apache/doris/pull/37646) + +- Fix the error return issue in row merging checks during index compression. [#38732](https://github.com/apache/doris/pull/38732) + +- Inverted index v2 format now supports renaming columns. [#38079](https://github.com/apache/doris/pull/38079) + +- Fix the coredump issue when the `MATCH` function matches an empty string without an index. [#37947](https://github.com/apache/doris/pull/37947) + +- Fix the handling of NULL values in inverted indexes. [#37921](https://github.com/apache/doris/pull/37921) [#37842](https://github.com/apache/doris/pull/37842) [#38741](https://github.com/apache/doris/pull/38741) + +- Fix the incorrect `row_store_page_size` after FE restart. [#38240](https://github.com/apache/doris/pull/38240) + +### Other + +- Fix the timezone configuration issue. The default timezone is no longer fixed at UTC+8 and is now obtained from system configuration. [#37294](https://github.com/apache/doris/pull/37294) + +- Fix the class conflict issue when using ranger due to multiple JSR specification implementations. [#37575](https://github.com/apache/doris/pull/37575) + +- Fix the potential uninitialized field issue in some BE code. [#37403](https://github.com/apache/doris/pull/37403) + +- Fix the error in delete statements for random distributed tables. [#37985](https://github.com/apache/doris/pull/37985) + +- Fix the incorrect requirement for `alter_priv` permission on the base table when creating a synchronized materialized view. [#38011](https://github.com/apache/doris/pull/38011) + +- Fix the issue of not authenticating resources when used in TVF. [#36928](https://github.com/apache/doris/pull/36928) + + +## Credits + +Thanks all who contribute to this release: + +@133tosakarin, @924060929, @AshinGau, @Baymine, @BePPPower, @BiteTheDDDDt, @ByteYue, @CalvinKirs, @Ceng23333, @DarvenDuan, @FreeOnePlus, @Gabriel39, @HappenLee, @JNSimba, @Jibing-Li, @KassieZ, @Lchangliang, @LiBinfeng-01, @Mryange, @SWJTU-ZhangLei, @TangSiyang2001, @Tech-Circle-48, @Vallishp, @Yukang-Lian, @Yulei-Yang, @airborne12, @amorynan, @bobhan1, @cambyzju, @cjj2010, @csun5285, @dataroaring, @deardeng, @eldenmoon, @englefly, @feiniaofeiafei, @felixwluo, @freemandealer, @gavinchou, @ghkang98, @hello-stephen, @hubgeter, @hust-hhb, @jacktengg, @kaijchen, @kaka11chen, @keanji-x, @liaoxin01, @liutang123, @luwei16, @luzhijing, @lxr599, @morningman, @morrySnow, @mrhhsg, @mymeiyi, @platoneko, @qidaye, @qzsee, @seawinde, @shuke987, @sollhui, @starocean999, @suxiaogang223, @w41ter, @wangbo, @wangshuo128, @whutpencil, @wsjz, @wuwenchi, @wyxxxcat, @xiaokang, @xiedeyantu, @xinyiZzz, @xy720, @xzj7019, @yagagagaga, @yiguolei, @yujun777, @z404289981, @zclllyybb, @zddr, @zfr9527, @zhangbutao, @zhangstar333, @zhannngchen, @zhiqiang-hhhh, @zjj, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.2.md b/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.2.md new file mode 100644 index 0000000000000..0ab6a828ab95d --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.2.md @@ -0,0 +1,341 @@ +--- +{ + "title": "Release 3.0.2", + "language": "en" +} +--- + + + + +Dear community members, the Apache Doris 3.0.2 version was officially released on October 15, 2024, featuring updates and improvements in compute-storage decoupling, data storage, lakehouse, query optimizer, query execution and more. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavioral Changes + +### Storage + +- Limited the number of tablets in a single backup task to prevent FE memory overflow. [#40518](https://github.com/apache/doris/pull/40518) +- The `SHOW PARTITIONS` command now displays the `CommittedVersion` of partitions. [#28274](https://github.com/apache/doris/pull/28274) + +### Other + +- The default printing mode (asynchronous) of `fe.log` now includes file line number information. If performance issues are encountered due to line number output, please switch to BRIEF mode. [#39419](https://github.com/apache/doris/pull/39419) +- The default value of the session variable `ENABLE_PREPARED_STMT_AUDIT_LOG` has been changed from `true` to `false`, and the audit log of prepare statements will no longer be printed. [#38865](https://github.com/apache/doris/pull/38865) +- The default value of the session variable `max_allowed_packet` has been adjusted from 1MB to 16MB to align with MySQL 8.4. [#38697](https://github.com/apache/doris/pull/38697) +- The JVM of FE and BE defaults to using the UTF-8 character set. [#39521](https://github.com/apache/doris/pull/39521) + +## New Features + +### Storage + +- Backup and recovery now support clearing tables or partitions that are not in the backup. [#39028](https://github.com/apache/doris/pull/39028) + +### Compute-Storage Decoupled + +- Support for parallel recycling of expired data on multiple tablets. [#37630](https://github.com/apache/doris/pull/37630) +- Support for changing storage vaults through `ALTER` statements. [#38685](https://github.com/apache/doris/pull/38685) [#37606](https://github.com/apache/doris/pull/37606) +- Support for importing a large number of tablets (5000+) in a single transaction (experimental feature). [#38243](https://github.com/apache/doris/pull/38243) +- Support for automatically aborting pending transactions caused by reasons such as node restarts, solving the issue of pending transactions blocking decommission or schema change. [#37669](https://github.com/apache/doris/pull/37669) +- A new session variable `enable_segment_cache` has been added to control whether to use segment cache during queries (default is `true`). [#37141](https://github.com/apache/doris/pull/37141) +- Resolved the issue of not being able to import a large amount of data during schema changes in compute-storage decoupled mode. [#39558](https://github.com/apache/doris/pull/39558) +- Support for adding multiple follower roles of FE in compute-storage decoupled mode. [#38388](https://github.com/apache/doris/pull/38388) +- Support for using memory as file cache to accelerate queries in environments with no disks or low-performance HDDs. [#38811](https://github.com/apache/doris/pull/38811) + +### Lakehouse + +- New Lakesoul Catalog has been added. [Apache Doris Docs](https://doris.apache.org/zh-CN/docs/dev/lakehouse/datalake-analytics/lakesoul) +- A new system table `catalog_meta_cache_statistics` has been added to view the usage of various metadata caches in external catalog. [#40155](https://github.com/apache/doris/pull/40155) + +### Query Optimizer + +- Support for `is [not] true/false` expressions. [#38623](https://github.com/apache/doris/pull/38623) + +### Query Execution + +- A new CRC32 function has been added. [#38204](https://github.com/apache/doris/pull/38204) +- New aggregate functions skew and kurt have been added. [#41277](https://github.com/apache/doris/pull/41277) +- Profiles are now persisted to the FE's disk to retain more profiles. [#33690](https://github.com/apache/doris/pull/33690) +- A new system table `workload_group_privileges` has been added to view permission information related to workload groups. [#38436](https://github.com/apache/doris/pull/38436) +- A new system table `workload_group_resource_usage` has been added to monitor resource statistics of workload groups. [#39177](https://github.com/apache/doris/pull/39177) +- Workload groups now support limiting reads of local IO and remote IO. [#39012](https://github.com/apache/doris/pull/39012) +- Workload groups now support cgroupv2 to limit CPU usage. [#39374](https://github.com/apache/doris/pull/39374) +- A new system table `information_schema.partitions` has been added to view some table creation attributes. [#40636](https://github.com/apache/doris/pull/40636) + +### Other + +- Support for using the `SHOW` statement to display BE's configuration information, such as `SHOW BACKEND CONFIG LIKE ${pattern}`. [#36525](https://github.com/apache/doris/pull/36525) + +## Improvements + +### Load + +- Improved the import efficiency of routine load when encountering frequent EOFs from Kafka. [#39975](https://github.com/apache/doris/pull/39975) +- The stream load result now includes the time taken to read HTTP data, `ReceiveDataTimeMs`, which can quickly determine slow stream load issues caused by network reasons. [#40735](https://github.com/apache/doris/pull/40735) +- Optimized the routine load timeout logic to avoid frequent timeouts during inverted index and mow writes. [#40818](https://github.com/apache/doris/pull/40818) + +### Storage + +- Support for batch addition of partitions. [#37114](https://github.com/apache/doris/pull/37114) + +### Compute-Storage Decoupled + +- Added the meta-service HTTP interface `/MetaService/http/show_meta_ranges` to facilitate the statistics of KV distribution in FDB. [#39208](https://github.com/apache/doris/pull/39208) +- The meta-service/recycler stop script ensures that the process fully exits before returning. [#40218](https://github.com/apache/doris/pull/40218) +- Support for using the session variable `version_comment` (Cloud Mode) to display the current deployment mode as compute-storage decoupled. [#38269](https://github.com/apache/doris/pull/38269) +- Fixed the detailed message returned when transaction submission fails. [#40584](https://github.com/apache/doris/pull/40584) +- Support for using one meta-service process to provide both metadata services and data recycling services. [#40223](https://github.com/apache/doris/pull/40223) +- Optimized the default configuration of file_cache to avoid potential issues when not set. [#41421](https://github.com/apache/doris/pull/41421) [#41507](https://github.com/apache/doris/pull/41507) +- Improved query performance by batch retrieving the version of multiple partitions. [#38949](https://github.com/apache/doris/pull/38949) +- Delayed the redistribution of tablets to avoid query performance issues caused by temporary network fluctuations. [#40371](https://github.com/apache/doris/pull/40371) +- Optimized the read-write lock logic in the balance. [#40633](https://github.com/apache/doris/pull/40633) +- Enhanced the robustness of file cache in handling TTL filenames during restarts/crashes. [#40226](https://github.com/apache/doris/pull/40226) +- Added the BE HTTP interface `/api/file_cache?op=hash` to facilitate the calculation of the hash file names of segment files on disk. [#40831](https://github.com/apache/doris/pull/40831) +- Optimized the unified naming to be compatible with using compute group to represent BE groups (original cloud cluster). [#40767](https://github.com/apache/doris/pull/40767) +- Optimized the waiting time for obtaining locks when calculating delete bitmaps in primary key tables. [#40341](https://github.com/apache/doris/pull/40341) +- When there are many delete bitmaps in primary key tables, optimized the high CPU consumption during queries by pre-merging multiple delete bitmaps. [#40204](https://github.com/apache/doris/pull/40204) +- Support for managing FE/BE nodes in compute-storage decoupled mode through SQL statements, hiding the logic of direct interaction with meta-service when deploying in compute-storage decoupled mode. [#40264](https://github.com/apache/doris/pull/40264) +- Added a script for rapid deployment of FDB. [#39803](https://github.com/apache/doris/pull/39803) +- Optimized the output of `SHOW CACHE HOTSPOT` to unify the column name style with other `SHOW` statements. [#41322](https://github.com/apache/doris/pull/41322) +- When using a storage vault as the storage backend, disallowed the use of `latest_fs()` to avoid binding different storage backends to the same table. [#40516](https://github.com/apache/doris/pull/40516) +- Optimized the timeout strategy for calculating delete bitmaps when importing mow tables. [#40562](https://github.com/apache/doris/pull/40562) [#40333](https://github.com/apache/doris/pull/40333) +- The enable_file_cache in be.conf is now enabled by default in compute-storage decoupled mode. [#41502](https://github.com/apache/doris/pull/41502) + +### Lakehouse + +- When reading tables in CSV format, support for the session `keep_carriage_return` setting to control the reading behavior of the `\r` symbol. [#39980](https://github.com/apache/doris/pull/39980) +- The default maximum memory of BE's JVM has been adjusted to 2GB (affecting only new deployments). [#41403](https://github.com/apache/doris/pull/41403) +- Hive Catalog has added `hive.recursive_directories_table` and `hive.ignore_absent_partitions` properties to specify whether to recursively traverse data directories and whether to ignore missing partitions. [#39494](https://github.com/apache/doris/pull/39494) +- Optimized the Catalog refresh logic to avoid generating a large number of connections during refresh. [#39205](https://github.com/apache/doris/pull/39205) +- `SHOW CREATE DATABASE` and `SHOW CREATE TABLE` for external data sources now display location information. [#39179](https://github.com/apache/doris/pull/39179) +- The new optimizer supports inserting data into JDBC external tables using the `INSERT INTO` statement. [#41511](https://github.com/apache/doris/pull/41511) +- MaxCompute Catalog now supports complex data types. [#39259](https://github.com/apache/doris/pull/39259) +- Optimized the logic for reading and merging data shards of external tables. [#38311](https://github.com/apache/doris/pull/38311) +- Optimized some refresh strategies for metadata caches of external tables. [#38506](https://github.com/apache/doris/pull/38506) +- Paimon tables now support pushing down `IN/NOT IN` predicates. [#38390](https://github.com/apache/doris/pull/38390) +- Compatible with tables created in Parquet format by Paimon version 0.9. [#41020](https://github.com/apache/doris/pull/41020) + +### Asynchronous Materialized Views + +- Building asynchronous materialized views now supports the use of both immediate and starttime. [#39573](https://github.com/apache/doris/pull/39573) +- Asynchronous materialized views based on external tables will refresh the metadata cache of the external tables before refreshing the materialized views, ensuring construction based on the latest external table data. [#38212](https://github.com/apache/doris/pull/38212) +- Partition incremental construction now supports rolling up according to weekly and quarterly granularities. [#39286](https://github.com/apache/doris/pull/39286) + +### Query Optimizer + +- The aggregate function `GROUP_CONCAT` now supports the use of both `DISTINCT` and `ORDER BY`. [#38080](https://github.com/apache/doris/pull/38080) +- Optimized the collection and use of statistical information, as well as the logic for estimating row counts and cost calculations, to generate more efficient and stable execution plans. +- Window function partition data pre-filtering now supports cases containing multiple window functions. [#38393](https://github.com/apache/doris/pull/38393) + +### Query Execution + +- Reduced query latency by running prepare pipeline tasks in parallel. [#40874](https://github.com/apache/doris/pull/40874) +- Display Catalog information in Profile. [#38283](https://github.com/apache/doris/pull/38283) +- Optimized the computational performance of `IN` filtering conditions. [#40917](https://github.com/apache/doris/pull/40917) +- Supported cgroupv2 in K8S to limit Doris's memory usage. [#39256](https://github.com/apache/doris/pull/39256) +- Optimized the performance of converting strings to datetime types. [#38385](https://github.com/apache/doris/pull/38385) +- When a `string` is a decimal number, support casting it to an `int`, which will be more compatible with certain behaviors of MySQL. [#38847](https://github.com/apache/doris/pull/38847) + +### Semi-Structured Data Management + +- Optimized the performance of inverted index matching. [#41122](https://github.com/apache/doris/pull/41122) +- Temporarily prohibited the creation of inverted indexes with tokenization on arrays. [#39062](https://github.com/apache/doris/pull/39062) +- `explode_json_array` now supports binary JSON types. [#37278](https://github.com/apache/doris/pull/37278) +- IP data types now support bloomfilter indexes. [#39253](https://github.com/apache/doris/pull/39253) +- IP data types now support row storage. [#39258](https://github.com/apache/doris/pull/39258) +- Nested data types such as ARRAY, MAP, and STRUCT now support schema changes. [#39210](https://github.com/apache/doris/pull/39210) +- When creating MTMV, automatically truncate KEYs encountered in VARIANT data types. [#39988](https://github.com/apache/doris/pull/39988) +- Lazy loading of inverted indexes during queries to improve performance. [#38979](https://github.com/apache/doris/pull/38979) +- `add inverted index file size for open file`. [#37482](https://github.com/apache/doris/pull/37482) +- Reduced access to object storage interfaces during compaction to improve performance. [#41079](https://github.com/apache/doris/pull/41079) +- Added three new query profile metrics related to inverted indexes. [#36696](https://github.com/apache/doris/pull/36696) +- Reduced cache overhead for non-PreparedStatement SQL to improve performance. [#40910](https://github.com/apache/doris/pull/40910) +- Pre-warming cache now supports inverted indexes. [#38986](https://github.com/apache/doris/pull/38986) +- Inverted indexes are now cached immediately after writing. [#39076](https://github.com/apache/doris/pull/39076) + +### Compatibility + +- Fixed the issue of Thrift ID incompatibility on the master with branch-2.1. [#41057](https://github.com/apache/doris/pull/41057) + +### Other + +- BE HTTP API now supports authentication; set config::enable_all_http_auth to true (default is false) when authentication is required. [#39577](https://github.com/apache/doris/pull/39577) +- Optimized the user permissions required for the REFRESH operation. Permissions have been relaxed from ALTER to SHOW. [#39008](https://github.com/apache/doris/pull/39008) +- Reduced the range of nextId when calling advanceNextId(). [#40160](https://github.com/apache/doris/pull/40160) +- Optimized the caching mechanism for Java UDFs. [#40404](https://github.com/apache/doris/pull/40404) + +## Bug Fixes + +### Load + +- Fixed the issue where `abortTransaction` did not handle return codes. [#41275](https://github.com/apache/doris/pull/41275) +- Fixed the issue where transactions failed to commit or abort in compute-storage decoupled mode without calling `afterCommit/afterAbort`. [#41267](https://github.com/apache/doris/pull/41267) +- Fixed the issue where Routine Load could not work properly when modifying consumer offsets in compute-storage decoupled mode. [#39159](https://github.com/apache/doris/pull/39159) +- Fixed the issue of repeatedly closing file handles when obtaining error log file paths. [#41320](https://github.com/apache/doris/pull/41320) +- Fixed the issue of incorrect job progress caching for Routine Load in compute-storage decoupled mode. [#39313](https://github.com/apache/doris/pull/39313) +- Fixed the issue where Routine Load could get stuck when failing to commit transactions in compute-storage decoupled mode. [#40539](https://github.com/apache/doris/pull/40539) +- Fixed the issue where Routine Load kept reporting data quality check errors in compute-storage decoupled mode. [#39790](https://github.com/apache/doris/pull/39790) +- Fixed the issue where Routine Load did not check transactions before committing in compute-storage decoupled mode. [#39775](https://github.com/apache/doris/pull/39775) +- Fixed the issue where Routine Load did not check transactions before aborting in compute-storage decoupled mode. [#40463](https://github.com/apache/doris/pull/40463) +- Fixed the issue where cluster keys did not support certain data types. [#38966](https://github.com/apache/doris/pull/38966) +- Fixed the issue of transactions being repeatedly committed. [#39786](https://github.com/apache/doris/pull/39786) +- Fixed the issue of use after free with WAL when BE exits. [#33131](https://github.com/apache/doris/pull/33131) +- Fixed the issue where WAL playback did not skip completed import transactions in compute-storage decoupled mode. [#41262](https://github.com/apache/doris/pull/41262) +- Fixed the logic for selecting BE in group commit in compute-storage decoupled mode. [#39986](https://github.com/apache/doris/pull/39986) [#38644](https://github.com/apache/doris/pull/38644) +- Fixed the issue where BE might crash when group commit was enabled for insert into. [#39339](https://github.com/apache/doris/pull/39339) +- Fixed the issue where insert into with group commit enabled might get stuck. [#39391](https://github.com/apache/doris/pull/39391) +- Fixed the issue where not enabling the group commit option during import might result in a table not found error. [#39731](https://github.com/apache/doris/pull/39731) +- Fixed the issue of transaction submission timeouts due to too many tablets. [#40031](https://github.com/apache/doris/pull/40031) +- Fixed the issue of concurrent opens with Auto Partition. [#38605](https://github.com/apache/doris/pull/38605) +- Fixed the issue of import lock granularity being too large. [#40134](https://github.com/apache/doris/pull/40134) +- Fixed the issue of coredumps caused by zero-length varchars. [#40940](https://github.com/apache/doris/pull/40940) +- Fixed the issue of incorrect index Id values in log prints. [#38790](https://github.com/apache/doris/pull/38790) +- Fixed the issue of memtable shifting not closing BRPC streaming. [#40105](https://github.com/apache/doris/pull/40105) +- Fixed the issue of inaccurate bvar statistics during memtable shifting. [#39075](https://github.com/apache/doris/pull/39075) +- Fixed the issue of multi-replication fault tolerance during memtable shifting. [#38003](https://github.com/apache/doris/pull/38003) +- Fixed the issue of incorrect message length calculations for Routine Load with multiple tables in one stream. [#40367](https://github.com/apache/doris/pull/40367) +- Fixed the issue of inaccurate progress reporting for Broker Load. [#40325](https://github.com/apache/doris/pull/40325) +- Fixed the issue of inaccurate data scan volume reporting for Broker Load. [#40694](https://github.com/apache/doris/pull/40694) +- Fixed the issue of concurrency with Routine Load in compute-storage decoupled mode. [#39242](https://github.com/apache/doris/pull/39242) +- Fixed the issue of Routine Load jobs being canceled in compute-storage decoupled mode. [#39514](https://github.com/apache/doris/pull/39514) +- Fixed the issue of progress not being reset when deleting Kafka topics. [#38474](https://github.com/apache/doris/pull/38474) +- Fixed the issue of updating progress during transaction state transitions in Routine Load. [#39311](https://github.com/apache/doris/pull/39311) +- Fixed the issue of Routine Load switching from a paused state to a paused state. [#40728](https://github.com/apache/doris/pull/40728) +- Fixed the issue of Stream Load records being missed due to database deletion. [#39360](https://github.com/apache/doris/pull/39360) + +### Storage + +- Fixed the issue of missing storage policies. [#38700](https://github.com/apache/doris/pull/38700) +- Fixed the issue of errors during cross-version backup and recovery. [#38370](https://github.com/apache/doris/pull/38370) +- Fixed the NPE issue with ccr binlog. [#39909](https://github.com/apache/doris/pull/39909) +- Fixed potential issues with duplicate keys in mow. [#41309](https://github.com/apache/doris/pull/41309) [#39791](https://github.com/apache/doris/pull/39791) [#39958](https://github.com/apache/doris/pull/39958) [#38369](https://github.com/apache/doris/pull/38369) [#38331](https://github.com/apache/doris/pull/38331) +- Fixed the issue of not being able to write after backup and recovery in high-frequency write scenarios. [#40118](https://github.com/apache/doris/pull/40118) [#38321](https://github.com/apache/doris/pull/38321) +- Fixed the issue of data errors potentially triggered by deleting empty strings and schema changes. [#41064](https://github.com/apache/doris/pull/41064) +- Fixed the issue of incorrect statistics due to column updates. [#40880](https://github.com/apache/doris/pull/40880) +- Limited the size of tablet meta pb to prevent BE crashes due to oversized meta. [#39455](https://github.com/apache/doris/pull/39455) +- Fixed the potential column misalignment issue with the new optimizer in `begin; insert into values; commit`. [#39295](https://github.com/apache/doris/pull/39295) + +### Compute-Storage Decoupled + +- Fixed the issue where the tablet distribution might be inconsistent across multiple FEs in compute-storage decoupled mode. [#41458](https://github.com/apache/doris/pull/41458) +- Fixed the issue where TVF might not work in multi-computing group environments. [#39249](https://github.com/apache/doris/pull/39249) +- Fixed the issue where compaction used resources that had already been released when BE exited in compute-storage decoupled mode. [#39302](https://github.com/apache/doris/pull/39302) +- Fixed the issue where automatic start-stop might cause FE replay to get stuck. [#40027](https://github.com/apache/doris/pull/40027) +- Fixed the issue where the BE status and the stored status in meta-service were inconsistent. [#40799](https://github.com/apache/doris/pull/40799) +- Fixed the issue where the FE->meta-service connection pool could not automatically expire and reconnect. [#41202](https://github.com/apache/doris/pull/41202) [#40661](https://github.com/apache/doris/pull/40661) +- Fixed the issue where some tablets might repeatedly undergo unexpected balance processes during rebalance. [#39792](https://github.com/apache/doris/pull/39792) +- Fixed the issue where storage vault permissions were lost after FE restarted. [#40260](https://github.com/apache/doris/pull/40260) +- Fixed the issue where tablet row counts and other statistical information might be incomplete due to FDB scan range pagination. [#40494](https://github.com/apache/doris/pull/40494) +- Fixed the performance issue caused by a large number of aborted transactions associated with the same label. [#40606](https://github.com/apache/doris/pull/40606) +- Fixed the issue where `commit_txn` did not automatically re-enter, maintaining consistent behavior between compute-storage decoupled and integrated modes. [#39615](https://github.com/apache/doris/pull/39615) +- Fixed the issue where the number of projected columns increased when dropping columns. [#40187](https://github.com/apache/doris/pull/40187) +- Fixed the issue where delete statements did not correctly handle return values, causing data to still be visible after deletion. [#39428](https://github.com/apache/doris/pull/39428) +- Fixed the coredump issue caused by rowset metadata competition during file cache preheating. [#39361](https://github.com/apache/doris/pull/39361) +- Fixed the issue where the entire cache space would be used up when TTL cache enabled LRU eviction. [#39814](https://github.com/apache/doris/pull/39814) +- Fixed the issue where temporary files could not be recycled when importing commit rowset failed with HDFS storage backend. [#40215](https://github.com/apache/doris/pull/40215) + +### Lakehouse + +- Fixed some issues with predicate pushdown in JDBC Catalog. [#39064](https://github.com/apache/doris/pull/39064) +- Fixed the issue of not being able to read when `S``TRUCT` type columns are missing in Parquet format. [#38718](https://github.com/apache/doris/pull/38718) +- Fixed the issue of FileSystem leaks on the FE side in some cases. [#38610](https://github.com/apache/doris/pull/38610) +- Fixed the issue of metadata cache information being inconsistent when Hive/Iceberg tables write back in some cases. [#40729](https://github.com/apache/doris/pull/40729) +- Fixed the issue of unstable partition ID generation for external tables in some cases. [#39325](https://github.com/apache/doris/pull/39325) +- Fixed the issue of external table queries selecting BE nodes in the blacklist in some cases. [#39451](https://github.com/apache/doris/pull/39451) +- Optimized the timeout time for batch retrieval of external table partition information to avoid long-term thread occupation. [#39346](https://github.com/apache/doris/pull/39346) +- Fixed the issue of memory leaks when querying Hudi tables in some cases. [#41256](https://github.com/apache/doris/pull/41256) +- Fixed the issue of connection pool connection leaks in JDBC Catalog in some cases. [#39582](https://github.com/apache/doris/pull/39582) +- Fixed the issue of BE memory leaks in JDBC Catalog in some cases. [#41041](https://github.com/apache/doris/pull/41041) +- Fixed the issue of not being able to query Hudi data on Alibaba Cloud OSS. [#41316](https://github.com/apache/doris/pull/41316) +- Fixed the issue of not being able to read empty partitions in MaxCompute. [#40046](https://github.com/apache/doris/pull/40046) +- Fixed the issue of poor performance when querying Oracle through JDBC Catalog. [#41513](https://github.com/apache/doris/pull/41513) +- Fixed the issue of BE crashes when querying deletion vector of Paimon tables after enabling file cache features. [#39877](https://github.com/apache/doris/pull/39877) +- Fixed the issue of not being able to access Paimon tables on HDFS clusters with HA enabled. [#39806](https://github.com/apache/doris/pull/39806) +- Temporarily disabled the page index filtering feature of Parquet to avoid potential issues. [#38691](https://github.com/apache/doris/pull/38691) +- Fixed the issue of not being able to read unsigned types in Parquet files. [#39926](https://github.com/apache/doris/pull/39926) +- Fixed the issue of potential infinite loops when reading Parquet files in some cases. [#39523](https://github.com/apache/doris/pull/39523) + +### Asynchronous Materialized Views + +- Fixed the issue where partition construction might select the wrong table to track partitions if both sides have the same column names. [#40810](https://github.com/apache/doris/pull/40810) +- Fixed the issue where transparent rewrite partition compensation might result in incorrect results. [#40803](https://github.com/apache/doris/pull/40803) +- Fixed the issue where transparent rewrite did not take effect on external tables. [#38909](https://github.com/apache/doris/pull/38909) +- Fixed the issue where nested materialized views might not refresh properly. [#40433](https://github.com/apache/doris/pull/40433) + +### Synchronous Materialized Views + +- Fixed the issue where creating synchronous materialized views on MOW tables might result in incorrect query results. [#39171](https://github.com/apache/doris/pull/39171) + +### Query Optimizer + +- Fixed the issue where existing synchronous materialized views might not be usable after upgrading. [#41283](https://github.com/apache/doris/pull/41283) +- Fixed the issue of not correctly handling milliseconds when comparing datetime literals. [#40121](https://github.com/apache/doris/pull/40121) +- Fixed the issue of potential errors in conditional function partition pruning. [#39298](https://github.com/apache/doris/pull/39298) +- Fixed the issue where MOW tables with synchronous materialized views could not perform delete operations. [#39578](https://github.com/apache/doris/pull/39578) +- Fixed the issue where the nullable of slots in JDBC external table query predicates might be incorrectly planned, causing query errors. [#41014](https://github.com/apache/doris/pull/41014) + +### Query Execution + +- Fixed the memory leak issue caused by the use of runtime filters. [#39155](https://github.com/apache/doris/pull/39155) +- Fixed the issue of excessive memory usage by window functions. [#39581](https://github.com/apache/doris/pull/39581) +- Fixed a series of function compatibility issues during rolling upgrades. [#41023](https://github.com/apache/doris/pull/41023) [#40438](https://github.com/apache/doris/pull/40438) [#39648](https://github.com/apache/doris/pull/39648) +- Fixed the issue of incorrect results with `encryption_function` when used with constants. [#40201](https://github.com/apache/doris/pull/40201) +- Fixed the issue of errors when importing single-table materialized views. [#39061](https://github.com/apache/doris/pull/39061) +- Fixed the issue of incorrect partition result calculations for window functions. [#39100](https://github.com/apache/doris/pull/39100) [#40761](https://github.com/apache/doris/pull/40761) +- Fixed the issue of incorrect calculations for topn when null values are present. [#39497](https://github.com/apache/doris/pull/39497) +- Fixed the issue of incorrect results with the `map_agg` function. [#39743](https://github.com/apache/doris/pull/39743) +- Fixed the issue of incorrect messages returned by cancel. [#38982](https://github.com/apache/doris/pull/38982) +- Fixed the issue of BE core dumps caused by encrypt and decrypt functions. [#40726](https://github.com/apache/doris/pull/40726) +- Fixed the issue of queries getting stuck due to too many scanners in high-concurrency scenarios. [#40495](https://github.com/apache/doris/pull/40495) +- Supported time types in runtime filters. [#38258](https://github.com/apache/doris/pull/38258) +- Fixed the issue of incorrect results with window funnel functions. [#40960](https://github.com/apache/doris/pull/40960) + +### Semi-Structured Data Management + +- Fixed the issue of match function errors when no indexes were present. [#38989](https://github.com/apache/doris/pull/38989) +- Fixed the issue of crashes when ARRAY data types were used as parameters for array_min/array_max functions. [#39492](https://github.com/apache/doris/pull/39492) +- Fixed the issue of nullable with the `array_enumerate_uniq` function. [#38384](https://github.com/apache/doris/pull/38384) +- Fixed the issue of bloomfilter indexes not being updated when adding or deleting columns. [#38431](https://github.com/apache/doris/pull/38431) +- Fixed the issue of es-catalog parsing exceptions with array data. [#39104](https://github.com/apache/doris/pull/39104) +- Fixed the issue of improper predicate push-down in es-catalog. [#40111](https://github.com/apache/doris/pull/40111) +- Fixed the issue of exceptions caused by modifying input data with`map()` and `struct()` functions. [#39699](https://github.com/apache/doris/pull/39699) +- Fixed the issue of index compaction crashes in special cases. [#40294](https://github.com/apache/doris/pull/40294) +- Fixed the issue of ARRAY type inverted indexes missing nullbitmaps. [#38907](https://github.com/apache/doris/pull/38907) +- Fixed the issue of incorrect results with the `count()` function on inverted indexes. [#41152](https://github.com/apache/doris/pull/41152) +- Fixed the issue of correct results with the `explode_map` function when using aliases. [#39757](https://github.com/apache/doris/pull/39757) +- Fixed the issue of VARIANT type not being able to use row storage for exceptional JSON data. [#39394](https://github.com/apache/doris/pull/39394) +- Fixed the issue of memory leaks when returning ARRAY results with VARIANT type. [#41358](https://github.com/apache/doris/pull/41358) +- Fixed the issue of changing column names with VARIANT type. [#40320](https://github.com/apache/doris/pull/40320) +- Fixed the issue of potential precision loss when converting VARIANT type to DECIMAL type. [#39650](https://github.com/apache/doris/pull/39650) +- Fixed the issue of nullable handling with VARIANT type. [#39732](https://github.com/apache/doris/pull/39732) +- Fixed the issue of sparse column reading with VARIANT type. [#40295](https://github.com/apache/doris/pull/40295) + +### Other + +- Fixed the compatibility issue between new and old audit log plugins. [#41401](https://github.com/apache/doris/pull/41401) +- Fixed the issue where users could see processes of others in certain cases. [#39747](https://github.com/apache/doris/pull/39747) +- Fixed the issue where users with permissions could not export. [#38365](https://github.com/apache/doris/pull/38365) +- Fixed the issue where create table like required create permissions for the existing table. [#37879](https://github.com/apache/doris/pull/37879) +- Fixed the issue where some features did not verify permissions. [#39726](https://github.com/apache/doris/pull/39726) +- Fixed the issue of not correctly closing connections when using SSL. [#38587](https://github.com/apache/doris/pull/38587) +- Fixed the issue where executing ALTER VIEW operations in some cases caused FE to fail to start. [#40872](https://github.com/apache/doris/pull/40872) \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.3.md b/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.3.md new file mode 100644 index 0000000000000..b15777212b400 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.3.md @@ -0,0 +1,226 @@ +--- +{ + "title": "Release 3.0.3", + "language": "en" +} +--- + + + + +Dear community members, the Apache Doris 3.0.3 version was officially released on December 02, 2024, this version further enhances the performance and stability of the system. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavioral Changes + +- Prohibited column updates on MOW tables with synchronous materialized views. [#40190](https://github.com/apache/doris/pull/40190) +- Adjusted the default parameters of RoutineLoad to improve import efficiency. [#42968](https://github.com/apache/doris/pull/42968) +- When StreamLoad fails, the return value of LoadedRows is adjusted to 0. [#41946](https://github.com/apache/doris/pull/41946) [#42291](https://github.com/apache/doris/pull/42291) +- Adjusted the default memory limit of Segment cache to 5%. [#42308](https://github.com/apache/doris/pull/42308) [#42436](https://github.com/apache/doris/pull/42436) + +## New Features + +- Introduced the session variable `enable_cooldown_replica_affinity` to control the affinity of cold and hot tiered replicas. [#42677](https://github.com/apache/doris/pull/42677) + +- Added `table$partition` syntax for querying partition information of Hive tables. [#40774](https://github.com/apache/doris/pull/40774) + + - [View Documentation](https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/hive) + +- Supported creation of Hive tables in Text format. [#41860](https://github.com/apache/doris/pull/41860) [#42175](https://github.com/apache/doris/pull/42175) + + - [View Documentation](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build#table) + +### Asynchronous Materialized Views + +- Introduced new materialized view attribute `use_for_rewrite`. When `use_for_rewrite` is set to false, the materialized view does not participate in transparent rewriting. [#40332](https://github.com/apache/doris/pull/40332) + +### Query Optimizer + +- Supported correlated non-aggregate subqueries. [#42236](https://github.com/apache/doris/pull/42236) + +### Query Execution + +- Added functions `ngram_search`, `normal_cdf`, `to_iso8601`, `from_iso8601_date`, `SESSION_USER()`, `last_query_id`. [#38226](https://github.com/apache/doris/pull/38226) [#40695](https://github.com/apache/doris/pull/40695) [#41075](https://github.com/apache/doris/pull/41075) [#41600](https://github.com/apache/doris/pull/41600) [#39575](https://github.com/apache/doris/pull/39575) [#40739](https://github.com/apache/doris/pull/40739) +- The `aes_encrypt` and `aes_decrypt` functions support GCM mode. [#40004](https://github.com/apache/doris/pull/40004) +- Profile outputs the changed session variable values. [#41016](https://github.com/apache/doris/pull/41016) [#41318](https://github.com/apache/doris/pull/41318) + +### Semi-structured Data Management + +- Added array functions `array_match_all` and `array_match_any`. [#40605](https://github.com/apache/doris/pull/40605) [#43514](https://github.com/apache/doris/pull/43514) +- The array function `array_agg` supports nesting ARRAY/MAP/STRUCT within ARRAY. [#42009](https://github.com/apache/doris/pull/42009) +- Added approximate aggregate statistical functions `approx_top_k` and `approx_top_sum`. [#44082](https://github.com/apache/doris/pull/44082) + +## Improvements + +### Storage + +- Supported `bitmap_empty` as the default value. [#40364](https://github.com/apache/doris/pull/40364) +- Introduced the session variable `insert_timeout` to control the timeout of DELETE statements. [#41063](https://github.com/apache/doris/pull/41063) +- Improved some error message prompts. [#41048](https://github.com/apache/doris/pull/41048) [#39631](https://github.com/apache/doris/pull/39631) +- Improved the priority scheduling of replica repair. [#41076](https://github.com/apache/doris/pull/41076) +- Enhanced the robustness of timezone handling when creating tables. [#41926](https://github.com/apache/doris/pull/41926) [#42389](https://github.com/apache/doris/pull/42389) +- Checked the validity of partition expressions when creating tables. [#40158](https://github.com/apache/doris/pull/40158) +- Supported Unicode-encoded column names in DELETE operations. [#39381](https://github.com/apache/doris/pull/39381) + +### Compute-Storage Decoupled + +- Supported ARM architecture deployment in storage and compute separation mode. [#42467](https://github.com/apache/doris/pull/42467) [#43377](https://github.com/apache/doris/pull/43377) +- Optimized the eviction strategy and lock competition of file cache, improving hit rate and high concurrency point query performance. [#42451](https://github.com/apache/doris/pull/42451) [#43201](https://github.com/apache/doris/pull/43201) [#41818](https://github.com/apache/doris/pull/41818) [#43401](https://github.com/apache/doris/pull/43401) +- S3 storage vault supported `use_path_style`, solving the problem of using custom domain names for object storage. [#43060](https://github.com/apache/doris/pull/43060) [#43343](https://github.com/apache/doris/pull/43343) [#43330](https://github.com/apache/doris/pull/43330) +- Optimized storage and compute separation configuration and deployment, preventing misoperations in different modes. [#43381](https://github.com/apache/doris/pull/43381) [#43522](https://github.com/apache/doris/pull/43522) [#43434](https://github.com/apache/doris/pull/43434) [#40764](https://github.com/apache/doris/pull/40764) [#43891](https://github.com/apache/doris/pull/43891) +- Optimized observability and provided an interface for deleting specified segment file cache. [#38489](https://github.com/apache/doris/pull/38489) [#42896](https://github.com/apache/doris/pull/42896) [#41037](https://github.com/apache/doris/pull/41037) [#43412](https://github.com/apache/doris/pull/43412) +- Optimized Meta-service operation and maintenance interface: RPC rate limiting and tablet metadata correction. [#42413](https://github.com/apache/doris/pull/42413) [#43884](https://github.com/apache/doris/pull/43884) [#41782](https://github.com/apache/doris/pull/41782) [#43460](https://github.com/apache/doris/pull/43460) + +### Lakehouse + +- Paimon Catalog supported Alibaba Cloud DLF and OSS-HDFS storage. [#41247](https://github.com/apache/doris/pull/41247) [#42585](https://github.com/apache/doris/pull/42585) + + - View [Documentation](https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/paimon) + +- Supported reading of Hive tables in OpenCSV format. [#42257](https://github.com/apache/doris/pull/42257) [#42942](https://github.com/apache/doris/pull/42942) +- Optimized the performance of accessing the `information_schema.columns` table in External Catalog. [#41659](https://github.com/apache/doris/pull/41659) [#41962](https://github.com/apache/doris/pull/41962) +- Used the new Max Compute open storage API to access Max Compute data sources. [#41614](https://github.com/apache/doris/pull/41614) +- Optimized the scheduling policy of the JNI part of Paimon tables, making scan tasks more balanced. [#43310](https://github.com/apache/doris/pull/43310) +- Optimized the read performance of small ORC files. [#42004](https://github.com/apache/doris/pull/42004) [#43467](https://github.com/apache/doris/pull/43467) +- Supported reading of parquet files in brotli compressed format. [#42177](https://github.com/apache/doris/pull/42177) +- Added `file_cache_statistics` table under the `information_schema` library to view metadata cache statistics. [#42160](https://github.com/apache/doris/pull/42160) + +### Query Optimizer + +- Optimization: When queries only differ in comments, the same SQL Cache can be reused. [#40049](https://github.com/apache/doris/pull/40049) +- Optimization: Improved the stability of statistical information when data is frequently updated. [#43865](https://github.com/apache/doris/pull/43865) [#39788](https://github.com/apache/doris/pull/39788) [#43009](https://github.com/apache/doris/pull/43009) [#40457](https://github.com/apache/doris/pull/40457) [#42409](https://github.com/apache/doris/pull/42409) [#41894](https://github.com/apache/doris/pull/41894) +- Optimization: Enhanced the stability of constant folding. [#42910](https://github.com/apache/doris/pull/42910) [#41164](https://github.com/apache/doris/pull/41164) [#39723](https://github.com/apache/doris/pull/39723) [#41394](https://github.com/apache/doris/pull/41394) [#42256](https://github.com/apache/doris/pull/42256) [#40441](https://github.com/apache/doris/pull/40441) +- Optimization: Column pruning can generate better execution plans. [#41719](https://github.com/apache/doris/pull/41719) [#41548](https://github.com/apache/doris/pull/41548) + +### Query Execution + +- Optimized the memory usage of the sort operator. [#39306](https://github.com/apache/doris/pull/39306) +- Optimized the performance of computations on ARM. [#38888](https://github.com/apache/doris/pull/38888) [#38759](https://github.com/apache/doris/pull/38759) +- Optimized the computational performance of a series of functions. [#40366](https://github.com/apache/doris/pull/40366) [#40821](https://github.com/apache/doris/pull/40821) [#40670](https://github.com/apache/doris/pull/40670) [#41206](https://github.com/apache/doris/pull/41206) [#40162](https://github.com/apache/doris/pull/40162) +- Used SSE instructions to optimize the performance of the `match_ipv6_subnet` function. [#38755](https://github.com/apache/doris/pull/38755) +- Supported automatic creation of new partitions during insert overwrite. [#38628](https://github.com/apache/doris/pull/38628) [#42645](https://github.com/apache/doris/pull/42645) +- Added the status of each PipelineTask in Profile. [#42981](https://github.com/apache/doris/pull/42981) +- IP type supported runtime filter. [#39985](https://github.com/apache/doris/pull/39985) + +### Semi-structured Data Management + +- Output the real SQL of prepared statements in audit logs. [#43321](https://github.com/apache/doris/pull/43321) +- The filebeat doris output plugin supports fault tolerance and progress reporting. [#36355](https://github.com/apache/doris/pull/36355) +- Optimized the performance of inverted index queries. [#41547](https://github.com/apache/doris/pull/41547) [#41585](https://github.com/apache/doris/pull/41585) [#41567](https://github.com/apache/doris/pull/41567) [#41577](https://github.com/apache/doris/pull/41577) [#42060](https://github.com/apache/doris/pull/42060) [#42372](https://github.com/apache/doris/pull/42372) +- The array function `array overlaps` supports acceleration using inverted indexes. [#41571](https://github.com/apache/doris/pull/41571) +- The IP function `is_ip_address_in_range` supports acceleration using inverted indexes. [#41571](https://github.com/apache/doris/pull/41571) +- Optimized the CAST performance of the VARIANT data type. [#41775](https://github.com/apache/doris/pull/41775) [#42438](https://github.com/apache/doris/pull/42438) [#43320](https://github.com/apache/doris/pull/43320) +- Optimized the CPU resource consumption of the Variant data type. [#42856](https://github.com/apache/doris/pull/42856) [#43062](https://github.com/apache/doris/pull/43062) [#43634](https://github.com/apache/doris/pull/43634) +- Optimized the metadata and execution memory resource consumption of the Variant data type. [#42448](https://github.com/apache/doris/pull/42448) [#43326](https://github.com/apache/doris/pull/43326) [#41482](https://github.com/apache/doris/pull/41482) [#43093](https://github.com/apache/doris/pull/43093) [#43567](https://github.com/apache/doris/pull/43567) [#43620](https://github.com/apache/doris/pull/43620) + +### Permissions + +- Added a new configuration item `ldap_group_filter` in LDAP for custom group filtering. [#43292](https://github.com/apache/doris/pull/43292) + +### Other + +- Supported displaying connection count information by user in FE monitoring items. [#39200](https://github.com/apache/doris/pull/39200) + +## Bug Fixes + +### Storage + +- Fixed the issue with using IPv6 hostnames. [#40074](https://github.com/apache/doris/pull/40074) +- Fixed the inaccurate display of broker/s3 load progress. [#43535](https://github.com/apache/doris/pull/43535) +- Fixed the issue where queries might hang from FE. [#41303](https://github.com/apache/doris/pull/41303) [#42382](https://github.com/apache/doris/pull/42382) +- Fixed the issue of duplicate auto-increment IDs under exceptional circumstances. [#43774](https://github.com/apache/doris/pull/43774) [#43983](https://github.com/apache/doris/pull/43983) +- Fixed occasional NPE issues with groupcommit. [#43635](https://github.com/apache/doris/pull/43635) +- Fixed the inaccurate calculation of auto bucket. [#41675](https://github.com/apache/doris/pull/41675) [#41835](https://github.com/apache/doris/pull/41835) +- Fixed the issue where FE might not correctly plan multi-table flows after restart. [#41677](https://github.com/apache/doris/pull/41677) [#42290](https://github.com/apache/doris/pull/42290) + +### Compute-Storage Decoupled + +- Fixed the issue that MOW primary key tables with large delete bitmaps might cause coredump. [#43088](https://github.com/apache/doris/pull/43088) [#43457](https://github.com/apache/doris/pull/43457) [#43479](https://github.com/apache/doris/pull/43479) [#43407](https://github.com/apache/doris/pull/43407) [#43297](https://github.com/apache/doris/pull/43297) [#43613](https://github.com/apache/doris/pull/43613) [#43615](https://github.com/apache/doris/pull/43615) [#43854](https://github.com/apache/doris/pull/43854) [#43968](https://github.com/apache/doris/pull/43968) [#44074](https://github.com/apache/doris/pull/44074) [#41793](https://github.com/apache/doris/pull/41793) [#42142](https://github.com/apache/doris/pull/42142) +- Fixed the issue that segment files, when being a multiple of 5MB, would fail to upload objects. [#43254](https://github.com/apache/doris/pull/43254) +- Fixed the issue that the default retry policy of aws sdk did not take effect. [#43575](https://github.com/apache/doris/pull/43575) [#43648](https://github.com/apache/doris/pull/43648) +- Fixed the issue that altering storage vault could continue execution even when the wrong type was specified. [#43489](https://github.com/apache/doris/pull/43489) [#43352](https://github.com/apache/doris/pull/43352) [#43495](https://github.com/apache/doris/pull/43495) +- Fixed the issue that tablet_id might be 0 during the delayed commit process of large transactions. [#42043](https://github.com/apache/doris/pull/42043) [#42905](https://github.com/apache/doris/pull/42905) +- Fixed the issue that constant folding RCP and FE forwarding SQL might not be executed in the expected computation group. [#43110](https://github.com/apache/doris/pull/43110) [#41819](https://github.com/apache/doris/pull/41819) [#41846](https://github.com/apache/doris/pull/41846) +- Fixed the issue that meta-service did not strictly check instance_id upon receiving RPC. [#43253](https://github.com/apache/doris/pull/43253) [#43832](https://github.com/apache/doris/pull/43832) +- Fixed the issue that FE follower information_schema version did not update in time. [#43496](https://github.com/apache/doris/pull/43496) +- Fixed the issue of atomicity in file cache rename and inaccurate metrics. [#42869](https://github.com/apache/doris/pull/42869) [#43504](https://github.com/apache/doris/pull/43504) [#43220](https://github.com/apache/doris/pull/43220) + +### Lakehouse + +- Prohibited implicit conversion predicates from being pushed down to JDBC data sources to avoid inconsistent query results. [#42102](https://github.com/apache/doris/pull/42102) +- Fixed some read issues with high-version Hive transactional tables. [#42226](https://github.com/apache/doris/pull/42226) +- Fixed the issue that the Export command might cause deadlocks. [#43083](https://github.com/apache/doris/pull/43083) [#43402](https://github.com/apache/doris/pull/43402) +- Fixed the issue of being unable to query Hive views created by Spark. [#43552](https://github.com/apache/doris/pull/43552) +- Fixed the issue that Hive partition paths containing special characters led to incorrect partition pruning. [#42906](https://github.com/apache/doris/pull/42906) +- Fixed the issue that Iceberg Catalog could not use AWS Glue. [#41084](https://github.com/apache/doris/pull/41084) + +### Asynchronous Materialized Views + +- Fixed the issue that asynchronous materialized views might not refresh after the base table is rebuilt. [#41762](https://github.com/apache/doris/pull/41762) + +### Query Optimizer + +- Fixed the issue that partition pruning results might be incorrect when using multi-column range partitioning. [#43332](https://github.com/apache/doris/pull/43332) +- Fixed the issue of incorrect calculation results in some limit offset scenarios. [#42576](https://github.com/apache/doris/pull/42576) + +### Query Execution + +- Fixed the issue that hash join with array types larger than 4G could cause BE Core. [#43861](https://github.com/apache/doris/pull/43861) +- Fixed the issue that is null predicate operations might yield incorrect results in some scenarios. [#43619](https://github.com/apache/doris/pull/43619) +- Fixed the issue that bitmap types might produce incorrect output results in hash join. [#43718](https://github.com/apache/doris/pull/43718) +- Fixed some issues where function results were calculated incorrectly. [#40710](https://github.com/apache/doris/pull/40710) [#39358](https://github.com/apache/doris/pull/39358) [#40929](https://github.com/apache/doris/pull/40929) [#40869](https://github.com/apache/doris/pull/40869) [#40285](https://github.com/apache/doris/pull/40285) [#39891](https://github.com/apache/doris/pull/39891) [#40530](https://github.com/apache/doris/pull/40530) [#41948](https://github.com/apache/doris/pull/41948) [#43588](https://github.com/apache/doris/pull/43588) +- Fixed some issues with JSON type parsing. [#39937](https://github.com/apache/doris/pull/39937) +- Fixed issues with varchar and char types in runtime filter operations. [#43758](https://github.com/apache/doris/pull/43758) [#43919](https://github.com/apache/doris/pull/43919) +- Fixed some issues with the use of decimal256 in scalar and aggregate functions. [#42136](https://github.com/apache/doris/pull/42136) [#42356](https://github.com/apache/doris/pull/42356) +- Fixed the issue that arrow flight reported `Reach limit of connections` errors upon connection. [#39127](https://github.com/apache/doris/pull/39127) +- Fixed the issue of incorrect memory usage statistics for BE in k8s environments. [#41123](https://github.com/apache/doris/pull/41123) + +### Semi-structured Data Management + +- Adjusted the default values of `segment_cache_fd_percentage` and `inverted_index_fd_number_limit_percent`. [#42224](https://github.com/apache/doris/pull/42224) +- logstash now supports group_commit. [#40450](https://github.com/apache/doris/pull/40450) +- Fixed the issue of coredump when building index. [#43246](https://github.com/apache/doris/pull/43246) [#43298](https://github.com/apache/doris/pull/43298) +- Fixed issues with variant index. [#43375](https://github.com/apache/doris/pull/43375) [#43773](https://github.com/apache/doris/pull/43773) +- Fixed potential fd and memory leaks under abnormal compaction circumstances. [#42374](https://github.com/apache/doris/pull/42374) +- Inverted index match null now correctly returns null instead of false. [#41786](https://github.com/apache/doris/pull/41786) +- Fixed the issue of coredump when ngram bloomfilter index bf_size is set to 65536. [#43645](https://github.com/apache/doris/pull/43645) +- Fixed the issue of potential coredump during complex data type JOINs. [#40398](https://github.com/apache/doris/pull/40398) +- Fixed the issue of coredump with TVF JSON data. [#43187](https://github.com/apache/doris/pull/43187) +- Fixed the precision issue of bloom filter calculations for dates and times. [#43612](https://github.com/apache/doris/pull/43612) +- Fixed the issue of coredump with IPv6 type storage. [#43251](https://github.com/apache/doris/pull/43251) +- Fixed the issue of coredump when using VARIANT type with light_schema_change disabled. [#40908](https://github.com/apache/doris/pull/40908) +- Improved cache performance for high-concurrency point queries. [#44077](https://github.com/apache/doris/pull/44077) +- Fixed the issue that bloom filter indexes were not synchronized when columns were deleted. [#43378](https://github.com/apache/doris/pull/43378) +- Fixed instability issues with es catalog under special circumstances such as mixed array and scalar data. [#40314](https://github.com/apache/doris/pull/40314) [#40385](https://github.com/apache/doris/pull/40385) [#43399](https://github.com/apache/doris/pull/43399) [#40614](https://github.com/apache/doris/pull/40614) +- Fixed coredump issues caused by abnormal regular pattern matching. [#43394](https://github.com/apache/doris/pull/43394) + +### Permissions + +- Fixed several issues where permissions were not properly restricted after authorization. [#43193](https://github.com/apache/doris/pull/43193) [#41723](https://github.com/apache/doris/pull/41723) [#42107](https://github.com/apache/doris/pull/42107) [#43306](https://github.com/apache/doris/pull/43306) +- Enhanced several permission checks. [#40688](https://github.com/apache/doris/pull/40688) [#40533](https://github.com/apache/doris/pull/40533) [#41791](https://github.com/apache/doris/pull/41791) [#42106](https://github.com/apache/doris/pull/42106) + +### Other + +- Supplemented missing audit log fields in audit log tables and files. [#43303](https://github.com/apache/doris/pull/43303) + + - [View Documentation](https://doris.apache.org/docs/3.0/admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file diff --git a/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.0.md b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.0.md new file mode 100644 index 0000000000000..dd94da6816294 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.0.md @@ -0,0 +1,379 @@ +--- +{ + "title": "Release 1.1.0", + "language": "en" +} +--- + + + +In version 1.1, we realized the full vectorization of the computing layer and storage layer, and officially enabled the vectorized execution engine as a stable function. All queries are executed by the vectorized execution engine by default, and the performance is 3-5 times higher than the previous version. It increases the ability to access the external tables of Apache Iceberg and supports federated query of data in Doris and Iceberg, and expands the analysis capabilities of Apache Doris on the data lake; on the basis of the original LZ4, the ZSTD compression algorithm is added , further improves the data compression rate; fixed many performance and stability problems in previous versions, greatly improving system stability. Downloading and using is recommended. + +## Upgrade Notes + +### The vectorized execution engine is enabled by default + +In version 1.0, we introduced the vectorized execution engine as an experimental feature and Users need to manually enable it when executing queries by configuring the session variables through `set batch_size = 4096` and `set enable_vectorized_engine = true` . + +In version 1.1, we officially fully enabled the vectorized execution engine as a stable function. The session variable `enable_vectorized_engine` is set to true by default. All queries are executed by default through the vectorized execution engine. + +### BE Binary File Renaming + +BE binary file has been renamed from palo_be to doris_be . Please pay attention to modifying the relevant scripts if you used to rely on process names for cluster management and other operations. + +### Segment storage format upgrade + +The storage format of earlier versions of Apache Doris was Segment V1. In version 0.12, we had implemented Segment V2 as a new storage format, which introduced Bitmap indexes, memory tables, page cache, dictionary compression, delayed materialization and many other features. Starting from version 0.13, the default storage format for newly created tables is Segment V2, while maintaining compatibility with the Segment V1 format. + +In order to ensure the maintainability of the code structure and reduce the additional learning and development costs caused by redundant historical codes, we have decided to no longer support the Segment v1 storage format from the next version. It is expected that this part of the code will be deleted in the Apache Doris 1.2 version. + +### Normal Upgrade + +For normal upgrade operations, you can perform rolling upgrades according to the cluster upgrade documentation on the official website. + +[https://doris.apache.org//docs/admin-manual/cluster-management/upgrade](https://doris.apache.org//docs/admin-manual/cluster-management/upgrade) + +## Features + +### Support random distribution of data [experimental] + +In some scenarios (such as log data analysis), users may not be able to find a suitable bucket key to avoid data skew, so the system needs to provide additional distribution methods to solve the problem. + +Therefore, when creating a table you can set `DISTRIBUTED BY random BUCKETS number`to use random distribution, the data will be randomly written to a single tablet when importing to reduce the data fanout during the loading process. And reduce resource overhead and improve system stability. + +### Support for creating Iceberg external tables[experimental] + +Iceberg external tables provide Apache Doris with direct access to data stored in Iceberg. Through Iceberg external tables, federated queries on data stored in local storage and Iceberg can be implemented, which saves tedious data loading work, simplifies the system architecture for data analysis, and performs more complex analysis operations. + +In version 1.1, Apache Doris supports creating Iceberg external tables and querying data, and supports automatic synchronization of all table schemas in the Iceberg database through the REFRESH command. + +### Added ZSTD compression algorithm + +At present, the data compression method in Apache Doris is uniformly specified by the system, and the default is LZ4. For some scenarios that are sensitive to data storage costs, the original data compression ratio requirements cannot be met. + +In version 1.1, users can set "compression"="zstd" in the table properties to specify the compression method as ZSTD when creating a table. In the 25GB 110 million lines of text log test data, the highest compression rate is nearly 10 times, which is 53% higher than the original compression rate, and the speed of reading data from disk and decompressing it is increased by 30%. + +## Improvements + +### More comprehensive vectorization support + +In version 1.1, we implemented full vectorization of the compute and storage layers, including: + +Implemented vectorization of all built-in functions + +The storage layer implements vectorization and supports dictionary optimization for low-cardinality string columns + +Optimized and resolved numerous performance and stability issues with the vectorization engine. + +We tested the performance of Apache Doris version 1.1 and version 0.15 on the SSB and TPC-H standard test datasets: + +On all 13 SQLs in the SSB test data set, version 1.1 is better than version 0.15, and the overall performance is improved by about 3 times, which solves the problem of performance degradation in some scenarios in version 1.0; + +On all 22 SQLs in the TPC-H test data set, version 1.1 is better than version 0.15, the overall performance is improved by about 4.5 times, and the performance of some scenarios is improved by more than ten times; + +![](/images/release-note-1.1.0-SSB.png) + +

SSB Benchmark

+ +![](/images/release-note-1.1.0-TPC-H.png) + + +

TPC-H Benchmark

+ +**Performance test report** + +[https://doris.apache.org//docs/benchmark/ssb](https://doris.apache.org//docs/benchmark/ssb) + +[https://doris.apache.org//docs/benchmark/tpch](https://doris.apache.org//docs/benchmark/tpch) + +### Compaction logic optimization and real-time guarantee + +In Apache Doris, each commit will generate a data version. In high concurrent write scenarios, -235 errors are prone to occur due to too many data versions and untimely compaction, and query performance will also decrease accordingly. + +In version 1.1, we introduced QuickCompaction, which will actively trigger compaction when the data version increases. At the same time, by improving the ability to scan fragment metadata, it can quickly find fragments with too many data versions and trigger compaction. Through active triggering and passive scanning, the real-time problem of data merging is completely solved. + +At the same time, for high-frequency small file cumulative compaction, the scheduling and isolation of compaction tasks is implemented to prevent the heavyweight base compaction from affecting the merging of new data. + +Finally, for the merging of small files, the strategy of merging small files is optimized, and the method of gradient merging is adopted. Each time the files participating in the merging belong to the same data magnitude, it prevents versions with large differences in size from merging, and gradually merges hierarchically. , reducing the number of times a single file participates in merging, which can greatly save the CPU consumption of the system. + +When the data upstream maintains a write frequency of 10w per second (20 concurrent write tasks, 5000 rows per job, and checkpoint interval of 1s), version 1.1 behaves as follows: + +- Quick data consolidation: Tablet version remains below 50 and compaction score is stable. Compared with the -235 problem that frequently occurred during high concurrent writing in the previous version, the compaction merge efficiency has been improved by more than 10 times. + +- Significantly reduced CPU resource consumption: The strategy has been optimized for small file Compaction. In the above scenario of high concurrent writing, CPU resource consumption is reduced by 25%; + +- Stable query time consumption: The overall orderliness of data is improved, and the fluctuation of query time consumption is greatly reduced. The query time consumption during high concurrent writing is the same as that of only querying, and the query performance is improved by 3-4 times compared with the previous version. + +### Read efficiency optimization for Parquet and ORC files + +By adjusting arrow parameters, arrow's multi-threaded read capability is used to speed up Arrow's reading of each row_group, and it is modified to SPSC model to reduce the cost of waiting for the network through prefetching. After optimization, the performance of Parquet file import is improved by 4 to 5 times. + +### Safer metadata Checkpoint + +By double-checking the image files generated after the metadata checkpoint and retaining the function of historical image files, the problem of metadata corruption caused by image file errors is solved. + +## Bugfix + +### Fix the problem that the data cannot be queried due to the missing data version.(Serious) + +This issue was introduced in version 1.0 and may result in the loss of data versions for multiple replicas. + +### Fix the problem that the resource isolation is invalid for the resource usage limit of loading tasks (Moderate) + +In 1.1, the broker load and routine load will use Backends with specified resource tags to do the load. + +### Use HTTP BRPC to transfer network data packets over 2GB (Moderate) + +In the previous version, when the data transmitted between Backends through BRPC exceeded 2GB, +it may cause data transmission errors. + +## Others + +### Disabling Mini Load + +The `/_load` interface is disabled by default, please use `the /_stream_load` interface uniformly. +Of course, you can re-enable it by turning off the FE configuration item `disable_mini_load`. + +The Mini Load interface will be completely removed in version 1.2. + +### Completely disable the SegmentV1 storage format + +Data in SegmentV1 format is no longer allowed to be created. Existing data can continue to be accessed normally. +You can use the `ADMIN SHOW TABLET STORAGE FORMAT` statement to check whether the data in SegmentV1 format +still exists in the cluster. And convert to SegmentV2 through the data conversion command + +Access to SegmentV1 data will no longer be supported in version 1.2. + +### Limit the maximum length of String type + +In previous versions, String types were allowed a maximum length of 2GB. +In version 1.1, we will limit the maximum length of the string type to 1MB. Strings longer than this length cannot be written anymore. +At the same time, using the String type as a partitioning or bucketing column of a table is no longer supported. + +The String type that has been written can be accessed normally. + +### Fix fastjson related vulnerabilities + +Update to Canal version to fix fastjson security vulnerability. + +### Added `ADMIN DIAGNOSE TABLET` command + +Used to quickly diagnose problems with the specified tablet. + +## Download to Use + +### Download Link + +[hhttps://doris.apache.org/download](https://doris.apache.org/download) + +### Feedback + +If you encounter any problems with use, please feel free to contact us through GitHub discussion forum or Dev e-mail group anytime. + +GitHub Forum: [https://github.com/apache/doris/discussions](https://github.com/apache/doris/discussions) + +Mailing list: [dev@doris.apache.org](dev@doris.apache.org) + +## Thanks + +Thanks to everyone who has contributed to this release: + +``` + +@adonis0147 + +@airborne12 + +@amosbird + +@aopangzi + +@arthuryangcs + +@awakeljw + +@BePPPower + +@BiteTheDDDDt + +@bridgeDream + +@caiconghui + +@cambyzju + +@ccoffline + +@chenlinzhong + +@daikon12 + +@DarvenDuan + +@dataalive + +@dataroaring + +@deardeng + +@Doris-Extras + +@emerkfu + +@EmmyMiao87 + +@englefly + +@Gabriel39 + +@GoGoWen + +@gtchaos + +@HappenLee + +@hello-stephen + +@Henry2SS + +@hewei-nju + +@hf200012 + +@jacktengg + +@jackwener + +@Jibing-Li + +@JNSimba + +@kangshisen + +@Kikyou1997 + +@kylinmac + +@Lchangliang + +@leo65535 + +@liaoxin01 + +@liutang123 + +@lovingfeel + +@luozenglin + +@luwei16 + +@luzhijing + +@mklzl + +@morningman + +@morrySnow + +@nextdreamblue + +@Nivane + +@pengxiangyu + +@qidaye + +@qzsee + +@SaintBacchus + +@SleepyBear96 + +@smallhibiscus + +@spaces-X + +@stalary + +@starocean999 + +@steadyBoy + +@SWJTU-ZhangLei + +@Tanya-W + +@tarepanda1024 + +@tianhui5 + +@Userwhite + +@wangbo + +@wangyf0555 + +@weizuo93 + +@whutpencil + +@wsjz + +@wunan1210 + +@xiaokang + +@xinyiZzz + +@xlwh + +@xy720 + +@yangzhg + +@Yankee24 + +@yiguolei + +@yinzhijian + +@yixiutt + +@zbtzbtzbt + +@zenoyang + +@zhangstar333 + +@zhangyifan27 + +@zhannngchen + +@zhengshengjun + +@zhengshiJ + +@zingdle + +@zuochunwei + +@zy-kkk +``` diff --git a/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.1.md b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.1.md new file mode 100644 index 0000000000000..73a6c2d976999 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.1.md @@ -0,0 +1,78 @@ +--- +{ + "title": "Release 1.1.1", + "language": "en" +} +--- + + + +## Features + +### Support ODBC Sink in Vectorized Engine. + +This feature is enabled in non-vectorized engine but it is missed in vectorized engine in 1.1. So that we add back this feature in 1.1.1. + +### Simple Memtracker for Vectorized Engine. + +There is no memtracker in BE for vectorized engine in 1.1, so that the memory is out of control and cause OOM. In 1.1.1, a simple memtracker is added to BE and could control the memory and cancel the query when memory exceeded. + +## Improvements + +### Cache decompressed data in page cache. + +Some data is compressed using bitshuffle and it costs a lot of time to decompress it during query. In 1.1.1, doris will decompress the data that encoded by bitshuffle to accelerate query and we find it could reduce 30% latency for some query in ssb-flat. + +## Bug Fix + +### Fix the problem that could not do rolling upgrade from 1.0.(Serious) + +This issue was introduced in version 1.1 and may cause BE core when upgrade BE but not upgrade FE. + +If you encounter this problem, you can try to fix it with [#10833](https://github.com/apache/doris/pull/10833). + +### Fix the problem that some query not fall back to non-vectorized engine, and BE will core. + +Currently, vectorized engine could not deal with all sql queries and some queries (like left outer join) will use non-vectorized engine to run. But there are some cases not covered in 1.1. And it will cause be crash. + +### Compaction not work correctly and cause -235 Error. + +One rowset multi segments in uniq key compaction, segments rows will be merged in generic_iterator but merged_rows not increased. Compaction will failed in check_correctness, and make a tablet with too much versions which lead to -235 load error. + +### Some segment fault cases during query. + +[#10961](https://github.com/apache/doris/pull/10961) +[#10954](https://github.com/apache/doris/pull/10954) +[#10962](https://github.com/apache/doris/pull/10962) + +# Thanks + +Thanks to everyone who has contributed to this release: + +``` +@jacktengg +@mrhhsg +@xinyiZzz +@yixiutt +@starocean999 +@morrySnow +@morningman +@HappenLee +``` \ No newline at end of file diff --git a/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.2.md b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.2.md new file mode 100644 index 0000000000000..223b65fda064c --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.2.md @@ -0,0 +1,84 @@ +--- +{ + "title": "Release 1.1.2", + "language": "en" +} +--- + + + + +In this release, Doris Team has fixed more than 170 issues or performance improvement since 1.1.1. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + +# Features + +### New MemTracker + +Introduced new MemTracker for both vectorized engine and non-vectorized engine which is more accurate. + +### Add API for showing current queries and kill query + +### Support read/write emoji of UTF16 via ODBC Table + +# Improvements + +### Data Lake related improvements + +- Improved HDFS ORC File scan performance about 300%. [#11501](https://github.com/apache/doris/pull/11501) + +- Support HDFS HA mode when query Iceberg table. + +- Support query Hive data created by [Apache Tez](https://tez.apache.org/) + +- Add Ali OSS as Hive external support. + +### Add support for string and text type in Spark Load + + +### Add reuse block in non-vectorized engine and have 50% performance improvement in some cases. [#11392](https://github.com/apache/doris/pull/11392) + +### Improve like or regex performance + +### Disable tcmalloc's aggressive_memory_decommit + +It will have 40% performance gains in load or query. + +Currently it is a config, you can change it by set config `tc_enable_aggressive_memory_decommit`. + +# Bug Fix + +### Some issues about FE that will cause FE failure or data corrupt. + +- Add reserved disk config to avoid too many reserved BDB-JE files.**(Serious)** In an HA environment, BDB JE will retains as many reserved files. The BDB-je log doesn't delete until approaching a disk limit. + +- Fix fatal bug in BDB-JE which will cause FE replica could not start correctly or data corrupted.** (Serious)** + +### Fe will hang on waitFor_rpc during query and BE will hang in high concurrent scenarios. + +[#12459](https://github.com/apache/doris/pull/12459) [#12458](https://github.com/apache/doris/pull/12458) [#12392](https://github.com/apache/doris/pull/12392) + +### A fatal issue in vectorized storage engine which will cause wrong result. **(Serious)** + +[#11754](https://github.com/apache/doris/pull/11754) [#11694](https://github.com/apache/doris/pull/11694) + +### Lots of planner related issues that will cause BE core or in abnormal state. + +[#12080](https://github.com/apache/doris/pull/12080) [#12075](https://github.com/apache/doris/pull/12075) [#12040](https://github.com/apache/doris/pull/12040) [#12003](https://github.com/apache/doris/pull/12003) [#12007](https://github.com/apache/doris/pull/12007) [#11971](https://github.com/apache/doris/pull/11971) [#11933](https://github.com/apache/doris/pull/11933) [#11861](https://github.com/apache/doris/pull/11861) [#11859](https://github.com/apache/doris/pull/11859) [#11855](https://github.com/apache/doris/pull/11855) [#11837](https://github.com/apache/doris/pull/11837) [#11834](https://github.com/apache/doris/pull/11834) [#11821](https://github.com/apache/doris/pull/11821) [#11782](https://github.com/apache/doris/pull/11782) [#11723](https://github.com/apache/doris/pull/11723) [#11569](https://github.com/apache/doris/pull/11569) + diff --git a/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.3.md b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.3.md new file mode 100644 index 0000000000000..cfa7151097de3 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.3.md @@ -0,0 +1,92 @@ +--- +{ + "title": "Release 1.1.3", + "language": "en" +} +--- + + + + +In this release, Doris Team has fixed more than 80 issues or performance improvement since 1.1.2. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + + +# Features + +- Support escape identifiers for sqlserver and postgresql in ODBC table. + +- Could use Parquet as output file format. + +# Improvements + +- Optimize flush policy to avoid small segments. [#12706](https://github.com/apache/doris/pull/12706) [#12716](https://github.com/apache/doris/pull/12716) + +- Refactor runtime filter to reduce the prepare time. [#13127](https://github.com/apache/doris/pull/13127) + +- Lots of memory control related issues during query or load process. [#12682](https://github.com/apache/doris/pull/12682) [#12688](https://github.com/apache/doris/pull/12688) [#12708](https://github.com/apache/doris/pull/12708) [#12776](https://github.com/apache/doris/pull/12776) [#12782](https://github.com/apache/doris/pull/12782) [#12791](https://github.com/apache/doris/pull/12791) [#12794](https://github.com/apache/doris/pull/12794) [#12820](https://github.com/apache/doris/pull/12820) [#12932](https://github.com/apache/doris/pull/12932) [#12954](https://github.com/apache/doris/pull/12954) [#12951](https://github.com/apache/doris/pull/12951) + +# BugFix + +- Core dump on compaction with largeint. [#10094](https://github.com/apache/doris/pull/10094) + +- Grouping sets cause be core or return wrong results. [#12313](https://github.com/apache/doris/pull/12313) + +- PREAGGREGATION flag in orthogonal_bitmap_union_count operator is wrong. [#12581](https://github.com/apache/doris/pull/12581) + +- Level1Iterator should release iterators in heap and it may cause memory leak. [#12592](https://github.com/apache/doris/pull/12592) + +- Fix decommission failure with 2 BEs and existing colocation table. [#12644](https://github.com/apache/doris/pull/12644) + +- BE may core dump because of stack-buffer-overflow when TBrokerOpenReaderResponse too large. [#12658](https://github.com/apache/doris/pull/12658) + +- BE may OOM during load when error code -238 occurs. [#12666](https://github.com/apache/doris/pull/12666) + +- Fix wrong child expression of lead function. [#12587](https://github.com/apache/doris/pull/12587) + +- Fix intersect query failed in row storage code. [#12712](https://github.com/apache/doris/pull/12712) + +- Fix wrong result produced by curdate()/current_date() function. [#12720](https://github.com/apache/doris/pull/12720) + +- Fix lateral view explode_split with temp table bug. [#13643](https://github.com/apache/doris/pull/13643) + +- Bucket shuffle join plan is wrong in two same table. [#12930](https://github.com/apache/doris/pull/12930) + +- Fix bug that tablet version may be wrong when doing alter and load. [#13070](https://github.com/apache/doris/pull/13070) + +- BE core when load data using broker with md5sum()/sm3sum(). [#13009](https://github.com/apache/doris/pull/13009) + +# Upgrade Notes + +PageCache and ChunkAllocator are disabled by default to reduce memory usage and can be re-enabled by modifying the configuration items `disable_storage_page_cache` and `chunk_reserved_bytes_limit`. + +Storage Page Cache and Chunk Allocator cache user data chunks and memory preallocation, respectively. + +These two functions take up a certain percentage of memory and are not freed. This part of memory cannot be flexibly allocated, which may lead to insufficient memory for other tasks in some scenarios, affecting system stability and availability. Therefore, we disabled these two features by default in version 1.1.3. + +However, in some latency-sensitive reporting scenarios, turning off this feature may lead to increased query latency. If you are worried about the impact of this feature on your business after upgrade, you can add the following parameters to be.conf to keep the same behavior as the previous version. + +``` +disable_storage_page_cache=false +chunk_reserved_bytes_limit=10% +``` + +* ``disable_storage_page_cache``: Whether to disable Storage Page Cache. version 1.1.2 (inclusive), the default is false, i.e., on. version 1.1.3 defaults to true, i.e., off. +* `chunk_reserved_bytes_limit`: Chunk allocator reserved memory size. 1.1.2 (and earlier), the default is 10% of the overall memory. 1.1.3 version default is 209715200 (200MB). + diff --git a/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.4.md b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.4.md new file mode 100644 index 0000000000000..4710463f4bcde --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.4.md @@ -0,0 +1,72 @@ +--- +{ + "title": "Release 1.1.4", + "language": "en" +} +--- + + + +In this release, Doris Team has fixed about 60 issues or performance improvement since 1.1.3. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + + +# Features + +- Support obs broker load for Huawei Cloud. [#13523](https://github.com/apache/doris/pull/13523) + +- SparkLoad support parquet and orc file.[#13438](https://github.com/apache/doris/pull/13438) + +# Improvements + +- Do not acquire mutex in metric hook since it will affect query performance during heavy load.[#10941](https://github.com/apache/doris/pull/10941) + + +# BugFix + +- The where condition does not take effect when spark load loads the file. [#13804](https://github.com/apache/doris/pull/13804) + +- If function return error result when there is nullable column in vectorized mode. [#13779](https://github.com/apache/doris/pull/13779) + +- Fix incorrect result when using anti join with other join predicates. [#13743](https://github.com/apache/doris/pull/13743) + +- BE crash when call function concat(ifnull). [#13693](https://github.com/apache/doris/pull/13693) + +- Fix planner bug when there is a function in group by clause. [#13613](https://github.com/apache/doris/pull/13613) + +- Table name and column name is not recognized correctly in lateral view clause. [#13600](https://github.com/apache/doris/pull/13600) + +- Unknown column when use MV and table alias. [#13605](https://github.com/apache/doris/pull/13605) + +- JSONReader release memory of both value and parse allocator. [#13513](https://github.com/apache/doris/pull/13513) + +- Fix allow create mv using to_bitmap() on negative value columns when enable_vectorized_alter_table is true. [#13448](https://github.com/apache/doris/pull/13448) + +- Microsecond in function from_date_format_str is lost. [#13446](https://github.com/apache/doris/pull/13446) + +- Sort exprs nullability property may not be right after subsitute using child's smap info. [#13328](https://github.com/apache/doris/pull/13328) + +- Fix core dump on case when have 1000 condition. [#13315](https://github.com/apache/doris/pull/13315) + +- Fix bug that last line of data lost for stream load. [#13066](https://github.com/apache/doris/pull/13066) + +- Restore table or partition with the same replication num as before the backup. [#11942](https://github.com/apache/doris/pull/11942) + + + diff --git a/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.5.md b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.5.md new file mode 100644 index 0000000000000..ee0482b3ba487 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.5.md @@ -0,0 +1,65 @@ +--- +{ + "title": "Release 1.1.5", + "language": "en" +} +--- + + + +In this release, Doris Team has fixed about 36 issues or performance improvement since 1.1.4. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + +# Behavior Changes + +When alias name is same as the original column name like "select year(birthday) as birthday" and use it in group by, order by , having clause, doris's behavior is different from MySQL in the past. In this release, we make it follow MySQL's behavior. Group by and having clause will use original column at first and order by will use alias first. It maybe a litter confuse here so there is a simple advice here, you'd better not use an alias the same as original column name. + +# Features + +Add support of murmur_hash3_64. [#14636](https://github.com/apache/doris/pull/14636) + +# Improvements + +Add timezone cache for convert_tz to improve performance. [#14616](https://github.com/apache/doris/pull/14616) + +Sort result by tablename when call show clause. [#14492](https://github.com/apache/doris/pull/14492) + +# Bug Fix + +Fix coredump when there is a if constant expr in select clause. [#14858](https://github.com/apache/doris/pull/14858) + +ColumnVector::insert_date_column may crashed. [#14839](https://github.com/apache/doris/pull/14839) + +Update high_priority_flush_thread_num_per_store default value to 6 and it will improve the load performance. [#14775](https://github.com/apache/doris/pull/14775) + +Fix quick compaction core. [#14731](https://github.com/apache/doris/pull/14731) + +Partition column is not duplicate key, spark load will throw IndexOutOfBounds error. [#14661](https://github.com/apache/doris/pull/14661) + +Fix a memory leak problem in VCollectorIterator. [#14549](https://github.com/apache/doris/pull/14549) + +Fix create table like when having sequence column. [#14511](https://github.com/apache/doris/pull/14511) + +Using avg rowset to calculate batch size instead of using total_bytes since it costs a lot of cpu. [#14273](https://github.com/apache/doris/pull/14273) + +Fix right outer join core with conjunct. [#14821](https://github.com/apache/doris/pull/14821) + +Optimize policy of tcmalloc gc. [#14777](https://github.com/apache/doris/pull/14777) [#14738](https://github.com/apache/doris/pull/14738) [#14374](https://github.com/apache/doris/pull/14374) + + diff --git a/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.0.md b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.0.md new file mode 100644 index 0000000000000..2529ce7e58aa2 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.0.md @@ -0,0 +1,563 @@ +--- +{ + "title": "Release 1.2.0", + "language": "en" +} +--- + + + + + +# Feature +## Highlight + +1. Full Vectorizied-Engine support, greatly improved performance + + In the standard ssb-100-flat benchmark, the performance of 1.2 is 2 times faster than that of 1.1; in complex TPCH 100 benchmark, the performance of 1.2 is 3 times faster than that of 1.1. + +2. Merge-on-Write Unique Key + + Support Merge-On-Write on Unique Key Model. This mode marks the data that needs to be deleted or updated when the data is written, thereby avoiding the overhead of Merge-On-Read when querying, and greatly improving the reading efficiency on the updateable data model. + +3. Multi Catalog + + The multi-catalog feature provides Doris with the ability to quickly access external data sources for access. Users can connect to external data sources through the `CREATE CATALOG` command. Doris will automatically map the library and table information of external data sources. After that, users can access the data in these external data sources just like accessing ordinary tables. It avoids the complicated operation that the user needs to manually establish external mapping for each table. + + Currently this feature supports the following data sources: + + 1. Hive Metastore: You can access data tables including Hive, Iceberg, and Hudi. It can also be connected to data sources compatible with Hive Metastore, such as Alibaba Cloud's DataLake Formation. Supports data access on both HDFS and object storage. + 2. Elasticsearch: Access ES data sources. + 3. JDBC: Access MySQL through the JDBC protocol. + + Documentation: https://doris.apache.org//docs/dev/lakehouse/multi-catalog) + + > Note: The corresponding permission level will also be changed automatically, see the "Upgrade Notes" section for details. + +4. Light table structure changes + +In the new version, it is no longer necessary to change the data file synchronously for the operation of adding and subtracting columns to the data table, and only need to update the metadata in FE, thus realizing the millisecond-level Schema Change operation. Through this function, the DDL synchronization capability of upstream CDC data can be realized. For example, users can use Flink CDC to realize DML and DDL synchronization from upstream database to Doris. + +Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE + +When creating a table, set `"light_schema_change"="true"` in properties. + +5. JDBC facade + + Users can connect to external data sources through JDBC. Currently supported: + + - MySQL + - PostgreSQL + - Oracle + - SQL Server + - Clickhouse + + Documentation: [https://doris.apache.org/en/docs/dev/lakehouse/multi-catalog/jdbc](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc/) + + > Note: The ODBC feature will be removed in a later version, please try to switch to the JDBC. + +6. JAVA UDF + + Supports writing UDF/UDAF in Java, which is convenient for users to use custom functions in the Java ecosystem. At the same time, through technologies such as off-heap memory and Zero Copy, the efficiency of cross-language data access has been greatly improved. + + Document: https://doris.apache.org//docs/dev/ecosystem/udf/java-user-defined-function + + Example: https://github.com/apache/doris/tree/master/samples/doris-demo + +7. Remote UDF + + Supports accessing remote user-defined function services through RPC, thus completely eliminating language restrictions for users to write UDFs. Users can use any programming language to implement custom functions to complete complex data analysis work. + + Documentation: https://doris.apache.org//docs/ecosystem/udf/remote-user-defined-function + + Example: https://github.com/apache/doris/tree/master/samples/doris-demo + +8. More data types support + + - Array type + + Array types are supported. It also supports nested array types. In some scenarios such as user portraits and tags, the Array type can be used to better adapt to business scenarios. At the same time, in the new version, we have also implemented a large number of data-related functions to better support the application of data types in actual scenarios. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Types/ARRAY + + Related functions: https://doris.apache.org//docs/dev/sql-manual/sql-functions/array-functions/array_max + + - Jsonb type + + Support binary Json data type: Jsonb. This type provides a more compact json encoding format, and at the same time provides data access in the encoding format. Compared with json data stored in strings, it is several times newer and can be improved. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Types/JSONB + + Related functions: https://doris.apache.org//docs/dev/sql-manual/sql-functions/json-functions/jsonb_parse + + - Date V2 + + Sphere of influence: + + 1. The user needs to specify datev2 and datetimev2 when creating the table, and the date and datetime of the original table will not be affected. + 2. When datev2 and datetimev2 are calculated with the original date and datetime (for example, equivalent connection), the original type will be cast into a new type for calculation + 3. The example is in the documentation + + Documentation: https://doris.apache.org/docs/1.2/sql-manual/sql-reference/Data-Types/DATEV2 + + +## More + +1. A new memory management framework + + Documentation: https://doris.apache.org//docs/dev/admin-manual/maint-monitor/memory-management/memory-tracker + +2. Table Valued Function + + Doris implements a set of Table Valued Function (TVF). TVF can be regarded as an ordinary table, which can appear in all places where "table" can appear in SQL. + + For example, we can use S3 TVF to implement data import on object storage: + + ``` + insert into tbl select * from s3("s3://bucket/file.*", "ak" = "xx", "sk" = "xxx") where c1 > 2; + ``` + + Or directly query data files on HDFS: + + ``` + insert into tbl select * from hdfs("hdfs://bucket/file.*") where c1 > 2; + ``` + + TVF can help users make full use of the rich expressiveness of SQL and flexibly process various data. + + Documentation: + + https://doris.apache.org//docs/dev/sql-manual/sql-functions/table-functions/s3 + + https://doris.apache.org//docs/dev/sql-manual/sql-functions/table-functions/hdfs + +3. A more convenient way to create partitions + + Support for creating multiple partitions within a time range via the `FROM TO` command. + +4. Column renaming + + For tables with Light Schema Change enabled, column renaming is supported. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-TABLE-RENAME + +5. Richer permission management + + - Support row-level permissions + + Row-level permissions can be created with the `CREATE ROW POLICY` command. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-POLICY + + - Support specifying password strength, expiration time, etc. + + - Support for locking accounts after multiple failed logins. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Account-Management-Statements/ALTER-USER + +6. Import + + - CSV import supports csv files with header. + + Search for `csv_with_names` in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD/ + + - Stream Load adds `hidden_columns`, which can explicitly specify the delete flag column and sequence column. + + Search for `hidden_columns` in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD + + - Spark Load supports Parquet and ORC file import. + + - Support for cleaning completed imported Labels + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CLEAN-LABEL + + - Support batch cancellation of import jobs by status + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CANCEL-LOAD + + - Added support for Alibaba Cloud oss, Tencent Cloud cos/chdfs and Huawei Cloud obs in broker load. + + Documentation: https://doris.apache.org//docs/dev/advanced/broker + + - Support access to hdfs through hive-site.xml file configuration. + + Documentation: https://doris.apache.org//docs/dev/admin-manual/config/config-dir + +7. Support viewing the contents of the catalog recycle bin through `SHOW CATALOG RECYCLE BIN` function. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Show-Statements/SHOW-CATALOG-RECYCLE-BIN + +8. Support `SELECT * EXCEPT` syntax. + + Documentation: https://doris.apache.org//docs/dev/data-table/basic-usage + +9. OUTFILE supports ORC format export. And supports multi-byte delimiters. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/OUTFILE + +10. Support to modify the number of Query Profiles that can be saved through configuration. + + Document search FE configuration item: max_query_profile_num + +11. The DELETE statement supports IN predicate conditions. And it supports partition pruning. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Manipulation/DELETE + +12. The default value of the time column supports using `CURRENT_TIMESTAMP` + + Search for "CURRENT_TIMESTAMP" in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE + +13. Add two system tables: backends, rowsets + + Documentation: + + https://doris.apache.org//docs/dev/admin-manual/system-table/backends + + https://doris.apache.org//docs/dev/admin-manual/system-table/rowsets + +14. Backup and restore + + - The Restore job supports the `reserve_replica` parameter, so that the number of replicas of the restored table is the same as that of the backup. + + - The Restore job supports `reserve_dynamic_partition_enable` parameter, so that the restored table keeps the dynamic partition enabled. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Backup-and-Restore/RESTORE + + - Support backup and restore operations through the built-in libhdfs, no longer rely on broker. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Backup-and-Restore/CREATE-REPOSITORY + +15. Support data balance between multiple disks on the same machine + + Documentation: + + https://doris.apache.org//docs/dev/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-REBALANCE-DISK + + https://doris.apache.org//docs/dev/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-CANCEL-REBALANCE-DISK + +16. Routine Load supports subscribing to Kerberos-authenticated Kafka services. + + Search for kerberos in the documentation: https://doris.apache.org//docs/dev/data-operate/import/import-way/routine-load-manual + +17. New built-in-function + + Added the following built-in functions: + + - `cbrt` + - `sequence_match/sequence_count` + - `mask/mask_first_n/mask_last_n` + - `elt` + - `any/any_value` + - `group_bitmap_xor` + - `ntile` + - `nvl` + - `uuid` + - `initcap` + - `regexp_replace_one/regexp_extract_all` + - `multi_search_all_positions/multi_match_any` + - `domain/domain_without_www/protocol` + - `running_difference` + - `bitmap_hash64` + - `murmur_hash3_64` + - `to_monday` + - `not_null_or_empty` + - `window_funnel` + - `group_bit_and/group_bit_or/group_bit_xor` + - `outer combine` + - and all array functions + +# Upgrade Notice + +## Known Issues + +- Use JDK11 will cause BE crash, please use JDK8 instead. + +## Behavior Changed + +- Permission level changes + + Because the catalog level is introduced, the corresponding user permission level will also be changed automatically. The rules are as follows: + + - GlobalPrivs and ResourcePrivs remain unchanged + - Added CatalogPrivs level. + - The original DatabasePrivs level is added with the internal prefix (indicating the db in the internal catalog) + - Add the internal prefix to the original TablePrivs level (representing tbl in the internal catalog) + +- In GroupBy and Having clauses, match on column names in preference to aliases. (#14408) + +- Creating columns starting with `mv_` is no longer supported. `mv_` is a reserved keyword in materialized views (#14361) + +- Removed the default limit of 65535 rows added by the order by statement, and added the session variable `default_order_by_limit` to configure this limit. (#12478) + +- In the table generated by "Create Table As Select", all string columns use the string type uniformly, and no longer distinguish varchar/char/string (#14382) + +- In the audit log, remove the word `default_cluster` before the db and user names. (#13499) (#11408) + +- Add sql digest field in audit log (#8919) + +- The union clause always changes the order by logic. In the new version, the order by clause will be executed after the union is executed, unless explicitly associated by parentheses. (#9745) + +- During the decommission operation, the tablet in the recycle bin will be ignored to ensure that the decomission can be completed. (#14028) + +- The returned result of Decimal will be displayed according to the precision declared in the original column, or according to the precision specified in the cast function. (#13437) + +- Changed column name length limit from 64 to 256 (#14671) + +- Changes to FE configuration items + + - The `enable_vectorized_load` parameter is enabled by default. (#11833) + + - Increased `create_table_timeout` value. The default timeout for table creation operations will be increased. (#13520) + + - Modify `stream_load_default_timeout_second` default value to 3 days. + + - Modify the default value of `alter_table_timeout_second` to one month. + + - Increase the parameter `max_replica_count_when_schema_change` to limit the number of replicas involved in the alter job, the default is 100000. (#12850) + + - Add `disable_iceberg_hudi_table`. The iceberg and hudi appearances are disabled by default, and the multi catalog function is recommended. (#13932) + +- Changes to BE configuration items + + - Removed `disable_stream_load_2pc` parameter. 2PC's stream load can be used directly. (#13520) + + - Modify `tablet_rowset_stale_sweep_time_sec` from 1800 seconds to 300 seconds. + + - Redesigned configuration item name about compaction (#13495) + + - Revisited parameter about memory optimization (#13781) + +- Session variable changes + + - Modify the variable `enable_insert_strict` to true by default. This will cause some insert operations that could be executed before, but inserted illegal values, to no longer be executed. (11866) + + - Modified variable `enable_local_exchange` to default to true (#13292) + + - Default data transmission via lz4 compression, controlled by variable `fragment_transmission_compression_codec` (#11955) + + - Add `skip_storage_engine_merge` variable for debugging unique or agg model data (#11952) + + Documentation: https://doris.apache.org//docs/dev/advanced/variables + +- The BE startup script will check whether the value is greater than 200W through `/proc/sys/vm/max_map_count`. Otherwise, the startup fails. (#11052) + +- Removed mini load interface (#10520) + +- FE Metadata Version + + FE Meta Version changed from 107 to 114, and cannot be rolled back after upgrading. + +## During Upgrade + +1. Upgrade preparation + + - Need to replace: lib, bin directory (start/stop scripts have been modified) + + - BE also needs to configure JAVA_HOME, and already supports JDBC Table and Java UDF. + + - The default JVM Xmx parameter in fe.conf is changed to 8GB. + +2. Possible errors during the upgrade process + + - The repeat function cannot be used and an error is reported: `vectorized repeat function cannot be executed`, you can turn off the vectorized execution engine before upgrading. (#13868) + + - schema change fails with error: `desc_tbl is not set. Maybe the FE version is not equal to the BE` (#13822) + + - Vectorized hash join cannot be used and an error will be reported. `vectorized hash join cannot be executed`. You can turn off the vectorized execution engine before upgrading. (#13753) + + The above errors will return to normal after a full upgrade. + +## Performance Impact + +- By default, JeMalloc is used as the memory allocator of the new version BE, replacing TcMalloc (#13367) + +- The batch size in the tablet sink is modified to be at least 8K. (#13912) + +- Disable chunk allocator by default (#13285) + +## Api change + +- BE's http api error return information changed from `{"status": "Fail", "msg": "xxx"}` to more specific ``{"status": "Not found", "msg": "Tablet not found. tablet_id=1202"}``(#9771) + +- In `SHOW CREATE TABLE`, the content of comment is changed from double quotes to single quotes (#10327) + +- Support ordinary users to obtain query profile through http command. (#14016) +Documentation: https://doris.apache.org//docs/dev/admin-manual/http-actions/fe/manager/query-profile-action + +- Optimized the way to specify the sequence column, you can directly specify the column name. (#13872) +Documentation: https://doris.apache.org//docs/dev/data-operate/update-delete/sequence-column-manual + +- Increase the space usage of remote storage in the results returned by `show backends` and `show tablets` (#11450) + +- Removed Num-Based Compaction related code (#13409) + +- Refactored BE's error code mechanism, some returned error messages will change (#8855) +other + +- Support Docker official image. + +- Support compiling Doris on MacOS(x86/M1) and ubuntu-22.04 + Documentation: https://doris.apache.org//docs/dev/install/source-install/compilation-mac/ + +- Support for image file verification. + + Documentation: https://doris.apache.org//docs/dev/admin-manual/maint-monitor/metadata-operation/ + +- script related + + - The stop scripts of FE and BE support exiting FE and BE via the `--grace` parameter (use kill -15 signal instead of kill -9) + + - FE start script supports checking the current FE version via --version (#11563) + + - Support to get the data and related table creation statement of a tablet through the `ADMIN COPY TABLET` command, for local problem debugging (#12176) + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-COPY-TABLET + +- Support to obtain a table creation statement related to a SQL statement through the http api for local problem reproduction (#11979) + + Documentation: https://doris.apache.org//docs/dev/admin-manual/http-actions/fe/query-schema-action + +- Support to close the compaction function of this table when creating a table, for testing (#11743) + + Search for "disble_auto_compaction" in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE + +# Big Thanks + +Thanks to ALL who contributed to this release! (alphabetically) +``` +@924060929 +@a19920714liou +@adonis0147 +@Aiden-Dong +@aiwenmo +@AshinGau +@b19mud +@BePPPower +@BiteTheDDDDt +@bridgeDream +@ByteYue +@caiconghui +@CalvinKirs +@cambyzju +@caoliang-web +@carlvinhust2012 +@catpineapple +@ccoffline +@chenlinzhong +@chovy-3012 +@coderjiang +@cxzl25 +@dataalive +@dataroaring +@dependabot[bot] +@dinggege1024 +@DongLiang-0 +@Doris-Extras +@eldenmoon +@EmmyMiao87 +@englefly +@FreeOnePlus +@Gabriel39 +@gaodayue +@geniusjoe +@gj-zhang +@gnehil +@GoGoWen +@HappenLee +@hello-stephen +@Henry2SS +@hf200012 +@huyuanfeng2018 +@jacktengg +@jackwener +@jeffreys-cat +@Jibing-Li +@JNSimba +@Kikyou1997 +@Lchangliang +@LemonLiTree +@lexoning +@liaoxin01 +@lide-reed +@link3280 +@liutang123 +@liuyaolin +@LOVEGISER +@lsy3993 +@luozenglin +@luzhijing +@madongz +@morningman +@morningman-cmy +@morrySnow +@mrhhsg +@Myasuka +@myfjdthink +@nextdreamblue +@pan3793 +@pangzhili +@pengxiangyu +@platoneko +@qidaye +@qzsee +@SaintBacchus +@SeekingYang +@smallhibiscus +@sohardforaname +@song7788q +@spaces-X +@ssusieee +@stalary +@starocean999 +@SWJTU-ZhangLei +@TaoZex +@timelxy +@Wahno +@wangbo +@wangshuo128 +@wangyf0555 +@weizhengte +@weizuo93 +@wsjz +@wunan1210 +@xhmz +@xiaokang +@xiaokangguo +@xinyiZzz +@xy720 +@yangzhg +@Yankee24 +@yeyudefeng +@yiguolei +@yinzhijian +@yixiutt +@yuanyuan8983 +@zbtzbtzbt +@zenoyang +@zhangboya1 +@zhangstar333 +@zhannngchen +@ZHbamboo +@zhengshiJ +@zhenhb +@zhqu1148980644 +@zuochunwei +@zy-kkk +``` diff --git a/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.1.md b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.1.md new file mode 100644 index 0000000000000..d5adb31eb5256 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.1.md @@ -0,0 +1,196 @@ +--- +{ + "title": "Release 1.2.1", + "language": "en" +} +--- + + + +# Improvement + +### Supports new type DecimalV3 + +DecimalV3, which supports higher precision and better performance, has the following advantages over past versions. + +- Larger representable range, the range of values are significantly expanded, and the valid number range [1,38]. + +- Higher performance, adaptive adjustment of the storage space occupied according to different precision. + +- More complete precision derivation support, for different expressions, different precision derivation rules are applied to the accuracy of the result. + +[DecimalV3](https://doris.apache.org/docs/dev/sql-manual/sql-reference/Data-Types/DECIMALV3/) + +### Support Iceberg V2 + +Support Iceberg V2 (only Position Delete is supported, Equality Delete will be supported in subsequent versions). + +Tables in Iceberg V2 format can be accessed through the Multi-Catalog feature. + +### Support OR condition to IN + +Support converting OR condition to IN condition, which can improve the execution efficiency in some scenarios.[#15437](https://github.com/apache/doris/pull/15437) [#12872](https://github.com/apache/doris/pull/12872) + +### Optimize the import and query performance of JSONB type + +Optimize the import and query performance of JSONB type. [#15219](https://github.com/apache/doris/pull/15219) [#15219](https://github.com/apache/doris/pull/15219) + +### Stream load supports quoted csv data + +Search trim_double_quotes in Document:[https://doris.apache.org/docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD](https://doris.apache.org/docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD) + +### Broker supports Tencent Cloud CHDFS and Baidu Cloud BOS, AFS + +Data on CHDFS, BOS, and AFS can be accessed through Broker. [#15297](https://github.com/apache/doris/pull/15297) [#15448](https://github.com/apache/doris/pull/15448) + +### New function + +Add function `substring_index`. [#15373](https://github.com/apache/doris/pull/15373) + +# Bug Fix + +- In some cases, after upgrading from version 1.1 to version 1.2, the user permission information will be lost. [#15144](https://github.com/apache/doris/pull/15144) + +- Fix the problem that the partition value is wrong when using datev2/datetimev2 type for partitioning. [#15094](https://github.com/apache/doris/pull/15094) + +- Bug fixes for a large number of released features. For a complete list see: [PR List](https://github.com/apache/doris/pulls?q=is%3Apr+label%3Adev%2F1.2.1-merged+is%3Aclosed) + +# Upgrade Notice + +### Known Issues + +- Do not use JDK11 as the runtime JDK of BE, it will cause BE Crash. +- The reading performance of the csv format in this version has declined, which will affect the import and reading efficiency of the csv format. We will fix it as soon as possible in the next three-digit version + +### Behavior Changed + +- The default value of the BE configuration item `high_priority_flush_thread_num_per_store` is changed from 1 to 6, to improve the write efficiency of Routine Load. (https://github.com/apache/doris/pull/14775) + +- The default value of the FE configuration item `enable_new_load_scan_node` is changed to true. Import tasks will be performed using the new File Scan Node. No impact on users.[#14808](https://github.com/apache/doris/pull/14808) + +- Delete the FE configuration item `enable_multi_catalog`. The Multi-Catalog function is enabled by default. + +- The vectorized execution engine is forced to be enabled by default.[#15213](https://github.com/apache/doris/pull/15213) + +The session variable enable_vectorized_engine will no longer take effect. Enabled by default. + +To make it valid again, set the FE configuration item `disable_enable_vectorized_engine` to false, and restart FE to make `enable_vectorized_engine` valid again. + + +# Big Thanks + +Thanks to ALL who contributed to this release! + + +@adonis0147 + +@AshinGau + +@BePPPower + +@BiteTheDDDDt + +@ByteYue + +@caiconghui + +@cambyzju + +@chenlinzhong + +@dataroaring + +@Doris-Extras + +@dutyu + +@eldenmoon + +@englefly + +@freemandealer + +@Gabriel39 + +@HappenLee + +@Henry2SS + +@hf200012 + +@jacktengg + +@Jibing-Li + +@Kikyou1997 + +@liaoxin01 + +@luozenglin + +@morningman + +@morrySnow + +@mrhhsg + +@nextdreamblue + +@qidaye + +@spaces-X + +@starocean999 + +@wangshuo128 + +@weizuo93 + +@wsjz + +@xiaokang + +@xinyiZzz + +@xutaoustc + +@yangzhg + +@yiguolei + +@yixiutt + +@Yulei-Yang + +@yuxuan-luo + +@zenoyang + +@zhangstar333 + +@zhannngchen + +@zhengshengjun + + + + + + diff --git a/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.2.md b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.2.md new file mode 100644 index 0000000000000..08fd22571a03f --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.2.md @@ -0,0 +1,254 @@ +--- +{ + "title": "Release 1.2.2", + "language": "en" +} +--- + + + +# New Features + +### Lakehouse + +- Support automatic synchronization of Hive metastore. + +- Support reading the Iceberg Snapshot, and viewing the Snapshot history. + +- JDBC Catalog supports PostgreSQL, Clickhouse, Oracle, SQLServer + +- JDBC Catalog supports Insert operation + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/) + +### Auto Bucket + + Set and scale the number of buckets for different partitions to keep the number of tablet in a relatively appropriate range. + +### New Functions + +Add the new function `width_bucket`. + +Reference: [https://doris.apache.org/zh-CN/docs/dev/sql-manual/sql-functions/width-bucket/#description](https://doris.apache.org/zh-CN/docs/dev/sql-manual/sql-functions/width-bucket/#description) + +# Behavior Changes + +- Disable BE's page cache by default: `disable_storage_page_cache=true` + +Turn off this configuration to optimize memory usage and reduce the risk of memory OOM. +But it will reduce the query latency of some small queries. +If you are sensitive to query latency, or have high concurrency and small query scenarios, you can configure *disable_storage_page_cache=false* to enable page cache again. + +- Add new session variable `group_by_and_having_use_alias_first`, used to control whether the group and having clauses use alias. + +Reference: [https://doris.apache.org/docs/dev/advanced/variables](https://doris.apache.org/docs/dev/advanced/variables) + +# Improvement + +### Compaction + +- Support `Vertical Compaction`. To optimize the compaction overhead and efficiency of wide tables. + +- Support `Segment ompaction`. Fix -238 and -235 issues with high frequency imports. + +### Lakehouse + +- Hive Catalog can be compatible with Hive version 1/2/3 + +- Hive Catalog can access JuiceFS based HDFS with Broker. + +- Iceberg Catalog Support Hive Metastore and Rest Catalog type. + +- ES Catalog support _id column mapping. + +- Optimize Iceberg V2 read performance with large number of delete rows. + +- Support for reading Iceberg tables after Schema Evolution + +- Parquet Reader handles column name case correctly. + +### Other + +- Support for accessing Hadoop KMS-encrypted HDFS. + +- Support to cancel the Export export task in progress. + +- Optimize the performance of `explode_split` with 1x. + +- Optimize the read performance of nullable columns with 3x. + +- Optimize some problems of Memtracker, improve memory management accuracy, and optimize memory application. + + + +# Bug Fix + +- Fixed memory leak when loading data with Doris Flink Connector. + +- Fixed the possible thread scheduling problem of BE and reduce the `Fragment sent timeout` error caused by BE thread exhaustion. + +- Fixed various correctness and precision issues of column type datetimev2/decimalv3. + +- Fixed the problem data correctness issue with Unique Key Merge-on-Read table. + +- Fixed various known issues with the Light Schema Change feature. + +- Fixed various data correctness issues of bitmap type Runtime Filter. + +- Fixed the problem of poor reading performance of csv reader introduced in version 1.2.1. + +- Fixed the problem of BE OOM caused by Spark Load data download phase. + +- Fixed possible metadata compatibility issues when upgrading from version 1.1 to version 1.2. + +- Fixed the metadata problem when creating JDBC Catalog with Resource. + +- Fixed the problem of high CPU usage caused by load operation. + +- Fixed the problem of FE OOM caused by a large number of failed Broker Load jobs. + +- Fixed the problem of precision loss when loading floating-point types. + +- Fixed the problem of memory leak when useing 2PC stream load + +# Other + +Add metrics to view the total rowset and segment numbers on BE + +- doris_be_all_rowsets_num and doris_be_all_segments_num + + +# Big Thanks + +Thanks to ALL who contributed to this release! + + +@adonis0147 + +@AshinGau + +@BePPPower + +@BiteTheDDDDt + +@ByteYue + +@caiconghui + +@cambyzju + +@chenlinzhong + +@DarvenDuan + +@dataroaring + +@Doris-Extras + +@dutyu + +@englefly + +@freemandealer + +@Gabriel39 + +@HappenLee + +@Henry2SS + +@htyoung + +@isHuangXin + +@JackDrogon + +@jacktengg + +@Jibing-Li + +@kaka11chen + +@Kikyou1997 + +@Lchangliang + +@LemonLiTree + +@liaoxin01 + +@liqing-coder + +@luozenglin + +@morningman + +@morrySnow + +@mrhhsg + +@nextdreamblue + +@qidaye + +@qzsee + +@spaces-X + +@stalary + + +@starocean999 + +@weizuo93 + +@wsjz + +@xiaokang + +@xinyiZzz + +@xy720 + +@yangzhg + +@yiguolei + +@yixiutt + +@Yukang-Lian + +@Yulei-Yang + +@zclllyybb + +@zddr + +@zhangstar333 + +@zhannngchen + +@zy-kkk + + + + + + diff --git a/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.3.md b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.3.md new file mode 100644 index 0000000000000..cd9226b15e14f --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.3.md @@ -0,0 +1,109 @@ +--- +{ + "title": "Release 1.2.3", + "language": "en" +} +--- + + + +# Improvement + +### JDBC Catalog + +- Support connecting to Doris clusters through JDBC Catalog. + +Currently, Jdbc Catalog only support to use 5.x version of JDBC jar package to connect another Doris database. If you use 8.x version of JDBC jar package, the data type of column may not be matched. + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc/#doris](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc/#doris) + +- Support to synchronize only the specified database through the `only_specified_database` attribute. + +- Support synchronizing table names in the form of lowercase through `lower_case_table_names` to solve the problem of case sensitivity of table names. + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc) + +- Optimize the read performance of JDBC Catalog. + +### Elasticsearch Catalog + +- Support Array type mapping. + +- Support whether to push down the like expression through the `like_push_down` attribute to control the CPU overhead of the ES cluster. + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/es](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/es) + +### Hive Catalog + +- Support Hive table default partition `_HIVE_DEFAULT_PARTITION_`. + +- Hive Metastore metadata automatic synchronization supports notification event in compressed format. + +### Dynamic Partition Improvement + +- Dynamic partition supports specifying the `storage_medium` parameter to control the storage medium of the newly added partition. + +Reference: [https://doris.apache.org/docs/dev/advanced/partition/dynamic-partition](https://doris.apache.org/docs/dev/advanced/partition/dynamic-partition) + + +### Optimize BE's Threading Model + +- Optimize BE's threading model to avoid stability problems caused by frequent thread creation and destroy. + +# Bugfix + +- Fixed issues with Merge-On-Write Unique Key tables. + +- Fixed compaction related issues. + +- Fixed some delete statement issues causing data errors. + +- Fixed several query execution errors. + +- Fixed the problem of using JDBC catalog to cause BE crash on some operating system. + +- Fixed Multi-Catalog issues. + +- Fixed memory statistics and optimization issues. + +- Fixed decimalV3 and date/datetimev2 related issues. + +- Fixed load transaction stability issues. + +- Fixed light-weight schema change issues. + +- Fixed the issue of using `datetime` type for batch partition creation. + +- Fixed the problem that a large number of failed broker loads would cause the FE memory usage to be too high. + +- Fixed the problem that stream load cannot be canceled after dropping the table. + +- Fixed querying `information_schema` timeout in some cases. + +- Fixed the problem of BE crash caused by concurrent data export using `select outfile`. + +- Fixed transactional insert operation memory leak. + +- Fixed several query/load profile issues, and supports direct download of profiles through FE web ui. + +- Fixed the problem that the BE tablet GC thread caused the IO util to be too high. + +- Fixed the problem that the commit offset is inaccurate in Kafka routine load. + diff --git a/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.4.md b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.4.md new file mode 100644 index 0000000000000..a959a323d06d1 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.4.md @@ -0,0 +1,81 @@ +--- +{ + "title": "Release 1.2.4", + "language": "en" +} +--- + + + + +# Behavior Changed + +- For `DateV2`/`DatetimeV2` and `DecimalV3` type, in the results of `DESCRIBLE` and `SHOW CREATE TABLE` statements, they will no longer be displayed as `DateV2`/`DatetimeV2` or `DecimalV3`, but directly displayed as `Date`/`Datetime` or `Decimal`. + + - This change is for compatibility with some BI tools. If you want to see the actual type of the column, you can check it with the `DESCRIBE ALL` statement. + +- When querying tables in the `information_schema` database, the meta information(database, table, column, etc.) in the external catalog is no longer returned by default. + + - This change avoids the problem that the `information_schema` database cannot be queried due to the connection problem of some external catalog, so as to solve the problem of using some BI tools with Doris. It can be controlled by the FE configuration `infodb_support_ext_catalog`, and the default value is `false`, that is, the meta information of external catalog will not be returned. + +# Improvement + +### JDBC Catalog + +- Supports connecting to Trino/Presto via JDBC Catalog + +​ Refer to: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#trino](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#trino) + +- JDBC Catalog connects to Clickhouse data source and supports Array type mapping + +​ Refer to: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#clickhouse](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#clickhouse) + +### Spark Load + +- Spark Load supports Resource Manager HA related configuration + +​ Refer to: https://github.com/apache/doris/pull/15000 + +## Bug Fixes + +- Fixed several connectivity issues with Hive Catalog. + +- Fixed ClassNotFound issues with Hudi Catalog. + +- Optimize the connection pool of JDBC Catalog to avoid too many connections. + +- Fix the problem that OOM will occur when importing data from another Doris cluster through JDBC Catalog. + +- Fixed serveral queries and imports planning issues. + +- Fixed several issues with Unique Key Merge-On-Write data model. + +- Fix several BDBJE issues and solve the problem of abnormal FE metadata in some cases. + +- Fix the problem that the `CREATE VIEW` statement does not support Table Valued Function. + +- Fixed several memory statistics issues. + +- Fixed several issues reading Parquet/ORC format. + +- Fixed several issues with DecimalV3. + +- Fixed several issues with SHOW QUERY/LOAD PROFILE. + diff --git a/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.5.md b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.5.md new file mode 100644 index 0000000000000..55af863ba47d6 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.5.md @@ -0,0 +1,199 @@ +--- +{ + "title": "Release 1.2.5", + "language": "en" +} +--- + + + +In version 1.2.5, the Doris team has fixed nearly 210 issues or performance improvements since the release of version 1.2.4. At the same time, version 1.2.5 is also an iterative version of version 1.2.4, which has higher stability. It is recommended that all users upgrade to this version. + +# Behavior Changed + +- The `start_be.sh` script will check that the maximum number of file handles in the system must be greater than or equal to 65536, otherwise the startup will fail. + +- The BE configuration item `enable_quick_compaction` is set to true by default. The Quick Compaction is enabled by default. This feature is used to optimize the problem of small files in the case of large batch import. + +- After modifying the dynamic partition attribute of the table, it will no longer take effect immediately, but wait for the next task scheduling of the dynamic partition table to avoid some deadlock problems. + +# Improvement + +- Optimize the use of bthread and pthread to reduce the RPC blocking problem during the query process. + +- A button to download Profile is added to the Profile page of the FE web UI. + +- Added FE configuration `recover_with_skip_missing_version`, which is used to query to skip the problematic replica under certain failure conditions. + +- The row-level permission function supports external Catalog. + +- Hive Catalog supports automatic refreshing of kerberos tickets on the BE side without manual refreshing. + +- JDBC Catalog supports tables under the MySQL/ClickHouse system database (`information_schema`). + +# Bug Fixes + +- Fixed the problem of incorrect query results caused by low-cardinality column optimization + +- Fixed several authentication and compatibility issues accessing HDFS. + +- Fixed several issues with float/double and decimal types. + +- Fixed several issues with date/datetimev2 types. + +- Fixed several query execution and planning issues. + +- Fixed several issues with JDBC Catalog. + +- Fixed several query-related issues with Hive Catalog, and Hive Metastore metadata synchronization issues. + +- Fix the problem that the result of `SHOW LOAD PROFILE` statement is incorrect. + +- Fixed several memory related issues. + +- Fixed several issues with `CREATE TABLE AS SELECT` functionality. + +- Fix the problem that the jsonb type causes BE to crash on CPU that do not support avx2. + +- Fixed several issues with dynamic partitions. + +- Fixed several issues with TOPN query optimization. + +- Fixed several issues with the Unique Key Merge-on-Write table model. + +# Big Thanks + +58 contributors participated in the improvement and release of 1.2.5, and thank them for their hard work and dedication: + +@adonis0147 + +@airborne12 + +@AshinGau + +@BePPPower + +@BiteTheDDDDt + +@caiconghui + +@CalvinKirs + +@cambyzju + +@caoliang-web + +@dataroaring + +@Doris-Extras + +@dujl + +@dutyu + +@fsilent + +@Gabriel39 + +@gitccl + +@gnehil + +@GoGoWen + +@gongzexin + +@HappenLee + +@herry2038 + +@jacktengg + +@Jibing-Li + +@kaka11chen + +@Kikyou1997 + +@LemonLiTree + +@liaoxin01 + +@LiBinfeng-01 + +@luwei16 + +@Moonm3n + +@morningman + +@mrhhsg + +@Mryange + +@nextdreamblue + +@nsnhuang + +@qidaye + +@Shoothzj + +@sohardforaname + +@stalary + +@starocean999 + +@SWJTU-ZhangLei + +@wsjz + +@xiaokang + +@xinyiZzz + +@yangzhg + +@yiguolei + +@yixiutt + +@yujun777 + +@Yulei-Yang + +@yuxuan-luo + +@zclllyybb + +@zddr + +@zenoyang + +@zhangstar333 + +@zhannngchen + +@zxealous + +@zy-kkk + +@zzzzzzzs diff --git a/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.6.md b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.6.md new file mode 100644 index 0000000000000..39146b35b15ac --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.6.md @@ -0,0 +1,135 @@ +--- +{ + "title": "Release 1.2.6", + "language": "en" +} +--- + + + + +# Behavior Change + +- Add a BE configuration item `allow_invalid_decimalv2_literal` to control whether can import data that exceeding the decimal's precision, for compatibility with previous logic. + +# Query + +- Fix several query planning issues. +- Support `sql_select_limit` session variable. +- Optimize query cold run performance. +- Fix expr context memory leak. +- Fix the issue that the `explode_split` function was executed incorrectly in some cases. + +## Multi Catalog + +- Fix the issue that synchronizing hive metadata caused FE replay edit log to fail. +- Fix `refresh catalog` operation causing FE OOM. +- Fix the issue that jdbc catalog cannot handle `0000-00-00` correctly. +- Fixed the issue that the kerberos ticket cannot be refreshed automatically. +- Optimize the partition pruning performance of hive. +- Fix the inconsistent behavior of trino and presto in jdbc catalog. +- Fix the issue that hdfs short-circuit read could not be used to improve query efficiency in some environments. +- Fix the issue that the iceberg table on CHDFS could not be read. + +# Storage + +- Fix the wrong calculation of delete bitmap in MOW table. +- Fix several BE memory issues. +- Fix snappy compression issue. +- Fix the issue that jemalloc may cause BE to crash in some cases. + +# Others + +- Fix several java udf related issues. +- Fix the issue that the `recover table` operation incorrectly triggered the creation of dynamic partitions. +- Fix timezone when importing orc files via broker load. +- Fix the issue that the newly added `PERCENT` keyword caused the replay metadata of the routine load job to fail. +- Fix the issue that the `truncate` operation failed to acts on a non-partitioned table. +- Fix the issue that the mysql connection was lost due to the `show snapshot` operation. +- Optimize the lock logic to reduce the probability of lock timeout errors when creating tables. +- Add session variable `have_query_cache` to be compatible with some old mysql clients. +- Optimize the error message when encountering an error of loading. + +# Big Thanks + +Thanks all who contribute to this release: + +@amorynan + +@BiteTheDDDDt + +@caoliang-web + +@dataroaring + +@Doris-Extras + +@dutyu + +@Gabriel39 + +@HHoflittlefish777 + +@htyoung + +@jacktengg + +@jeffreys-cat + +@kaijchen + +@kaka11chen + +@Kikyou1997 + +@KnightLiJunLong + +@liaoxin01 + +@LiBinfeng-01 + +@morningman + +@mrhhsg + +@sohardforaname + +@starocean999 + +@vinlee19 + +@wangbo + +@wsjz + +@xiaokang + +@xinyiZzz + +@yiguolei + +@yujun777 + +@Yulei-Yang + +@zhangstar333 + +@zy-kkk + diff --git a/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.7.md b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.7.md new file mode 100644 index 0000000000000..cd47282f4688d --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.7.md @@ -0,0 +1,46 @@ +--- +{ + "title": "Release 1.2.7", + "language": "en" +} +--- + + + +# Bug Fixes + +- Fixed some query issues. +- Fix some storage issues. +- Fix some decimal precision issues. +- Fix query error caused by invalid `sql_select_limit` session variable's value. +- Fix the problem that hdfs short-circuit read cannot be used. +- Fix the problem that Tencent Cloud cosn cannot be accessed. +- Fix several issues with hive catalog kerberos access. +- Fix the problem that stream load profile cannot be used. +- Fix promethus monitoring parameter format problem. +- Fix the table creation timeout issue when creating a large number of tablets. + +# New Features + +- Unique Key model supports array type as value column +- Added `have_query_cache` variable for compatibility with MySQL ecosystem. +- Added `enable_strong_consistency_read` to support strong consistent read between sessions +- FE metrics supports user-level query counter + diff --git a/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.8.md b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.8.md new file mode 100644 index 0000000000000..35cbb7a3cdcf1 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.8.md @@ -0,0 +1,47 @@ +--- +{ + "title": "Release 1.2.8", + "language": "en" +} +--- + + + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Bug Fixes +- Fixed several issues with query execution. +- Fixed several issues with Spark Load. +- Fixed several issues with Parquet Reader. +- Fixed several issues with Orc Reader. +- Fixed Broker "FileSystem closed" problem. +- Fixed several issues with Broker Load. +- Fixed several issues with CTAS execution. +- Fixed several issues with backup and restore. +- Added "Catalog" column in audit log. +- Optimized the metadata cache of Iceberg Catalog. +- Fixed several issues with outfile/export feature. +- Fixed an issue with "replayEraseTable" edit log causing FE start to fail. +- Fixed some security issues. + + diff --git a/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.0.md b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.0.md new file mode 100644 index 0000000000000..b0b88f715ee51 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.0.md @@ -0,0 +1,159 @@ +--- +{ + "title": "Release 2.1.0", + "language": "en" +} +--- + + + +Dear community, we are pleased to share with you the official release of Apache Doris 2.1.0, now available for download and use as of March 8th. This latest version marks a significant milestone in our journey towards enhancing data analysis capabilities, particularly for handling massive and complex datasets. + +With Doris 2.1.0, our primary focus has been on optimizing analysis performance, and the results speak for themselves. We have achieved an impressive performance improvement of over 100% on the TPC-DS 1TB test dataset, making Apache Doris more capable of challenging real-world business scenarios. + +- **Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Performance improvement + +### Smarter optimizer + +On the basis of V2.0, the query optimizer in Doris V2.1 comes with enhanced statistics-based inference and enumeration framework. We have upgraded the cost model and expanded the optimization rules to serve the needs of more use cases + +### Better heuristic optimization + +For data analytics at scale or data lake scenarios, Doris V2.1 provides better heuristic query plans. Meanwhile, the RuntimeFilter is more self-adaptive to enable higher performance even without statistical information. + +### Parallel adaptive scan + +Doris V2.1 has adopted parallel adaptive scan to optimize scan I/O and thus improve query performance. It can avoid the negative impact of unreasonable numbers of buckets. (This feature is currently available on the Duplicate Key model and Merge-on-Write Unique Key model.) + +### Local shuffle + +We have introduced Local Shuffle to prevent uneven data distribution. Benchmark tests show that Local Shuffle in combination with Parallel Adaptive Scan can guarantee fast query performance in spite of unreasonable bucket number settings upon table creation. + +### Faster INSERT INTO SELECT + +To further improve the performance of INSERT INTO SELECT, which is a frequent operation in ETL, we have moved forward the MemTable execution-wise to reduce data ingestion overheads. Tests show that this can double the data ingestion speed in most cases compared to V2.0. +Improved data lake analytics capabilities + +## Data lake analytic performance + +### TPC-DS Benchmark + +According to TPC-DS benchmark tests (1TB) of Doris V2.1 against Trino, + +- Without caching, the total execution time of Doris is 56% of that of Trino V435. (717s VS 1296s) +- Enabling file cache can further increase the overall performance of Doris by 2.2 times. (323s) + This is achieved by a series of optimizations in I/O, parquet/ORC file reading, predicate pushdown, caching, and scan task scheduling, etc. + +### SQL dialects compatibility + +To facilitate migration to Doris and increase its compatibility with other DBMS, we have enabled SQL dialect conversion in V2.1. ([read more](../../lakehouse/sql-dialect)) For example, by set sql_dialect = "trino" in Doris, you can use the Trino SQL dialect as you're used to, without modifying your current business logic, and Doris will execute the corresponding queries for you. Tests in user production environment show that Doris V2.1 is compatible with 99% of Trino SQL. + +### Arrow Flight SQL protocol + +As a column-oriented database compatible with MySQL 8.0 protocol, Doris V2.1 now supports the Arrow Flight SQL protocol as well so users can have fast access to Doris data via Pandas/Numpy without data serialization and deserialization. For most common data types, the Arrow Flight protocol enables tens of times faster performance than the MySQL protocol. + +## Asynchronous materialized view + +V2.1 allows creating a materialized view based on multiple tables. This feature currently supports: + +- Transparent rewriting: supports transparent rewriting of common operators including Select, Where, Join, Group By, and Aggregation. +- Auto refresh: supports regular refresh, manual refresh, full refresh, incremental refresh, and partition-based refresh. +- Materialized view of external tables: supports materialized views based on external data tables such as those on Hive, Hudi, and Iceberg; supported synchronizing data from data lakes into Doris internal tables via materialized views. +- Direct query on materialized views: Materialized views can be regarded as the result set after ETL. In this sense, materialized views are data tables, so users can conduct queries on them directly. + +## Enhanced storage + +### Auto-increment column + +V2.1 supports auto-increment columns, which can ensure data uniqueness of each row. This lays the foundation for efficient dictionary encoding and query pagination. For example, for precise UV calculation and customer grouping, users often apply the bitmap type in Doris, the process of which entails dictionary encoding. With V2.1, users can first create a dictionary table using the auto-increment column, and then simply load user data into it. + +### Auto partition + +To further release burden on operation and maintenance, V2.1 allows auto data partitioning. Upon data ingestion, it detects whether a partition exists for the data based on the partitioning column. If not, it automatically creates one and starts data ingestion. + +### High-concurrency real-time data ingestion + +For data writing, a back pressure mechanism is in place to avoid execessive data versions, so as to reduce resource consumption by data version merging. In addition, V2.1 supports group commit ([read more](../../data-operate/import/import-way/group-commit-manual)), which means to accumulate multiple writing and commit them as one. Benchmark tests on group commit with JDBC ingestion and the Stream Load method present great results. + +## Semi-structured data analysis + +### A new data type: Variant + +V2.1 supports a new data type named Variant. It can accommodate semi-structured data such as JSON as well as compound data types that contain integers, strings, booleans, etcs. Users don't have to pre-define the exact data types for a Variant column in the table schema. The Variant type is handy when processing nested data structures. +You can include Variant columns and static columns with pre-defined data types in the same table. This will provide you with more flexibility in storage and queries. +Tests with ClickBench datasets prove that data in Variant columns takes up the same storage space as data in static columns, which is half of that in JSON format. In terms of query performance, the Variant type enables 8 times higher query speed than JSON in hot runs and even more in cold runs. + +### IP types + +Doris V2.1 provides native support for IPv4 and IPv6. It stores IP data in binary format, which cuts down storage space usage by 60% compared to IP string in plain texts. Along with these IP types, we have added over 20 functions for IP data processing. + +### More powerful functions for compound data types + +- explode_map: supports exploding rows into columns for the Map data type. +- Supports the STRUCT data type in the IN predicates + +## Workload Management + +### Hard isolation of resources + +On the basis of the Workload Group mechanism, which imposes a soft limit on the resources that a workload group can use, Doris 2.1 introduces a hard limit on CPU resource consumption for workload groups as a way to ensure higher stability in query performance. + +### TopSQL + +V2.1 allows users to check the most resource-consuming SQL queries in the runtime. This can be a big help when handling cluster load spike caused by unexpected large queries. + + +## Others + +### Decimal 256 + +For users in the financial sector or high-end manufacturing, V2.1 supports a high-precision data type: Decimal, which supports up to 76 significant digits (an experimental feature, please set enable_decimal256=true.) + +### Job scheduler + +V2.1 provides a good option for regular task scheduling: Doris Job Scheduler. It can trigger the pre-defined operations on schedule or at fixed intervals. The Doris Job Scheduler is accurate to the second. It provides consistency guarantee for data writing, high efficiency and flexibility, high-performance processing queues, retraceable scheduling records, and high availability of jobs. + +### Support Docker fast start to experience the new version + +Starting from version 2.1.0, we will provide a separate Docker Image to support the rapid creation of a 1FE, 1BE Docker container to experience the new version of Doris. The container will complete the initialization of FE and BE, BE registration and other steps by default. After creating the container, it can directly access and use the Doris cluster about 1 [minute.In](http://minute.in/) this image version, the default `max_map_count`, `ulimit`, `Swap` and other hard limits are removed. It supports X64 (avx2) machines and ARM machines for deployment. The default open ports are 8000, 8030, 8040, 9030.If you need to experience the Broker component, you can add the environment variable `--env BROKER=true` at startup to start the Broker process synchronously. After startup, it will automatically complete the registration. The Broker name is `test`. + +Please note that this version is only suitable for quick experience and functional testing, not for production environment! + +## Behavior changed + +- The default data model is the Merge-on-Write Unique Key model. enable_unique_key_merge_on_write will be included as a default setting when a table is created in the Unique Key model. +- As inverted index has proven to be more performant than bitmap index, V2.1 stops supporting bitmap index. Existing bitmap indexes will remain effective but new creation is not allowed. We will remove bitmap index-related code in the future. +- cpu_resource_limit is no longer supported. It is to put a limit on the number of scanner threads on Doris BE. Since the workload group mechanism also supports such settings, the already configured cpu_resource_limit will be invalid. +- The default value of enable_segcompaction is true. This means Doris supports compaction of multiple segments in the same rowset. +- Audit log plug-in + - Since V2.1.0, Doris has a built-in audit log plug-in. Users can simply enable or disable it by setting the enable_audit_plugin parameter. + - If you have already installed your own audit log plug-in, you can either continue using it after upgrading to Doris V2.1, or uninstall it and use the one in Doris. Please note that the audit log table will be relocated after switching plug-in. + - For more details, please see the [docs](../../admin-manual/audit-plugin). + + +## Credits +Thanks all who contribute to this release: + +467887319, 924060929, acnot, airborne12, AKIRA, alan_rodriguez, AlexYue, allenhooo, amory, amory, AshinGau, beat4ocean, BePPPower, bigben0204, bingquanzhao, BirdAmosBird, BiteTheDDDDt, bobhan1, caiconghui, camby, camby, CanGuan, caoliang-web, catpineapple, Centurybbx, chen, ChengDaqi2023, ChenyangSunChenyang, Chester, ChinaYiGuan, ChouGavinChou, chunping, colagy, CSTGluigi, czzmmc, daidai, dalong, dataroaring, DeadlineFen, DeadlineFen, deadlinefen, deardeng, didiaode18, DongLiang-0, dong-shuai, Doris-Extras, Dragonliu2018, DrogonJackDrogon, DuanXujianDuan, DuRipeng, dutyu, echo-dundun, ElvinWei, englefly, Euporia, feelshana, feifeifeimoon, feiniaofeiafei, felixwluo, figurant, flynn, fornaix, FreeOnePlus, Gabriel39, gitccl, gnehil, GoGoWen, gohalo, guardcrystal, hammer, HappenLee, HB, hechao, HelgeLarsHelge, herry2038, HeZhangJianHe, HHoflittlefish777, HonestManXin, hongkun-Shao, HowardQin, hqx871, httpshirley, htyoung, huanghaibin, HuJerryHu, HuZhiyuHu, Hyman-zhao, i78086, irenesrl, ixzc, jacktengg, jacktengg, jackwener, jayhua, Jeffrey, jiafeng.zhang, Jibing-Li, JingDas, julic20s, kaijchen, kaka11chen, KassieZ, kindred77, KirsCalvinKirs, KirsCalvinKirs, kkop, koarz, LemonLiTree, LHG41278, liaoxin01, LiBinfeng-01, LiChuangLi, LiDongyangLi, Lightman, lihangyu, lihuigang, LingAdonisLing, liugddx, LiuGuangdongLiu, LiuHongLiu, liuJiwenliu, LiuLijiaLiu, lsy3993, LuGuangmingLu, LuoMetaLuo, luozenglin, Luwei, Luzhijing, lxliyou001, Ma1oneZhang, mch_ucchi, Miaohongkai, morningman, morrySnow, Mryange, mymeiyi, nanfeng, nanfeng, Nitin-Kashyap, PaiVallishPai, Petrichor, plat1ko, py023, q763562998, qidaye, QiHouliangQi, ranxiang327, realize096, rohitrs1983, sdhzwc, seawinde, seuhezhiqiang, seuhezhiqiang, shee, shuke987, shysnow, songguangfan, Stalary, starocean999, SunChenyangSun, sunny, SWJTU-ZhangLei, TangSiyang2001, Tanya-W, taoxutao, Uniqueyou, vhwzIs, walter, walter, wangbo, Wanghuan, wangqt, wangtao, wangtianyi2004, wenluowen, whuxingying, wsjz, wudi, wudongliang, wuwenchihdu, wyx123654, xiangran0327, Xiaocc, XiaoChangmingXiao, xiaokang, XieJiann, Xinxing, xiongjx, xuefengze, xueweizhang, XueYuhai, XuJianxu, xuke-hat, xy, xy720, xyfsjq, xzj7019, yagagagaga, yangshijie, YangYAN, yiguolei, yiguolei, yimeng, YinShaowenYin, Yoko, yongjinhou, ytwp, yuanyuan8983, yujian, yujun777, Yukang-Lian, Yulei-Yang, yuxuan-luo, zclllyybb, ZenoYang, zfr95, zgxme, zhangdong, zhangguoqiang, zhangstar333, zhangstar333, zhangy5, ZhangYu0123, zhannngchen, ZhaoLongZhao, zhaoshuo, zhengyu, zhiqqqq, ZhongJinHacker, ZhuArmandoZhu, zlw5307, ZouXinyiZou, zxealous, zy-kkk, zzwwhh, zzzxl1993, zzzzzzzs diff --git a/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.1.md b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.1.md new file mode 100644 index 0000000000000..384bccdceb414 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.1.md @@ -0,0 +1,251 @@ +--- +{ + "title": "Release 2.1.1", + "language": "en" +} +--- + + + +Dear community members, Apache Doris 2.1.1 has been officially released on April 3, 2024, with several enhancements and bug fixes based on 2.1.0, enabling smoother user experience. + +- **Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + +## Behavior Changed + +1. Change float type output format to improve float type serialization performance. + +- https://github.com/apache/doris/pull/32049 + +2. Change system table value functions active_queries(), workload_groups() to system tables. + +- https://github.com/apache/doris/pull/32314 + +3. Disable show query/load profile stmt because there are not so many developers use it and the pipeline and pipelinex engine not support it. + +- https://github.com/apache/doris/pull/32467 + +4. Upgrade arrow flight version to 15.0.2 to fix some bugs, so that please use ADBC 15.0.2 version to access Doris. + +- https://github.com/apache/doris/pull/32827. + + + +## Upgrade Problem + +1. BE will core when rolling pgrade problem from 2.0.x to 2.1.x + +- https://github.com/apache/doris/pull/32672 + +- https://github.com/apache/doris/pull/32444 + +- https://github.com/apache/doris/pull/32162 + +2. JDBC Catalog will have query errors when rolling grade rom 2.0.x to 2.1.x. + +- https://github.com/apache/doris/pull/32618 + + + +## New Feature + +1. Enable column auth by default. + +- https://github.com/apache/doris/pull/32659 + + +2. Get correct cores for pipeline and pipelinex engine when running within docker or k8s. + +- https://github.com/apache/doris/pull/32370 + +3. Support read parquet int96 type. + +- https://github.com/apache/doris/pull/32394 + +4. Enable proxy protocol to support IP transparency. Using this protocol, IP transparency for load balancing can be achieved, so that after load balancing, Doris can still obtain the client's real IP and implement permission control such as whitelisting. + +- https://github.com/apache/doris/pull/32338/files + +5. Add workload group queue related columns for active_queries system table. Uses could use this system to monitor the workload queue usage. + +- https://github.com/apache/doris/pull/32259 + +6. Add new system table backend_active_tasks to monitor the realtime query statics on every BE. + +- https://github.com/apache/doris/pull/31945 + +7. Add ipv4 and ipv6 support for spark-doris connector. + +- https://github.com/apache/doris/pull/32240 + +8. Add inverted index support for CCR. + +- https://github.com/apache/doris/pull/32101 + +9. Support select experimental session variable. + +- https://github.com/apache/doris/pull/31837 + +10. Support materialized view with bitmap_union(bitmap_from_array()) case. + +- https://github.com/apache/doris/pull/31962 + +11. Support partition prune for *HIVE_DEFAULT_PARTITION*. + +- https://github.com/apache/doris/pull/31736 + +12. Support function in set variable statement. + +- https://github.com/apache/doris/pull/32492 + +13. Support arrow serialization for varint type. + +- https://github.com/apache/doris/pull/32809 + + + +## Optimization + +1. Auto resume routine load when be restart or during upgrade. And keep the routine load stable. + +- https://github.com/apache/doris/pull/32239 + +2. Routine Load: optimize allocate task to be algorithm for load balance. + +- https://github.com/apache/doris/pull/32021 + +3. Spark Load: update spark version for spark load to resolve cve problem. + +- https://github.com/apache/doris/pull/30368 + +4. Skip cooldown if the tablet is dropped. + +- https://github.com/apache/doris/pull/32079 + +5. Support using workload group to manage routine load. + +- https://github.com/apache/doris/pull/31671 + +6. [MTMV ]Improve the performance for query rewritting by materialized view. + +- https://github.com/apache/doris/pull/31886 + +7. Reduce jvm heap memory consumed by profiles of BrokerLoadJob. + +- https://github.com/apache/doris/pull/31985 +8. Imporve the high QPS query by speed up PartitionPrunner. + +- https://github.com/apache/doris/pull/31970 + +9. Reduce duplicated memory consumption for column name and column path for schema cache. + +- https://github.com/apache/doris/pull/31141 + +10. Support more join types for query rewriting by materialized view such as INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN, FULL OUTER JOIN, LEFT SEMI JOIN, RIGHT SEMI JOIN, LEFT ANTI JOIN, RIGHT ANTI JOIN + +- https://github.com/apache/doris/pull/32909 + + + +## Bugfix + + +1. Do not push down topn-filter through right/full outer join if the first orderkey is nulls first. + +- https://github.com/apache/doris/pull/32633 + +2. Fix memory leak in Java UDF + +- https://github.com/apache/doris/pull/32630 + +3. If some odbc tables use the same resource, and restore not all odbc tables, it will not retain the resource. +and check some conf for backup/restore + +- https://github.com/apache/doris/pull/31989 + +4. Fold constant will core for variant type. + +- https://github.com/apache/doris/pull/32265 + +5. Routine load will pause when transaction fail in some cases. + +- https://github.com/apache/doris/pull/32638 + +6. the result of left semi join with empty right side should be false instead of null. + +- https://github.com/apache/doris/pull/32477 + +7. Fix core when build inverted index for a new column with no data. + +- https://github.com/apache/doris/pull/32669 + +8. Fix be core caused by null-safe-equal join. + +- https://github.com/apache/doris/pull/32623 + +9. Partial update: fix data correctness risk when load delete sign data into a table with sequence col. + +- https://github.com/apache/doris/pull/32574 + +10. Select outfile: Fix the column type mapping in the orc/parquet file format. + +- https://github.com/apache/doris/pull/32281 + +11. Fix BE core during restore stage. + +- https://github.com/apache/doris/pull/32489 + +12. Use array_agg func after other agg func like count, sum, may make be core. + +- https://github.com/apache/doris/pull/32387 + +13. Variant type should always nullable or there will some bugs. + +- https://github.com/apache/doris/pull/32248 + +14. Fix the bug of handling empty blocks in schema change. + +- https://github.com/apache/doris/pull/32396 + +15. Fix BE will core when use json_length() in some cases. + +- https://github.com/apache/doris/pull/32145 + +16. Fix error when query iceberg table using date cast predicate + +- https://github.com/apache/doris/pull/32194 + +17. Fix some bugs when build inverted index for variant type. + +- https://github.com/apache/doris/pull/31992 + +18. Wrong result of two or more map_agg functions in query. + +- https://github.com/apache/doris/pull/31928 + +19. Fix wrong result of money_format function. + +- https://github.com/apache/doris/pull/31883 + +20. Fix connection hang after too many connections. + +- https://github.com/apache/doris/pull/31594 \ No newline at end of file diff --git a/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.2.md b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.2.md new file mode 100644 index 0000000000000..6116bd9984632 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.2.md @@ -0,0 +1,110 @@ +--- +{ + "title": "Release 2.1.2", + "language": "en" +} +--- + + + +## Behavior Changed + +1. Set the default value of the `data_consistence` property of EXPORT to partition to make export more stable during load. + +- https://github.com/apache/doris/pull/32830 + +2. Some of MySQL Connector (eg, dotnet MySQL.Data) rely on variable's column type to make connection. + + eg, select @[@autocommit]([@autocommit](https://github.com/autocommit)) should with column type BIGINT, not BIT, otherwise it will throw error. So we change column type of @[@autocommit](https://github.com/autocommit) to BIGINT. + + - https://github.com/apache/doris/pull/33282 + + +## Upgrade Problem + +1. Normal workload group is not created when upgrade from 2.0 or other old versions. + + - https://github.com/apache/doris/pull/33197 + +## New Feature + + +1. Add processlist table in information_schema database, users could use this table to query active connections. + + - https://github.com/apache/doris/pull/32511 + +2. Add a new table valued function `LOCAL` to allow access file system like shared storage. + + - https://github.com/apache/doris-website/pull/494 + + +## Optimization + +1. Skip some useless process to make graceful stop more quickly in K8s env. + + - https://github.com/apache/doris/pull/33212 + +2. Add rollup table name in profile to help find the mv selection problem. + + - https://github.com/apache/doris/pull/33137 + +3. Add test connection function to DB2 database to allow user check the connection when create DB2 Catalog. + + - https://github.com/apache/doris/pull/33335 + +4. Add DNS Cache for FQDN to accelerate the connect process among BEs in K8s env. + + - https://github.com/apache/doris/pull/32869 + +5. Refresh external table's rowcount async to make the query plan more stable. + + - https://github.com/apache/doris/pull/32997 + + +## Bugfix + + +1. Fix Iceberg Catalog of HMS and Hadoop do not support Iceberg properties like "io.manifest.cache-enabled" to enable manifest cache in Iceberg. + + - https://github.com/apache/doris/pull/33113 + +2. The offset params in `LEAD`/`LAG` function could use 0 as offset. + + - https://github.com/apache/doris/pull/33174 + +3. Fix some timeout issues with load. + + - https://github.com/apache/doris/pull/33077 + + - https://github.com/apache/doris/pull/33260 + +4. Fix core problem related with `ARRAY`/`MAP`/`STRUCT` compaction process. + + - https://github.com/apache/doris/pull/33130 + + - https://github.com/apache/doris/pull/33295 + +5. Fix runtime filter wait timeout. + + - https://github.com/apache/doris/pull/33369 + +6. Fix `unix_timestamp` core for string input in auto partition. + + - https://github.com/apache/doris/pull/32871 \ No newline at end of file diff --git a/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.3.md b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.3.md new file mode 100644 index 0000000000000..e88ec3e94fb6d --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.3.md @@ -0,0 +1,191 @@ +--- +{ + "title": "Release 2.1.3", + "language": "en" +} +--- + + + +Apache Doris 2.1.3 was officially released on May 21, 2024. This version has updated several improvements, including writing data back to Hive, materialized view, permission management and bug fixes. It further enhances the performance and stability of the system. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + + + +## Feature Enhancements + +**1. Support writing back data to hive tables via Hive Catalog** + +Starting from version 2.1.3, Apache Doris supports DDL and DML operations on Hive. Users can directly create libraries and tables in Hive through Apache Doris and write data to Hive tables by executing `INSERT INTO` statements. This feature allows users to perform complete data query and write operations on Hive through Apache Doris, further simplifying the integrated lakehouse architecture. + +Please refer: [https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/](https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/) + +**2. Support building new asynchronous materialized views on top of existing ones** + +Users can create new asynchronous materialized views on top of existing ones, directly reusing pre-computed intermediate results for data processing. This simplifies complex aggregation and computation operations, reducing resource consumption and maintenance costs while further accelerating query performance and improving data availability. [#32984](https://github.com/apache/doris/pull/32984) + +**3. Support rewriting through nested materialized views** + +Materialized View (MV) is a database object used to store query results. Now, Apache Doris supports rewriting through nested materialized views, which helps optimize query performance. [#33362](https://github.com/apache/doris/pull/33362) + +**4. New `SHOW VIEWS` statement** + +The `SHOW VIEWS` statement can be used to query views in the database, facilitating better management and understanding of view objects in the database. [#32358](https://github.com/apache/doris/pull/32358) + +**5. Workload Group supports binding to specific BE nodes** + +Workload Group can be bound to specific BE nodes, enabling more refined control over query execution to optimize resource usage and improve performance. [#32874](https://github.com/apache/doris/pull/32874) + +**6. Broker Load supports compressed JSON format** + +Broker Load now supports importing compressed JSON format data, significantly reducing bandwidth requirements for data transmission and accelerating data import performance. [#30809](https://github.com/apache/doris/pull/30809) + +**7. TRUNCATE Function can use columns as scale parameters** + +The TRUNCATE function can now accept columns as scale parameters, providing more flexibility when processing numerical data. [#32746](https://github.com/apache/doris/pull/32746) + +**8. Add new functions `uuid_to_int` and `int_to_uuid`** + +These two functions allow users to convert between UUID and integer, significantly helping in scenarios that require handling UUID data. [#33005](https://github.com/apache/doris/pull/33005) + +**9. Add `bypass_workload_group` session variable to bypass query queue** + +The `bypass_workload_group` session variable allows certain queries to bypass the Workload Group queue and execute directly, which is useful for handling critical queries that require quick responses. [#33101](https://github.com/apache/doris/pull/33101) + +**10. Add strcmp function** + +The strcmp function compares two strings and returns their comparison result, simplifying text data processing. [#33272](https://github.com/apache/doris/pull/33272) + +**11. Support HLL functions `hll_from_base64` and `hll_to_base64`** + +HyperLogLog (HLL) is an algorithm for cardinality estimation. These two functions allow users to decode HLL data from a Base64-encoded string or encode HLL data as a Base64 string, which is very useful for storing and transmitting HLL data. [#32089](https://github.com/apache/doris/pull/32089) + +## Optimization and Improvements + +**1. Replace SipHash with XXHash to improve shuffle performance** + +Both SipHash and XXHash are hashing functions, but XXHash may provide faster hashing speeds and better performance in certain scenarios. This optimization aims to improve performance during data shuffling by adopting XXHash. [#32919](https://github.com/apache/doris/pull/32919) + +**2. Asynchronous materialized views support NULL partition columns in OLAP tables** + +This enhancement allows asynchronous materialized views to support NULL partition columns in OLAP tables, enhancing data processing flexibility.[#32698](https://github.com/apache/doris/pull/32698) + +**3. Limit maximum string length to 1024 when collecting column statistics to control BE memory usage** + +Limiting the string length when collecting column statistics prevents excessive data from consuming too much BE memory, helping maintain system stability and performance. [#32470](https://github.com/apache/doris/pull/32470) + +**4. Support dynamic deletion of Bitmap cache to improve performance** + +Dynamically deleting no longer needed Bitmap Cache can free up memory and improve system performance. [#32991](https://github.com/apache/doris/pull/32991) + +**5. Reduce memory usage during ALTER operations** + +Reducing memory usage during ALTER operations improves the efficiency of system resource utilization. [#33474](https://github.com/apache/doris/pull/33474) + +**6. Support constant folding for complex types** + +Supports constant folding for Array/Map/Struct complex types.[#32867](https://github.com/apache/doris/pull/32867) + +**7. Add support for Variant type in Aggregate Key Model** + +The Variant data type can store multiple data types. This optimization allows aggregation operations on Variant type data, enhancing the flexibility of semi-structured data analysis. [#33493](https://github.com/apache/doris/pull/33493) + +**8. Support new inverted index format in CCR** [#33415](https://github.com/apache/doris/pull/33415) + +**9. Optimize rewriting performance for nested materialized views** [#34127](https://github.com/apache/doris/pull/34127) + +**10. Support decimal256 type in row-based storage format** + +Supporting the decimal256 type in row-based storage extends the system's ability to handle high-precision numerical data. [#34887](https://github.com/apache/doris/pull/34887) + +## Behavioral Changes + +**1. Authorization** + +- **Grant_priv permission changes**: `Grant_priv` can no longer be arbitrarily granted. When performing a `GRANT` operation, the user not only needs to have `Grant_priv` but also the permissions to be granted. For example, to grant `SELECT` permission on `table1`, the user needs both `GRANT` permission and `SELECT` permission on `table1`, enhancing security and consistency in permission management. [#32825](https://github.com/apache/doris/pull/32825) + +- **Workload group and resource usage_priv**: `Usage_priv` for Workload Group and Resource is no longer global but limited to Resource and Workload Group, making permission granting and usage more specific. [#32907](https://github.com/apache/doris/pull/32907) + +- **Authorization for operations**: Operations that were previously unauthorized now have corresponding authorizations for more detailed and comprehensive operational permission control. [#33347](https://github.com/apache/doris/pull/33347) + +**2. LOG directory configuration** + +The log directory configuration for FE and BE now uniformly uses the `LOG_DIR` environment variable. All other different types of logs will be stored with `LOG_DIR` as the root directory. To maintain compatibility between versions, the previous configuration item `sys_log_dir` can still be used. [#32933](https://github.com/apache/doris/pull/32933) + +**3. S3 Table Function (TVF)** + +Due to issues with correctly recognizing or processing S3 URLs in certain cases, the parsing logic for object storage paths has been refactored. For file paths in S3 table functions, the `force_parsing_by_standard_uri` parameter needs to be passed to ensure correct parsing. [#33858](https://github.com/apache/doris/pull/33858) + +## Upgrade Issues + +Since many users use certain keywords as column names or attribute values, the following keywords have been set as non-reserved, allowing users to use them as identifiers. [#34613](https://github.com/apache/doris/pull/34613) + +## Bug Fixes + +**1. Fix no data error when reading Hive tables on Tencent Cloud COSN** + +Resolved the no data error that could occur when reading Hive tables on Tencent Cloud COSN, enhancing compatibility with Tencent Cloud storage services. + +**2. Fix incorrect results returned by `milliseconds_diff` function** + +Fixed an issue where the `milliseconds_diff` function returned incorrect results in some cases, ensuring the accuracy of time difference calculations. [#32897](https://github.com/apache/doris/pull/32897) + +**3. User-defined variables should be rorwarded to the Master node** + +Ensured that user-defined variables are correctly passed to the Master node for consistency and correct execution logic across the entire system. [#33013]https://github.com/apache/doris/pull/33013 + +**4. Fix Schema Change issues when adding complex type columns** + +Resolved Schema Change issues that could arise when adding complex type columns, ensuring the correctness of Schema Changes. [#31824](https://github.com/apache/doris/pull/31824) + +**5. Fix data loss issue in Routine Load when FE Master node changes** + +`Routine Load` is often used to subscribe to Kafka message queues. This fix addresses potential data loss issues that may occur during FE Master node changes. [#33678](https://github.com/apache/doris/pull/33678) + +**6. Fix Routine Load failure when Workload Group cannot be found** + +Resolved an issue where `Routine Load` would fail if the specified Workload Group could not be found. [#33596](https://github.com/apache/doris/pull/33596) + +**7. Support column string64 to avoid join failures when string size overflows unit32** + +In some cases, string sizes may exceed the unit32 limit. Supporting the `string64` type ensures correct execution of string JOIN operations. [#33850](https://github.com/apache/doris/pull/33850) + +**8. Allow Hadoop users to create Paimon Catalog** + +Permitted authorized Hadoop users to create Paimon Catalogs.[#33833](https://github.com/apache/doris/pull/33833) + +**9. Fix `function_ipxx_cidr` function issues with constant parameters** + +Resolved problems with the `function_ipxx_cidr` function when handling constant parameters, ensuring the correctness of function execution.[#33968](https://github.com/apache/doris/pull/33968) + +**10. Fix file download errors when restoring using HDFS** + +Resolved "failed to download" errors encountered during data restoration using HDFS, ensuring the accuracy and reliability of data recovery. [#33303](https://github.com/apache/doris/issues/33303) + +**11. Fix column permission issues related to hidden columns** + +In some cases, permission settings for hidden columns may be incorrect. This fix ensures the correctness and security of column permission settings. [#34849](https://github.com/apache/doris/pull/34849) + +**12. Fix issue where Arrow Flight cannot obtain the correct IP in K8s deployments** + +This fix resolves an issue where Arrow Flight cannot correctly obtain the IP address in Kubernetes deployment environments.[#34850](https://github.com/apache/doris/pull/34850) \ No newline at end of file diff --git a/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.4.md b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.4.md new file mode 100644 index 0000000000000..521694ffa60fa --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.4.md @@ -0,0 +1,289 @@ +--- +{ + "title": "Release 2.1.4", + "language": "en" +} +--- + + + +**Apache Doris version 2.1.4 was officially released on June 26, 2024.** In this update, we have optimized various functional experiences for data lakehouse scenarios, with a focus on resolving the abnormal memory usage issue in the previous version. Additionally, we have implemented several improvemnents and bug fixes to enhance the stability. Welcome to download and use it. + + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + + +## Behavior changes + +- Non-existent files will be ignored when querying external tables such as Hive. [#35319](https://github.com/apache/doris/pull/35319) + + The file list is obtained from the meta cache, and it may not be consistent with the actual file list. + + Ignoring non-existent files helps to avoid query errors. + +- By default, creating a Bitmap Index will no longer be automatically changed to an Inverted Index. [#35521](https://github.com/apache/doris/pull/35521) + + This behavior is controlled by the FE configuration item `enable_create_bitmap_index_as_inverted_index`, which defaults to false. + +- When starting FE and BE processes using `--console`, all logs will be output to the standard output and differentiated by prefixes indicating the log type. [#35679](https://github.com/apache/doris/pull/35679) + + For more infomation, please see the documentations: + + - [Log Management - FE Log](../admin-manual/log-management/fe-log.md) + + - [Log Management - BE Log](../admin-manual/log-management/be-log.md) + +- If no table comment is provided when creating a table, the default comment will be empty instead of using the table type as the default comment. [#36025](https://github.com/apache/doris/pull/36025) + +- The default precision of DECIMALV3 has been adjusted from (9, 0) to (38, 9) to maintain compatibility with the version in which this feature was initially released. [#36316](https://github.com/apache/doris/pull/36316) + +## New features + +### Query optimizer + +- Support FE flame graph tool + + For more information, see the [documentation](/community/developer-guide/fe-profiler.md) + +- Support `SELECT DISTINCT` to be used with aggregation. + +- Support single table query rewrite without `GROUP BY`. This is useful for complex filters or expressions. [#35242](https://github.com/apache/doris/pull/35242). + +- The new optimizer fully supports point query functionality [#36205](https://github.com/apache/doris/pull/36205). + +### Data Lakehouse + +- Support native reader of Apache Paimon deletion vector [#35241](https://github.com/apache/doris/pull/35241) + +- Support using Resource in Table Valued Functions [#35139](https://github.com/apache/doris/pull/35139) + +- Access controller with Hive Ranger plugin supports Data Mask + +### Asynchronous materialized views + +- Build support for internal table triggered updates, where if a materialized view uses an internal table and the data in the internal table changes, it can trigger a refresh of the materialized view, specifying REFRESH ON COMMIT when creating the materialized view. + +- Support transparent rewriting for single tables. For more information, see [Querying Async Materialized View](../query/view-materialized-view/query-async-materialized-view.md). + +- Transparent rewriting supports aggregation roll-up for agg_state, agg_union types; materialized views can be defined as agg_state or agg_union, queries can use specific aggregation functions, or use agg_merge. For more information, see [AGG_STATE](../sql-manual/sql-types/Data-Types/AGG_STATE.md). + +### Others + +- Added function `replace_empty`. + + For more information, see [documentation]../sql-manual/sql-functions/string-functions/replace_empty). + +- Support `show storage policy using` statement. + + For more information, see [documentation](../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md). + +- Support JVM metrics on the BE side. + + By setting `enable_jvm_monitor=true` in `be.conf` to enable this feature. + +## Improvements + +- Supported creating inverted indexes for columns with Chinese names. [#36321](https://github.com/apache/doris/pull/36321) + +- Estimate memory consumed by segment cache more accurately so that unused memory can be released more quickly. [#35751](https://github.com/apache/doris/pull/35751) + +- Filter empty partitions before exporting tables to remote storage. [#35542](https://github.com/apache/doris/pull/35542) + +- Optimize routine load task allocation algorithm to balance the load among Backends. [#34778](https://github.com/apache/doris/pull/34778) + +- Provide hints when a related variable is not found during a set operation. [#35775](https://github.com/apache/doris/pull/35775) + +- Support placing Java UDF jar files in the FE's `custom_lib` directory for default loading. [#35984](https://github.com/apache/doris/pull/35984) + +- Add a timeout global variable `audit_plugin_load_timeout` for audit log load jobs. + +- Optimize the performance of transparent rewrite planning for asynchronous materialized views. + +- Optimize the `INSERT` operation that when the source is empty, the BE will not execute. [#34418](https://github.com/apache/doris/pull/34418) + +- Support fetching file lists of Hive/Hudi tables in batches. In a senario with 1.2 million files, the time taken to obtain the list of files has been reduced from 390 seconds to 46 seconds. [#35107](https://github.com/apache/doris/pull/35107) + +- Forbid dynamic partitioning when creating asynchronous materialized views. + +- Support detecting whether the partition data of external data of external tables in Hive is synchronized with asynchronous materialized views. + +- Allow to create index for asynchronous materialized views. + +## Bug fixes + +### Query optimizer + +- Fixed the issue where SQL cache returns old results after truncating a partition. [#34698](https://github.com/apache/doris/pull/34698) + +- Fixed the issue where casting from JSON to other types did not correctly handle nullable attributes. [#34707](https://github.com/apache/doris/pull/34707) + +- Fixed occasional DATETIMEV2 literal simplification errors. [#35153](https://github.com/apache/doris/pull/35153) + +- Fixed the issue where `COUNT(*)` could not be used in window functions. [#35220](https://github.com/apache/doris/pull/35220) + +- Fixed the issue where nullable attributes could be incorrect when all `SELECT` statements under `UNION ALL` have no `FROM` clause. [#35074](https://github.com/apache/doris/pull/35074) + +- Fixed the issue where `bitmap in join` and subquery unnesting could not be used simultaneously. [#35435](https://github.com/apache/doris/pull/35435) + +- Fixed the performance issue where filter conditions could not be pushed down to the CTE producer in specific situations. [#35463](https://github.com/apache/doris/pull/35463) + +- Fixed the issue where aggregate combinators written in uppercase could not be found. [#35540](https://github.com/apache/doris/pull/35540) + +- Fixed the performance issue where window functions were not properly pruned by column pruning. [#35504](https://github.com/apache/doris/pull/35504) + +- Fixed the issue where queries might parse incorrectly leading to wrong results when multiple tables with the same name but in different databases appeared simultaneously in the query. [#35571](https://github.com/apache/doris/pull/35571) + +- Fixed the query error caused by generating runtime filters during schema table scans. [#35655](https://github.com/apache/doris/pull/35655) + +- Fixed the issue where nested correlated subqueries could not execute because the join condition was folded into a null literal. [#35811](https://github.com/apache/doris/pull/35811) + +- Fixed the occasional issue where decimal literals were set with incorrect precision during planning. [#36055](https://github.com/apache/doris/pull/36055) + +- Fixed the occasional issue where multiple layers of aggregation were merged incorrectly during planning. [#36145](https://github.com/apache/doris/pull/36145) + +- Fixed the occasional issue where the input-output mismatch error occurred after aggregate expansion planning. [#36207](https://github.com/apache/doris/pull/36207) + +- Fixed the occasional issue where `<=>` was incorrectly converted to `=`. [#36521](https://github.com/apache/doris/pull/36521) + +### Query execution + +- Fixed the issue where the query hangs if the limited rows are reached on the pipeline engine and memory is not released. [#35746](https://github.com/apache/doris/pull/35746) + +- Fixed the BE coredump when `enable_decimal256` is true but falls back to the old planner. [#35731](https://github.com/apache/doris/pull/35731) + +### Asynchronous materialized views + +- Fixed the issue in the asynchronous materialized view build where the store_row_column attribute specified was not being recognized by the core. + +- Fixed the problem in the asynchronous materialized view build where specifying the storage_medium was not taking effect. + +- Resolved the error occurring in the asynchronous materialized view show partitions after the base table is deleted. + +- Fixed the issue where asynchronous materialized views caused backup and restore exceptions. [#35703](https://github.com/apache/doris/pull/35703) + +- Fixed the issue where partition rewrite could lead to incorrect results. [#35236](https://github.com/apache/doris/pull/35236) + +### Semi-structured + +- Fixed the core dump problem when a VARIANT with an empty key is used. [#35671](https://github.com/apache/doris/pull/35671) +- Bitmap and BloomFilter index should not perform light index changes. [#35225](https://github.com/apache/doris/pull/35225) + +### Primary key + +- Fixed the issue where an exception BE restart occurred in the case of partial column updates during import, which could result in duplicate keys. [#35678](https://github.com/apache/doris/pull/35678) + +- Fixed the issue where BE might core dump during clone operations when memory is tight. [#34702](https://github.com/apache/doris/pull/34702) + +### Data Lakehouse + +- Fixed the issue where a Hive table could not be created with a fully qualified name such as `ctl.db.tbl` [#34984](https://github.com/apache/doris/pull/34984) + +- Fixed the issue where the Hive metastore connection did not close when refreshing [#35426](https://github.com/apache/doris/pull/35426) + +- Fixed a potential meta replay issue when upgrading from 2.0.x to 2.1.x. [#35532](https://github.com/apache/doris/pull/35532) + +- Fixed the issue where the Table Valued Function could not read an empty snappy compressed file. [#34926](https://github.com/apache/doris/pull/34926) + +- Fixed the issue where unable to read Parquet files with invalid min-max column statistics [#35041](https://github.com/apache/doris/pull/35041) + +- Fixed the issue where unable to handle pushdown predicates with null-aware functions in the Parquet/ORC reader [#35335](https://github.com/apache/doris/pull/35335) + +- Fixed the issue about the order of partition columns when creating a Hive table [#35347](https://github.com/apache/doris/pull/35347) + +- Fixed the issue where writing to a Hive table on S3 failed when partition values contained spaces [#35645](https://github.com/apache/doris/pull/35645) + +- Fixed the issue about incorrect scheme of Aliyun OSS endpoint [#34907](https://github.com/apache/doris/pull/34907) + +- Fixed the issue where the Parquet format Hive table written by Doris could not be read by Hive [#34981](https://github.com/apache/doris/pull/34981) + +- Fixed the issue where unable to read ORC files after the schema change of a Hive table [#35583](https://github.com/apache/doris/pull/35583) + +- Fixed the issue where unable to read Paimon tables via JNI after the schema change of the Paimon table [#35309](https://github.com/apache/doris/pull/35309) + +- Fixed the issue of too small Row Groups in Parquet format files written out. [#36042](https://github.com/apache/doris/pull/36042) [#36143](https://github.com/apache/doris/pull/36143) + +- Fixed the issue where unable to read Paimon tables after schema changes [#36049](https://github.com/apache/doris/pull/36049) + +- Fixed the issue where unable to read Hive Parquet format tables after schema changes [#36182](https://github.com/apache/doris/pull/36182) + +- Fixed the FE OOM issue caused by Hadoop FS cache [#36403](https://github.com/apache/doris/pull/36403) + +- Fixed the issue where FE could not start after enabling the Hive Metastore Listener [#36533](https://github.com/apache/doris/pull/36533) + +- Fixed the issue of query performance degradation with a large number of files [#36431](https://github.com/apache/doris/pull/36431) + +- Fixed the timezone issue when reading the timestamp column type in Iceberg [#36435](https://github.com/apache/doris/pull/36435) + +- Fixed DATETIME conversion error and data path error on Iceberg Table. [#35708](https://github.com/apache/doris/pull/35708) + +- Support retain and pass the additional user-defined properties fo Table Valued Functions to the S3 SDK. [#35515](https://github.com/apache/doris/pull/35515) + + +### Data import + +- Fixed the issue where `CANCEL LOAD` did not work [#35352](https://github.com/apache/doris/pull/35352) + +- Fixed the issue where a null pointer error in the Publish phase of load transactions prevented the load from completing [#35977](https://github.com/apache/doris/pull/35977) + +- Fixed the issue with bRPC serializing large data files when sent via HTTP [#36169](https://github.com/apache/doris/pull/36169) + +### Data management + +- Fixed the isseu that the resource tag in ConnectionContext was not set after forwarding DDL or DML to master FE. [#35618](https://github.com/apache/doris/pull/35618) + +- Fixed the issue where the restored table name was incorrect when `lower_case_table_names` was enabled [#35508](https://github.com/apache/doris/pull/35508) + +- Fixed the issue where `admin clean trash` could not work [#35271](https://github.com/apache/doris/pull/35271) + +- Fixed the issue where a storage policy could not be deleted from a partition [#35874](https://github.com/apache/doris/pull/35874) + +- Fixed the issue of data loss when importing into a multi-replica automatic partition table [#36586](https://github.com/apache/doris/pull/36586) + +- Fixed the issue where the partition column of a table changed when querying or inserting into an automatic partition table using the old optimizer [#36514](https://github.com/apache/doris/pull/36514) + +### Memory management + +- Fixed the issue of frequent errors in the logs due to failure in obtaining Cgroup meminfo. [#35425](https://github.com/apache/doris/pull/35425) + +- Fixed the issue where the Segment cache size was uncontrolled when using BloomFilter, leading to abnormal process memory growth. [#34871](https://github.com/apache/doris/pull/34871) + +### Permissions + +- Fixed the issue where permission settings were ineffective after enabling case-insensitive table names. [#36557](https://github.com/apache/doris/pull/36557) + +- Fixed the issue where setting LDAP passwords through non-Master FE nodes did not take effect. [#36598](https://github.com/apache/doris/pull/36598) + +- Fixed the issue where authorization could not be checked for the `SELECT COUNT(*)` statement. [#35465](https://github.com/apache/doris/pull/35465) + +### Others + +- Fixed the issue where the client JDBC program could not close the connection if the MySQL connection was broken. [#36616](https://github.com/apache/doris/pull/36616) + +- Fixed MySQL protocol compatibility issue with the `SHOW PROCEDURE STATUS` statement. [#35350](https://github.com/apache/doris/pull/35350) + +- The `libevent` now forces Keepalive to solve the issue of connection leaks in certain situations. [#36088](https://github.com/apache/doris/pull/36088) + +## Credits + +Thanks to every one who contributes to this release. + +@airborne12, @amorynan, @AshinGau, @BePPPower, @BiteTheDDDDt, @ByteYue, @caiconghui, @CalvinKirs, @cambyzju, @catpineapple, @cjj2010, @csun5285, @DarvenDuan, @dataroaring, @deardeng, @Doris-Extras, @eldenmoon, @englefly, @feiniaofeiafei, @felixwluo, @freemandealer, @Gabriel39, @gavinchou, @GoGoWen, @HappenLee, @hello-stephen, @hubgeter, @hust-hhb, @jacktengg, @jackwener, @jeffreys-cat, @Jibing-Li, @kaijchen, @kaka11chen, @Lchangliang, @liaoxin01, @LiBinfeng-01, @lide-reed, @luennng, @luwei16, @mongo360, @morningman, @morrySnow, @mrhhsg, @Mryange, @mymeiyi, @nextdreamblue, @platoneko, @qidaye, @qzsee, @seawinde, @shuke987, @sollhui, @starocean999, @suxiaogang223, @TangSiyang2001, @Thearas, @Vallishp, @w41ter, @wangbo, @whutpencil, @wsjz, @wuwenchi, @xiaokang, @xiedeyantu, @XieJiann, @xinyiZzz, @XuPengfei-1020, @xy720, @xzj7019, @yiguolei, @yongjinhou, @yujun777, @Yukang-Lian, @Yulei-Yang, @zclllyybb, @zddr, @zfr9527, @zgxme, @zhangbutao, @zhangstar333, @zhannngchen, @zhiqiang-hhhh, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.5.md b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.5.md new file mode 100644 index 0000000000000..7c1910eeae8c5 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.5.md @@ -0,0 +1,395 @@ +--- +{ + "title": "Release 2.1.5", + "language": "en" +} +--- + + + +**Apache Doris version 2.1.5 was officially released on July 24, 2024.** In this update, we have optimized various functional experiences for data lakehouse and high concurrency scenarios, functionalities of asynchronous materialized views. Additionaly, we have implemented several improvemnents and bug fixes to enhance the stability. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- The default connection pool size for the JDBC Catalog has been increased from 10 to 30 to prevent connection exhaustion in high-concurrency scenarios. [#37023](https://github.com/apache/doris/pull/37023). + +- The system's reserved memory (low water mark) has been adjusted to `min(6.4GB, MemTotal * 5%)` to mitigate BE OOM issues. + +- When processing multiple statements in a single request, only the last statement's result is returned if the `CLIENT_MULTI_STATEMENTS` flag is not set. + +- Direct modifications to data in asynchronous materialized views are no longer permitted.[#37129](https://github.com/apache/doris/pull/37129) + +- A session variable `use_max_length_of_varchar_in_ctas` has been added to control the behavior of varchar and char type length generation during CTAS (Create Table As Select). The default value is true. When set to false, the derived varchar length is used instead of the maximum length. [#37284](https://github.com/apache/doris/pull/37284) + +- Statistics collection now defaults to enabling the functionality of estimating the number of rows in Hive tables based on file size. [#37694](https://github.com/apache/doris/pull/37694) + +- Transparent rewrite for asynchronous materialized views is now enabled by default. [#35897](https://github.com/apache/doris/pull/35897) + +- Transparent rewrite utilizes partitioned materialized views. If partitions fail, the base tables are unioned with the materialized view to ensure data correctness. [#35897](https://github.com/apache/doris/pull/35897) + +## New features + +### Lakehouse + +- The session variable `read_csv_empty_line_as_null` can be used to control whether empty lines are ignored when reading CSV format files. [#37153](https://github.com/apache/doris/pull/37153) + + By default, empty lines are ignored. When set to true, empty lines will be read as rows where all columns are null. + +- Compatibility with Presto's complex type output format can be enabled by setting `serde_dialect="presto"`. [#37253](https://github.com/apache/doris/pull/37253) + +### Multi-Table Materialized View + +- Supports non-deterministic functions in materialized view building. [#37651](https://github.com/apache/doris/pull/37651) + +- Atomically replaces definitions of asynchronous materialized views. [#37147](https://github.com/apache/doris/pull/37147) + +- Views creation statements can be viewed via `SHOW CREATE MATERIALIZED VIEW`. [#37125](https://github.com/apache/doris/pull/37125) + +- Transparent rewrites for multi-dimensional aggregation and non-aggregate queries. [#37436](https://github.com/apache/doris/pull/37436) [#37497](https://github.com/apache/doris/pull/37497) + +- Supports DISTINCT aggregations with key columns and partitioning for roll-ups. [#37651](https://github.com/apache/doris/pull/37651) + +- Support for partitioning materialized views to roll up partitions using `date_trunc` [#31812](https://github.com/apache/doris/pull/31812) [#35562](https://github.com/apache/doris/pull/35562) + +- Partitioned table-valued functions (TVFs) are supported. [#36479](https://github.com/apache/doris/pull/36479) + +### Semi-Structured Data Management + +- Tables using the VARIANT type now support partial column updates. [#34925](https://github.com/apache/doris/pull/34925) + +- PreparedStatement support is now enabled by default. [#36581](https://github.com/apache/doris/pull/36581) + +- The VARIANT type can be exported to CSV format. [#37857](https://github.com/apache/doris/pull/37857) + +- `explode_json_object` function transposes JSON Object rows into columns. [#36887](https://github.com/apache/doris/pull/36887) + +- The ES Catalog now maps ES NESTED or OBJECT types to the Doris JSON type.[#37101](https://github.com/apache/doris/pull/37101) + +- By default, support_phrase is enabled for inverted indexes with specified analyzers to improve the performance of match_phrase series queries. [#37949](https://github.com/apache/doris/pull/37949) + +### Query Optimizer + +- Support for explaining `DELETE FROM` statements. [#37100](https://github.com/apache/doris/pull/37100) + +- Support for hint form of constant expression parameters [#37988](https://github.com/apache/doris/pull/37988) + +### Memory Management + +- Added an HTTP API to clear the cache. [#36599](https://github.com/apache/doris/pull/36599) + +### Permissions + +- Support for authorization of resources within Table-Valued Functions (TVFs) [#37132](https://github.com/apache/doris/pull/37132) + +## Improvements + +### Lakehouse + +- Upgraded Paimon to version 0.8.1 + +- Fixes ClassNotFoundException for org.apache.commons.lang.StringUtils when querying Paimon tables. [#37512](https://github.com/apache/doris/pull/37512) + +- Added support for Tencent Cloud LakeFS. [#36891](https://github.com/apache/doris/pull/36891) + +- Optimized the timeout duration when fetching file lists for external table queries. [#36842](https://github.com/apache/doris/pull/36842) + +- Configurable via the session variable `fetch_splits_max_wait_time_ms`. + +- Improved default connection logic for SQLServer JDBC Catalog. [#36971](https://github.com/apache/doris/pull/36971) + + By default, the connection encryption settings are not intervened. Only when `force_sqlserver_jdbc_encrypt_false` is set to true, encrypt=false is forcibly added to the JDBC URL to reduce authentication errors. This allows for more flexible control over encryption behavior, enabling it to be turned on or off as needed. + +- Added serde properties to the show create table statements for Hive tables. [#37096](https://github.com/apache/doris/pull/37096) + +- Changed the default cache time for Hive table lists on the FE from 1 day to 4 hours + +- Data export (Export/Outfile) now supports specifying compression formats for Parquet and ORC + + For more information, please refer to [docs](https://doris.apache.org/docs/sql-manual/sql-statements/Data-Manipulation-Statements/Manipulation/EXPORT/?_highlight=compress_type). + +- When creating a table using CTAS+TVF, partition columns in the TVF are automatically mapped to Varchar(65533) instead of String, allowing them to be used as partition columns for internal tables [#37161](https://github.com/apache/doris/pull/37161) + +- Optimized the number of metadata accesses for Hive write operations [#37127](https://github.com/apache/doris/pull/37127) + +- ES Catalog now supports mapping nested/object types to Doris's Json type. [#37182](https://github.com/apache/doris/pull/37182) + +- Improved error messages when connecting to Oracle using older versions of the ojdbc driver [#37634](https://github.com/apache/doris/pull/37634) + +- When Hudi tables return an empty set during Incremental Read, Doris now also returns an empty set instead of error [#37636](https://github.com/apache/doris/pull/37636) + +- Fixed an issue where inner-outer table join queries could lead to FE timeouts in some cases [#37757](https://github.com/apache/doris/pull/37757) + +- Fixed an issue with FE metadata replay errors during upgrades from older versions to newer versions when the Hive metastore event listener is enabled. [#37757](https://github.com/apache/doris/pull/37757) + +### Multi-Table Materialized View + +- Automate key column selection for asynchronous materialized views. [#36601](https://github.com/apache/doris/pull/36601) + +- Support date_trunc in materialized view partition definitions.. [#35562](https://github.com/apache/doris/pull/35562) + +- Enable transparent rewrites across nested materialized view aggregations. [#37651](https://github.com/apache/doris/pull/37651) + +- Asynchronous materialized views remain available when schema changes do not affect the correctness of their data. [#37122](https://github.com/apache/doris/pull/37122) + +- Improve planning speed for transparent rewrites. [#37935](https://github.com/apache/doris/pull/37935) + +- When calculating the availability of asynchronous materialized views, the current refresh status is no longer taken into account. [#36617](https://github.com/apache/doris/pull/36617) + +### Semi-Structured Data Management + +- Optimize DESC performance for viewing VARIANT sub-columns through sampling. [#37217](https://github.com/apache/doris/pull/37217) + +- Support for special JSON data with empty keys in the JSON type. [#36762](https://github.com/apache/doris/pull/36762) + +### Inverted Index + +- Reduce latency by minimizing the invocation of inverted index exists to avoid delays in accessing object storage. [#36945](https://github.com/apache/doris/pull/36945) + +- Optimize the overhead of the inverted index query process. [#35357](https://github.com/apache/doris/pull/35357) + +- Prevent inverted indices in materialized views. [#36869](https://github.com/apache/doris/pull/36869) + +### Query Optimizer + +- When both sides of a comparison expression are literals, the string literal will attempt to convert to the type of the other side. [#36921](https://github.com/apache/doris/pull/36921) + +- Refactored the sub-path pushdown functionality for the variant type, now better supporting complex pushdown scenarios. [#36923](https://github.com/apache/doris/pull/36923) + +- Optimized the logic for calculating the cost of materialized views, enabling more accurate selection of lower-cost materialized views. [#37098](https://github.com/apache/doris/pull/37098) + +- Improved the SQL cache planning speed when using user variables in SQL. [#37119](https://github.com/apache/doris/pull/37119) + +- Optimized the row estimation logic for NOT NULL expressions, resulting in better performance when NOT NULL is present in queries. [#37498](https://github.com/apache/doris/pull/37498) + +- Optimized the null rejection derivation logic for LIKE expressions. [#37864](https://github.com/apache/doris/pull/37864) + +- Improved error messages when querying a specific partition fails, making it clearer which table is causing the issue. [#37280](https://github.com/apache/doris/pull/37280) + +### Query Execution + +- Improved the performance of the bitmap_union operator up to 3 times in certain scenarios. + +- Enhanced the reading performance of Arrow Flight in ARM environments. + +- Optimized the execution performance of the explode, explode_map, and explode_json functions. + +### Data Loading + +- Support setting `max_filter_ratio` for `INSERT INTO ... FROM TABLE VALUE FUNCTION` + +## Bug fixes + +### Lakehouse + +- Fixed an issue that caused BE crashes in some cases when querying Parquet format [#37086](https://github.com/apache/doris/pull/37086) + +- Fixed an issue where BE printed excessive logs when querying Parquet format. [#37012](https://github.com/apache/doris/pull/37012) + +- Fixed an issue where the FE side created a large number of duplicate FileSystem objects in some cases. [#37142](https://github.com/apache/doris/pull/37142) + +- Fixed an issue where transaction information was not cleaned up after writing to Hive in some cases. [#37172](https://github.com/apache/doris/pull/37172) + +- Fixed a thread leak issue caused by Hive table write operations in some cases. [#37247](https://github.com/apache/doris/pull/37247) + +- Fixed an issue where Hive Text format row and column delimiters could not be correctly obtained in some cases. [#37188](https://github.com/apache/doris/pull/37188) + +- Fixed a concurrency issue when reading lz4 compressed blocks in some cases. [#37187](https://github.com/apache/doris/pull/37187) + +- Fixed an issue where `count(*)` on Iceberg tables returned incorrect results in some cases. [#37810](https://github.com/apache/doris/pull/37810) + +- Fixed an issue where creating a Paimon catalog based on MinIO caused FE metadata replay errors in some cases. [#37249](https://github.com/apache/doris/pull/37249) + +- Fixed an issue where using Ranger to create a catalog caused the client to hang in some cases. [#37551](https://github.com/apache/doris/pull/37551) + +### Multi-Table Materialized View + +- Fixed an issue where adding new partitions to the base table could lead to incorrect results after partition aggregation roll-up rewrites. [#37651](https://github.com/apache/doris/pull/37651) + +- Fixed an issue where the materialized view partition status was not set to out-of-sync after deleting associated base table partitions. [#36602](https://github.com/apache/doris/pull/36602) + +- Fixed an occasional deadlock issue during asynchronous materialized view builds. [#37133](https://github.com/apache/doris/pull/37133) + +- Fixed an occasional "nereids cost too much time" error when refreshing a large number of partitions in a single asynchronous materialized view refresh. [#37589](https://github.com/apache/doris/pull/37589) + +- Fixed an issue where an asynchronous materialized view could not be created if the final select list contained a null literal. [#37281](https://github.com/apache/doris/pull/37281) + +- Fixed an issue with single-table materialized views where, even though the aggregation materialized view was successfully rewritten, the CBO did not select it. [#35721](https://github.com/apache/doris/pull/35721) [#36058](https://github.com/apache/doris/pull/36058) + +- Fixed an issue where partition derivation failed when building a partitioned materialized view with both join inputs being aggregations. [#34781](https://github.com/apache/doris/pull/34781) + +### Semi-Structured Data Management + +- Fixed issues with VARIANT in special cases such as concurrency and abnormal data.[#37976](https://github.com/apache/doris/pull/37976) [#37839](https://github.com/apache/doris/pull/37839) [#37794](https://github.com/apache/doris/pull/37794) [#37674](https://github.com/apache/doris/pull/37674) [#36997](https://github.com/apache/doris/pull/36997) + +- Fixed coredump issues when using VARIANT in unsupported SQL. [#37640](https://github.com/apache/doris/pull/37640) + +- Fixed coredump issues related to MAP data type when upgrading from 1.x to 2.x or higher versions. [#36937](https://github.com/apache/doris/pull/36937) + +- Improved ES Catalog support for Array types. [#36936](https://github.com/apache/doris/pull/36936) + +### Inverted Index + +- Fixed an issue where DROP INDEX for Inverted Index v2 did not delete metadata. [#37646](https://github.com/apache/doris/pull/37646) + +- Fixed query accuracy issues when string length exceeded the "ignore above" threshold. [#37679](https://github.com/apache/doris/pull/37679) + +- Fixed issues with index size statistics. [#37232](https://github.com/apache/doris/pull/37232) [#37564](https://github.com/apache/doris/pull/37564) + +### Query Optimizer + +- Fixed an issue that prevented import operations from executing due to the use of reserved keywords. [#35938](https://github.com/apache/doris/pull/35938) + +- Fixed a type error where char(255) was incorrectly recorded as char(1) when creating a table. [#37671](https://github.com/apache/doris/pull/37671) + +- Fixed incorrect results when the join expression in a correlated subquery was a complex expression. [#37683](https://github.com/apache/doris/pull/37683) + +- Fixed a potential issue with incorrect bucket pruning for decimal types. [#38013](https://github.com/apache/doris/pull/38013) + +- Fixed incorrect aggregation operator results when pipeline local shuffle was enabled in certain scenarios. [#38016](https://github.com/apache/doris/pull/38016) + +- Fixed planning errors that could occur when equal expressions existed in aggregation operators. [#36622](https://github.com/apache/doris/pull/36622) + +- Fixed planning errors that could occur when lambda expressions were present in aggregation operators. [#37285](https://github.com/apache/doris/pull/37285) + +- Fixed an issue where a literal generated from a window function being optimized to a literal had the wrong type, preventing execution. [#37283](https://github.com/apache/doris/pull/37283) + +- Fixed an issue with the null attribute being incorrectly output by the aggregate function foreach combinator. [#37980](https://github.com/apache/doris/pull/37980) + +- Fixed an issue where the acos function could not be planned when its parameter was a literal out of range. [#37996](https://github.com/apache/doris/pull/37996) + +- Fixed planning errors when specifying partitions for a query on a synchronized materialized view. [#36982](https://github.com/apache/doris/pull/36982) + +- Fixed occasional Null Pointer Exceptions (NPEs) during planning. [#38024](https://github.com/apache/doris/pull/38024) + +### Query Execution + +- Fixed an error in delete where statements when using decimal data types as conditions. [#37801](https://github.com/apache/doris/pull/37801) + +- Fixed an issue where BE memory was not released after query execution ended. [#37792](https://github.com/apache/doris/pull/37792) [#37297](https://github.com/apache/doris/pull/37297) + +- Fixed a problem where audit logs occupied too much FE memory under high QPS scenarios. [#37786](https://github.com/apache/doris/pull/37786) + +- Fixed BE core dumps when the sleep function received illegal input values. [#37681](https://github.com/apache/doris/pull/37681) + +- Fixed an error encountered during sync filter size execution. [#37103](https://github.com/apache/doris/pull/37103) + +- Fixed incorrect results when using time zones during execution. [#37062](https://github.com/apache/doris/pull/37062) + +- Fixed incorrect results when casting strings to integers. [#36788](https://github.com/apache/doris/pull/36788) + +- Fixed query errors when using the Arrow Flight protocol with pipelinex enabled. [#35804](https://github.com/apache/doris/pull/35804) + +- Fixed errors when casting strings to dates/datetimes. [#35637](https://github.com/apache/doris/pull/35637) + +- Fixed BE core dumps during large table join queries using <=>. [#36263](https://github.com/apache/doris/pull/36263) + +### Storage Management + +- Fixed the issue of invisible DELETE SIGN data encountered during column update and write operations. [#36755](https://github.com/apache/doris/pull/36755) + +- Optimized FE's memory usage during schema changes. [#36756](https://github.com/apache/doris/pull/36756) + +- Fixed the issue where BE would hang during restart due to transactions not being aborted [#36437](https://github.com/apache/doris/pull/36437) + +- Fixed occasional errors when changing from NOT NULL to NULL data types. [#36389](https://github.com/apache/doris/pull/36389) + +- Optimized replica repair scheduling when BE goes down. [#36897](https://github.com/apache/doris/pull/36897) + +- Supported round-robin disk selection for tablet creation on a single BE. [#36900](https://github.com/apache/doris/pull/36900) + +- Fixed query error -230 caused by slow publishing. [#36222](https://github.com/apache/doris/pull/36222) + +- Improved the speed of partition balancing. [#36976](https://github.com/apache/doris/pull/36976) + +- Controlled segment cache using the number of file descriptors (FDs) and memory to avoid FD exhaustion. [#37035](https://github.com/apache/doris/pull/37035) + +- Fixed potential replica loss caused by concurrent clone and alter operations [#36858](https://github.com/apache/doris/pull/36858) + +- Fixed the issue of not being able to adjust column order.[#37226](https://github.com/apache/doris/pull/37226) + +- Prohibited certain schema change operations on auto-increment columns. [#37331](https://github.com/apache/doris/pull/37331) + +- Fixed inaccurate error reporting for DELETE operations. [#37374](https://github.com/apache/doris/pull/37374) + +- Adjusted the trash expiration time on BE side to one day. [#37409](https://github.com/apache/doris/pull/37409) + +- Optimized compaction memory usage and scheduling. [#37491](https://github.com/apache/doris/pull/37491) + +- Checked for potential oversized backups causing FE restarts. [#37466](https://github.com/apache/doris/pull/37466) + +- Restored dynamic partition deletion policies and cross-partition behaviors to 2.1.3. [#37570](https://github.com/apache/doris/pull/37570) [#37506](https://github.com/apache/doris/pull/37506) + +- Fixed errors related to decimal types in DELETE predicates. [#37710](https://github.com/apache/doris/pull/37710) + +### Data Loading + +- Fixed data invisibility issues caused by race conditions in error handling during imports [#36744](https://github.com/apache/doris/pull/36744) + +- Added support for hhl_from_base64 in streamload imports. [#36819](https://github.com/apache/doris/pull/36819) + +- Fixed potential FE OOM issues when importing very large numbers of tablets for a single table. [#36944](https://github.com/apache/doris/pull/36944) + +- Fixed possible auto-increment column duplication during FE master-slave switchovers. [#36961](https://github.com/apache/doris/pull/36961) + +- Fixed errors when inserting into select with auto-increment columns. [#37029](https://github.com/apache/doris/pull/37029) + +- Reduced the number of data flush threads to optimize memory usage. [#37092](https://github.com/apache/doris/pull/37092) + +- Improved automatic recovery and error messaging for routine load tasks. [#37371](https://github.com/apache/doris/pull/37371) + +- Increased the default batch size for routine load. [#37388](https://github.com/apache/doris/pull/37388) + +- Fixed routine load task stoppage due to Kafka EOF expiration. [#37983](https://github.com/apache/doris/pull/37983) + +- Fixed coredump issues in multi-table streaming. [#37370](https://github.com/apache/doris/pull/37370) + +- Fixed premature backpressure caused by inaccurate memory estimation in groupcommit. [#37379](https://github.com/apache/doris/pull/37379) + +- Optimized BE-side thread usage in groupcommit. [#37380](https://github.com/apache/doris/pull/37380) + +- Fixed the issue of no error URL when data was not partitioned. [#37401](https://github.com/apache/doris/pull/37401) + +- Fixed potential memory misoperations during imports. [#38021](https://github.com/apache/doris/pull/38021) + +### Merge on Write Unique Key + +- Reduced memory usage during compaction for primary key tables. [#36968](https://github.com/apache/doris/pull/36968) + +- Fixed potential duplicate data issues when primary key replica cloning fails. [#37229](https://github.com/apache/doris/pull/37229) + +### Permissions + +- Fixed the issue of missing authorization when a table-valued function references a resource. [#37132](https://github.com/apache/doris/pull/37132) + +- Fixed the issue where the SHOW ROLE statement did not include workload group permissions. [#36032](https://github.com/apache/doris/pull/36032) + +- Fixed the issue where executing two statements simultaneously when creating a row policy could cause FE to fail to restart. [#37342](https://github.com/apache/doris/pull/37342) + +- Fixed the issue where, in some cases, upgrading from an older version could result in FE metadata replay failures due to row policies. [#37342](https://github.com/apache/doris/pull/37342) + +### Others + +- Fixed the issue of compute nodes participating in internal table creation. [#37961](https://github.com/apache/doris/pull/37961) + +- Fixed the read lag issue when `enable_strong_read_consistency` is set to true. [#37641](https://github.com/apache/doris/pull/37641) \ No newline at end of file diff --git a/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.6.md b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.6.md new file mode 100644 index 0000000000000..c14d25b52573f --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.6.md @@ -0,0 +1,524 @@ +--- +{ + "title": "Release 2.1.6", + "language": "en" +} +--- + + + +Dear community, **Apache Doris version 2.1.6 was officially released on September 10, 2024.** This version brings continuous upgrades and improvements to the Lakehouse, Async Materialized Views, and Semi-Structured Data Management. Additionally, several fixes have been implemented in areas such as the query optimizer, execution engine, storage management, permission management. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- Removed the `delete_if_exists` option from create repository. [#38192](https://github.com/apache/doris/pull/38192) + +- Added the `enable_prepared_stmt_audit_log` session variable to control whether JDBC prepared statements record audit logs, with the default being no recording. [#38624](https://github.com/apache/doris/pull/38624) [#39009](https://github.com/apache/doris/pull/39009) + +- Implemented fd limit and memory constraints for segment cache. [#39689](https://github.com/apache/doris/pull/39689) + +- When the FE configuration item `sys_log_mode` is set to BRIEF, file location information is added to the logs. [#39571](https://github.com/apache/doris/pull/39571) + +- Changed the default value of the session variable `max_allowed_packet` to 16MB. [#38697](https://github.com/apache/doris/pull/38697) + +- When a single request contains multiple statements, semicolons must be used to separate them. [#38670](https://github.com/apache/doris/pull/38670) + +- Added support for statements to begin with a semicolon. [#39399](https://github.com/apache/doris/pull/39399) + +- Aligned type formatting with MySQL in statements such as `show create table`. [#38012](https://github.com/apache/doris/pull/38012) + +- When the new optimizer planning times out, it no longer falls back to prevent the old optimizer from using longer planning times. [#39499](https://github.com/apache/doris/pull/39499) + +## New features + +### Lakehouse + +- Supported writeback for Iceberg tables. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/lakehouse/datalake-building/iceberg-build). + +- SQL interception rules now support external tables. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/query-admin/sql-interception). + +- Added the system table `file_cache_statistics` to view BE data cache metrics. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/system-tables/file_cache_statistics). + +### Async Materialized View + +- Supported transparent rewriting during inserts. [#38115](https://github.com/apache/doris/pull/38115) + +- Supported transparent rewriting when variant types exist in queries.[ #37929](https://github.com/apache/doris/pull/37929) + +### Semi-Structured Data Management + +- Supported casting ARRAY MAP to JSON type.[ #36548](https://github.com/apache/doris/pull/36548) + +- Supported the `json_keys` function.[ #36411](https://github.com/apache/doris/pull/36411) + +- Supported specifying the JSON path $. when importing JSON. [#38213](https://github.com/apache/doris/pull/38213) + +- ARRAY, MAP, STRUCT types now support `replace_if_not_null`[#38304](https://github.com/apache/doris/pull/38304) + +- ARRAY, MAP, STRUCT types now support adjusting column order.[#39210](https://github.com/apache/doris/pull/39210) + +- Added the `multi_match` function to match keywords across multiple fields, with support for inverted index acceleration. [#37722](https://github.com/apache/doris/pull/37722) + +### Query Optimizer + +- Filled in the original database name, table name, column name, and alias for returned columns in the MySQL protocol. [ #38126](https://github.com/apache/doris/pull/38126) + +- Supported the aggregation function `group_concat` with both order by and distinct simultaneously. [#38080](https://github.com/apache/doris/pull/38080) + +- SQL cache now supports reusing cached results for queries with different comments. [#40049](https://github.com/apache/doris/pull/40049) + +- In partition pruning, supported including `date_trunc` and date functions in filter conditions. [#38025](https://github.com/apache/doris/pull/38025) [#38743](https://github.com/apache/doris/pull/38743) + +- Allowed using the database name where the table resides as a qualifier prefix for table aliases. [#38640](https://github.com/apache/doris/pull/38640) + +- Supported hint-style comments.[#39113](https://github.com/apache/doris/pull/39113) + +### Others + +- Added the system table `table_properties` for viewing table properties. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/system-tables/information_schema/table_properties). + +- Introduced deadlock and slow lock detection in FE. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/maint-monitor/frontend-lock-manager). + +## Improvements + +### Lakehouse + +- Reimplemented the external table metadata caching mechanism. + + - For details, refer to the [documentation](https://doris.apache.org/docs/lakehouse/metacache). + +- Added the session variable `keep_carriage_return` with a default value of false. By default, reading Hive Text format tables treats both `\r\n` and `\n` as newline characters. [#38099](https://github.com/apache/doris/pull/38099) + +- Optimized memory statistics for Parquet/ORC file read/write operations.[#37257](https://github.com/apache/doris/pull/37257) + +- Supported pushing down IN/NOT IN predicates for Paimon tables. [#38390](https://github.com/apache/doris/pull/38390) + +- Enhanced the optimizer to support Time Travel syntax for Hudi tables. [#38591](https://github.com/apache/doris/pull/38591) + +- Optimized Kerberos authentication-related processes. [ #37301](https://github.com/apache/doris/pull/37301) + +- Enabled reading Hive tables after renaming column operations. [#38809](https://github.com/apache/doris/pull/38809) + +- Optimized the reading performance of partition columns for external tables. [#38810](https://github.com/apache/doris/pull/38810) + +- Improved the data shard merging strategy during external table query planning to avoid performance degradation caused by a large number of small shards.[#38964](https://github.com/apache/doris/pull/38964) + +- Added attributes such as location to `SHOW CREATE DATABASE/TABLE`. [#39644](https://github.com/apache/doris/pull/39644) + +- Supported complex types in MaxCompute Catalog. [#39822](https://github.com/apache/doris/pull/39822) + +- Optimized the file cache loading strategy by using asynchronous loading to avoid long BE startup times. [#39036](https://github.com/apache/doris/pull/39036) + +- Improved the file cache eviction strategy, such as evicting locks held for extended periods. [#39721](https://github.com/apache/doris/pull/39721) + +### Async Materialized View + +- Supported hourly, weekly, and quarterly partition roll-up construction. [#37678](https://github.com/apache/doris/pull/37678) + +- For materialized views based on Hive external tables, the metadata cache is now updated before refresh to ensure the latest data is obtained during each refresh. [#38212](https://github.com/apache/doris/pull/38212) + +- Improved the performance of transparent rewrite planning in storage-compute decoupled mode by batch fetching metadata. [#39301](https://github.com/apache/doris/pull/39301) + +- Enhanced the performance of transparent rewrite planning by prohibiting duplicate enumerations. [#39541](https://github.com/apache/doris/pull/39541) + +- Improved the performance of transparent rewrite for refreshing materialized views based on Hive external table partitions.[#38525](https://github.com/apache/doris/pull/38525) + +### Semi-Structured Data Management + +- Optimized memory allocation for TOPN queries to improve performance. [#37429](https://github.com/apache/doris/pull/37429) + +- Enhanced the performance of string processing in inverted indexes.[#37395](https://github.com/apache/doris/pull/37395) + +- Optimized the performance of inverted indexes in MOW tables. [#37428](https://github.com/apache/doris/pull/37428) + +- Supported specifying the row-store `page_size` during table creation to control compression effectiveness. [#37145](https://github.com/apache/doris/pull/37145) + +### Query Optimizer + +- Adjusted the row count estimation algorithm for mark joins, resulting in more accurate cardinality estimates for mark joins. [#38270](https://github.com/apache/doris/pull/38270) + +- Optimized the cost estimation algorithm for semi/anti joins, enabling more accurate selection of semi/anti join orders. [#37951](https://github.com/apache/doris/pull/37951) + +- Adjusted the filter estimation algorithm for cases where some columns have no statistical information, leading to more accurate cardinality estimates. [#39592](https://github.com/apache/doris/pull/39592) + +- Modified the instance calculation logic for set operation operators to prevent insufficient parallelism in extreme cases. [#39999](https://github.com/apache/doris/pull/39999) + +- Adjusted the usage strategy of bucket shuffle, achieving better performance when data is not sufficiently shuffled. [#36784](https://github.com/apache/doris/pull/36784) + +- Enabled early filtering of window function data, supporting multiple window functions in a single projection. [#38393](https://github.com/apache/doris/pull/38393) + +- When a `NullLiteral` exists in a filter condition, it can now be folded into false, further converted to an `EmptySet` to reduce unnecessary data scanning and computation. [#38135](https://github.com/apache/doris/pull/38135) + +- Expanded the scope of predicate derivation, reducing data scanning in queries with specific patterns. [#37314](https://github.com/apache/doris/pull/37314) + +- Supported partial short-circuit evaluation logic in partition pruning to improve partition pruning performance, achieving over 100% improvement in specific scenarios. [#38191](https://github.com/apache/doris/pull/38191) + +- Enabled the computation of arbitrary scalar functions within user variables. [#39144](https://github.com/apache/doris/pull/39144) + +- Maintained error messages consistent with MySQL when alias conflicts exist in queries. [#38104](https://github.com/apache/doris/pull/38104) + +### Query Execution + +- Adapted AggState for compatibility from 2.1 to 3.x and fixed coredump issues. [#37104](https://github.com/apache/doris/pull/37104) + +- Refactored the strategy selection for local shuffle when no joins are involved. [#37282](https://github.com/apache/doris/pull/37282) + +- Modified the scanner for internal table queries to an asynchronous approach to prevent blocking during internal table queries. [#38403](https://github.com/apache/doris/pull/38403) + +- Optimized the block merge process when building hash tables in Join operators. [#37471](https://github.com/apache/doris/pull/37471) + +- Reduced the lock holding time for MultiCast operations. [37462](https://github.com/apache/doris/pull/37462) + +- Optimized gRPC's keepAliveTime and added a connection monitoring mechanism, reducing the probability of query failures due to RPC errors during query execution. [#37304](https://github.com/apache/doris/pull/37304) + +- Cleaned up all dirty pages in jemalloc when memory limits are exceeded. [#37164](https://github.com/apache/doris/pull/37164) + +- Improved the performance of `aes_encrypt`/`decrypt` functions when handling constant types. [#37194](https://github.com/apache/doris/pull/37194) + +- Optimized the performance of `json_extract` functions when processing constant data. [#36927](https://github.com/apache/doris/pull/36927) + +- Optimized the performance of ParseURL functions when processing constant data. [#36882](https://github.com/apache/doris/pull/36882) + +### Backup Recovery / CCR + +- Restore now supports deleting redundant tablets and partition options. [#39363](https://github.com/apache/doris/pull/39363) + +- Check storage connectivity when creating a repository. [#39538](https://github.com/apache/doris/pull/39538) + +- Enables binlog to support `DROP TABLE`, allowing CCR to incrementally synchronize `DROP TABLE` operations. [#38541](https://github.com/apache/doris/pull/38541) + +### Compaction + +- Improves the issue where high-priority compaction tasks were not subject to task concurrency control limits. [#38189](https://github.com/apache/doris/pull/38189) + +- Automatically reduces compaction memory consumption based on data characteristics. [#37486](https://github.com/apache/doris/pull/37486) + +- Fixes an issue where the sequential data optimization strategy could lead to incorrect data in aggregate tables or MOR UNIQUE tables. [ #38299](https://github.com/apache/doris/pull/38299) + +- Optimizes the rowset selection strategy during compaction during replica replenishment to avoid triggering -235 errors. [#39262](https://github.com/apache/doris/pull/39262) + +### MOW (Merge-On-Write) + +- Optimizes slow column updates caused by concurrent column updates and compactions. [#38682](https://github.com/apache/doris/pull/38682) + +- Fixes an issue where segcompaction during bulk data imports could lead to incorrect MOW data. [#38992](https://github.com/apache/doris/pull/38992) [#39707](https://github.com/apache/doris/pull/39707) + +- Fixes data loss in column updates that may occur after BE restarts. [#39035](https://github.com/apache/doris/pull/39035) + +### Storage Management + +- Adds FE configuration to control whether queries under hot-cold tiering prefer local data replicas. [#38322](https://github.com/apache/doris/pull/38322) + +- Optimizes expired BE report messages to include newly created tablets. [#38839](https://github.com/apache/doris/pull/38839) [#39605](https://github.com/apache/doris/pull/39605) + +- Optimizes replica scheduling priority strategy to prioritize replicas with missing data. [#38884](https://github.com/apache/doris/pull/38884) + +- Prevents tablets with unfinished ALTER jobs from being balanced. [#39202](https://github.com/apache/doris/pull/39202) + +- Enables modifying the number of buckets for tables with list partitioning. [#39688](https://github.com/apache/doris/pull/39688) + +- Prefers querying from online disk services. [#39654](https://github.com/apache/doris/pull/39654) + +- Improves error messages for materialized view base tables that do not support deletion during synchronization. [#39857](https://github.com/apache/doris/pull/39857) + +- Improves error messages for single columns exceeding 4GB. [#39897](https://github.com/apache/doris/pull/39897) + +- Fixes an issue where aborted transactions were omitted when plan errors occurred during `INSERT` statements.[#38260](https://github.com/apache/doris/pull/38260) + +- Fixes exceptions during SSL connection closure.[#38677](https://github.com/apache/doris/pull/38677) + +- Fixes an issue where table locks were not held when aborting transactions using labels. [#38842](https://github.com/apache/doris/pull/38842) + +- Fixes `gson pretty` causing large image issues. [#39135](https://github.com/apache/doris/pull/39135) + +- Fixes an issue where the new optimizer did not check for bucket values of 0 in `CREATE TABLE` statements.[#38999](https://github.com/apache/doris/pull/38999) + +- Fixes errors when Chinese column names are included in `DELETE` condition predicates. [#39500](https://github.com/apache/doris/pull/39500) + +- Fixes frequent tablet balancing issues in partition balancing mode. [#39606](https://github.com/apache/doris/pull/39606) + +- Fixes an issue where partition storage policy attributes were lost. [#39677](https://github.com/apache/doris/pull/39677) + +- Fixes incorrect statistics when importing multiple tables within a transaction. [#39548](https://github.com/apache/doris/pull/39548) + +- Fixes errors when deleting random bucket tables. [#39830](https://github.com/apache/doris/pull/39830) + +- Fixes issues where FE fails to start due to non-existent UDFs. [#39868](https://github.com/apache/doris/pull/39868) + +- Fixes inconsistencies in the last failed version between FE master and slave. [#39947](https://github.com/apache/doris/pull/39947) + +- Fixes an issue where related tablets may still be in schema change state when schema change jobs are canceled. [ #39327](https://github.com/apache/doris/pull/39327) + +- Fixes errors when modifying type and column order in a single statement schema change (SC). [#39107](https://github.com/apache/doris/pull/39107) + +### Data Loading + +- Improves error messages for -238 errors during imports. [#39182](https://github.com/apache/doris/pull/39182) + +- Allows importing to other partitions while restoring a partition. [#39915](https://github.com/apache/doris/pull/39915) + +- Optimizes the strategy for FE to select BEs during group commit. [#37830](https://github.com/apache/doris/pull/37830) [#39010](https://github.com/apache/doris/pull/39010) + +- Avoids printing stack traces for some common streamload error messages. [#38418](https://github.com/apache/doris/pull/38418) + +- Improves handling of issues where offline BEs may affect import errors. [#38256](https://github.com/apache/doris/pull/38256) + +### Permissions + +- Optimizes access performance after enabling the Ranger authentication plugin. [#38575](https://github.com/apache/doris/pull/38575) +- Optimizes permission strategies for Refresh Catalog/Database/Table operations, allowing users to perform these operations with only SHOW permissions. [#39008](https://github.com/apache/doris/pull/39008) + +## Bug fixes + +### Lakehouse + +- Fixes the issue where switching catalogs may result in an error of not finding the database. [#38114](https://github.com/apache/doris/pull/38114) + +- Addresses exceptions caused by attempting to read non-existent data on S3. [#38253](https://github.com/apache/doris/pull/38253) + +- Resolves the issue where specifying an abnormal path during export operations may lead to incorrect export locations. [#38602](https://github.com/apache/doris/pull/38602) + +- Fixes the timezone issue for time columns in Paimon tables. [#37716](https://github.com/apache/doris/pull/37716) + +- Temporarily disables the Parquet PageIndex feature to avoid certain erroneous behaviors. + +- Corrects the selection of Backend nodes in the blacklist during external table queries. [#38984](https://github.com/apache/doris/pull/38984) + +- Resolves errors caused by missing subcolumns in Parquet Struct column types.[#39192](https://github.com/apache/doris/pull/39192) + +- Addresses several issues with predicate pushdown in JDBC Catalog. [#39082](https://github.com/apache/doris/pull/39082) + +- Fixes issues where some historical Parquet formats led to incorrect query results. [#39375](https://github.com/apache/doris/pull/39375) + +- Improves compatibility with ojdbc6 drivers for Oracle JDBC Catalog. [#39408](https://github.com/apache/doris/pull/39408) + +- Resolves potential FE memory leaks caused by Refresh Catalog/Database/Table operations. [#39186](https://github.com/apache/doris/pull/39186) [#39871](https://github.com/apache/doris/pull/39871) + +- Fixes thread leaks in JDBC Catalog under certain conditions. [#39666](https://github.com/apache/doris/pull/39666) [#39582](https://github.com/apache/doris/pull/39582) + +- Addresses potential event processing failures after enabling Hive Metastore event subscription. [#39239](https://github.com/apache/doris/pull/39239) + +- Disables reading Hive Text format tables with custom escape characters and null formats to prevent data errors. [#39869](https://github.com/apache/doris/pull/39869) + +- Resolves issues accessing Iceberg tables created via the Iceberg API under certain conditions. [#39203](https://github.com/apache/doris/pull/39203) + +- Fixes the inability to read Paimon tables stored on HDFS clusters with high availability enabled. [#39876](https://github.com/apache/doris/pull/39876) + +- Addresses errors that may occur when reading Paimon table deletion vectors after enabling file caching. [#39875](https://github.com/apache/doris/pull/39875) + +- Resolves potential deadlocks when reading Parquet files under certain conditions. [#39945](https://github.com/apache/doris/pull/39945) + +### Async Materialized View + +- Fixes the inability to use `SHOW CREATE MATERIALIZED VIEW` on follower FEs. [#38794](https://github.com/apache/doris/pull/38794) + +- Unifies the object type of asynchronous materialized views in metadata as tables to enable proper display in data tools. [#38797](https://github.com/apache/doris/pull/38797) + +- Resolves the issue where nested asynchronous materialized views always perform full refreshes. [#38698](https://github.com/apache/doris/pull/38698) + +- Fixes the issue where canceled tasks may show as running after restarting FEs. [ #39424](https://github.com/apache/doris/pull/39424) + +- Addresses incorrect use of contexts, which may lead to unexpected failures of materialized view refresh tasks. [#39690](https://github.com/apache/doris/pull/39690) + +- Resolves issues that may cause varchar type write failures due to unreasonable lengths when creating asynchronous materialized views based on external tables.[#37668](https://github.com/apache/doris/pull/37668) + +- Fixes the potential invalidation of asynchronous materialized views based on external tables after FE restarts or catalog rebuilds. [#39355](https://github.com/apache/doris/pull/39355) + +- Prohibits the use of partition rollup for materialized views with list partitions to prevent the generation of incorrect data. [#38124](https://github.com/apache/doris/pull/38124) + +- Fixes incorrect results when literals exist in the select list during transparent rewriting for aggregation rollup. [#38958](https://github.com/apache/doris/pull/38958) + +- Addresses potential errors during transparent rewriting when queries contain filters like `a = a`. [#39629](https://github.com/apache/doris/pull/39629) + +- Fixes issues where transparent rewriting for direct external table queries fails. [#39041](https://github.com/apache/doris/pull/39041) + +### Semi-Structured Data Management + +- Removes support for prepared statements in the old optimizer. [#39465](https://github.com/apache/doris/pull/39465) + +- Fixes issues with JSON escape character handling. [#37251](https://github.com/apache/doris/pull/37251) + +- Resolves issues with duplicate processing of JSON fields. [#38490](https://github.com/apache/doris/pull/38490) + +- Fixes issues with some ARRAY and MAP functions. [#39307](https://github.com/apache/doris/pull/39307) [#39699](https://github.com/apache/doris/pull/39699) [#39757](https://github.com/apache/doris/pull/39757) + +- Resolves complex combinations of inverted index queries and LIKE queries. [#36687](https://github.com/apache/doris/pull/36687) + +### Query Optimizer + +- Fixed the potential partition pruning error issue when the 'OR' condition exists in partition filter conditions. [#38897](https://github.com/apache/doris/pull/38897) + +- Fixed the potential partition pruning error issue when complex expressions are involved. [#39298](https://github.com/apache/doris/pull/39298) + +- Fixed the issue where nullable in `agg_state` subtypes might be planned incorrectly, leading to execution errors. [#37489](https://github.com/apache/doris/pull/37489) + +- Fixed the issue where nullable in set operation operators might be planned incorrectly, leading to execution errors. [#39109](https://github.com/apache/doris/pull/39109) + +- Fixed the incorrect execution priority issue of intersect operator. [#39095](https://github.com/apache/doris/pull/39095) + +- Fixed the NPE issue that may occur when the maximum valid date literal exists in the query. [#39482](https://github.com/apache/doris/pull/39482) + +- Fixed the occasional planning error that results in an illegal slot error during execution. [#39640](https://github.com/apache/doris/pull/39640) + +- Fixed the issue where repeatedly referencing columns in cte may lead to missing data in some columns in the result. [#39850](https://github.com/apache/doris/pull/39850) + +- Fixed the occasional planning error issue when 'case when' exists in the query. [#38491](https://github.com/apache/doris/pull/38491) + +- Fixed the issue where IP types cannot be implicitly converted to string types. [#39318](https://github.com/apache/doris/pull/39318) + +- Fixed the potential planning error issue when using multi-dimensional aggregation and the same column and its alias exist in the select list. [ #38166](https://github.com/apache/doris/pull/38166) + +- Fixed the issue where boolean types might be handled incorrectly when using BE constant folding. [#39019](https://github.com/apache/doris/pull/39019) + +- Fixed the planning error issue caused by `default_cluster`: as a prefix for the database name in expressions. [#39114](https://github.com/apache/doris/pull/39114) + +- Fixed the potential deadlock issue caused by` insert into`. [#38660](https://github.com/apache/doris/pull/38660) + +- Fixed the potential planning error issue caused by not holding table locks throughout the planning process. [#38950](https://github.com/apache/doris/pull/38950) + +- Fixed the issue where CHAR(0), VARCHAR(0) are not handled correctly when creating tables. [#38427](https://github.com/apache/doris/pull/38427) + +- Fixed the issue where `show create table` may incorrectly display hidden columns. [#38796](https://github.com/apache/doris/pull/38796) + +- Fixed the issue where columns with the same name as hidden columns are not prohibited when creating tables. [#38796](https://github.com/apache/doris/pull/38796) + +- Fixed the occasional planning error issue when executing `insert into as select` with CTEs. [#38526](https://github.com/apache/doris/pull/38526) + +- Fixed the issue where `insert into values` cannot automatically fill null default values. **[[fix](Nereids) fix insert into table with null literal default value #39122](https://github.com/apache/doris/pull/39122)** + +- Fixed the NPE issue caused by using cte in delete without using it. [#39379](https://github.com/apache/doris/pull/39379) + +- Fixed the issue where deleting from a randomly distributed aggregation model table fails. [#37985](https://github.com/apache/doris/pull/37985) + +### Query Execution + +- Fixed the issue where the pipeline execution engine gets stuck in multiple scenarios, causing queries not to end. [#38657](https://github.com/apache/doris/pull/38657) [#38206](https://github.com/apache/doris/pull/38206) [#38885](https://github.com/apache/doris/pull/38885) + +- Fixed the coredump issue caused by null and non-null columns in set difference calculations.[#38737](https://github.com/apache/doris/pull/38737) + +- Fixed the incorrect result issue of the `width_bucket` function. [#37892](https://github.com/apache/doris/pull/37892) + +- Fixed the query error issue when a single row of data is large and the result set is also large (exceeding 2GB). [#37990](https://github.com/apache/doris/pull/37990) + +- Fixed the incorrect result issue of `stddev` with DecimalV2 type. [#38731](https://github.com/apache/doris/pull/38731) + +- Fixed the coredump issue caused by the `MULTI_MATCH_ANY` function. [#37959](https://github.com/apache/doris/pull/37959) + +- Fixed the issue where `insert overwrite auto partition` causes transaction rollback. [#38103](https://github.com/apache/doris/pull/38103) + +- Fixed the incorrect result issue of the `convert_tz` function. [#37358](https://github.com/apache/doris/pull/37358) [#38764](https://github.com/apache/doris/pull/38764) + +- Fixed the coredump issue when using the `collect_set` function with window functions. [#38234](https://github.com/apache/doris/pull/38234) + +- Fixed the coredump issue caused by the mod function with abnormal input. [#37999](https://github.com/apache/doris/pull/37999) + +- Fixed the issue where executing the same expression in multiple threads may lead to incorrect Java UDF results. [#38612](https://github.com/apache/doris/pull/38612) + +- Fixed the overflow issue caused by the incorrect return type of the `conv` function. [#38001](https://github.com/apache/doris/pull/38001) + +- Fixed the unstable result issue of the histogram function. [#38608](https://github.com/apache/doris/pull/38608) + +### Backup & Recovery / CCR + +- Fixed the issue where the data version after backup and recovery may be incorrect, leading to unreadability. [#38343](https://github.com/apache/doris/pull/38343) + +- Fixed the issue of using restore version across versions. [#38396](https://github.com/apache/doris/pull/38396) + +- Fixed the issue where the job is not canceled when backup fails. [#38993](https://github.com/apache/doris/pull/38993) + +- Fixed the NPE issue in ccr during the upgrade from 2.1.4 to 2.1.5, causing the FE to fail to start. [#39910](https://github.com/apache/doris/pull/39910) + +- Fixed the issue where views and materialized views cannot be used after restoration. [#38072](https://github.com/apache/doris/pull/38072) [#39848](https://github.com/apache/doris/pull/39848) + +### Storage Management + +- Fixed possible memory leaks in routine load when loading multiple tables from a single stream. [#38824](https://github.com/apache/doris/pull/38824) + +- Fixed the issue where delimiters and escape characters in routine load were not effective. [#38825](https://github.com/apache/doris/pull/38825) + +- Fixed incorrectly show routine load results when the routine load task name contained uppercase letters. [#38826](https://github.com/apache/doris/pull/38826) + +- Fixed the issue where the offset cache was not reset when changing the routineload topic. [#38474](https://github.com/apache/doris/pull/38474) + +- Fixed the potential exception triggered by show routineload under concurrent scenarios. [#39525](https://github.com/apache/doris/pull/39525) + +- Fixed the issue where routine load might import data repeatedly. [#39526](https://github.com/apache/doris/pull/39526) + +- Fixed the data error caused by `setNull` when enabling group commit via JDBC. [#38276](https://github.com/apache/doris/pull/38276) + +- Fixed the potential NPE issue when enabling group commit insert to a non-master FE. [#38345](https://github.com/apache/doris/pull/38345) + +- Fixed incorrect error handling during internal data writing in group commit. [#38997](https://github.com/apache/doris/pull/38997) + +- Fixed the coredump that might be triggered when the group commit execution plan failed. [#39396](https://github.com/apache/doris/pull/39396) + +- Fixed the issue where concurrent imports into auto partition tables might report non-existent tablets. [#38793](https://github.com/apache/doris/pull/38793) + +- Fixed potential load stream leakage issues. [#39039](https://github.com/apache/doris/pull/39039) + +- Fixed the issue where transactions were opened for `insert into select` with no data. [#39108](https://github.com/apache/doris/pull/39108) + +- Ignored the single-replica import configuration when using memtable prefetching. [#39154](https://github.com/apache/doris/pull/39154) + +- Fixed the issue where background imports of stream load records might be abnormally aborted upon encountering db deletion. [#39527](https://github.com/apache/doris/pull/39527) + +- Fixed inaccurate error messages when data errors occurred in strict mode. [#39587](https://github.com/apache/doris/pull/39587) + +- Fixed the issue where streamload did not return an error URL upon encountering erroneous data. [#38417](https://github.com/apache/doris/pull/38417) + +- Fixed the issue with the combined use of insert overwrite and auto partition. [#38442](https://github.com/apache/doris/pull/38442) + +- Fixed parsing errors when CSV encountered data where the line delimiter was enclosed by the enclosing character. [#38445](https://github.com/apache/doris/pull/38445) + +### Data Exporting + +- Fixed the issue where enabling the delete_existing_files property during export operations might result in duplicate deletion of exported data. [#39304](https://github.com/apache/doris/pull/39304)) + +### Permissions + +- Fixed the incorrect requirement of ALTER TABLE permission when creating a materialized view. [#38011](https://github.com/apache/doris/pull/38011) + +- Fixed the issue where the db was explicitly displayed as empty when showing routine load. [#38365](https://github.com/apache/doris/pull/38365) + +- Fixed the incorrect requirement of CREATE permission on the original table when using CREATE TABLE LIKE. [#37879](https://github.com/apache/doris/pull/37879) + +- Fixed the issue where grant operations did not check if the object existed. [#39597](https://github.com/apache/doris/pull/39597) + +## Upgrade suggestions + +When upgrading Doris, please follow the principle of not skipping two minor versions and upgrade sequentially. + +For example, if you are upgrading from version 0.15.x to 2.0.x, it is recommended to first upgrade to the latest version of 1.1, then upgrade to the latest version of 1.2, and finally upgrade to the latest version of 2.0. + +For more upgrade information, see the documentation: [Cluster Upgrade](../../admin-manual/cluster-management/upgrade) \ No newline at end of file diff --git a/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.7.md b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.7.md new file mode 100644 index 0000000000000..414229276e6b0 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.7.md @@ -0,0 +1,180 @@ +--- +{ + "title": "Release 2.1.7", + "language": "en" +} +--- + + + +Dear community, **Apache Doris version 2.1.7 was officially released on November 10, 2024.** This version brings continuous upgrades and improvements. Additionally, several fixes have been implemented in areas such as the to the Lakehouse, Async Materialized Views, and Semi-Structured Data Management, Query Optimizer and Permission Management. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- The following global variables will be forcibly set to the following default values: + - enable_nereids_dml: true + - enable_nereids_dml_with_pipeline: true + - enable_nereids_planner: true + - enable_fallback_to_original_planner: true + - enable_pipeline_x_engine: true +- New columns have been added to the audit log. [#42262](https://github.com/apache/doris/pull/42262) + - For more information, please refer to [docs](https://doris.apache.org/docs/admin-manual/audit-plugin/) + +## New features + +### Async Materialized View + +- An asynchronous materialized view has added a property called `use_for_rewrite` to control whether it participates in transparent rewriting. [#40332](https://github.com/apache/doris/pull/40332) + +### Query Execution + +- The list of changed session variables is now output in the Profile. [#41016](https://github.com/apache/doris/pull/41016) +- Support for `trim_in`, `ltrim_in`, and `rtrim_in` functions has been added. [#42641](https://github.com/apache/doris/pull/42641) (Note: This is a duplicate mention, but I'm including it as per your original list.) +- Support for several URL functions (top_level_domain, first_significant_subdomain, cut_to_first_significant_subdomain) has been added. [#42916](https://github.com/apache/doris/pull/42916) +- The `bit_set` function has been added. [#42916](https://github.com/apache/doris/pull/42099) +- The `count_substrings` function has been added. [#42055](https://github.com/apache/doris/pull/42055) +- The `translate` and `url_encode` functions have been added. [#41051](https://github.com/apache/doris/pull/41051) +- The `normal_cdf`, `to_iso8601`, and `from_iso8601_date` functions have been added. [#40695](https://github.com/apache/doris/pull/40695) + + +### Storage Management + +- The `information_schema.table_options` and `table_properties` system tables have been added, supporting the querying of attributes set during table creation. [#34384](https://github.com/apache/doris/pull/34384) +- Support for `bitmap_empty` as a default value has been implemented. [#40364](https://github.com/apache/doris/pull/40364) +- A new session variable `require_sequence_in_insert` has been introduced to control whether a sequence column must be provided when performing `INSERT INTO SELECT` writes to a unique key table. [#41655](https://github.com/apache/doris/pull/41655) + +### Others + +- Allow for generating flame graphs on the BE WebUI page.[#41044](https://github.com/apache/doris/pull/41044) + +## Improvements + +### Lakehouse + +- Support for writing data to Hive text format tables. [#40537](https://github.com/apache/doris/pull/40537) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/datalake-building/hive-build) +- Access MaxCompute data using MaxCompute Open Storage API. [#41610](https://github.com/apache/doris/pull/41610) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/database/max-compute) +- Support for Paimon DLF Catalog. [#41694](https://github.com/apache/doris/pull/41694) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/datalake-analytics/paimon) +- Added `table$partitions` syntax to directly query Hive partition information.[#41230](https://github.com/apache/doris/pull/41230) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/datalake-analytics/hive) +- Support for reading Parquet files in brotli compression format.[#42162](https://github.com/apache/doris/pull/42162) +- Support for reading DECIMAL 256 types in Parquet files. [#42241](https://github.com/apache/doris/pull/42241) +- Support for reading Hive tables in OpenCsvSerde format.[#42939](https://github.com/apache/doris/pull/42939) + +### Async Materialized View + +- Refined the granularity of lock holding during the build process for asynchronous materialized views. [#40402](https://github.com/apache/doris/pull/40402) [#41010](https://github.com/apache/doris/pull/41010). + +### Query optimizer + +- Improved the accuracy of statistic information collection and usage in extreme cases to enhance planning stability. [#40457](https://github.com/apache/doris/pull/40457) +- Runtime filters can now be generated in more scenarios to improve query performance. [#40815](https://github.com/apache/doris/pull/40815) +- Enhanced constant folding capabilities for numerical, date, and string functions to boost query performance. [#40820](https://github.com/apache/doris/pull/40820) +- Optimized the column pruning algorithm to enhance query performance. [#41548](https://github.com/apache/doris/pull/41548) + +### Query Execution + +- Supported parallel preparation to reduce the time consumed by short queries. [#40270](https://github.com/apache/doris/pull/40270) +- Corrected the names of some counters in the profile to match the audit logs. [#41993](https://github.com/apache/doris/pull/41993) +- Added new local shuffle rules to speed up certain queries. [#40637](https://github.com/apache/doris/pull/40637) + +### Storage Management + +- The `SHOW PARTITIONS` command now supports displaying the commit version. [#28274](https://github.com/apache/doris/pull/28274) +- Checked for unreasonable partition expressions when creating tables. [#40158](https://github.com/apache/doris/pull/40158) +- Optimized the scheduling logic when encountering EOF in Routine Load. [#40509](https://github.com/apache/doris/pull/40509) +- Made Routine Load aware of schema changes. [#40508](https://github.com/apache/doris/pull/40508) +- Improved the timeout logic for Routine Load tasks. [#41135](https://github.com/apache/doris/pull/41135) + +### Others + +- Allowed closing the built-in service port of BRPC via BE configuration. [#41047](https://github.com/apache/doris/pull/41047) +- Fixed issues with missing fields and duplicate records in audit logs. [#43015](https://github.com/apache/doris/pull/43015) + +## Bug fixes + +### Lakehouse + +- Fixed the inconsistency in the behavior of INSERT OVERWRITE with Hive. [#39840](https://github.com/apache/doris/pull/39840) +- Cleaned up temporarily created folders to address the issue of too many empty folders on HDFS. [#40424](https://github.com/apache/doris/pull/40424) +- Resolved memory leaks in FE caused by using the JDBC Catalog in some cases. [#40923](https://github.com/apache/doris/pull/40923) +- Resolved memory leaks in BE caused by using the JDBC Catalog in some cases. [#41266](https://github.com/apache/doris/pull/41266) +- Fixed errors in reading Snappy compressed formats in certain scenarios. [#40862](https://github.com/apache/doris/pull/40862) +- Addressed potential FileSystem leaks on the FE side in certain scenarios. [#41108](https://github.com/apache/doris/pull/41108) +- Resolved issues where using EXPLAIN VERBOSE to view external table execution plans could cause null pointer exceptions in some cases. [#41231] (https://github.com/apache/doris/pull/41231) +- Fixed the inability to read tables in Paimon parquet format. [#41487](https://github.com/apache/doris/pull/41487) +- Addressed performance issues introduced by compatibility changes in the JDBC Oracle Catalog. [#41407](https://github.com/apache/doris/pull/41407) +- Disabled predicate pushing down after implicit conversion to resolve incorrect query results in some cases with JDBC Catalog. [#42242](https://github.com/apache/doris/pull/42242) +- Fixed issues with case-sensitive access to table names in the External Catalog. [#42261](https://github.com/apache/doris/pull/42261) + +### Async Materialized View + +- Fixed the issue where user-specified start times were not effective. [#39573](https://github.com/apache/doris/pull/39573) +- Resolved the issue of nested materialized views not refreshing. [#40433](https://github.com/apache/doris/pull/40433) +- Fixed the issue where materialized views might not refresh after the base table was deleted and recreated. [#41762](https://github.com/apache/doris/pull/41762) +- Addressed issues where partition compensation rewrites could lead to incorrect results. [#40803](https://github.com/apache/doris/pull/40803) +- Fixed potential errors in rewrite results when `sql_select_limit` was set. [#40106](https://github.com/apache/doris/pull/40106) + +### Semi-Structured Data Management + +- Fixed the issue of index file handle leaks. [#41915](https://github.com/apache/doris/pull/41915) +- Addressed inaccuracies in the `count()` function of inverted indexes in special cases. (#41127)[https://github.com/apache/doris/pull/41127] +- Fixed exceptions with variant when light schema change was not enabled. [#40908](https://github.com/apache/doris/pull/40908) +- Resolved memory leaks when variant returns arrays. [#41339](https://github.com/apache/doris/pull/41339) + +### Query optimizer + +- Corrected potential errors in nullable calculations for filter conditions during external table queries, leading to execution exceptions. [#41014](https://github.com/apache/doris/pull/41014) +- Fixed potential errors in optimizing range comparison expressions. [#41356](https://github.com/apache/doris/pull/41356) + +### Query Execution + +- The match_regexp function could not correctly handle empty strings. [#39503](https://github.com/apache/doris/pull/39503) +- Resolved issues where the scanner thread pool could become stuck in high-concurrency scenarios. [#40495](https://github.com/apache/doris/pull/40495) +- Fixed errors in the results of the `data_floor` function. [#41948](https://github.com/apache/doris/pull/41948) +- Addressed incorrect cancel messages in some scenarios. [#41798](https://github.com/apache/doris/pull/41798) +- Fixed issues with excessive warning logs printed by arrow flight. [#41770](https://github.com/apache/doris/pull/41770) +- Resolved issues where runtime filters failed to send in some scenarios. [#41698](https://github.com/apache/doris/pull/41698) +- Fixed problems where some system table queries could not end normally or became stuck. [#41592](https://github.com/apache/doris/pull/41592) +- Addressed incorrect results from window functions. ][#40761](https://github.com/apache/doris/pull/40761) +- Fixed issues where the encrypt and decrypt functions caused BE cores. [#40726](https://github.com/apache/doris/pull/40726) +- Resolved errors in the results of the conv function. [#40530](https://github.com/apache/doris/pull/40530) + +### Storage Management + +- Fixed import failures when Memtable migration was used in multi-replica scenarios with machine crashes. [#38003](https://github.com/apache/doris/pull/38003) +- Addressed inaccurate memory statistics during the Memtable flush phase during imports. [#39536](https://github.com/apache/doris/pull/39536) +- Fixed fault tolerance issues with Memtable migration in multi-replica scenarios. [#40477](https://github.com/apache/doris/pull/40477) +- Resolved inaccurate bvar statistics with Memtable migration. [#40985](https://github.com/apache/doris/pull/40985) +- Fixed inaccurate progress reporting for S3 loads. [#40987](https://github.com/apache/doris/pull/40987) + +### Permissions + +- Fixed permission issues related to show columns, show sync, and show data from db.table. [#39726](https://github.com/apache/doris/pull/39726) + +### Others + +- Fixed the issue where the audit log plugin for version 2.0 could not be used in version 2.1. [#41400](https://github.com/apache/doris/pull/41400) diff --git a/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.0.md b/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.0.md new file mode 100644 index 0000000000000..baa62b37e1e75 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.0.md @@ -0,0 +1,469 @@ +--- +{ + "title": "Release 3.0.0", + "language": "en" +} +--- + + + + +We are excited to announce the release of Apache Doris 3.0! + +**Starting from version 3.X, Apache Doris supports a compute-storage decoupled mode in addition to the compute-storage coupled mode for cluster deployment. With the cloud-native architecture that decouples the computation and storage layers, users can achieve physical isolation between query loads across multiple compute clusters, as well as isolation between read and write loads. Additionally, users can take advantage of low-cost shared storage systems such as object storage or HDFS to significantly reduce storage costs.** + +Version 3.0 marks a milestone in the evolution of Apache Doris towards a unified data lake and data warehouse architecture. This version introduces the ability to write data back to data lakes, allowing users to perform data analysis, sharing, processing, and storage operations across multiple data sources within Apache Doris. With capabilities such as asynchronous materialized views, Apache Doris can serve as a unified data processing engine for enterprises, helping users better manage data across lakes, warehouses, and databases. Also, Apache Doris 3.0 introduces the Trino Connector. It allows users to quickly connect or adapt to more data sources, and leverage the high-performance compute engine of Doris to deliver faster query results than Trino. + +Version 3.0 also enhances support for ETL batch processing scenarios, adding explicit transaction support for operations like `insert into select`, `delete` and `update`. The observability of query execution has also been improved. + +In terms of performance, we have improved the framework capabilities, infrastructure, and rules of the query optimizer in version 3.0. This provides optimized performance, which has been proven by blind testing in more complex and diverse business scenarios. + +The adaptive Runtime Filter computation method now accurately estimates filters based on data size during execution, delivering better performance under large data volumes and high loads. Additionally, asynchronous materialized view has been more stable and user-friendly in query acceleration and data modeling. + +**During the development of version 3.0, over 200 contributors submitted nearly 5,000 optimizations** and fixes to Apache Doris. Contributors from companies such as VeloDB, Baidu, Meituan, ByteDance, Tencent, Alibaba, Kwai, Huawei, and Tianyi Cloud actively collaborated with the community, contributing test cases from real-world use cases to help us improve Apache Doris. We extend our heartfelt thanks to all the contributors involved in the development, testing, and feedback process for this release. + +- **GitHub**: https://github.com/apache/doris/releases + +- **Website**: https://doris.apache.org/download + +## 1. Compute-storage decoupled mode + +Since V3.0, Apache Doris supports the compute-storage decoupled mode. Users can choose between it and the compute-storage coupled mode during cluster deployment. + +In the compute-storage decoupled mode, the BE nodes no longer store the data, but instead, a shared storage layer (HDFS and object storage) is introduced as the shared data storage layer. The computing and storage resources can be scaled independently, bringing multiple benefits to users: + +- **Workload isolation**: Multiple compute clusters can share the same data, allowing users to isolate different business workloads or offline loads using separate compute clusters. + +- **Reduced storage costs**: The full dataset is stored in the more cost-effective and highly reliable shared storage, with only hot data cached locally. Compared to the compute-storage coupled mode with three data replicas, the storage cost can be reduced by up to 90%. + +- **Elastic computing resources**: Since no data is stored on the BE nodes, the computing resources can be scaled flexibly based on the load requirements. Users can scale in or out an individual compute cluster or increase/decrease the number of compute clusters. This also leads to cost savings. + +- **Improved system robustness**: By storing the data in shared storage, Doris no longer needs to handle the complex logic of multi-replica consistency, thus simplifying distributed storage complexity and improving the overall system robustness. + +- **Flexible data sharing and cloning**: The flexibility of the compute-storage decoupled mode extends beyond a single Doris cluster. Tables from one Doris cluster can be easily cloned to another Doris cluster, with just metadata replication. + +### 1-1. From coupled to decoupled + +In the compute-storage coupled mode, the Apache Doris architecture consists of two main process types: Frontend (FE) and Backend (BE). The FE is primarily responsible for user request access, query parsing and planning, metadata management, and node management. The BE is responsible for data storage and query plan execution. + +The BE nodes employ an MPP (Massively Parallel Processing) distributed computing architecture, leveraging a multi-replica consistency protocol to ensure high service availability and high data reliability. + +![From coupled to decoupled](/images/storage-compute-decoupled.PNG) + + +The maturation of emerging cloud computing infrastructure, including public clouds, private clouds, and Kubernetes-based container platforms, has driven the need for cloud-native capabilities. Increasingly, users are seeking deeper integration between Apache Doris and cloud computing infrastructure to provide more elasticity. + +**To address this need, the VeloDB team has designed and implemented a cloud-native version of Apache Doris that decouples compute and storage, known as VeloDB Cloud. After extensive production testing and refinement across hundreds of enterprises over a long time, this cloud-native solution has now been contributed to the Apache Doris community, manifesting as the Apache Doris 3.0 in the compute-storage decoupled mode.** + +In the compute-storage decoupled mode, the Apache Doris architecture consists of three layers: + +- **Meta data layer**: A new Meta Service module has been introduced to provide meta data services, such as processing database and table information, schemas, rowset meta, and transactions. The Meta Service is stateless and horizontally scalable. In V3.0, all of the BE's meta data and parts of the FE's meta data have been migrated to the Meta Service. We will finish the migration of the remains in future versions. +- **Computation layer**: The stateless BE nodes execute query plans and cache a portion of the data and tablet meta data locally to improve query performance. Multiple stateless BE nodes can be organized into a computing resource pool (i.e., compute cluster), and multiple compute clusters can share the same data and metadata service. The compute clusters can be elastically scaled by adding or removing nodes as needed. +- **Shared storage layer**: Data is persisted to the shared storage layer, which currently supports HDFS as well as various cloud-based object storage systems that are compatible with the S3 protocol, such as S3, OSS, GCS, Azure Blob, COS, BOS, and MinIO. + +![From coupled to decoupled-2](/images/storage-compute-decoupled-2.JPEG) + +### 1-2 Design highlight + +The design of the compute-storage decoupled mode of Apache Doris highlights the transformation of the FE's in-memory metadata model into a shared metadata service. This approach offers a globally consistent state view, allowing any node to directly submit writes without needing to go through the FE for publishing. During write operations, data is stored in shared storage, while metadata is managed by the metadata service. **This effectively controls the number of small files in shared storage. Meanwhile, the real-time write performance for individual tables is nearly on par with that in the compute-storage coupled mode. The system's overall write capacity is no longer limited by the processing power of a single FE node.** + +![Design highlight](/images/design-hightlight.PNG) + +Based on the globally consistent state view, for data garbage collection, we have adopted a design approach for data deletion that is easier to prove correct and more efficient. + +Specifically, data in the shared storage is incorporated into the globally consistent view offered by the shared meta data service. Whenever data is generated, we bind it to a separate, independent transaction. Similarly, for a meta data deletion operation, we also bind it to a separate, independent transaction. The purpose of this approach is to ensure that deletion and write operations cannot succeed together. The view records which data needs to be deleted, and the asynchronous deletion process can simply perform a forward deletion of the data based on the transaction records, without the need for reverse garbage collection. + +As the tablet-related meta data in the FE is gradually migrated to the shared meta data service, the scalability of the Doris cluster will no longer be constrained by the memory capacity of a single FE node. Building upon the shared meta data service and the forward data deletion technique, we can conveniently expand functionality such as data sharing and lightweight cloning. + +### 1-3 Comparison with alternative solutions + +Another design of decoupling compute and storage in the industry is to store the data and BE node meta data in a shared object storage or HDFS. However, this approach brings the following problems: + +- **Inability to support real-time writes**: During data writes, the data is mapped to tablets based on the partitioning and bucketing rules, generating segment files and rowset meta data. During the write process, a two-phase commit (Publish) is performed through the FE. When a BE node receives the Publish request, it then sets the rowset as visible. The Publish operation must not fail. If the rowset meta data is stored in the shared storage, the total small file data during the real-time write process would triple the size of the actual data files - one replica of data files, one for rowset meta data, and another for rowset meta data changes during Publish. The Publish operation is driven by a single FE node, so the write capacity of a single table or even the entire system is limited by the FE node's capabilities. + + ![Comparison with alternative solutions](/images/comparison-with-alternative-solutions.png) + + We compared the real-time data write performance of Apache Doris 3.0 with the above-described solution. We simulated 500 concurrent tasks writing 10,000 data files with 500 rows each, and 50 concurrent tasks writing 250 data files with 20,000 rows each, using the same computational resources. + + **The results showed that at 50 concurrent tasks, the micro-batch write performances of Apache Doris in both compute-storage coupled and decoupled modes were almost identical, while the industry solution lagged behind Apache Doris by a factor of 100.** + + At 500 concurrent tasks, the performance of Apache Doris in the compute-storage decoupled mode showed slight degradation, but it still maintained an 11X advantage over the industry solution. To ensure a fair test, Apache Doris did not enable the Group Commit feature (which the industry solution lacks). Enabling Group Commit would further enhance real-time write performance. + + ![Comparison with alternative solutions](/images/real-time-write-performance..png) + + Additionally, the industry solution also faces stability and cost issues in terms of real-time data ingestion: + + - Stability concerns: A large number of small files can put pressure on the shared storage, especially HDFS, and introduce stability risks. + + - High object storage request costs: Some public cloud object storage services charge 10 times more for Put and Delete operations compared to Get operations. A large number of small files can lead to a significant increase in object storage request costs, which can even exceed the storage costs. + +- **Limited scalability**: Use cases of the compute-storage decoupled model often handles larger data storage sizes, since the FE (Frontend) meta data is entirely in-memory, when the number of tablets reaches a certain high level (e.g. tens of millions), the FE's memory pressure can become a bottleneck that limits the overall write throughput of the system. + +- **Potential data deletion logic issues**: In the compute-storage decoupled architecture, data is stored with one single replica. Therefore, the data deletion logic is critical for the system's reliability. The conventional approach of cross-system data deletion by comparing the differences can be challenging. During the write process, there is no way to completely avoid deletion and write from succeeding together, which can lead to data loss. Additionally, when the storage system experiences anomalies, the input used for difference calculation may be incorrect, which potentially leads to unintended data deletion. + +- **Data sharing and lightweight cloning**: The flexibility of the decoupled storage-compute architecture can enable future data sharing and lightweight data cloning, reducing the burden of enterprise data management. However, if each cluster has a separate FE, after cloning data across clusters, it becomes difficult to accurately determine which data is no longer referenced and can be safely deleted, as calculating cross-cluster references can easily lead to unintended data deletion. + +By evolving the FE's full in-memory meta data model into a shared meta data service, Apache Doris 3.0 avoids all the aforementioned issues. + +### 1-4 Query performance comparison + +In the compute-storage decoupled mode, data needs to be read from the remote shared storage system, the main bottleneck has become the network bandwidth instead of the disk I/O in the compute-storage coupled mode. + +To accelerate data access, Apache Doris has implemented a high-speed caching mechanism based on local disks, and provides two cache management policies: LRU (Least Recently Used) and TTL (Time-To-Live). The newly imported data is asynchronously written to the cache to accelerate the first-time access to the latest data. If the data required by a query is not in the cache, the system will read the data from the remote storage into memory and synchronously write it to the cache for subsequent queries. + +In use cases involving multiple compute clusters, Apache Doris provides a cache preheating function. When a new compute cluster is established, users can choose to preheat specific data (such as tables or partitions) to further improve query efficiency. + +In this context, we have conducted performance tests with different caching strategies in both the compute-storage coupled and decoupled modes, using the TPC-DS 1TB test dataset. The results are concluded as follows: + +- When the cache is fully hit (i.e., all the data required for the query is loaded into the cache), **the query performance of the compute-storage decoupled mode is on par with that of the compute-storage coupled mode**. + +- When the cache is partially hit (i.e., the cache is cleared before the test, and data is gradually loaded into the cache during the test, with performance continuously improving), the query performance of the compute-storage decoupled mode is about 10% lower than that of the compute-storage coupled mode. This test scenario is the most similar to the real-life use cases. + +- When the cache is completely missed (i.e., the cache is cleared before every SQL execution, simulating an extreme case), the performance loss is around 35%. **Even so, Apache Doris in the compute-storage decoupled mode delivers much higher performance than its alternative solutions.** + +![Query performance comparison](/images/query-performance-comparison.png) + +### 1-5 Write speed comparison + +In terms of write performance, we have simulated two test cases under the same computing resources: batch import and high-concurrency real-time import. The comparison of write performance between the compute-storage coupled mode and the compute-storage decoupled mode is as follows: + +- **Batch import**: When importing the 1TB TPC-H and 1TB TPC-DS test datasets, **the write performance of the compute-storage decoupled mode is 20.05% and 27.98% higher than the compute-storage coupled mode**, respectively, under the single-replica configuration. During batch import, the segment file size is generally in the range of tens to hundreds of MB. In the compute-storage decoupled mode, the segment files are split into smaller files and concurrently uploaded to the object storage, which can result in higher throughput compared to writing to local disks. In real-life deployments, the compute-storage coupled mode typically uses three replicas, which means the write speed advantage of the compute-storage decoupled mode will be even more pronounced. + +- **High-concurrency real-time import**: as described in the "Comparison with alternative solutions" section. + +![Write speed comparison](/images/write-speed-comparison.png) + +### 1-6 Tips for production environment + +- **Performance**: For real-time data analysis, users can achieve query performance comparable to the compute-storage coupled mode by specifying a TTL (Time-To-Live) for the cache and writing newly ingested data into the cache. To prevent query jitter, users can cache the data generated by background tasks such as compaction and schema changes based on how frequently used the data is. + +- **Workload isolation**: Users can achieve physical resource isolation for different business using multiple compute clusters. For workload isolation within a single compute cluster, users can utilize the Workload Group mechanism to limit and isolate resources for different queries. + +### 1-7 Notes + +- Apache Doris 3.0 does not support the co-existence of the compute-storage coupled mode and the compute-storage decoupled mode. Users need to specify one of them during cluster deployment. + +- If users need the compute-storage coupled mode, following the [documentation](https://doris.apache.org/docs/3.0/install/source-install/compilation-with-docker/) for its deployment and upgrade. We recommend using Doris Manager for quick deployment and cluster upgrades. However, the compute-storage decoupled mode does not yet support Doris Manager deployment and upgrade. We will continue iteration for better support in future versions. + +- Currently Apache Doris does not support in-place upgrade from V2.1 to the compute-storage decoupled mode of V3.0. For such purpose, users need to perform data migration using tools like X2Doris after deploying the compute-storage decoupled clusters. In the future, we will support migration without service interruption through the CCR (Change Data Capture) capability. + +:::info +See doc: +https://doris.apache.org/docs/3.0/compute-storage-decoupled/overview/ +::: + +## 2. Data lakehouse + +Apache Doris is positioned as a real-time data warehouse, but it is much more than that. In previous versions, we have consistently pushed beyond the boundaries of traditional data warehouse capabilities, advancing towards a unified data lakehouse. Version 3.0 marks a milestone in this journey, with its capabilities in the lakehouse architecture becoming fully mature. We believe that a unified lakehouse is identified by **boundaryless data** and **lakehouse fusion**: + +**Boundaryless data: Apache Doris serves as a unified query processing engine, breaking down data barriers across different systems. It provides a consistent and ultra-fast analysis experience across all data sources, including data warehouses, data lakes, data streams, and local data files.** + +- **Lakehouse query acceleration**: Without the need to migrate data to Apache Doris, users can leverage Doris’ efficient query engine to directly query data stored in data lakes such as Iceberg, Hudi, Paimon, and offline data warehouses like Hive, thereby accelerating query analysis. + +- **Federated analysis**: By extending its catalog and storage plugins, Apache Doris enhances its federated analysis capabilities, allowing users to perform unified analysis across multiple heterogeneous data sources without physically centralizing the data in a single storage system. This enables external table queries and federated joins between internal and external tables, breaking down data silos and providing globally consistent data insights. + +- **Data lake construction**: Apache Doris introduces write-back functionality for Hive and Iceberg, allowing users to directly create Hive and Iceberg tables through Doris and write data into them. This allows users to write internal table data back to the offline lakehouse or process offline lakehouse data using Doris and save the results back into the lakehouse, simplifying and streamlining the data lake construction process. + +**Lakehouse fusion: As data lake architectures become increasingly complex, the costs of technology selection and maintenance rise for users. Achieving consistent fine-grained access control across multiple systems also becomes challenging, and real-time performance suffers. To address this, Apache Doris integrates core features of the data lake, transforming itself into a lightweight, efficient, native real-time lakehouse.** + +- **Real-time data updates**: Starting with version 1.2, Apache Doris enhanced the primary key model by introducing Merge-on-Write, supporting real-time updates. This feature allows high-frequency, real-time data updates based on primary key changes from upstream data sources. + +- **Data science and** **AI** **computation support**: From version 2.1, Apache Doris, using the efficient Arrow Flight protocol, increased the openness of its storage system and its support for various compute loads, enabling data science and AI computations. + +- **Enhancements for semi-structured and unstructured Data**: Apache Doris has introduced support for data types like Array, Map, Struct, JSON, and Variant, with plans to support vector indexing in the future. + +- **Improved resource efficiency by decoupling storage and compute**: With version 3.0, Apache Doris supports a decoupled storage and compute mode, further improving resource efficiency and scalability. + +### 2-1 Faster queries in the data lakehouse + +TPC-H and TPC-DS benchmarking proves that Apache Doris achieves average query performance that is 3 to 5 times faster than Trino/Presto. + +In V3.0, we have focused on optimizing query performance for production environments, including: + +- **More granular task splitting strategy**: By adjusting the consistent hashing algorithm and introducing a task sharding weighting mechanism, we ensure balanced query loads across all nodes. + +- **Scheduling optimizations for use cases with numerous partitions and files**: For cases with a large number of files (over 1 million), we have largely reduced query latency (from 100 seconds to 10 seconds) and alleviated memory pressure on the Frontend (FE) by asynchronously and batch-fetching file shards. + +We will continue to specifically enhance query acceleration performance in real-world business scenarios, improve the actual user experience, and build an industry-leading lakehouse query acceleration engine. + +### 2-2 Federated analysis: more data connectors + +Previous versions of Apache Doris support connectors for over 10 mainstream data lakehouses, warehouses, and relational databases. In V3.0, we have introduced the Trino Connector compatibility framework, which expands the range of data sources that Apache Doris can connect to. With this framework, users can easily adapt their existing setups to access corresponding data sources using Doris and leverage its high-speed computing engine for data analysis. + +Currently, Doris has completed adaptations for Delta Lake, Kudu, BigQuery, Kafka, TPCH, and TPCDS. We also encourage contributions from developers to prolong this list. + +:::info Note + +See doc: + +- Trino Connector: https://doris.apache.org/community/how-to-contribute/trino-connector-developer-guide/ + +- TPC-H: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/tpch/ + +- TPC-DS: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/tpcds/ + +- Delta Lake: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/deltalake/ + +- Kudu: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/kudu/ + +- BigQuery: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/bigquery/ +::: + + +### 2-3 Data lake building + +In V3.0, we have introduced data writeback functionality for Hive and Iceberg. This allows users to create Hive and Iceberg tables directly through Doris and write data into these tables, and enables users to perform data analysis, sharing, processing, and storage operations across multiple data sources within Doris. + +In future iterations, Apache Doris will further enhance support for data lake table formats and improve the openness of storage APIs. + +:::info Note +See doc: https://doris.apache.org/docs/3.0/lakehouse/datalake-building/hive-build/ +::: + +## 3. Upgraded semi-structured data analysis capabilities + +In versions 2.0 and 2.1, Apache Doris introduced some well-embraced features such as inverted index, NGram Bloom Filter, and Variant data type to support high-performance full-text search and multi-dimensional analysis. With them, the storage and processing of complex semi-structured data have been more flexible and efficient. + +In V3.0, we have further enhanced the capabilities in this scenario. + +After extensive testing in production environments, the Variant data type has gained sufficient stability and become the preferred choice for JSON data storage and analysis. In V3.0, we have made multiple optimizations to it: + +- Support for indexing of the Variant data type to accelerate queries, including inverted index, Bloom Filter index, and the built-in ZoneMap index. + +- Support for flexible partial column updates for Unique Key tables containing the Variant data type. + +- Support for the use of the Variant data type in the compute-storage decoupled mode, with optimizations of its metadata storage. + +- Support for exporting the Variant data type to formats such as Parquet and CSV. + +The inverted index, introduced since V2.0, has reached a high level of maturity after more than a year of refinement and is now running in production environments of hundreds of enterprises. In V3.0, we have made multiple optimizations to the inverted index: + +- After performance optimizations, including lock concurrency, Apache Doris outperforms Elasticsearch in key metrics such as query latency and concurrency in real-time reporting analysis. + +- Optimized index file in the compute-storage decoupled mode to reduce remote storage calls and decrease index query latency. + +- Support for the Array data type to accelerate the `array_contains` queries. + +- Enhanced the `match_phrase_*` functionality, including support for slop and phrase prefix matching `match_phrase_prefix`. + +## 4. Enhanced ETL capabilities + +### 4-1. Transaction improvements + +Data processing in data warehouses often involves multiple data changes that need to be handled as a single transaction. V3.0 provides explicit transaction support for `insert into select`, `delete`, and `update` operations. Example cases include: + +- **Transactional requirements**: For example, when updating data within a time range, the typical approach is to first delete the data in that time range, and then insert the new data. Considering that the data might already be in service, there is a need to ensure that queries visit either the old data or the new data. Thus, it can be achieved by executing the `delete` and `insert into select` operations in a transaction. + + ```Java + BEGIN; + DELETE FROM table WHERE date >= "2024-07-01" AND date <= "2024-07-31"; + INSERT INTO table SELECT * FROM stage_table; + COMMIT; + ``` + +- **Simplified the processing of failed tasks**: For example, when two `insert into select` operations are executed within a single transaction, if any of the operations fail, it can be retried directly. + + ```Java + BEGIN WITH LABEL label_etl_1; + INTO table1 SELECT * FROM stage_table1; + INSERT INTO table SELECT * FROM stage_table; + COMMIT; + ``` + +:::info Note +See doc: https://doris.apache.org/docs/3.0/data-operate/transaction/ +Currently, explicit transaction synchronization is not supported in Cross-Cluster Replication (CCR). +::: + +### 4-2. Improved observability + +- **Real-time profile retrieval**: In previous versions, due to issues with the execution plan or the data, some complex queries might have high computational requirements, so developers can only access the query profile for performance analysis after the completion of the query. This makes it hard to promptly identify issues in query execution to guarantee stability of the production environment. Now, with the ability to retrieve real-time profiles, V3.0 allows users to monitor query execution as the query is running. It also allows them to better monitor the progress of each ETL job. + +- **`backend_active_tasks` system table**: The `backend_active_tasks` system table provides real-time resource consumption information for each query on each BE node. Users can analyze this system table using SQL to obtain the resource usage of each query, which helps identify large queries or abnormal workloads. + +## 5. Asynchronous materialized view + +In V3.0, asynchronous materialized view is faster and more stable. It is also more user-friendly for query acceleration and data modeling scenarios. We have restructured the logic for transparent rewrite and expanded its capabilities, making it 2X faster. + +### 5-1 Refresh + +- Support for incremental update of materialized views by partitions and partition roll-ups on materialized views to allow refreshes at different granularities. + +- Support for nested materialized views, which is useful in data modeling scenarios. + +- Support for index creation and sort key specification in asynchronous materialized views, which will improve query performance after the materialized view is hit. + +- Higher usability of materialized view DDL with support for atomically replacing materialized views, allowing modifications to the materialized view definition SQL while keeping the materialized view available. + +- Support for non-deterministic functions in materialized views to better serve daily materialized view creation. + +- Support for trigger-based materialized view refresh, which ensures data consistency in data modeling with nested materialized views. + +- Support for a broader range of SQL patterns for building partitioned materialized views, making the incremental update capability available to more use cases. + +### 5-2 Refresh stability + +- V3.0 supports specifying a Workload Group for building materialized views. This is to limit the resources used by the materialized view build process and ensure that sufficient resources remain available for ongoing queries. + +### 5-3 Transparent rewrite + +- Support for transparent rewrite of more Join types, including derived Joins. Even when there is a mismatch of Join types between the query and materialized view, transparent rewrite can still be performed by compensating with additional predicates, as long as the materialized view can provide all the data needed for the query. + +- Support for more aggregate functions for roll-up as well as rewrite of multi-dimensional aggregations like GROUPING SETS, ROLLUP, and CUBE; support rewriting queries with aggregations when the materialized view does not contain aggregations, simplifying Join operations and expression computation. + +- Support for transparent rewrite of nested materialized views, enabling higher performance for complex queries. + +- For partially invalid partitioned materialized views, V3.0 supports `Union All` the base tables for data completion, expanding the applicability of partitioned materialized views. + +### 5-4 Transparent rewrite performance + +- Continuous optimization has been done to improve the transparent rewrite performance, achieving 2X the speed compared to version 2.1.0. + +:::info Note + +See doc: + +https://doris.apache.org/docs/3.0/query/view-materialized-view/query-async-materialized-view + +https://doris.apache.org/docs/3.0/query/view-materialized-view/async-materialized-view/ + +::: + +## 6. Performance improvement + +### 6-1 Smarter optimizer + +In V3.0, the query optimizer has been enhanced in terms of framework capabilities, distributed plan support, optimizer infrastructure, and rule expansion. It provides better optimization capabilities for more complex and diverse business scenarios, with higher blind test performance for complex SQL: + +- **Improved plan enumeration capability**: The key structure Memo for plan enumeration has been restructured and normalized. This improves the efficiency of the Cascades framework in plan enumeration and the possibility of producing better plans. Additionally, it fixes incomplete column pruning during the Join Reorder process in older versions, which led to unnecessary overhead of the Join operator, thus improving the execution performance in the relevant scenarios. + +- **Improved distributed plan support**: The distributed query plan has been enhanced to allow aggregation, join, and window function operations to more intelligently identify the data characteristics of intermediate computation results, avoiding ineffective data redistribution operations. Meanwhile, we have optimized the execution under the multi-replica continuous execution mode, making it more data cache-friendly. + +- **Improved optimizer infrastructure**: V3 has fixed several issues in cost model and statistics information estimation. The fixes to the cost model are more adaptable to the evolution of the execution engine, making the execution plan more stable compared to previous versions. + +- **Enhanced Runtime Filter plan support**: On the basis of Join Runtime Filter, V3.0 has expanded the capability of the TopN Runtime Filter to achieve better performance in use cases that involve a TopN operator. + +- **Enriched optimization rule library**: Based on user feedback and internal testing results, we have introduced optimization rules such as Intersect Reorder to enrich the rule set of the optimizer. + +### 6-2 Self-adaptive Runtime Filter + +In previous versions, the generation of Runtime Filter relies on manual setting by users based on statistical information. However, inaccurate settings in certain cases could lead to performance instability. + +In V3.0, Doris implements a self-adaptive Runtime Filter calculation approach. It can estimate the Runtime Filter at runtime based on the data size with high accuracy, enabling better performance in use cases with large data volumes and high workloads. + +### 6-3 Function performance optimization + +- V3.0 has improved the vectorized implementation of dozens of functions, enabling a performance improvement of over 50% for some commonly used functions. +- V3.0 has also made extensive optimizations to the aggregation of nullable data types, enabling a 30% performance improvement. + +### 6-4 Blind test performance improvement + +Our blind tests on V3.0 and V2.1 show that the new version is 7.3% and 6.2% faster in TPC-DS and TPC-H benchmark tests, respectively. + +![Blind test performance improvement](/images/blind-test-performance-improvement.png) + +## 7. New features + +### 7-1 Java UDTF + +Version 3.0 has added support for Java UDTFs. The key operations are as follows: + +- Implementing a UDTF: Similar to a UDF, a UDTF requires the user to implement an `evaluate` method. Note that the return value of a UDTF function must be of the `Array` data type. + + ```sql + public class UDTFStringTest { + public ArrayList evaluate(String value, String separator) { + if (value == null || separator == null) { + return null; + } else { + return new ArrayList<>(Arrays.asList(value.split(separator))); + } + } + } + ``` + +- Creating a UDTF: By default, two corresponding functions will be created - `java-utdf`and `java-utdf_outer`. The `_outer` suffix adds a single row of `NULL` data when the table function generates 0 rows of output. + + ```sql + CREATE TABLES FUNCTION java-utdf(string, string) RETURNS array PROPERTIES ( + "file"="file:///pathTo/java-udaf.jar", + "symbol"="org.apache.doris.udf.demo.UDTFStringTest", + "always_nullable"="true", + "type"="JAVA_UDF" + ); + ``` + +:::info + +See doc: https://doris.apache.org/docs/3.0/query/udf/java-user-defined-function/#udtf-1 + +::: + +### 7-2 Generated column + +A generated column is a special column whose value is calculated from the values of other columns rather than directly inserted or updated by the user. It supports pre-computing the results of expressions and storing them in the database, which is suitable for scenarios that require frequent queries or complex calculations. + +Results can be automatically calculated based on predefined expressions when data is imported or updated, and then stored persistently. In this way, during subsequent queries, the system can directly access these calculated results without performing complex calculations, thereby improving query performance. + +Generated columns are supported since V3.0. When creating a table, you can specify a column as generated column. A generated column automatically calculates values based on the defined expression when data is written. Generated columns allow for more complex expressions to be defined, but the value cannot be explicitly written or set. + +:::info + +See doc: https://doris.apache.org/docs/3.0/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-TABLE-AND-GENERATED-COLUMN/ + +::: + +## 8. Functional improvements + +### 8-1. Materialized view + +We have refactored the selection logic for materialized views and migrated it from the rule-based optimizer (RBO) to the cost-based optimizer (CBO). This aligns the selection logic with that of asynchronous materialized views. This functionality is enabled by default. If any issues are encountered, you can revert to the RBO mode using `set global enable_sync_mv_cost_based_rewrite = false`. + +### 8-2. Routine Load + +In previous versions, the Routine Load functionality faced some usability challenges, such as uneven task scheduling across BE nodes, untimely task scheduling, complex configuration requirements (the need to change multiple FE and BE settings for optimization), insufficient overall stability (where restarts or upgrades could frequently pause Routine Load jobs, requiring manual user intervention to resume). + +To address these issues, we have made extensive optimizations to the Routine Load feature: + +- **Resource scheduling**: We have improved the scheduling balance to make sure that tasks are more evenly distributed across BE nodes. Jobs that encounter unrepairable errors will be promptly paused to avoid wasting resources on futile scheduling attempts. Additionally, we have improved the timeliness of the scheduling process, which has enhanced the import performance of Routine Load. + +- **Parameter configuration**: Users in most environments no longer need to modify FE and BE configurations for optimization. An automatic adjustment mechanism with timeout parameter has been introduced to prevent tasks from constantly retrying when cluster pressure increases. + +- **Stability**: We have enhanced the robustness of Doris in various exceptional scenarios, such as FE failovers, BE rolling upgrades, and Kafka cluster anomalies, ensuring continuous stable operation. We have also optimized the Auto Resume mechanism, allowing Routine Load to automatically resume operation after faults are repaired, reducing the need for manual user intervention. + +## 9. Behavior changed + +- `cpu_resource_limit` will no longer be supported, and all types of resource isolation will be implemented through Workload Groups. + +- Please use JDK 17 for Apache Doris 3.0 and later versions. The recommended version being `jdk-17.0.10_linux-x64_bin.tar.gz`. + +## Try Apache Doris 3.0 now! + +Before the official release of version 3.0, the compute-storage decoupled mode of Apache Doris has undergone nearly two years of extensive testing and optimization in the production environments of hundreds of enterprises. Contributors from many tech giants have collaborated with the community to provide a significant number of test cases based on their real-world business needs. This has rigorously validated the usability and stability of version 3.0. + +We highly recommend users with compute-storage decoupling needs to download version 3.0 and experience it firsthand. + +Going forward, we will accelerate our release iteration cycle to deliver a more stable version experience for all users. Feel free to join us in the [Apache Doris community](https://join.slack.com/t/apachedoriscommunity/shared_invite/zt-2gmq5o30h-455W226d79zP3L96ZhXIoQ) and engage directly with the core developers. + +## Credits + +Special thanks to the following contributors who participated in the development, testing, and provided feedback for this version: + +@133tosakarin、@390008457、@924060929、@AcKing-Sam、@AshinGau、@BePPPower、@BiteTheDDDDt、@ByteYue、@CSTGluigi、@CalvinKirs、@Ceng23333、@DarvenDuan、@DongLiang-0、@Doris-Extras、@Dragonliu2018、@Emor-nj、@FreeOnePlus、@Gabriel39、@GoGoWen、@HappenLee、@HowardQin、@Hyman-zhao、@INNOCENT-BOY、@JNSimba、@JackDrogon、@Jibing-Li、@KassieZ、@Lchangliang、@LemonLiTree、@LiBinfeng-01、@LompleZ、@M1saka2003、@Mryange、@Nitin-Kashyap、@On-Work-Song、@SWJTU-ZhangLei、@StarryVerse、@TangSiyang2001、@Tech-Circle-48、@Thearas、@Vallishp、@WinkerDu、@XieJiann、@XuJianxu、@XuPengfei-1020、@Yukang-Lian、@Yulei-Yang、@Z-SWEI、@ZhongJinHacker、@adonis0147、@airborne12、@allenhooo、@amorynan、@bingquanzhao、@biohazard4321、@bobhan1、@caiconghui、@cambyzju、@caoliang-web、@catpineapple、@cjj2010、@csun5285、@dataroaring、@deardeng、@dongsilun、@dutyu、@echo-hhj、@eldenmoon、@elvestar、@englefly、@feelshana、@feifeifeimoon、@feiniaofeiafei、@felixwluo、@freemandealer、@gavinchou、@ghkang98、@gnehil、@hechao-ustc、@hello-stephen、@httpshirley、@hubgeter、@hust-hhb、@iszhangpch、@iwanttobepowerful、@ixzc、@jacktengg、@jackwener、@jeffreys-cat、@kaijchen、@kaka11chen、@kindred77、@koarz、@kobe6th、@kylinmac、@larshelge、@liaoxin01、@lide-reed、@liugddx、@liujiwen-up、@liutang123、@lsy3993、@luwei16、@luzhijing、@lxliyou001、@mongo360、@morningman、@morrySnow、@mrhhsg、@my-vegetable-has-exploded、@mymeiyi、@nanfeng1999、@nextdreamblue、@pingchunzhang、@platoneko、@py023、@qidaye、@qzsee、@raboof、@rohitrs1983、@rotkang、@ryanzryu、@seawinde、@shoothzj、@shuke987、@sjyango、@smallhibiscus、@sollhui、@sollhui、@spaces-X、@stalary、@starocean999、@superdiaodiao、@suxiaogang223、@taptao、@vhwzx、@vinlee19、@w41ter、@wangbo、@wangshuo128、@whutpencil、@wsjz、@wuwenchi、@wyxxxcat、@xiaokang、@xiedeyantu、@xiedeyantu、@xingyingone、@xinyiZzz、@xy720、@xzj7019、@yagagagaga、@yiguolei、@yongjinhou、@ytwp、@yuanyuan8983、@yujun777、@yuxuan-luo、@zclllyybb、@zddr、@zfr9527、@zgxme、@zhangbutao、@zhangstar333、@zhannngchen、@zhiqiang-hhhh、@ziyanTOP、@zxealous、@zy-kkk、@zzzxl1993、@zzzzzzzs \ No newline at end of file diff --git a/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.1.md b/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.1.md new file mode 100644 index 0000000000000..9b9007e4391aa --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.1.md @@ -0,0 +1,604 @@ +--- +{ + "title": "Release 3.0.1", + "language": "en" +} +--- + + + +Dear community members, the Apache Doris 3.0.1 version was officially released on August 23, 2024, featuring updates and improvements in compute-storage decoupling, lakehouse, semi-structured data analysis, asynchronous materialized views, and more. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior Changes + +### Query Optimizer + +- Added the variable `use_max_length_of_varchar_in_ctas` to control the length behavior of VARCHAR type when executing `CREATE TABLE AS SELECT` (CTAS) operations. [#37069](https://github.com/apache/doris/pull/37069) + + - This variable is set to true by default. + + - When set to true, if the VARCHAR type column originates from a table, the derived length is used; otherwise, the maximum length is used. + + - When set to false, the VARCHAR type will always use the derived length. + +- All data types will now be displayed in lowercase to maintain compatibility with MySQL format. [#38012](https://github.com/apache/doris/pull/38012) + +- Multiple query statements in the same query request must now be separated by semicolons. [#38670](https://github.com/apache/doris/pull/38670) + +### Query Execution + +- The default number of parallel tasks after shuffle operations in the cluster is set to 100, which will improve query stability and concurrent processing capability in large clusters. [#38196](https://github.com/apache/doris/pull/38196) + +### Storage + +- The default value of `trash_file_expire_time_sec` has been changed from 86400 seconds to 0 seconds, which means that if files are deleted by mistake and the FE trash is cleared, the data cannot be recovered. + +- The table attribute `enable_mow_delete_on_delete_predicate` (introduced in version 3.0.0) has been renamed to `enable_mow_light_delete`. + +- Explicit transactions are now prohibited from performing delete operations on tables with written data. + +- Heavy schema change operations are prohibited on tables with auto-increment fields. + + + +## New Features + +### Job Scheduling + +- Optimized the execution logic of internal scheduling jobs, decoupling the strong association between start time and immediate execution parameters. Now, tasks can be created with a specified start time or selected for immediate execution, without conflict, enhancing scheduling flexibility. [#36805](https://github.com/apache/doris/pull/36805) + +### Compute-Storage Decoupled + +- Supports dynamic modification of the upper limit for file cache usage. [#37484](https://github.com/apache/doris/pull/37484) + +- Recycler now supports object storage rate limiting and server-side rate limiting retry functionality. [#37663](https://github.com/apache/doris/pull/37663) [#37680](https://github.com/apache/doris/pull/37680) + +### Lakehouse + +- Added the session variable `serde_dialect` to set the output format for complex types. [#37039](https://github.com/apache/doris/pull/37039) + +- SQL interception now supports external tables. + + - For more information, refer to the documentation on [SQL Interception](https://doris.apache.org/docs/admin-manual/query-admin/sql-interception). + +- Insert overwrite now supports Iceberg tables. [#37191](https://github.com/apache/doris/pull/37191) + +### Asynchronous Materialized Views + +- Supports partition roll-up and build at the hourly level. [#37678](https://github.com/apache/doris/pull/37678) + +- Supports atomic replacement of asynchronous materialized view definition statements. [#36749](https://github.com/apache/doris/pull/36749) + +- Transparent rewriting now supports Insert statements. [#38115](https://github.com/apache/doris/pull/38115) + +- Transparent rewriting now supports the VARIANT type. [#37929](https://github.com/apache/doris/pull/37929) + +### Query Execution + +- The group concat function now supports DISTINCT and ORDER BY options. [#38744](https://github.com/apache/doris/pull/38744) + +### Semi-Structured Data Management + +- The ES Catalog now maps `nested` or `object` types in Elasticsearch to the JSON type in Doris. [#37101](https://github.com/apache/doris/pull/37101) + +- Added the `MULTI_MATCH` function, which supports matching keywords across multiple fields and can leverage inverted indexes to accelerate searches. [#37722](https://github.com/apache/doris/pull/37722) + +- Added the `explode_json_object` function, which can unfold objects in JSON data into multiple rows. [#36887](https://github.com/apache/doris/pull/36887) + +- Inverted indexes now support memtable advancement, requiring index construction only once during multi-replica writes, reducing CPU consumption and improving performance. [#35891](https://github.com/apache/doris/pull/35891) + +- Added `MATCH_PHRASE` support for positive slop, e.g., `msg MATCH_PHRASE 'a b 2+'` can match instances containing words a and b with a slop of no more than two, and a preceding b; regular slop without the final `+` does not guarantee this order. [#36356](https://github.com/apache/doris/pull/36356) + +### Other + +- Added the FE parameter `skip_audit_user_list`, where user operations specified in this configuration will not be recorded in the audit log. [#38310](https://github.com/apache/doris/pull/38310) + + - For more information, refer to the documentation on [Audit Plugin](https://doris.apache.org/docs/admin-manual/audit-plugin/). + + + +## Improvements + +### Storage + +- Reduced the likelihood of write failures caused by disk balancing within a single BE. [#38000](https://github.com/apache/doris/pull/38000) + +- Decreased memory consumption by the memtable limiter. [#37511](https://github.com/apache/doris/pull/37511) + +- Moved old partitions to the FE trash during partition replacement operations. [#36361](https://github.com/apache/doris/pull/36361) + +- Optimized memory consumption during compaction. [#37099](https://github.com/apache/doris/pull/37099) + +- Added a session variable to control audit logs for JDBC PreparedStatement, with default setting to not print. [#38419](https://github.com/apache/doris/pull/38419) + +- Optimized the logic for selecting BEs for group commits. [#35558](https://github.com/apache/doris/pull/35558) + +- Improved the performance of column updates. [#38487](https://github.com/apache/doris/pull/38487) + +- Optimized the use of `delete bitmap cache`. [#38761](https://github.com/apache/doris/pull/38761) + +- Added a configuration to control query affinity during hot and cold tiering. [#37492](https://github.com/apache/doris/pull/37492) + +### Compute-Storage Decoupled + +- Implemented automatic retries when encountering object storage server rate limiting. [#37199](https://github.com/apache/doris/pull/37199) + +- Adapted the number of threads for memtable flush in the compute-storage decoupled mode. [#38789](https://github.com/apache/doris/pull/38789) + +- Added Azure as a compile option to support compilation in environments without Azure support. + +- Optimized the observability of object storage access rate limiting. [#38294](https://github.com/apache/doris/pull/38294) + +- Allowed the file cache TTL queue to perform LRU eviction, enhancing TTL queue usability. [#37312](https://github.com/apache/doris/pull/37312) + +- Optimized the number of balance writeeditlog IO operations in the storage and compute separation mode. [#37787](https://github.com/apache/doris/pull/37787) + +- Improved table creation speed in the storage and compute separation mode by sending tablet creation requests in batches. [#36786](https://github.com/apache/doris/pull/36786) + +- Optimized read failures caused by potential inconsistencies in the local file cache through backoff retries. [#38645](https://github.com/apache/doris/pull/38645) + +### Lakehouse + +- Optimized memory statistics for Parquet/ORC format read and write operations. [#37234](https://github.com/apache/doris/pull/37234) + +- Trino Connector Catalog now supports predicate pushdown. [#37874](https://github.com/apache/doris/pull/37874) + +- Added a session variable `enable_count_push_down_for_external_table` to control whether to enable `count(*)` pushdown optimization for external tables. [#37046](https://github.com/apache/doris/pull/37046) + +- Optimized the read logic for Hudi snapshot reads, returning an empty set when the snapshot is empty, consistent with Spark behavior. [#37702](https://github.com/apache/doris/pull/37702) + +- Improved the read performance of partition columns for Hive tables. [#37377](https://github.com/apache/doris/pull/37377) + +### Asynchronous Materialized Views + +- Improved transparent rewrite plan speed by 20%. [#37197](https://github.com/apache/doris/pull/37197) + +- Eliminated roll-up during transparent rewrite if the group key satisfies data uniqueness for better nested matching. [#38387](https://github.com/apache/doris/pull/38387) + +- Transparent rewrite now performs better aggregation elimination to improve the matching success rate of nested materialized views. [#36888](https://github.com/apache/doris/pull/36888) + +### MySQL Compatibility + +- Now correctly populates the database name, table name, and original name in the MySQL protocol result columns. [#38126](https://github.com/apache/doris/pull/38126) + +- Supported the hint format `/*+ func(value) */`. [#37720](https://github.com/apache/doris/pull/37720) + +### Query Optimizer + +- Significantly improved the plan speed for complex queries. [#38317](https://github.com/apache/doris/pull/38317) + +- Adaptively chose whether to perform bucket shuffle based on the number of data buckets to avoid performance degradation in extreme cases. [#36784](https://github.com/apache/doris/pull/36784) + +- Optimized the cost estimation logic for SEMI / ANTI JOIN. [#37951](https://github.com/apache/doris/pull/37951) [#37060](https://github.com/apache/doris/pull/37060) + +- Supported pushing Limit down to the first stage of aggregation to improve performance. [#34853](https://github.com/apache/doris/pull/34853) + +- Partition pruning now supports filter conditions containing the `date_trunc` or `date` function. [#38025](https://github.com/apache/doris/pull/38025) [#38743](https://github.com/apache/doris/pull/38743) + +- SQL cache now supports query scenarios that include user variables. [#37915](https://github.com/apache/doris/pull/37915) + +- Optimized error messages for invalid aggregation semantics. [#38122](https://github.com/apache/doris/pull/38122) + +### Query Execution + +- Adapted AggState compatibility from 2.1 to 3.x and fixed Coredump issues. [#37104](https://github.com/apache/doris/pull/37104) + +- Refactored the strategy selection for local shuffle without Join. [#37282](https://github.com/apache/doris/pull/37282) + +- Modified the scanner for internal table queries to be asynchronous to prevent stalling during such queries. [#38403](https://github.com/apache/doris/pull/38403) + +- Optimized the block merge process during Hash table construction for Join operators. [#37471](https://github.com/apache/doris/pull/37471) + +- Optimized the duration of lock holding for MultiCast. [#37462](https://github.com/apache/doris/pull/37462) + +- Optimized gRPC keepAliveTime and added link monitoring to reduce the probability of query failure due to RPC errors. [#37304](https://github.com/apache/doris/pull/37304) + +- Cleaned up all dirty pages in jemalloc when memory limits were exceeded. [#37164](https://github.com/apache/doris/pull/37164) + +- Optimized the processing performance of `aes_encrypt`/`decrypt` functions for constant types. [#37194](https://github.com/apache/doris/pull/37194) + +- Optimized the processing performance of the `json_extract` function for constant data. [#36927](https://github.com/apache/doris/pull/36927) + +- Optimized the processing performance of the `ParseUrl` function for constant data. [#36882](https://github.com/apache/doris/pull/36882) + +### Semi-Structured Data Management + +- Bitmap indexes now default to using inverted indexes, with `enable_create_bitmap_index_as_inverted_index` set to true by default. [#36692](https://github.com/apache/doris/pull/36692) + +- In the compute-storage decoupled mode, DESC can now view sub-columns of VARIANT type. [#38143](https://github.com/apache/doris/pull/38143) + +- Removed the step of checking file existence during inverted index queries to reduce access latency to remote storage. [#36945](https://github.com/apache/doris/pull/36945) + +- Complex types ARRAY / MAP / STRUCT now support `replace_if_not_null` for AGG tables. [#38304](https://github.com/apache/doris/pull/38304) + +- Escape characters for JSON data are now supported. [#37176](https://github.com/apache/doris/pull/37176) [#37251](https://github.com/apache/doris/pull/37251) + +- Inverted index queries now behave consistently on MOW tables and DUP tables. [#37428](https://github.com/apache/doris/pull/37428) + +- Optimized the performance of inverted index acceleration for IN queries. [#37395](https://github.com/apache/doris/pull/37395) + +- Reduced unnecessary memory allocation during TOPN queries to improve performance. [#37429](https://github.com/apache/doris/pull/37429) + +- When creating an inverted index with tokenization, the `support_phrase` option is now automatically enabled to accelerate `match_phrase` series phrase queries. [#37949](https://github.com/apache/doris/pull/37949) + +### Other + +- Audit log now can record SQL types. [#37790](https://github.com/apache/doris/pull/37790) + +- Added support for `information_schema.processlist` to show all FE. [#38701](https://github.com/apache/doris/pull/38701) + +- Cached ranger's `atamask` and `rowpolicy` to accelerate query efficiency. [#37723](https://github.com/apache/doris/pull/37723) + +- Optimized metadata management in job manager to release locks immediately after modifying metadata, reducing lock holding time. [#38162](https://github.com/apache/doris/pull/38162) + + + +## Bug Fixes + +### Upgrade + +- Fix the issue where `mtmv load` fails during upgrade from version 2.1. [#38799](https://github.com/apache/doris/pull/38799) + +- Resolve the issue where `null_type` cannot be found during the upgrade to version 2.1. [#39373](https://github.com/apache/doris/pull/39373) + +- Address the compatibility issue with permission persistence during the upgrade from version 2.1 to 3.0. [#39288](https://github.com/apache/doris/pull/39288) + +### Load + +- Fix the issue where parsing fails when the newline character is surrounded by delimiters in CSV format parsing. [#38347](https://github.com/apache/doris/pull/38347) +- Resolve potential exception issues when FE forwards group commit. [#38228](https://github.com/apache/doris/pull/38228) [#38265](https://github.com/apache/doris/pull/38265) + +- Group commit now supports the new optimizer. [#37002](https://github.com/apache/doris/pull/37002) + +- Fix the issue where group commit reports data errors when JDBC setNull is used. [#38262](https://github.com/apache/doris/pull/38262) + +- Optimize the retry logic for group commit when encountering `delete bitmap lock` errors. [#37600](https://github.com/apache/doris/pull/37600) + +- Resolve the issue where routine load cannot use CSV delimiters and escape characters. [#38402](https://github.com/apache/doris/pull/38402) + +- Fix the issue where routine load job names with mixed case cannot be displayed. [#38523](https://github.com/apache/doris/pull/38523) + +- Optimize the logic for actively recovering routine load during FE master-slave switching. [#37876](https://github.com/apache/doris/pull/37876) + +- Resolve the issue where routine load pauses when all data in Kafka is expired. [#37288](https://github.com/apache/doris/pull/37288) + +- Fix the issue where `show routine load` returns empty results. [#38199](https://github.com/apache/doris/pull/38199) + +- Resolve the memory leak issue during multi-table stream import in routine load. [#38255](https://github.com/apache/doris/pull/38255) + +- Fix the issue where stream load does not return the error URL. [#38325](https://github.com/apache/doris/pull/38325) + +- Resolve potential load channel leak issues. [#38031](https://github.com/apache/doris/pull/38031) [#37500](https://github.com/apache/doris/pull/37500) + +- Fix the issue where no error may be reported when importing fewer segments than expected. [#36753](https://github.com/apache/doris/pull/36753) + +- Resolve the load stream leak issue. [#38912](https://github.com/apache/doris/pull/38912) + +- Optimize the impact of offline nodes on import operations. [#38198](https://github.com/apache/doris/pull/38198) + +- Fix the issue where transactions do not end when inserting into empty data. [#38991](https://github.com/apache/doris/pull/38991) + +### Storage + +**01 Backup and Restoration** + +- Fix the issue where tables cannot be written after backup and restoration. [#37089](https://github.com/apache/doris/pull/37089) + +- Resolve the issue where view database names are incorrect after backup and restoration. [#37412](https://github.com/apache/doris/pull/37412) + +**02 Compaction** + +- Fix the issue where cumu compaction handles delete errors incorrectly during ordered data compression. [#38742](https://github.com/apache/doris/pull/38742) + +- Resolve the issue of duplicate keys in aggregate tables caused by sequential compression optimization. [#38224](https://github.com/apache/doris/pull/38224) + +- Fix the issue where compression operations cause coredump in large wide tables. [#37960](https://github.com/apache/doris/pull/37960) + +- Resolve the compression starvation issue caused by inaccurate concurrent statistics of compression tasks. [#37318](https://github.com/apache/doris/pull/37318) + +**03 MOW Unique Key** + +- Resolve the issue of inconsistent data between replicas caused by cumulative compression deletion of delete sign. [#37950](https://github.com/apache/doris/pull/37950) + +- MOW delete now uses partial column updates with the new optimizer. [#38751](https://github.com/apache/doris/pull/38751) + +- Fix the potential duplicate key issue in MOW tables under compute-storage decoupled. [#39018](https://github.com/apache/doris/pull/39018) + +- Resolve the issue where MOW unique and duplicate tables cannot modify column order. [#37067](https://github.com/apache/doris/pull/37067) + +- Fix the potential data correctness issue caused by segcompaction. [#37760](https://github.com/apache/doris/pull/37760) + +- Resolve the potential memory leak issue during column updates. [#37706](https://github.com/apache/doris/pull/37706) + +**04 Other** + +- Fix the small probability of exceptions in TOPN queries. [#39119](https://github.com/apache/doris/pull/39119) [#39199](https://github.com/apache/doris/pull/39199) + +- Resolve the issue where auto-increment IDs may duplicate during FE restart. [#37306](https://github.com/apache/doris/pull/37306) + +- Fix the potential queuing issue in the delete operation priority queue. [#37169](https://github.com/apache/doris/pull/37169) + +- Optimize the delete retry logic. [#37363](https://github.com/apache/doris/pull/37363) + +- Resolve the issue with `bucket = 0` in table creation statements under the new optimizer. [#38971](https://github.com/apache/doris/pull/38971) + +- Fix the issue where FE reports success incorrectly when image generation fails. [#37508](https://github.com/apache/doris/pull/37508) + +- Resolve the issue where using the wrong nodename during FE offline nodes may cause inconsistent FE members. [#37987](https://github.com/apache/doris/pull/37987) + +- Fix the issue where CCR partition addition may fail. [#37295](https://github.com/apache/doris/pull/37295) + +- Resolve the `int32` overflow issue in inverted index files. [#38891](https://github.com/apache/doris/pull/38891) + +- Fix the issue where TRUNCATE TABLE failure may cause BE to fail to go offline. [#37334](https://github.com/apache/doris/pull/37334) + +- Resolve the issue where publish cannot continue due to null pointers. [#37724](https://github.com/apache/doris/pull/37724) [#37531](https://github.com/apache/doris/pull/37531) + +- Fix the potential coredump issue when manually triggering disk migration. [#37712](https://github.com/apache/doris/pull/37712) + +### Compute-Storage Decoupled + +- Fixed the issue where `show create table` might display the `file_cache_ttl_seconds` attribute twice. [#38052](https://github.com/apache/doris/pull/38052) + +- Fixed the issue where segment Footer TTL was not set correctly after setting file cache TTL. [#37485](https://github.com/apache/doris/pull/37485) + +- Fixed the issue where file cache might cause coredump due to massive conversion of cache types. [#38518](https://github.com/apache/doris/pull/38518) + +- Fixed the potential file descriptor (fd) leak in file cache. [#38051](https://github.com/apache/doris/pull/38051) + +- Fixed the issue where schema change Job overwriting compaction Job prevented base tablet compaction from completing normally. [#38210](https://github.com/apache/doris/pull/38210) + +- Fixed the potential inaccuracy of base compaction score due to data race. [#38006](https://github.com/apache/doris/pull/38006) + +- Fixed the issue where error messages from imports might not be uploaded correctly to object storage. [#38359](https://github.com/apache/doris/pull/38359) + +- Fixed the inconsistency in return information between compute-storage decoupled mode and storage and compute integration mode for 2PC imports. [#38076](https://github.com/apache/doris/pull/38076) + +- Fix the issue where incorrect file size setting during file cache warm-up leads to coredump. [#38939](https://github.com/apache/doris/pull/38939) + +- Fixed the issue where partial column updates did not correctly dequeue delete operations. [#37151](https://github.com/apache/doris/pull/37151) + +- Fixed compatibility issues with permission persistence in compute-storage decoupled mode. [#38136](https://github.com/apache/doris/pull/38136) [#37708](https://github.com/apache/doris/pull/37708) + +- Fixed the issue where observer did not retry correctly when encountering a `-230` error. [#37625](https://github.com/apache/doris/pull/37625) + +- Fixed the issue where `show load` with conditions did not perform correct analysis. [#37656](https://github.com/apache/doris/pull/37656) + +- Fixed the issue where `show streamload` in compute-storage decoupled mode caused BE coredump. [#37903](https://github.com/apache/doris/pull/37903) + +- Fixed the issue where `copy into` did not correctly verify column names in strict mode. [#37650](https://github.com/apache/doris/pull/37650) + +- Fixed the issue where multi-stream imports into a single table lacked permissions. [#38878](https://github.com/apache/doris/pull/38878) + +- Fixed the potential overflow issue in `getVersionUpdateTimeMs`. [#38074](https://github.com/apache/doris/pull/38074) + +- Fixed the issue where FE azure blob list was not implemented correctly. [#37986](https://github.com/apache/doris/pull/37986) + +- Fixed the issue where inaccurate azure blob recycling time calculation prevented recycling. [#37535](https://github.com/apache/doris/pull/37535) + +- Fixed the issue where inverted index files were not deleted in compute-storage decoupled mode. [#38306](https://github.com/apache/doris/pull/38306) + +### Lakehouse + +- Fixed the issue with reading binary data from Oracle Catalog. [#37078](https://github.com/apache/doris/pull/37078) + +- Fixed the potential deadlock issue when acquiring external table metadata in multi-FE scenarios. [#37756](https://github.com/apache/doris/pull/37756) + +- Fixed the issue where JNI scanner failure caused BE nodes to crash. [#37697](https://github.com/apache/doris/pull/37697) + +- Fixed the issue with slow reading of date types from Trino Connector Catalog. [#37266](https://github.com/apache/doris/pull/37266) + +- Optimized kerberos authentication logic for Hive Catalog. [#37301](https://github.com/apache/doris/pull/37301) + +- Fixed the issue where region attributes might be parsed incorrectly when parsing MinIO properties. [#37249](https://github.com/apache/doris/pull/37249) + +- Fixed the issue where creating too many FileSystems by FE caused memory leaks. [#36954](https://github.com/apache/doris/pull/36954) + +- Fixed the issue with reading incorrect time zone information from Paimon. [#37716](https://github.com/apache/doris/pull/37716) + +- Fixed the potential thread leak issue caused by Hive write-back operations. [#36990](https://github.com/apache/doris/pull/36990) + +- Fixed the null pointer issue caused by enabling Hive metastore event synchronization. [#38421](https://github.com/apache/doris/pull/38421) + +- Fixed the issue where error messages were unclear or caused stalling when creating catalogs. [#37551](https://github.com/apache/doris/pull/37551) + +- Fixed the issue where reading Hive text format tables behaved differently from Hive. [#37638](https://github.com/apache/doris/pull/37638) + +- Fixed the logic error when switching between catalogs and databases. [#37828](https://github.com/apache/doris/pull/37828) + +### MySQL Compatibility + +- Fixed the issue where certain flags in the MySQL protocol were set incorrectly when SSL was enabled. [#38086](https://github.com/apache/doris/pull/38086) + +### Asynchronous Materialized Views + +- Fixed the issue where construction might fail when the base table had a very large number of partitions. [#37589](https://github.com/apache/doris/pull/37589) + +- Fixed the issue where nested materialized views incorrectly performed full table refreshes even when partition refreshes were possible. [#38698](https://github.com/apache/doris/pull/38698) + +- Fixed the issue where partition refresh could not handle the simultaneous existence of valid and invalid dependencies when analyzing partition dependencies. [#38367](https://github.com/apache/doris/pull/38367) + +- Fixed the issue where the final result containing NULL type might cause asynchronous materialized views to fail. [#37019](https://github.com/apache/doris/pull/37019) + +- Fixed the planning error that might occur during transparent rewriting when both synchronous and asynchronous materialized views with the same name were present. [#37311](https://github.com/apache/doris/pull/37311) + +### Synchronous Materialized Views + +- The rewritten synchronous materialized views now can correctly perform partition pruning. [#38527](https://github.com/apache/doris/pull/38527) + +- When rewriting synchronous materialized views, those with unready data are no longer selected. [#38148](https://github.com/apache/doris/pull/38148) + +### Query Optimizer + +- Fixed the deadlock issue that might occur when queries and delete operations are performed simultaneously. [#38660](https://github.com/apache/doris/pull/38660) + +- Fixed the issue where bucket pruning might incorrectly prune on decimal column buckets. [#37889](https://github.com/apache/doris/pull/37889) + +- Fixed the issue where planning might be incorrect when mark join participates in join reorder. [#39152](https://github.com/apache/doris/pull/39152) + +- Fixed the issue where the result is incorrect when the correlation condition of a correlated subquery is not a simple column. [#37644](https://github.com/apache/doris/pull/37644) + +- Fixed the issue where partition pruning cannot correctly handle or expressions. [#38897](https://github.com/apache/doris/pull/38897) + +- Fixed the planning error that might occur when optimizing the execution order of JOIN and AGG. [#37343](https://github.com/apache/doris/pull/37343) + +- Fixed the issue where `str_to_date` performs incorrect constant folding calculations on datev1 types. [#37360](https://github.com/apache/doris/pull/37360) + +- Fixed the issue where the ACOS function's constant folding returns non-NaN values. [#37932](https://github.com/apache/doris/pull/37932) + +- Fixed the occasional planning error: "The children format needs to be [WhenClause+, DefaultValue?]". [#38491](https://github.com/apache/doris/pull/38491) + +- Fixed the issue where planning might be incorrect when the projection includes window functions and there is both the original column and its alias. [#38166](https://github.com/apache/doris/pull/38166) + +- Fixed the issue where planning might report an error when the aggregation parameter contains a lambda expression. [#37109](https://github.com/apache/doris/pull/37109) + +- Fixed the insert error that might occur in extreme cases: "MultiCastDataSink cannot be cast to DataStreamSink". [#38526](https://github.com/apache/doris/pull/38526) + +- Fixed the issue where the new optimizer does not correctly handle `char(0)/varchar(0)` when creating a table. [#38427](https://github.com/apache/doris/pull/38427) + +- Fixed the incorrect behavior of `char(255) toSql`. [#37340](https://github.com/apache/doris/pull/37340) + +- Fixed the issue where the nullable attribute within the `agg_state` type might lead to planning errors. [#37489](https://github.com/apache/doris/pull/37489) +- Fixed the issue where row count statistics are inaccurate during mark Join. [#38270](https://github.com/apache/doris/pull/38270) + +### Query Execution + +- Fixed issues where the Pipeline execution engine was stuck, causing queries to not end, in multiple scenarios. [#38657](https://github.com/apache/doris/pull/38657), [#38206](https://github.com/apache/doris/pull/38206), [#38885](https://github.com/apache/doris/pull/38885), [#38151](https://github.com/apache/doris/pull/38151), [#37297](https://github.com/apache/doris/pull/37297) + +- Fixed the coredump issue caused by NULL and non-NULL columns during set difference calculations. [#38750](https://github.com/apache/doris/pull/38750) + +- Fixed the error when using the DECIMAL type with pure decimals in delete statements. [#37801](https://github.com/apache/doris/pull/37801) + +- Fixed the issue where the `width_bucket` function returned incorrect results. [#37892](https://github.com/apache/doris/pull/37892) + +- Fixed the query error when a single row of data was very large and the result set was also large (exceeding 2GB). [#37990](https://github.com/apache/doris/pull/37990) + +- Fixed the coredump issue caused by incorrect release of rpc connections during single-replica imports. [#38087](https://github.com/apache/doris/pull/38087) + +- Fixed the coredump issue caused by processing NULL values with the `foreach` function. [#37349](https://github.com/apache/doris/pull/37349) + +- Fixed the issue where stddev returned incorrect results for DECIMALV2 types. [#38731](https://github.com/apache/doris/pull/38731) + +- Fixed the slow performance of `bitmap union` calculations. [#37816](https://github.com/apache/doris/pull/37816) + +- Fixed the issue where RowsProduced for aggregation operators was not set in the profile. [#38271](https://github.com/apache/doris/pull/38271) + +- Fixed the overflow issue when calculating the number of buckets for the hash table under hash join. [#37193](https://github.com/apache/doris/pull/37193), [#37493](https://github.com/apache/doris/pull/37493) + +- Fixed the inaccurate recording of the `jemalloc cache memory tracker`. [#37464](https://github.com/apache/doris/pull/37464) + +- Added the `enable_stacktrace` configuration option, allowing users to control whether exception stacks are output in BE logs. [#37713](https://github.com/apache/doris/pull/37713) + +- Fixed the issue where Arrow Flight SQL did not work correctly when `enable_parallel_result_sink` was set to false. [#37779](https://github.com/apache/doris/pull/37779) + +- Fixed the incorrect use of colocate Join. [#37361](https://github.com/apache/doris/pull/37361), [#37729](https://github.com/apache/doris/pull/37729) + +- Fixed the calculation overflow issue of the `round` function on DECIMAL128 types. [#37733](https://github.com/apache/doris/pull/37733), [#38106](https://github.com/apache/doris/pull/38106) + +- Fixed the coredump issue when passing a const string to the `sleep` function. [#37681](https://github.com/apache/doris/pull/37681) + +- Increased the queue length for audit logs, solving the issue where audit logs could not be recorded normally under high concurrency scenarios with thousands of concurrent connections. [#37786](https://github.com/apache/doris/pull/37786) + +- Fixed the issue where creating a workload group caused too many threads, leading to BE coredump. [#38096](https://github.com/apache/doris/pull/38096) + +- Fixed the coredump issue caused by the `MULTI_MATCH_ANY` function. [#37959](https://github.com/apache/doris/pull/37959) + +- Fixed the transaction rollback issue caused by `insert overwrite auto partition`. [#38103](https://github.com/apache/doris/pull/38103) + +- Fixed the issue where the TimeUtils formatter did not use the correct time zone. [#37465](https://github.com/apache/doris/pull/37465) + +- Fixed the issue where results were incorrect under constant folding scenarios for week/yearweek. [#37376](https://github.com/apache/doris/pull/37376) + +- Fixed the issue where the `convert_tz` function returned incorrect results. [#37358](https://github.com/apache/doris/pull/37358), [#38764](https://github.com/apache/doris/pull/38764) + +- Fixed the coredump issue when using the `collect_set` function with window functions. [#38234](https://github.com/apache/doris/pull/38234) + +- Fixed the coredump issue caused by `percentile_approx` during rolling upgrades. [#39321](https://github.com/apache/doris/pull/39321) + +- Fixed the coredump issue caused by the `mod` function when encountering abnormal input. [#37999](https://github.com/apache/doris/pull/37999) + +- Fixed the issue where the hash table was not fully built when the broadcast join probe started running. [#37643](https://github.com/apache/doris/pull/37643) + +- Fixed the issue where executing the same expression in multithreaded environments might lead to incorrect results for Java UDFs. [#38612](https://github.com/apache/doris/pull/38612) + +- Fixed the overflow issue caused by incorrect return types of the `conv` function. [#38001](https://github.com/apache/doris/pull/38001) + +- Fixed the issue where the `json_replace` function returned incorrect types. [#3701](https://github.com/apache/doris/pull/37014) + +- Fixed the issue where the nullable attribute setting was unreasonable for the `percentile` aggregation function. [#37330](https://github.com/apache/doris/pull/37330) + +- Fixed the issue where the results of the `histogram` function were unstable. [#38608](https://github.com/apache/doris/pull/38608) + +- Fixed the issue where task state was displayed incorrectly in the profile. [#38082](https://github.com/apache/doris/pull/38082) + +- Fixed the issue where some queries were incorrectly canceled when the system just started. [#37662](https://github.com/apache/doris/pull/37662) + +### Semi-Structured Data Management + +- Fix some issues with time series compression. [#39170](https://github.com/apache/doris/pull/39170) [#39176](https://github.com/apache/doris/pull/39176) + +- Fix the issue of incorrect index size statistics during compression. [#37232](https://github.com/apache/doris/pull/37232) + +- Fix the potential incorrect matching of ultra-long strings without tokenization in inverted indexes. [#37679](https://github.com/apache/doris/pull/37679) [#38218](https://github.com/apache/doris/pull/38218) + +- Fix the high memory usage issue of `array_range` and `array_with_const` functions when dealing with large data volumes. [#38284](https://github.com/apache/doris/pull/38284) [#37495](https://github.com/apache/doris/pull/37495) + +- Fix the potential coredump issue when selecting columns of ARRAY / MAP / STRUCT types. [#37936](https://github.com/apache/doris/pull/37936) + +- Fix the import failure issue caused by simdjson parsing errors when specifying jsonpath in Stream Load. [#38490](https://github.com/apache/doris/pull/38490) + +- Fix the exception handling issue when there are duplicate keys in JSON data. [#38146](https://github.com/apache/doris/pull/38146) + +- Fix the potential query error after DROP INDEX. [#37646](https://github.com/apache/doris/pull/37646) + +- Fix the error return issue in row merging checks during index compression. [#38732](https://github.com/apache/doris/pull/38732) + +- Inverted index v2 format now supports renaming columns. [#38079](https://github.com/apache/doris/pull/38079) + +- Fix the coredump issue when the `MATCH` function matches an empty string without an index. [#37947](https://github.com/apache/doris/pull/37947) + +- Fix the handling of NULL values in inverted indexes. [#37921](https://github.com/apache/doris/pull/37921) [#37842](https://github.com/apache/doris/pull/37842) [#38741](https://github.com/apache/doris/pull/38741) + +- Fix the incorrect `row_store_page_size` after FE restart. [#38240](https://github.com/apache/doris/pull/38240) + +### Other + +- Fix the timezone configuration issue. The default timezone is no longer fixed at UTC+8 and is now obtained from system configuration. [#37294](https://github.com/apache/doris/pull/37294) + +- Fix the class conflict issue when using ranger due to multiple JSR specification implementations. [#37575](https://github.com/apache/doris/pull/37575) + +- Fix the potential uninitialized field issue in some BE code. [#37403](https://github.com/apache/doris/pull/37403) + +- Fix the error in delete statements for random distributed tables. [#37985](https://github.com/apache/doris/pull/37985) + +- Fix the incorrect requirement for `alter_priv` permission on the base table when creating a synchronized materialized view. [#38011](https://github.com/apache/doris/pull/38011) + +- Fix the issue of not authenticating resources when used in TVF. [#36928](https://github.com/apache/doris/pull/36928) + + +## Credits + +Thanks all who contribute to this release: + +@133tosakarin, @924060929, @AshinGau, @Baymine, @BePPPower, @BiteTheDDDDt, @ByteYue, @CalvinKirs, @Ceng23333, @DarvenDuan, @FreeOnePlus, @Gabriel39, @HappenLee, @JNSimba, @Jibing-Li, @KassieZ, @Lchangliang, @LiBinfeng-01, @Mryange, @SWJTU-ZhangLei, @TangSiyang2001, @Tech-Circle-48, @Vallishp, @Yukang-Lian, @Yulei-Yang, @airborne12, @amorynan, @bobhan1, @cambyzju, @cjj2010, @csun5285, @dataroaring, @deardeng, @eldenmoon, @englefly, @feiniaofeiafei, @felixwluo, @freemandealer, @gavinchou, @ghkang98, @hello-stephen, @hubgeter, @hust-hhb, @jacktengg, @kaijchen, @kaka11chen, @keanji-x, @liaoxin01, @liutang123, @luwei16, @luzhijing, @lxr599, @morningman, @morrySnow, @mrhhsg, @mymeiyi, @platoneko, @qidaye, @qzsee, @seawinde, @shuke987, @sollhui, @starocean999, @suxiaogang223, @w41ter, @wangbo, @wangshuo128, @whutpencil, @wsjz, @wuwenchi, @wyxxxcat, @xiaokang, @xiedeyantu, @xinyiZzz, @xy720, @xzj7019, @yagagagaga, @yiguolei, @yujun777, @z404289981, @zclllyybb, @zddr, @zfr9527, @zhangbutao, @zhangstar333, @zhannngchen, @zhiqiang-hhhh, @zjj, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.2.md b/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.2.md new file mode 100644 index 0000000000000..0ab6a828ab95d --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.2.md @@ -0,0 +1,341 @@ +--- +{ + "title": "Release 3.0.2", + "language": "en" +} +--- + + + + +Dear community members, the Apache Doris 3.0.2 version was officially released on October 15, 2024, featuring updates and improvements in compute-storage decoupling, data storage, lakehouse, query optimizer, query execution and more. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavioral Changes + +### Storage + +- Limited the number of tablets in a single backup task to prevent FE memory overflow. [#40518](https://github.com/apache/doris/pull/40518) +- The `SHOW PARTITIONS` command now displays the `CommittedVersion` of partitions. [#28274](https://github.com/apache/doris/pull/28274) + +### Other + +- The default printing mode (asynchronous) of `fe.log` now includes file line number information. If performance issues are encountered due to line number output, please switch to BRIEF mode. [#39419](https://github.com/apache/doris/pull/39419) +- The default value of the session variable `ENABLE_PREPARED_STMT_AUDIT_LOG` has been changed from `true` to `false`, and the audit log of prepare statements will no longer be printed. [#38865](https://github.com/apache/doris/pull/38865) +- The default value of the session variable `max_allowed_packet` has been adjusted from 1MB to 16MB to align with MySQL 8.4. [#38697](https://github.com/apache/doris/pull/38697) +- The JVM of FE and BE defaults to using the UTF-8 character set. [#39521](https://github.com/apache/doris/pull/39521) + +## New Features + +### Storage + +- Backup and recovery now support clearing tables or partitions that are not in the backup. [#39028](https://github.com/apache/doris/pull/39028) + +### Compute-Storage Decoupled + +- Support for parallel recycling of expired data on multiple tablets. [#37630](https://github.com/apache/doris/pull/37630) +- Support for changing storage vaults through `ALTER` statements. [#38685](https://github.com/apache/doris/pull/38685) [#37606](https://github.com/apache/doris/pull/37606) +- Support for importing a large number of tablets (5000+) in a single transaction (experimental feature). [#38243](https://github.com/apache/doris/pull/38243) +- Support for automatically aborting pending transactions caused by reasons such as node restarts, solving the issue of pending transactions blocking decommission or schema change. [#37669](https://github.com/apache/doris/pull/37669) +- A new session variable `enable_segment_cache` has been added to control whether to use segment cache during queries (default is `true`). [#37141](https://github.com/apache/doris/pull/37141) +- Resolved the issue of not being able to import a large amount of data during schema changes in compute-storage decoupled mode. [#39558](https://github.com/apache/doris/pull/39558) +- Support for adding multiple follower roles of FE in compute-storage decoupled mode. [#38388](https://github.com/apache/doris/pull/38388) +- Support for using memory as file cache to accelerate queries in environments with no disks or low-performance HDDs. [#38811](https://github.com/apache/doris/pull/38811) + +### Lakehouse + +- New Lakesoul Catalog has been added. [Apache Doris Docs](https://doris.apache.org/zh-CN/docs/dev/lakehouse/datalake-analytics/lakesoul) +- A new system table `catalog_meta_cache_statistics` has been added to view the usage of various metadata caches in external catalog. [#40155](https://github.com/apache/doris/pull/40155) + +### Query Optimizer + +- Support for `is [not] true/false` expressions. [#38623](https://github.com/apache/doris/pull/38623) + +### Query Execution + +- A new CRC32 function has been added. [#38204](https://github.com/apache/doris/pull/38204) +- New aggregate functions skew and kurt have been added. [#41277](https://github.com/apache/doris/pull/41277) +- Profiles are now persisted to the FE's disk to retain more profiles. [#33690](https://github.com/apache/doris/pull/33690) +- A new system table `workload_group_privileges` has been added to view permission information related to workload groups. [#38436](https://github.com/apache/doris/pull/38436) +- A new system table `workload_group_resource_usage` has been added to monitor resource statistics of workload groups. [#39177](https://github.com/apache/doris/pull/39177) +- Workload groups now support limiting reads of local IO and remote IO. [#39012](https://github.com/apache/doris/pull/39012) +- Workload groups now support cgroupv2 to limit CPU usage. [#39374](https://github.com/apache/doris/pull/39374) +- A new system table `information_schema.partitions` has been added to view some table creation attributes. [#40636](https://github.com/apache/doris/pull/40636) + +### Other + +- Support for using the `SHOW` statement to display BE's configuration information, such as `SHOW BACKEND CONFIG LIKE ${pattern}`. [#36525](https://github.com/apache/doris/pull/36525) + +## Improvements + +### Load + +- Improved the import efficiency of routine load when encountering frequent EOFs from Kafka. [#39975](https://github.com/apache/doris/pull/39975) +- The stream load result now includes the time taken to read HTTP data, `ReceiveDataTimeMs`, which can quickly determine slow stream load issues caused by network reasons. [#40735](https://github.com/apache/doris/pull/40735) +- Optimized the routine load timeout logic to avoid frequent timeouts during inverted index and mow writes. [#40818](https://github.com/apache/doris/pull/40818) + +### Storage + +- Support for batch addition of partitions. [#37114](https://github.com/apache/doris/pull/37114) + +### Compute-Storage Decoupled + +- Added the meta-service HTTP interface `/MetaService/http/show_meta_ranges` to facilitate the statistics of KV distribution in FDB. [#39208](https://github.com/apache/doris/pull/39208) +- The meta-service/recycler stop script ensures that the process fully exits before returning. [#40218](https://github.com/apache/doris/pull/40218) +- Support for using the session variable `version_comment` (Cloud Mode) to display the current deployment mode as compute-storage decoupled. [#38269](https://github.com/apache/doris/pull/38269) +- Fixed the detailed message returned when transaction submission fails. [#40584](https://github.com/apache/doris/pull/40584) +- Support for using one meta-service process to provide both metadata services and data recycling services. [#40223](https://github.com/apache/doris/pull/40223) +- Optimized the default configuration of file_cache to avoid potential issues when not set. [#41421](https://github.com/apache/doris/pull/41421) [#41507](https://github.com/apache/doris/pull/41507) +- Improved query performance by batch retrieving the version of multiple partitions. [#38949](https://github.com/apache/doris/pull/38949) +- Delayed the redistribution of tablets to avoid query performance issues caused by temporary network fluctuations. [#40371](https://github.com/apache/doris/pull/40371) +- Optimized the read-write lock logic in the balance. [#40633](https://github.com/apache/doris/pull/40633) +- Enhanced the robustness of file cache in handling TTL filenames during restarts/crashes. [#40226](https://github.com/apache/doris/pull/40226) +- Added the BE HTTP interface `/api/file_cache?op=hash` to facilitate the calculation of the hash file names of segment files on disk. [#40831](https://github.com/apache/doris/pull/40831) +- Optimized the unified naming to be compatible with using compute group to represent BE groups (original cloud cluster). [#40767](https://github.com/apache/doris/pull/40767) +- Optimized the waiting time for obtaining locks when calculating delete bitmaps in primary key tables. [#40341](https://github.com/apache/doris/pull/40341) +- When there are many delete bitmaps in primary key tables, optimized the high CPU consumption during queries by pre-merging multiple delete bitmaps. [#40204](https://github.com/apache/doris/pull/40204) +- Support for managing FE/BE nodes in compute-storage decoupled mode through SQL statements, hiding the logic of direct interaction with meta-service when deploying in compute-storage decoupled mode. [#40264](https://github.com/apache/doris/pull/40264) +- Added a script for rapid deployment of FDB. [#39803](https://github.com/apache/doris/pull/39803) +- Optimized the output of `SHOW CACHE HOTSPOT` to unify the column name style with other `SHOW` statements. [#41322](https://github.com/apache/doris/pull/41322) +- When using a storage vault as the storage backend, disallowed the use of `latest_fs()` to avoid binding different storage backends to the same table. [#40516](https://github.com/apache/doris/pull/40516) +- Optimized the timeout strategy for calculating delete bitmaps when importing mow tables. [#40562](https://github.com/apache/doris/pull/40562) [#40333](https://github.com/apache/doris/pull/40333) +- The enable_file_cache in be.conf is now enabled by default in compute-storage decoupled mode. [#41502](https://github.com/apache/doris/pull/41502) + +### Lakehouse + +- When reading tables in CSV format, support for the session `keep_carriage_return` setting to control the reading behavior of the `\r` symbol. [#39980](https://github.com/apache/doris/pull/39980) +- The default maximum memory of BE's JVM has been adjusted to 2GB (affecting only new deployments). [#41403](https://github.com/apache/doris/pull/41403) +- Hive Catalog has added `hive.recursive_directories_table` and `hive.ignore_absent_partitions` properties to specify whether to recursively traverse data directories and whether to ignore missing partitions. [#39494](https://github.com/apache/doris/pull/39494) +- Optimized the Catalog refresh logic to avoid generating a large number of connections during refresh. [#39205](https://github.com/apache/doris/pull/39205) +- `SHOW CREATE DATABASE` and `SHOW CREATE TABLE` for external data sources now display location information. [#39179](https://github.com/apache/doris/pull/39179) +- The new optimizer supports inserting data into JDBC external tables using the `INSERT INTO` statement. [#41511](https://github.com/apache/doris/pull/41511) +- MaxCompute Catalog now supports complex data types. [#39259](https://github.com/apache/doris/pull/39259) +- Optimized the logic for reading and merging data shards of external tables. [#38311](https://github.com/apache/doris/pull/38311) +- Optimized some refresh strategies for metadata caches of external tables. [#38506](https://github.com/apache/doris/pull/38506) +- Paimon tables now support pushing down `IN/NOT IN` predicates. [#38390](https://github.com/apache/doris/pull/38390) +- Compatible with tables created in Parquet format by Paimon version 0.9. [#41020](https://github.com/apache/doris/pull/41020) + +### Asynchronous Materialized Views + +- Building asynchronous materialized views now supports the use of both immediate and starttime. [#39573](https://github.com/apache/doris/pull/39573) +- Asynchronous materialized views based on external tables will refresh the metadata cache of the external tables before refreshing the materialized views, ensuring construction based on the latest external table data. [#38212](https://github.com/apache/doris/pull/38212) +- Partition incremental construction now supports rolling up according to weekly and quarterly granularities. [#39286](https://github.com/apache/doris/pull/39286) + +### Query Optimizer + +- The aggregate function `GROUP_CONCAT` now supports the use of both `DISTINCT` and `ORDER BY`. [#38080](https://github.com/apache/doris/pull/38080) +- Optimized the collection and use of statistical information, as well as the logic for estimating row counts and cost calculations, to generate more efficient and stable execution plans. +- Window function partition data pre-filtering now supports cases containing multiple window functions. [#38393](https://github.com/apache/doris/pull/38393) + +### Query Execution + +- Reduced query latency by running prepare pipeline tasks in parallel. [#40874](https://github.com/apache/doris/pull/40874) +- Display Catalog information in Profile. [#38283](https://github.com/apache/doris/pull/38283) +- Optimized the computational performance of `IN` filtering conditions. [#40917](https://github.com/apache/doris/pull/40917) +- Supported cgroupv2 in K8S to limit Doris's memory usage. [#39256](https://github.com/apache/doris/pull/39256) +- Optimized the performance of converting strings to datetime types. [#38385](https://github.com/apache/doris/pull/38385) +- When a `string` is a decimal number, support casting it to an `int`, which will be more compatible with certain behaviors of MySQL. [#38847](https://github.com/apache/doris/pull/38847) + +### Semi-Structured Data Management + +- Optimized the performance of inverted index matching. [#41122](https://github.com/apache/doris/pull/41122) +- Temporarily prohibited the creation of inverted indexes with tokenization on arrays. [#39062](https://github.com/apache/doris/pull/39062) +- `explode_json_array` now supports binary JSON types. [#37278](https://github.com/apache/doris/pull/37278) +- IP data types now support bloomfilter indexes. [#39253](https://github.com/apache/doris/pull/39253) +- IP data types now support row storage. [#39258](https://github.com/apache/doris/pull/39258) +- Nested data types such as ARRAY, MAP, and STRUCT now support schema changes. [#39210](https://github.com/apache/doris/pull/39210) +- When creating MTMV, automatically truncate KEYs encountered in VARIANT data types. [#39988](https://github.com/apache/doris/pull/39988) +- Lazy loading of inverted indexes during queries to improve performance. [#38979](https://github.com/apache/doris/pull/38979) +- `add inverted index file size for open file`. [#37482](https://github.com/apache/doris/pull/37482) +- Reduced access to object storage interfaces during compaction to improve performance. [#41079](https://github.com/apache/doris/pull/41079) +- Added three new query profile metrics related to inverted indexes. [#36696](https://github.com/apache/doris/pull/36696) +- Reduced cache overhead for non-PreparedStatement SQL to improve performance. [#40910](https://github.com/apache/doris/pull/40910) +- Pre-warming cache now supports inverted indexes. [#38986](https://github.com/apache/doris/pull/38986) +- Inverted indexes are now cached immediately after writing. [#39076](https://github.com/apache/doris/pull/39076) + +### Compatibility + +- Fixed the issue of Thrift ID incompatibility on the master with branch-2.1. [#41057](https://github.com/apache/doris/pull/41057) + +### Other + +- BE HTTP API now supports authentication; set config::enable_all_http_auth to true (default is false) when authentication is required. [#39577](https://github.com/apache/doris/pull/39577) +- Optimized the user permissions required for the REFRESH operation. Permissions have been relaxed from ALTER to SHOW. [#39008](https://github.com/apache/doris/pull/39008) +- Reduced the range of nextId when calling advanceNextId(). [#40160](https://github.com/apache/doris/pull/40160) +- Optimized the caching mechanism for Java UDFs. [#40404](https://github.com/apache/doris/pull/40404) + +## Bug Fixes + +### Load + +- Fixed the issue where `abortTransaction` did not handle return codes. [#41275](https://github.com/apache/doris/pull/41275) +- Fixed the issue where transactions failed to commit or abort in compute-storage decoupled mode without calling `afterCommit/afterAbort`. [#41267](https://github.com/apache/doris/pull/41267) +- Fixed the issue where Routine Load could not work properly when modifying consumer offsets in compute-storage decoupled mode. [#39159](https://github.com/apache/doris/pull/39159) +- Fixed the issue of repeatedly closing file handles when obtaining error log file paths. [#41320](https://github.com/apache/doris/pull/41320) +- Fixed the issue of incorrect job progress caching for Routine Load in compute-storage decoupled mode. [#39313](https://github.com/apache/doris/pull/39313) +- Fixed the issue where Routine Load could get stuck when failing to commit transactions in compute-storage decoupled mode. [#40539](https://github.com/apache/doris/pull/40539) +- Fixed the issue where Routine Load kept reporting data quality check errors in compute-storage decoupled mode. [#39790](https://github.com/apache/doris/pull/39790) +- Fixed the issue where Routine Load did not check transactions before committing in compute-storage decoupled mode. [#39775](https://github.com/apache/doris/pull/39775) +- Fixed the issue where Routine Load did not check transactions before aborting in compute-storage decoupled mode. [#40463](https://github.com/apache/doris/pull/40463) +- Fixed the issue where cluster keys did not support certain data types. [#38966](https://github.com/apache/doris/pull/38966) +- Fixed the issue of transactions being repeatedly committed. [#39786](https://github.com/apache/doris/pull/39786) +- Fixed the issue of use after free with WAL when BE exits. [#33131](https://github.com/apache/doris/pull/33131) +- Fixed the issue where WAL playback did not skip completed import transactions in compute-storage decoupled mode. [#41262](https://github.com/apache/doris/pull/41262) +- Fixed the logic for selecting BE in group commit in compute-storage decoupled mode. [#39986](https://github.com/apache/doris/pull/39986) [#38644](https://github.com/apache/doris/pull/38644) +- Fixed the issue where BE might crash when group commit was enabled for insert into. [#39339](https://github.com/apache/doris/pull/39339) +- Fixed the issue where insert into with group commit enabled might get stuck. [#39391](https://github.com/apache/doris/pull/39391) +- Fixed the issue where not enabling the group commit option during import might result in a table not found error. [#39731](https://github.com/apache/doris/pull/39731) +- Fixed the issue of transaction submission timeouts due to too many tablets. [#40031](https://github.com/apache/doris/pull/40031) +- Fixed the issue of concurrent opens with Auto Partition. [#38605](https://github.com/apache/doris/pull/38605) +- Fixed the issue of import lock granularity being too large. [#40134](https://github.com/apache/doris/pull/40134) +- Fixed the issue of coredumps caused by zero-length varchars. [#40940](https://github.com/apache/doris/pull/40940) +- Fixed the issue of incorrect index Id values in log prints. [#38790](https://github.com/apache/doris/pull/38790) +- Fixed the issue of memtable shifting not closing BRPC streaming. [#40105](https://github.com/apache/doris/pull/40105) +- Fixed the issue of inaccurate bvar statistics during memtable shifting. [#39075](https://github.com/apache/doris/pull/39075) +- Fixed the issue of multi-replication fault tolerance during memtable shifting. [#38003](https://github.com/apache/doris/pull/38003) +- Fixed the issue of incorrect message length calculations for Routine Load with multiple tables in one stream. [#40367](https://github.com/apache/doris/pull/40367) +- Fixed the issue of inaccurate progress reporting for Broker Load. [#40325](https://github.com/apache/doris/pull/40325) +- Fixed the issue of inaccurate data scan volume reporting for Broker Load. [#40694](https://github.com/apache/doris/pull/40694) +- Fixed the issue of concurrency with Routine Load in compute-storage decoupled mode. [#39242](https://github.com/apache/doris/pull/39242) +- Fixed the issue of Routine Load jobs being canceled in compute-storage decoupled mode. [#39514](https://github.com/apache/doris/pull/39514) +- Fixed the issue of progress not being reset when deleting Kafka topics. [#38474](https://github.com/apache/doris/pull/38474) +- Fixed the issue of updating progress during transaction state transitions in Routine Load. [#39311](https://github.com/apache/doris/pull/39311) +- Fixed the issue of Routine Load switching from a paused state to a paused state. [#40728](https://github.com/apache/doris/pull/40728) +- Fixed the issue of Stream Load records being missed due to database deletion. [#39360](https://github.com/apache/doris/pull/39360) + +### Storage + +- Fixed the issue of missing storage policies. [#38700](https://github.com/apache/doris/pull/38700) +- Fixed the issue of errors during cross-version backup and recovery. [#38370](https://github.com/apache/doris/pull/38370) +- Fixed the NPE issue with ccr binlog. [#39909](https://github.com/apache/doris/pull/39909) +- Fixed potential issues with duplicate keys in mow. [#41309](https://github.com/apache/doris/pull/41309) [#39791](https://github.com/apache/doris/pull/39791) [#39958](https://github.com/apache/doris/pull/39958) [#38369](https://github.com/apache/doris/pull/38369) [#38331](https://github.com/apache/doris/pull/38331) +- Fixed the issue of not being able to write after backup and recovery in high-frequency write scenarios. [#40118](https://github.com/apache/doris/pull/40118) [#38321](https://github.com/apache/doris/pull/38321) +- Fixed the issue of data errors potentially triggered by deleting empty strings and schema changes. [#41064](https://github.com/apache/doris/pull/41064) +- Fixed the issue of incorrect statistics due to column updates. [#40880](https://github.com/apache/doris/pull/40880) +- Limited the size of tablet meta pb to prevent BE crashes due to oversized meta. [#39455](https://github.com/apache/doris/pull/39455) +- Fixed the potential column misalignment issue with the new optimizer in `begin; insert into values; commit`. [#39295](https://github.com/apache/doris/pull/39295) + +### Compute-Storage Decoupled + +- Fixed the issue where the tablet distribution might be inconsistent across multiple FEs in compute-storage decoupled mode. [#41458](https://github.com/apache/doris/pull/41458) +- Fixed the issue where TVF might not work in multi-computing group environments. [#39249](https://github.com/apache/doris/pull/39249) +- Fixed the issue where compaction used resources that had already been released when BE exited in compute-storage decoupled mode. [#39302](https://github.com/apache/doris/pull/39302) +- Fixed the issue where automatic start-stop might cause FE replay to get stuck. [#40027](https://github.com/apache/doris/pull/40027) +- Fixed the issue where the BE status and the stored status in meta-service were inconsistent. [#40799](https://github.com/apache/doris/pull/40799) +- Fixed the issue where the FE->meta-service connection pool could not automatically expire and reconnect. [#41202](https://github.com/apache/doris/pull/41202) [#40661](https://github.com/apache/doris/pull/40661) +- Fixed the issue where some tablets might repeatedly undergo unexpected balance processes during rebalance. [#39792](https://github.com/apache/doris/pull/39792) +- Fixed the issue where storage vault permissions were lost after FE restarted. [#40260](https://github.com/apache/doris/pull/40260) +- Fixed the issue where tablet row counts and other statistical information might be incomplete due to FDB scan range pagination. [#40494](https://github.com/apache/doris/pull/40494) +- Fixed the performance issue caused by a large number of aborted transactions associated with the same label. [#40606](https://github.com/apache/doris/pull/40606) +- Fixed the issue where `commit_txn` did not automatically re-enter, maintaining consistent behavior between compute-storage decoupled and integrated modes. [#39615](https://github.com/apache/doris/pull/39615) +- Fixed the issue where the number of projected columns increased when dropping columns. [#40187](https://github.com/apache/doris/pull/40187) +- Fixed the issue where delete statements did not correctly handle return values, causing data to still be visible after deletion. [#39428](https://github.com/apache/doris/pull/39428) +- Fixed the coredump issue caused by rowset metadata competition during file cache preheating. [#39361](https://github.com/apache/doris/pull/39361) +- Fixed the issue where the entire cache space would be used up when TTL cache enabled LRU eviction. [#39814](https://github.com/apache/doris/pull/39814) +- Fixed the issue where temporary files could not be recycled when importing commit rowset failed with HDFS storage backend. [#40215](https://github.com/apache/doris/pull/40215) + +### Lakehouse + +- Fixed some issues with predicate pushdown in JDBC Catalog. [#39064](https://github.com/apache/doris/pull/39064) +- Fixed the issue of not being able to read when `S``TRUCT` type columns are missing in Parquet format. [#38718](https://github.com/apache/doris/pull/38718) +- Fixed the issue of FileSystem leaks on the FE side in some cases. [#38610](https://github.com/apache/doris/pull/38610) +- Fixed the issue of metadata cache information being inconsistent when Hive/Iceberg tables write back in some cases. [#40729](https://github.com/apache/doris/pull/40729) +- Fixed the issue of unstable partition ID generation for external tables in some cases. [#39325](https://github.com/apache/doris/pull/39325) +- Fixed the issue of external table queries selecting BE nodes in the blacklist in some cases. [#39451](https://github.com/apache/doris/pull/39451) +- Optimized the timeout time for batch retrieval of external table partition information to avoid long-term thread occupation. [#39346](https://github.com/apache/doris/pull/39346) +- Fixed the issue of memory leaks when querying Hudi tables in some cases. [#41256](https://github.com/apache/doris/pull/41256) +- Fixed the issue of connection pool connection leaks in JDBC Catalog in some cases. [#39582](https://github.com/apache/doris/pull/39582) +- Fixed the issue of BE memory leaks in JDBC Catalog in some cases. [#41041](https://github.com/apache/doris/pull/41041) +- Fixed the issue of not being able to query Hudi data on Alibaba Cloud OSS. [#41316](https://github.com/apache/doris/pull/41316) +- Fixed the issue of not being able to read empty partitions in MaxCompute. [#40046](https://github.com/apache/doris/pull/40046) +- Fixed the issue of poor performance when querying Oracle through JDBC Catalog. [#41513](https://github.com/apache/doris/pull/41513) +- Fixed the issue of BE crashes when querying deletion vector of Paimon tables after enabling file cache features. [#39877](https://github.com/apache/doris/pull/39877) +- Fixed the issue of not being able to access Paimon tables on HDFS clusters with HA enabled. [#39806](https://github.com/apache/doris/pull/39806) +- Temporarily disabled the page index filtering feature of Parquet to avoid potential issues. [#38691](https://github.com/apache/doris/pull/38691) +- Fixed the issue of not being able to read unsigned types in Parquet files. [#39926](https://github.com/apache/doris/pull/39926) +- Fixed the issue of potential infinite loops when reading Parquet files in some cases. [#39523](https://github.com/apache/doris/pull/39523) + +### Asynchronous Materialized Views + +- Fixed the issue where partition construction might select the wrong table to track partitions if both sides have the same column names. [#40810](https://github.com/apache/doris/pull/40810) +- Fixed the issue where transparent rewrite partition compensation might result in incorrect results. [#40803](https://github.com/apache/doris/pull/40803) +- Fixed the issue where transparent rewrite did not take effect on external tables. [#38909](https://github.com/apache/doris/pull/38909) +- Fixed the issue where nested materialized views might not refresh properly. [#40433](https://github.com/apache/doris/pull/40433) + +### Synchronous Materialized Views + +- Fixed the issue where creating synchronous materialized views on MOW tables might result in incorrect query results. [#39171](https://github.com/apache/doris/pull/39171) + +### Query Optimizer + +- Fixed the issue where existing synchronous materialized views might not be usable after upgrading. [#41283](https://github.com/apache/doris/pull/41283) +- Fixed the issue of not correctly handling milliseconds when comparing datetime literals. [#40121](https://github.com/apache/doris/pull/40121) +- Fixed the issue of potential errors in conditional function partition pruning. [#39298](https://github.com/apache/doris/pull/39298) +- Fixed the issue where MOW tables with synchronous materialized views could not perform delete operations. [#39578](https://github.com/apache/doris/pull/39578) +- Fixed the issue where the nullable of slots in JDBC external table query predicates might be incorrectly planned, causing query errors. [#41014](https://github.com/apache/doris/pull/41014) + +### Query Execution + +- Fixed the memory leak issue caused by the use of runtime filters. [#39155](https://github.com/apache/doris/pull/39155) +- Fixed the issue of excessive memory usage by window functions. [#39581](https://github.com/apache/doris/pull/39581) +- Fixed a series of function compatibility issues during rolling upgrades. [#41023](https://github.com/apache/doris/pull/41023) [#40438](https://github.com/apache/doris/pull/40438) [#39648](https://github.com/apache/doris/pull/39648) +- Fixed the issue of incorrect results with `encryption_function` when used with constants. [#40201](https://github.com/apache/doris/pull/40201) +- Fixed the issue of errors when importing single-table materialized views. [#39061](https://github.com/apache/doris/pull/39061) +- Fixed the issue of incorrect partition result calculations for window functions. [#39100](https://github.com/apache/doris/pull/39100) [#40761](https://github.com/apache/doris/pull/40761) +- Fixed the issue of incorrect calculations for topn when null values are present. [#39497](https://github.com/apache/doris/pull/39497) +- Fixed the issue of incorrect results with the `map_agg` function. [#39743](https://github.com/apache/doris/pull/39743) +- Fixed the issue of incorrect messages returned by cancel. [#38982](https://github.com/apache/doris/pull/38982) +- Fixed the issue of BE core dumps caused by encrypt and decrypt functions. [#40726](https://github.com/apache/doris/pull/40726) +- Fixed the issue of queries getting stuck due to too many scanners in high-concurrency scenarios. [#40495](https://github.com/apache/doris/pull/40495) +- Supported time types in runtime filters. [#38258](https://github.com/apache/doris/pull/38258) +- Fixed the issue of incorrect results with window funnel functions. [#40960](https://github.com/apache/doris/pull/40960) + +### Semi-Structured Data Management + +- Fixed the issue of match function errors when no indexes were present. [#38989](https://github.com/apache/doris/pull/38989) +- Fixed the issue of crashes when ARRAY data types were used as parameters for array_min/array_max functions. [#39492](https://github.com/apache/doris/pull/39492) +- Fixed the issue of nullable with the `array_enumerate_uniq` function. [#38384](https://github.com/apache/doris/pull/38384) +- Fixed the issue of bloomfilter indexes not being updated when adding or deleting columns. [#38431](https://github.com/apache/doris/pull/38431) +- Fixed the issue of es-catalog parsing exceptions with array data. [#39104](https://github.com/apache/doris/pull/39104) +- Fixed the issue of improper predicate push-down in es-catalog. [#40111](https://github.com/apache/doris/pull/40111) +- Fixed the issue of exceptions caused by modifying input data with`map()` and `struct()` functions. [#39699](https://github.com/apache/doris/pull/39699) +- Fixed the issue of index compaction crashes in special cases. [#40294](https://github.com/apache/doris/pull/40294) +- Fixed the issue of ARRAY type inverted indexes missing nullbitmaps. [#38907](https://github.com/apache/doris/pull/38907) +- Fixed the issue of incorrect results with the `count()` function on inverted indexes. [#41152](https://github.com/apache/doris/pull/41152) +- Fixed the issue of correct results with the `explode_map` function when using aliases. [#39757](https://github.com/apache/doris/pull/39757) +- Fixed the issue of VARIANT type not being able to use row storage for exceptional JSON data. [#39394](https://github.com/apache/doris/pull/39394) +- Fixed the issue of memory leaks when returning ARRAY results with VARIANT type. [#41358](https://github.com/apache/doris/pull/41358) +- Fixed the issue of changing column names with VARIANT type. [#40320](https://github.com/apache/doris/pull/40320) +- Fixed the issue of potential precision loss when converting VARIANT type to DECIMAL type. [#39650](https://github.com/apache/doris/pull/39650) +- Fixed the issue of nullable handling with VARIANT type. [#39732](https://github.com/apache/doris/pull/39732) +- Fixed the issue of sparse column reading with VARIANT type. [#40295](https://github.com/apache/doris/pull/40295) + +### Other + +- Fixed the compatibility issue between new and old audit log plugins. [#41401](https://github.com/apache/doris/pull/41401) +- Fixed the issue where users could see processes of others in certain cases. [#39747](https://github.com/apache/doris/pull/39747) +- Fixed the issue where users with permissions could not export. [#38365](https://github.com/apache/doris/pull/38365) +- Fixed the issue where create table like required create permissions for the existing table. [#37879](https://github.com/apache/doris/pull/37879) +- Fixed the issue where some features did not verify permissions. [#39726](https://github.com/apache/doris/pull/39726) +- Fixed the issue of not correctly closing connections when using SSL. [#38587](https://github.com/apache/doris/pull/38587) +- Fixed the issue where executing ALTER VIEW operations in some cases caused FE to fail to start. [#40872](https://github.com/apache/doris/pull/40872) \ No newline at end of file diff --git a/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.3.md b/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.3.md new file mode 100644 index 0000000000000..b15777212b400 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.3.md @@ -0,0 +1,226 @@ +--- +{ + "title": "Release 3.0.3", + "language": "en" +} +--- + + + + +Dear community members, the Apache Doris 3.0.3 version was officially released on December 02, 2024, this version further enhances the performance and stability of the system. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavioral Changes + +- Prohibited column updates on MOW tables with synchronous materialized views. [#40190](https://github.com/apache/doris/pull/40190) +- Adjusted the default parameters of RoutineLoad to improve import efficiency. [#42968](https://github.com/apache/doris/pull/42968) +- When StreamLoad fails, the return value of LoadedRows is adjusted to 0. [#41946](https://github.com/apache/doris/pull/41946) [#42291](https://github.com/apache/doris/pull/42291) +- Adjusted the default memory limit of Segment cache to 5%. [#42308](https://github.com/apache/doris/pull/42308) [#42436](https://github.com/apache/doris/pull/42436) + +## New Features + +- Introduced the session variable `enable_cooldown_replica_affinity` to control the affinity of cold and hot tiered replicas. [#42677](https://github.com/apache/doris/pull/42677) + +- Added `table$partition` syntax for querying partition information of Hive tables. [#40774](https://github.com/apache/doris/pull/40774) + + - [View Documentation](https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/hive) + +- Supported creation of Hive tables in Text format. [#41860](https://github.com/apache/doris/pull/41860) [#42175](https://github.com/apache/doris/pull/42175) + + - [View Documentation](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build#table) + +### Asynchronous Materialized Views + +- Introduced new materialized view attribute `use_for_rewrite`. When `use_for_rewrite` is set to false, the materialized view does not participate in transparent rewriting. [#40332](https://github.com/apache/doris/pull/40332) + +### Query Optimizer + +- Supported correlated non-aggregate subqueries. [#42236](https://github.com/apache/doris/pull/42236) + +### Query Execution + +- Added functions `ngram_search`, `normal_cdf`, `to_iso8601`, `from_iso8601_date`, `SESSION_USER()`, `last_query_id`. [#38226](https://github.com/apache/doris/pull/38226) [#40695](https://github.com/apache/doris/pull/40695) [#41075](https://github.com/apache/doris/pull/41075) [#41600](https://github.com/apache/doris/pull/41600) [#39575](https://github.com/apache/doris/pull/39575) [#40739](https://github.com/apache/doris/pull/40739) +- The `aes_encrypt` and `aes_decrypt` functions support GCM mode. [#40004](https://github.com/apache/doris/pull/40004) +- Profile outputs the changed session variable values. [#41016](https://github.com/apache/doris/pull/41016) [#41318](https://github.com/apache/doris/pull/41318) + +### Semi-structured Data Management + +- Added array functions `array_match_all` and `array_match_any`. [#40605](https://github.com/apache/doris/pull/40605) [#43514](https://github.com/apache/doris/pull/43514) +- The array function `array_agg` supports nesting ARRAY/MAP/STRUCT within ARRAY. [#42009](https://github.com/apache/doris/pull/42009) +- Added approximate aggregate statistical functions `approx_top_k` and `approx_top_sum`. [#44082](https://github.com/apache/doris/pull/44082) + +## Improvements + +### Storage + +- Supported `bitmap_empty` as the default value. [#40364](https://github.com/apache/doris/pull/40364) +- Introduced the session variable `insert_timeout` to control the timeout of DELETE statements. [#41063](https://github.com/apache/doris/pull/41063) +- Improved some error message prompts. [#41048](https://github.com/apache/doris/pull/41048) [#39631](https://github.com/apache/doris/pull/39631) +- Improved the priority scheduling of replica repair. [#41076](https://github.com/apache/doris/pull/41076) +- Enhanced the robustness of timezone handling when creating tables. [#41926](https://github.com/apache/doris/pull/41926) [#42389](https://github.com/apache/doris/pull/42389) +- Checked the validity of partition expressions when creating tables. [#40158](https://github.com/apache/doris/pull/40158) +- Supported Unicode-encoded column names in DELETE operations. [#39381](https://github.com/apache/doris/pull/39381) + +### Compute-Storage Decoupled + +- Supported ARM architecture deployment in storage and compute separation mode. [#42467](https://github.com/apache/doris/pull/42467) [#43377](https://github.com/apache/doris/pull/43377) +- Optimized the eviction strategy and lock competition of file cache, improving hit rate and high concurrency point query performance. [#42451](https://github.com/apache/doris/pull/42451) [#43201](https://github.com/apache/doris/pull/43201) [#41818](https://github.com/apache/doris/pull/41818) [#43401](https://github.com/apache/doris/pull/43401) +- S3 storage vault supported `use_path_style`, solving the problem of using custom domain names for object storage. [#43060](https://github.com/apache/doris/pull/43060) [#43343](https://github.com/apache/doris/pull/43343) [#43330](https://github.com/apache/doris/pull/43330) +- Optimized storage and compute separation configuration and deployment, preventing misoperations in different modes. [#43381](https://github.com/apache/doris/pull/43381) [#43522](https://github.com/apache/doris/pull/43522) [#43434](https://github.com/apache/doris/pull/43434) [#40764](https://github.com/apache/doris/pull/40764) [#43891](https://github.com/apache/doris/pull/43891) +- Optimized observability and provided an interface for deleting specified segment file cache. [#38489](https://github.com/apache/doris/pull/38489) [#42896](https://github.com/apache/doris/pull/42896) [#41037](https://github.com/apache/doris/pull/41037) [#43412](https://github.com/apache/doris/pull/43412) +- Optimized Meta-service operation and maintenance interface: RPC rate limiting and tablet metadata correction. [#42413](https://github.com/apache/doris/pull/42413) [#43884](https://github.com/apache/doris/pull/43884) [#41782](https://github.com/apache/doris/pull/41782) [#43460](https://github.com/apache/doris/pull/43460) + +### Lakehouse + +- Paimon Catalog supported Alibaba Cloud DLF and OSS-HDFS storage. [#41247](https://github.com/apache/doris/pull/41247) [#42585](https://github.com/apache/doris/pull/42585) + + - View [Documentation](https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/paimon) + +- Supported reading of Hive tables in OpenCSV format. [#42257](https://github.com/apache/doris/pull/42257) [#42942](https://github.com/apache/doris/pull/42942) +- Optimized the performance of accessing the `information_schema.columns` table in External Catalog. [#41659](https://github.com/apache/doris/pull/41659) [#41962](https://github.com/apache/doris/pull/41962) +- Used the new Max Compute open storage API to access Max Compute data sources. [#41614](https://github.com/apache/doris/pull/41614) +- Optimized the scheduling policy of the JNI part of Paimon tables, making scan tasks more balanced. [#43310](https://github.com/apache/doris/pull/43310) +- Optimized the read performance of small ORC files. [#42004](https://github.com/apache/doris/pull/42004) [#43467](https://github.com/apache/doris/pull/43467) +- Supported reading of parquet files in brotli compressed format. [#42177](https://github.com/apache/doris/pull/42177) +- Added `file_cache_statistics` table under the `information_schema` library to view metadata cache statistics. [#42160](https://github.com/apache/doris/pull/42160) + +### Query Optimizer + +- Optimization: When queries only differ in comments, the same SQL Cache can be reused. [#40049](https://github.com/apache/doris/pull/40049) +- Optimization: Improved the stability of statistical information when data is frequently updated. [#43865](https://github.com/apache/doris/pull/43865) [#39788](https://github.com/apache/doris/pull/39788) [#43009](https://github.com/apache/doris/pull/43009) [#40457](https://github.com/apache/doris/pull/40457) [#42409](https://github.com/apache/doris/pull/42409) [#41894](https://github.com/apache/doris/pull/41894) +- Optimization: Enhanced the stability of constant folding. [#42910](https://github.com/apache/doris/pull/42910) [#41164](https://github.com/apache/doris/pull/41164) [#39723](https://github.com/apache/doris/pull/39723) [#41394](https://github.com/apache/doris/pull/41394) [#42256](https://github.com/apache/doris/pull/42256) [#40441](https://github.com/apache/doris/pull/40441) +- Optimization: Column pruning can generate better execution plans. [#41719](https://github.com/apache/doris/pull/41719) [#41548](https://github.com/apache/doris/pull/41548) + +### Query Execution + +- Optimized the memory usage of the sort operator. [#39306](https://github.com/apache/doris/pull/39306) +- Optimized the performance of computations on ARM. [#38888](https://github.com/apache/doris/pull/38888) [#38759](https://github.com/apache/doris/pull/38759) +- Optimized the computational performance of a series of functions. [#40366](https://github.com/apache/doris/pull/40366) [#40821](https://github.com/apache/doris/pull/40821) [#40670](https://github.com/apache/doris/pull/40670) [#41206](https://github.com/apache/doris/pull/41206) [#40162](https://github.com/apache/doris/pull/40162) +- Used SSE instructions to optimize the performance of the `match_ipv6_subnet` function. [#38755](https://github.com/apache/doris/pull/38755) +- Supported automatic creation of new partitions during insert overwrite. [#38628](https://github.com/apache/doris/pull/38628) [#42645](https://github.com/apache/doris/pull/42645) +- Added the status of each PipelineTask in Profile. [#42981](https://github.com/apache/doris/pull/42981) +- IP type supported runtime filter. [#39985](https://github.com/apache/doris/pull/39985) + +### Semi-structured Data Management + +- Output the real SQL of prepared statements in audit logs. [#43321](https://github.com/apache/doris/pull/43321) +- The filebeat doris output plugin supports fault tolerance and progress reporting. [#36355](https://github.com/apache/doris/pull/36355) +- Optimized the performance of inverted index queries. [#41547](https://github.com/apache/doris/pull/41547) [#41585](https://github.com/apache/doris/pull/41585) [#41567](https://github.com/apache/doris/pull/41567) [#41577](https://github.com/apache/doris/pull/41577) [#42060](https://github.com/apache/doris/pull/42060) [#42372](https://github.com/apache/doris/pull/42372) +- The array function `array overlaps` supports acceleration using inverted indexes. [#41571](https://github.com/apache/doris/pull/41571) +- The IP function `is_ip_address_in_range` supports acceleration using inverted indexes. [#41571](https://github.com/apache/doris/pull/41571) +- Optimized the CAST performance of the VARIANT data type. [#41775](https://github.com/apache/doris/pull/41775) [#42438](https://github.com/apache/doris/pull/42438) [#43320](https://github.com/apache/doris/pull/43320) +- Optimized the CPU resource consumption of the Variant data type. [#42856](https://github.com/apache/doris/pull/42856) [#43062](https://github.com/apache/doris/pull/43062) [#43634](https://github.com/apache/doris/pull/43634) +- Optimized the metadata and execution memory resource consumption of the Variant data type. [#42448](https://github.com/apache/doris/pull/42448) [#43326](https://github.com/apache/doris/pull/43326) [#41482](https://github.com/apache/doris/pull/41482) [#43093](https://github.com/apache/doris/pull/43093) [#43567](https://github.com/apache/doris/pull/43567) [#43620](https://github.com/apache/doris/pull/43620) + +### Permissions + +- Added a new configuration item `ldap_group_filter` in LDAP for custom group filtering. [#43292](https://github.com/apache/doris/pull/43292) + +### Other + +- Supported displaying connection count information by user in FE monitoring items. [#39200](https://github.com/apache/doris/pull/39200) + +## Bug Fixes + +### Storage + +- Fixed the issue with using IPv6 hostnames. [#40074](https://github.com/apache/doris/pull/40074) +- Fixed the inaccurate display of broker/s3 load progress. [#43535](https://github.com/apache/doris/pull/43535) +- Fixed the issue where queries might hang from FE. [#41303](https://github.com/apache/doris/pull/41303) [#42382](https://github.com/apache/doris/pull/42382) +- Fixed the issue of duplicate auto-increment IDs under exceptional circumstances. [#43774](https://github.com/apache/doris/pull/43774) [#43983](https://github.com/apache/doris/pull/43983) +- Fixed occasional NPE issues with groupcommit. [#43635](https://github.com/apache/doris/pull/43635) +- Fixed the inaccurate calculation of auto bucket. [#41675](https://github.com/apache/doris/pull/41675) [#41835](https://github.com/apache/doris/pull/41835) +- Fixed the issue where FE might not correctly plan multi-table flows after restart. [#41677](https://github.com/apache/doris/pull/41677) [#42290](https://github.com/apache/doris/pull/42290) + +### Compute-Storage Decoupled + +- Fixed the issue that MOW primary key tables with large delete bitmaps might cause coredump. [#43088](https://github.com/apache/doris/pull/43088) [#43457](https://github.com/apache/doris/pull/43457) [#43479](https://github.com/apache/doris/pull/43479) [#43407](https://github.com/apache/doris/pull/43407) [#43297](https://github.com/apache/doris/pull/43297) [#43613](https://github.com/apache/doris/pull/43613) [#43615](https://github.com/apache/doris/pull/43615) [#43854](https://github.com/apache/doris/pull/43854) [#43968](https://github.com/apache/doris/pull/43968) [#44074](https://github.com/apache/doris/pull/44074) [#41793](https://github.com/apache/doris/pull/41793) [#42142](https://github.com/apache/doris/pull/42142) +- Fixed the issue that segment files, when being a multiple of 5MB, would fail to upload objects. [#43254](https://github.com/apache/doris/pull/43254) +- Fixed the issue that the default retry policy of aws sdk did not take effect. [#43575](https://github.com/apache/doris/pull/43575) [#43648](https://github.com/apache/doris/pull/43648) +- Fixed the issue that altering storage vault could continue execution even when the wrong type was specified. [#43489](https://github.com/apache/doris/pull/43489) [#43352](https://github.com/apache/doris/pull/43352) [#43495](https://github.com/apache/doris/pull/43495) +- Fixed the issue that tablet_id might be 0 during the delayed commit process of large transactions. [#42043](https://github.com/apache/doris/pull/42043) [#42905](https://github.com/apache/doris/pull/42905) +- Fixed the issue that constant folding RCP and FE forwarding SQL might not be executed in the expected computation group. [#43110](https://github.com/apache/doris/pull/43110) [#41819](https://github.com/apache/doris/pull/41819) [#41846](https://github.com/apache/doris/pull/41846) +- Fixed the issue that meta-service did not strictly check instance_id upon receiving RPC. [#43253](https://github.com/apache/doris/pull/43253) [#43832](https://github.com/apache/doris/pull/43832) +- Fixed the issue that FE follower information_schema version did not update in time. [#43496](https://github.com/apache/doris/pull/43496) +- Fixed the issue of atomicity in file cache rename and inaccurate metrics. [#42869](https://github.com/apache/doris/pull/42869) [#43504](https://github.com/apache/doris/pull/43504) [#43220](https://github.com/apache/doris/pull/43220) + +### Lakehouse + +- Prohibited implicit conversion predicates from being pushed down to JDBC data sources to avoid inconsistent query results. [#42102](https://github.com/apache/doris/pull/42102) +- Fixed some read issues with high-version Hive transactional tables. [#42226](https://github.com/apache/doris/pull/42226) +- Fixed the issue that the Export command might cause deadlocks. [#43083](https://github.com/apache/doris/pull/43083) [#43402](https://github.com/apache/doris/pull/43402) +- Fixed the issue of being unable to query Hive views created by Spark. [#43552](https://github.com/apache/doris/pull/43552) +- Fixed the issue that Hive partition paths containing special characters led to incorrect partition pruning. [#42906](https://github.com/apache/doris/pull/42906) +- Fixed the issue that Iceberg Catalog could not use AWS Glue. [#41084](https://github.com/apache/doris/pull/41084) + +### Asynchronous Materialized Views + +- Fixed the issue that asynchronous materialized views might not refresh after the base table is rebuilt. [#41762](https://github.com/apache/doris/pull/41762) + +### Query Optimizer + +- Fixed the issue that partition pruning results might be incorrect when using multi-column range partitioning. [#43332](https://github.com/apache/doris/pull/43332) +- Fixed the issue of incorrect calculation results in some limit offset scenarios. [#42576](https://github.com/apache/doris/pull/42576) + +### Query Execution + +- Fixed the issue that hash join with array types larger than 4G could cause BE Core. [#43861](https://github.com/apache/doris/pull/43861) +- Fixed the issue that is null predicate operations might yield incorrect results in some scenarios. [#43619](https://github.com/apache/doris/pull/43619) +- Fixed the issue that bitmap types might produce incorrect output results in hash join. [#43718](https://github.com/apache/doris/pull/43718) +- Fixed some issues where function results were calculated incorrectly. [#40710](https://github.com/apache/doris/pull/40710) [#39358](https://github.com/apache/doris/pull/39358) [#40929](https://github.com/apache/doris/pull/40929) [#40869](https://github.com/apache/doris/pull/40869) [#40285](https://github.com/apache/doris/pull/40285) [#39891](https://github.com/apache/doris/pull/39891) [#40530](https://github.com/apache/doris/pull/40530) [#41948](https://github.com/apache/doris/pull/41948) [#43588](https://github.com/apache/doris/pull/43588) +- Fixed some issues with JSON type parsing. [#39937](https://github.com/apache/doris/pull/39937) +- Fixed issues with varchar and char types in runtime filter operations. [#43758](https://github.com/apache/doris/pull/43758) [#43919](https://github.com/apache/doris/pull/43919) +- Fixed some issues with the use of decimal256 in scalar and aggregate functions. [#42136](https://github.com/apache/doris/pull/42136) [#42356](https://github.com/apache/doris/pull/42356) +- Fixed the issue that arrow flight reported `Reach limit of connections` errors upon connection. [#39127](https://github.com/apache/doris/pull/39127) +- Fixed the issue of incorrect memory usage statistics for BE in k8s environments. [#41123](https://github.com/apache/doris/pull/41123) + +### Semi-structured Data Management + +- Adjusted the default values of `segment_cache_fd_percentage` and `inverted_index_fd_number_limit_percent`. [#42224](https://github.com/apache/doris/pull/42224) +- logstash now supports group_commit. [#40450](https://github.com/apache/doris/pull/40450) +- Fixed the issue of coredump when building index. [#43246](https://github.com/apache/doris/pull/43246) [#43298](https://github.com/apache/doris/pull/43298) +- Fixed issues with variant index. [#43375](https://github.com/apache/doris/pull/43375) [#43773](https://github.com/apache/doris/pull/43773) +- Fixed potential fd and memory leaks under abnormal compaction circumstances. [#42374](https://github.com/apache/doris/pull/42374) +- Inverted index match null now correctly returns null instead of false. [#41786](https://github.com/apache/doris/pull/41786) +- Fixed the issue of coredump when ngram bloomfilter index bf_size is set to 65536. [#43645](https://github.com/apache/doris/pull/43645) +- Fixed the issue of potential coredump during complex data type JOINs. [#40398](https://github.com/apache/doris/pull/40398) +- Fixed the issue of coredump with TVF JSON data. [#43187](https://github.com/apache/doris/pull/43187) +- Fixed the precision issue of bloom filter calculations for dates and times. [#43612](https://github.com/apache/doris/pull/43612) +- Fixed the issue of coredump with IPv6 type storage. [#43251](https://github.com/apache/doris/pull/43251) +- Fixed the issue of coredump when using VARIANT type with light_schema_change disabled. [#40908](https://github.com/apache/doris/pull/40908) +- Improved cache performance for high-concurrency point queries. [#44077](https://github.com/apache/doris/pull/44077) +- Fixed the issue that bloom filter indexes were not synchronized when columns were deleted. [#43378](https://github.com/apache/doris/pull/43378) +- Fixed instability issues with es catalog under special circumstances such as mixed array and scalar data. [#40314](https://github.com/apache/doris/pull/40314) [#40385](https://github.com/apache/doris/pull/40385) [#43399](https://github.com/apache/doris/pull/43399) [#40614](https://github.com/apache/doris/pull/40614) +- Fixed coredump issues caused by abnormal regular pattern matching. [#43394](https://github.com/apache/doris/pull/43394) + +### Permissions + +- Fixed several issues where permissions were not properly restricted after authorization. [#43193](https://github.com/apache/doris/pull/43193) [#41723](https://github.com/apache/doris/pull/41723) [#42107](https://github.com/apache/doris/pull/42107) [#43306](https://github.com/apache/doris/pull/43306) +- Enhanced several permission checks. [#40688](https://github.com/apache/doris/pull/40688) [#40533](https://github.com/apache/doris/pull/40533) [#41791](https://github.com/apache/doris/pull/41791) [#42106](https://github.com/apache/doris/pull/42106) + +### Other + +- Supplemented missing audit log fields in audit log tables and files. [#43303](https://github.com/apache/doris/pull/43303) + + - [View Documentation](https://doris.apache.org/docs/3.0/admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.0.md b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.0.md new file mode 100644 index 0000000000000..dd94da6816294 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.0.md @@ -0,0 +1,379 @@ +--- +{ + "title": "Release 1.1.0", + "language": "en" +} +--- + + + +In version 1.1, we realized the full vectorization of the computing layer and storage layer, and officially enabled the vectorized execution engine as a stable function. All queries are executed by the vectorized execution engine by default, and the performance is 3-5 times higher than the previous version. It increases the ability to access the external tables of Apache Iceberg and supports federated query of data in Doris and Iceberg, and expands the analysis capabilities of Apache Doris on the data lake; on the basis of the original LZ4, the ZSTD compression algorithm is added , further improves the data compression rate; fixed many performance and stability problems in previous versions, greatly improving system stability. Downloading and using is recommended. + +## Upgrade Notes + +### The vectorized execution engine is enabled by default + +In version 1.0, we introduced the vectorized execution engine as an experimental feature and Users need to manually enable it when executing queries by configuring the session variables through `set batch_size = 4096` and `set enable_vectorized_engine = true` . + +In version 1.1, we officially fully enabled the vectorized execution engine as a stable function. The session variable `enable_vectorized_engine` is set to true by default. All queries are executed by default through the vectorized execution engine. + +### BE Binary File Renaming + +BE binary file has been renamed from palo_be to doris_be . Please pay attention to modifying the relevant scripts if you used to rely on process names for cluster management and other operations. + +### Segment storage format upgrade + +The storage format of earlier versions of Apache Doris was Segment V1. In version 0.12, we had implemented Segment V2 as a new storage format, which introduced Bitmap indexes, memory tables, page cache, dictionary compression, delayed materialization and many other features. Starting from version 0.13, the default storage format for newly created tables is Segment V2, while maintaining compatibility with the Segment V1 format. + +In order to ensure the maintainability of the code structure and reduce the additional learning and development costs caused by redundant historical codes, we have decided to no longer support the Segment v1 storage format from the next version. It is expected that this part of the code will be deleted in the Apache Doris 1.2 version. + +### Normal Upgrade + +For normal upgrade operations, you can perform rolling upgrades according to the cluster upgrade documentation on the official website. + +[https://doris.apache.org//docs/admin-manual/cluster-management/upgrade](https://doris.apache.org//docs/admin-manual/cluster-management/upgrade) + +## Features + +### Support random distribution of data [experimental] + +In some scenarios (such as log data analysis), users may not be able to find a suitable bucket key to avoid data skew, so the system needs to provide additional distribution methods to solve the problem. + +Therefore, when creating a table you can set `DISTRIBUTED BY random BUCKETS number`to use random distribution, the data will be randomly written to a single tablet when importing to reduce the data fanout during the loading process. And reduce resource overhead and improve system stability. + +### Support for creating Iceberg external tables[experimental] + +Iceberg external tables provide Apache Doris with direct access to data stored in Iceberg. Through Iceberg external tables, federated queries on data stored in local storage and Iceberg can be implemented, which saves tedious data loading work, simplifies the system architecture for data analysis, and performs more complex analysis operations. + +In version 1.1, Apache Doris supports creating Iceberg external tables and querying data, and supports automatic synchronization of all table schemas in the Iceberg database through the REFRESH command. + +### Added ZSTD compression algorithm + +At present, the data compression method in Apache Doris is uniformly specified by the system, and the default is LZ4. For some scenarios that are sensitive to data storage costs, the original data compression ratio requirements cannot be met. + +In version 1.1, users can set "compression"="zstd" in the table properties to specify the compression method as ZSTD when creating a table. In the 25GB 110 million lines of text log test data, the highest compression rate is nearly 10 times, which is 53% higher than the original compression rate, and the speed of reading data from disk and decompressing it is increased by 30%. + +## Improvements + +### More comprehensive vectorization support + +In version 1.1, we implemented full vectorization of the compute and storage layers, including: + +Implemented vectorization of all built-in functions + +The storage layer implements vectorization and supports dictionary optimization for low-cardinality string columns + +Optimized and resolved numerous performance and stability issues with the vectorization engine. + +We tested the performance of Apache Doris version 1.1 and version 0.15 on the SSB and TPC-H standard test datasets: + +On all 13 SQLs in the SSB test data set, version 1.1 is better than version 0.15, and the overall performance is improved by about 3 times, which solves the problem of performance degradation in some scenarios in version 1.0; + +On all 22 SQLs in the TPC-H test data set, version 1.1 is better than version 0.15, the overall performance is improved by about 4.5 times, and the performance of some scenarios is improved by more than ten times; + +![](/images/release-note-1.1.0-SSB.png) + +

SSB Benchmark

+ +![](/images/release-note-1.1.0-TPC-H.png) + + +

TPC-H Benchmark

+ +**Performance test report** + +[https://doris.apache.org//docs/benchmark/ssb](https://doris.apache.org//docs/benchmark/ssb) + +[https://doris.apache.org//docs/benchmark/tpch](https://doris.apache.org//docs/benchmark/tpch) + +### Compaction logic optimization and real-time guarantee + +In Apache Doris, each commit will generate a data version. In high concurrent write scenarios, -235 errors are prone to occur due to too many data versions and untimely compaction, and query performance will also decrease accordingly. + +In version 1.1, we introduced QuickCompaction, which will actively trigger compaction when the data version increases. At the same time, by improving the ability to scan fragment metadata, it can quickly find fragments with too many data versions and trigger compaction. Through active triggering and passive scanning, the real-time problem of data merging is completely solved. + +At the same time, for high-frequency small file cumulative compaction, the scheduling and isolation of compaction tasks is implemented to prevent the heavyweight base compaction from affecting the merging of new data. + +Finally, for the merging of small files, the strategy of merging small files is optimized, and the method of gradient merging is adopted. Each time the files participating in the merging belong to the same data magnitude, it prevents versions with large differences in size from merging, and gradually merges hierarchically. , reducing the number of times a single file participates in merging, which can greatly save the CPU consumption of the system. + +When the data upstream maintains a write frequency of 10w per second (20 concurrent write tasks, 5000 rows per job, and checkpoint interval of 1s), version 1.1 behaves as follows: + +- Quick data consolidation: Tablet version remains below 50 and compaction score is stable. Compared with the -235 problem that frequently occurred during high concurrent writing in the previous version, the compaction merge efficiency has been improved by more than 10 times. + +- Significantly reduced CPU resource consumption: The strategy has been optimized for small file Compaction. In the above scenario of high concurrent writing, CPU resource consumption is reduced by 25%; + +- Stable query time consumption: The overall orderliness of data is improved, and the fluctuation of query time consumption is greatly reduced. The query time consumption during high concurrent writing is the same as that of only querying, and the query performance is improved by 3-4 times compared with the previous version. + +### Read efficiency optimization for Parquet and ORC files + +By adjusting arrow parameters, arrow's multi-threaded read capability is used to speed up Arrow's reading of each row_group, and it is modified to SPSC model to reduce the cost of waiting for the network through prefetching. After optimization, the performance of Parquet file import is improved by 4 to 5 times. + +### Safer metadata Checkpoint + +By double-checking the image files generated after the metadata checkpoint and retaining the function of historical image files, the problem of metadata corruption caused by image file errors is solved. + +## Bugfix + +### Fix the problem that the data cannot be queried due to the missing data version.(Serious) + +This issue was introduced in version 1.0 and may result in the loss of data versions for multiple replicas. + +### Fix the problem that the resource isolation is invalid for the resource usage limit of loading tasks (Moderate) + +In 1.1, the broker load and routine load will use Backends with specified resource tags to do the load. + +### Use HTTP BRPC to transfer network data packets over 2GB (Moderate) + +In the previous version, when the data transmitted between Backends through BRPC exceeded 2GB, +it may cause data transmission errors. + +## Others + +### Disabling Mini Load + +The `/_load` interface is disabled by default, please use `the /_stream_load` interface uniformly. +Of course, you can re-enable it by turning off the FE configuration item `disable_mini_load`. + +The Mini Load interface will be completely removed in version 1.2. + +### Completely disable the SegmentV1 storage format + +Data in SegmentV1 format is no longer allowed to be created. Existing data can continue to be accessed normally. +You can use the `ADMIN SHOW TABLET STORAGE FORMAT` statement to check whether the data in SegmentV1 format +still exists in the cluster. And convert to SegmentV2 through the data conversion command + +Access to SegmentV1 data will no longer be supported in version 1.2. + +### Limit the maximum length of String type + +In previous versions, String types were allowed a maximum length of 2GB. +In version 1.1, we will limit the maximum length of the string type to 1MB. Strings longer than this length cannot be written anymore. +At the same time, using the String type as a partitioning or bucketing column of a table is no longer supported. + +The String type that has been written can be accessed normally. + +### Fix fastjson related vulnerabilities + +Update to Canal version to fix fastjson security vulnerability. + +### Added `ADMIN DIAGNOSE TABLET` command + +Used to quickly diagnose problems with the specified tablet. + +## Download to Use + +### Download Link + +[hhttps://doris.apache.org/download](https://doris.apache.org/download) + +### Feedback + +If you encounter any problems with use, please feel free to contact us through GitHub discussion forum or Dev e-mail group anytime. + +GitHub Forum: [https://github.com/apache/doris/discussions](https://github.com/apache/doris/discussions) + +Mailing list: [dev@doris.apache.org](dev@doris.apache.org) + +## Thanks + +Thanks to everyone who has contributed to this release: + +``` + +@adonis0147 + +@airborne12 + +@amosbird + +@aopangzi + +@arthuryangcs + +@awakeljw + +@BePPPower + +@BiteTheDDDDt + +@bridgeDream + +@caiconghui + +@cambyzju + +@ccoffline + +@chenlinzhong + +@daikon12 + +@DarvenDuan + +@dataalive + +@dataroaring + +@deardeng + +@Doris-Extras + +@emerkfu + +@EmmyMiao87 + +@englefly + +@Gabriel39 + +@GoGoWen + +@gtchaos + +@HappenLee + +@hello-stephen + +@Henry2SS + +@hewei-nju + +@hf200012 + +@jacktengg + +@jackwener + +@Jibing-Li + +@JNSimba + +@kangshisen + +@Kikyou1997 + +@kylinmac + +@Lchangliang + +@leo65535 + +@liaoxin01 + +@liutang123 + +@lovingfeel + +@luozenglin + +@luwei16 + +@luzhijing + +@mklzl + +@morningman + +@morrySnow + +@nextdreamblue + +@Nivane + +@pengxiangyu + +@qidaye + +@qzsee + +@SaintBacchus + +@SleepyBear96 + +@smallhibiscus + +@spaces-X + +@stalary + +@starocean999 + +@steadyBoy + +@SWJTU-ZhangLei + +@Tanya-W + +@tarepanda1024 + +@tianhui5 + +@Userwhite + +@wangbo + +@wangyf0555 + +@weizuo93 + +@whutpencil + +@wsjz + +@wunan1210 + +@xiaokang + +@xinyiZzz + +@xlwh + +@xy720 + +@yangzhg + +@Yankee24 + +@yiguolei + +@yinzhijian + +@yixiutt + +@zbtzbtzbt + +@zenoyang + +@zhangstar333 + +@zhangyifan27 + +@zhannngchen + +@zhengshengjun + +@zhengshiJ + +@zingdle + +@zuochunwei + +@zy-kkk +``` diff --git a/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.1.md b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.1.md new file mode 100644 index 0000000000000..73a6c2d976999 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.1.md @@ -0,0 +1,78 @@ +--- +{ + "title": "Release 1.1.1", + "language": "en" +} +--- + + + +## Features + +### Support ODBC Sink in Vectorized Engine. + +This feature is enabled in non-vectorized engine but it is missed in vectorized engine in 1.1. So that we add back this feature in 1.1.1. + +### Simple Memtracker for Vectorized Engine. + +There is no memtracker in BE for vectorized engine in 1.1, so that the memory is out of control and cause OOM. In 1.1.1, a simple memtracker is added to BE and could control the memory and cancel the query when memory exceeded. + +## Improvements + +### Cache decompressed data in page cache. + +Some data is compressed using bitshuffle and it costs a lot of time to decompress it during query. In 1.1.1, doris will decompress the data that encoded by bitshuffle to accelerate query and we find it could reduce 30% latency for some query in ssb-flat. + +## Bug Fix + +### Fix the problem that could not do rolling upgrade from 1.0.(Serious) + +This issue was introduced in version 1.1 and may cause BE core when upgrade BE but not upgrade FE. + +If you encounter this problem, you can try to fix it with [#10833](https://github.com/apache/doris/pull/10833). + +### Fix the problem that some query not fall back to non-vectorized engine, and BE will core. + +Currently, vectorized engine could not deal with all sql queries and some queries (like left outer join) will use non-vectorized engine to run. But there are some cases not covered in 1.1. And it will cause be crash. + +### Compaction not work correctly and cause -235 Error. + +One rowset multi segments in uniq key compaction, segments rows will be merged in generic_iterator but merged_rows not increased. Compaction will failed in check_correctness, and make a tablet with too much versions which lead to -235 load error. + +### Some segment fault cases during query. + +[#10961](https://github.com/apache/doris/pull/10961) +[#10954](https://github.com/apache/doris/pull/10954) +[#10962](https://github.com/apache/doris/pull/10962) + +# Thanks + +Thanks to everyone who has contributed to this release: + +``` +@jacktengg +@mrhhsg +@xinyiZzz +@yixiutt +@starocean999 +@morrySnow +@morningman +@HappenLee +``` \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.2.md b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.2.md new file mode 100644 index 0000000000000..223b65fda064c --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.2.md @@ -0,0 +1,84 @@ +--- +{ + "title": "Release 1.1.2", + "language": "en" +} +--- + + + + +In this release, Doris Team has fixed more than 170 issues or performance improvement since 1.1.1. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + +# Features + +### New MemTracker + +Introduced new MemTracker for both vectorized engine and non-vectorized engine which is more accurate. + +### Add API for showing current queries and kill query + +### Support read/write emoji of UTF16 via ODBC Table + +# Improvements + +### Data Lake related improvements + +- Improved HDFS ORC File scan performance about 300%. [#11501](https://github.com/apache/doris/pull/11501) + +- Support HDFS HA mode when query Iceberg table. + +- Support query Hive data created by [Apache Tez](https://tez.apache.org/) + +- Add Ali OSS as Hive external support. + +### Add support for string and text type in Spark Load + + +### Add reuse block in non-vectorized engine and have 50% performance improvement in some cases. [#11392](https://github.com/apache/doris/pull/11392) + +### Improve like or regex performance + +### Disable tcmalloc's aggressive_memory_decommit + +It will have 40% performance gains in load or query. + +Currently it is a config, you can change it by set config `tc_enable_aggressive_memory_decommit`. + +# Bug Fix + +### Some issues about FE that will cause FE failure or data corrupt. + +- Add reserved disk config to avoid too many reserved BDB-JE files.**(Serious)** In an HA environment, BDB JE will retains as many reserved files. The BDB-je log doesn't delete until approaching a disk limit. + +- Fix fatal bug in BDB-JE which will cause FE replica could not start correctly or data corrupted.** (Serious)** + +### Fe will hang on waitFor_rpc during query and BE will hang in high concurrent scenarios. + +[#12459](https://github.com/apache/doris/pull/12459) [#12458](https://github.com/apache/doris/pull/12458) [#12392](https://github.com/apache/doris/pull/12392) + +### A fatal issue in vectorized storage engine which will cause wrong result. **(Serious)** + +[#11754](https://github.com/apache/doris/pull/11754) [#11694](https://github.com/apache/doris/pull/11694) + +### Lots of planner related issues that will cause BE core or in abnormal state. + +[#12080](https://github.com/apache/doris/pull/12080) [#12075](https://github.com/apache/doris/pull/12075) [#12040](https://github.com/apache/doris/pull/12040) [#12003](https://github.com/apache/doris/pull/12003) [#12007](https://github.com/apache/doris/pull/12007) [#11971](https://github.com/apache/doris/pull/11971) [#11933](https://github.com/apache/doris/pull/11933) [#11861](https://github.com/apache/doris/pull/11861) [#11859](https://github.com/apache/doris/pull/11859) [#11855](https://github.com/apache/doris/pull/11855) [#11837](https://github.com/apache/doris/pull/11837) [#11834](https://github.com/apache/doris/pull/11834) [#11821](https://github.com/apache/doris/pull/11821) [#11782](https://github.com/apache/doris/pull/11782) [#11723](https://github.com/apache/doris/pull/11723) [#11569](https://github.com/apache/doris/pull/11569) + diff --git a/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.3.md b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.3.md new file mode 100644 index 0000000000000..cfa7151097de3 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.3.md @@ -0,0 +1,92 @@ +--- +{ + "title": "Release 1.1.3", + "language": "en" +} +--- + + + + +In this release, Doris Team has fixed more than 80 issues or performance improvement since 1.1.2. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + + +# Features + +- Support escape identifiers for sqlserver and postgresql in ODBC table. + +- Could use Parquet as output file format. + +# Improvements + +- Optimize flush policy to avoid small segments. [#12706](https://github.com/apache/doris/pull/12706) [#12716](https://github.com/apache/doris/pull/12716) + +- Refactor runtime filter to reduce the prepare time. [#13127](https://github.com/apache/doris/pull/13127) + +- Lots of memory control related issues during query or load process. [#12682](https://github.com/apache/doris/pull/12682) [#12688](https://github.com/apache/doris/pull/12688) [#12708](https://github.com/apache/doris/pull/12708) [#12776](https://github.com/apache/doris/pull/12776) [#12782](https://github.com/apache/doris/pull/12782) [#12791](https://github.com/apache/doris/pull/12791) [#12794](https://github.com/apache/doris/pull/12794) [#12820](https://github.com/apache/doris/pull/12820) [#12932](https://github.com/apache/doris/pull/12932) [#12954](https://github.com/apache/doris/pull/12954) [#12951](https://github.com/apache/doris/pull/12951) + +# BugFix + +- Core dump on compaction with largeint. [#10094](https://github.com/apache/doris/pull/10094) + +- Grouping sets cause be core or return wrong results. [#12313](https://github.com/apache/doris/pull/12313) + +- PREAGGREGATION flag in orthogonal_bitmap_union_count operator is wrong. [#12581](https://github.com/apache/doris/pull/12581) + +- Level1Iterator should release iterators in heap and it may cause memory leak. [#12592](https://github.com/apache/doris/pull/12592) + +- Fix decommission failure with 2 BEs and existing colocation table. [#12644](https://github.com/apache/doris/pull/12644) + +- BE may core dump because of stack-buffer-overflow when TBrokerOpenReaderResponse too large. [#12658](https://github.com/apache/doris/pull/12658) + +- BE may OOM during load when error code -238 occurs. [#12666](https://github.com/apache/doris/pull/12666) + +- Fix wrong child expression of lead function. [#12587](https://github.com/apache/doris/pull/12587) + +- Fix intersect query failed in row storage code. [#12712](https://github.com/apache/doris/pull/12712) + +- Fix wrong result produced by curdate()/current_date() function. [#12720](https://github.com/apache/doris/pull/12720) + +- Fix lateral view explode_split with temp table bug. [#13643](https://github.com/apache/doris/pull/13643) + +- Bucket shuffle join plan is wrong in two same table. [#12930](https://github.com/apache/doris/pull/12930) + +- Fix bug that tablet version may be wrong when doing alter and load. [#13070](https://github.com/apache/doris/pull/13070) + +- BE core when load data using broker with md5sum()/sm3sum(). [#13009](https://github.com/apache/doris/pull/13009) + +# Upgrade Notes + +PageCache and ChunkAllocator are disabled by default to reduce memory usage and can be re-enabled by modifying the configuration items `disable_storage_page_cache` and `chunk_reserved_bytes_limit`. + +Storage Page Cache and Chunk Allocator cache user data chunks and memory preallocation, respectively. + +These two functions take up a certain percentage of memory and are not freed. This part of memory cannot be flexibly allocated, which may lead to insufficient memory for other tasks in some scenarios, affecting system stability and availability. Therefore, we disabled these two features by default in version 1.1.3. + +However, in some latency-sensitive reporting scenarios, turning off this feature may lead to increased query latency. If you are worried about the impact of this feature on your business after upgrade, you can add the following parameters to be.conf to keep the same behavior as the previous version. + +``` +disable_storage_page_cache=false +chunk_reserved_bytes_limit=10% +``` + +* ``disable_storage_page_cache``: Whether to disable Storage Page Cache. version 1.1.2 (inclusive), the default is false, i.e., on. version 1.1.3 defaults to true, i.e., off. +* `chunk_reserved_bytes_limit`: Chunk allocator reserved memory size. 1.1.2 (and earlier), the default is 10% of the overall memory. 1.1.3 version default is 209715200 (200MB). + diff --git a/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.4.md b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.4.md new file mode 100644 index 0000000000000..4710463f4bcde --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.4.md @@ -0,0 +1,72 @@ +--- +{ + "title": "Release 1.1.4", + "language": "en" +} +--- + + + +In this release, Doris Team has fixed about 60 issues or performance improvement since 1.1.3. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + + +# Features + +- Support obs broker load for Huawei Cloud. [#13523](https://github.com/apache/doris/pull/13523) + +- SparkLoad support parquet and orc file.[#13438](https://github.com/apache/doris/pull/13438) + +# Improvements + +- Do not acquire mutex in metric hook since it will affect query performance during heavy load.[#10941](https://github.com/apache/doris/pull/10941) + + +# BugFix + +- The where condition does not take effect when spark load loads the file. [#13804](https://github.com/apache/doris/pull/13804) + +- If function return error result when there is nullable column in vectorized mode. [#13779](https://github.com/apache/doris/pull/13779) + +- Fix incorrect result when using anti join with other join predicates. [#13743](https://github.com/apache/doris/pull/13743) + +- BE crash when call function concat(ifnull). [#13693](https://github.com/apache/doris/pull/13693) + +- Fix planner bug when there is a function in group by clause. [#13613](https://github.com/apache/doris/pull/13613) + +- Table name and column name is not recognized correctly in lateral view clause. [#13600](https://github.com/apache/doris/pull/13600) + +- Unknown column when use MV and table alias. [#13605](https://github.com/apache/doris/pull/13605) + +- JSONReader release memory of both value and parse allocator. [#13513](https://github.com/apache/doris/pull/13513) + +- Fix allow create mv using to_bitmap() on negative value columns when enable_vectorized_alter_table is true. [#13448](https://github.com/apache/doris/pull/13448) + +- Microsecond in function from_date_format_str is lost. [#13446](https://github.com/apache/doris/pull/13446) + +- Sort exprs nullability property may not be right after subsitute using child's smap info. [#13328](https://github.com/apache/doris/pull/13328) + +- Fix core dump on case when have 1000 condition. [#13315](https://github.com/apache/doris/pull/13315) + +- Fix bug that last line of data lost for stream load. [#13066](https://github.com/apache/doris/pull/13066) + +- Restore table or partition with the same replication num as before the backup. [#11942](https://github.com/apache/doris/pull/11942) + + + diff --git a/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.5.md b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.5.md new file mode 100644 index 0000000000000..ee0482b3ba487 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.5.md @@ -0,0 +1,65 @@ +--- +{ + "title": "Release 1.1.5", + "language": "en" +} +--- + + + +In this release, Doris Team has fixed about 36 issues or performance improvement since 1.1.4. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + +# Behavior Changes + +When alias name is same as the original column name like "select year(birthday) as birthday" and use it in group by, order by , having clause, doris's behavior is different from MySQL in the past. In this release, we make it follow MySQL's behavior. Group by and having clause will use original column at first and order by will use alias first. It maybe a litter confuse here so there is a simple advice here, you'd better not use an alias the same as original column name. + +# Features + +Add support of murmur_hash3_64. [#14636](https://github.com/apache/doris/pull/14636) + +# Improvements + +Add timezone cache for convert_tz to improve performance. [#14616](https://github.com/apache/doris/pull/14616) + +Sort result by tablename when call show clause. [#14492](https://github.com/apache/doris/pull/14492) + +# Bug Fix + +Fix coredump when there is a if constant expr in select clause. [#14858](https://github.com/apache/doris/pull/14858) + +ColumnVector::insert_date_column may crashed. [#14839](https://github.com/apache/doris/pull/14839) + +Update high_priority_flush_thread_num_per_store default value to 6 and it will improve the load performance. [#14775](https://github.com/apache/doris/pull/14775) + +Fix quick compaction core. [#14731](https://github.com/apache/doris/pull/14731) + +Partition column is not duplicate key, spark load will throw IndexOutOfBounds error. [#14661](https://github.com/apache/doris/pull/14661) + +Fix a memory leak problem in VCollectorIterator. [#14549](https://github.com/apache/doris/pull/14549) + +Fix create table like when having sequence column. [#14511](https://github.com/apache/doris/pull/14511) + +Using avg rowset to calculate batch size instead of using total_bytes since it costs a lot of cpu. [#14273](https://github.com/apache/doris/pull/14273) + +Fix right outer join core with conjunct. [#14821](https://github.com/apache/doris/pull/14821) + +Optimize policy of tcmalloc gc. [#14777](https://github.com/apache/doris/pull/14777) [#14738](https://github.com/apache/doris/pull/14738) [#14374](https://github.com/apache/doris/pull/14374) + + diff --git a/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.0.md b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.0.md new file mode 100644 index 0000000000000..2529ce7e58aa2 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.0.md @@ -0,0 +1,563 @@ +--- +{ + "title": "Release 1.2.0", + "language": "en" +} +--- + + + + + +# Feature +## Highlight + +1. Full Vectorizied-Engine support, greatly improved performance + + In the standard ssb-100-flat benchmark, the performance of 1.2 is 2 times faster than that of 1.1; in complex TPCH 100 benchmark, the performance of 1.2 is 3 times faster than that of 1.1. + +2. Merge-on-Write Unique Key + + Support Merge-On-Write on Unique Key Model. This mode marks the data that needs to be deleted or updated when the data is written, thereby avoiding the overhead of Merge-On-Read when querying, and greatly improving the reading efficiency on the updateable data model. + +3. Multi Catalog + + The multi-catalog feature provides Doris with the ability to quickly access external data sources for access. Users can connect to external data sources through the `CREATE CATALOG` command. Doris will automatically map the library and table information of external data sources. After that, users can access the data in these external data sources just like accessing ordinary tables. It avoids the complicated operation that the user needs to manually establish external mapping for each table. + + Currently this feature supports the following data sources: + + 1. Hive Metastore: You can access data tables including Hive, Iceberg, and Hudi. It can also be connected to data sources compatible with Hive Metastore, such as Alibaba Cloud's DataLake Formation. Supports data access on both HDFS and object storage. + 2. Elasticsearch: Access ES data sources. + 3. JDBC: Access MySQL through the JDBC protocol. + + Documentation: https://doris.apache.org//docs/dev/lakehouse/multi-catalog) + + > Note: The corresponding permission level will also be changed automatically, see the "Upgrade Notes" section for details. + +4. Light table structure changes + +In the new version, it is no longer necessary to change the data file synchronously for the operation of adding and subtracting columns to the data table, and only need to update the metadata in FE, thus realizing the millisecond-level Schema Change operation. Through this function, the DDL synchronization capability of upstream CDC data can be realized. For example, users can use Flink CDC to realize DML and DDL synchronization from upstream database to Doris. + +Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE + +When creating a table, set `"light_schema_change"="true"` in properties. + +5. JDBC facade + + Users can connect to external data sources through JDBC. Currently supported: + + - MySQL + - PostgreSQL + - Oracle + - SQL Server + - Clickhouse + + Documentation: [https://doris.apache.org/en/docs/dev/lakehouse/multi-catalog/jdbc](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc/) + + > Note: The ODBC feature will be removed in a later version, please try to switch to the JDBC. + +6. JAVA UDF + + Supports writing UDF/UDAF in Java, which is convenient for users to use custom functions in the Java ecosystem. At the same time, through technologies such as off-heap memory and Zero Copy, the efficiency of cross-language data access has been greatly improved. + + Document: https://doris.apache.org//docs/dev/ecosystem/udf/java-user-defined-function + + Example: https://github.com/apache/doris/tree/master/samples/doris-demo + +7. Remote UDF + + Supports accessing remote user-defined function services through RPC, thus completely eliminating language restrictions for users to write UDFs. Users can use any programming language to implement custom functions to complete complex data analysis work. + + Documentation: https://doris.apache.org//docs/ecosystem/udf/remote-user-defined-function + + Example: https://github.com/apache/doris/tree/master/samples/doris-demo + +8. More data types support + + - Array type + + Array types are supported. It also supports nested array types. In some scenarios such as user portraits and tags, the Array type can be used to better adapt to business scenarios. At the same time, in the new version, we have also implemented a large number of data-related functions to better support the application of data types in actual scenarios. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Types/ARRAY + + Related functions: https://doris.apache.org//docs/dev/sql-manual/sql-functions/array-functions/array_max + + - Jsonb type + + Support binary Json data type: Jsonb. This type provides a more compact json encoding format, and at the same time provides data access in the encoding format. Compared with json data stored in strings, it is several times newer and can be improved. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Types/JSONB + + Related functions: https://doris.apache.org//docs/dev/sql-manual/sql-functions/json-functions/jsonb_parse + + - Date V2 + + Sphere of influence: + + 1. The user needs to specify datev2 and datetimev2 when creating the table, and the date and datetime of the original table will not be affected. + 2. When datev2 and datetimev2 are calculated with the original date and datetime (for example, equivalent connection), the original type will be cast into a new type for calculation + 3. The example is in the documentation + + Documentation: https://doris.apache.org/docs/1.2/sql-manual/sql-reference/Data-Types/DATEV2 + + +## More + +1. A new memory management framework + + Documentation: https://doris.apache.org//docs/dev/admin-manual/maint-monitor/memory-management/memory-tracker + +2. Table Valued Function + + Doris implements a set of Table Valued Function (TVF). TVF can be regarded as an ordinary table, which can appear in all places where "table" can appear in SQL. + + For example, we can use S3 TVF to implement data import on object storage: + + ``` + insert into tbl select * from s3("s3://bucket/file.*", "ak" = "xx", "sk" = "xxx") where c1 > 2; + ``` + + Or directly query data files on HDFS: + + ``` + insert into tbl select * from hdfs("hdfs://bucket/file.*") where c1 > 2; + ``` + + TVF can help users make full use of the rich expressiveness of SQL and flexibly process various data. + + Documentation: + + https://doris.apache.org//docs/dev/sql-manual/sql-functions/table-functions/s3 + + https://doris.apache.org//docs/dev/sql-manual/sql-functions/table-functions/hdfs + +3. A more convenient way to create partitions + + Support for creating multiple partitions within a time range via the `FROM TO` command. + +4. Column renaming + + For tables with Light Schema Change enabled, column renaming is supported. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-TABLE-RENAME + +5. Richer permission management + + - Support row-level permissions + + Row-level permissions can be created with the `CREATE ROW POLICY` command. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-POLICY + + - Support specifying password strength, expiration time, etc. + + - Support for locking accounts after multiple failed logins. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Account-Management-Statements/ALTER-USER + +6. Import + + - CSV import supports csv files with header. + + Search for `csv_with_names` in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD/ + + - Stream Load adds `hidden_columns`, which can explicitly specify the delete flag column and sequence column. + + Search for `hidden_columns` in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD + + - Spark Load supports Parquet and ORC file import. + + - Support for cleaning completed imported Labels + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CLEAN-LABEL + + - Support batch cancellation of import jobs by status + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CANCEL-LOAD + + - Added support for Alibaba Cloud oss, Tencent Cloud cos/chdfs and Huawei Cloud obs in broker load. + + Documentation: https://doris.apache.org//docs/dev/advanced/broker + + - Support access to hdfs through hive-site.xml file configuration. + + Documentation: https://doris.apache.org//docs/dev/admin-manual/config/config-dir + +7. Support viewing the contents of the catalog recycle bin through `SHOW CATALOG RECYCLE BIN` function. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Show-Statements/SHOW-CATALOG-RECYCLE-BIN + +8. Support `SELECT * EXCEPT` syntax. + + Documentation: https://doris.apache.org//docs/dev/data-table/basic-usage + +9. OUTFILE supports ORC format export. And supports multi-byte delimiters. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/OUTFILE + +10. Support to modify the number of Query Profiles that can be saved through configuration. + + Document search FE configuration item: max_query_profile_num + +11. The DELETE statement supports IN predicate conditions. And it supports partition pruning. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Manipulation/DELETE + +12. The default value of the time column supports using `CURRENT_TIMESTAMP` + + Search for "CURRENT_TIMESTAMP" in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE + +13. Add two system tables: backends, rowsets + + Documentation: + + https://doris.apache.org//docs/dev/admin-manual/system-table/backends + + https://doris.apache.org//docs/dev/admin-manual/system-table/rowsets + +14. Backup and restore + + - The Restore job supports the `reserve_replica` parameter, so that the number of replicas of the restored table is the same as that of the backup. + + - The Restore job supports `reserve_dynamic_partition_enable` parameter, so that the restored table keeps the dynamic partition enabled. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Backup-and-Restore/RESTORE + + - Support backup and restore operations through the built-in libhdfs, no longer rely on broker. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Backup-and-Restore/CREATE-REPOSITORY + +15. Support data balance between multiple disks on the same machine + + Documentation: + + https://doris.apache.org//docs/dev/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-REBALANCE-DISK + + https://doris.apache.org//docs/dev/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-CANCEL-REBALANCE-DISK + +16. Routine Load supports subscribing to Kerberos-authenticated Kafka services. + + Search for kerberos in the documentation: https://doris.apache.org//docs/dev/data-operate/import/import-way/routine-load-manual + +17. New built-in-function + + Added the following built-in functions: + + - `cbrt` + - `sequence_match/sequence_count` + - `mask/mask_first_n/mask_last_n` + - `elt` + - `any/any_value` + - `group_bitmap_xor` + - `ntile` + - `nvl` + - `uuid` + - `initcap` + - `regexp_replace_one/regexp_extract_all` + - `multi_search_all_positions/multi_match_any` + - `domain/domain_without_www/protocol` + - `running_difference` + - `bitmap_hash64` + - `murmur_hash3_64` + - `to_monday` + - `not_null_or_empty` + - `window_funnel` + - `group_bit_and/group_bit_or/group_bit_xor` + - `outer combine` + - and all array functions + +# Upgrade Notice + +## Known Issues + +- Use JDK11 will cause BE crash, please use JDK8 instead. + +## Behavior Changed + +- Permission level changes + + Because the catalog level is introduced, the corresponding user permission level will also be changed automatically. The rules are as follows: + + - GlobalPrivs and ResourcePrivs remain unchanged + - Added CatalogPrivs level. + - The original DatabasePrivs level is added with the internal prefix (indicating the db in the internal catalog) + - Add the internal prefix to the original TablePrivs level (representing tbl in the internal catalog) + +- In GroupBy and Having clauses, match on column names in preference to aliases. (#14408) + +- Creating columns starting with `mv_` is no longer supported. `mv_` is a reserved keyword in materialized views (#14361) + +- Removed the default limit of 65535 rows added by the order by statement, and added the session variable `default_order_by_limit` to configure this limit. (#12478) + +- In the table generated by "Create Table As Select", all string columns use the string type uniformly, and no longer distinguish varchar/char/string (#14382) + +- In the audit log, remove the word `default_cluster` before the db and user names. (#13499) (#11408) + +- Add sql digest field in audit log (#8919) + +- The union clause always changes the order by logic. In the new version, the order by clause will be executed after the union is executed, unless explicitly associated by parentheses. (#9745) + +- During the decommission operation, the tablet in the recycle bin will be ignored to ensure that the decomission can be completed. (#14028) + +- The returned result of Decimal will be displayed according to the precision declared in the original column, or according to the precision specified in the cast function. (#13437) + +- Changed column name length limit from 64 to 256 (#14671) + +- Changes to FE configuration items + + - The `enable_vectorized_load` parameter is enabled by default. (#11833) + + - Increased `create_table_timeout` value. The default timeout for table creation operations will be increased. (#13520) + + - Modify `stream_load_default_timeout_second` default value to 3 days. + + - Modify the default value of `alter_table_timeout_second` to one month. + + - Increase the parameter `max_replica_count_when_schema_change` to limit the number of replicas involved in the alter job, the default is 100000. (#12850) + + - Add `disable_iceberg_hudi_table`. The iceberg and hudi appearances are disabled by default, and the multi catalog function is recommended. (#13932) + +- Changes to BE configuration items + + - Removed `disable_stream_load_2pc` parameter. 2PC's stream load can be used directly. (#13520) + + - Modify `tablet_rowset_stale_sweep_time_sec` from 1800 seconds to 300 seconds. + + - Redesigned configuration item name about compaction (#13495) + + - Revisited parameter about memory optimization (#13781) + +- Session variable changes + + - Modify the variable `enable_insert_strict` to true by default. This will cause some insert operations that could be executed before, but inserted illegal values, to no longer be executed. (11866) + + - Modified variable `enable_local_exchange` to default to true (#13292) + + - Default data transmission via lz4 compression, controlled by variable `fragment_transmission_compression_codec` (#11955) + + - Add `skip_storage_engine_merge` variable for debugging unique or agg model data (#11952) + + Documentation: https://doris.apache.org//docs/dev/advanced/variables + +- The BE startup script will check whether the value is greater than 200W through `/proc/sys/vm/max_map_count`. Otherwise, the startup fails. (#11052) + +- Removed mini load interface (#10520) + +- FE Metadata Version + + FE Meta Version changed from 107 to 114, and cannot be rolled back after upgrading. + +## During Upgrade + +1. Upgrade preparation + + - Need to replace: lib, bin directory (start/stop scripts have been modified) + + - BE also needs to configure JAVA_HOME, and already supports JDBC Table and Java UDF. + + - The default JVM Xmx parameter in fe.conf is changed to 8GB. + +2. Possible errors during the upgrade process + + - The repeat function cannot be used and an error is reported: `vectorized repeat function cannot be executed`, you can turn off the vectorized execution engine before upgrading. (#13868) + + - schema change fails with error: `desc_tbl is not set. Maybe the FE version is not equal to the BE` (#13822) + + - Vectorized hash join cannot be used and an error will be reported. `vectorized hash join cannot be executed`. You can turn off the vectorized execution engine before upgrading. (#13753) + + The above errors will return to normal after a full upgrade. + +## Performance Impact + +- By default, JeMalloc is used as the memory allocator of the new version BE, replacing TcMalloc (#13367) + +- The batch size in the tablet sink is modified to be at least 8K. (#13912) + +- Disable chunk allocator by default (#13285) + +## Api change + +- BE's http api error return information changed from `{"status": "Fail", "msg": "xxx"}` to more specific ``{"status": "Not found", "msg": "Tablet not found. tablet_id=1202"}``(#9771) + +- In `SHOW CREATE TABLE`, the content of comment is changed from double quotes to single quotes (#10327) + +- Support ordinary users to obtain query profile through http command. (#14016) +Documentation: https://doris.apache.org//docs/dev/admin-manual/http-actions/fe/manager/query-profile-action + +- Optimized the way to specify the sequence column, you can directly specify the column name. (#13872) +Documentation: https://doris.apache.org//docs/dev/data-operate/update-delete/sequence-column-manual + +- Increase the space usage of remote storage in the results returned by `show backends` and `show tablets` (#11450) + +- Removed Num-Based Compaction related code (#13409) + +- Refactored BE's error code mechanism, some returned error messages will change (#8855) +other + +- Support Docker official image. + +- Support compiling Doris on MacOS(x86/M1) and ubuntu-22.04 + Documentation: https://doris.apache.org//docs/dev/install/source-install/compilation-mac/ + +- Support for image file verification. + + Documentation: https://doris.apache.org//docs/dev/admin-manual/maint-monitor/metadata-operation/ + +- script related + + - The stop scripts of FE and BE support exiting FE and BE via the `--grace` parameter (use kill -15 signal instead of kill -9) + + - FE start script supports checking the current FE version via --version (#11563) + + - Support to get the data and related table creation statement of a tablet through the `ADMIN COPY TABLET` command, for local problem debugging (#12176) + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-COPY-TABLET + +- Support to obtain a table creation statement related to a SQL statement through the http api for local problem reproduction (#11979) + + Documentation: https://doris.apache.org//docs/dev/admin-manual/http-actions/fe/query-schema-action + +- Support to close the compaction function of this table when creating a table, for testing (#11743) + + Search for "disble_auto_compaction" in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE + +# Big Thanks + +Thanks to ALL who contributed to this release! (alphabetically) +``` +@924060929 +@a19920714liou +@adonis0147 +@Aiden-Dong +@aiwenmo +@AshinGau +@b19mud +@BePPPower +@BiteTheDDDDt +@bridgeDream +@ByteYue +@caiconghui +@CalvinKirs +@cambyzju +@caoliang-web +@carlvinhust2012 +@catpineapple +@ccoffline +@chenlinzhong +@chovy-3012 +@coderjiang +@cxzl25 +@dataalive +@dataroaring +@dependabot[bot] +@dinggege1024 +@DongLiang-0 +@Doris-Extras +@eldenmoon +@EmmyMiao87 +@englefly +@FreeOnePlus +@Gabriel39 +@gaodayue +@geniusjoe +@gj-zhang +@gnehil +@GoGoWen +@HappenLee +@hello-stephen +@Henry2SS +@hf200012 +@huyuanfeng2018 +@jacktengg +@jackwener +@jeffreys-cat +@Jibing-Li +@JNSimba +@Kikyou1997 +@Lchangliang +@LemonLiTree +@lexoning +@liaoxin01 +@lide-reed +@link3280 +@liutang123 +@liuyaolin +@LOVEGISER +@lsy3993 +@luozenglin +@luzhijing +@madongz +@morningman +@morningman-cmy +@morrySnow +@mrhhsg +@Myasuka +@myfjdthink +@nextdreamblue +@pan3793 +@pangzhili +@pengxiangyu +@platoneko +@qidaye +@qzsee +@SaintBacchus +@SeekingYang +@smallhibiscus +@sohardforaname +@song7788q +@spaces-X +@ssusieee +@stalary +@starocean999 +@SWJTU-ZhangLei +@TaoZex +@timelxy +@Wahno +@wangbo +@wangshuo128 +@wangyf0555 +@weizhengte +@weizuo93 +@wsjz +@wunan1210 +@xhmz +@xiaokang +@xiaokangguo +@xinyiZzz +@xy720 +@yangzhg +@Yankee24 +@yeyudefeng +@yiguolei +@yinzhijian +@yixiutt +@yuanyuan8983 +@zbtzbtzbt +@zenoyang +@zhangboya1 +@zhangstar333 +@zhannngchen +@ZHbamboo +@zhengshiJ +@zhenhb +@zhqu1148980644 +@zuochunwei +@zy-kkk +``` diff --git a/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.1.md b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.1.md new file mode 100644 index 0000000000000..d5adb31eb5256 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.1.md @@ -0,0 +1,196 @@ +--- +{ + "title": "Release 1.2.1", + "language": "en" +} +--- + + + +# Improvement + +### Supports new type DecimalV3 + +DecimalV3, which supports higher precision and better performance, has the following advantages over past versions. + +- Larger representable range, the range of values are significantly expanded, and the valid number range [1,38]. + +- Higher performance, adaptive adjustment of the storage space occupied according to different precision. + +- More complete precision derivation support, for different expressions, different precision derivation rules are applied to the accuracy of the result. + +[DecimalV3](https://doris.apache.org/docs/dev/sql-manual/sql-reference/Data-Types/DECIMALV3/) + +### Support Iceberg V2 + +Support Iceberg V2 (only Position Delete is supported, Equality Delete will be supported in subsequent versions). + +Tables in Iceberg V2 format can be accessed through the Multi-Catalog feature. + +### Support OR condition to IN + +Support converting OR condition to IN condition, which can improve the execution efficiency in some scenarios.[#15437](https://github.com/apache/doris/pull/15437) [#12872](https://github.com/apache/doris/pull/12872) + +### Optimize the import and query performance of JSONB type + +Optimize the import and query performance of JSONB type. [#15219](https://github.com/apache/doris/pull/15219) [#15219](https://github.com/apache/doris/pull/15219) + +### Stream load supports quoted csv data + +Search trim_double_quotes in Document:[https://doris.apache.org/docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD](https://doris.apache.org/docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD) + +### Broker supports Tencent Cloud CHDFS and Baidu Cloud BOS, AFS + +Data on CHDFS, BOS, and AFS can be accessed through Broker. [#15297](https://github.com/apache/doris/pull/15297) [#15448](https://github.com/apache/doris/pull/15448) + +### New function + +Add function `substring_index`. [#15373](https://github.com/apache/doris/pull/15373) + +# Bug Fix + +- In some cases, after upgrading from version 1.1 to version 1.2, the user permission information will be lost. [#15144](https://github.com/apache/doris/pull/15144) + +- Fix the problem that the partition value is wrong when using datev2/datetimev2 type for partitioning. [#15094](https://github.com/apache/doris/pull/15094) + +- Bug fixes for a large number of released features. For a complete list see: [PR List](https://github.com/apache/doris/pulls?q=is%3Apr+label%3Adev%2F1.2.1-merged+is%3Aclosed) + +# Upgrade Notice + +### Known Issues + +- Do not use JDK11 as the runtime JDK of BE, it will cause BE Crash. +- The reading performance of the csv format in this version has declined, which will affect the import and reading efficiency of the csv format. We will fix it as soon as possible in the next three-digit version + +### Behavior Changed + +- The default value of the BE configuration item `high_priority_flush_thread_num_per_store` is changed from 1 to 6, to improve the write efficiency of Routine Load. (https://github.com/apache/doris/pull/14775) + +- The default value of the FE configuration item `enable_new_load_scan_node` is changed to true. Import tasks will be performed using the new File Scan Node. No impact on users.[#14808](https://github.com/apache/doris/pull/14808) + +- Delete the FE configuration item `enable_multi_catalog`. The Multi-Catalog function is enabled by default. + +- The vectorized execution engine is forced to be enabled by default.[#15213](https://github.com/apache/doris/pull/15213) + +The session variable enable_vectorized_engine will no longer take effect. Enabled by default. + +To make it valid again, set the FE configuration item `disable_enable_vectorized_engine` to false, and restart FE to make `enable_vectorized_engine` valid again. + + +# Big Thanks + +Thanks to ALL who contributed to this release! + + +@adonis0147 + +@AshinGau + +@BePPPower + +@BiteTheDDDDt + +@ByteYue + +@caiconghui + +@cambyzju + +@chenlinzhong + +@dataroaring + +@Doris-Extras + +@dutyu + +@eldenmoon + +@englefly + +@freemandealer + +@Gabriel39 + +@HappenLee + +@Henry2SS + +@hf200012 + +@jacktengg + +@Jibing-Li + +@Kikyou1997 + +@liaoxin01 + +@luozenglin + +@morningman + +@morrySnow + +@mrhhsg + +@nextdreamblue + +@qidaye + +@spaces-X + +@starocean999 + +@wangshuo128 + +@weizuo93 + +@wsjz + +@xiaokang + +@xinyiZzz + +@xutaoustc + +@yangzhg + +@yiguolei + +@yixiutt + +@Yulei-Yang + +@yuxuan-luo + +@zenoyang + +@zhangstar333 + +@zhannngchen + +@zhengshengjun + + + + + + diff --git a/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.2.md b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.2.md new file mode 100644 index 0000000000000..08fd22571a03f --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.2.md @@ -0,0 +1,254 @@ +--- +{ + "title": "Release 1.2.2", + "language": "en" +} +--- + + + +# New Features + +### Lakehouse + +- Support automatic synchronization of Hive metastore. + +- Support reading the Iceberg Snapshot, and viewing the Snapshot history. + +- JDBC Catalog supports PostgreSQL, Clickhouse, Oracle, SQLServer + +- JDBC Catalog supports Insert operation + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/) + +### Auto Bucket + + Set and scale the number of buckets for different partitions to keep the number of tablet in a relatively appropriate range. + +### New Functions + +Add the new function `width_bucket`. + +Reference: [https://doris.apache.org/zh-CN/docs/dev/sql-manual/sql-functions/width-bucket/#description](https://doris.apache.org/zh-CN/docs/dev/sql-manual/sql-functions/width-bucket/#description) + +# Behavior Changes + +- Disable BE's page cache by default: `disable_storage_page_cache=true` + +Turn off this configuration to optimize memory usage and reduce the risk of memory OOM. +But it will reduce the query latency of some small queries. +If you are sensitive to query latency, or have high concurrency and small query scenarios, you can configure *disable_storage_page_cache=false* to enable page cache again. + +- Add new session variable `group_by_and_having_use_alias_first`, used to control whether the group and having clauses use alias. + +Reference: [https://doris.apache.org/docs/dev/advanced/variables](https://doris.apache.org/docs/dev/advanced/variables) + +# Improvement + +### Compaction + +- Support `Vertical Compaction`. To optimize the compaction overhead and efficiency of wide tables. + +- Support `Segment ompaction`. Fix -238 and -235 issues with high frequency imports. + +### Lakehouse + +- Hive Catalog can be compatible with Hive version 1/2/3 + +- Hive Catalog can access JuiceFS based HDFS with Broker. + +- Iceberg Catalog Support Hive Metastore and Rest Catalog type. + +- ES Catalog support _id column mapping. + +- Optimize Iceberg V2 read performance with large number of delete rows. + +- Support for reading Iceberg tables after Schema Evolution + +- Parquet Reader handles column name case correctly. + +### Other + +- Support for accessing Hadoop KMS-encrypted HDFS. + +- Support to cancel the Export export task in progress. + +- Optimize the performance of `explode_split` with 1x. + +- Optimize the read performance of nullable columns with 3x. + +- Optimize some problems of Memtracker, improve memory management accuracy, and optimize memory application. + + + +# Bug Fix + +- Fixed memory leak when loading data with Doris Flink Connector. + +- Fixed the possible thread scheduling problem of BE and reduce the `Fragment sent timeout` error caused by BE thread exhaustion. + +- Fixed various correctness and precision issues of column type datetimev2/decimalv3. + +- Fixed the problem data correctness issue with Unique Key Merge-on-Read table. + +- Fixed various known issues with the Light Schema Change feature. + +- Fixed various data correctness issues of bitmap type Runtime Filter. + +- Fixed the problem of poor reading performance of csv reader introduced in version 1.2.1. + +- Fixed the problem of BE OOM caused by Spark Load data download phase. + +- Fixed possible metadata compatibility issues when upgrading from version 1.1 to version 1.2. + +- Fixed the metadata problem when creating JDBC Catalog with Resource. + +- Fixed the problem of high CPU usage caused by load operation. + +- Fixed the problem of FE OOM caused by a large number of failed Broker Load jobs. + +- Fixed the problem of precision loss when loading floating-point types. + +- Fixed the problem of memory leak when useing 2PC stream load + +# Other + +Add metrics to view the total rowset and segment numbers on BE + +- doris_be_all_rowsets_num and doris_be_all_segments_num + + +# Big Thanks + +Thanks to ALL who contributed to this release! + + +@adonis0147 + +@AshinGau + +@BePPPower + +@BiteTheDDDDt + +@ByteYue + +@caiconghui + +@cambyzju + +@chenlinzhong + +@DarvenDuan + +@dataroaring + +@Doris-Extras + +@dutyu + +@englefly + +@freemandealer + +@Gabriel39 + +@HappenLee + +@Henry2SS + +@htyoung + +@isHuangXin + +@JackDrogon + +@jacktengg + +@Jibing-Li + +@kaka11chen + +@Kikyou1997 + +@Lchangliang + +@LemonLiTree + +@liaoxin01 + +@liqing-coder + +@luozenglin + +@morningman + +@morrySnow + +@mrhhsg + +@nextdreamblue + +@qidaye + +@qzsee + +@spaces-X + +@stalary + + +@starocean999 + +@weizuo93 + +@wsjz + +@xiaokang + +@xinyiZzz + +@xy720 + +@yangzhg + +@yiguolei + +@yixiutt + +@Yukang-Lian + +@Yulei-Yang + +@zclllyybb + +@zddr + +@zhangstar333 + +@zhannngchen + +@zy-kkk + + + + + + diff --git a/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.3.md b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.3.md new file mode 100644 index 0000000000000..cd9226b15e14f --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.3.md @@ -0,0 +1,109 @@ +--- +{ + "title": "Release 1.2.3", + "language": "en" +} +--- + + + +# Improvement + +### JDBC Catalog + +- Support connecting to Doris clusters through JDBC Catalog. + +Currently, Jdbc Catalog only support to use 5.x version of JDBC jar package to connect another Doris database. If you use 8.x version of JDBC jar package, the data type of column may not be matched. + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc/#doris](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc/#doris) + +- Support to synchronize only the specified database through the `only_specified_database` attribute. + +- Support synchronizing table names in the form of lowercase through `lower_case_table_names` to solve the problem of case sensitivity of table names. + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc) + +- Optimize the read performance of JDBC Catalog. + +### Elasticsearch Catalog + +- Support Array type mapping. + +- Support whether to push down the like expression through the `like_push_down` attribute to control the CPU overhead of the ES cluster. + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/es](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/es) + +### Hive Catalog + +- Support Hive table default partition `_HIVE_DEFAULT_PARTITION_`. + +- Hive Metastore metadata automatic synchronization supports notification event in compressed format. + +### Dynamic Partition Improvement + +- Dynamic partition supports specifying the `storage_medium` parameter to control the storage medium of the newly added partition. + +Reference: [https://doris.apache.org/docs/dev/advanced/partition/dynamic-partition](https://doris.apache.org/docs/dev/advanced/partition/dynamic-partition) + + +### Optimize BE's Threading Model + +- Optimize BE's threading model to avoid stability problems caused by frequent thread creation and destroy. + +# Bugfix + +- Fixed issues with Merge-On-Write Unique Key tables. + +- Fixed compaction related issues. + +- Fixed some delete statement issues causing data errors. + +- Fixed several query execution errors. + +- Fixed the problem of using JDBC catalog to cause BE crash on some operating system. + +- Fixed Multi-Catalog issues. + +- Fixed memory statistics and optimization issues. + +- Fixed decimalV3 and date/datetimev2 related issues. + +- Fixed load transaction stability issues. + +- Fixed light-weight schema change issues. + +- Fixed the issue of using `datetime` type for batch partition creation. + +- Fixed the problem that a large number of failed broker loads would cause the FE memory usage to be too high. + +- Fixed the problem that stream load cannot be canceled after dropping the table. + +- Fixed querying `information_schema` timeout in some cases. + +- Fixed the problem of BE crash caused by concurrent data export using `select outfile`. + +- Fixed transactional insert operation memory leak. + +- Fixed several query/load profile issues, and supports direct download of profiles through FE web ui. + +- Fixed the problem that the BE tablet GC thread caused the IO util to be too high. + +- Fixed the problem that the commit offset is inaccurate in Kafka routine load. + diff --git a/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.4.md b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.4.md new file mode 100644 index 0000000000000..a959a323d06d1 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.4.md @@ -0,0 +1,81 @@ +--- +{ + "title": "Release 1.2.4", + "language": "en" +} +--- + + + + +# Behavior Changed + +- For `DateV2`/`DatetimeV2` and `DecimalV3` type, in the results of `DESCRIBLE` and `SHOW CREATE TABLE` statements, they will no longer be displayed as `DateV2`/`DatetimeV2` or `DecimalV3`, but directly displayed as `Date`/`Datetime` or `Decimal`. + + - This change is for compatibility with some BI tools. If you want to see the actual type of the column, you can check it with the `DESCRIBE ALL` statement. + +- When querying tables in the `information_schema` database, the meta information(database, table, column, etc.) in the external catalog is no longer returned by default. + + - This change avoids the problem that the `information_schema` database cannot be queried due to the connection problem of some external catalog, so as to solve the problem of using some BI tools with Doris. It can be controlled by the FE configuration `infodb_support_ext_catalog`, and the default value is `false`, that is, the meta information of external catalog will not be returned. + +# Improvement + +### JDBC Catalog + +- Supports connecting to Trino/Presto via JDBC Catalog + +​ Refer to: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#trino](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#trino) + +- JDBC Catalog connects to Clickhouse data source and supports Array type mapping + +​ Refer to: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#clickhouse](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#clickhouse) + +### Spark Load + +- Spark Load supports Resource Manager HA related configuration + +​ Refer to: https://github.com/apache/doris/pull/15000 + +## Bug Fixes + +- Fixed several connectivity issues with Hive Catalog. + +- Fixed ClassNotFound issues with Hudi Catalog. + +- Optimize the connection pool of JDBC Catalog to avoid too many connections. + +- Fix the problem that OOM will occur when importing data from another Doris cluster through JDBC Catalog. + +- Fixed serveral queries and imports planning issues. + +- Fixed several issues with Unique Key Merge-On-Write data model. + +- Fix several BDBJE issues and solve the problem of abnormal FE metadata in some cases. + +- Fix the problem that the `CREATE VIEW` statement does not support Table Valued Function. + +- Fixed several memory statistics issues. + +- Fixed several issues reading Parquet/ORC format. + +- Fixed several issues with DecimalV3. + +- Fixed several issues with SHOW QUERY/LOAD PROFILE. + diff --git a/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.5.md b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.5.md new file mode 100644 index 0000000000000..55af863ba47d6 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.5.md @@ -0,0 +1,199 @@ +--- +{ + "title": "Release 1.2.5", + "language": "en" +} +--- + + + +In version 1.2.5, the Doris team has fixed nearly 210 issues or performance improvements since the release of version 1.2.4. At the same time, version 1.2.5 is also an iterative version of version 1.2.4, which has higher stability. It is recommended that all users upgrade to this version. + +# Behavior Changed + +- The `start_be.sh` script will check that the maximum number of file handles in the system must be greater than or equal to 65536, otherwise the startup will fail. + +- The BE configuration item `enable_quick_compaction` is set to true by default. The Quick Compaction is enabled by default. This feature is used to optimize the problem of small files in the case of large batch import. + +- After modifying the dynamic partition attribute of the table, it will no longer take effect immediately, but wait for the next task scheduling of the dynamic partition table to avoid some deadlock problems. + +# Improvement + +- Optimize the use of bthread and pthread to reduce the RPC blocking problem during the query process. + +- A button to download Profile is added to the Profile page of the FE web UI. + +- Added FE configuration `recover_with_skip_missing_version`, which is used to query to skip the problematic replica under certain failure conditions. + +- The row-level permission function supports external Catalog. + +- Hive Catalog supports automatic refreshing of kerberos tickets on the BE side without manual refreshing. + +- JDBC Catalog supports tables under the MySQL/ClickHouse system database (`information_schema`). + +# Bug Fixes + +- Fixed the problem of incorrect query results caused by low-cardinality column optimization + +- Fixed several authentication and compatibility issues accessing HDFS. + +- Fixed several issues with float/double and decimal types. + +- Fixed several issues with date/datetimev2 types. + +- Fixed several query execution and planning issues. + +- Fixed several issues with JDBC Catalog. + +- Fixed several query-related issues with Hive Catalog, and Hive Metastore metadata synchronization issues. + +- Fix the problem that the result of `SHOW LOAD PROFILE` statement is incorrect. + +- Fixed several memory related issues. + +- Fixed several issues with `CREATE TABLE AS SELECT` functionality. + +- Fix the problem that the jsonb type causes BE to crash on CPU that do not support avx2. + +- Fixed several issues with dynamic partitions. + +- Fixed several issues with TOPN query optimization. + +- Fixed several issues with the Unique Key Merge-on-Write table model. + +# Big Thanks + +58 contributors participated in the improvement and release of 1.2.5, and thank them for their hard work and dedication: + +@adonis0147 + +@airborne12 + +@AshinGau + +@BePPPower + +@BiteTheDDDDt + +@caiconghui + +@CalvinKirs + +@cambyzju + +@caoliang-web + +@dataroaring + +@Doris-Extras + +@dujl + +@dutyu + +@fsilent + +@Gabriel39 + +@gitccl + +@gnehil + +@GoGoWen + +@gongzexin + +@HappenLee + +@herry2038 + +@jacktengg + +@Jibing-Li + +@kaka11chen + +@Kikyou1997 + +@LemonLiTree + +@liaoxin01 + +@LiBinfeng-01 + +@luwei16 + +@Moonm3n + +@morningman + +@mrhhsg + +@Mryange + +@nextdreamblue + +@nsnhuang + +@qidaye + +@Shoothzj + +@sohardforaname + +@stalary + +@starocean999 + +@SWJTU-ZhangLei + +@wsjz + +@xiaokang + +@xinyiZzz + +@yangzhg + +@yiguolei + +@yixiutt + +@yujun777 + +@Yulei-Yang + +@yuxuan-luo + +@zclllyybb + +@zddr + +@zenoyang + +@zhangstar333 + +@zhannngchen + +@zxealous + +@zy-kkk + +@zzzzzzzs diff --git a/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.6.md b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.6.md new file mode 100644 index 0000000000000..39146b35b15ac --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.6.md @@ -0,0 +1,135 @@ +--- +{ + "title": "Release 1.2.6", + "language": "en" +} +--- + + + + +# Behavior Change + +- Add a BE configuration item `allow_invalid_decimalv2_literal` to control whether can import data that exceeding the decimal's precision, for compatibility with previous logic. + +# Query + +- Fix several query planning issues. +- Support `sql_select_limit` session variable. +- Optimize query cold run performance. +- Fix expr context memory leak. +- Fix the issue that the `explode_split` function was executed incorrectly in some cases. + +## Multi Catalog + +- Fix the issue that synchronizing hive metadata caused FE replay edit log to fail. +- Fix `refresh catalog` operation causing FE OOM. +- Fix the issue that jdbc catalog cannot handle `0000-00-00` correctly. +- Fixed the issue that the kerberos ticket cannot be refreshed automatically. +- Optimize the partition pruning performance of hive. +- Fix the inconsistent behavior of trino and presto in jdbc catalog. +- Fix the issue that hdfs short-circuit read could not be used to improve query efficiency in some environments. +- Fix the issue that the iceberg table on CHDFS could not be read. + +# Storage + +- Fix the wrong calculation of delete bitmap in MOW table. +- Fix several BE memory issues. +- Fix snappy compression issue. +- Fix the issue that jemalloc may cause BE to crash in some cases. + +# Others + +- Fix several java udf related issues. +- Fix the issue that the `recover table` operation incorrectly triggered the creation of dynamic partitions. +- Fix timezone when importing orc files via broker load. +- Fix the issue that the newly added `PERCENT` keyword caused the replay metadata of the routine load job to fail. +- Fix the issue that the `truncate` operation failed to acts on a non-partitioned table. +- Fix the issue that the mysql connection was lost due to the `show snapshot` operation. +- Optimize the lock logic to reduce the probability of lock timeout errors when creating tables. +- Add session variable `have_query_cache` to be compatible with some old mysql clients. +- Optimize the error message when encountering an error of loading. + +# Big Thanks + +Thanks all who contribute to this release: + +@amorynan + +@BiteTheDDDDt + +@caoliang-web + +@dataroaring + +@Doris-Extras + +@dutyu + +@Gabriel39 + +@HHoflittlefish777 + +@htyoung + +@jacktengg + +@jeffreys-cat + +@kaijchen + +@kaka11chen + +@Kikyou1997 + +@KnightLiJunLong + +@liaoxin01 + +@LiBinfeng-01 + +@morningman + +@mrhhsg + +@sohardforaname + +@starocean999 + +@vinlee19 + +@wangbo + +@wsjz + +@xiaokang + +@xinyiZzz + +@yiguolei + +@yujun777 + +@Yulei-Yang + +@zhangstar333 + +@zy-kkk + diff --git a/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.7.md b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.7.md new file mode 100644 index 0000000000000..cd47282f4688d --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.7.md @@ -0,0 +1,46 @@ +--- +{ + "title": "Release 1.2.7", + "language": "en" +} +--- + + + +# Bug Fixes + +- Fixed some query issues. +- Fix some storage issues. +- Fix some decimal precision issues. +- Fix query error caused by invalid `sql_select_limit` session variable's value. +- Fix the problem that hdfs short-circuit read cannot be used. +- Fix the problem that Tencent Cloud cosn cannot be accessed. +- Fix several issues with hive catalog kerberos access. +- Fix the problem that stream load profile cannot be used. +- Fix promethus monitoring parameter format problem. +- Fix the table creation timeout issue when creating a large number of tablets. + +# New Features + +- Unique Key model supports array type as value column +- Added `have_query_cache` variable for compatibility with MySQL ecosystem. +- Added `enable_strong_consistency_read` to support strong consistent read between sessions +- FE metrics supports user-level query counter + diff --git a/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.8.md b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.8.md new file mode 100644 index 0000000000000..35cbb7a3cdcf1 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.8.md @@ -0,0 +1,47 @@ +--- +{ + "title": "Release 1.2.8", + "language": "en" +} +--- + + + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Bug Fixes +- Fixed several issues with query execution. +- Fixed several issues with Spark Load. +- Fixed several issues with Parquet Reader. +- Fixed several issues with Orc Reader. +- Fixed Broker "FileSystem closed" problem. +- Fixed several issues with Broker Load. +- Fixed several issues with CTAS execution. +- Fixed several issues with backup and restore. +- Added "Catalog" column in audit log. +- Optimized the metadata cache of Iceberg Catalog. +- Fixed several issues with outfile/export feature. +- Fixed an issue with "replayEraseTable" edit log causing FE start to fail. +- Fixed some security issues. + + diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.0.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.0.md new file mode 100644 index 0000000000000..61ba6c5c60890 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.0.md @@ -0,0 +1,236 @@ +--- +{ + "title": "Release 2.0.0", + "language": "en" +} +--- + + + + +We are more than excited to announce that, after six months of coding, testing, and fine-tuning, Apache Doris 2.0.0 is now production-ready. Special thanks to the 275 committers who altogether contributed over 4100 optimizations and fixes to the project. + +This new version highlights: + +- 10 times faster data queries +- Enhanced log analytic and federated query capabilities +- More efficient data writing and updates +- Improved multi-tenant and resource isolation mechanisms +- Progresses in elastic scaling of resources and storage-compute separation +- Enterprise-facing features for higher usability + +> Download: https://doris.apache.org/download +> +> GitHub source code: https://github.com/apache/doris/releases/tag/2.0.0-rc04 + +## **A 10 Times Performance Increase** + +In SSB-Flat and TPC-H benchmarking, Apache Doris 2.0.0 delivered **over 10-time faster query performance** compared to an early version of Apache Doris. + +![](/images/release-note-2.0.0-1.png) + +This is realized by the introduction of a smarter query optimizer, inverted index, a parallel execution model, and a series of new functionalities to support high-concurrency point queries. + +### A smarter query optimizer + +The brand new query optimizer, Nereids, has a richer statistical base and adopts the Cascades framework. It is capable of self-tuning in most query scenarios and supports all 99 SQLs in TPC-DS, so users can expect high performance without any fine-tuning or SQL rewriting. + +TPC-H tests showed that Nereids, with no human intervention, outperformed the old query optimizer by a wide margin. Over 100 users have tried Apache Doris 2.0.0 in their production environment and the vast majority of them reported huge speedups in query execution. + +![](/images/release-note-2.0.0-2.png) + +**Doc**: https://doris.apache.org/docs/dev/query-acceleration/nereids/ + +Nereids is enabled by default in Apache Doris 2.0.0: `SET enable_nereids_planner=true`. Nereids collects statistical data by calling the Analyze command. + +### Inverted Index + +In Apache Doris 2.0.0, we introduced inverted index to better support fuzzy keyword search, equivalence queries, and range queries. + +A smartphone manufacturer tested Apache Doris 2.0.0 in their user behavior analysis scenarios. With inverted index enabled, v2.0.0 was able to finish the queries within milliseconds and maintain stable performance as the query concurrency level went up. In this case, it is 5 to 90 times faster than its old version. + +![](/images/release-note-2.0.0-3.png) + +### 20 times higher concurrency capability + +In scenarios like e-commerce order queries and express tracking, a huge number of end data users search for a certain data record simultaneously. These are what we call high-concurrency point queries, which can bring huge pressure on the system. A traditional solution is to introduce Key-Value stores like Apache HBase for such queries, and Redis as a cache layer to ease the burden, but that means redundant storage and higher maintenance costs. + +For a column-oriented DBMS like Apache Doris, the I/O usage of point queries will be multiplied. We need neater execution. Thus, on the basis of columnar storage, we added row storage format and row cache to increase row reading efficiency, short-circuit plans to speed up data retrieval, and prepared statements to reduce frontend overheads. + +After these optimizations, Apache Doris 2.0 reached a concurrency level of **30,000 QPS per node** on YCSB on a 16 Core 64G cloud server with 4×1T hard drives, representing an improvement of **20 times** compared to its older version. This makes Apache Doris a good alternative to HBase in high-concurrency scenarios, so that users don't need to endure extra maintenance costs and redundant storage brought by complicated tech stacks. + +Read more: https://doris.apache.org/blog/High_concurrency + +### A self-adaptive parallel execution model + +Apache 2.0 brought in a Pipeline execution model for higher efficiency and stability in hybrid analytic workloads. In this model, the execution of queries is driven by data. The blocking operators in all query execution processes are split into pipelines. Whether a pipeline gets an execution thread depends on whether its relevant data is ready. This enables asynchronous blocking operations and more flexible system resource management. Also, this improves CPU efficiency as the system doesn't have to create and destroy threads that much. + +Doc: https://doris.apache.org/docs/dev/query-acceleration/pipeline-execution-engine/ + +**How to enable the Pipeline execution model** + +- The Pipeline execution engine is enabled by default in Apache Doris 2.0: `Set enable_pipeline_engine = true`. +- `parallel_pipeline_task_num` represents the number of pipeline tasks that are parallelly executed in SQL queries. The default value of it is `0`, which means Apache Doris will automatically set the concurrency level to half the number of CPUs in each backend node. Users can change this value as they need it. +- For those who are upgrading to Apache Doris 2.0 from an older version, it is recommended to set the value of `parallel_pipeline_task_num` to that of `parallel_fragment_exec_instance_num` in the old version. + +## A Unified Platform for Multiple Analytic Workloads + +Apache Doris has been pushing its boundaries. Starting as an OLAP engine for reporting, it is now a data warehouse capable of ETL/ELT and more. Version 2.0 is making advancements in its log analysis and data lakehousing capabilities. + +### A 10 times more cost-effective log analysis solution + +Apache Doris 2.0.0 provides native support for semi-structured data. In addition to JSON and Array, it now supports a complex data type: Map. Based on Light Schema Change, it also supports Schema Evolution, which means you can adjust the schema as your business changes. You can add or delete fields and indexes, and change the data types for fields. As we introduced inverted index and a high-performance text analysis algorithm into it, it can execute full-text search and dimensional analysis of logs more efficiently. With faster data writing and query speed and lower storage cost, it is 10 times more cost-effective than the common log analytic solution within the industry. + +![](/images/release-note-2.0.0-4.png) + +### Enhanced data lakehousing capabilities + +In Apache Doris 1.2, we introduced Multi-Catalog to allow for auto-mapping and auto-synchronization of data from heterogeneous sources. In version 2.0.0, we extended the list of data sources supported and optimized Doris for based on users' needs in production environment. + +![](/images/release-note-2.0.0-5.png) + +Apache Doris 2.0.0 supports dozens of data sources including Hive, Hudi, Iceberg, Paimon, MaxCompute, Elasticsearch, Trino, ClickHouse, and almost all open lakehouse formats. It also supports snapshot queries on Hudi Copy-on-Write tables and read optimized queries on Hudi Merge-on-Read tables. It allows for authorization of Hive Catalog using Apache Ranger, so users can reuse their existing privilege control system. Besides, it supports extensible authorization plug-ins to enable user-defined authorization methods for any catalog. + +TPC-H benchmark tests showed that Apache Doris 2.0.0 is 3~5 times faster than Presto/Trino in queries on Hive tables. This is realized by all-around optimizations (in small file reading, flat table reading, local file cache, ORC/Parquet file reading, Compute Nodes, and information collection of external tables) finished in this development cycle and the distributed execution framework, vectorized execution engine, and query optimizer of Apache Doris. + +![](/images/release-note-2.0.0-6.png) + +All this gives Apache Doris 2.0.0 an edge in data lakehousing scenarios. With Doris, you can do incremental or overall synchronization of multiple upstream data sources in one place, and expect much higher data query performance than other query engines. The processed data can be written back to the sources or provided for downstream systems. In this way, you can make Apache Doris your unified data analytic gateway. + +## Efficient Data Update + +Data update is important in real-time analysis, since users want to always be accessible to the latest data, and be able to update data flexibly, such as updating a row or just a few columns, batching updating or deleting their specified data, or even overwriting a whole data partition. + +Efficient data updating has been another hill to climb in data analysis. Apache Hive only supports updates on the partition level, while Hudi and Iceberg do better in low-frequency batch updates instead of real-time updates due to their Merge-on-Read and Copy-on-Write implementations. + +As for data updating, Apache Doris 2.0.0 is capable of: + +- **Faster data writing**: In the pressure tests with an online payment platform, under 20 concurrent data writing tasks, Doris reached a writing throughput of 300,000 records per second and maintained stability throughout the over 10-hour continuous writing process. +- **Partial column update**: Older versions of Doris implements partial column update by `replace_if_not_null` in the Aggregate Key model. In 2.0.0, we enable partial column updates in the Unique Key model. That means you can directly write data from multiple source tables into a flat table, without having to concatenate them into one output stream using Flink before writing. This method avoids a complicated processing pipeline and the extra resource consumption. You can simply specify the columns you need to update. +- **Conditional update and deletion**: In addition to the simple Update and Delete operations, we realize complicated conditional updates and deletes operations on the basis of Merge-on-Write. + +## Faster, Stabler, and Smarter Data Writing + +### Higher speed in data writing + +As part of our continuing effort to strengthen the real-time analytic capability of Apache Doris, we have improved the end-to-end real-time data writing capability of version 2.0.0. Benchmark tests reported higher throughput in various writing methods: + +- Stream Load, TPC-H 144G lineitem table, 48-bucket Duplicate table, triple-replica writing: throughput increased by 100% +- Stream Load, TPC-H 144G lineitem table, 48-bucket Unique Key table, triple-replica writing: throughput increased by 200% +- Insert Into Select, TPC-H 144G lineitem table, 48-bucket Duplicate table: throughput increased by 50% +- Insert Into Select, TPC-H 144G lineitem table, 48-bucket Unique Key table: throughput increased by 150% + +### Greater stability in high-concurrency data writing + +The sources of system instability often includes small file merging, write amplification, and the consequential disk I/O and CPU overheads. Hence, we introduced Vertical Compaction and Segment Compaction in version 2.0.0 to eliminate OOM errors in compaction and avoid the generation of too many segment files during data writing. After such improvements, Apache Doris can write data 50% faster while **using only 10% of the memory that it previously used**. + +Read more: https://doris.apache.org/blog/Compaction + +### Auto-synchronization of table schema + +The latest Flink-Doris-Connector allows users to synchronize an entire database (such as MySQL and Oracle) to Apache Doris by one simple step. According to our test results, one single synchronization task can support the real-time concurrent writing of thousands of tables. Users no longer need to go through a complicated synchronization procedure because Apache Doris has automated the process. Changes in the upstream data schema will be automatically captured and dynamically updated to Apache Doris in a seamless manner. + +Read more: https://doris.apache.org/blog/FDC + +## A New Multi-Tenant Resource Isolation Solution + +The purpose of multi-tenant resource isolation is to avoid resource preemption in the case of heavy loads. For that sake, older versions of Apache Doris adopted a hard isolation plan featured by Resource Group: Backend nodes of the same Doris cluster would be tagged, and those of the same tag formed a Resource Group. As data was ingested into the database, different data replicas would be written into different Resource Groups, which will be responsible for different workloads. For example, data reading and writing will be conducted on different data tablets, so as to realize read-write separation. Similarly, you can also put online and offline business on different Resource Groups. + +![](/images/release-note-2.0.0-7.png) + +This is an effective solution, but in practice, it happens that some Resource Groups are heavily occupied while others are idle. We want a more flexible way to reduce vacancy rate of resources. Thus, in 2.0.0, we introduce Workload Group resource soft limit. + +![](/images/release-note-2.0.0-8.png) + +The idea is to divide workloads into groups to allow for flexible management of CPU and memory resources. Apache Doris associates a query with a Workload Group, and limits the percentage of CPU and memory that a single query can use on a backend node. The memory soft limit can be configured and enabled by the user. + +When there is a cluster resource shortage, the system will kill the largest memory-consuming query tasks; when there are sufficient cluster resources, once a Workload Group uses more resources than expected, the idle cluster resources will be shared among all the Workload Groups to give full play to the system memory and ensure stable execution of queries. You can also prioritize the Workload Groups in terms of resource allocation. In other words, you can decide which tasks can be assigned with adequate resources and which not. + +Meanwhile, we introduced Query Queue in 2.0.0. Upon Workload Group creation, you can set a maximum query number for a query queue. Queries beyond that limit will wait for execution in the queue. This is to reduce system burden under heavy workloads. + +## Elastic Scaling and Storage-Compute Separation + +When it comes to computation and storage resources, what do users want? + +- **Elastic scaling of computation resources**: Scale up resources quickly in peak times to increase efficiency and scale down in valley times to reduce costs. +- **Lower storage costs**: Use low-cost storage media and separate storage from computation. +- **Separation of workloads**: Isolate the computation resources of different workloads to avoid preemption. +- **Unified management of data**: Simply manage catalogs and data in one place. + +To separate storage and computation is a way to realize elastic scaling of resources, but it demands more efforts in maintaining storage stability, which determines the stability and continuity of OLAP services. To ensure storage stability, we introduced mechanisms including cache management, computation resource management, and garbage collection. + + In this respect, we divide our users into three groups after investigation: + +1. Users with no need for resource scaling +2. Users requiring resource scaling, low storage costs, and workload separation from Apache Doris +3. Users who already have a stable large-scale storage system and thus require an advanced compute-storage-separated architecture for efficient resource scaling + +Apache Doris 2.0 provides two solutions to address the needs of the first two types of users. + +1. **Compute nodes**. We introduced stateless compute nodes in version 2.0. Unlike the mix nodes, the compute nodes do not save any data and are not involved in workload balancing of data tablets during cluster scaling. Thus, they are able to quickly join the cluster and share the computing pressure during peak times. In addition, in data lakehouse analysis, these nodes will be the first ones to execute queries on remote storage (HDFS/S3) so there will be no resource competition between internal tables and external tables. + 1. Doc: https://doris.apache.org/docs/dev/advanced/compute_node/ +2. **Hot-cold data separation**. Hot/cold data refers to data that is frequently/seldom accessed, respectively. Generally, it makes more sense to store cold data in low-cost storage. Older versions of Apache Doris support lifecycle management of table partitions: As hot data cooled down, it would be moved from SSD to HDD. However, data was stored with multiple replicas on HDD, which was still a waste. Now, in Apache Doris 2.0, cold data can be stored in object storage, which is even cheaper and allows single-copy storage. That reduces the storage costs by 70% and cuts down the computation and network overheads that come with storage. + 1. Read more: https://doris.apache.org/blog/HCDS/ + +For neater separate of computation and storage, the VeloDB team is going to contribute the Cloud Compute-Storage-Separation solution to the Apache Doris project. The performance and stability of it has stood the test of hundreds of companies in their production environment. The merging of code will be finished by October this year, and all Apache Doris users will be able to get an early taste of it in September. + +## Enhanced Usability + +Apache Doris 2.0.0 also highlights some enterprise-facing functionalities. + +### Support for Kubernetes Deployment + +Older versions of Apache Doris communicate based on IP, so any host failure in Kubernetes deployment that causes a POD IP drift will lead to cluster unavailability. Now, version 2.0 supports FQDN. That means the failed Doris nodes can recover automatically without human intervention, which lays the foundation for Kubernetes deployment and elastic scaling. + +### Support for Cross-Cluster Replication (CCR) + +Apache Doris 2.0.0 supports cross-cluster replication (CCR). Data changes at the database/table level in the source cluster will be synchronized to the target cluster. You can choose to replicate the incremental data or the overall data. + +It also supports synchronization of DDL, which means DDL statements executed by the source cluster can also by automatically replicated to the target cluster. + +It is simple to configure and use CCR in Doris. Leveraging this functionality, you can implement read-write separation and multi-datacenter replication + +This feature allows for higher availability of data, read/write workload separation, and cross-data-center replication more efficiently. + +## Behavior Change + +- Use rolling upgrade from 1.2-ITS to 2.0.0, and restart upgrade from preview versions of 2.0 to 2.0.0; +- The new query optimizer (Nereids) is enabled by default: `enable_nereids_planner=true`; +- All non-vectorized code has been removed from the system, so the `enable_vectorized_engine` parameter no long works; +- A new parameter `enable_single_replica_compaction` has been added; +- datev2, datetimev2, and decimalv3 are the default data types in table creation; datav1, datetimev1, and decimalv2 are not supported in table creation; +- decimalv3 is the default data type for JDBC and Iceberg Catalog; +- A new data type `AGG_STATE` has been added; +- The cluster column has been removed from backend tables; +- For better compatibility with BI tools, datev2 and datetimev2 are displayed as date and datetime when `show create table`; +- max_openfiles and swaps checks are added to the backend startup script so inappropriate system configuration might lead to backend failure; +- Password-free login is not allowed when accessing frontend on localhost; +- If there is a Multi-Catalog in the system, by default, only data of the internal catalog will be displayed when querying information schema; +- A limit has been imposed on the depth of the expression tree. The default value is 200; +- The single quote in the return value of array string has been changed to double quote; +- The Doris processes are renamed to DorisFE and DorisBE. +- The functions AES and SM4 with two arguments' behaviour changed. See more informations in [relative function docs](../../sql-manual/sql-functions/encrypt-digest-functions/sm4-encrypt.md) + +## Embarking on the 2.0.0 Journey + +To make Apache Doris 2.0.0 production-ready, we invited hundreds of enterprise users to engage in the testing and optimized it for better performance, stability, and usability. In the next phase, we will continue responding to user needs with agile release planning. We plan to launch 2.0.1 in late August and 2.0.2 in September, as we keep fixing bugs and adding new features. We also plan to release an early version of 2.1 in September to bring a few long-requested capabilities to you. For example, in Doris 2.1, the Variant data type will better serve the schema-free analytic needs of semi-structured data; the multi-table materialized views will be able to simplify the data scheduling and processing link while speeding up queries; more and neater data ingestion methods will be added and nested composite data types will be realized. + +If you have any questions or ideas when investigating, testing, and deploying Apache Doris, please find us on [Slack](https://t.co/ZxJuNJHXb2). Our developers will be happy to hear them and provide targeted support. + diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.1.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.1.md new file mode 100644 index 0000000000000..d8c19fb67525b --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.1.md @@ -0,0 +1,224 @@ +--- +{ + "title": "Release 2.0.1", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, 383 improvements and bug fixes have been made in Doris 2.0.1. + +## Behavior Changes + +- [https://github.com/apache/doris/pull/21302](https://github.com/apache/doris/pull/21302) + +## Improvements + +### functionality and stability of array and map datatypes +- [https://github.com/apache/doris/pull/22793](https://github.com/apache/doris/pull/22793) +- [https://github.com/apache/doris/pull/22927](https://github.com/apache/doris/pull/22927) +- https://github.com/apache/doris/pull/22738 +- https://github.com/apache/doris/pull/22347 +- https://github.com/apache/doris/pull/23250 +- https://github.com/apache/doris/pull/22300 + +### performance for inverted index query +- https://github.com/apache/doris/pull/22836 +- https://github.com/apache/doris/pull/23381 +- https://github.com/apache/doris/pull/23389 +- https://github.com/apache/doris/pull/22570 + +### performance for bitmap, like, scan, agg functions +- https://github.com/apache/doris/pull/23172 +- https://github.com/apache/doris/pull/23495 +- https://github.com/apache/doris/pull/23476 +- https://github.com/apache/doris/pull/23396 +- https://github.com/apache/doris/pull/23182 +- https://github.com/apache/doris/pull/22216 + +### functionality and stability of CCR +- https://github.com/apache/doris/pull/22447 +- https://github.com/apache/doris/pull/22559 +- https://github.com/apache/doris/pull/22173 +- https://github.com/apache/doris/pull/22678 + +### merge on write unique table + +- https://github.com/apache/doris/pull/22282 +- https://github.com/apache/doris/pull/22984 +- https://github.com/apache/doris/pull/21933 +- https://github.com/apache/doris/pull/22874 + +### optimizer table stats and analyze + +- https://github.com/apache/doris/pull/22658 +- https://github.com/apache/doris/pull/22211 +- https://github.com/apache/doris/pull/22775 +- https://github.com/apache/doris/pull/22896 +- https://github.com/apache/doris/pull/22788 +- https://github.com/apache/doris/pull/22882 +- + +### functionality and performance of multi catalog + +- https://github.com/apache/doris/pull/22949 +- https://github.com/apache/doris/pull/22923 +- https://github.com/apache/doris/pull/22336 +- https://github.com/apache/doris/pull/22915 +- https://github.com/apache/doris/pull/23056 +- https://github.com/apache/doris/pull/23297 +- https://github.com/apache/doris/pull/23279 + + +## Important Bug fixes + +- https://github.com/apache/doris/pull/22673 +- https://github.com/apache/doris/pull/22656 +- https://github.com/apache/doris/pull/22892 +- https://github.com/apache/doris/pull/22959 +- https://github.com/apache/doris/pull/22902 +- https://github.com/apache/doris/pull/22976 +- https://github.com/apache/doris/pull/22734 +- https://github.com/apache/doris/pull/22840 +- https://github.com/apache/doris/pull/23008 +- https://github.com/apache/doris/pull/23003 +- https://github.com/apache/doris/pull/22966 +- https://github.com/apache/doris/pull/22965 +- https://github.com/apache/doris/pull/22784 +- https://github.com/apache/doris/pull/23049 +- https://github.com/apache/doris/pull/23084 +- https://github.com/apache/doris/pull/22947 +- https://github.com/apache/doris/pull/22919 +- https://github.com/apache/doris/pull/22979 +- https://github.com/apache/doris/pull/23096 +- https://github.com/apache/doris/pull/23113 +- https://github.com/apache/doris/pull/23062 +- https://github.com/apache/doris/pull/22918 +- https://github.com/apache/doris/pull/23026 +- https://github.com/apache/doris/pull/23175 +- https://github.com/apache/doris/pull/23167 +- https://github.com/apache/doris/pull/23015 +- https://github.com/apache/doris/pull/23165 +- https://github.com/apache/doris/pull/23264 +- https://github.com/apache/doris/pull/23246 +- https://github.com/apache/doris/pull/23198 +- https://github.com/apache/doris/pull/23221 +- https://github.com/apache/doris/pull/23277 +- https://github.com/apache/doris/pull/23249 +- https://github.com/apache/doris/pull/23272 +- https://github.com/apache/doris/pull/23383 +- https://github.com/apache/doris/pull/23372 +- https://github.com/apache/doris/pull/23399 +- https://github.com/apache/doris/pull/23295 +- https://github.com/apache/doris/pull/23446 +- https://github.com/apache/doris/pull/23406 +- https://github.com/apache/doris/pull/23387 +- https://github.com/apache/doris/pull/23421 +- https://github.com/apache/doris/pull/23456 +- https://github.com/apache/doris/pull/23361 +- https://github.com/apache/doris/pull/23402 +- https://github.com/apache/doris/pull/23369 +- https://github.com/apache/doris/pull/23245 +- https://github.com/apache/doris/pull/23532 +- https://github.com/apache/doris/pull/23529 +- https://github.com/apache/doris/pull/23601 + + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.1-merged+is%3Aclosed) . + + +## Big Thanks + +Thanks all who contribute to this release: + +@adonis0147 +@airborne12 +@amorynan +@AshinGau +@BePPPower +@BiteTheDDDDt +@bobhan1 +@ByteYue +@caiconghui +@CalvinKirs +@csun5285 +@DarvenDuan +@deadlinefen +@DongLiang-0 +@Doris-Extras +@dutyu +@englefly +@freemandealer +@Gabriel39 +@GoGoWen +@HappenLee +@hello-stephen +@HHoflittlefish777 +@hubgeter +@hust-hhb +@JackDrogon +@jacktengg +@jackwener +@Jibing-Li +@kaijchen +@kaka11chen +@Kikyou1997 +@Lchangliang +@LemonLiTree +@liaoxin01 +@LiBinfeng-01 +@lsy3993 +@luozenglin +@morningman +@morrySnow +@mrhhsg +@Mryange +@mymeiyi +@shuke987 +@sohardforaname +@starocean999 +@TangSiyang2001 +@Tanya-W +@ucasfl +@vinlee19 +@wangbo +@wsjz +@wuwenchi +@xiaokang +@XieJiann +@xinyiZzz +@yujun777 +@Yukang-Lian +@Yulei-Yang +@zclllyybb +@zddr +@zenoyang +@zgxme +@zhangguoqiang666 +@zhangstar333 +@zhannngchen +@zhiqiang-hhhh +@zxealous +@zy-kkk +@zzzxl1993 +@zzzzzzzs + diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.10.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.10.md new file mode 100644 index 0000000000000..5d8592a0ee25c --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.10.md @@ -0,0 +1,59 @@ +--- +{ + "title": "Release 2.0.10", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 83 improvements and bug fixes have been made in Doris 2.0.10 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + + +## Improvement and Optimizations + +- This enhancement introduces the `read_only` and `super_read_only` variables to the database system, ensuring compatibility with MySQL's read-only modes. + +- When the check status is not IO_ERROR, the disk path should not be added to the broken list. This ensures that only disks with actual I/O errors are marked as broken. + +- When performing a Create Table As Select (CTAS) operation from an external table, convert the `VARCHAR` column to `STRING` type. + +- Support mapping Paimon column type "ROW" to Doris type "STRUCT" + +- Choose disk tolerate with little skew when creating tablet + +- Write editlog to `set replica drop` to avoid confusing status on follower FE + +- Make the schema change memory space adaptive to avoid memory over limit + +- Inverted index 'unicode' tokenizer supports configuration to exclude stop words + +See the complete list of improvements and bug fixes on [GitHub](https://github.com/apache/doris/compare/2.0.9...2.0.10) . + +## Credits + +Thanks to all who contributed to this release: + +@airborne12, @BePPPower, @ByteYue, @CalvinKirs, @cambyzju, @csun5285, @dataroaring, @deardeng, @DongLiang-0, @eldenmoon, @felixwluo, @HappenLee, @hubgeter, @jackwener, @kaijchen, @kaka11chen, @Lchangliang, @liaoxin01, @LiBinfeng-01, @luennng, @morningman, @morrySnow, @Mryange, @nextdreamblue, @qidaye, @starocean999, @suxiaogang223, @SWJTU-ZhangLei, @w41ter, @xiaokang, @xy720, @yujun777, @Yukang-Lian, @zhangstar333, @zxealous, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.11.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.11.md new file mode 100644 index 0000000000000..1a2598b0d41a0 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.11.md @@ -0,0 +1,60 @@ +--- +{ + "title": "Release 2.0.11", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 123 improvements and bug fixes have been made in Doris 2.0.11 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + +## 1 Behavior change + +Since the inverted index is now mature and stable, it can replace the old BITMAP INDEX. Therefore, any newly created `BITMAP INDEX` will automatically switch to an `INVERTED INDEX`, while existing `BITMAP INDEX` will remain unchanged. This entire switching process is transparent to the user, with no changes to writing or querying. Additionally, users can disable this automatic switch by setting the FE configuration `enable_create_bitmap_index_as_inverted_index` to false. [#35528](https://github.com/apache/doris/pull/35528) + + +## 2 Improvement and optimizations + +- Add Trino JDBC Catalog type mapping for JSON and TIME + +- FE exit when failed to transfer to (non) master to prevent unknown state and too many logs + +- Write audit log while doing drop stats table. + +- Ignore min/max column stats if table is partially analyzed to avoid inefficient query plan + +- Support minus operation for set like `set1 - set2` + +- Improve perfmance of LIKE and REGEXP clause with concat (col, pattern_str), eg. `col1 LIKE concat('%', col2, '%')` + +- Add query options for short circuit queries for upgrade compatibility + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.10...2.0.11) . + +## Credits + +Thanks all who contribute to this release: + +@AshinGau, @BePPPower, @BiteTheDDDDt, @ByteYue, @CalvinKirs, @cambyzju, @csun5285, @dataroaring, @eldenmoon, @englefly, @feiniaofeiafei, @Gabriel39, @GoGoWen, @HHoflittlefish777, @hubgeter, @jacktengg, @jackwener, @jeffreys-cat, @Jibing-Li, @kaka11chen, @kobe6th, @LiBinfeng-01, @mongo360, @morningman, @morrySnow, @mrhhsg, @Mryange, @nextdreamblue, @qidaye, @sjyango, @starocean999, @SWJTU-ZhangLei, @w41ter, @wangbo, @wsjz, @wuwenchi, @xiaokang, @XieJiann, @xy720, @yujun777, @Yukang-Lian, @Yulei-Yang, @zclllyybb, @zddr, @zhangstar333, @zhiqiang-hhhh, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.12.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.12.md new file mode 100644 index 0000000000000..0bc289c91a8ef --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.12.md @@ -0,0 +1,58 @@ +--- +{ + "title": "Release 2.0.12", + "language": "en" +} +--- + + + +Thanks to our community developers and users for their contributions. Doris version 2.0.12 will bring 99 improvements and bug fixes. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- No longer set the default table comment to the table type. Instead, set it to be empty by default, for example, change COMMENT 'OLAP' to COMMENT ' '. This new behavior is more friendly for BI software that relies on table comments. [#35855](https://github.com/apache/doris/pull/35855) + +- Change the type of the `@@autocommit` variable from `BOOLEAN` to `BIGINT` to prevent errors from certain MySQL clients (such as .NET MySQL.Data). [#33282](https://github.com/apache/doris/pull/33282) + + +## Improvements + +- Remove the `disable_nested_complex_type` parameter and allow the creation of nested `ARRAY`, `MAP`, and `STRUCT` types by default. [#36255](https://github.com/apache/doris/pull/36255) + +- The HMS catalog supports the `SHOW CREATE DATABASE` command. [#28145](https://github.com/apache/doris/pull/28145) + +- Add more inverted index metrics to the query profile. [#36545](https://github.com/apache/doris/pull/36545) + +- Cross-Cluster Replication (CCR) supports inverted indices. [#31743](https://github.com/apache/doris/pull/31743) + +You can access the full list through the GitHub [link](https://github.com/apache/doris/compare/2.0.11...2.0.12) , with the key features and improvements highlighted below. + + + +## Credits + +Thanks all who contribute to this release: + +@airborne12, D14@amorynan, D14@BiteTheDDDDt, D14@cambyzju, D14@caoliang-web, D14@dataroaring, D14@eldenmoon, D14@feiniaofeiafei, D14@felixwluo, D14@gavinchou, D14@HappenLee, D14@hello-stephen, D14@jacktengg, D14@Jibing-Li, D14@Johnnyssc, D14@liaoxin01, D14@LiBinfeng-01, D14@luwei16, D14@mongo360, D14@morningman, D14@morrySnow, D14@mrhhsg, D14@Mryange, D14@mymeiyi, D14@qidaye, D14@qzsee, D14@starocean999, D14@w41ter, D14@wangbo, D14@wsjz, D14@wuwenchi, D14@xiaokang, D14@XuPengfei-1020, D14@xy720, D14@yongjinhou, D14@yujun777, D14@Yukang-Lian, D14@Yulei-Yang, D14@zclllyybb, D14@zddr, D14@zhannngchen, D14@zhiqiang-hhhh, D14@zy-kkk, D14@zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.13.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.13.md new file mode 100644 index 0000000000000..1b6e54d948d7d --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.13.md @@ -0,0 +1,61 @@ +--- +{ + "title": "Release 2.0.13", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 112 improvements and bug fixes have been made in Doris 2.0.13 version + +[Quick Download](https://doris.apache.org/download/) + +## Behavior changes + +SQL input is treated as multiple statements only when the `CLIENT_MULTI_STATEMENTS` setting is enabled on the client side, enhancing compatibility with MySQL. [#36759](https://github.com/apache/doris/pull/36759) + +## New features + +- A new BE configuration `allow_zero_date` has been added, allowing dates with all zeros. When set to `false`, `0000-00-00` is parsed as `NULL`, and when set to `true`, it is parsed as `0000-01-01`. The default value is `false` to maintain consistency with previous behavior. [#34961](https://github.com/apache/doris/pull/34961) + +- `LogicalWindow` and `LogicalPartitionTopN` support multi-field predicate pushdown to improve performance. [#36828](https://github.com/apache/doris/pull/36828) + +- The ES Catalog now maps ES `nested` or `object` types to Doris `JSON` types. [#37101](https://github.com/apache/doris/pull/37101) + +## Improvements + +- Queries with `LIMIT` end reading data earlier to reduce resource consumption and improve performance. [#36535](https://github.com/apache/doris/pull/36535) + +- Special JSON data with empty keys is now supported. [#36762](https://github.com/apache/doris/pull/36762) + +- Stability and usability of routine load have been improved, including load balancing, automatic recovery, exception handling, and more user-friendly error messages. [#36450](https://github.com/apache/doris/pull/36450) [#35376](https://github.com/apache/doris/pull/35376) [#35266](https://github.com/apache/doris/pull/35266) [ #33372](https://github.com/apache/doris/pull/33372) [#32282](https://github.com/apache/doris/pull/32282) [#32046](https://github.com/apache/doris/pull/32046) [#32021](https://github.com/apache/doris/pull/32021) [#31846](https://github.com/apache/doris/pull/31846) [#31273](https://github.com/apache/doris/pull/31273) + +- BE load balancing selection of hard disk strategy and speed optimization. [#36826](https://github.com/apache/doris/pull/36826) [#36795](https://github.com/apache/doris/pull/36795) [#36509](https://github.com/apache/doris/pull/36509) + +- Stability and usability of the JDBC catalog have been improved, including encryption, thread pool connection count configuration, and more user-friendly error messages. [#36940](https://github.com/apache/doris/pull/36940) [#36720](https://github.com/apache/doris/pull/36720) [#30880](https://github.com/apache/doris/pull/30880) [#35692](https://github.com/apache/doris/pull/35692) + +You can access the full list through the GitHub [link](https://github.com/apache/doris/compare/2.0.12...2.0.13) , with the key features and improvements highlighted below. + +## Credits + +Thanks to all who contributed to this release: + +@Gabriel39, @Jibing-Li, @Johnnyssc, @Lchangliang, @LiBinfeng-01, @SWJTU-ZhangLei, @Thearas, @Yukang-Lian, @Yulei-Yang, @airborne12, @amorynan, @bobhan1, @cambyzju, @csun5285, @dataroaring, @deardeng, @eldenmoon, @englefly, @feiniaofeiafei, @hello-stephen, @jacktengg, @kaijchen, @liutang123, @luwei16, @morningman, @morrySnow, @mrhhsg, @mymeiyi, @platoneko, @qidaye, @sollhui, @starocean999, @w41ter, @xiaokang, @xy720, @yujun777, @zclllyybb, @zddr \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.14.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.14.md new file mode 100644 index 0000000000000..061c5cb7a1093 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.14.md @@ -0,0 +1,59 @@ +--- +{ + "title": "Release 2.0.14", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 110 improvements and bug fixes have been made in Doris 2.0.14 version + + +## 1 New features + +- Adds a REST interface to retrieve the most recent query profile: `curl http://user:password@127.0.0.1:8030/api/profile/text` [#38268](https://github.com/apache/doris/pull/38268) + +## 2 Improvements + +- Optimizes the primary key point query performance for MOW tables with sequence columns [#38287](https://github.com/apache/doris/pull/38287) + +- Enhances the performance of inverted index queries with many conditions [#35346](https://github.com/apache/doris/pull/35346) + +- Automatically enables the `support_phrase` option when creating a tokenized inverted index to accelerate `match_phrase` phrase queries [#37949](https://github.com/apache/doris/pull/37949) + +- Supports simplified SQL hints, for example: `SELECT /*+ query_timeout(3000) */ * FROM t;` [#37720](https://github.com/apache/doris/pull/37720) + +- Automatically retries reading from object storage when encountering a `429` error to improve stability [#35396](https://github.com/apache/doris/pull/35396) + +- LEFT SEMI / ANTI JOIN terminates subsequent matching execution upon matching a qualifying data row to enhance performance. [#34703](https://github.com/apache/doris/pull/34703) + +- Prevents coredump when returning illegal data to MySQL results. [#28069](https://github.com/apache/doris/pull/28069) + +- Unifies the output of type names in lowercase to maintain compatibility with MySQL and be more friendly to BI tools. [#38521](https://github.com/apache/doris/pull/38521) + + +You can access the full list through the GitHub [link](https://github.com/apache/doris/compare/2.0.13...2.0.14) , with the key features and improvements highlighted below. + +## Credits + +Thanks all who contribute to this release: + +@ByteYue, @CalvinKirs, @GoGoWen, @HappenLee, @Jibing-Li, @Lchangliang, @LiBinfeng-01, @Mryange, @XieJiann, @Yukang-Lian, @Yulei-Yang, @airborne12, @amorynan, @biohazard4321, @cambyzju, @csun5285, @eldenmoon, @englefly, @freemandealer, @hello-stephen, @hubgeter, @kaijchen, @liaoxin01, @luwei16, @morningman, @morrySnow, @mymeiyi, @qidaye, @sollhui, @starocean999, @w41ter, @wuwenchi, @xiaokang, @xy720, @yujun777, @zclllyybb, @zddr, @zhangstar333, @zhiqiang-hhhh, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.15.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.15.md new file mode 100644 index 0000000000000..58237f7c3f097 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.15.md @@ -0,0 +1,91 @@ +--- +{ + "title": "Release 2.0.15", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 157 improvements and bug fixes have been made in Doris 2.0.15 version + +- Quick Download: https://doris.apache.org/download + +- GitHub: https://github.com/apache/doris/releases/tag/2.0.15 + +## 1 Behavior Change + +NA + +## 2 New Features + +- Restore now supports deleting redundant tablets and partition options. [#39028](https://github.com/apache/doris/pull/39028) + +- Support JSON function `json_search`.[#40948](https://github.com/apache/doris/pull/40948) + +## 3 Improvement and Optimizations + +### Stability + +- Add a FE configuration `abort_txn_after_lost_heartbeat_time_second` for transaction abort time. [#28662](https://github.com/apache/doris/pull/28662) + +- Abort transactions after a BE loses heartbeat for over 1 minute instead of 5 seconds, to avoid overly sensitive transaction aborts. [#22781](https://github.com/apache/doris/pull/22781) + +- Delay scheduling EOF tasks of routine load to avoid an excessive number of small transactions. [#39975](https://github.com/apache/doris/pull/39975) + +- Prefer querying from online disk services to be more robust. [#39467](https://github.com/apache/doris/pull/39467) + +- Skip checking newly inserted rows in non-strict mode partial updates if the row's delete sign is marked. [#40322](https://github.com/apache/doris/pull/40322) + +- To prevent FE OOM, limit the number of tablets in backup tasks, with a default value of 300,000. [#39987](https://github.com/apache/doris/pull/39987) + +### Performance + +- Optimize slow column updates caused by concurrent column updates and compactions. [#38487](https://github.com/apache/doris/pull/38487) + +- When a NullLiteral exists in a filter condition, it can now be folded into False and further converted to an EmptySet to reduce unnecessary data scanning and computation. [#38135](https://github.com/apache/doris/pull/38135) + +- Improve performance of `ORDER BY` permutation. [#38985](https://github.com/apache/doris/pull/38985) + +- Improve the performance of string processing in inverted indexes. [#37395](https://github.com/apache/doris/pull/37395) + +### Optimizer and Statistics + +- Added support for statements beginning with a semicolon. [#39399](https://github.com/apache/doris/pull/39399) + +- Polish aggregate function signature matching. [#39352](https://github.com/apache/doris/pull/39352) + +- Drop column statistics and trigger auto analysis after schema change. [#39101](https://github.com/apache/doris/pull/39101) + +- Support dropping cached stats using `DROP CACHED STATS table_name`. [#39367](https://github.com/apache/doris/pull/39367) + +### Multi Catalog and Others + +- Optimize JDBC Catalog refresh to reduce the frequency of client creation. [#40261](https://github.com/apache/doris/pull/40261) + +- Fix thread leaks in JDBC Catalog under certain conditions. [#39423](https://github.com/apache/doris/pull/39423) + +- ARRAY MAP STRUCT types now support `REPLACE_IF_NOT_NULL`. [#38304](https://github.com/apache/doris/pull/38304) + +- Retry delete jobs for failures that are not `DELETE_INVALID_XXX`. [#37834](https://github.com/apache/doris/pull/37834) + +**Credits** + +@924060929, @BePPPower, @BiteTheDDDDt, @CalvinKirs, @GoGoWen, @HappenLee, @Jibing-Li, @Johnnyssc, @LiBinfeng-01, @Mryange, @SWJTU-ZhangLei, @TangSiyang2001, @Toms1999, @Vallishp, @Yukang-Lian, @airborne12, @amorynan, @bobhan1, @cambyzju, @csun5285, @dataroaring, @eldenmoon, @englefly, @feiniaofeiafei, @hello-stephen, @htyoung, @hubgeter, @justfortaste, @liaoxin01, @liugddx, @liutang123, @luwei16, @mongo360, @morrySnow, @qidaye, @smallx, @sollhui, @starocean999, @w41ter, @xiaokang, @xzj7019, @yujun777, @zclllyybb, @zddr, @zhangstar333, @zhannngchen, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.2.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.2.md new file mode 100644 index 0000000000000..3f8e89cddf946 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.2.md @@ -0,0 +1,157 @@ +--- +{ + "title": "Release 2.0.2", + "language": "en" +} +--- + + + +# Release 2.0.2 + +Thanks to our community users and developers, 489 improvements and bug fixes have been made in Doris 2.0.2. + +## Behavior Changes + +- [Remove json -> operator convert to json_extract #24679](https://github.com/apache/doris/pull/24679) + + Remove json '->' operator since it is conflicted with lambda function syntax. It's a syntax sugar for function json_extract and can be replaced with the former. +- [Start the script to set metadata_failure_recovery #24308](https://github.com/apache/doris/pull/24308) + + Move metadata_failure_recovery from fe.conf to start_fe.sh argument to prevent being used unexpectedly. +- [Change ordinary type null value is \N,complex type null value is null #24207](https://github.com/apache/doris/pull/24207) +- [Optimize priority_ network matching logic for be #23795](https://github.com/apache/doris/pull/23795) +- [Fix cancel load failed because Job could not be cancelled… #17730](https://github.com/apache/doris/pull/17730) + + Allow cancel a retrying load job. + +## Improvements + +### Easier to use + +- [Support custom lib dir to save custom libs #23887](https://github.com/apache/doris/pull/23887) + + Add a custom_lib dir to allow users place custom lib files and custom_lib will not be replaced. +- [Optimize priority_ network matching logic #23784](https://github.com/apache/doris/pull/23784) + + Optimize priority_network logic to avoid error when this config is wrong or not configured. +- [Row policy support role #23022](https://github.com/apache/doris/pull/23022) + + Support role based auth for row policy. + +### New optimizer Nereids statistics collection improvement + +- [Disable file cache while running analysis tasks. #23663](https://github.com/apache/doris/pull/23663) +- [Show column stats even when error occurred. #23703](https://github.com/apache/doris/pull/23703) +- [Support basic jdbc external table stats collection. #23965](https://github.com/apache/doris/pull/23965) +- [Skip unknown col stats check on __internal_scheam and information_schema #24625](https://github.com/apache/doris/pull/24625) + +### Better support for JDBC, HDFS, Hive, MySQL, Max Compute, Multi-Catalog + +- [Support hadoop viewfs. #24168](https://github.com/apache/doris/pull/24168) +- [Avoid calling checksum when replaying creating jdbc catalog and fix ranger issue #22369](https://github.com/apache/doris/pull/22369) +- [Optimize the JDBC Catalog connection error message #23868](https://github.com/apache/doris/pull/23868) + + Improve property check and error message for JDBC catalog +- [Fix mc decimal type parse, fix wrong obj location #24242](https://github.com/apache/doris/pull/24242) + + Fix some issues for Max Compute catalog +- [Support sql cache for hms catalog #23391](https://github.com/apache/doris/pull/23391) + + SQL cache for Hive catalog +- [Merge hms partition events. #22869](https://github.com/apache/doris/pull/22869) + + Improve performance for Hive metadata sync +- [Add metadata_name_ids for quickly get catlogs,db,table and add profiling table in order to Compatible with mysql #22702](https://github.com/apache/doris/pull/22702) + +### Performance for inverted index query + +- [Add bkd index query cache to improve perf #23952](https://github.com/apache/doris/pull/23952) +- [Improve performance for count on index other than match #24678](https://github.com/apache/doris/pull/24678) +- [Improve match performance without index #24751](https://github.com/apache/doris/pull/24751) +- [Optimize multiple terms conjunction query #23871](https://github.com/apache/doris/pull/23871) +Improve performance of MATCH_ALL +- [Optimize unnecessary conversions #24389](https://github.com/apache/doris/pull/24389) +Improve performance of MATCH + +### Improve Array functions + +- [[Fix old optimizer with some array literal functions #23630](https://github.com/apache/doris/pull/23630) +- [Improve array union support multi params #24327](https://github.com/apache/doris/pull/24327) +- [Improve explode func with array nested complex type #24455](https://github.com/apache/doris/pull/24455) + +## Important Bug fixes + +- [The parameter positions of timestamp diff function to sql are reversed #23601](https://github.com/apache/doris/pull/23601) +- [Fix old optimizer with some array literal functions #23630](https://github.com/apache/doris/pull/23630) +- [Fix query cache returns wrong result after deleting partitions. #23555](https://github.com/apache/doris/pull/23555) +- [Fix potential data loss when clone task's dst tablet is cooldown replica #17644](https://github.com/apache/doris/pull/17644) +- [Fix array map batch append data with right next_array_item_rowid #23779](https://github.com/apache/doris/pull/23779) +- [Fix or to in rule #23940](https://github.com/apache/doris/pull/23940) +- [Fix 'char' function's toSql implementation is wrong #23860](https://github.com/apache/doris/pull/23860) +- [Record wrong best plan properties #23973](https://github.com/apache/doris/pull/23973) +- [Make TVF's distribution spec always be RANDOM #24020](https://github.com/apache/doris/pull/24020) +- [External scan use STORAGE_ANY instead of ANY as distibution #24039](https://github.com/apache/doris/pull/24039) +- [Runtimefilter target is not SlotReference #23958](https://github.com/apache/doris/pull/23958) +- [mv in select materialized_view should disable show table #24104](https://github.com/apache/doris/pull/24104) +- [Fail over to remote file reader if local cache failed #24097](https://github.com/apache/doris/pull/24097) +- [Fix revoke role operation cause fe down #23852](https://github.com/apache/doris/pull/23852) +- [Handle status code correctly and add a new error code `ENTRY_NOT_FOUND` #24139](https://github.com/apache/doris/pull/24139) +- [Fix leaky abstraction and shield the status code `END_OF_FILE` from upper layers #24165](https://github.com/apache/doris/pull/24165) +- [Fix bug that Read garbled files caused be crash. #24164](https://github.com/apache/doris/pull/24164) +- [Fix be core when user sepcified empty `column_separator` using hdfs tvf #24369](https://github.com/apache/doris/pull/24369) +- [Fix need to restart BE after replacing the jar package in java-udf #24372](https://github.com/apache/doris/pull/24372) +- [Need to call 'set_version' in nested functions #24381](https://github.com/apache/doris/pull/24381) +- [windown_funnel compatibility issue with multi backends #24385](https://github.com/apache/doris/pull/24385) +- [correlated anti join shouldn't be translated to null aware anti join #24290](https://github.com/apache/doris/pull/24290) +- [Change ordinary type null value is \N,complex type null value is null #24207](https://github.com/apache/doris/pull/24207) +- [Fix analyze failed when there are thousands of partitions. #24521](https://github.com/apache/doris/pull/24521) +- [Do not use enum as the data type for JavaUdfDataType. #24460](https://github.com/apache/doris/pull/24460) +- [Fix multi window projection issue temporarily #24568](https://github.com/apache/doris/pull/24568) +- [Make metadata compatible with 2.0.3 #24610](https://github.com/apache/doris/pull/24610) +- [Select outfile column order is wrong #24595](https://github.com/apache/doris/pull/24595) +- [Incorrect result of semi/anti mark join #24616](https://github.com/apache/doris/pull/24616) +- [Fix broker read issue #24635](https://github.com/apache/doris/pull/24635) +- [Skip unknown col stats check on __internal_scheam and information_schema #24625](https://github.com/apache/doris/pull/24625) +- [Fixed bug when parsing multi-character delimiters. #24572](https://github.com/apache/doris/pull/24572) +- [Fix timezone parse when there is no tzfile #24578](https://github.com/apache/doris/pull/24578) +- [We need to issue an error when starting FE without setting the Java home environment #23943](https://github.com/apache/doris/pull/23943) +- [Enable_unique_key_partial_update should be forwarded to master #24697](https://github.com/apache/doris/pull/24697) +- [Fix paimon file catalog meta issue and replication num analysis issue #24681](https://github.com/apache/doris/pull/24681) +- [Add more log for ingest_binlog && Fix ingest_binlog not rewrite rowset_meta tablet_uid #24617](https://github.com/apache/doris/pull/24617) +- [Do not abort when a disk is broken #24692](https://github.com/apache/doris/pull/24692) +- [colocate join could not work well on full outer join #24700](https://github.com/apache/doris/pull/24700) +- [Optimize unnecessary conversions #24389](https://github.com/apache/doris/pull/24389) +- [Optimize the reading efficiency of nullable (string) columns. #24698](https://github.com/apache/doris/pull/24698) +- [Fix segment cache core when output rowset is nullptr #24778](https://github.com/apache/doris/pull/24778) +- [Fix duplicate key in schema change #24782](https://github.com/apache/doris/pull/24782) +- [Make metadata compatible for future version after 2.0.2 #24800](https://github.com/apache/doris/pull/24800) +- [Fix map/array deserialize string with quote pair #24808](https://github.com/apache/doris/pull/24808) +- [Failed on arm platform, with clang compiler and pch on, close #24633 #24636](https://github.com/apache/doris/pull/24636) +- [Table column order is changed if add a column and do truncate #24981](https://github.com/apache/doris/pull/24981) +- [Make parser mode coarse grained by default #24949](https://github.com/apache/doris/pull/24949) + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.2-merged+is%3Aclosed) . + +## Big Thanks + +Thanks all who contribute to this release: + +[@adonis0147](https://github.com/adonis0147) [@airborne12](https://github.com/airborne12) [@amorynan](https://github.com/amorynan) [@AshinGau](https://github.com/AshinGau) [@BePPPower](https://github.com/BePPPower) [@BiteTheDDDDt](https://github.com/BiteTheDDDDt) [@bobhan1](https://github.com/bobhan1) [@ByteYue](https://github.com/ByteYue) [@caiconghui](https://github.com/caiconghui) [@CalvinKirs](https://github.com/CalvinKirs) [@cambyzju](https://github.com/cambyzju) [@ChengDaqi2023](https://github.com/ChengDaqi2023) [@ChinaYiGuan](https://github.com/ChinaYiGuan) [@CodeCooker17](https://github.com/CodeCooker17) [@csun5285](https://github.com/csun5285) [@dataroaring](https://github.com/dataroaring) [@deadlinefen](https://github.com/deadlinefen) [@DongLiang-0](https://github.com/DongLiang-0) [@Doris-Extras](https://github.com/Doris-Extras) [@dutyu](https://github.com/dutyu) [@eldenmoon](https://github.com/eldenmoon) [@englefly](https://github.com/englefly) [@freemandealer](https://github.com/freemandealer) [@Gabriel39](https://github.com/Gabriel39) [@gnehil](https://github.com/gnehil) [@GoGoWen](https://github.com/GoGoWen) [@gohalo](https://github.com/gohalo) [@HappenLee](https://github.com/HappenLee) [@hello-stephen](https://github.com/hello-stephen) [@HHoflittlefish777](https://github.com/HHoflittlefish777) [@hubgeter](https://github.com/hubgeter) [@hust-hhb](https://github.com/hust-hhb) [@ixzc](https://github.com/ixzc) [@JackDrogon](https://github.com/JackDrogon) [@jacktengg](https://github.com/jacktengg) [@jackwener](https://github.com/jackwener) [@Jibing-Li](https://github.com/Jibing-Li) [@JNSimba](https://github.com/JNSimba) [@kaijchen](https://github.com/kaijchen) [@kaka11chen](https://github.com/kaka11chen) [@Kikyou1997](https://github.com/Kikyou1997) [@Lchangliang](https://github.com/Lchangliang) [@LemonLiTree](https://github.com/LemonLiTree) [@liaoxin01](https://github.com/liaoxin01) [@LiBinfeng-01](https://github.com/LiBinfeng-01) [@liugddx](https://github.com/liugddx) [@luwei16](https://github.com/luwei16) [@mongo360](https://github.com/mongo360) [@morningman](https://github.com/morningman) [@morrySnow](https://github.com/morrySnow) @mrhhsg @Mryange @mymeiyi @neuyilan @pingchunzhang @platoneko @qidaye @realize096 @RYH61 @shuke987 @sohardforaname @starocean999 @SWJTU-ZhangLei @TangSiyang2001 @Tech-Circle-48 @w41ter @wangbo @wsjz @wuwenchi @wyx123654 @xiaokang @XieJiann @xinyiZzz @XuJianxu @xutaoustc @xy720 @xyfsjq @xzj7019 @yiguolei @yujun777 @Yukang-Lian @Yulei-Yang @zclllyybb @zddr @zhangguoqiang666 @zhangstar333 @ZhangYu0123 @zhannngchen @zxealous @zy-kkk @zzzxl1993 @zzzzzzzs diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.3.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.3.md new file mode 100644 index 0000000000000..a716d6d711fb0 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.3.md @@ -0,0 +1,253 @@ +--- +{ + "title": "Release 2.0.3", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 1000 improvements and bug fixes have been made in Doris 2.0.3 version, including optimizer statistics, inverted index, complex datatypes, data lake, replica management. + + + +## 1 Behavior change + +- The output format of the complex data type array/map/struct has been changed to be consistent to the input format and JSON specification. The main changes from the previous version are that DATE/DATETIME and STRING/VARCHAR are enclosed in double quotes and null values inside ARRAY/MAP are displayed as `null` instead of `NULL`. + - https://github.com/apache/doris/pull/25946 +- SHOW_VIEW permission is supported. Users with SELECT or LOAD permission will no longer be able to execute the 'SHOW CREATE VIEW' statement and must be granted the SHOW_VIEW permission separately. + - https://github.com/apache/doris/pull/25370 + + +## 2 New features + +### 2.1 Support collecting statistics for optimizer automatically + +Collecting statistics helps the optimizer understand the data distribution characteristics and choose a better plan to greatly improve query performance. It is officially supported starting from version 2.0.3 and is enabled all day by default. + +### 2.2 Support complex datatypes for more datalake source +- Support complex datatypes for JAVA UDF, JDBC and Hudi MOR + - https://github.com/apache/doris/pull/24810 + - https://github.com/apache/doris/pull/26236 +- Support complex datatypes for Paimon + - https://github.com/apache/doris/pull/25364 +- Suport Paimon version 0.5 + - https://github.com/apache/doris/pull/24985 + + +### 2.3 Add more builtin functions +- Support the BitmapAgg function in new optimizer + - https://github.com/apache/doris/pull/25508 +- Supports SHA series digest functions + - https://github.com/apache/doris/pull/24342 +- Support the BITMAP datatype in the aggregate functions min_by and max_by + - https://github.com/apache/doris/pull/25430 +- Add milliseconds/microseconds_add/sub/diff functions + - https://github.com/apache/doris/pull/24114 +- Add some json functions: json_insert, json_replace, json_set + - https://github.com/apache/doris/pull/24384 + + +## 3 Improvement and optimizations + +### 3.1 Performance optimizations + +- When the inverted index MATCH WHERE condition with a high filter rate is combined with the common WHERE condition with a low filter rate, the I/O of the index column is greatly reduced. +- Optimize the efficiency of random data access after the where filter. +- Optimizes the performance of the old get_json_xx function on JSON data types by 2~4x. +- Supports the configuration to reduce the priority of the data read thread, ensuring the CPU resources for real-time writing. +- Adds `uuid-numeric` function that returns largeint, which is 20 times faster than `uuid` function that returns string. +- Optimized the performance of case when by 3x. +- Cut out unnecessary predicate calculations in storage engine execution. +- Accelerate count performance by pushing down count operator to storage tier. +- Optimizes the computation performance of the nullable type in and or expressions. +- Supports rewriting the limit operator before `join` in more scenarios to improve query performance. +- Eliminate useless `order by` operators from inline view to improve query performance. +- Optimizes the accuracy of cardinality estimates and cost models in some cases. +- Optimized jdbc catalog predicate pushdown logic. +- Optimized the read efficiency of the file cache when it's enable for the first time. +- Optimizes the hive table sql cache policy and uses the partition update time stored in HMS to improve the cache hit ratio. +- Optimize mow compaction efficiency. +- Optimized thread allocation logic for external table query to reduce memory usage +- Optimize memory usage for column reader. + + + +### 3.2 Distributed replica management improvements + +Distributed replica management improvements include skipping partition deletion, colocate group deletion, balance failure due to continuous write, and hot and cold seperation table balance. + + +### 3.3 Security enhancement +- The audit log plug-in uses a token instead of a plaintext password to enhance security + - https://github.com/apache/doris/pull/26278 +- log4j configures security enhancement + - https://github.com/apache/doris/pull/24861 +- Sensitive user information is not displayed in logs + - https://github.com/apache/doris/pull/26912 + + +## 4 Bugfix and stability + +### 4.1 Complex datatypes +- Fix issues that fixed-length CHAR(n) was not truncated correctly in map/struct. + - https://github.com/apache/doris/pull/25725 +- Fix write failure for struct datatype nested for map/array + - https://github.com/apache/doris/pull/26973 +- Fix the issue that count distinct did not support array/map/struct + - https://github.com/apache/doris/pull/25483 +- Fix be crash in updating to 2.0.3 after the delete complex type appeared in query + - https://github.com/apache/doris/pull/26006 +- Fix be crash when JSON datatype is in WHERE clause. + - https://github.com/apache/doris/pull/27325 +- Fix be crash when ARRAY datatype is in OUTER JOIN clause. + - https://github.com/apache/doris/pull/25669 +- Fix reading incorrect result for DECIMAL datatype in ORC format. + - https://github.com/apache/doris/pull/26548 + - https://github.com/apache/doris/pull/25977 + - https://github.com/apache/doris/pull/26633 + +### 4.2 Inverted index +- Fix incorrect result for OR NOT combination in WHERE clause were incorrect when disable inverted index query. + - https://github.com/apache/doris/pull/26327 +- Fix be crash when write a empty with inverted index + - https://github.com/apache/doris/pull/25984 +- Fix be crash in index compaction when the output of compaction is empty. + - https://github.com/apache/doris/pull/25486 +- Fixed the problem of adding an inverted index to be crashed when no data is written to the newly added column. +- Fix be crash when BUILD INDEX after ADD COLUMN without new data written. + - https://github.com/apache/doris/pull/27276 +- Fix missing and leak problem of hardlink for inverted index file. + - https://github.com/apache/doris/pull/26903 +- Fix index file corrupt when disk is full temporarilly + - https://github.com/apache/doris/pull/28191 +- Fix incorrect result due to optimization for skip reading index column + - https://github.com/apache/doris/pull/28104 + +### 4.3 Materialized View +- Fix the problem of BE crash caused by repeated expressions in the group by statement +- Fix be crash when there are duplicate expressions in `group by` statements. + - https://github.com/apache/doris/pull/27523 +- Disables the float/double type in the `group by` clause when a view is created. + - https://github.com/apache/doris/pull/25823 +- Improve the function of select query matching materialized view + - https://github.com/apache/doris/pull/24691 +- Fix an issue that materialized views could not be matched when a table alias was used + - https://github.com/apache/doris/pull/25321 +- Fix the problem using percentile_approx when creating materialized views + - https://github.com/apache/doris/pull/26528 + +### 4.4 Table sample +- Fix the problem that table sample query can not work on table with partitions. + - https://github.com/apache/doris/pull/25912 +- Fix the problem that table sample query can not work when specify tablet. + - https://github.com/apache/doris/pull/25378 + + +### 4.5 Unique with merge on write +- Fix null pointer exception in conditional update based on primary key + - https://github.com/apache/doris/pull/26881 +- Fix field name capitalization issues in partial update + - https://github.com/apache/doris/pull/27223 +- Fix duplicate keys occur in mow during schema change repairement. + - https://github.com/apache/doris/pull/25705 + + +### 4.6 Load and compaction +- Fix unkown slot descriptor error in routineload for running multiple tables + - https://github.com/apache/doris/pull/25762 +- Fix be crash due to concurrent memory access when caculating memory + - https://github.com/apache/doris/pull/27101 +- Fix be crash on duplicate cancel for load. + - https://github.com/apache/doris/pull/27111 +- Fix broker connection error during broker load + - https://github.com/apache/doris/pull/26050 +- Fix incorrect result delete predicates in concurrent case of compation and scan. + - https://github.com/apache/doris/pull/24638 +- Fix the problem tha compaction task would print too many stacktrace logs + - https://github.com/apache/doris/pull/25597 + + +### 4.7 Data Lake compatibility +- Solve the problem that the iceberg table contains special characters that cause query failure + - https://github.com/apache/doris/pull/27108 +- Fix compatibility issues of different hive metastore versions + - https://github.com/apache/doris/pull/27327 +- Fix an error reading max compute partition table + - https://github.com/apache/doris/pull/24911 +- Fix the issue that backup to object storage failed + - https://github.com/apache/doris/pull/25496 + - https://github.com/apache/doris/pull/25803 + + +### 4.8 JDBC external table compatibility + +- Fix Oracle date type format error in jdbc catalog + - https://github.com/apache/doris/pull/25487 +- Fix MySQL 0000-00-00 date exception in jdbc catalog + - https://github.com/apache/doris/pull/26569 +- Fix an exception in reading data from Mariadb where the default value of the time type is current_timestamp + - https://github.com/apache/doris/pull/25016 +- Fix be crash when processing BITMAP datatype in jdbc catalog + - https://github.com/apache/doris/pull/25034 + - https://github.com/apache/doris/pull/26933 + + +### 4.9 SQL Planner and Optimizer + +- Fix partition prune error in some scenes + - https://github.com/apache/doris/pull/27047 + - https://github.com/apache/doris/pull/26873 + - https://github.com/apache/doris/pull/25769 + - https://github.com/apache/doris/pull/27636 + +- Fix incorrect sub-query processing in some scenarios + - https://github.com/apache/doris/pull/26034 + - https://github.com/apache/doris/pull/25492 + - https://github.com/apache/doris/pull/25955 + - https://github.com/apache/doris/pull/27177 + +- Fix some semantic parsing errors + - https://github.com/apache/doris/pull/24928 + - https://github.com/apache/doris/pull/25627 + +- Fix data loss during right outer/anti join + - https://github.com/apache/doris/pull/26529 + +- Fix incorrect pushing down of predicate pass aggregation operators. + - https://github.com/apache/doris/pull/25525 + +- Fix incorrect result header in some cases + - https://github.com/apache/doris/pull/25372 + +- Fix incorrect plan when the nullsafeEquals expression (<=>) is used as the join condition + - https://github.com/apache/doris/pull/27127 + +- Fix correct column prune in set operation operator. + - https://github.com/apache/doris/pull/26884 + + +### Others + +- Fix BE crash when the order of columns in a table is changed and then upgraded to 2.0.3. + - https://github.com/apache/doris/pull/28205 + + +See the complete list of improvements and bug fixes on [github dev/2.0.3-merged](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.3-merged+is%3Aclosed) . diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.4.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.4.md new file mode 100644 index 0000000000000..e1dac58fbf69a --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.4.md @@ -0,0 +1,67 @@ +--- +{ + "title": "Release 2.0.4", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, about 333 improvements and bug fixes have been made in Doris 2.0.4 version. + +**Quick Download** : [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub** : [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Behavior change +- More reasonable and accurate precision and scale inference for decimal data type + - [https://github.com/apache/doris/pull/28034](https://github.com/apache/doris/pull/28034) + +- Support drop policy for user or role + - [https://github.com/apache/doris/pull/29488](https://github.com/apache/doris/pull/29488) + +## New features + +- Support datev1, datetimev1 and decimalv2 datatypes in new optimizer Nereids. +- Support ODBC table for new optimizer Nereids. +- Add `lower_case` and `ignore_above` option for inverted index +- Support `match_regexp` and `match_phrase_prefix` optimization by inverted index +- Support paimon native reader in datalake +- Support audit-log for `insert into` SQL +- Support reading parquet file in lzo compressed format + +## Three Improvement and optimizations + +- Improve storage management including balance, migration, publish and others. +- Improve storage cooldown policy to use save disk space. +- Performance optimization for substr with ascii string. +- Improve partition prune when date function is used. +- Improve auto analyze visibility and performance. + +See the complete list of improvements and bug fixes on github [dev/2.0.4-merged](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.4-merged+is%3Aclosed) + + + +## Credits +Last but not least, this release would not have been possible without the following contributors: + +airborne12, amorynan, AshinGau, BePPPower, bingquanzhao, BiteTheDDDDt, bobhan1, ByteYue, caiconghui,CalvinKirs, cambyzju, caoliang-web, catpineapple, csun5285, dataroaring, deardeng, dutyu, eldenmoon, englefly, feifeifeimoon, fornaix, Gabriel39, gnehil, HappenLee, hello-stephen, HHoflittlefish777,hubgeter, hust-hhb, ixzc, jacktengg, jackwener, Jibing-Li, kaka11chen, KassieZ, LemonLiTree,liaoxin01, LiBinfeng-01, lihuigang, liugddx, luwei16, morningman, morrySnow, mrhhsg, Mryange, nextdreamblue, Nitin-Kashyap, platoneko, py023, qidaye, shuke987, starocean999, SWJTU-ZhangLei, w41ter, wangbo, wsjz, wuwenchi, Xiaoccer, xiaokang, XieJiann, xingyingone, xinyiZzz, xuwei0912, xy720, xzj7019, yujun777, zclllyybb, zddr, zhangguoqiang666, zhangstar333, zhannngchen, zhiqiang-hhhh, zy-kkk, zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.5.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.5.md new file mode 100644 index 0000000000000..20d6bd9302b2c --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.5.md @@ -0,0 +1,73 @@ +--- +{ + "title": "Release 2.0.5", + "language": "en" +} +--- + + + + + +Thanks to our community users and developers, about 217 improvements and bug fixes have been made in Doris 2.0.5 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Behavior change +- Change char function behaviour: `select char(0) = '\0'` return true as MySQL + - https://github.com/apache/doris/pull/30034 +- Allow exporting empty data + - https://github.com/apache/doris/pull/30703 + +## New features +- Eliminate left outer join with `is null` condition +- Add `show-tablets-belong` stmt for analyzing a batch of tablet-ids +- InferPredicates support In, such as `a = b & a in [1, 2] -> b in [1, 2]` +- Optimize plan when column stats are unavailable +- Optimize plan using rollup column stats +- Support analyze materialized view +- Support ShowProcessStmt Show all FE connection + +## Improvement and optimizations +- Optimize query plan when column stats are unaviable +- Optimize query plan using rollup column stats +- Stop analyze quickly after user close auto analyze +- Catch load column stats exception, avoid print too much stack info to fe.out +- Select materialized view by specify the view name in SQL +- Change auto analyze max table width default value to 100 +- Escape characters for columns in recovery predicate pushdown in JDBC Catalog +- Fix JDBC MYSQL Catalog `to_date` fun pushdown +- Optimize the close logic of JDBC client +- Optimize JDBC connection pool parameter settings +- Obtain hudi partition information through HMS's API +- Optimize routine load job error msg and memory +- Skip all backup/restore jobs if max allowd option is set to 0 + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.4-rc06...2.0.5-rc02). + + +## Credits +Thanks all who contribute to this release: + +airborne12, alexxing662, amorynan, AshinGau, BePPPower, bingquanzhao, BiteTheDDDDt, ByteYue, caiconghui, cambyzju, catpineapple, dataroaring, eldenmoon, Emor-nj, englefly, felixwluo, GoGoWen, HappenLee, hello-stephen, HHoflittlefish777, HowardQin, JackDrogon, jacktengg, jackwener, Jibing-Li, KassieZ, LemonLiTree, liaoxin01, liugddx, LuGuangming, morningman, morrySnow, mrhhsg, Mryange, mymeiyi, nextdreamblue, qidaye, ryanzryu, seawinde,starocean999, TangSiyang2001, vinlee19, w41ter, wangbo, wsjz, wuwenchi, xiaokang, XieJiann, xingyingone, xy720,xzj7019, yujun777, zclllyybb, zhangstar333, zhannngchen, zhiqiang-hhhh, zxealous, zy-kkk, zzzxl1993 + diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.6.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.6.md new file mode 100644 index 0000000000000..9591ed8d3fab8 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.6.md @@ -0,0 +1,59 @@ +--- +{ + "title": "Release 2.0.6", + "language": "en" +} +--- + + + + + +Thanks to our community users and developers, about 114 improvements and bug fixes have been created by 51 contributors in Doris 2.0.6 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Behavior change +- N/A + +## New features +- Support match a function with alias in materialized-view +- Add a command to drop a tablet replica safely on backend +- Add row count cache for external table. +- Support analyze rollup to gather statistics for optimizer + +## Improvement and optimizations +- Improve tablet schema cache memory by using deterministic way to serialize protobuf +- Improve show column stats performance +- Support estimate row count for iceberg and paimon +- Support sqlserver timestamp type read for JDBC catalog + + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.5-rc02...2.0.6). + + +## Credits +Thanks all who contribute to this release: + +924060929, AshinGau, BePPPower, BiteTheDDDDt, CalvinKirs, cambyzju, deardeng, DongLiang-0, eldenmoon, englefly, feelshana, feiniaofeiafei, felixwluo, HappenLee, hust-hhb, iwanttobepowerful, ixzc, JackDrogon, Jibing-Li, KassieZ, larshelge, liaoxin01, LiBinfeng-01, liutang123, luennng, morningman, morrySnow, mrhhsg, qidaye, starocean999, TangSiyang2001, wangbo, wsjz, wuwenchi, xiaokang, XieJiann, xuwei0912, xy720, xzj7019, yiguolei, yujun777, Yukang-Lian, Yulei-Yang, zclllyybb, zddr, zhangstar333, zhannngchen, zhiqiang-hhhh, zy-kkk, zzzxl1993 + diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.7.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.7.md new file mode 100644 index 0000000000000..10f226dbd63b4 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.7.md @@ -0,0 +1,84 @@ +--- +{ + "title": "Release 2.0.7", + "language": "en" +} +--- + + + + + +Thanks to our community users and developers, about 80 improvements and bug fixes have been made in Doris 2.0.7 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## 1 Behavior change + +- `round` function defaults to rounding normally as MySQL, eg. round(5/2) return 3 instead of 2. + + - https://github.com/apache/doris/pull/31583 + +- `round` datetime with scale from string literal as MySQL, eg. round '2023-10-12 14:31:49.666' to '2023-10-12 14:31:50' . + + - https://github.com/apache/doris/pull/27965 + + +## 2 New features +- Support make miss slot as null alias when converting outer join to anti join to speed up query + + - https://github.com/apache/doris/pull/31854 + +- Enable proxy protocol to support IP transparency for Nginx and HAProxy. + + - https://github.com/apache/doris/pull/32338 + + +## 3 Improvement and optimizations + +- Add DEFAULT_ENCRYPTION column in `information_schema` table and add `processlist` table for better compatibility for BI tools + +- Automatically test connectivity by default when creating a JDBC Catalog. + +- Enhance auto resume to keep routine load stable + +- Use lowercase by default for Chinese tokenizer in inverted index + +- Add error msg if exceeded maximum default value in repeat function + +- Skip hidden file and dir in Hive table + +- Reduce file meta cache size and disable cache for some cases to avoid OOM + +- Reduce jvm heap memory consumed by profiles of BrokerLoadJob + +- Remove sort which is under table sink to speed up query like `INSERT INTO t1 SELECT * FROM t2 ORDER BY k`. + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.6...2.0.7) . + + +## 4 Credits + +Thanks all who contribute to this release: + +924060929,airborne12,amorynan,ByteYue,dataroaring,deardeng,feiniaofeiafei,felixwluo,freemandealer,gavinchou,hello-stephen,HHoflittlefish777,jacktengg,jackwener,jeffreys-cat,Jibing-Li,KassieZ,LiBinfeng-01,luwei16,morningman,mrhhsg,Mryange,nextdreamblue,platoneko,qidaye,rohitrs1983,seawinde,shuke987,starocean999,SWJTU-ZhangLei,w41ter,wsjz,wuwenchi,xiaokang,XieJiann,XuJianxu,yujun777,Yulei-Yang,zhangstar333,zhiqiang-hhhh,zy-kkk,zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.8.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.8.md new file mode 100644 index 0000000000000..d881a80628b44 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.8.md @@ -0,0 +1,76 @@ +--- +{ + "title": "Release 2.0.8", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, about 65 improvements and bug fixes have been made in Doris 2.0.8 version. + +- **Quick Download** : [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub** : [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + + +## 1 Behavior change + +The `ADMIN SHOW` statement can not be executed with high version of MySQL 8.x jdbc driver. So rename these statement, remove the `ADMIN` keywords. + +- https://github.com/apache/doris/pull/29492 + +```sql +ADMIN SHOW CONFIG -> SHOW CONFIG +ADMIN SHOW REPLICA -> SHOW REPLICA +ADMIN DIAGNOSE TABLET -> SHOW TABLET DIAGNOSIS +ADMIN SHOW TABLET -> SHOW TABLET +``` + + +## 2 New features + +N/A + + + +## 3 Improvement and optimizations + +- Make Inverted Index work with TopN opt in Nereids + +- Limit the max string length to 1024 while collecting column stats to control BE memory usage + +- JDBC Catalog close when JDBC client is not empty + +- Accept all Iceberg database and do not check the name format of database + +- Refresh external table's rowcount async to avoid cache miss and unstable query plan + +- Simplify the isSplitable method of hive external table to avoid too many hadoop metrics + +See the complete list of improvements and bug fixes on [GitHub](https://github.com/apache/doris/compare/2.0.7...2.0.8) . + +## 4 Credits + +Thanks all who contribute to this release: + +924060929, AcKing-Sam, amorynan, AshinGau, BePPPower, BiteTheDDDDt, ByteYue, cambyzju, dongsilun, eldenmoon, feiniaofeiafei, gnehil, Jibing-Li, liaoxin01, luwei16, morningman, morrySnow, mrhhsg, Mryange, nextdreamblue, platoneko, starocean999, SWJTU-ZhangLei, wuwenchi, xiaokang, xinyiZzz, Yukang-Lian, Yulei-Yang, zclllyybb, zddr, zhangstar333, zhiqiang-hhhh, ziyanTOP, zy-kkk, zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.9.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.9.md new file mode 100644 index 0000000000000..04048fc060461 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.9.md @@ -0,0 +1,75 @@ +--- +{ + "title": "Release 2.0.9", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, about 68 improvements and bug fixes have been made in Doris 2.0.9 version. + +- **Quick Download** : [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub** : [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## 1 Behavior change + +NA + +## 2 New features + +- Support predicate apprear both on key and value mv column + +- Support mv with `bitmap_union(bitmap_from_array())` + +- Add a FE config to force replicate allocation for OLAP tables in the cluster + +- Support date literal support timezone in new optimizer Nereids + +- Support slop in fulltext search `match_phrase` to specify word distence + +- Show index id in `SHOW PROC INDEXES` + +## 3 Improvement and optimizations + +- Sdd a secondary argument in `first_value` / `last_value` to ignore NULL values + +- the offset params in `LEAD`/ `LAG` function could use 0 + +- Adjust priority of materialized view match rule + +- TopN opt reads only limit number of records for better performance + +- Add profile for delete_bitmap get_agg function + +- Refine the Meta cache to get better performance + +- Add FE config `autobucket_max_buckets` + +See the complete list of improvements and bug fixes on [GitHub](https://github.com/apache/doris/compare/2.0.8...2.0.9) . + +## Big Thanks + +Thanks all who contribute to this release: + +adonis0147, airborne12, amorynan, AshinGau, BePPPower, BiteTheDDDDt, CalvinKirs, cambyzju, csun5285, eldenmoon, englefly, feiniaofeiafei, HHoflittlefish777, htyoung, hust-hhb, jackwener, Jibing-Li, kaijchen, kylinmac, liaoxin01, luwei16, morningman, mrhhsg, qidaye, starocean999, SWJTU-ZhangLei, w41ter, xiaokang, xiedeyantu, xy720, zclllyybb, zhangstar333, zhannngchen, zy-kkk, zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.0.md b/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.0.md new file mode 100644 index 0000000000000..baa62b37e1e75 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.0.md @@ -0,0 +1,469 @@ +--- +{ + "title": "Release 3.0.0", + "language": "en" +} +--- + + + + +We are excited to announce the release of Apache Doris 3.0! + +**Starting from version 3.X, Apache Doris supports a compute-storage decoupled mode in addition to the compute-storage coupled mode for cluster deployment. With the cloud-native architecture that decouples the computation and storage layers, users can achieve physical isolation between query loads across multiple compute clusters, as well as isolation between read and write loads. Additionally, users can take advantage of low-cost shared storage systems such as object storage or HDFS to significantly reduce storage costs.** + +Version 3.0 marks a milestone in the evolution of Apache Doris towards a unified data lake and data warehouse architecture. This version introduces the ability to write data back to data lakes, allowing users to perform data analysis, sharing, processing, and storage operations across multiple data sources within Apache Doris. With capabilities such as asynchronous materialized views, Apache Doris can serve as a unified data processing engine for enterprises, helping users better manage data across lakes, warehouses, and databases. Also, Apache Doris 3.0 introduces the Trino Connector. It allows users to quickly connect or adapt to more data sources, and leverage the high-performance compute engine of Doris to deliver faster query results than Trino. + +Version 3.0 also enhances support for ETL batch processing scenarios, adding explicit transaction support for operations like `insert into select`, `delete` and `update`. The observability of query execution has also been improved. + +In terms of performance, we have improved the framework capabilities, infrastructure, and rules of the query optimizer in version 3.0. This provides optimized performance, which has been proven by blind testing in more complex and diverse business scenarios. + +The adaptive Runtime Filter computation method now accurately estimates filters based on data size during execution, delivering better performance under large data volumes and high loads. Additionally, asynchronous materialized view has been more stable and user-friendly in query acceleration and data modeling. + +**During the development of version 3.0, over 200 contributors submitted nearly 5,000 optimizations** and fixes to Apache Doris. Contributors from companies such as VeloDB, Baidu, Meituan, ByteDance, Tencent, Alibaba, Kwai, Huawei, and Tianyi Cloud actively collaborated with the community, contributing test cases from real-world use cases to help us improve Apache Doris. We extend our heartfelt thanks to all the contributors involved in the development, testing, and feedback process for this release. + +- **GitHub**: https://github.com/apache/doris/releases + +- **Website**: https://doris.apache.org/download + +## 1. Compute-storage decoupled mode + +Since V3.0, Apache Doris supports the compute-storage decoupled mode. Users can choose between it and the compute-storage coupled mode during cluster deployment. + +In the compute-storage decoupled mode, the BE nodes no longer store the data, but instead, a shared storage layer (HDFS and object storage) is introduced as the shared data storage layer. The computing and storage resources can be scaled independently, bringing multiple benefits to users: + +- **Workload isolation**: Multiple compute clusters can share the same data, allowing users to isolate different business workloads or offline loads using separate compute clusters. + +- **Reduced storage costs**: The full dataset is stored in the more cost-effective and highly reliable shared storage, with only hot data cached locally. Compared to the compute-storage coupled mode with three data replicas, the storage cost can be reduced by up to 90%. + +- **Elastic computing resources**: Since no data is stored on the BE nodes, the computing resources can be scaled flexibly based on the load requirements. Users can scale in or out an individual compute cluster or increase/decrease the number of compute clusters. This also leads to cost savings. + +- **Improved system robustness**: By storing the data in shared storage, Doris no longer needs to handle the complex logic of multi-replica consistency, thus simplifying distributed storage complexity and improving the overall system robustness. + +- **Flexible data sharing and cloning**: The flexibility of the compute-storage decoupled mode extends beyond a single Doris cluster. Tables from one Doris cluster can be easily cloned to another Doris cluster, with just metadata replication. + +### 1-1. From coupled to decoupled + +In the compute-storage coupled mode, the Apache Doris architecture consists of two main process types: Frontend (FE) and Backend (BE). The FE is primarily responsible for user request access, query parsing and planning, metadata management, and node management. The BE is responsible for data storage and query plan execution. + +The BE nodes employ an MPP (Massively Parallel Processing) distributed computing architecture, leveraging a multi-replica consistency protocol to ensure high service availability and high data reliability. + +![From coupled to decoupled](/images/storage-compute-decoupled.PNG) + + +The maturation of emerging cloud computing infrastructure, including public clouds, private clouds, and Kubernetes-based container platforms, has driven the need for cloud-native capabilities. Increasingly, users are seeking deeper integration between Apache Doris and cloud computing infrastructure to provide more elasticity. + +**To address this need, the VeloDB team has designed and implemented a cloud-native version of Apache Doris that decouples compute and storage, known as VeloDB Cloud. After extensive production testing and refinement across hundreds of enterprises over a long time, this cloud-native solution has now been contributed to the Apache Doris community, manifesting as the Apache Doris 3.0 in the compute-storage decoupled mode.** + +In the compute-storage decoupled mode, the Apache Doris architecture consists of three layers: + +- **Meta data layer**: A new Meta Service module has been introduced to provide meta data services, such as processing database and table information, schemas, rowset meta, and transactions. The Meta Service is stateless and horizontally scalable. In V3.0, all of the BE's meta data and parts of the FE's meta data have been migrated to the Meta Service. We will finish the migration of the remains in future versions. +- **Computation layer**: The stateless BE nodes execute query plans and cache a portion of the data and tablet meta data locally to improve query performance. Multiple stateless BE nodes can be organized into a computing resource pool (i.e., compute cluster), and multiple compute clusters can share the same data and metadata service. The compute clusters can be elastically scaled by adding or removing nodes as needed. +- **Shared storage layer**: Data is persisted to the shared storage layer, which currently supports HDFS as well as various cloud-based object storage systems that are compatible with the S3 protocol, such as S3, OSS, GCS, Azure Blob, COS, BOS, and MinIO. + +![From coupled to decoupled-2](/images/storage-compute-decoupled-2.JPEG) + +### 1-2 Design highlight + +The design of the compute-storage decoupled mode of Apache Doris highlights the transformation of the FE's in-memory metadata model into a shared metadata service. This approach offers a globally consistent state view, allowing any node to directly submit writes without needing to go through the FE for publishing. During write operations, data is stored in shared storage, while metadata is managed by the metadata service. **This effectively controls the number of small files in shared storage. Meanwhile, the real-time write performance for individual tables is nearly on par with that in the compute-storage coupled mode. The system's overall write capacity is no longer limited by the processing power of a single FE node.** + +![Design highlight](/images/design-hightlight.PNG) + +Based on the globally consistent state view, for data garbage collection, we have adopted a design approach for data deletion that is easier to prove correct and more efficient. + +Specifically, data in the shared storage is incorporated into the globally consistent view offered by the shared meta data service. Whenever data is generated, we bind it to a separate, independent transaction. Similarly, for a meta data deletion operation, we also bind it to a separate, independent transaction. The purpose of this approach is to ensure that deletion and write operations cannot succeed together. The view records which data needs to be deleted, and the asynchronous deletion process can simply perform a forward deletion of the data based on the transaction records, without the need for reverse garbage collection. + +As the tablet-related meta data in the FE is gradually migrated to the shared meta data service, the scalability of the Doris cluster will no longer be constrained by the memory capacity of a single FE node. Building upon the shared meta data service and the forward data deletion technique, we can conveniently expand functionality such as data sharing and lightweight cloning. + +### 1-3 Comparison with alternative solutions + +Another design of decoupling compute and storage in the industry is to store the data and BE node meta data in a shared object storage or HDFS. However, this approach brings the following problems: + +- **Inability to support real-time writes**: During data writes, the data is mapped to tablets based on the partitioning and bucketing rules, generating segment files and rowset meta data. During the write process, a two-phase commit (Publish) is performed through the FE. When a BE node receives the Publish request, it then sets the rowset as visible. The Publish operation must not fail. If the rowset meta data is stored in the shared storage, the total small file data during the real-time write process would triple the size of the actual data files - one replica of data files, one for rowset meta data, and another for rowset meta data changes during Publish. The Publish operation is driven by a single FE node, so the write capacity of a single table or even the entire system is limited by the FE node's capabilities. + + ![Comparison with alternative solutions](/images/comparison-with-alternative-solutions.png) + + We compared the real-time data write performance of Apache Doris 3.0 with the above-described solution. We simulated 500 concurrent tasks writing 10,000 data files with 500 rows each, and 50 concurrent tasks writing 250 data files with 20,000 rows each, using the same computational resources. + + **The results showed that at 50 concurrent tasks, the micro-batch write performances of Apache Doris in both compute-storage coupled and decoupled modes were almost identical, while the industry solution lagged behind Apache Doris by a factor of 100.** + + At 500 concurrent tasks, the performance of Apache Doris in the compute-storage decoupled mode showed slight degradation, but it still maintained an 11X advantage over the industry solution. To ensure a fair test, Apache Doris did not enable the Group Commit feature (which the industry solution lacks). Enabling Group Commit would further enhance real-time write performance. + + ![Comparison with alternative solutions](/images/real-time-write-performance..png) + + Additionally, the industry solution also faces stability and cost issues in terms of real-time data ingestion: + + - Stability concerns: A large number of small files can put pressure on the shared storage, especially HDFS, and introduce stability risks. + + - High object storage request costs: Some public cloud object storage services charge 10 times more for Put and Delete operations compared to Get operations. A large number of small files can lead to a significant increase in object storage request costs, which can even exceed the storage costs. + +- **Limited scalability**: Use cases of the compute-storage decoupled model often handles larger data storage sizes, since the FE (Frontend) meta data is entirely in-memory, when the number of tablets reaches a certain high level (e.g. tens of millions), the FE's memory pressure can become a bottleneck that limits the overall write throughput of the system. + +- **Potential data deletion logic issues**: In the compute-storage decoupled architecture, data is stored with one single replica. Therefore, the data deletion logic is critical for the system's reliability. The conventional approach of cross-system data deletion by comparing the differences can be challenging. During the write process, there is no way to completely avoid deletion and write from succeeding together, which can lead to data loss. Additionally, when the storage system experiences anomalies, the input used for difference calculation may be incorrect, which potentially leads to unintended data deletion. + +- **Data sharing and lightweight cloning**: The flexibility of the decoupled storage-compute architecture can enable future data sharing and lightweight data cloning, reducing the burden of enterprise data management. However, if each cluster has a separate FE, after cloning data across clusters, it becomes difficult to accurately determine which data is no longer referenced and can be safely deleted, as calculating cross-cluster references can easily lead to unintended data deletion. + +By evolving the FE's full in-memory meta data model into a shared meta data service, Apache Doris 3.0 avoids all the aforementioned issues. + +### 1-4 Query performance comparison + +In the compute-storage decoupled mode, data needs to be read from the remote shared storage system, the main bottleneck has become the network bandwidth instead of the disk I/O in the compute-storage coupled mode. + +To accelerate data access, Apache Doris has implemented a high-speed caching mechanism based on local disks, and provides two cache management policies: LRU (Least Recently Used) and TTL (Time-To-Live). The newly imported data is asynchronously written to the cache to accelerate the first-time access to the latest data. If the data required by a query is not in the cache, the system will read the data from the remote storage into memory and synchronously write it to the cache for subsequent queries. + +In use cases involving multiple compute clusters, Apache Doris provides a cache preheating function. When a new compute cluster is established, users can choose to preheat specific data (such as tables or partitions) to further improve query efficiency. + +In this context, we have conducted performance tests with different caching strategies in both the compute-storage coupled and decoupled modes, using the TPC-DS 1TB test dataset. The results are concluded as follows: + +- When the cache is fully hit (i.e., all the data required for the query is loaded into the cache), **the query performance of the compute-storage decoupled mode is on par with that of the compute-storage coupled mode**. + +- When the cache is partially hit (i.e., the cache is cleared before the test, and data is gradually loaded into the cache during the test, with performance continuously improving), the query performance of the compute-storage decoupled mode is about 10% lower than that of the compute-storage coupled mode. This test scenario is the most similar to the real-life use cases. + +- When the cache is completely missed (i.e., the cache is cleared before every SQL execution, simulating an extreme case), the performance loss is around 35%. **Even so, Apache Doris in the compute-storage decoupled mode delivers much higher performance than its alternative solutions.** + +![Query performance comparison](/images/query-performance-comparison.png) + +### 1-5 Write speed comparison + +In terms of write performance, we have simulated two test cases under the same computing resources: batch import and high-concurrency real-time import. The comparison of write performance between the compute-storage coupled mode and the compute-storage decoupled mode is as follows: + +- **Batch import**: When importing the 1TB TPC-H and 1TB TPC-DS test datasets, **the write performance of the compute-storage decoupled mode is 20.05% and 27.98% higher than the compute-storage coupled mode**, respectively, under the single-replica configuration. During batch import, the segment file size is generally in the range of tens to hundreds of MB. In the compute-storage decoupled mode, the segment files are split into smaller files and concurrently uploaded to the object storage, which can result in higher throughput compared to writing to local disks. In real-life deployments, the compute-storage coupled mode typically uses three replicas, which means the write speed advantage of the compute-storage decoupled mode will be even more pronounced. + +- **High-concurrency real-time import**: as described in the "Comparison with alternative solutions" section. + +![Write speed comparison](/images/write-speed-comparison.png) + +### 1-6 Tips for production environment + +- **Performance**: For real-time data analysis, users can achieve query performance comparable to the compute-storage coupled mode by specifying a TTL (Time-To-Live) for the cache and writing newly ingested data into the cache. To prevent query jitter, users can cache the data generated by background tasks such as compaction and schema changes based on how frequently used the data is. + +- **Workload isolation**: Users can achieve physical resource isolation for different business using multiple compute clusters. For workload isolation within a single compute cluster, users can utilize the Workload Group mechanism to limit and isolate resources for different queries. + +### 1-7 Notes + +- Apache Doris 3.0 does not support the co-existence of the compute-storage coupled mode and the compute-storage decoupled mode. Users need to specify one of them during cluster deployment. + +- If users need the compute-storage coupled mode, following the [documentation](https://doris.apache.org/docs/3.0/install/source-install/compilation-with-docker/) for its deployment and upgrade. We recommend using Doris Manager for quick deployment and cluster upgrades. However, the compute-storage decoupled mode does not yet support Doris Manager deployment and upgrade. We will continue iteration for better support in future versions. + +- Currently Apache Doris does not support in-place upgrade from V2.1 to the compute-storage decoupled mode of V3.0. For such purpose, users need to perform data migration using tools like X2Doris after deploying the compute-storage decoupled clusters. In the future, we will support migration without service interruption through the CCR (Change Data Capture) capability. + +:::info +See doc: +https://doris.apache.org/docs/3.0/compute-storage-decoupled/overview/ +::: + +## 2. Data lakehouse + +Apache Doris is positioned as a real-time data warehouse, but it is much more than that. In previous versions, we have consistently pushed beyond the boundaries of traditional data warehouse capabilities, advancing towards a unified data lakehouse. Version 3.0 marks a milestone in this journey, with its capabilities in the lakehouse architecture becoming fully mature. We believe that a unified lakehouse is identified by **boundaryless data** and **lakehouse fusion**: + +**Boundaryless data: Apache Doris serves as a unified query processing engine, breaking down data barriers across different systems. It provides a consistent and ultra-fast analysis experience across all data sources, including data warehouses, data lakes, data streams, and local data files.** + +- **Lakehouse query acceleration**: Without the need to migrate data to Apache Doris, users can leverage Doris’ efficient query engine to directly query data stored in data lakes such as Iceberg, Hudi, Paimon, and offline data warehouses like Hive, thereby accelerating query analysis. + +- **Federated analysis**: By extending its catalog and storage plugins, Apache Doris enhances its federated analysis capabilities, allowing users to perform unified analysis across multiple heterogeneous data sources without physically centralizing the data in a single storage system. This enables external table queries and federated joins between internal and external tables, breaking down data silos and providing globally consistent data insights. + +- **Data lake construction**: Apache Doris introduces write-back functionality for Hive and Iceberg, allowing users to directly create Hive and Iceberg tables through Doris and write data into them. This allows users to write internal table data back to the offline lakehouse or process offline lakehouse data using Doris and save the results back into the lakehouse, simplifying and streamlining the data lake construction process. + +**Lakehouse fusion: As data lake architectures become increasingly complex, the costs of technology selection and maintenance rise for users. Achieving consistent fine-grained access control across multiple systems also becomes challenging, and real-time performance suffers. To address this, Apache Doris integrates core features of the data lake, transforming itself into a lightweight, efficient, native real-time lakehouse.** + +- **Real-time data updates**: Starting with version 1.2, Apache Doris enhanced the primary key model by introducing Merge-on-Write, supporting real-time updates. This feature allows high-frequency, real-time data updates based on primary key changes from upstream data sources. + +- **Data science and** **AI** **computation support**: From version 2.1, Apache Doris, using the efficient Arrow Flight protocol, increased the openness of its storage system and its support for various compute loads, enabling data science and AI computations. + +- **Enhancements for semi-structured and unstructured Data**: Apache Doris has introduced support for data types like Array, Map, Struct, JSON, and Variant, with plans to support vector indexing in the future. + +- **Improved resource efficiency by decoupling storage and compute**: With version 3.0, Apache Doris supports a decoupled storage and compute mode, further improving resource efficiency and scalability. + +### 2-1 Faster queries in the data lakehouse + +TPC-H and TPC-DS benchmarking proves that Apache Doris achieves average query performance that is 3 to 5 times faster than Trino/Presto. + +In V3.0, we have focused on optimizing query performance for production environments, including: + +- **More granular task splitting strategy**: By adjusting the consistent hashing algorithm and introducing a task sharding weighting mechanism, we ensure balanced query loads across all nodes. + +- **Scheduling optimizations for use cases with numerous partitions and files**: For cases with a large number of files (over 1 million), we have largely reduced query latency (from 100 seconds to 10 seconds) and alleviated memory pressure on the Frontend (FE) by asynchronously and batch-fetching file shards. + +We will continue to specifically enhance query acceleration performance in real-world business scenarios, improve the actual user experience, and build an industry-leading lakehouse query acceleration engine. + +### 2-2 Federated analysis: more data connectors + +Previous versions of Apache Doris support connectors for over 10 mainstream data lakehouses, warehouses, and relational databases. In V3.0, we have introduced the Trino Connector compatibility framework, which expands the range of data sources that Apache Doris can connect to. With this framework, users can easily adapt their existing setups to access corresponding data sources using Doris and leverage its high-speed computing engine for data analysis. + +Currently, Doris has completed adaptations for Delta Lake, Kudu, BigQuery, Kafka, TPCH, and TPCDS. We also encourage contributions from developers to prolong this list. + +:::info Note + +See doc: + +- Trino Connector: https://doris.apache.org/community/how-to-contribute/trino-connector-developer-guide/ + +- TPC-H: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/tpch/ + +- TPC-DS: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/tpcds/ + +- Delta Lake: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/deltalake/ + +- Kudu: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/kudu/ + +- BigQuery: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/bigquery/ +::: + + +### 2-3 Data lake building + +In V3.0, we have introduced data writeback functionality for Hive and Iceberg. This allows users to create Hive and Iceberg tables directly through Doris and write data into these tables, and enables users to perform data analysis, sharing, processing, and storage operations across multiple data sources within Doris. + +In future iterations, Apache Doris will further enhance support for data lake table formats and improve the openness of storage APIs. + +:::info Note +See doc: https://doris.apache.org/docs/3.0/lakehouse/datalake-building/hive-build/ +::: + +## 3. Upgraded semi-structured data analysis capabilities + +In versions 2.0 and 2.1, Apache Doris introduced some well-embraced features such as inverted index, NGram Bloom Filter, and Variant data type to support high-performance full-text search and multi-dimensional analysis. With them, the storage and processing of complex semi-structured data have been more flexible and efficient. + +In V3.0, we have further enhanced the capabilities in this scenario. + +After extensive testing in production environments, the Variant data type has gained sufficient stability and become the preferred choice for JSON data storage and analysis. In V3.0, we have made multiple optimizations to it: + +- Support for indexing of the Variant data type to accelerate queries, including inverted index, Bloom Filter index, and the built-in ZoneMap index. + +- Support for flexible partial column updates for Unique Key tables containing the Variant data type. + +- Support for the use of the Variant data type in the compute-storage decoupled mode, with optimizations of its metadata storage. + +- Support for exporting the Variant data type to formats such as Parquet and CSV. + +The inverted index, introduced since V2.0, has reached a high level of maturity after more than a year of refinement and is now running in production environments of hundreds of enterprises. In V3.0, we have made multiple optimizations to the inverted index: + +- After performance optimizations, including lock concurrency, Apache Doris outperforms Elasticsearch in key metrics such as query latency and concurrency in real-time reporting analysis. + +- Optimized index file in the compute-storage decoupled mode to reduce remote storage calls and decrease index query latency. + +- Support for the Array data type to accelerate the `array_contains` queries. + +- Enhanced the `match_phrase_*` functionality, including support for slop and phrase prefix matching `match_phrase_prefix`. + +## 4. Enhanced ETL capabilities + +### 4-1. Transaction improvements + +Data processing in data warehouses often involves multiple data changes that need to be handled as a single transaction. V3.0 provides explicit transaction support for `insert into select`, `delete`, and `update` operations. Example cases include: + +- **Transactional requirements**: For example, when updating data within a time range, the typical approach is to first delete the data in that time range, and then insert the new data. Considering that the data might already be in service, there is a need to ensure that queries visit either the old data or the new data. Thus, it can be achieved by executing the `delete` and `insert into select` operations in a transaction. + + ```Java + BEGIN; + DELETE FROM table WHERE date >= "2024-07-01" AND date <= "2024-07-31"; + INSERT INTO table SELECT * FROM stage_table; + COMMIT; + ``` + +- **Simplified the processing of failed tasks**: For example, when two `insert into select` operations are executed within a single transaction, if any of the operations fail, it can be retried directly. + + ```Java + BEGIN WITH LABEL label_etl_1; + INTO table1 SELECT * FROM stage_table1; + INSERT INTO table SELECT * FROM stage_table; + COMMIT; + ``` + +:::info Note +See doc: https://doris.apache.org/docs/3.0/data-operate/transaction/ +Currently, explicit transaction synchronization is not supported in Cross-Cluster Replication (CCR). +::: + +### 4-2. Improved observability + +- **Real-time profile retrieval**: In previous versions, due to issues with the execution plan or the data, some complex queries might have high computational requirements, so developers can only access the query profile for performance analysis after the completion of the query. This makes it hard to promptly identify issues in query execution to guarantee stability of the production environment. Now, with the ability to retrieve real-time profiles, V3.0 allows users to monitor query execution as the query is running. It also allows them to better monitor the progress of each ETL job. + +- **`backend_active_tasks` system table**: The `backend_active_tasks` system table provides real-time resource consumption information for each query on each BE node. Users can analyze this system table using SQL to obtain the resource usage of each query, which helps identify large queries or abnormal workloads. + +## 5. Asynchronous materialized view + +In V3.0, asynchronous materialized view is faster and more stable. It is also more user-friendly for query acceleration and data modeling scenarios. We have restructured the logic for transparent rewrite and expanded its capabilities, making it 2X faster. + +### 5-1 Refresh + +- Support for incremental update of materialized views by partitions and partition roll-ups on materialized views to allow refreshes at different granularities. + +- Support for nested materialized views, which is useful in data modeling scenarios. + +- Support for index creation and sort key specification in asynchronous materialized views, which will improve query performance after the materialized view is hit. + +- Higher usability of materialized view DDL with support for atomically replacing materialized views, allowing modifications to the materialized view definition SQL while keeping the materialized view available. + +- Support for non-deterministic functions in materialized views to better serve daily materialized view creation. + +- Support for trigger-based materialized view refresh, which ensures data consistency in data modeling with nested materialized views. + +- Support for a broader range of SQL patterns for building partitioned materialized views, making the incremental update capability available to more use cases. + +### 5-2 Refresh stability + +- V3.0 supports specifying a Workload Group for building materialized views. This is to limit the resources used by the materialized view build process and ensure that sufficient resources remain available for ongoing queries. + +### 5-3 Transparent rewrite + +- Support for transparent rewrite of more Join types, including derived Joins. Even when there is a mismatch of Join types between the query and materialized view, transparent rewrite can still be performed by compensating with additional predicates, as long as the materialized view can provide all the data needed for the query. + +- Support for more aggregate functions for roll-up as well as rewrite of multi-dimensional aggregations like GROUPING SETS, ROLLUP, and CUBE; support rewriting queries with aggregations when the materialized view does not contain aggregations, simplifying Join operations and expression computation. + +- Support for transparent rewrite of nested materialized views, enabling higher performance for complex queries. + +- For partially invalid partitioned materialized views, V3.0 supports `Union All` the base tables for data completion, expanding the applicability of partitioned materialized views. + +### 5-4 Transparent rewrite performance + +- Continuous optimization has been done to improve the transparent rewrite performance, achieving 2X the speed compared to version 2.1.0. + +:::info Note + +See doc: + +https://doris.apache.org/docs/3.0/query/view-materialized-view/query-async-materialized-view + +https://doris.apache.org/docs/3.0/query/view-materialized-view/async-materialized-view/ + +::: + +## 6. Performance improvement + +### 6-1 Smarter optimizer + +In V3.0, the query optimizer has been enhanced in terms of framework capabilities, distributed plan support, optimizer infrastructure, and rule expansion. It provides better optimization capabilities for more complex and diverse business scenarios, with higher blind test performance for complex SQL: + +- **Improved plan enumeration capability**: The key structure Memo for plan enumeration has been restructured and normalized. This improves the efficiency of the Cascades framework in plan enumeration and the possibility of producing better plans. Additionally, it fixes incomplete column pruning during the Join Reorder process in older versions, which led to unnecessary overhead of the Join operator, thus improving the execution performance in the relevant scenarios. + +- **Improved distributed plan support**: The distributed query plan has been enhanced to allow aggregation, join, and window function operations to more intelligently identify the data characteristics of intermediate computation results, avoiding ineffective data redistribution operations. Meanwhile, we have optimized the execution under the multi-replica continuous execution mode, making it more data cache-friendly. + +- **Improved optimizer infrastructure**: V3 has fixed several issues in cost model and statistics information estimation. The fixes to the cost model are more adaptable to the evolution of the execution engine, making the execution plan more stable compared to previous versions. + +- **Enhanced Runtime Filter plan support**: On the basis of Join Runtime Filter, V3.0 has expanded the capability of the TopN Runtime Filter to achieve better performance in use cases that involve a TopN operator. + +- **Enriched optimization rule library**: Based on user feedback and internal testing results, we have introduced optimization rules such as Intersect Reorder to enrich the rule set of the optimizer. + +### 6-2 Self-adaptive Runtime Filter + +In previous versions, the generation of Runtime Filter relies on manual setting by users based on statistical information. However, inaccurate settings in certain cases could lead to performance instability. + +In V3.0, Doris implements a self-adaptive Runtime Filter calculation approach. It can estimate the Runtime Filter at runtime based on the data size with high accuracy, enabling better performance in use cases with large data volumes and high workloads. + +### 6-3 Function performance optimization + +- V3.0 has improved the vectorized implementation of dozens of functions, enabling a performance improvement of over 50% for some commonly used functions. +- V3.0 has also made extensive optimizations to the aggregation of nullable data types, enabling a 30% performance improvement. + +### 6-4 Blind test performance improvement + +Our blind tests on V3.0 and V2.1 show that the new version is 7.3% and 6.2% faster in TPC-DS and TPC-H benchmark tests, respectively. + +![Blind test performance improvement](/images/blind-test-performance-improvement.png) + +## 7. New features + +### 7-1 Java UDTF + +Version 3.0 has added support for Java UDTFs. The key operations are as follows: + +- Implementing a UDTF: Similar to a UDF, a UDTF requires the user to implement an `evaluate` method. Note that the return value of a UDTF function must be of the `Array` data type. + + ```sql + public class UDTFStringTest { + public ArrayList evaluate(String value, String separator) { + if (value == null || separator == null) { + return null; + } else { + return new ArrayList<>(Arrays.asList(value.split(separator))); + } + } + } + ``` + +- Creating a UDTF: By default, two corresponding functions will be created - `java-utdf`and `java-utdf_outer`. The `_outer` suffix adds a single row of `NULL` data when the table function generates 0 rows of output. + + ```sql + CREATE TABLES FUNCTION java-utdf(string, string) RETURNS array PROPERTIES ( + "file"="file:///pathTo/java-udaf.jar", + "symbol"="org.apache.doris.udf.demo.UDTFStringTest", + "always_nullable"="true", + "type"="JAVA_UDF" + ); + ``` + +:::info + +See doc: https://doris.apache.org/docs/3.0/query/udf/java-user-defined-function/#udtf-1 + +::: + +### 7-2 Generated column + +A generated column is a special column whose value is calculated from the values of other columns rather than directly inserted or updated by the user. It supports pre-computing the results of expressions and storing them in the database, which is suitable for scenarios that require frequent queries or complex calculations. + +Results can be automatically calculated based on predefined expressions when data is imported or updated, and then stored persistently. In this way, during subsequent queries, the system can directly access these calculated results without performing complex calculations, thereby improving query performance. + +Generated columns are supported since V3.0. When creating a table, you can specify a column as generated column. A generated column automatically calculates values based on the defined expression when data is written. Generated columns allow for more complex expressions to be defined, but the value cannot be explicitly written or set. + +:::info + +See doc: https://doris.apache.org/docs/3.0/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-TABLE-AND-GENERATED-COLUMN/ + +::: + +## 8. Functional improvements + +### 8-1. Materialized view + +We have refactored the selection logic for materialized views and migrated it from the rule-based optimizer (RBO) to the cost-based optimizer (CBO). This aligns the selection logic with that of asynchronous materialized views. This functionality is enabled by default. If any issues are encountered, you can revert to the RBO mode using `set global enable_sync_mv_cost_based_rewrite = false`. + +### 8-2. Routine Load + +In previous versions, the Routine Load functionality faced some usability challenges, such as uneven task scheduling across BE nodes, untimely task scheduling, complex configuration requirements (the need to change multiple FE and BE settings for optimization), insufficient overall stability (where restarts or upgrades could frequently pause Routine Load jobs, requiring manual user intervention to resume). + +To address these issues, we have made extensive optimizations to the Routine Load feature: + +- **Resource scheduling**: We have improved the scheduling balance to make sure that tasks are more evenly distributed across BE nodes. Jobs that encounter unrepairable errors will be promptly paused to avoid wasting resources on futile scheduling attempts. Additionally, we have improved the timeliness of the scheduling process, which has enhanced the import performance of Routine Load. + +- **Parameter configuration**: Users in most environments no longer need to modify FE and BE configurations for optimization. An automatic adjustment mechanism with timeout parameter has been introduced to prevent tasks from constantly retrying when cluster pressure increases. + +- **Stability**: We have enhanced the robustness of Doris in various exceptional scenarios, such as FE failovers, BE rolling upgrades, and Kafka cluster anomalies, ensuring continuous stable operation. We have also optimized the Auto Resume mechanism, allowing Routine Load to automatically resume operation after faults are repaired, reducing the need for manual user intervention. + +## 9. Behavior changed + +- `cpu_resource_limit` will no longer be supported, and all types of resource isolation will be implemented through Workload Groups. + +- Please use JDK 17 for Apache Doris 3.0 and later versions. The recommended version being `jdk-17.0.10_linux-x64_bin.tar.gz`. + +## Try Apache Doris 3.0 now! + +Before the official release of version 3.0, the compute-storage decoupled mode of Apache Doris has undergone nearly two years of extensive testing and optimization in the production environments of hundreds of enterprises. Contributors from many tech giants have collaborated with the community to provide a significant number of test cases based on their real-world business needs. This has rigorously validated the usability and stability of version 3.0. + +We highly recommend users with compute-storage decoupling needs to download version 3.0 and experience it firsthand. + +Going forward, we will accelerate our release iteration cycle to deliver a more stable version experience for all users. Feel free to join us in the [Apache Doris community](https://join.slack.com/t/apachedoriscommunity/shared_invite/zt-2gmq5o30h-455W226d79zP3L96ZhXIoQ) and engage directly with the core developers. + +## Credits + +Special thanks to the following contributors who participated in the development, testing, and provided feedback for this version: + +@133tosakarin、@390008457、@924060929、@AcKing-Sam、@AshinGau、@BePPPower、@BiteTheDDDDt、@ByteYue、@CSTGluigi、@CalvinKirs、@Ceng23333、@DarvenDuan、@DongLiang-0、@Doris-Extras、@Dragonliu2018、@Emor-nj、@FreeOnePlus、@Gabriel39、@GoGoWen、@HappenLee、@HowardQin、@Hyman-zhao、@INNOCENT-BOY、@JNSimba、@JackDrogon、@Jibing-Li、@KassieZ、@Lchangliang、@LemonLiTree、@LiBinfeng-01、@LompleZ、@M1saka2003、@Mryange、@Nitin-Kashyap、@On-Work-Song、@SWJTU-ZhangLei、@StarryVerse、@TangSiyang2001、@Tech-Circle-48、@Thearas、@Vallishp、@WinkerDu、@XieJiann、@XuJianxu、@XuPengfei-1020、@Yukang-Lian、@Yulei-Yang、@Z-SWEI、@ZhongJinHacker、@adonis0147、@airborne12、@allenhooo、@amorynan、@bingquanzhao、@biohazard4321、@bobhan1、@caiconghui、@cambyzju、@caoliang-web、@catpineapple、@cjj2010、@csun5285、@dataroaring、@deardeng、@dongsilun、@dutyu、@echo-hhj、@eldenmoon、@elvestar、@englefly、@feelshana、@feifeifeimoon、@feiniaofeiafei、@felixwluo、@freemandealer、@gavinchou、@ghkang98、@gnehil、@hechao-ustc、@hello-stephen、@httpshirley、@hubgeter、@hust-hhb、@iszhangpch、@iwanttobepowerful、@ixzc、@jacktengg、@jackwener、@jeffreys-cat、@kaijchen、@kaka11chen、@kindred77、@koarz、@kobe6th、@kylinmac、@larshelge、@liaoxin01、@lide-reed、@liugddx、@liujiwen-up、@liutang123、@lsy3993、@luwei16、@luzhijing、@lxliyou001、@mongo360、@morningman、@morrySnow、@mrhhsg、@my-vegetable-has-exploded、@mymeiyi、@nanfeng1999、@nextdreamblue、@pingchunzhang、@platoneko、@py023、@qidaye、@qzsee、@raboof、@rohitrs1983、@rotkang、@ryanzryu、@seawinde、@shoothzj、@shuke987、@sjyango、@smallhibiscus、@sollhui、@sollhui、@spaces-X、@stalary、@starocean999、@superdiaodiao、@suxiaogang223、@taptao、@vhwzx、@vinlee19、@w41ter、@wangbo、@wangshuo128、@whutpencil、@wsjz、@wuwenchi、@wyxxxcat、@xiaokang、@xiedeyantu、@xiedeyantu、@xingyingone、@xinyiZzz、@xy720、@xzj7019、@yagagagaga、@yiguolei、@yongjinhou、@ytwp、@yuanyuan8983、@yujun777、@yuxuan-luo、@zclllyybb、@zddr、@zfr9527、@zgxme、@zhangbutao、@zhangstar333、@zhannngchen、@zhiqiang-hhhh、@ziyanTOP、@zxealous、@zy-kkk、@zzzxl1993、@zzzzzzzs \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.1.md b/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.1.md new file mode 100644 index 0000000000000..9b9007e4391aa --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.1.md @@ -0,0 +1,604 @@ +--- +{ + "title": "Release 3.0.1", + "language": "en" +} +--- + + + +Dear community members, the Apache Doris 3.0.1 version was officially released on August 23, 2024, featuring updates and improvements in compute-storage decoupling, lakehouse, semi-structured data analysis, asynchronous materialized views, and more. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior Changes + +### Query Optimizer + +- Added the variable `use_max_length_of_varchar_in_ctas` to control the length behavior of VARCHAR type when executing `CREATE TABLE AS SELECT` (CTAS) operations. [#37069](https://github.com/apache/doris/pull/37069) + + - This variable is set to true by default. + + - When set to true, if the VARCHAR type column originates from a table, the derived length is used; otherwise, the maximum length is used. + + - When set to false, the VARCHAR type will always use the derived length. + +- All data types will now be displayed in lowercase to maintain compatibility with MySQL format. [#38012](https://github.com/apache/doris/pull/38012) + +- Multiple query statements in the same query request must now be separated by semicolons. [#38670](https://github.com/apache/doris/pull/38670) + +### Query Execution + +- The default number of parallel tasks after shuffle operations in the cluster is set to 100, which will improve query stability and concurrent processing capability in large clusters. [#38196](https://github.com/apache/doris/pull/38196) + +### Storage + +- The default value of `trash_file_expire_time_sec` has been changed from 86400 seconds to 0 seconds, which means that if files are deleted by mistake and the FE trash is cleared, the data cannot be recovered. + +- The table attribute `enable_mow_delete_on_delete_predicate` (introduced in version 3.0.0) has been renamed to `enable_mow_light_delete`. + +- Explicit transactions are now prohibited from performing delete operations on tables with written data. + +- Heavy schema change operations are prohibited on tables with auto-increment fields. + + + +## New Features + +### Job Scheduling + +- Optimized the execution logic of internal scheduling jobs, decoupling the strong association between start time and immediate execution parameters. Now, tasks can be created with a specified start time or selected for immediate execution, without conflict, enhancing scheduling flexibility. [#36805](https://github.com/apache/doris/pull/36805) + +### Compute-Storage Decoupled + +- Supports dynamic modification of the upper limit for file cache usage. [#37484](https://github.com/apache/doris/pull/37484) + +- Recycler now supports object storage rate limiting and server-side rate limiting retry functionality. [#37663](https://github.com/apache/doris/pull/37663) [#37680](https://github.com/apache/doris/pull/37680) + +### Lakehouse + +- Added the session variable `serde_dialect` to set the output format for complex types. [#37039](https://github.com/apache/doris/pull/37039) + +- SQL interception now supports external tables. + + - For more information, refer to the documentation on [SQL Interception](https://doris.apache.org/docs/admin-manual/query-admin/sql-interception). + +- Insert overwrite now supports Iceberg tables. [#37191](https://github.com/apache/doris/pull/37191) + +### Asynchronous Materialized Views + +- Supports partition roll-up and build at the hourly level. [#37678](https://github.com/apache/doris/pull/37678) + +- Supports atomic replacement of asynchronous materialized view definition statements. [#36749](https://github.com/apache/doris/pull/36749) + +- Transparent rewriting now supports Insert statements. [#38115](https://github.com/apache/doris/pull/38115) + +- Transparent rewriting now supports the VARIANT type. [#37929](https://github.com/apache/doris/pull/37929) + +### Query Execution + +- The group concat function now supports DISTINCT and ORDER BY options. [#38744](https://github.com/apache/doris/pull/38744) + +### Semi-Structured Data Management + +- The ES Catalog now maps `nested` or `object` types in Elasticsearch to the JSON type in Doris. [#37101](https://github.com/apache/doris/pull/37101) + +- Added the `MULTI_MATCH` function, which supports matching keywords across multiple fields and can leverage inverted indexes to accelerate searches. [#37722](https://github.com/apache/doris/pull/37722) + +- Added the `explode_json_object` function, which can unfold objects in JSON data into multiple rows. [#36887](https://github.com/apache/doris/pull/36887) + +- Inverted indexes now support memtable advancement, requiring index construction only once during multi-replica writes, reducing CPU consumption and improving performance. [#35891](https://github.com/apache/doris/pull/35891) + +- Added `MATCH_PHRASE` support for positive slop, e.g., `msg MATCH_PHRASE 'a b 2+'` can match instances containing words a and b with a slop of no more than two, and a preceding b; regular slop without the final `+` does not guarantee this order. [#36356](https://github.com/apache/doris/pull/36356) + +### Other + +- Added the FE parameter `skip_audit_user_list`, where user operations specified in this configuration will not be recorded in the audit log. [#38310](https://github.com/apache/doris/pull/38310) + + - For more information, refer to the documentation on [Audit Plugin](https://doris.apache.org/docs/admin-manual/audit-plugin/). + + + +## Improvements + +### Storage + +- Reduced the likelihood of write failures caused by disk balancing within a single BE. [#38000](https://github.com/apache/doris/pull/38000) + +- Decreased memory consumption by the memtable limiter. [#37511](https://github.com/apache/doris/pull/37511) + +- Moved old partitions to the FE trash during partition replacement operations. [#36361](https://github.com/apache/doris/pull/36361) + +- Optimized memory consumption during compaction. [#37099](https://github.com/apache/doris/pull/37099) + +- Added a session variable to control audit logs for JDBC PreparedStatement, with default setting to not print. [#38419](https://github.com/apache/doris/pull/38419) + +- Optimized the logic for selecting BEs for group commits. [#35558](https://github.com/apache/doris/pull/35558) + +- Improved the performance of column updates. [#38487](https://github.com/apache/doris/pull/38487) + +- Optimized the use of `delete bitmap cache`. [#38761](https://github.com/apache/doris/pull/38761) + +- Added a configuration to control query affinity during hot and cold tiering. [#37492](https://github.com/apache/doris/pull/37492) + +### Compute-Storage Decoupled + +- Implemented automatic retries when encountering object storage server rate limiting. [#37199](https://github.com/apache/doris/pull/37199) + +- Adapted the number of threads for memtable flush in the compute-storage decoupled mode. [#38789](https://github.com/apache/doris/pull/38789) + +- Added Azure as a compile option to support compilation in environments without Azure support. + +- Optimized the observability of object storage access rate limiting. [#38294](https://github.com/apache/doris/pull/38294) + +- Allowed the file cache TTL queue to perform LRU eviction, enhancing TTL queue usability. [#37312](https://github.com/apache/doris/pull/37312) + +- Optimized the number of balance writeeditlog IO operations in the storage and compute separation mode. [#37787](https://github.com/apache/doris/pull/37787) + +- Improved table creation speed in the storage and compute separation mode by sending tablet creation requests in batches. [#36786](https://github.com/apache/doris/pull/36786) + +- Optimized read failures caused by potential inconsistencies in the local file cache through backoff retries. [#38645](https://github.com/apache/doris/pull/38645) + +### Lakehouse + +- Optimized memory statistics for Parquet/ORC format read and write operations. [#37234](https://github.com/apache/doris/pull/37234) + +- Trino Connector Catalog now supports predicate pushdown. [#37874](https://github.com/apache/doris/pull/37874) + +- Added a session variable `enable_count_push_down_for_external_table` to control whether to enable `count(*)` pushdown optimization for external tables. [#37046](https://github.com/apache/doris/pull/37046) + +- Optimized the read logic for Hudi snapshot reads, returning an empty set when the snapshot is empty, consistent with Spark behavior. [#37702](https://github.com/apache/doris/pull/37702) + +- Improved the read performance of partition columns for Hive tables. [#37377](https://github.com/apache/doris/pull/37377) + +### Asynchronous Materialized Views + +- Improved transparent rewrite plan speed by 20%. [#37197](https://github.com/apache/doris/pull/37197) + +- Eliminated roll-up during transparent rewrite if the group key satisfies data uniqueness for better nested matching. [#38387](https://github.com/apache/doris/pull/38387) + +- Transparent rewrite now performs better aggregation elimination to improve the matching success rate of nested materialized views. [#36888](https://github.com/apache/doris/pull/36888) + +### MySQL Compatibility + +- Now correctly populates the database name, table name, and original name in the MySQL protocol result columns. [#38126](https://github.com/apache/doris/pull/38126) + +- Supported the hint format `/*+ func(value) */`. [#37720](https://github.com/apache/doris/pull/37720) + +### Query Optimizer + +- Significantly improved the plan speed for complex queries. [#38317](https://github.com/apache/doris/pull/38317) + +- Adaptively chose whether to perform bucket shuffle based on the number of data buckets to avoid performance degradation in extreme cases. [#36784](https://github.com/apache/doris/pull/36784) + +- Optimized the cost estimation logic for SEMI / ANTI JOIN. [#37951](https://github.com/apache/doris/pull/37951) [#37060](https://github.com/apache/doris/pull/37060) + +- Supported pushing Limit down to the first stage of aggregation to improve performance. [#34853](https://github.com/apache/doris/pull/34853) + +- Partition pruning now supports filter conditions containing the `date_trunc` or `date` function. [#38025](https://github.com/apache/doris/pull/38025) [#38743](https://github.com/apache/doris/pull/38743) + +- SQL cache now supports query scenarios that include user variables. [#37915](https://github.com/apache/doris/pull/37915) + +- Optimized error messages for invalid aggregation semantics. [#38122](https://github.com/apache/doris/pull/38122) + +### Query Execution + +- Adapted AggState compatibility from 2.1 to 3.x and fixed Coredump issues. [#37104](https://github.com/apache/doris/pull/37104) + +- Refactored the strategy selection for local shuffle without Join. [#37282](https://github.com/apache/doris/pull/37282) + +- Modified the scanner for internal table queries to be asynchronous to prevent stalling during such queries. [#38403](https://github.com/apache/doris/pull/38403) + +- Optimized the block merge process during Hash table construction for Join operators. [#37471](https://github.com/apache/doris/pull/37471) + +- Optimized the duration of lock holding for MultiCast. [#37462](https://github.com/apache/doris/pull/37462) + +- Optimized gRPC keepAliveTime and added link monitoring to reduce the probability of query failure due to RPC errors. [#37304](https://github.com/apache/doris/pull/37304) + +- Cleaned up all dirty pages in jemalloc when memory limits were exceeded. [#37164](https://github.com/apache/doris/pull/37164) + +- Optimized the processing performance of `aes_encrypt`/`decrypt` functions for constant types. [#37194](https://github.com/apache/doris/pull/37194) + +- Optimized the processing performance of the `json_extract` function for constant data. [#36927](https://github.com/apache/doris/pull/36927) + +- Optimized the processing performance of the `ParseUrl` function for constant data. [#36882](https://github.com/apache/doris/pull/36882) + +### Semi-Structured Data Management + +- Bitmap indexes now default to using inverted indexes, with `enable_create_bitmap_index_as_inverted_index` set to true by default. [#36692](https://github.com/apache/doris/pull/36692) + +- In the compute-storage decoupled mode, DESC can now view sub-columns of VARIANT type. [#38143](https://github.com/apache/doris/pull/38143) + +- Removed the step of checking file existence during inverted index queries to reduce access latency to remote storage. [#36945](https://github.com/apache/doris/pull/36945) + +- Complex types ARRAY / MAP / STRUCT now support `replace_if_not_null` for AGG tables. [#38304](https://github.com/apache/doris/pull/38304) + +- Escape characters for JSON data are now supported. [#37176](https://github.com/apache/doris/pull/37176) [#37251](https://github.com/apache/doris/pull/37251) + +- Inverted index queries now behave consistently on MOW tables and DUP tables. [#37428](https://github.com/apache/doris/pull/37428) + +- Optimized the performance of inverted index acceleration for IN queries. [#37395](https://github.com/apache/doris/pull/37395) + +- Reduced unnecessary memory allocation during TOPN queries to improve performance. [#37429](https://github.com/apache/doris/pull/37429) + +- When creating an inverted index with tokenization, the `support_phrase` option is now automatically enabled to accelerate `match_phrase` series phrase queries. [#37949](https://github.com/apache/doris/pull/37949) + +### Other + +- Audit log now can record SQL types. [#37790](https://github.com/apache/doris/pull/37790) + +- Added support for `information_schema.processlist` to show all FE. [#38701](https://github.com/apache/doris/pull/38701) + +- Cached ranger's `atamask` and `rowpolicy` to accelerate query efficiency. [#37723](https://github.com/apache/doris/pull/37723) + +- Optimized metadata management in job manager to release locks immediately after modifying metadata, reducing lock holding time. [#38162](https://github.com/apache/doris/pull/38162) + + + +## Bug Fixes + +### Upgrade + +- Fix the issue where `mtmv load` fails during upgrade from version 2.1. [#38799](https://github.com/apache/doris/pull/38799) + +- Resolve the issue where `null_type` cannot be found during the upgrade to version 2.1. [#39373](https://github.com/apache/doris/pull/39373) + +- Address the compatibility issue with permission persistence during the upgrade from version 2.1 to 3.0. [#39288](https://github.com/apache/doris/pull/39288) + +### Load + +- Fix the issue where parsing fails when the newline character is surrounded by delimiters in CSV format parsing. [#38347](https://github.com/apache/doris/pull/38347) +- Resolve potential exception issues when FE forwards group commit. [#38228](https://github.com/apache/doris/pull/38228) [#38265](https://github.com/apache/doris/pull/38265) + +- Group commit now supports the new optimizer. [#37002](https://github.com/apache/doris/pull/37002) + +- Fix the issue where group commit reports data errors when JDBC setNull is used. [#38262](https://github.com/apache/doris/pull/38262) + +- Optimize the retry logic for group commit when encountering `delete bitmap lock` errors. [#37600](https://github.com/apache/doris/pull/37600) + +- Resolve the issue where routine load cannot use CSV delimiters and escape characters. [#38402](https://github.com/apache/doris/pull/38402) + +- Fix the issue where routine load job names with mixed case cannot be displayed. [#38523](https://github.com/apache/doris/pull/38523) + +- Optimize the logic for actively recovering routine load during FE master-slave switching. [#37876](https://github.com/apache/doris/pull/37876) + +- Resolve the issue where routine load pauses when all data in Kafka is expired. [#37288](https://github.com/apache/doris/pull/37288) + +- Fix the issue where `show routine load` returns empty results. [#38199](https://github.com/apache/doris/pull/38199) + +- Resolve the memory leak issue during multi-table stream import in routine load. [#38255](https://github.com/apache/doris/pull/38255) + +- Fix the issue where stream load does not return the error URL. [#38325](https://github.com/apache/doris/pull/38325) + +- Resolve potential load channel leak issues. [#38031](https://github.com/apache/doris/pull/38031) [#37500](https://github.com/apache/doris/pull/37500) + +- Fix the issue where no error may be reported when importing fewer segments than expected. [#36753](https://github.com/apache/doris/pull/36753) + +- Resolve the load stream leak issue. [#38912](https://github.com/apache/doris/pull/38912) + +- Optimize the impact of offline nodes on import operations. [#38198](https://github.com/apache/doris/pull/38198) + +- Fix the issue where transactions do not end when inserting into empty data. [#38991](https://github.com/apache/doris/pull/38991) + +### Storage + +**01 Backup and Restoration** + +- Fix the issue where tables cannot be written after backup and restoration. [#37089](https://github.com/apache/doris/pull/37089) + +- Resolve the issue where view database names are incorrect after backup and restoration. [#37412](https://github.com/apache/doris/pull/37412) + +**02 Compaction** + +- Fix the issue where cumu compaction handles delete errors incorrectly during ordered data compression. [#38742](https://github.com/apache/doris/pull/38742) + +- Resolve the issue of duplicate keys in aggregate tables caused by sequential compression optimization. [#38224](https://github.com/apache/doris/pull/38224) + +- Fix the issue where compression operations cause coredump in large wide tables. [#37960](https://github.com/apache/doris/pull/37960) + +- Resolve the compression starvation issue caused by inaccurate concurrent statistics of compression tasks. [#37318](https://github.com/apache/doris/pull/37318) + +**03 MOW Unique Key** + +- Resolve the issue of inconsistent data between replicas caused by cumulative compression deletion of delete sign. [#37950](https://github.com/apache/doris/pull/37950) + +- MOW delete now uses partial column updates with the new optimizer. [#38751](https://github.com/apache/doris/pull/38751) + +- Fix the potential duplicate key issue in MOW tables under compute-storage decoupled. [#39018](https://github.com/apache/doris/pull/39018) + +- Resolve the issue where MOW unique and duplicate tables cannot modify column order. [#37067](https://github.com/apache/doris/pull/37067) + +- Fix the potential data correctness issue caused by segcompaction. [#37760](https://github.com/apache/doris/pull/37760) + +- Resolve the potential memory leak issue during column updates. [#37706](https://github.com/apache/doris/pull/37706) + +**04 Other** + +- Fix the small probability of exceptions in TOPN queries. [#39119](https://github.com/apache/doris/pull/39119) [#39199](https://github.com/apache/doris/pull/39199) + +- Resolve the issue where auto-increment IDs may duplicate during FE restart. [#37306](https://github.com/apache/doris/pull/37306) + +- Fix the potential queuing issue in the delete operation priority queue. [#37169](https://github.com/apache/doris/pull/37169) + +- Optimize the delete retry logic. [#37363](https://github.com/apache/doris/pull/37363) + +- Resolve the issue with `bucket = 0` in table creation statements under the new optimizer. [#38971](https://github.com/apache/doris/pull/38971) + +- Fix the issue where FE reports success incorrectly when image generation fails. [#37508](https://github.com/apache/doris/pull/37508) + +- Resolve the issue where using the wrong nodename during FE offline nodes may cause inconsistent FE members. [#37987](https://github.com/apache/doris/pull/37987) + +- Fix the issue where CCR partition addition may fail. [#37295](https://github.com/apache/doris/pull/37295) + +- Resolve the `int32` overflow issue in inverted index files. [#38891](https://github.com/apache/doris/pull/38891) + +- Fix the issue where TRUNCATE TABLE failure may cause BE to fail to go offline. [#37334](https://github.com/apache/doris/pull/37334) + +- Resolve the issue where publish cannot continue due to null pointers. [#37724](https://github.com/apache/doris/pull/37724) [#37531](https://github.com/apache/doris/pull/37531) + +- Fix the potential coredump issue when manually triggering disk migration. [#37712](https://github.com/apache/doris/pull/37712) + +### Compute-Storage Decoupled + +- Fixed the issue where `show create table` might display the `file_cache_ttl_seconds` attribute twice. [#38052](https://github.com/apache/doris/pull/38052) + +- Fixed the issue where segment Footer TTL was not set correctly after setting file cache TTL. [#37485](https://github.com/apache/doris/pull/37485) + +- Fixed the issue where file cache might cause coredump due to massive conversion of cache types. [#38518](https://github.com/apache/doris/pull/38518) + +- Fixed the potential file descriptor (fd) leak in file cache. [#38051](https://github.com/apache/doris/pull/38051) + +- Fixed the issue where schema change Job overwriting compaction Job prevented base tablet compaction from completing normally. [#38210](https://github.com/apache/doris/pull/38210) + +- Fixed the potential inaccuracy of base compaction score due to data race. [#38006](https://github.com/apache/doris/pull/38006) + +- Fixed the issue where error messages from imports might not be uploaded correctly to object storage. [#38359](https://github.com/apache/doris/pull/38359) + +- Fixed the inconsistency in return information between compute-storage decoupled mode and storage and compute integration mode for 2PC imports. [#38076](https://github.com/apache/doris/pull/38076) + +- Fix the issue where incorrect file size setting during file cache warm-up leads to coredump. [#38939](https://github.com/apache/doris/pull/38939) + +- Fixed the issue where partial column updates did not correctly dequeue delete operations. [#37151](https://github.com/apache/doris/pull/37151) + +- Fixed compatibility issues with permission persistence in compute-storage decoupled mode. [#38136](https://github.com/apache/doris/pull/38136) [#37708](https://github.com/apache/doris/pull/37708) + +- Fixed the issue where observer did not retry correctly when encountering a `-230` error. [#37625](https://github.com/apache/doris/pull/37625) + +- Fixed the issue where `show load` with conditions did not perform correct analysis. [#37656](https://github.com/apache/doris/pull/37656) + +- Fixed the issue where `show streamload` in compute-storage decoupled mode caused BE coredump. [#37903](https://github.com/apache/doris/pull/37903) + +- Fixed the issue where `copy into` did not correctly verify column names in strict mode. [#37650](https://github.com/apache/doris/pull/37650) + +- Fixed the issue where multi-stream imports into a single table lacked permissions. [#38878](https://github.com/apache/doris/pull/38878) + +- Fixed the potential overflow issue in `getVersionUpdateTimeMs`. [#38074](https://github.com/apache/doris/pull/38074) + +- Fixed the issue where FE azure blob list was not implemented correctly. [#37986](https://github.com/apache/doris/pull/37986) + +- Fixed the issue where inaccurate azure blob recycling time calculation prevented recycling. [#37535](https://github.com/apache/doris/pull/37535) + +- Fixed the issue where inverted index files were not deleted in compute-storage decoupled mode. [#38306](https://github.com/apache/doris/pull/38306) + +### Lakehouse + +- Fixed the issue with reading binary data from Oracle Catalog. [#37078](https://github.com/apache/doris/pull/37078) + +- Fixed the potential deadlock issue when acquiring external table metadata in multi-FE scenarios. [#37756](https://github.com/apache/doris/pull/37756) + +- Fixed the issue where JNI scanner failure caused BE nodes to crash. [#37697](https://github.com/apache/doris/pull/37697) + +- Fixed the issue with slow reading of date types from Trino Connector Catalog. [#37266](https://github.com/apache/doris/pull/37266) + +- Optimized kerberos authentication logic for Hive Catalog. [#37301](https://github.com/apache/doris/pull/37301) + +- Fixed the issue where region attributes might be parsed incorrectly when parsing MinIO properties. [#37249](https://github.com/apache/doris/pull/37249) + +- Fixed the issue where creating too many FileSystems by FE caused memory leaks. [#36954](https://github.com/apache/doris/pull/36954) + +- Fixed the issue with reading incorrect time zone information from Paimon. [#37716](https://github.com/apache/doris/pull/37716) + +- Fixed the potential thread leak issue caused by Hive write-back operations. [#36990](https://github.com/apache/doris/pull/36990) + +- Fixed the null pointer issue caused by enabling Hive metastore event synchronization. [#38421](https://github.com/apache/doris/pull/38421) + +- Fixed the issue where error messages were unclear or caused stalling when creating catalogs. [#37551](https://github.com/apache/doris/pull/37551) + +- Fixed the issue where reading Hive text format tables behaved differently from Hive. [#37638](https://github.com/apache/doris/pull/37638) + +- Fixed the logic error when switching between catalogs and databases. [#37828](https://github.com/apache/doris/pull/37828) + +### MySQL Compatibility + +- Fixed the issue where certain flags in the MySQL protocol were set incorrectly when SSL was enabled. [#38086](https://github.com/apache/doris/pull/38086) + +### Asynchronous Materialized Views + +- Fixed the issue where construction might fail when the base table had a very large number of partitions. [#37589](https://github.com/apache/doris/pull/37589) + +- Fixed the issue where nested materialized views incorrectly performed full table refreshes even when partition refreshes were possible. [#38698](https://github.com/apache/doris/pull/38698) + +- Fixed the issue where partition refresh could not handle the simultaneous existence of valid and invalid dependencies when analyzing partition dependencies. [#38367](https://github.com/apache/doris/pull/38367) + +- Fixed the issue where the final result containing NULL type might cause asynchronous materialized views to fail. [#37019](https://github.com/apache/doris/pull/37019) + +- Fixed the planning error that might occur during transparent rewriting when both synchronous and asynchronous materialized views with the same name were present. [#37311](https://github.com/apache/doris/pull/37311) + +### Synchronous Materialized Views + +- The rewritten synchronous materialized views now can correctly perform partition pruning. [#38527](https://github.com/apache/doris/pull/38527) + +- When rewriting synchronous materialized views, those with unready data are no longer selected. [#38148](https://github.com/apache/doris/pull/38148) + +### Query Optimizer + +- Fixed the deadlock issue that might occur when queries and delete operations are performed simultaneously. [#38660](https://github.com/apache/doris/pull/38660) + +- Fixed the issue where bucket pruning might incorrectly prune on decimal column buckets. [#37889](https://github.com/apache/doris/pull/37889) + +- Fixed the issue where planning might be incorrect when mark join participates in join reorder. [#39152](https://github.com/apache/doris/pull/39152) + +- Fixed the issue where the result is incorrect when the correlation condition of a correlated subquery is not a simple column. [#37644](https://github.com/apache/doris/pull/37644) + +- Fixed the issue where partition pruning cannot correctly handle or expressions. [#38897](https://github.com/apache/doris/pull/38897) + +- Fixed the planning error that might occur when optimizing the execution order of JOIN and AGG. [#37343](https://github.com/apache/doris/pull/37343) + +- Fixed the issue where `str_to_date` performs incorrect constant folding calculations on datev1 types. [#37360](https://github.com/apache/doris/pull/37360) + +- Fixed the issue where the ACOS function's constant folding returns non-NaN values. [#37932](https://github.com/apache/doris/pull/37932) + +- Fixed the occasional planning error: "The children format needs to be [WhenClause+, DefaultValue?]". [#38491](https://github.com/apache/doris/pull/38491) + +- Fixed the issue where planning might be incorrect when the projection includes window functions and there is both the original column and its alias. [#38166](https://github.com/apache/doris/pull/38166) + +- Fixed the issue where planning might report an error when the aggregation parameter contains a lambda expression. [#37109](https://github.com/apache/doris/pull/37109) + +- Fixed the insert error that might occur in extreme cases: "MultiCastDataSink cannot be cast to DataStreamSink". [#38526](https://github.com/apache/doris/pull/38526) + +- Fixed the issue where the new optimizer does not correctly handle `char(0)/varchar(0)` when creating a table. [#38427](https://github.com/apache/doris/pull/38427) + +- Fixed the incorrect behavior of `char(255) toSql`. [#37340](https://github.com/apache/doris/pull/37340) + +- Fixed the issue where the nullable attribute within the `agg_state` type might lead to planning errors. [#37489](https://github.com/apache/doris/pull/37489) +- Fixed the issue where row count statistics are inaccurate during mark Join. [#38270](https://github.com/apache/doris/pull/38270) + +### Query Execution + +- Fixed issues where the Pipeline execution engine was stuck, causing queries to not end, in multiple scenarios. [#38657](https://github.com/apache/doris/pull/38657), [#38206](https://github.com/apache/doris/pull/38206), [#38885](https://github.com/apache/doris/pull/38885), [#38151](https://github.com/apache/doris/pull/38151), [#37297](https://github.com/apache/doris/pull/37297) + +- Fixed the coredump issue caused by NULL and non-NULL columns during set difference calculations. [#38750](https://github.com/apache/doris/pull/38750) + +- Fixed the error when using the DECIMAL type with pure decimals in delete statements. [#37801](https://github.com/apache/doris/pull/37801) + +- Fixed the issue where the `width_bucket` function returned incorrect results. [#37892](https://github.com/apache/doris/pull/37892) + +- Fixed the query error when a single row of data was very large and the result set was also large (exceeding 2GB). [#37990](https://github.com/apache/doris/pull/37990) + +- Fixed the coredump issue caused by incorrect release of rpc connections during single-replica imports. [#38087](https://github.com/apache/doris/pull/38087) + +- Fixed the coredump issue caused by processing NULL values with the `foreach` function. [#37349](https://github.com/apache/doris/pull/37349) + +- Fixed the issue where stddev returned incorrect results for DECIMALV2 types. [#38731](https://github.com/apache/doris/pull/38731) + +- Fixed the slow performance of `bitmap union` calculations. [#37816](https://github.com/apache/doris/pull/37816) + +- Fixed the issue where RowsProduced for aggregation operators was not set in the profile. [#38271](https://github.com/apache/doris/pull/38271) + +- Fixed the overflow issue when calculating the number of buckets for the hash table under hash join. [#37193](https://github.com/apache/doris/pull/37193), [#37493](https://github.com/apache/doris/pull/37493) + +- Fixed the inaccurate recording of the `jemalloc cache memory tracker`. [#37464](https://github.com/apache/doris/pull/37464) + +- Added the `enable_stacktrace` configuration option, allowing users to control whether exception stacks are output in BE logs. [#37713](https://github.com/apache/doris/pull/37713) + +- Fixed the issue where Arrow Flight SQL did not work correctly when `enable_parallel_result_sink` was set to false. [#37779](https://github.com/apache/doris/pull/37779) + +- Fixed the incorrect use of colocate Join. [#37361](https://github.com/apache/doris/pull/37361), [#37729](https://github.com/apache/doris/pull/37729) + +- Fixed the calculation overflow issue of the `round` function on DECIMAL128 types. [#37733](https://github.com/apache/doris/pull/37733), [#38106](https://github.com/apache/doris/pull/38106) + +- Fixed the coredump issue when passing a const string to the `sleep` function. [#37681](https://github.com/apache/doris/pull/37681) + +- Increased the queue length for audit logs, solving the issue where audit logs could not be recorded normally under high concurrency scenarios with thousands of concurrent connections. [#37786](https://github.com/apache/doris/pull/37786) + +- Fixed the issue where creating a workload group caused too many threads, leading to BE coredump. [#38096](https://github.com/apache/doris/pull/38096) + +- Fixed the coredump issue caused by the `MULTI_MATCH_ANY` function. [#37959](https://github.com/apache/doris/pull/37959) + +- Fixed the transaction rollback issue caused by `insert overwrite auto partition`. [#38103](https://github.com/apache/doris/pull/38103) + +- Fixed the issue where the TimeUtils formatter did not use the correct time zone. [#37465](https://github.com/apache/doris/pull/37465) + +- Fixed the issue where results were incorrect under constant folding scenarios for week/yearweek. [#37376](https://github.com/apache/doris/pull/37376) + +- Fixed the issue where the `convert_tz` function returned incorrect results. [#37358](https://github.com/apache/doris/pull/37358), [#38764](https://github.com/apache/doris/pull/38764) + +- Fixed the coredump issue when using the `collect_set` function with window functions. [#38234](https://github.com/apache/doris/pull/38234) + +- Fixed the coredump issue caused by `percentile_approx` during rolling upgrades. [#39321](https://github.com/apache/doris/pull/39321) + +- Fixed the coredump issue caused by the `mod` function when encountering abnormal input. [#37999](https://github.com/apache/doris/pull/37999) + +- Fixed the issue where the hash table was not fully built when the broadcast join probe started running. [#37643](https://github.com/apache/doris/pull/37643) + +- Fixed the issue where executing the same expression in multithreaded environments might lead to incorrect results for Java UDFs. [#38612](https://github.com/apache/doris/pull/38612) + +- Fixed the overflow issue caused by incorrect return types of the `conv` function. [#38001](https://github.com/apache/doris/pull/38001) + +- Fixed the issue where the `json_replace` function returned incorrect types. [#3701](https://github.com/apache/doris/pull/37014) + +- Fixed the issue where the nullable attribute setting was unreasonable for the `percentile` aggregation function. [#37330](https://github.com/apache/doris/pull/37330) + +- Fixed the issue where the results of the `histogram` function were unstable. [#38608](https://github.com/apache/doris/pull/38608) + +- Fixed the issue where task state was displayed incorrectly in the profile. [#38082](https://github.com/apache/doris/pull/38082) + +- Fixed the issue where some queries were incorrectly canceled when the system just started. [#37662](https://github.com/apache/doris/pull/37662) + +### Semi-Structured Data Management + +- Fix some issues with time series compression. [#39170](https://github.com/apache/doris/pull/39170) [#39176](https://github.com/apache/doris/pull/39176) + +- Fix the issue of incorrect index size statistics during compression. [#37232](https://github.com/apache/doris/pull/37232) + +- Fix the potential incorrect matching of ultra-long strings without tokenization in inverted indexes. [#37679](https://github.com/apache/doris/pull/37679) [#38218](https://github.com/apache/doris/pull/38218) + +- Fix the high memory usage issue of `array_range` and `array_with_const` functions when dealing with large data volumes. [#38284](https://github.com/apache/doris/pull/38284) [#37495](https://github.com/apache/doris/pull/37495) + +- Fix the potential coredump issue when selecting columns of ARRAY / MAP / STRUCT types. [#37936](https://github.com/apache/doris/pull/37936) + +- Fix the import failure issue caused by simdjson parsing errors when specifying jsonpath in Stream Load. [#38490](https://github.com/apache/doris/pull/38490) + +- Fix the exception handling issue when there are duplicate keys in JSON data. [#38146](https://github.com/apache/doris/pull/38146) + +- Fix the potential query error after DROP INDEX. [#37646](https://github.com/apache/doris/pull/37646) + +- Fix the error return issue in row merging checks during index compression. [#38732](https://github.com/apache/doris/pull/38732) + +- Inverted index v2 format now supports renaming columns. [#38079](https://github.com/apache/doris/pull/38079) + +- Fix the coredump issue when the `MATCH` function matches an empty string without an index. [#37947](https://github.com/apache/doris/pull/37947) + +- Fix the handling of NULL values in inverted indexes. [#37921](https://github.com/apache/doris/pull/37921) [#37842](https://github.com/apache/doris/pull/37842) [#38741](https://github.com/apache/doris/pull/38741) + +- Fix the incorrect `row_store_page_size` after FE restart. [#38240](https://github.com/apache/doris/pull/38240) + +### Other + +- Fix the timezone configuration issue. The default timezone is no longer fixed at UTC+8 and is now obtained from system configuration. [#37294](https://github.com/apache/doris/pull/37294) + +- Fix the class conflict issue when using ranger due to multiple JSR specification implementations. [#37575](https://github.com/apache/doris/pull/37575) + +- Fix the potential uninitialized field issue in some BE code. [#37403](https://github.com/apache/doris/pull/37403) + +- Fix the error in delete statements for random distributed tables. [#37985](https://github.com/apache/doris/pull/37985) + +- Fix the incorrect requirement for `alter_priv` permission on the base table when creating a synchronized materialized view. [#38011](https://github.com/apache/doris/pull/38011) + +- Fix the issue of not authenticating resources when used in TVF. [#36928](https://github.com/apache/doris/pull/36928) + + +## Credits + +Thanks all who contribute to this release: + +@133tosakarin, @924060929, @AshinGau, @Baymine, @BePPPower, @BiteTheDDDDt, @ByteYue, @CalvinKirs, @Ceng23333, @DarvenDuan, @FreeOnePlus, @Gabriel39, @HappenLee, @JNSimba, @Jibing-Li, @KassieZ, @Lchangliang, @LiBinfeng-01, @Mryange, @SWJTU-ZhangLei, @TangSiyang2001, @Tech-Circle-48, @Vallishp, @Yukang-Lian, @Yulei-Yang, @airborne12, @amorynan, @bobhan1, @cambyzju, @cjj2010, @csun5285, @dataroaring, @deardeng, @eldenmoon, @englefly, @feiniaofeiafei, @felixwluo, @freemandealer, @gavinchou, @ghkang98, @hello-stephen, @hubgeter, @hust-hhb, @jacktengg, @kaijchen, @kaka11chen, @keanji-x, @liaoxin01, @liutang123, @luwei16, @luzhijing, @lxr599, @morningman, @morrySnow, @mrhhsg, @mymeiyi, @platoneko, @qidaye, @qzsee, @seawinde, @shuke987, @sollhui, @starocean999, @suxiaogang223, @w41ter, @wangbo, @wangshuo128, @whutpencil, @wsjz, @wuwenchi, @wyxxxcat, @xiaokang, @xiedeyantu, @xinyiZzz, @xy720, @xzj7019, @yagagagaga, @yiguolei, @yujun777, @z404289981, @zclllyybb, @zddr, @zfr9527, @zhangbutao, @zhangstar333, @zhannngchen, @zhiqiang-hhhh, @zjj, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.2.md b/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.2.md new file mode 100644 index 0000000000000..0ab6a828ab95d --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.2.md @@ -0,0 +1,341 @@ +--- +{ + "title": "Release 3.0.2", + "language": "en" +} +--- + + + + +Dear community members, the Apache Doris 3.0.2 version was officially released on October 15, 2024, featuring updates and improvements in compute-storage decoupling, data storage, lakehouse, query optimizer, query execution and more. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavioral Changes + +### Storage + +- Limited the number of tablets in a single backup task to prevent FE memory overflow. [#40518](https://github.com/apache/doris/pull/40518) +- The `SHOW PARTITIONS` command now displays the `CommittedVersion` of partitions. [#28274](https://github.com/apache/doris/pull/28274) + +### Other + +- The default printing mode (asynchronous) of `fe.log` now includes file line number information. If performance issues are encountered due to line number output, please switch to BRIEF mode. [#39419](https://github.com/apache/doris/pull/39419) +- The default value of the session variable `ENABLE_PREPARED_STMT_AUDIT_LOG` has been changed from `true` to `false`, and the audit log of prepare statements will no longer be printed. [#38865](https://github.com/apache/doris/pull/38865) +- The default value of the session variable `max_allowed_packet` has been adjusted from 1MB to 16MB to align with MySQL 8.4. [#38697](https://github.com/apache/doris/pull/38697) +- The JVM of FE and BE defaults to using the UTF-8 character set. [#39521](https://github.com/apache/doris/pull/39521) + +## New Features + +### Storage + +- Backup and recovery now support clearing tables or partitions that are not in the backup. [#39028](https://github.com/apache/doris/pull/39028) + +### Compute-Storage Decoupled + +- Support for parallel recycling of expired data on multiple tablets. [#37630](https://github.com/apache/doris/pull/37630) +- Support for changing storage vaults through `ALTER` statements. [#38685](https://github.com/apache/doris/pull/38685) [#37606](https://github.com/apache/doris/pull/37606) +- Support for importing a large number of tablets (5000+) in a single transaction (experimental feature). [#38243](https://github.com/apache/doris/pull/38243) +- Support for automatically aborting pending transactions caused by reasons such as node restarts, solving the issue of pending transactions blocking decommission or schema change. [#37669](https://github.com/apache/doris/pull/37669) +- A new session variable `enable_segment_cache` has been added to control whether to use segment cache during queries (default is `true`). [#37141](https://github.com/apache/doris/pull/37141) +- Resolved the issue of not being able to import a large amount of data during schema changes in compute-storage decoupled mode. [#39558](https://github.com/apache/doris/pull/39558) +- Support for adding multiple follower roles of FE in compute-storage decoupled mode. [#38388](https://github.com/apache/doris/pull/38388) +- Support for using memory as file cache to accelerate queries in environments with no disks or low-performance HDDs. [#38811](https://github.com/apache/doris/pull/38811) + +### Lakehouse + +- New Lakesoul Catalog has been added. [Apache Doris Docs](https://doris.apache.org/zh-CN/docs/dev/lakehouse/datalake-analytics/lakesoul) +- A new system table `catalog_meta_cache_statistics` has been added to view the usage of various metadata caches in external catalog. [#40155](https://github.com/apache/doris/pull/40155) + +### Query Optimizer + +- Support for `is [not] true/false` expressions. [#38623](https://github.com/apache/doris/pull/38623) + +### Query Execution + +- A new CRC32 function has been added. [#38204](https://github.com/apache/doris/pull/38204) +- New aggregate functions skew and kurt have been added. [#41277](https://github.com/apache/doris/pull/41277) +- Profiles are now persisted to the FE's disk to retain more profiles. [#33690](https://github.com/apache/doris/pull/33690) +- A new system table `workload_group_privileges` has been added to view permission information related to workload groups. [#38436](https://github.com/apache/doris/pull/38436) +- A new system table `workload_group_resource_usage` has been added to monitor resource statistics of workload groups. [#39177](https://github.com/apache/doris/pull/39177) +- Workload groups now support limiting reads of local IO and remote IO. [#39012](https://github.com/apache/doris/pull/39012) +- Workload groups now support cgroupv2 to limit CPU usage. [#39374](https://github.com/apache/doris/pull/39374) +- A new system table `information_schema.partitions` has been added to view some table creation attributes. [#40636](https://github.com/apache/doris/pull/40636) + +### Other + +- Support for using the `SHOW` statement to display BE's configuration information, such as `SHOW BACKEND CONFIG LIKE ${pattern}`. [#36525](https://github.com/apache/doris/pull/36525) + +## Improvements + +### Load + +- Improved the import efficiency of routine load when encountering frequent EOFs from Kafka. [#39975](https://github.com/apache/doris/pull/39975) +- The stream load result now includes the time taken to read HTTP data, `ReceiveDataTimeMs`, which can quickly determine slow stream load issues caused by network reasons. [#40735](https://github.com/apache/doris/pull/40735) +- Optimized the routine load timeout logic to avoid frequent timeouts during inverted index and mow writes. [#40818](https://github.com/apache/doris/pull/40818) + +### Storage + +- Support for batch addition of partitions. [#37114](https://github.com/apache/doris/pull/37114) + +### Compute-Storage Decoupled + +- Added the meta-service HTTP interface `/MetaService/http/show_meta_ranges` to facilitate the statistics of KV distribution in FDB. [#39208](https://github.com/apache/doris/pull/39208) +- The meta-service/recycler stop script ensures that the process fully exits before returning. [#40218](https://github.com/apache/doris/pull/40218) +- Support for using the session variable `version_comment` (Cloud Mode) to display the current deployment mode as compute-storage decoupled. [#38269](https://github.com/apache/doris/pull/38269) +- Fixed the detailed message returned when transaction submission fails. [#40584](https://github.com/apache/doris/pull/40584) +- Support for using one meta-service process to provide both metadata services and data recycling services. [#40223](https://github.com/apache/doris/pull/40223) +- Optimized the default configuration of file_cache to avoid potential issues when not set. [#41421](https://github.com/apache/doris/pull/41421) [#41507](https://github.com/apache/doris/pull/41507) +- Improved query performance by batch retrieving the version of multiple partitions. [#38949](https://github.com/apache/doris/pull/38949) +- Delayed the redistribution of tablets to avoid query performance issues caused by temporary network fluctuations. [#40371](https://github.com/apache/doris/pull/40371) +- Optimized the read-write lock logic in the balance. [#40633](https://github.com/apache/doris/pull/40633) +- Enhanced the robustness of file cache in handling TTL filenames during restarts/crashes. [#40226](https://github.com/apache/doris/pull/40226) +- Added the BE HTTP interface `/api/file_cache?op=hash` to facilitate the calculation of the hash file names of segment files on disk. [#40831](https://github.com/apache/doris/pull/40831) +- Optimized the unified naming to be compatible with using compute group to represent BE groups (original cloud cluster). [#40767](https://github.com/apache/doris/pull/40767) +- Optimized the waiting time for obtaining locks when calculating delete bitmaps in primary key tables. [#40341](https://github.com/apache/doris/pull/40341) +- When there are many delete bitmaps in primary key tables, optimized the high CPU consumption during queries by pre-merging multiple delete bitmaps. [#40204](https://github.com/apache/doris/pull/40204) +- Support for managing FE/BE nodes in compute-storage decoupled mode through SQL statements, hiding the logic of direct interaction with meta-service when deploying in compute-storage decoupled mode. [#40264](https://github.com/apache/doris/pull/40264) +- Added a script for rapid deployment of FDB. [#39803](https://github.com/apache/doris/pull/39803) +- Optimized the output of `SHOW CACHE HOTSPOT` to unify the column name style with other `SHOW` statements. [#41322](https://github.com/apache/doris/pull/41322) +- When using a storage vault as the storage backend, disallowed the use of `latest_fs()` to avoid binding different storage backends to the same table. [#40516](https://github.com/apache/doris/pull/40516) +- Optimized the timeout strategy for calculating delete bitmaps when importing mow tables. [#40562](https://github.com/apache/doris/pull/40562) [#40333](https://github.com/apache/doris/pull/40333) +- The enable_file_cache in be.conf is now enabled by default in compute-storage decoupled mode. [#41502](https://github.com/apache/doris/pull/41502) + +### Lakehouse + +- When reading tables in CSV format, support for the session `keep_carriage_return` setting to control the reading behavior of the `\r` symbol. [#39980](https://github.com/apache/doris/pull/39980) +- The default maximum memory of BE's JVM has been adjusted to 2GB (affecting only new deployments). [#41403](https://github.com/apache/doris/pull/41403) +- Hive Catalog has added `hive.recursive_directories_table` and `hive.ignore_absent_partitions` properties to specify whether to recursively traverse data directories and whether to ignore missing partitions. [#39494](https://github.com/apache/doris/pull/39494) +- Optimized the Catalog refresh logic to avoid generating a large number of connections during refresh. [#39205](https://github.com/apache/doris/pull/39205) +- `SHOW CREATE DATABASE` and `SHOW CREATE TABLE` for external data sources now display location information. [#39179](https://github.com/apache/doris/pull/39179) +- The new optimizer supports inserting data into JDBC external tables using the `INSERT INTO` statement. [#41511](https://github.com/apache/doris/pull/41511) +- MaxCompute Catalog now supports complex data types. [#39259](https://github.com/apache/doris/pull/39259) +- Optimized the logic for reading and merging data shards of external tables. [#38311](https://github.com/apache/doris/pull/38311) +- Optimized some refresh strategies for metadata caches of external tables. [#38506](https://github.com/apache/doris/pull/38506) +- Paimon tables now support pushing down `IN/NOT IN` predicates. [#38390](https://github.com/apache/doris/pull/38390) +- Compatible with tables created in Parquet format by Paimon version 0.9. [#41020](https://github.com/apache/doris/pull/41020) + +### Asynchronous Materialized Views + +- Building asynchronous materialized views now supports the use of both immediate and starttime. [#39573](https://github.com/apache/doris/pull/39573) +- Asynchronous materialized views based on external tables will refresh the metadata cache of the external tables before refreshing the materialized views, ensuring construction based on the latest external table data. [#38212](https://github.com/apache/doris/pull/38212) +- Partition incremental construction now supports rolling up according to weekly and quarterly granularities. [#39286](https://github.com/apache/doris/pull/39286) + +### Query Optimizer + +- The aggregate function `GROUP_CONCAT` now supports the use of both `DISTINCT` and `ORDER BY`. [#38080](https://github.com/apache/doris/pull/38080) +- Optimized the collection and use of statistical information, as well as the logic for estimating row counts and cost calculations, to generate more efficient and stable execution plans. +- Window function partition data pre-filtering now supports cases containing multiple window functions. [#38393](https://github.com/apache/doris/pull/38393) + +### Query Execution + +- Reduced query latency by running prepare pipeline tasks in parallel. [#40874](https://github.com/apache/doris/pull/40874) +- Display Catalog information in Profile. [#38283](https://github.com/apache/doris/pull/38283) +- Optimized the computational performance of `IN` filtering conditions. [#40917](https://github.com/apache/doris/pull/40917) +- Supported cgroupv2 in K8S to limit Doris's memory usage. [#39256](https://github.com/apache/doris/pull/39256) +- Optimized the performance of converting strings to datetime types. [#38385](https://github.com/apache/doris/pull/38385) +- When a `string` is a decimal number, support casting it to an `int`, which will be more compatible with certain behaviors of MySQL. [#38847](https://github.com/apache/doris/pull/38847) + +### Semi-Structured Data Management + +- Optimized the performance of inverted index matching. [#41122](https://github.com/apache/doris/pull/41122) +- Temporarily prohibited the creation of inverted indexes with tokenization on arrays. [#39062](https://github.com/apache/doris/pull/39062) +- `explode_json_array` now supports binary JSON types. [#37278](https://github.com/apache/doris/pull/37278) +- IP data types now support bloomfilter indexes. [#39253](https://github.com/apache/doris/pull/39253) +- IP data types now support row storage. [#39258](https://github.com/apache/doris/pull/39258) +- Nested data types such as ARRAY, MAP, and STRUCT now support schema changes. [#39210](https://github.com/apache/doris/pull/39210) +- When creating MTMV, automatically truncate KEYs encountered in VARIANT data types. [#39988](https://github.com/apache/doris/pull/39988) +- Lazy loading of inverted indexes during queries to improve performance. [#38979](https://github.com/apache/doris/pull/38979) +- `add inverted index file size for open file`. [#37482](https://github.com/apache/doris/pull/37482) +- Reduced access to object storage interfaces during compaction to improve performance. [#41079](https://github.com/apache/doris/pull/41079) +- Added three new query profile metrics related to inverted indexes. [#36696](https://github.com/apache/doris/pull/36696) +- Reduced cache overhead for non-PreparedStatement SQL to improve performance. [#40910](https://github.com/apache/doris/pull/40910) +- Pre-warming cache now supports inverted indexes. [#38986](https://github.com/apache/doris/pull/38986) +- Inverted indexes are now cached immediately after writing. [#39076](https://github.com/apache/doris/pull/39076) + +### Compatibility + +- Fixed the issue of Thrift ID incompatibility on the master with branch-2.1. [#41057](https://github.com/apache/doris/pull/41057) + +### Other + +- BE HTTP API now supports authentication; set config::enable_all_http_auth to true (default is false) when authentication is required. [#39577](https://github.com/apache/doris/pull/39577) +- Optimized the user permissions required for the REFRESH operation. Permissions have been relaxed from ALTER to SHOW. [#39008](https://github.com/apache/doris/pull/39008) +- Reduced the range of nextId when calling advanceNextId(). [#40160](https://github.com/apache/doris/pull/40160) +- Optimized the caching mechanism for Java UDFs. [#40404](https://github.com/apache/doris/pull/40404) + +## Bug Fixes + +### Load + +- Fixed the issue where `abortTransaction` did not handle return codes. [#41275](https://github.com/apache/doris/pull/41275) +- Fixed the issue where transactions failed to commit or abort in compute-storage decoupled mode without calling `afterCommit/afterAbort`. [#41267](https://github.com/apache/doris/pull/41267) +- Fixed the issue where Routine Load could not work properly when modifying consumer offsets in compute-storage decoupled mode. [#39159](https://github.com/apache/doris/pull/39159) +- Fixed the issue of repeatedly closing file handles when obtaining error log file paths. [#41320](https://github.com/apache/doris/pull/41320) +- Fixed the issue of incorrect job progress caching for Routine Load in compute-storage decoupled mode. [#39313](https://github.com/apache/doris/pull/39313) +- Fixed the issue where Routine Load could get stuck when failing to commit transactions in compute-storage decoupled mode. [#40539](https://github.com/apache/doris/pull/40539) +- Fixed the issue where Routine Load kept reporting data quality check errors in compute-storage decoupled mode. [#39790](https://github.com/apache/doris/pull/39790) +- Fixed the issue where Routine Load did not check transactions before committing in compute-storage decoupled mode. [#39775](https://github.com/apache/doris/pull/39775) +- Fixed the issue where Routine Load did not check transactions before aborting in compute-storage decoupled mode. [#40463](https://github.com/apache/doris/pull/40463) +- Fixed the issue where cluster keys did not support certain data types. [#38966](https://github.com/apache/doris/pull/38966) +- Fixed the issue of transactions being repeatedly committed. [#39786](https://github.com/apache/doris/pull/39786) +- Fixed the issue of use after free with WAL when BE exits. [#33131](https://github.com/apache/doris/pull/33131) +- Fixed the issue where WAL playback did not skip completed import transactions in compute-storage decoupled mode. [#41262](https://github.com/apache/doris/pull/41262) +- Fixed the logic for selecting BE in group commit in compute-storage decoupled mode. [#39986](https://github.com/apache/doris/pull/39986) [#38644](https://github.com/apache/doris/pull/38644) +- Fixed the issue where BE might crash when group commit was enabled for insert into. [#39339](https://github.com/apache/doris/pull/39339) +- Fixed the issue where insert into with group commit enabled might get stuck. [#39391](https://github.com/apache/doris/pull/39391) +- Fixed the issue where not enabling the group commit option during import might result in a table not found error. [#39731](https://github.com/apache/doris/pull/39731) +- Fixed the issue of transaction submission timeouts due to too many tablets. [#40031](https://github.com/apache/doris/pull/40031) +- Fixed the issue of concurrent opens with Auto Partition. [#38605](https://github.com/apache/doris/pull/38605) +- Fixed the issue of import lock granularity being too large. [#40134](https://github.com/apache/doris/pull/40134) +- Fixed the issue of coredumps caused by zero-length varchars. [#40940](https://github.com/apache/doris/pull/40940) +- Fixed the issue of incorrect index Id values in log prints. [#38790](https://github.com/apache/doris/pull/38790) +- Fixed the issue of memtable shifting not closing BRPC streaming. [#40105](https://github.com/apache/doris/pull/40105) +- Fixed the issue of inaccurate bvar statistics during memtable shifting. [#39075](https://github.com/apache/doris/pull/39075) +- Fixed the issue of multi-replication fault tolerance during memtable shifting. [#38003](https://github.com/apache/doris/pull/38003) +- Fixed the issue of incorrect message length calculations for Routine Load with multiple tables in one stream. [#40367](https://github.com/apache/doris/pull/40367) +- Fixed the issue of inaccurate progress reporting for Broker Load. [#40325](https://github.com/apache/doris/pull/40325) +- Fixed the issue of inaccurate data scan volume reporting for Broker Load. [#40694](https://github.com/apache/doris/pull/40694) +- Fixed the issue of concurrency with Routine Load in compute-storage decoupled mode. [#39242](https://github.com/apache/doris/pull/39242) +- Fixed the issue of Routine Load jobs being canceled in compute-storage decoupled mode. [#39514](https://github.com/apache/doris/pull/39514) +- Fixed the issue of progress not being reset when deleting Kafka topics. [#38474](https://github.com/apache/doris/pull/38474) +- Fixed the issue of updating progress during transaction state transitions in Routine Load. [#39311](https://github.com/apache/doris/pull/39311) +- Fixed the issue of Routine Load switching from a paused state to a paused state. [#40728](https://github.com/apache/doris/pull/40728) +- Fixed the issue of Stream Load records being missed due to database deletion. [#39360](https://github.com/apache/doris/pull/39360) + +### Storage + +- Fixed the issue of missing storage policies. [#38700](https://github.com/apache/doris/pull/38700) +- Fixed the issue of errors during cross-version backup and recovery. [#38370](https://github.com/apache/doris/pull/38370) +- Fixed the NPE issue with ccr binlog. [#39909](https://github.com/apache/doris/pull/39909) +- Fixed potential issues with duplicate keys in mow. [#41309](https://github.com/apache/doris/pull/41309) [#39791](https://github.com/apache/doris/pull/39791) [#39958](https://github.com/apache/doris/pull/39958) [#38369](https://github.com/apache/doris/pull/38369) [#38331](https://github.com/apache/doris/pull/38331) +- Fixed the issue of not being able to write after backup and recovery in high-frequency write scenarios. [#40118](https://github.com/apache/doris/pull/40118) [#38321](https://github.com/apache/doris/pull/38321) +- Fixed the issue of data errors potentially triggered by deleting empty strings and schema changes. [#41064](https://github.com/apache/doris/pull/41064) +- Fixed the issue of incorrect statistics due to column updates. [#40880](https://github.com/apache/doris/pull/40880) +- Limited the size of tablet meta pb to prevent BE crashes due to oversized meta. [#39455](https://github.com/apache/doris/pull/39455) +- Fixed the potential column misalignment issue with the new optimizer in `begin; insert into values; commit`. [#39295](https://github.com/apache/doris/pull/39295) + +### Compute-Storage Decoupled + +- Fixed the issue where the tablet distribution might be inconsistent across multiple FEs in compute-storage decoupled mode. [#41458](https://github.com/apache/doris/pull/41458) +- Fixed the issue where TVF might not work in multi-computing group environments. [#39249](https://github.com/apache/doris/pull/39249) +- Fixed the issue where compaction used resources that had already been released when BE exited in compute-storage decoupled mode. [#39302](https://github.com/apache/doris/pull/39302) +- Fixed the issue where automatic start-stop might cause FE replay to get stuck. [#40027](https://github.com/apache/doris/pull/40027) +- Fixed the issue where the BE status and the stored status in meta-service were inconsistent. [#40799](https://github.com/apache/doris/pull/40799) +- Fixed the issue where the FE->meta-service connection pool could not automatically expire and reconnect. [#41202](https://github.com/apache/doris/pull/41202) [#40661](https://github.com/apache/doris/pull/40661) +- Fixed the issue where some tablets might repeatedly undergo unexpected balance processes during rebalance. [#39792](https://github.com/apache/doris/pull/39792) +- Fixed the issue where storage vault permissions were lost after FE restarted. [#40260](https://github.com/apache/doris/pull/40260) +- Fixed the issue where tablet row counts and other statistical information might be incomplete due to FDB scan range pagination. [#40494](https://github.com/apache/doris/pull/40494) +- Fixed the performance issue caused by a large number of aborted transactions associated with the same label. [#40606](https://github.com/apache/doris/pull/40606) +- Fixed the issue where `commit_txn` did not automatically re-enter, maintaining consistent behavior between compute-storage decoupled and integrated modes. [#39615](https://github.com/apache/doris/pull/39615) +- Fixed the issue where the number of projected columns increased when dropping columns. [#40187](https://github.com/apache/doris/pull/40187) +- Fixed the issue where delete statements did not correctly handle return values, causing data to still be visible after deletion. [#39428](https://github.com/apache/doris/pull/39428) +- Fixed the coredump issue caused by rowset metadata competition during file cache preheating. [#39361](https://github.com/apache/doris/pull/39361) +- Fixed the issue where the entire cache space would be used up when TTL cache enabled LRU eviction. [#39814](https://github.com/apache/doris/pull/39814) +- Fixed the issue where temporary files could not be recycled when importing commit rowset failed with HDFS storage backend. [#40215](https://github.com/apache/doris/pull/40215) + +### Lakehouse + +- Fixed some issues with predicate pushdown in JDBC Catalog. [#39064](https://github.com/apache/doris/pull/39064) +- Fixed the issue of not being able to read when `S``TRUCT` type columns are missing in Parquet format. [#38718](https://github.com/apache/doris/pull/38718) +- Fixed the issue of FileSystem leaks on the FE side in some cases. [#38610](https://github.com/apache/doris/pull/38610) +- Fixed the issue of metadata cache information being inconsistent when Hive/Iceberg tables write back in some cases. [#40729](https://github.com/apache/doris/pull/40729) +- Fixed the issue of unstable partition ID generation for external tables in some cases. [#39325](https://github.com/apache/doris/pull/39325) +- Fixed the issue of external table queries selecting BE nodes in the blacklist in some cases. [#39451](https://github.com/apache/doris/pull/39451) +- Optimized the timeout time for batch retrieval of external table partition information to avoid long-term thread occupation. [#39346](https://github.com/apache/doris/pull/39346) +- Fixed the issue of memory leaks when querying Hudi tables in some cases. [#41256](https://github.com/apache/doris/pull/41256) +- Fixed the issue of connection pool connection leaks in JDBC Catalog in some cases. [#39582](https://github.com/apache/doris/pull/39582) +- Fixed the issue of BE memory leaks in JDBC Catalog in some cases. [#41041](https://github.com/apache/doris/pull/41041) +- Fixed the issue of not being able to query Hudi data on Alibaba Cloud OSS. [#41316](https://github.com/apache/doris/pull/41316) +- Fixed the issue of not being able to read empty partitions in MaxCompute. [#40046](https://github.com/apache/doris/pull/40046) +- Fixed the issue of poor performance when querying Oracle through JDBC Catalog. [#41513](https://github.com/apache/doris/pull/41513) +- Fixed the issue of BE crashes when querying deletion vector of Paimon tables after enabling file cache features. [#39877](https://github.com/apache/doris/pull/39877) +- Fixed the issue of not being able to access Paimon tables on HDFS clusters with HA enabled. [#39806](https://github.com/apache/doris/pull/39806) +- Temporarily disabled the page index filtering feature of Parquet to avoid potential issues. [#38691](https://github.com/apache/doris/pull/38691) +- Fixed the issue of not being able to read unsigned types in Parquet files. [#39926](https://github.com/apache/doris/pull/39926) +- Fixed the issue of potential infinite loops when reading Parquet files in some cases. [#39523](https://github.com/apache/doris/pull/39523) + +### Asynchronous Materialized Views + +- Fixed the issue where partition construction might select the wrong table to track partitions if both sides have the same column names. [#40810](https://github.com/apache/doris/pull/40810) +- Fixed the issue where transparent rewrite partition compensation might result in incorrect results. [#40803](https://github.com/apache/doris/pull/40803) +- Fixed the issue where transparent rewrite did not take effect on external tables. [#38909](https://github.com/apache/doris/pull/38909) +- Fixed the issue where nested materialized views might not refresh properly. [#40433](https://github.com/apache/doris/pull/40433) + +### Synchronous Materialized Views + +- Fixed the issue where creating synchronous materialized views on MOW tables might result in incorrect query results. [#39171](https://github.com/apache/doris/pull/39171) + +### Query Optimizer + +- Fixed the issue where existing synchronous materialized views might not be usable after upgrading. [#41283](https://github.com/apache/doris/pull/41283) +- Fixed the issue of not correctly handling milliseconds when comparing datetime literals. [#40121](https://github.com/apache/doris/pull/40121) +- Fixed the issue of potential errors in conditional function partition pruning. [#39298](https://github.com/apache/doris/pull/39298) +- Fixed the issue where MOW tables with synchronous materialized views could not perform delete operations. [#39578](https://github.com/apache/doris/pull/39578) +- Fixed the issue where the nullable of slots in JDBC external table query predicates might be incorrectly planned, causing query errors. [#41014](https://github.com/apache/doris/pull/41014) + +### Query Execution + +- Fixed the memory leak issue caused by the use of runtime filters. [#39155](https://github.com/apache/doris/pull/39155) +- Fixed the issue of excessive memory usage by window functions. [#39581](https://github.com/apache/doris/pull/39581) +- Fixed a series of function compatibility issues during rolling upgrades. [#41023](https://github.com/apache/doris/pull/41023) [#40438](https://github.com/apache/doris/pull/40438) [#39648](https://github.com/apache/doris/pull/39648) +- Fixed the issue of incorrect results with `encryption_function` when used with constants. [#40201](https://github.com/apache/doris/pull/40201) +- Fixed the issue of errors when importing single-table materialized views. [#39061](https://github.com/apache/doris/pull/39061) +- Fixed the issue of incorrect partition result calculations for window functions. [#39100](https://github.com/apache/doris/pull/39100) [#40761](https://github.com/apache/doris/pull/40761) +- Fixed the issue of incorrect calculations for topn when null values are present. [#39497](https://github.com/apache/doris/pull/39497) +- Fixed the issue of incorrect results with the `map_agg` function. [#39743](https://github.com/apache/doris/pull/39743) +- Fixed the issue of incorrect messages returned by cancel. [#38982](https://github.com/apache/doris/pull/38982) +- Fixed the issue of BE core dumps caused by encrypt and decrypt functions. [#40726](https://github.com/apache/doris/pull/40726) +- Fixed the issue of queries getting stuck due to too many scanners in high-concurrency scenarios. [#40495](https://github.com/apache/doris/pull/40495) +- Supported time types in runtime filters. [#38258](https://github.com/apache/doris/pull/38258) +- Fixed the issue of incorrect results with window funnel functions. [#40960](https://github.com/apache/doris/pull/40960) + +### Semi-Structured Data Management + +- Fixed the issue of match function errors when no indexes were present. [#38989](https://github.com/apache/doris/pull/38989) +- Fixed the issue of crashes when ARRAY data types were used as parameters for array_min/array_max functions. [#39492](https://github.com/apache/doris/pull/39492) +- Fixed the issue of nullable with the `array_enumerate_uniq` function. [#38384](https://github.com/apache/doris/pull/38384) +- Fixed the issue of bloomfilter indexes not being updated when adding or deleting columns. [#38431](https://github.com/apache/doris/pull/38431) +- Fixed the issue of es-catalog parsing exceptions with array data. [#39104](https://github.com/apache/doris/pull/39104) +- Fixed the issue of improper predicate push-down in es-catalog. [#40111](https://github.com/apache/doris/pull/40111) +- Fixed the issue of exceptions caused by modifying input data with`map()` and `struct()` functions. [#39699](https://github.com/apache/doris/pull/39699) +- Fixed the issue of index compaction crashes in special cases. [#40294](https://github.com/apache/doris/pull/40294) +- Fixed the issue of ARRAY type inverted indexes missing nullbitmaps. [#38907](https://github.com/apache/doris/pull/38907) +- Fixed the issue of incorrect results with the `count()` function on inverted indexes. [#41152](https://github.com/apache/doris/pull/41152) +- Fixed the issue of correct results with the `explode_map` function when using aliases. [#39757](https://github.com/apache/doris/pull/39757) +- Fixed the issue of VARIANT type not being able to use row storage for exceptional JSON data. [#39394](https://github.com/apache/doris/pull/39394) +- Fixed the issue of memory leaks when returning ARRAY results with VARIANT type. [#41358](https://github.com/apache/doris/pull/41358) +- Fixed the issue of changing column names with VARIANT type. [#40320](https://github.com/apache/doris/pull/40320) +- Fixed the issue of potential precision loss when converting VARIANT type to DECIMAL type. [#39650](https://github.com/apache/doris/pull/39650) +- Fixed the issue of nullable handling with VARIANT type. [#39732](https://github.com/apache/doris/pull/39732) +- Fixed the issue of sparse column reading with VARIANT type. [#40295](https://github.com/apache/doris/pull/40295) + +### Other + +- Fixed the compatibility issue between new and old audit log plugins. [#41401](https://github.com/apache/doris/pull/41401) +- Fixed the issue where users could see processes of others in certain cases. [#39747](https://github.com/apache/doris/pull/39747) +- Fixed the issue where users with permissions could not export. [#38365](https://github.com/apache/doris/pull/38365) +- Fixed the issue where create table like required create permissions for the existing table. [#37879](https://github.com/apache/doris/pull/37879) +- Fixed the issue where some features did not verify permissions. [#39726](https://github.com/apache/doris/pull/39726) +- Fixed the issue of not correctly closing connections when using SSL. [#38587](https://github.com/apache/doris/pull/38587) +- Fixed the issue where executing ALTER VIEW operations in some cases caused FE to fail to start. [#40872](https://github.com/apache/doris/pull/40872) \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.3.md b/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.3.md new file mode 100644 index 0000000000000..b15777212b400 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.3.md @@ -0,0 +1,226 @@ +--- +{ + "title": "Release 3.0.3", + "language": "en" +} +--- + + + + +Dear community members, the Apache Doris 3.0.3 version was officially released on December 02, 2024, this version further enhances the performance and stability of the system. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavioral Changes + +- Prohibited column updates on MOW tables with synchronous materialized views. [#40190](https://github.com/apache/doris/pull/40190) +- Adjusted the default parameters of RoutineLoad to improve import efficiency. [#42968](https://github.com/apache/doris/pull/42968) +- When StreamLoad fails, the return value of LoadedRows is adjusted to 0. [#41946](https://github.com/apache/doris/pull/41946) [#42291](https://github.com/apache/doris/pull/42291) +- Adjusted the default memory limit of Segment cache to 5%. [#42308](https://github.com/apache/doris/pull/42308) [#42436](https://github.com/apache/doris/pull/42436) + +## New Features + +- Introduced the session variable `enable_cooldown_replica_affinity` to control the affinity of cold and hot tiered replicas. [#42677](https://github.com/apache/doris/pull/42677) + +- Added `table$partition` syntax for querying partition information of Hive tables. [#40774](https://github.com/apache/doris/pull/40774) + + - [View Documentation](https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/hive) + +- Supported creation of Hive tables in Text format. [#41860](https://github.com/apache/doris/pull/41860) [#42175](https://github.com/apache/doris/pull/42175) + + - [View Documentation](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build#table) + +### Asynchronous Materialized Views + +- Introduced new materialized view attribute `use_for_rewrite`. When `use_for_rewrite` is set to false, the materialized view does not participate in transparent rewriting. [#40332](https://github.com/apache/doris/pull/40332) + +### Query Optimizer + +- Supported correlated non-aggregate subqueries. [#42236](https://github.com/apache/doris/pull/42236) + +### Query Execution + +- Added functions `ngram_search`, `normal_cdf`, `to_iso8601`, `from_iso8601_date`, `SESSION_USER()`, `last_query_id`. [#38226](https://github.com/apache/doris/pull/38226) [#40695](https://github.com/apache/doris/pull/40695) [#41075](https://github.com/apache/doris/pull/41075) [#41600](https://github.com/apache/doris/pull/41600) [#39575](https://github.com/apache/doris/pull/39575) [#40739](https://github.com/apache/doris/pull/40739) +- The `aes_encrypt` and `aes_decrypt` functions support GCM mode. [#40004](https://github.com/apache/doris/pull/40004) +- Profile outputs the changed session variable values. [#41016](https://github.com/apache/doris/pull/41016) [#41318](https://github.com/apache/doris/pull/41318) + +### Semi-structured Data Management + +- Added array functions `array_match_all` and `array_match_any`. [#40605](https://github.com/apache/doris/pull/40605) [#43514](https://github.com/apache/doris/pull/43514) +- The array function `array_agg` supports nesting ARRAY/MAP/STRUCT within ARRAY. [#42009](https://github.com/apache/doris/pull/42009) +- Added approximate aggregate statistical functions `approx_top_k` and `approx_top_sum`. [#44082](https://github.com/apache/doris/pull/44082) + +## Improvements + +### Storage + +- Supported `bitmap_empty` as the default value. [#40364](https://github.com/apache/doris/pull/40364) +- Introduced the session variable `insert_timeout` to control the timeout of DELETE statements. [#41063](https://github.com/apache/doris/pull/41063) +- Improved some error message prompts. [#41048](https://github.com/apache/doris/pull/41048) [#39631](https://github.com/apache/doris/pull/39631) +- Improved the priority scheduling of replica repair. [#41076](https://github.com/apache/doris/pull/41076) +- Enhanced the robustness of timezone handling when creating tables. [#41926](https://github.com/apache/doris/pull/41926) [#42389](https://github.com/apache/doris/pull/42389) +- Checked the validity of partition expressions when creating tables. [#40158](https://github.com/apache/doris/pull/40158) +- Supported Unicode-encoded column names in DELETE operations. [#39381](https://github.com/apache/doris/pull/39381) + +### Compute-Storage Decoupled + +- Supported ARM architecture deployment in storage and compute separation mode. [#42467](https://github.com/apache/doris/pull/42467) [#43377](https://github.com/apache/doris/pull/43377) +- Optimized the eviction strategy and lock competition of file cache, improving hit rate and high concurrency point query performance. [#42451](https://github.com/apache/doris/pull/42451) [#43201](https://github.com/apache/doris/pull/43201) [#41818](https://github.com/apache/doris/pull/41818) [#43401](https://github.com/apache/doris/pull/43401) +- S3 storage vault supported `use_path_style`, solving the problem of using custom domain names for object storage. [#43060](https://github.com/apache/doris/pull/43060) [#43343](https://github.com/apache/doris/pull/43343) [#43330](https://github.com/apache/doris/pull/43330) +- Optimized storage and compute separation configuration and deployment, preventing misoperations in different modes. [#43381](https://github.com/apache/doris/pull/43381) [#43522](https://github.com/apache/doris/pull/43522) [#43434](https://github.com/apache/doris/pull/43434) [#40764](https://github.com/apache/doris/pull/40764) [#43891](https://github.com/apache/doris/pull/43891) +- Optimized observability and provided an interface for deleting specified segment file cache. [#38489](https://github.com/apache/doris/pull/38489) [#42896](https://github.com/apache/doris/pull/42896) [#41037](https://github.com/apache/doris/pull/41037) [#43412](https://github.com/apache/doris/pull/43412) +- Optimized Meta-service operation and maintenance interface: RPC rate limiting and tablet metadata correction. [#42413](https://github.com/apache/doris/pull/42413) [#43884](https://github.com/apache/doris/pull/43884) [#41782](https://github.com/apache/doris/pull/41782) [#43460](https://github.com/apache/doris/pull/43460) + +### Lakehouse + +- Paimon Catalog supported Alibaba Cloud DLF and OSS-HDFS storage. [#41247](https://github.com/apache/doris/pull/41247) [#42585](https://github.com/apache/doris/pull/42585) + + - View [Documentation](https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/paimon) + +- Supported reading of Hive tables in OpenCSV format. [#42257](https://github.com/apache/doris/pull/42257) [#42942](https://github.com/apache/doris/pull/42942) +- Optimized the performance of accessing the `information_schema.columns` table in External Catalog. [#41659](https://github.com/apache/doris/pull/41659) [#41962](https://github.com/apache/doris/pull/41962) +- Used the new Max Compute open storage API to access Max Compute data sources. [#41614](https://github.com/apache/doris/pull/41614) +- Optimized the scheduling policy of the JNI part of Paimon tables, making scan tasks more balanced. [#43310](https://github.com/apache/doris/pull/43310) +- Optimized the read performance of small ORC files. [#42004](https://github.com/apache/doris/pull/42004) [#43467](https://github.com/apache/doris/pull/43467) +- Supported reading of parquet files in brotli compressed format. [#42177](https://github.com/apache/doris/pull/42177) +- Added `file_cache_statistics` table under the `information_schema` library to view metadata cache statistics. [#42160](https://github.com/apache/doris/pull/42160) + +### Query Optimizer + +- Optimization: When queries only differ in comments, the same SQL Cache can be reused. [#40049](https://github.com/apache/doris/pull/40049) +- Optimization: Improved the stability of statistical information when data is frequently updated. [#43865](https://github.com/apache/doris/pull/43865) [#39788](https://github.com/apache/doris/pull/39788) [#43009](https://github.com/apache/doris/pull/43009) [#40457](https://github.com/apache/doris/pull/40457) [#42409](https://github.com/apache/doris/pull/42409) [#41894](https://github.com/apache/doris/pull/41894) +- Optimization: Enhanced the stability of constant folding. [#42910](https://github.com/apache/doris/pull/42910) [#41164](https://github.com/apache/doris/pull/41164) [#39723](https://github.com/apache/doris/pull/39723) [#41394](https://github.com/apache/doris/pull/41394) [#42256](https://github.com/apache/doris/pull/42256) [#40441](https://github.com/apache/doris/pull/40441) +- Optimization: Column pruning can generate better execution plans. [#41719](https://github.com/apache/doris/pull/41719) [#41548](https://github.com/apache/doris/pull/41548) + +### Query Execution + +- Optimized the memory usage of the sort operator. [#39306](https://github.com/apache/doris/pull/39306) +- Optimized the performance of computations on ARM. [#38888](https://github.com/apache/doris/pull/38888) [#38759](https://github.com/apache/doris/pull/38759) +- Optimized the computational performance of a series of functions. [#40366](https://github.com/apache/doris/pull/40366) [#40821](https://github.com/apache/doris/pull/40821) [#40670](https://github.com/apache/doris/pull/40670) [#41206](https://github.com/apache/doris/pull/41206) [#40162](https://github.com/apache/doris/pull/40162) +- Used SSE instructions to optimize the performance of the `match_ipv6_subnet` function. [#38755](https://github.com/apache/doris/pull/38755) +- Supported automatic creation of new partitions during insert overwrite. [#38628](https://github.com/apache/doris/pull/38628) [#42645](https://github.com/apache/doris/pull/42645) +- Added the status of each PipelineTask in Profile. [#42981](https://github.com/apache/doris/pull/42981) +- IP type supported runtime filter. [#39985](https://github.com/apache/doris/pull/39985) + +### Semi-structured Data Management + +- Output the real SQL of prepared statements in audit logs. [#43321](https://github.com/apache/doris/pull/43321) +- The filebeat doris output plugin supports fault tolerance and progress reporting. [#36355](https://github.com/apache/doris/pull/36355) +- Optimized the performance of inverted index queries. [#41547](https://github.com/apache/doris/pull/41547) [#41585](https://github.com/apache/doris/pull/41585) [#41567](https://github.com/apache/doris/pull/41567) [#41577](https://github.com/apache/doris/pull/41577) [#42060](https://github.com/apache/doris/pull/42060) [#42372](https://github.com/apache/doris/pull/42372) +- The array function `array overlaps` supports acceleration using inverted indexes. [#41571](https://github.com/apache/doris/pull/41571) +- The IP function `is_ip_address_in_range` supports acceleration using inverted indexes. [#41571](https://github.com/apache/doris/pull/41571) +- Optimized the CAST performance of the VARIANT data type. [#41775](https://github.com/apache/doris/pull/41775) [#42438](https://github.com/apache/doris/pull/42438) [#43320](https://github.com/apache/doris/pull/43320) +- Optimized the CPU resource consumption of the Variant data type. [#42856](https://github.com/apache/doris/pull/42856) [#43062](https://github.com/apache/doris/pull/43062) [#43634](https://github.com/apache/doris/pull/43634) +- Optimized the metadata and execution memory resource consumption of the Variant data type. [#42448](https://github.com/apache/doris/pull/42448) [#43326](https://github.com/apache/doris/pull/43326) [#41482](https://github.com/apache/doris/pull/41482) [#43093](https://github.com/apache/doris/pull/43093) [#43567](https://github.com/apache/doris/pull/43567) [#43620](https://github.com/apache/doris/pull/43620) + +### Permissions + +- Added a new configuration item `ldap_group_filter` in LDAP for custom group filtering. [#43292](https://github.com/apache/doris/pull/43292) + +### Other + +- Supported displaying connection count information by user in FE monitoring items. [#39200](https://github.com/apache/doris/pull/39200) + +## Bug Fixes + +### Storage + +- Fixed the issue with using IPv6 hostnames. [#40074](https://github.com/apache/doris/pull/40074) +- Fixed the inaccurate display of broker/s3 load progress. [#43535](https://github.com/apache/doris/pull/43535) +- Fixed the issue where queries might hang from FE. [#41303](https://github.com/apache/doris/pull/41303) [#42382](https://github.com/apache/doris/pull/42382) +- Fixed the issue of duplicate auto-increment IDs under exceptional circumstances. [#43774](https://github.com/apache/doris/pull/43774) [#43983](https://github.com/apache/doris/pull/43983) +- Fixed occasional NPE issues with groupcommit. [#43635](https://github.com/apache/doris/pull/43635) +- Fixed the inaccurate calculation of auto bucket. [#41675](https://github.com/apache/doris/pull/41675) [#41835](https://github.com/apache/doris/pull/41835) +- Fixed the issue where FE might not correctly plan multi-table flows after restart. [#41677](https://github.com/apache/doris/pull/41677) [#42290](https://github.com/apache/doris/pull/42290) + +### Compute-Storage Decoupled + +- Fixed the issue that MOW primary key tables with large delete bitmaps might cause coredump. [#43088](https://github.com/apache/doris/pull/43088) [#43457](https://github.com/apache/doris/pull/43457) [#43479](https://github.com/apache/doris/pull/43479) [#43407](https://github.com/apache/doris/pull/43407) [#43297](https://github.com/apache/doris/pull/43297) [#43613](https://github.com/apache/doris/pull/43613) [#43615](https://github.com/apache/doris/pull/43615) [#43854](https://github.com/apache/doris/pull/43854) [#43968](https://github.com/apache/doris/pull/43968) [#44074](https://github.com/apache/doris/pull/44074) [#41793](https://github.com/apache/doris/pull/41793) [#42142](https://github.com/apache/doris/pull/42142) +- Fixed the issue that segment files, when being a multiple of 5MB, would fail to upload objects. [#43254](https://github.com/apache/doris/pull/43254) +- Fixed the issue that the default retry policy of aws sdk did not take effect. [#43575](https://github.com/apache/doris/pull/43575) [#43648](https://github.com/apache/doris/pull/43648) +- Fixed the issue that altering storage vault could continue execution even when the wrong type was specified. [#43489](https://github.com/apache/doris/pull/43489) [#43352](https://github.com/apache/doris/pull/43352) [#43495](https://github.com/apache/doris/pull/43495) +- Fixed the issue that tablet_id might be 0 during the delayed commit process of large transactions. [#42043](https://github.com/apache/doris/pull/42043) [#42905](https://github.com/apache/doris/pull/42905) +- Fixed the issue that constant folding RCP and FE forwarding SQL might not be executed in the expected computation group. [#43110](https://github.com/apache/doris/pull/43110) [#41819](https://github.com/apache/doris/pull/41819) [#41846](https://github.com/apache/doris/pull/41846) +- Fixed the issue that meta-service did not strictly check instance_id upon receiving RPC. [#43253](https://github.com/apache/doris/pull/43253) [#43832](https://github.com/apache/doris/pull/43832) +- Fixed the issue that FE follower information_schema version did not update in time. [#43496](https://github.com/apache/doris/pull/43496) +- Fixed the issue of atomicity in file cache rename and inaccurate metrics. [#42869](https://github.com/apache/doris/pull/42869) [#43504](https://github.com/apache/doris/pull/43504) [#43220](https://github.com/apache/doris/pull/43220) + +### Lakehouse + +- Prohibited implicit conversion predicates from being pushed down to JDBC data sources to avoid inconsistent query results. [#42102](https://github.com/apache/doris/pull/42102) +- Fixed some read issues with high-version Hive transactional tables. [#42226](https://github.com/apache/doris/pull/42226) +- Fixed the issue that the Export command might cause deadlocks. [#43083](https://github.com/apache/doris/pull/43083) [#43402](https://github.com/apache/doris/pull/43402) +- Fixed the issue of being unable to query Hive views created by Spark. [#43552](https://github.com/apache/doris/pull/43552) +- Fixed the issue that Hive partition paths containing special characters led to incorrect partition pruning. [#42906](https://github.com/apache/doris/pull/42906) +- Fixed the issue that Iceberg Catalog could not use AWS Glue. [#41084](https://github.com/apache/doris/pull/41084) + +### Asynchronous Materialized Views + +- Fixed the issue that asynchronous materialized views might not refresh after the base table is rebuilt. [#41762](https://github.com/apache/doris/pull/41762) + +### Query Optimizer + +- Fixed the issue that partition pruning results might be incorrect when using multi-column range partitioning. [#43332](https://github.com/apache/doris/pull/43332) +- Fixed the issue of incorrect calculation results in some limit offset scenarios. [#42576](https://github.com/apache/doris/pull/42576) + +### Query Execution + +- Fixed the issue that hash join with array types larger than 4G could cause BE Core. [#43861](https://github.com/apache/doris/pull/43861) +- Fixed the issue that is null predicate operations might yield incorrect results in some scenarios. [#43619](https://github.com/apache/doris/pull/43619) +- Fixed the issue that bitmap types might produce incorrect output results in hash join. [#43718](https://github.com/apache/doris/pull/43718) +- Fixed some issues where function results were calculated incorrectly. [#40710](https://github.com/apache/doris/pull/40710) [#39358](https://github.com/apache/doris/pull/39358) [#40929](https://github.com/apache/doris/pull/40929) [#40869](https://github.com/apache/doris/pull/40869) [#40285](https://github.com/apache/doris/pull/40285) [#39891](https://github.com/apache/doris/pull/39891) [#40530](https://github.com/apache/doris/pull/40530) [#41948](https://github.com/apache/doris/pull/41948) [#43588](https://github.com/apache/doris/pull/43588) +- Fixed some issues with JSON type parsing. [#39937](https://github.com/apache/doris/pull/39937) +- Fixed issues with varchar and char types in runtime filter operations. [#43758](https://github.com/apache/doris/pull/43758) [#43919](https://github.com/apache/doris/pull/43919) +- Fixed some issues with the use of decimal256 in scalar and aggregate functions. [#42136](https://github.com/apache/doris/pull/42136) [#42356](https://github.com/apache/doris/pull/42356) +- Fixed the issue that arrow flight reported `Reach limit of connections` errors upon connection. [#39127](https://github.com/apache/doris/pull/39127) +- Fixed the issue of incorrect memory usage statistics for BE in k8s environments. [#41123](https://github.com/apache/doris/pull/41123) + +### Semi-structured Data Management + +- Adjusted the default values of `segment_cache_fd_percentage` and `inverted_index_fd_number_limit_percent`. [#42224](https://github.com/apache/doris/pull/42224) +- logstash now supports group_commit. [#40450](https://github.com/apache/doris/pull/40450) +- Fixed the issue of coredump when building index. [#43246](https://github.com/apache/doris/pull/43246) [#43298](https://github.com/apache/doris/pull/43298) +- Fixed issues with variant index. [#43375](https://github.com/apache/doris/pull/43375) [#43773](https://github.com/apache/doris/pull/43773) +- Fixed potential fd and memory leaks under abnormal compaction circumstances. [#42374](https://github.com/apache/doris/pull/42374) +- Inverted index match null now correctly returns null instead of false. [#41786](https://github.com/apache/doris/pull/41786) +- Fixed the issue of coredump when ngram bloomfilter index bf_size is set to 65536. [#43645](https://github.com/apache/doris/pull/43645) +- Fixed the issue of potential coredump during complex data type JOINs. [#40398](https://github.com/apache/doris/pull/40398) +- Fixed the issue of coredump with TVF JSON data. [#43187](https://github.com/apache/doris/pull/43187) +- Fixed the precision issue of bloom filter calculations for dates and times. [#43612](https://github.com/apache/doris/pull/43612) +- Fixed the issue of coredump with IPv6 type storage. [#43251](https://github.com/apache/doris/pull/43251) +- Fixed the issue of coredump when using VARIANT type with light_schema_change disabled. [#40908](https://github.com/apache/doris/pull/40908) +- Improved cache performance for high-concurrency point queries. [#44077](https://github.com/apache/doris/pull/44077) +- Fixed the issue that bloom filter indexes were not synchronized when columns were deleted. [#43378](https://github.com/apache/doris/pull/43378) +- Fixed instability issues with es catalog under special circumstances such as mixed array and scalar data. [#40314](https://github.com/apache/doris/pull/40314) [#40385](https://github.com/apache/doris/pull/40385) [#43399](https://github.com/apache/doris/pull/43399) [#40614](https://github.com/apache/doris/pull/40614) +- Fixed coredump issues caused by abnormal regular pattern matching. [#43394](https://github.com/apache/doris/pull/43394) + +### Permissions + +- Fixed several issues where permissions were not properly restricted after authorization. [#43193](https://github.com/apache/doris/pull/43193) [#41723](https://github.com/apache/doris/pull/41723) [#42107](https://github.com/apache/doris/pull/42107) [#43306](https://github.com/apache/doris/pull/43306) +- Enhanced several permission checks. [#40688](https://github.com/apache/doris/pull/40688) [#40533](https://github.com/apache/doris/pull/40533) [#41791](https://github.com/apache/doris/pull/41791) [#42106](https://github.com/apache/doris/pull/42106) + +### Other + +- Supplemented missing audit log fields in audit log tables and files. [#43303](https://github.com/apache/doris/pull/43303) + + - [View Documentation](https://doris.apache.org/docs/3.0/admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.0.md b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.0.md new file mode 100644 index 0000000000000..dd94da6816294 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.0.md @@ -0,0 +1,379 @@ +--- +{ + "title": "Release 1.1.0", + "language": "en" +} +--- + + + +In version 1.1, we realized the full vectorization of the computing layer and storage layer, and officially enabled the vectorized execution engine as a stable function. All queries are executed by the vectorized execution engine by default, and the performance is 3-5 times higher than the previous version. It increases the ability to access the external tables of Apache Iceberg and supports federated query of data in Doris and Iceberg, and expands the analysis capabilities of Apache Doris on the data lake; on the basis of the original LZ4, the ZSTD compression algorithm is added , further improves the data compression rate; fixed many performance and stability problems in previous versions, greatly improving system stability. Downloading and using is recommended. + +## Upgrade Notes + +### The vectorized execution engine is enabled by default + +In version 1.0, we introduced the vectorized execution engine as an experimental feature and Users need to manually enable it when executing queries by configuring the session variables through `set batch_size = 4096` and `set enable_vectorized_engine = true` . + +In version 1.1, we officially fully enabled the vectorized execution engine as a stable function. The session variable `enable_vectorized_engine` is set to true by default. All queries are executed by default through the vectorized execution engine. + +### BE Binary File Renaming + +BE binary file has been renamed from palo_be to doris_be . Please pay attention to modifying the relevant scripts if you used to rely on process names for cluster management and other operations. + +### Segment storage format upgrade + +The storage format of earlier versions of Apache Doris was Segment V1. In version 0.12, we had implemented Segment V2 as a new storage format, which introduced Bitmap indexes, memory tables, page cache, dictionary compression, delayed materialization and many other features. Starting from version 0.13, the default storage format for newly created tables is Segment V2, while maintaining compatibility with the Segment V1 format. + +In order to ensure the maintainability of the code structure and reduce the additional learning and development costs caused by redundant historical codes, we have decided to no longer support the Segment v1 storage format from the next version. It is expected that this part of the code will be deleted in the Apache Doris 1.2 version. + +### Normal Upgrade + +For normal upgrade operations, you can perform rolling upgrades according to the cluster upgrade documentation on the official website. + +[https://doris.apache.org//docs/admin-manual/cluster-management/upgrade](https://doris.apache.org//docs/admin-manual/cluster-management/upgrade) + +## Features + +### Support random distribution of data [experimental] + +In some scenarios (such as log data analysis), users may not be able to find a suitable bucket key to avoid data skew, so the system needs to provide additional distribution methods to solve the problem. + +Therefore, when creating a table you can set `DISTRIBUTED BY random BUCKETS number`to use random distribution, the data will be randomly written to a single tablet when importing to reduce the data fanout during the loading process. And reduce resource overhead and improve system stability. + +### Support for creating Iceberg external tables[experimental] + +Iceberg external tables provide Apache Doris with direct access to data stored in Iceberg. Through Iceberg external tables, federated queries on data stored in local storage and Iceberg can be implemented, which saves tedious data loading work, simplifies the system architecture for data analysis, and performs more complex analysis operations. + +In version 1.1, Apache Doris supports creating Iceberg external tables and querying data, and supports automatic synchronization of all table schemas in the Iceberg database through the REFRESH command. + +### Added ZSTD compression algorithm + +At present, the data compression method in Apache Doris is uniformly specified by the system, and the default is LZ4. For some scenarios that are sensitive to data storage costs, the original data compression ratio requirements cannot be met. + +In version 1.1, users can set "compression"="zstd" in the table properties to specify the compression method as ZSTD when creating a table. In the 25GB 110 million lines of text log test data, the highest compression rate is nearly 10 times, which is 53% higher than the original compression rate, and the speed of reading data from disk and decompressing it is increased by 30%. + +## Improvements + +### More comprehensive vectorization support + +In version 1.1, we implemented full vectorization of the compute and storage layers, including: + +Implemented vectorization of all built-in functions + +The storage layer implements vectorization and supports dictionary optimization for low-cardinality string columns + +Optimized and resolved numerous performance and stability issues with the vectorization engine. + +We tested the performance of Apache Doris version 1.1 and version 0.15 on the SSB and TPC-H standard test datasets: + +On all 13 SQLs in the SSB test data set, version 1.1 is better than version 0.15, and the overall performance is improved by about 3 times, which solves the problem of performance degradation in some scenarios in version 1.0; + +On all 22 SQLs in the TPC-H test data set, version 1.1 is better than version 0.15, the overall performance is improved by about 4.5 times, and the performance of some scenarios is improved by more than ten times; + +![](/images/release-note-1.1.0-SSB.png) + +

SSB Benchmark

+ +![](/images/release-note-1.1.0-TPC-H.png) + + +

TPC-H Benchmark

+ +**Performance test report** + +[https://doris.apache.org//docs/benchmark/ssb](https://doris.apache.org//docs/benchmark/ssb) + +[https://doris.apache.org//docs/benchmark/tpch](https://doris.apache.org//docs/benchmark/tpch) + +### Compaction logic optimization and real-time guarantee + +In Apache Doris, each commit will generate a data version. In high concurrent write scenarios, -235 errors are prone to occur due to too many data versions and untimely compaction, and query performance will also decrease accordingly. + +In version 1.1, we introduced QuickCompaction, which will actively trigger compaction when the data version increases. At the same time, by improving the ability to scan fragment metadata, it can quickly find fragments with too many data versions and trigger compaction. Through active triggering and passive scanning, the real-time problem of data merging is completely solved. + +At the same time, for high-frequency small file cumulative compaction, the scheduling and isolation of compaction tasks is implemented to prevent the heavyweight base compaction from affecting the merging of new data. + +Finally, for the merging of small files, the strategy of merging small files is optimized, and the method of gradient merging is adopted. Each time the files participating in the merging belong to the same data magnitude, it prevents versions with large differences in size from merging, and gradually merges hierarchically. , reducing the number of times a single file participates in merging, which can greatly save the CPU consumption of the system. + +When the data upstream maintains a write frequency of 10w per second (20 concurrent write tasks, 5000 rows per job, and checkpoint interval of 1s), version 1.1 behaves as follows: + +- Quick data consolidation: Tablet version remains below 50 and compaction score is stable. Compared with the -235 problem that frequently occurred during high concurrent writing in the previous version, the compaction merge efficiency has been improved by more than 10 times. + +- Significantly reduced CPU resource consumption: The strategy has been optimized for small file Compaction. In the above scenario of high concurrent writing, CPU resource consumption is reduced by 25%; + +- Stable query time consumption: The overall orderliness of data is improved, and the fluctuation of query time consumption is greatly reduced. The query time consumption during high concurrent writing is the same as that of only querying, and the query performance is improved by 3-4 times compared with the previous version. + +### Read efficiency optimization for Parquet and ORC files + +By adjusting arrow parameters, arrow's multi-threaded read capability is used to speed up Arrow's reading of each row_group, and it is modified to SPSC model to reduce the cost of waiting for the network through prefetching. After optimization, the performance of Parquet file import is improved by 4 to 5 times. + +### Safer metadata Checkpoint + +By double-checking the image files generated after the metadata checkpoint and retaining the function of historical image files, the problem of metadata corruption caused by image file errors is solved. + +## Bugfix + +### Fix the problem that the data cannot be queried due to the missing data version.(Serious) + +This issue was introduced in version 1.0 and may result in the loss of data versions for multiple replicas. + +### Fix the problem that the resource isolation is invalid for the resource usage limit of loading tasks (Moderate) + +In 1.1, the broker load and routine load will use Backends with specified resource tags to do the load. + +### Use HTTP BRPC to transfer network data packets over 2GB (Moderate) + +In the previous version, when the data transmitted between Backends through BRPC exceeded 2GB, +it may cause data transmission errors. + +## Others + +### Disabling Mini Load + +The `/_load` interface is disabled by default, please use `the /_stream_load` interface uniformly. +Of course, you can re-enable it by turning off the FE configuration item `disable_mini_load`. + +The Mini Load interface will be completely removed in version 1.2. + +### Completely disable the SegmentV1 storage format + +Data in SegmentV1 format is no longer allowed to be created. Existing data can continue to be accessed normally. +You can use the `ADMIN SHOW TABLET STORAGE FORMAT` statement to check whether the data in SegmentV1 format +still exists in the cluster. And convert to SegmentV2 through the data conversion command + +Access to SegmentV1 data will no longer be supported in version 1.2. + +### Limit the maximum length of String type + +In previous versions, String types were allowed a maximum length of 2GB. +In version 1.1, we will limit the maximum length of the string type to 1MB. Strings longer than this length cannot be written anymore. +At the same time, using the String type as a partitioning or bucketing column of a table is no longer supported. + +The String type that has been written can be accessed normally. + +### Fix fastjson related vulnerabilities + +Update to Canal version to fix fastjson security vulnerability. + +### Added `ADMIN DIAGNOSE TABLET` command + +Used to quickly diagnose problems with the specified tablet. + +## Download to Use + +### Download Link + +[hhttps://doris.apache.org/download](https://doris.apache.org/download) + +### Feedback + +If you encounter any problems with use, please feel free to contact us through GitHub discussion forum or Dev e-mail group anytime. + +GitHub Forum: [https://github.com/apache/doris/discussions](https://github.com/apache/doris/discussions) + +Mailing list: [dev@doris.apache.org](dev@doris.apache.org) + +## Thanks + +Thanks to everyone who has contributed to this release: + +``` + +@adonis0147 + +@airborne12 + +@amosbird + +@aopangzi + +@arthuryangcs + +@awakeljw + +@BePPPower + +@BiteTheDDDDt + +@bridgeDream + +@caiconghui + +@cambyzju + +@ccoffline + +@chenlinzhong + +@daikon12 + +@DarvenDuan + +@dataalive + +@dataroaring + +@deardeng + +@Doris-Extras + +@emerkfu + +@EmmyMiao87 + +@englefly + +@Gabriel39 + +@GoGoWen + +@gtchaos + +@HappenLee + +@hello-stephen + +@Henry2SS + +@hewei-nju + +@hf200012 + +@jacktengg + +@jackwener + +@Jibing-Li + +@JNSimba + +@kangshisen + +@Kikyou1997 + +@kylinmac + +@Lchangliang + +@leo65535 + +@liaoxin01 + +@liutang123 + +@lovingfeel + +@luozenglin + +@luwei16 + +@luzhijing + +@mklzl + +@morningman + +@morrySnow + +@nextdreamblue + +@Nivane + +@pengxiangyu + +@qidaye + +@qzsee + +@SaintBacchus + +@SleepyBear96 + +@smallhibiscus + +@spaces-X + +@stalary + +@starocean999 + +@steadyBoy + +@SWJTU-ZhangLei + +@Tanya-W + +@tarepanda1024 + +@tianhui5 + +@Userwhite + +@wangbo + +@wangyf0555 + +@weizuo93 + +@whutpencil + +@wsjz + +@wunan1210 + +@xiaokang + +@xinyiZzz + +@xlwh + +@xy720 + +@yangzhg + +@Yankee24 + +@yiguolei + +@yinzhijian + +@yixiutt + +@zbtzbtzbt + +@zenoyang + +@zhangstar333 + +@zhangyifan27 + +@zhannngchen + +@zhengshengjun + +@zhengshiJ + +@zingdle + +@zuochunwei + +@zy-kkk +``` diff --git a/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.1.md b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.1.md new file mode 100644 index 0000000000000..73a6c2d976999 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.1.md @@ -0,0 +1,78 @@ +--- +{ + "title": "Release 1.1.1", + "language": "en" +} +--- + + + +## Features + +### Support ODBC Sink in Vectorized Engine. + +This feature is enabled in non-vectorized engine but it is missed in vectorized engine in 1.1. So that we add back this feature in 1.1.1. + +### Simple Memtracker for Vectorized Engine. + +There is no memtracker in BE for vectorized engine in 1.1, so that the memory is out of control and cause OOM. In 1.1.1, a simple memtracker is added to BE and could control the memory and cancel the query when memory exceeded. + +## Improvements + +### Cache decompressed data in page cache. + +Some data is compressed using bitshuffle and it costs a lot of time to decompress it during query. In 1.1.1, doris will decompress the data that encoded by bitshuffle to accelerate query and we find it could reduce 30% latency for some query in ssb-flat. + +## Bug Fix + +### Fix the problem that could not do rolling upgrade from 1.0.(Serious) + +This issue was introduced in version 1.1 and may cause BE core when upgrade BE but not upgrade FE. + +If you encounter this problem, you can try to fix it with [#10833](https://github.com/apache/doris/pull/10833). + +### Fix the problem that some query not fall back to non-vectorized engine, and BE will core. + +Currently, vectorized engine could not deal with all sql queries and some queries (like left outer join) will use non-vectorized engine to run. But there are some cases not covered in 1.1. And it will cause be crash. + +### Compaction not work correctly and cause -235 Error. + +One rowset multi segments in uniq key compaction, segments rows will be merged in generic_iterator but merged_rows not increased. Compaction will failed in check_correctness, and make a tablet with too much versions which lead to -235 load error. + +### Some segment fault cases during query. + +[#10961](https://github.com/apache/doris/pull/10961) +[#10954](https://github.com/apache/doris/pull/10954) +[#10962](https://github.com/apache/doris/pull/10962) + +# Thanks + +Thanks to everyone who has contributed to this release: + +``` +@jacktengg +@mrhhsg +@xinyiZzz +@yixiutt +@starocean999 +@morrySnow +@morningman +@HappenLee +``` \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.2.md b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.2.md new file mode 100644 index 0000000000000..223b65fda064c --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.2.md @@ -0,0 +1,84 @@ +--- +{ + "title": "Release 1.1.2", + "language": "en" +} +--- + + + + +In this release, Doris Team has fixed more than 170 issues or performance improvement since 1.1.1. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + +# Features + +### New MemTracker + +Introduced new MemTracker for both vectorized engine and non-vectorized engine which is more accurate. + +### Add API for showing current queries and kill query + +### Support read/write emoji of UTF16 via ODBC Table + +# Improvements + +### Data Lake related improvements + +- Improved HDFS ORC File scan performance about 300%. [#11501](https://github.com/apache/doris/pull/11501) + +- Support HDFS HA mode when query Iceberg table. + +- Support query Hive data created by [Apache Tez](https://tez.apache.org/) + +- Add Ali OSS as Hive external support. + +### Add support for string and text type in Spark Load + + +### Add reuse block in non-vectorized engine and have 50% performance improvement in some cases. [#11392](https://github.com/apache/doris/pull/11392) + +### Improve like or regex performance + +### Disable tcmalloc's aggressive_memory_decommit + +It will have 40% performance gains in load or query. + +Currently it is a config, you can change it by set config `tc_enable_aggressive_memory_decommit`. + +# Bug Fix + +### Some issues about FE that will cause FE failure or data corrupt. + +- Add reserved disk config to avoid too many reserved BDB-JE files.**(Serious)** In an HA environment, BDB JE will retains as many reserved files. The BDB-je log doesn't delete until approaching a disk limit. + +- Fix fatal bug in BDB-JE which will cause FE replica could not start correctly or data corrupted.** (Serious)** + +### Fe will hang on waitFor_rpc during query and BE will hang in high concurrent scenarios. + +[#12459](https://github.com/apache/doris/pull/12459) [#12458](https://github.com/apache/doris/pull/12458) [#12392](https://github.com/apache/doris/pull/12392) + +### A fatal issue in vectorized storage engine which will cause wrong result. **(Serious)** + +[#11754](https://github.com/apache/doris/pull/11754) [#11694](https://github.com/apache/doris/pull/11694) + +### Lots of planner related issues that will cause BE core or in abnormal state. + +[#12080](https://github.com/apache/doris/pull/12080) [#12075](https://github.com/apache/doris/pull/12075) [#12040](https://github.com/apache/doris/pull/12040) [#12003](https://github.com/apache/doris/pull/12003) [#12007](https://github.com/apache/doris/pull/12007) [#11971](https://github.com/apache/doris/pull/11971) [#11933](https://github.com/apache/doris/pull/11933) [#11861](https://github.com/apache/doris/pull/11861) [#11859](https://github.com/apache/doris/pull/11859) [#11855](https://github.com/apache/doris/pull/11855) [#11837](https://github.com/apache/doris/pull/11837) [#11834](https://github.com/apache/doris/pull/11834) [#11821](https://github.com/apache/doris/pull/11821) [#11782](https://github.com/apache/doris/pull/11782) [#11723](https://github.com/apache/doris/pull/11723) [#11569](https://github.com/apache/doris/pull/11569) + diff --git a/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.3.md b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.3.md new file mode 100644 index 0000000000000..cfa7151097de3 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.3.md @@ -0,0 +1,92 @@ +--- +{ + "title": "Release 1.1.3", + "language": "en" +} +--- + + + + +In this release, Doris Team has fixed more than 80 issues or performance improvement since 1.1.2. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + + +# Features + +- Support escape identifiers for sqlserver and postgresql in ODBC table. + +- Could use Parquet as output file format. + +# Improvements + +- Optimize flush policy to avoid small segments. [#12706](https://github.com/apache/doris/pull/12706) [#12716](https://github.com/apache/doris/pull/12716) + +- Refactor runtime filter to reduce the prepare time. [#13127](https://github.com/apache/doris/pull/13127) + +- Lots of memory control related issues during query or load process. [#12682](https://github.com/apache/doris/pull/12682) [#12688](https://github.com/apache/doris/pull/12688) [#12708](https://github.com/apache/doris/pull/12708) [#12776](https://github.com/apache/doris/pull/12776) [#12782](https://github.com/apache/doris/pull/12782) [#12791](https://github.com/apache/doris/pull/12791) [#12794](https://github.com/apache/doris/pull/12794) [#12820](https://github.com/apache/doris/pull/12820) [#12932](https://github.com/apache/doris/pull/12932) [#12954](https://github.com/apache/doris/pull/12954) [#12951](https://github.com/apache/doris/pull/12951) + +# BugFix + +- Core dump on compaction with largeint. [#10094](https://github.com/apache/doris/pull/10094) + +- Grouping sets cause be core or return wrong results. [#12313](https://github.com/apache/doris/pull/12313) + +- PREAGGREGATION flag in orthogonal_bitmap_union_count operator is wrong. [#12581](https://github.com/apache/doris/pull/12581) + +- Level1Iterator should release iterators in heap and it may cause memory leak. [#12592](https://github.com/apache/doris/pull/12592) + +- Fix decommission failure with 2 BEs and existing colocation table. [#12644](https://github.com/apache/doris/pull/12644) + +- BE may core dump because of stack-buffer-overflow when TBrokerOpenReaderResponse too large. [#12658](https://github.com/apache/doris/pull/12658) + +- BE may OOM during load when error code -238 occurs. [#12666](https://github.com/apache/doris/pull/12666) + +- Fix wrong child expression of lead function. [#12587](https://github.com/apache/doris/pull/12587) + +- Fix intersect query failed in row storage code. [#12712](https://github.com/apache/doris/pull/12712) + +- Fix wrong result produced by curdate()/current_date() function. [#12720](https://github.com/apache/doris/pull/12720) + +- Fix lateral view explode_split with temp table bug. [#13643](https://github.com/apache/doris/pull/13643) + +- Bucket shuffle join plan is wrong in two same table. [#12930](https://github.com/apache/doris/pull/12930) + +- Fix bug that tablet version may be wrong when doing alter and load. [#13070](https://github.com/apache/doris/pull/13070) + +- BE core when load data using broker with md5sum()/sm3sum(). [#13009](https://github.com/apache/doris/pull/13009) + +# Upgrade Notes + +PageCache and ChunkAllocator are disabled by default to reduce memory usage and can be re-enabled by modifying the configuration items `disable_storage_page_cache` and `chunk_reserved_bytes_limit`. + +Storage Page Cache and Chunk Allocator cache user data chunks and memory preallocation, respectively. + +These two functions take up a certain percentage of memory and are not freed. This part of memory cannot be flexibly allocated, which may lead to insufficient memory for other tasks in some scenarios, affecting system stability and availability. Therefore, we disabled these two features by default in version 1.1.3. + +However, in some latency-sensitive reporting scenarios, turning off this feature may lead to increased query latency. If you are worried about the impact of this feature on your business after upgrade, you can add the following parameters to be.conf to keep the same behavior as the previous version. + +``` +disable_storage_page_cache=false +chunk_reserved_bytes_limit=10% +``` + +* ``disable_storage_page_cache``: Whether to disable Storage Page Cache. version 1.1.2 (inclusive), the default is false, i.e., on. version 1.1.3 defaults to true, i.e., off. +* `chunk_reserved_bytes_limit`: Chunk allocator reserved memory size. 1.1.2 (and earlier), the default is 10% of the overall memory. 1.1.3 version default is 209715200 (200MB). + diff --git a/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.4.md b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.4.md new file mode 100644 index 0000000000000..4710463f4bcde --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.4.md @@ -0,0 +1,72 @@ +--- +{ + "title": "Release 1.1.4", + "language": "en" +} +--- + + + +In this release, Doris Team has fixed about 60 issues or performance improvement since 1.1.3. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + + +# Features + +- Support obs broker load for Huawei Cloud. [#13523](https://github.com/apache/doris/pull/13523) + +- SparkLoad support parquet and orc file.[#13438](https://github.com/apache/doris/pull/13438) + +# Improvements + +- Do not acquire mutex in metric hook since it will affect query performance during heavy load.[#10941](https://github.com/apache/doris/pull/10941) + + +# BugFix + +- The where condition does not take effect when spark load loads the file. [#13804](https://github.com/apache/doris/pull/13804) + +- If function return error result when there is nullable column in vectorized mode. [#13779](https://github.com/apache/doris/pull/13779) + +- Fix incorrect result when using anti join with other join predicates. [#13743](https://github.com/apache/doris/pull/13743) + +- BE crash when call function concat(ifnull). [#13693](https://github.com/apache/doris/pull/13693) + +- Fix planner bug when there is a function in group by clause. [#13613](https://github.com/apache/doris/pull/13613) + +- Table name and column name is not recognized correctly in lateral view clause. [#13600](https://github.com/apache/doris/pull/13600) + +- Unknown column when use MV and table alias. [#13605](https://github.com/apache/doris/pull/13605) + +- JSONReader release memory of both value and parse allocator. [#13513](https://github.com/apache/doris/pull/13513) + +- Fix allow create mv using to_bitmap() on negative value columns when enable_vectorized_alter_table is true. [#13448](https://github.com/apache/doris/pull/13448) + +- Microsecond in function from_date_format_str is lost. [#13446](https://github.com/apache/doris/pull/13446) + +- Sort exprs nullability property may not be right after subsitute using child's smap info. [#13328](https://github.com/apache/doris/pull/13328) + +- Fix core dump on case when have 1000 condition. [#13315](https://github.com/apache/doris/pull/13315) + +- Fix bug that last line of data lost for stream load. [#13066](https://github.com/apache/doris/pull/13066) + +- Restore table or partition with the same replication num as before the backup. [#11942](https://github.com/apache/doris/pull/11942) + + + diff --git a/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.5.md b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.5.md new file mode 100644 index 0000000000000..ee0482b3ba487 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.5.md @@ -0,0 +1,65 @@ +--- +{ + "title": "Release 1.1.5", + "language": "en" +} +--- + + + +In this release, Doris Team has fixed about 36 issues or performance improvement since 1.1.4. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + +# Behavior Changes + +When alias name is same as the original column name like "select year(birthday) as birthday" and use it in group by, order by , having clause, doris's behavior is different from MySQL in the past. In this release, we make it follow MySQL's behavior. Group by and having clause will use original column at first and order by will use alias first. It maybe a litter confuse here so there is a simple advice here, you'd better not use an alias the same as original column name. + +# Features + +Add support of murmur_hash3_64. [#14636](https://github.com/apache/doris/pull/14636) + +# Improvements + +Add timezone cache for convert_tz to improve performance. [#14616](https://github.com/apache/doris/pull/14616) + +Sort result by tablename when call show clause. [#14492](https://github.com/apache/doris/pull/14492) + +# Bug Fix + +Fix coredump when there is a if constant expr in select clause. [#14858](https://github.com/apache/doris/pull/14858) + +ColumnVector::insert_date_column may crashed. [#14839](https://github.com/apache/doris/pull/14839) + +Update high_priority_flush_thread_num_per_store default value to 6 and it will improve the load performance. [#14775](https://github.com/apache/doris/pull/14775) + +Fix quick compaction core. [#14731](https://github.com/apache/doris/pull/14731) + +Partition column is not duplicate key, spark load will throw IndexOutOfBounds error. [#14661](https://github.com/apache/doris/pull/14661) + +Fix a memory leak problem in VCollectorIterator. [#14549](https://github.com/apache/doris/pull/14549) + +Fix create table like when having sequence column. [#14511](https://github.com/apache/doris/pull/14511) + +Using avg rowset to calculate batch size instead of using total_bytes since it costs a lot of cpu. [#14273](https://github.com/apache/doris/pull/14273) + +Fix right outer join core with conjunct. [#14821](https://github.com/apache/doris/pull/14821) + +Optimize policy of tcmalloc gc. [#14777](https://github.com/apache/doris/pull/14777) [#14738](https://github.com/apache/doris/pull/14738) [#14374](https://github.com/apache/doris/pull/14374) + + diff --git a/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.0.md b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.0.md new file mode 100644 index 0000000000000..2529ce7e58aa2 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.0.md @@ -0,0 +1,563 @@ +--- +{ + "title": "Release 1.2.0", + "language": "en" +} +--- + + + + + +# Feature +## Highlight + +1. Full Vectorizied-Engine support, greatly improved performance + + In the standard ssb-100-flat benchmark, the performance of 1.2 is 2 times faster than that of 1.1; in complex TPCH 100 benchmark, the performance of 1.2 is 3 times faster than that of 1.1. + +2. Merge-on-Write Unique Key + + Support Merge-On-Write on Unique Key Model. This mode marks the data that needs to be deleted or updated when the data is written, thereby avoiding the overhead of Merge-On-Read when querying, and greatly improving the reading efficiency on the updateable data model. + +3. Multi Catalog + + The multi-catalog feature provides Doris with the ability to quickly access external data sources for access. Users can connect to external data sources through the `CREATE CATALOG` command. Doris will automatically map the library and table information of external data sources. After that, users can access the data in these external data sources just like accessing ordinary tables. It avoids the complicated operation that the user needs to manually establish external mapping for each table. + + Currently this feature supports the following data sources: + + 1. Hive Metastore: You can access data tables including Hive, Iceberg, and Hudi. It can also be connected to data sources compatible with Hive Metastore, such as Alibaba Cloud's DataLake Formation. Supports data access on both HDFS and object storage. + 2. Elasticsearch: Access ES data sources. + 3. JDBC: Access MySQL through the JDBC protocol. + + Documentation: https://doris.apache.org//docs/dev/lakehouse/multi-catalog) + + > Note: The corresponding permission level will also be changed automatically, see the "Upgrade Notes" section for details. + +4. Light table structure changes + +In the new version, it is no longer necessary to change the data file synchronously for the operation of adding and subtracting columns to the data table, and only need to update the metadata in FE, thus realizing the millisecond-level Schema Change operation. Through this function, the DDL synchronization capability of upstream CDC data can be realized. For example, users can use Flink CDC to realize DML and DDL synchronization from upstream database to Doris. + +Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE + +When creating a table, set `"light_schema_change"="true"` in properties. + +5. JDBC facade + + Users can connect to external data sources through JDBC. Currently supported: + + - MySQL + - PostgreSQL + - Oracle + - SQL Server + - Clickhouse + + Documentation: [https://doris.apache.org/en/docs/dev/lakehouse/multi-catalog/jdbc](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc/) + + > Note: The ODBC feature will be removed in a later version, please try to switch to the JDBC. + +6. JAVA UDF + + Supports writing UDF/UDAF in Java, which is convenient for users to use custom functions in the Java ecosystem. At the same time, through technologies such as off-heap memory and Zero Copy, the efficiency of cross-language data access has been greatly improved. + + Document: https://doris.apache.org//docs/dev/ecosystem/udf/java-user-defined-function + + Example: https://github.com/apache/doris/tree/master/samples/doris-demo + +7. Remote UDF + + Supports accessing remote user-defined function services through RPC, thus completely eliminating language restrictions for users to write UDFs. Users can use any programming language to implement custom functions to complete complex data analysis work. + + Documentation: https://doris.apache.org//docs/ecosystem/udf/remote-user-defined-function + + Example: https://github.com/apache/doris/tree/master/samples/doris-demo + +8. More data types support + + - Array type + + Array types are supported. It also supports nested array types. In some scenarios such as user portraits and tags, the Array type can be used to better adapt to business scenarios. At the same time, in the new version, we have also implemented a large number of data-related functions to better support the application of data types in actual scenarios. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Types/ARRAY + + Related functions: https://doris.apache.org//docs/dev/sql-manual/sql-functions/array-functions/array_max + + - Jsonb type + + Support binary Json data type: Jsonb. This type provides a more compact json encoding format, and at the same time provides data access in the encoding format. Compared with json data stored in strings, it is several times newer and can be improved. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Types/JSONB + + Related functions: https://doris.apache.org//docs/dev/sql-manual/sql-functions/json-functions/jsonb_parse + + - Date V2 + + Sphere of influence: + + 1. The user needs to specify datev2 and datetimev2 when creating the table, and the date and datetime of the original table will not be affected. + 2. When datev2 and datetimev2 are calculated with the original date and datetime (for example, equivalent connection), the original type will be cast into a new type for calculation + 3. The example is in the documentation + + Documentation: https://doris.apache.org/docs/1.2/sql-manual/sql-reference/Data-Types/DATEV2 + + +## More + +1. A new memory management framework + + Documentation: https://doris.apache.org//docs/dev/admin-manual/maint-monitor/memory-management/memory-tracker + +2. Table Valued Function + + Doris implements a set of Table Valued Function (TVF). TVF can be regarded as an ordinary table, which can appear in all places where "table" can appear in SQL. + + For example, we can use S3 TVF to implement data import on object storage: + + ``` + insert into tbl select * from s3("s3://bucket/file.*", "ak" = "xx", "sk" = "xxx") where c1 > 2; + ``` + + Or directly query data files on HDFS: + + ``` + insert into tbl select * from hdfs("hdfs://bucket/file.*") where c1 > 2; + ``` + + TVF can help users make full use of the rich expressiveness of SQL and flexibly process various data. + + Documentation: + + https://doris.apache.org//docs/dev/sql-manual/sql-functions/table-functions/s3 + + https://doris.apache.org//docs/dev/sql-manual/sql-functions/table-functions/hdfs + +3. A more convenient way to create partitions + + Support for creating multiple partitions within a time range via the `FROM TO` command. + +4. Column renaming + + For tables with Light Schema Change enabled, column renaming is supported. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-TABLE-RENAME + +5. Richer permission management + + - Support row-level permissions + + Row-level permissions can be created with the `CREATE ROW POLICY` command. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-POLICY + + - Support specifying password strength, expiration time, etc. + + - Support for locking accounts after multiple failed logins. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Account-Management-Statements/ALTER-USER + +6. Import + + - CSV import supports csv files with header. + + Search for `csv_with_names` in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD/ + + - Stream Load adds `hidden_columns`, which can explicitly specify the delete flag column and sequence column. + + Search for `hidden_columns` in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD + + - Spark Load supports Parquet and ORC file import. + + - Support for cleaning completed imported Labels + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CLEAN-LABEL + + - Support batch cancellation of import jobs by status + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CANCEL-LOAD + + - Added support for Alibaba Cloud oss, Tencent Cloud cos/chdfs and Huawei Cloud obs in broker load. + + Documentation: https://doris.apache.org//docs/dev/advanced/broker + + - Support access to hdfs through hive-site.xml file configuration. + + Documentation: https://doris.apache.org//docs/dev/admin-manual/config/config-dir + +7. Support viewing the contents of the catalog recycle bin through `SHOW CATALOG RECYCLE BIN` function. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Show-Statements/SHOW-CATALOG-RECYCLE-BIN + +8. Support `SELECT * EXCEPT` syntax. + + Documentation: https://doris.apache.org//docs/dev/data-table/basic-usage + +9. OUTFILE supports ORC format export. And supports multi-byte delimiters. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/OUTFILE + +10. Support to modify the number of Query Profiles that can be saved through configuration. + + Document search FE configuration item: max_query_profile_num + +11. The DELETE statement supports IN predicate conditions. And it supports partition pruning. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Manipulation/DELETE + +12. The default value of the time column supports using `CURRENT_TIMESTAMP` + + Search for "CURRENT_TIMESTAMP" in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE + +13. Add two system tables: backends, rowsets + + Documentation: + + https://doris.apache.org//docs/dev/admin-manual/system-table/backends + + https://doris.apache.org//docs/dev/admin-manual/system-table/rowsets + +14. Backup and restore + + - The Restore job supports the `reserve_replica` parameter, so that the number of replicas of the restored table is the same as that of the backup. + + - The Restore job supports `reserve_dynamic_partition_enable` parameter, so that the restored table keeps the dynamic partition enabled. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Backup-and-Restore/RESTORE + + - Support backup and restore operations through the built-in libhdfs, no longer rely on broker. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Backup-and-Restore/CREATE-REPOSITORY + +15. Support data balance between multiple disks on the same machine + + Documentation: + + https://doris.apache.org//docs/dev/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-REBALANCE-DISK + + https://doris.apache.org//docs/dev/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-CANCEL-REBALANCE-DISK + +16. Routine Load supports subscribing to Kerberos-authenticated Kafka services. + + Search for kerberos in the documentation: https://doris.apache.org//docs/dev/data-operate/import/import-way/routine-load-manual + +17. New built-in-function + + Added the following built-in functions: + + - `cbrt` + - `sequence_match/sequence_count` + - `mask/mask_first_n/mask_last_n` + - `elt` + - `any/any_value` + - `group_bitmap_xor` + - `ntile` + - `nvl` + - `uuid` + - `initcap` + - `regexp_replace_one/regexp_extract_all` + - `multi_search_all_positions/multi_match_any` + - `domain/domain_without_www/protocol` + - `running_difference` + - `bitmap_hash64` + - `murmur_hash3_64` + - `to_monday` + - `not_null_or_empty` + - `window_funnel` + - `group_bit_and/group_bit_or/group_bit_xor` + - `outer combine` + - and all array functions + +# Upgrade Notice + +## Known Issues + +- Use JDK11 will cause BE crash, please use JDK8 instead. + +## Behavior Changed + +- Permission level changes + + Because the catalog level is introduced, the corresponding user permission level will also be changed automatically. The rules are as follows: + + - GlobalPrivs and ResourcePrivs remain unchanged + - Added CatalogPrivs level. + - The original DatabasePrivs level is added with the internal prefix (indicating the db in the internal catalog) + - Add the internal prefix to the original TablePrivs level (representing tbl in the internal catalog) + +- In GroupBy and Having clauses, match on column names in preference to aliases. (#14408) + +- Creating columns starting with `mv_` is no longer supported. `mv_` is a reserved keyword in materialized views (#14361) + +- Removed the default limit of 65535 rows added by the order by statement, and added the session variable `default_order_by_limit` to configure this limit. (#12478) + +- In the table generated by "Create Table As Select", all string columns use the string type uniformly, and no longer distinguish varchar/char/string (#14382) + +- In the audit log, remove the word `default_cluster` before the db and user names. (#13499) (#11408) + +- Add sql digest field in audit log (#8919) + +- The union clause always changes the order by logic. In the new version, the order by clause will be executed after the union is executed, unless explicitly associated by parentheses. (#9745) + +- During the decommission operation, the tablet in the recycle bin will be ignored to ensure that the decomission can be completed. (#14028) + +- The returned result of Decimal will be displayed according to the precision declared in the original column, or according to the precision specified in the cast function. (#13437) + +- Changed column name length limit from 64 to 256 (#14671) + +- Changes to FE configuration items + + - The `enable_vectorized_load` parameter is enabled by default. (#11833) + + - Increased `create_table_timeout` value. The default timeout for table creation operations will be increased. (#13520) + + - Modify `stream_load_default_timeout_second` default value to 3 days. + + - Modify the default value of `alter_table_timeout_second` to one month. + + - Increase the parameter `max_replica_count_when_schema_change` to limit the number of replicas involved in the alter job, the default is 100000. (#12850) + + - Add `disable_iceberg_hudi_table`. The iceberg and hudi appearances are disabled by default, and the multi catalog function is recommended. (#13932) + +- Changes to BE configuration items + + - Removed `disable_stream_load_2pc` parameter. 2PC's stream load can be used directly. (#13520) + + - Modify `tablet_rowset_stale_sweep_time_sec` from 1800 seconds to 300 seconds. + + - Redesigned configuration item name about compaction (#13495) + + - Revisited parameter about memory optimization (#13781) + +- Session variable changes + + - Modify the variable `enable_insert_strict` to true by default. This will cause some insert operations that could be executed before, but inserted illegal values, to no longer be executed. (11866) + + - Modified variable `enable_local_exchange` to default to true (#13292) + + - Default data transmission via lz4 compression, controlled by variable `fragment_transmission_compression_codec` (#11955) + + - Add `skip_storage_engine_merge` variable for debugging unique or agg model data (#11952) + + Documentation: https://doris.apache.org//docs/dev/advanced/variables + +- The BE startup script will check whether the value is greater than 200W through `/proc/sys/vm/max_map_count`. Otherwise, the startup fails. (#11052) + +- Removed mini load interface (#10520) + +- FE Metadata Version + + FE Meta Version changed from 107 to 114, and cannot be rolled back after upgrading. + +## During Upgrade + +1. Upgrade preparation + + - Need to replace: lib, bin directory (start/stop scripts have been modified) + + - BE also needs to configure JAVA_HOME, and already supports JDBC Table and Java UDF. + + - The default JVM Xmx parameter in fe.conf is changed to 8GB. + +2. Possible errors during the upgrade process + + - The repeat function cannot be used and an error is reported: `vectorized repeat function cannot be executed`, you can turn off the vectorized execution engine before upgrading. (#13868) + + - schema change fails with error: `desc_tbl is not set. Maybe the FE version is not equal to the BE` (#13822) + + - Vectorized hash join cannot be used and an error will be reported. `vectorized hash join cannot be executed`. You can turn off the vectorized execution engine before upgrading. (#13753) + + The above errors will return to normal after a full upgrade. + +## Performance Impact + +- By default, JeMalloc is used as the memory allocator of the new version BE, replacing TcMalloc (#13367) + +- The batch size in the tablet sink is modified to be at least 8K. (#13912) + +- Disable chunk allocator by default (#13285) + +## Api change + +- BE's http api error return information changed from `{"status": "Fail", "msg": "xxx"}` to more specific ``{"status": "Not found", "msg": "Tablet not found. tablet_id=1202"}``(#9771) + +- In `SHOW CREATE TABLE`, the content of comment is changed from double quotes to single quotes (#10327) + +- Support ordinary users to obtain query profile through http command. (#14016) +Documentation: https://doris.apache.org//docs/dev/admin-manual/http-actions/fe/manager/query-profile-action + +- Optimized the way to specify the sequence column, you can directly specify the column name. (#13872) +Documentation: https://doris.apache.org//docs/dev/data-operate/update-delete/sequence-column-manual + +- Increase the space usage of remote storage in the results returned by `show backends` and `show tablets` (#11450) + +- Removed Num-Based Compaction related code (#13409) + +- Refactored BE's error code mechanism, some returned error messages will change (#8855) +other + +- Support Docker official image. + +- Support compiling Doris on MacOS(x86/M1) and ubuntu-22.04 + Documentation: https://doris.apache.org//docs/dev/install/source-install/compilation-mac/ + +- Support for image file verification. + + Documentation: https://doris.apache.org//docs/dev/admin-manual/maint-monitor/metadata-operation/ + +- script related + + - The stop scripts of FE and BE support exiting FE and BE via the `--grace` parameter (use kill -15 signal instead of kill -9) + + - FE start script supports checking the current FE version via --version (#11563) + + - Support to get the data and related table creation statement of a tablet through the `ADMIN COPY TABLET` command, for local problem debugging (#12176) + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-COPY-TABLET + +- Support to obtain a table creation statement related to a SQL statement through the http api for local problem reproduction (#11979) + + Documentation: https://doris.apache.org//docs/dev/admin-manual/http-actions/fe/query-schema-action + +- Support to close the compaction function of this table when creating a table, for testing (#11743) + + Search for "disble_auto_compaction" in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE + +# Big Thanks + +Thanks to ALL who contributed to this release! (alphabetically) +``` +@924060929 +@a19920714liou +@adonis0147 +@Aiden-Dong +@aiwenmo +@AshinGau +@b19mud +@BePPPower +@BiteTheDDDDt +@bridgeDream +@ByteYue +@caiconghui +@CalvinKirs +@cambyzju +@caoliang-web +@carlvinhust2012 +@catpineapple +@ccoffline +@chenlinzhong +@chovy-3012 +@coderjiang +@cxzl25 +@dataalive +@dataroaring +@dependabot[bot] +@dinggege1024 +@DongLiang-0 +@Doris-Extras +@eldenmoon +@EmmyMiao87 +@englefly +@FreeOnePlus +@Gabriel39 +@gaodayue +@geniusjoe +@gj-zhang +@gnehil +@GoGoWen +@HappenLee +@hello-stephen +@Henry2SS +@hf200012 +@huyuanfeng2018 +@jacktengg +@jackwener +@jeffreys-cat +@Jibing-Li +@JNSimba +@Kikyou1997 +@Lchangliang +@LemonLiTree +@lexoning +@liaoxin01 +@lide-reed +@link3280 +@liutang123 +@liuyaolin +@LOVEGISER +@lsy3993 +@luozenglin +@luzhijing +@madongz +@morningman +@morningman-cmy +@morrySnow +@mrhhsg +@Myasuka +@myfjdthink +@nextdreamblue +@pan3793 +@pangzhili +@pengxiangyu +@platoneko +@qidaye +@qzsee +@SaintBacchus +@SeekingYang +@smallhibiscus +@sohardforaname +@song7788q +@spaces-X +@ssusieee +@stalary +@starocean999 +@SWJTU-ZhangLei +@TaoZex +@timelxy +@Wahno +@wangbo +@wangshuo128 +@wangyf0555 +@weizhengte +@weizuo93 +@wsjz +@wunan1210 +@xhmz +@xiaokang +@xiaokangguo +@xinyiZzz +@xy720 +@yangzhg +@Yankee24 +@yeyudefeng +@yiguolei +@yinzhijian +@yixiutt +@yuanyuan8983 +@zbtzbtzbt +@zenoyang +@zhangboya1 +@zhangstar333 +@zhannngchen +@ZHbamboo +@zhengshiJ +@zhenhb +@zhqu1148980644 +@zuochunwei +@zy-kkk +``` diff --git a/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.1.md b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.1.md new file mode 100644 index 0000000000000..d5adb31eb5256 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.1.md @@ -0,0 +1,196 @@ +--- +{ + "title": "Release 1.2.1", + "language": "en" +} +--- + + + +# Improvement + +### Supports new type DecimalV3 + +DecimalV3, which supports higher precision and better performance, has the following advantages over past versions. + +- Larger representable range, the range of values are significantly expanded, and the valid number range [1,38]. + +- Higher performance, adaptive adjustment of the storage space occupied according to different precision. + +- More complete precision derivation support, for different expressions, different precision derivation rules are applied to the accuracy of the result. + +[DecimalV3](https://doris.apache.org/docs/dev/sql-manual/sql-reference/Data-Types/DECIMALV3/) + +### Support Iceberg V2 + +Support Iceberg V2 (only Position Delete is supported, Equality Delete will be supported in subsequent versions). + +Tables in Iceberg V2 format can be accessed through the Multi-Catalog feature. + +### Support OR condition to IN + +Support converting OR condition to IN condition, which can improve the execution efficiency in some scenarios.[#15437](https://github.com/apache/doris/pull/15437) [#12872](https://github.com/apache/doris/pull/12872) + +### Optimize the import and query performance of JSONB type + +Optimize the import and query performance of JSONB type. [#15219](https://github.com/apache/doris/pull/15219) [#15219](https://github.com/apache/doris/pull/15219) + +### Stream load supports quoted csv data + +Search trim_double_quotes in Document:[https://doris.apache.org/docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD](https://doris.apache.org/docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD) + +### Broker supports Tencent Cloud CHDFS and Baidu Cloud BOS, AFS + +Data on CHDFS, BOS, and AFS can be accessed through Broker. [#15297](https://github.com/apache/doris/pull/15297) [#15448](https://github.com/apache/doris/pull/15448) + +### New function + +Add function `substring_index`. [#15373](https://github.com/apache/doris/pull/15373) + +# Bug Fix + +- In some cases, after upgrading from version 1.1 to version 1.2, the user permission information will be lost. [#15144](https://github.com/apache/doris/pull/15144) + +- Fix the problem that the partition value is wrong when using datev2/datetimev2 type for partitioning. [#15094](https://github.com/apache/doris/pull/15094) + +- Bug fixes for a large number of released features. For a complete list see: [PR List](https://github.com/apache/doris/pulls?q=is%3Apr+label%3Adev%2F1.2.1-merged+is%3Aclosed) + +# Upgrade Notice + +### Known Issues + +- Do not use JDK11 as the runtime JDK of BE, it will cause BE Crash. +- The reading performance of the csv format in this version has declined, which will affect the import and reading efficiency of the csv format. We will fix it as soon as possible in the next three-digit version + +### Behavior Changed + +- The default value of the BE configuration item `high_priority_flush_thread_num_per_store` is changed from 1 to 6, to improve the write efficiency of Routine Load. (https://github.com/apache/doris/pull/14775) + +- The default value of the FE configuration item `enable_new_load_scan_node` is changed to true. Import tasks will be performed using the new File Scan Node. No impact on users.[#14808](https://github.com/apache/doris/pull/14808) + +- Delete the FE configuration item `enable_multi_catalog`. The Multi-Catalog function is enabled by default. + +- The vectorized execution engine is forced to be enabled by default.[#15213](https://github.com/apache/doris/pull/15213) + +The session variable enable_vectorized_engine will no longer take effect. Enabled by default. + +To make it valid again, set the FE configuration item `disable_enable_vectorized_engine` to false, and restart FE to make `enable_vectorized_engine` valid again. + + +# Big Thanks + +Thanks to ALL who contributed to this release! + + +@adonis0147 + +@AshinGau + +@BePPPower + +@BiteTheDDDDt + +@ByteYue + +@caiconghui + +@cambyzju + +@chenlinzhong + +@dataroaring + +@Doris-Extras + +@dutyu + +@eldenmoon + +@englefly + +@freemandealer + +@Gabriel39 + +@HappenLee + +@Henry2SS + +@hf200012 + +@jacktengg + +@Jibing-Li + +@Kikyou1997 + +@liaoxin01 + +@luozenglin + +@morningman + +@morrySnow + +@mrhhsg + +@nextdreamblue + +@qidaye + +@spaces-X + +@starocean999 + +@wangshuo128 + +@weizuo93 + +@wsjz + +@xiaokang + +@xinyiZzz + +@xutaoustc + +@yangzhg + +@yiguolei + +@yixiutt + +@Yulei-Yang + +@yuxuan-luo + +@zenoyang + +@zhangstar333 + +@zhannngchen + +@zhengshengjun + + + + + + diff --git a/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.2.md b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.2.md new file mode 100644 index 0000000000000..08fd22571a03f --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.2.md @@ -0,0 +1,254 @@ +--- +{ + "title": "Release 1.2.2", + "language": "en" +} +--- + + + +# New Features + +### Lakehouse + +- Support automatic synchronization of Hive metastore. + +- Support reading the Iceberg Snapshot, and viewing the Snapshot history. + +- JDBC Catalog supports PostgreSQL, Clickhouse, Oracle, SQLServer + +- JDBC Catalog supports Insert operation + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/) + +### Auto Bucket + + Set and scale the number of buckets for different partitions to keep the number of tablet in a relatively appropriate range. + +### New Functions + +Add the new function `width_bucket`. + +Reference: [https://doris.apache.org/zh-CN/docs/dev/sql-manual/sql-functions/width-bucket/#description](https://doris.apache.org/zh-CN/docs/dev/sql-manual/sql-functions/width-bucket/#description) + +# Behavior Changes + +- Disable BE's page cache by default: `disable_storage_page_cache=true` + +Turn off this configuration to optimize memory usage and reduce the risk of memory OOM. +But it will reduce the query latency of some small queries. +If you are sensitive to query latency, or have high concurrency and small query scenarios, you can configure *disable_storage_page_cache=false* to enable page cache again. + +- Add new session variable `group_by_and_having_use_alias_first`, used to control whether the group and having clauses use alias. + +Reference: [https://doris.apache.org/docs/dev/advanced/variables](https://doris.apache.org/docs/dev/advanced/variables) + +# Improvement + +### Compaction + +- Support `Vertical Compaction`. To optimize the compaction overhead and efficiency of wide tables. + +- Support `Segment ompaction`. Fix -238 and -235 issues with high frequency imports. + +### Lakehouse + +- Hive Catalog can be compatible with Hive version 1/2/3 + +- Hive Catalog can access JuiceFS based HDFS with Broker. + +- Iceberg Catalog Support Hive Metastore and Rest Catalog type. + +- ES Catalog support _id column mapping. + +- Optimize Iceberg V2 read performance with large number of delete rows. + +- Support for reading Iceberg tables after Schema Evolution + +- Parquet Reader handles column name case correctly. + +### Other + +- Support for accessing Hadoop KMS-encrypted HDFS. + +- Support to cancel the Export export task in progress. + +- Optimize the performance of `explode_split` with 1x. + +- Optimize the read performance of nullable columns with 3x. + +- Optimize some problems of Memtracker, improve memory management accuracy, and optimize memory application. + + + +# Bug Fix + +- Fixed memory leak when loading data with Doris Flink Connector. + +- Fixed the possible thread scheduling problem of BE and reduce the `Fragment sent timeout` error caused by BE thread exhaustion. + +- Fixed various correctness and precision issues of column type datetimev2/decimalv3. + +- Fixed the problem data correctness issue with Unique Key Merge-on-Read table. + +- Fixed various known issues with the Light Schema Change feature. + +- Fixed various data correctness issues of bitmap type Runtime Filter. + +- Fixed the problem of poor reading performance of csv reader introduced in version 1.2.1. + +- Fixed the problem of BE OOM caused by Spark Load data download phase. + +- Fixed possible metadata compatibility issues when upgrading from version 1.1 to version 1.2. + +- Fixed the metadata problem when creating JDBC Catalog with Resource. + +- Fixed the problem of high CPU usage caused by load operation. + +- Fixed the problem of FE OOM caused by a large number of failed Broker Load jobs. + +- Fixed the problem of precision loss when loading floating-point types. + +- Fixed the problem of memory leak when useing 2PC stream load + +# Other + +Add metrics to view the total rowset and segment numbers on BE + +- doris_be_all_rowsets_num and doris_be_all_segments_num + + +# Big Thanks + +Thanks to ALL who contributed to this release! + + +@adonis0147 + +@AshinGau + +@BePPPower + +@BiteTheDDDDt + +@ByteYue + +@caiconghui + +@cambyzju + +@chenlinzhong + +@DarvenDuan + +@dataroaring + +@Doris-Extras + +@dutyu + +@englefly + +@freemandealer + +@Gabriel39 + +@HappenLee + +@Henry2SS + +@htyoung + +@isHuangXin + +@JackDrogon + +@jacktengg + +@Jibing-Li + +@kaka11chen + +@Kikyou1997 + +@Lchangliang + +@LemonLiTree + +@liaoxin01 + +@liqing-coder + +@luozenglin + +@morningman + +@morrySnow + +@mrhhsg + +@nextdreamblue + +@qidaye + +@qzsee + +@spaces-X + +@stalary + + +@starocean999 + +@weizuo93 + +@wsjz + +@xiaokang + +@xinyiZzz + +@xy720 + +@yangzhg + +@yiguolei + +@yixiutt + +@Yukang-Lian + +@Yulei-Yang + +@zclllyybb + +@zddr + +@zhangstar333 + +@zhannngchen + +@zy-kkk + + + + + + diff --git a/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.3.md b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.3.md new file mode 100644 index 0000000000000..cd9226b15e14f --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.3.md @@ -0,0 +1,109 @@ +--- +{ + "title": "Release 1.2.3", + "language": "en" +} +--- + + + +# Improvement + +### JDBC Catalog + +- Support connecting to Doris clusters through JDBC Catalog. + +Currently, Jdbc Catalog only support to use 5.x version of JDBC jar package to connect another Doris database. If you use 8.x version of JDBC jar package, the data type of column may not be matched. + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc/#doris](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc/#doris) + +- Support to synchronize only the specified database through the `only_specified_database` attribute. + +- Support synchronizing table names in the form of lowercase through `lower_case_table_names` to solve the problem of case sensitivity of table names. + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc) + +- Optimize the read performance of JDBC Catalog. + +### Elasticsearch Catalog + +- Support Array type mapping. + +- Support whether to push down the like expression through the `like_push_down` attribute to control the CPU overhead of the ES cluster. + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/es](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/es) + +### Hive Catalog + +- Support Hive table default partition `_HIVE_DEFAULT_PARTITION_`. + +- Hive Metastore metadata automatic synchronization supports notification event in compressed format. + +### Dynamic Partition Improvement + +- Dynamic partition supports specifying the `storage_medium` parameter to control the storage medium of the newly added partition. + +Reference: [https://doris.apache.org/docs/dev/advanced/partition/dynamic-partition](https://doris.apache.org/docs/dev/advanced/partition/dynamic-partition) + + +### Optimize BE's Threading Model + +- Optimize BE's threading model to avoid stability problems caused by frequent thread creation and destroy. + +# Bugfix + +- Fixed issues with Merge-On-Write Unique Key tables. + +- Fixed compaction related issues. + +- Fixed some delete statement issues causing data errors. + +- Fixed several query execution errors. + +- Fixed the problem of using JDBC catalog to cause BE crash on some operating system. + +- Fixed Multi-Catalog issues. + +- Fixed memory statistics and optimization issues. + +- Fixed decimalV3 and date/datetimev2 related issues. + +- Fixed load transaction stability issues. + +- Fixed light-weight schema change issues. + +- Fixed the issue of using `datetime` type for batch partition creation. + +- Fixed the problem that a large number of failed broker loads would cause the FE memory usage to be too high. + +- Fixed the problem that stream load cannot be canceled after dropping the table. + +- Fixed querying `information_schema` timeout in some cases. + +- Fixed the problem of BE crash caused by concurrent data export using `select outfile`. + +- Fixed transactional insert operation memory leak. + +- Fixed several query/load profile issues, and supports direct download of profiles through FE web ui. + +- Fixed the problem that the BE tablet GC thread caused the IO util to be too high. + +- Fixed the problem that the commit offset is inaccurate in Kafka routine load. + diff --git a/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.4.md b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.4.md new file mode 100644 index 0000000000000..a959a323d06d1 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.4.md @@ -0,0 +1,81 @@ +--- +{ + "title": "Release 1.2.4", + "language": "en" +} +--- + + + + +# Behavior Changed + +- For `DateV2`/`DatetimeV2` and `DecimalV3` type, in the results of `DESCRIBLE` and `SHOW CREATE TABLE` statements, they will no longer be displayed as `DateV2`/`DatetimeV2` or `DecimalV3`, but directly displayed as `Date`/`Datetime` or `Decimal`. + + - This change is for compatibility with some BI tools. If you want to see the actual type of the column, you can check it with the `DESCRIBE ALL` statement. + +- When querying tables in the `information_schema` database, the meta information(database, table, column, etc.) in the external catalog is no longer returned by default. + + - This change avoids the problem that the `information_schema` database cannot be queried due to the connection problem of some external catalog, so as to solve the problem of using some BI tools with Doris. It can be controlled by the FE configuration `infodb_support_ext_catalog`, and the default value is `false`, that is, the meta information of external catalog will not be returned. + +# Improvement + +### JDBC Catalog + +- Supports connecting to Trino/Presto via JDBC Catalog + +​ Refer to: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#trino](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#trino) + +- JDBC Catalog connects to Clickhouse data source and supports Array type mapping + +​ Refer to: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#clickhouse](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#clickhouse) + +### Spark Load + +- Spark Load supports Resource Manager HA related configuration + +​ Refer to: https://github.com/apache/doris/pull/15000 + +## Bug Fixes + +- Fixed several connectivity issues with Hive Catalog. + +- Fixed ClassNotFound issues with Hudi Catalog. + +- Optimize the connection pool of JDBC Catalog to avoid too many connections. + +- Fix the problem that OOM will occur when importing data from another Doris cluster through JDBC Catalog. + +- Fixed serveral queries and imports planning issues. + +- Fixed several issues with Unique Key Merge-On-Write data model. + +- Fix several BDBJE issues and solve the problem of abnormal FE metadata in some cases. + +- Fix the problem that the `CREATE VIEW` statement does not support Table Valued Function. + +- Fixed several memory statistics issues. + +- Fixed several issues reading Parquet/ORC format. + +- Fixed several issues with DecimalV3. + +- Fixed several issues with SHOW QUERY/LOAD PROFILE. + diff --git a/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.5.md b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.5.md new file mode 100644 index 0000000000000..55af863ba47d6 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.5.md @@ -0,0 +1,199 @@ +--- +{ + "title": "Release 1.2.5", + "language": "en" +} +--- + + + +In version 1.2.5, the Doris team has fixed nearly 210 issues or performance improvements since the release of version 1.2.4. At the same time, version 1.2.5 is also an iterative version of version 1.2.4, which has higher stability. It is recommended that all users upgrade to this version. + +# Behavior Changed + +- The `start_be.sh` script will check that the maximum number of file handles in the system must be greater than or equal to 65536, otherwise the startup will fail. + +- The BE configuration item `enable_quick_compaction` is set to true by default. The Quick Compaction is enabled by default. This feature is used to optimize the problem of small files in the case of large batch import. + +- After modifying the dynamic partition attribute of the table, it will no longer take effect immediately, but wait for the next task scheduling of the dynamic partition table to avoid some deadlock problems. + +# Improvement + +- Optimize the use of bthread and pthread to reduce the RPC blocking problem during the query process. + +- A button to download Profile is added to the Profile page of the FE web UI. + +- Added FE configuration `recover_with_skip_missing_version`, which is used to query to skip the problematic replica under certain failure conditions. + +- The row-level permission function supports external Catalog. + +- Hive Catalog supports automatic refreshing of kerberos tickets on the BE side without manual refreshing. + +- JDBC Catalog supports tables under the MySQL/ClickHouse system database (`information_schema`). + +# Bug Fixes + +- Fixed the problem of incorrect query results caused by low-cardinality column optimization + +- Fixed several authentication and compatibility issues accessing HDFS. + +- Fixed several issues with float/double and decimal types. + +- Fixed several issues with date/datetimev2 types. + +- Fixed several query execution and planning issues. + +- Fixed several issues with JDBC Catalog. + +- Fixed several query-related issues with Hive Catalog, and Hive Metastore metadata synchronization issues. + +- Fix the problem that the result of `SHOW LOAD PROFILE` statement is incorrect. + +- Fixed several memory related issues. + +- Fixed several issues with `CREATE TABLE AS SELECT` functionality. + +- Fix the problem that the jsonb type causes BE to crash on CPU that do not support avx2. + +- Fixed several issues with dynamic partitions. + +- Fixed several issues with TOPN query optimization. + +- Fixed several issues with the Unique Key Merge-on-Write table model. + +# Big Thanks + +58 contributors participated in the improvement and release of 1.2.5, and thank them for their hard work and dedication: + +@adonis0147 + +@airborne12 + +@AshinGau + +@BePPPower + +@BiteTheDDDDt + +@caiconghui + +@CalvinKirs + +@cambyzju + +@caoliang-web + +@dataroaring + +@Doris-Extras + +@dujl + +@dutyu + +@fsilent + +@Gabriel39 + +@gitccl + +@gnehil + +@GoGoWen + +@gongzexin + +@HappenLee + +@herry2038 + +@jacktengg + +@Jibing-Li + +@kaka11chen + +@Kikyou1997 + +@LemonLiTree + +@liaoxin01 + +@LiBinfeng-01 + +@luwei16 + +@Moonm3n + +@morningman + +@mrhhsg + +@Mryange + +@nextdreamblue + +@nsnhuang + +@qidaye + +@Shoothzj + +@sohardforaname + +@stalary + +@starocean999 + +@SWJTU-ZhangLei + +@wsjz + +@xiaokang + +@xinyiZzz + +@yangzhg + +@yiguolei + +@yixiutt + +@yujun777 + +@Yulei-Yang + +@yuxuan-luo + +@zclllyybb + +@zddr + +@zenoyang + +@zhangstar333 + +@zhannngchen + +@zxealous + +@zy-kkk + +@zzzzzzzs diff --git a/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.6.md b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.6.md new file mode 100644 index 0000000000000..39146b35b15ac --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.6.md @@ -0,0 +1,135 @@ +--- +{ + "title": "Release 1.2.6", + "language": "en" +} +--- + + + + +# Behavior Change + +- Add a BE configuration item `allow_invalid_decimalv2_literal` to control whether can import data that exceeding the decimal's precision, for compatibility with previous logic. + +# Query + +- Fix several query planning issues. +- Support `sql_select_limit` session variable. +- Optimize query cold run performance. +- Fix expr context memory leak. +- Fix the issue that the `explode_split` function was executed incorrectly in some cases. + +## Multi Catalog + +- Fix the issue that synchronizing hive metadata caused FE replay edit log to fail. +- Fix `refresh catalog` operation causing FE OOM. +- Fix the issue that jdbc catalog cannot handle `0000-00-00` correctly. +- Fixed the issue that the kerberos ticket cannot be refreshed automatically. +- Optimize the partition pruning performance of hive. +- Fix the inconsistent behavior of trino and presto in jdbc catalog. +- Fix the issue that hdfs short-circuit read could not be used to improve query efficiency in some environments. +- Fix the issue that the iceberg table on CHDFS could not be read. + +# Storage + +- Fix the wrong calculation of delete bitmap in MOW table. +- Fix several BE memory issues. +- Fix snappy compression issue. +- Fix the issue that jemalloc may cause BE to crash in some cases. + +# Others + +- Fix several java udf related issues. +- Fix the issue that the `recover table` operation incorrectly triggered the creation of dynamic partitions. +- Fix timezone when importing orc files via broker load. +- Fix the issue that the newly added `PERCENT` keyword caused the replay metadata of the routine load job to fail. +- Fix the issue that the `truncate` operation failed to acts on a non-partitioned table. +- Fix the issue that the mysql connection was lost due to the `show snapshot` operation. +- Optimize the lock logic to reduce the probability of lock timeout errors when creating tables. +- Add session variable `have_query_cache` to be compatible with some old mysql clients. +- Optimize the error message when encountering an error of loading. + +# Big Thanks + +Thanks all who contribute to this release: + +@amorynan + +@BiteTheDDDDt + +@caoliang-web + +@dataroaring + +@Doris-Extras + +@dutyu + +@Gabriel39 + +@HHoflittlefish777 + +@htyoung + +@jacktengg + +@jeffreys-cat + +@kaijchen + +@kaka11chen + +@Kikyou1997 + +@KnightLiJunLong + +@liaoxin01 + +@LiBinfeng-01 + +@morningman + +@mrhhsg + +@sohardforaname + +@starocean999 + +@vinlee19 + +@wangbo + +@wsjz + +@xiaokang + +@xinyiZzz + +@yiguolei + +@yujun777 + +@Yulei-Yang + +@zhangstar333 + +@zy-kkk + diff --git a/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.7.md b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.7.md new file mode 100644 index 0000000000000..cd47282f4688d --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.7.md @@ -0,0 +1,46 @@ +--- +{ + "title": "Release 1.2.7", + "language": "en" +} +--- + + + +# Bug Fixes + +- Fixed some query issues. +- Fix some storage issues. +- Fix some decimal precision issues. +- Fix query error caused by invalid `sql_select_limit` session variable's value. +- Fix the problem that hdfs short-circuit read cannot be used. +- Fix the problem that Tencent Cloud cosn cannot be accessed. +- Fix several issues with hive catalog kerberos access. +- Fix the problem that stream load profile cannot be used. +- Fix promethus monitoring parameter format problem. +- Fix the table creation timeout issue when creating a large number of tablets. + +# New Features + +- Unique Key model supports array type as value column +- Added `have_query_cache` variable for compatibility with MySQL ecosystem. +- Added `enable_strong_consistency_read` to support strong consistent read between sessions +- FE metrics supports user-level query counter + diff --git a/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.8.md b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.8.md new file mode 100644 index 0000000000000..35cbb7a3cdcf1 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.8.md @@ -0,0 +1,47 @@ +--- +{ + "title": "Release 1.2.8", + "language": "en" +} +--- + + + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Bug Fixes +- Fixed several issues with query execution. +- Fixed several issues with Spark Load. +- Fixed several issues with Parquet Reader. +- Fixed several issues with Orc Reader. +- Fixed Broker "FileSystem closed" problem. +- Fixed several issues with Broker Load. +- Fixed several issues with CTAS execution. +- Fixed several issues with backup and restore. +- Added "Catalog" column in audit log. +- Optimized the metadata cache of Iceberg Catalog. +- Fixed several issues with outfile/export feature. +- Fixed an issue with "replayEraseTable" edit log causing FE start to fail. +- Fixed some security issues. + + diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.0.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.0.md new file mode 100644 index 0000000000000..61ba6c5c60890 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.0.md @@ -0,0 +1,236 @@ +--- +{ + "title": "Release 2.0.0", + "language": "en" +} +--- + + + + +We are more than excited to announce that, after six months of coding, testing, and fine-tuning, Apache Doris 2.0.0 is now production-ready. Special thanks to the 275 committers who altogether contributed over 4100 optimizations and fixes to the project. + +This new version highlights: + +- 10 times faster data queries +- Enhanced log analytic and federated query capabilities +- More efficient data writing and updates +- Improved multi-tenant and resource isolation mechanisms +- Progresses in elastic scaling of resources and storage-compute separation +- Enterprise-facing features for higher usability + +> Download: https://doris.apache.org/download +> +> GitHub source code: https://github.com/apache/doris/releases/tag/2.0.0-rc04 + +## **A 10 Times Performance Increase** + +In SSB-Flat and TPC-H benchmarking, Apache Doris 2.0.0 delivered **over 10-time faster query performance** compared to an early version of Apache Doris. + +![](/images/release-note-2.0.0-1.png) + +This is realized by the introduction of a smarter query optimizer, inverted index, a parallel execution model, and a series of new functionalities to support high-concurrency point queries. + +### A smarter query optimizer + +The brand new query optimizer, Nereids, has a richer statistical base and adopts the Cascades framework. It is capable of self-tuning in most query scenarios and supports all 99 SQLs in TPC-DS, so users can expect high performance without any fine-tuning or SQL rewriting. + +TPC-H tests showed that Nereids, with no human intervention, outperformed the old query optimizer by a wide margin. Over 100 users have tried Apache Doris 2.0.0 in their production environment and the vast majority of them reported huge speedups in query execution. + +![](/images/release-note-2.0.0-2.png) + +**Doc**: https://doris.apache.org/docs/dev/query-acceleration/nereids/ + +Nereids is enabled by default in Apache Doris 2.0.0: `SET enable_nereids_planner=true`. Nereids collects statistical data by calling the Analyze command. + +### Inverted Index + +In Apache Doris 2.0.0, we introduced inverted index to better support fuzzy keyword search, equivalence queries, and range queries. + +A smartphone manufacturer tested Apache Doris 2.0.0 in their user behavior analysis scenarios. With inverted index enabled, v2.0.0 was able to finish the queries within milliseconds and maintain stable performance as the query concurrency level went up. In this case, it is 5 to 90 times faster than its old version. + +![](/images/release-note-2.0.0-3.png) + +### 20 times higher concurrency capability + +In scenarios like e-commerce order queries and express tracking, a huge number of end data users search for a certain data record simultaneously. These are what we call high-concurrency point queries, which can bring huge pressure on the system. A traditional solution is to introduce Key-Value stores like Apache HBase for such queries, and Redis as a cache layer to ease the burden, but that means redundant storage and higher maintenance costs. + +For a column-oriented DBMS like Apache Doris, the I/O usage of point queries will be multiplied. We need neater execution. Thus, on the basis of columnar storage, we added row storage format and row cache to increase row reading efficiency, short-circuit plans to speed up data retrieval, and prepared statements to reduce frontend overheads. + +After these optimizations, Apache Doris 2.0 reached a concurrency level of **30,000 QPS per node** on YCSB on a 16 Core 64G cloud server with 4×1T hard drives, representing an improvement of **20 times** compared to its older version. This makes Apache Doris a good alternative to HBase in high-concurrency scenarios, so that users don't need to endure extra maintenance costs and redundant storage brought by complicated tech stacks. + +Read more: https://doris.apache.org/blog/High_concurrency + +### A self-adaptive parallel execution model + +Apache 2.0 brought in a Pipeline execution model for higher efficiency and stability in hybrid analytic workloads. In this model, the execution of queries is driven by data. The blocking operators in all query execution processes are split into pipelines. Whether a pipeline gets an execution thread depends on whether its relevant data is ready. This enables asynchronous blocking operations and more flexible system resource management. Also, this improves CPU efficiency as the system doesn't have to create and destroy threads that much. + +Doc: https://doris.apache.org/docs/dev/query-acceleration/pipeline-execution-engine/ + +**How to enable the Pipeline execution model** + +- The Pipeline execution engine is enabled by default in Apache Doris 2.0: `Set enable_pipeline_engine = true`. +- `parallel_pipeline_task_num` represents the number of pipeline tasks that are parallelly executed in SQL queries. The default value of it is `0`, which means Apache Doris will automatically set the concurrency level to half the number of CPUs in each backend node. Users can change this value as they need it. +- For those who are upgrading to Apache Doris 2.0 from an older version, it is recommended to set the value of `parallel_pipeline_task_num` to that of `parallel_fragment_exec_instance_num` in the old version. + +## A Unified Platform for Multiple Analytic Workloads + +Apache Doris has been pushing its boundaries. Starting as an OLAP engine for reporting, it is now a data warehouse capable of ETL/ELT and more. Version 2.0 is making advancements in its log analysis and data lakehousing capabilities. + +### A 10 times more cost-effective log analysis solution + +Apache Doris 2.0.0 provides native support for semi-structured data. In addition to JSON and Array, it now supports a complex data type: Map. Based on Light Schema Change, it also supports Schema Evolution, which means you can adjust the schema as your business changes. You can add or delete fields and indexes, and change the data types for fields. As we introduced inverted index and a high-performance text analysis algorithm into it, it can execute full-text search and dimensional analysis of logs more efficiently. With faster data writing and query speed and lower storage cost, it is 10 times more cost-effective than the common log analytic solution within the industry. + +![](/images/release-note-2.0.0-4.png) + +### Enhanced data lakehousing capabilities + +In Apache Doris 1.2, we introduced Multi-Catalog to allow for auto-mapping and auto-synchronization of data from heterogeneous sources. In version 2.0.0, we extended the list of data sources supported and optimized Doris for based on users' needs in production environment. + +![](/images/release-note-2.0.0-5.png) + +Apache Doris 2.0.0 supports dozens of data sources including Hive, Hudi, Iceberg, Paimon, MaxCompute, Elasticsearch, Trino, ClickHouse, and almost all open lakehouse formats. It also supports snapshot queries on Hudi Copy-on-Write tables and read optimized queries on Hudi Merge-on-Read tables. It allows for authorization of Hive Catalog using Apache Ranger, so users can reuse their existing privilege control system. Besides, it supports extensible authorization plug-ins to enable user-defined authorization methods for any catalog. + +TPC-H benchmark tests showed that Apache Doris 2.0.0 is 3~5 times faster than Presto/Trino in queries on Hive tables. This is realized by all-around optimizations (in small file reading, flat table reading, local file cache, ORC/Parquet file reading, Compute Nodes, and information collection of external tables) finished in this development cycle and the distributed execution framework, vectorized execution engine, and query optimizer of Apache Doris. + +![](/images/release-note-2.0.0-6.png) + +All this gives Apache Doris 2.0.0 an edge in data lakehousing scenarios. With Doris, you can do incremental or overall synchronization of multiple upstream data sources in one place, and expect much higher data query performance than other query engines. The processed data can be written back to the sources or provided for downstream systems. In this way, you can make Apache Doris your unified data analytic gateway. + +## Efficient Data Update + +Data update is important in real-time analysis, since users want to always be accessible to the latest data, and be able to update data flexibly, such as updating a row or just a few columns, batching updating or deleting their specified data, or even overwriting a whole data partition. + +Efficient data updating has been another hill to climb in data analysis. Apache Hive only supports updates on the partition level, while Hudi and Iceberg do better in low-frequency batch updates instead of real-time updates due to their Merge-on-Read and Copy-on-Write implementations. + +As for data updating, Apache Doris 2.0.0 is capable of: + +- **Faster data writing**: In the pressure tests with an online payment platform, under 20 concurrent data writing tasks, Doris reached a writing throughput of 300,000 records per second and maintained stability throughout the over 10-hour continuous writing process. +- **Partial column update**: Older versions of Doris implements partial column update by `replace_if_not_null` in the Aggregate Key model. In 2.0.0, we enable partial column updates in the Unique Key model. That means you can directly write data from multiple source tables into a flat table, without having to concatenate them into one output stream using Flink before writing. This method avoids a complicated processing pipeline and the extra resource consumption. You can simply specify the columns you need to update. +- **Conditional update and deletion**: In addition to the simple Update and Delete operations, we realize complicated conditional updates and deletes operations on the basis of Merge-on-Write. + +## Faster, Stabler, and Smarter Data Writing + +### Higher speed in data writing + +As part of our continuing effort to strengthen the real-time analytic capability of Apache Doris, we have improved the end-to-end real-time data writing capability of version 2.0.0. Benchmark tests reported higher throughput in various writing methods: + +- Stream Load, TPC-H 144G lineitem table, 48-bucket Duplicate table, triple-replica writing: throughput increased by 100% +- Stream Load, TPC-H 144G lineitem table, 48-bucket Unique Key table, triple-replica writing: throughput increased by 200% +- Insert Into Select, TPC-H 144G lineitem table, 48-bucket Duplicate table: throughput increased by 50% +- Insert Into Select, TPC-H 144G lineitem table, 48-bucket Unique Key table: throughput increased by 150% + +### Greater stability in high-concurrency data writing + +The sources of system instability often includes small file merging, write amplification, and the consequential disk I/O and CPU overheads. Hence, we introduced Vertical Compaction and Segment Compaction in version 2.0.0 to eliminate OOM errors in compaction and avoid the generation of too many segment files during data writing. After such improvements, Apache Doris can write data 50% faster while **using only 10% of the memory that it previously used**. + +Read more: https://doris.apache.org/blog/Compaction + +### Auto-synchronization of table schema + +The latest Flink-Doris-Connector allows users to synchronize an entire database (such as MySQL and Oracle) to Apache Doris by one simple step. According to our test results, one single synchronization task can support the real-time concurrent writing of thousands of tables. Users no longer need to go through a complicated synchronization procedure because Apache Doris has automated the process. Changes in the upstream data schema will be automatically captured and dynamically updated to Apache Doris in a seamless manner. + +Read more: https://doris.apache.org/blog/FDC + +## A New Multi-Tenant Resource Isolation Solution + +The purpose of multi-tenant resource isolation is to avoid resource preemption in the case of heavy loads. For that sake, older versions of Apache Doris adopted a hard isolation plan featured by Resource Group: Backend nodes of the same Doris cluster would be tagged, and those of the same tag formed a Resource Group. As data was ingested into the database, different data replicas would be written into different Resource Groups, which will be responsible for different workloads. For example, data reading and writing will be conducted on different data tablets, so as to realize read-write separation. Similarly, you can also put online and offline business on different Resource Groups. + +![](/images/release-note-2.0.0-7.png) + +This is an effective solution, but in practice, it happens that some Resource Groups are heavily occupied while others are idle. We want a more flexible way to reduce vacancy rate of resources. Thus, in 2.0.0, we introduce Workload Group resource soft limit. + +![](/images/release-note-2.0.0-8.png) + +The idea is to divide workloads into groups to allow for flexible management of CPU and memory resources. Apache Doris associates a query with a Workload Group, and limits the percentage of CPU and memory that a single query can use on a backend node. The memory soft limit can be configured and enabled by the user. + +When there is a cluster resource shortage, the system will kill the largest memory-consuming query tasks; when there are sufficient cluster resources, once a Workload Group uses more resources than expected, the idle cluster resources will be shared among all the Workload Groups to give full play to the system memory and ensure stable execution of queries. You can also prioritize the Workload Groups in terms of resource allocation. In other words, you can decide which tasks can be assigned with adequate resources and which not. + +Meanwhile, we introduced Query Queue in 2.0.0. Upon Workload Group creation, you can set a maximum query number for a query queue. Queries beyond that limit will wait for execution in the queue. This is to reduce system burden under heavy workloads. + +## Elastic Scaling and Storage-Compute Separation + +When it comes to computation and storage resources, what do users want? + +- **Elastic scaling of computation resources**: Scale up resources quickly in peak times to increase efficiency and scale down in valley times to reduce costs. +- **Lower storage costs**: Use low-cost storage media and separate storage from computation. +- **Separation of workloads**: Isolate the computation resources of different workloads to avoid preemption. +- **Unified management of data**: Simply manage catalogs and data in one place. + +To separate storage and computation is a way to realize elastic scaling of resources, but it demands more efforts in maintaining storage stability, which determines the stability and continuity of OLAP services. To ensure storage stability, we introduced mechanisms including cache management, computation resource management, and garbage collection. + + In this respect, we divide our users into three groups after investigation: + +1. Users with no need for resource scaling +2. Users requiring resource scaling, low storage costs, and workload separation from Apache Doris +3. Users who already have a stable large-scale storage system and thus require an advanced compute-storage-separated architecture for efficient resource scaling + +Apache Doris 2.0 provides two solutions to address the needs of the first two types of users. + +1. **Compute nodes**. We introduced stateless compute nodes in version 2.0. Unlike the mix nodes, the compute nodes do not save any data and are not involved in workload balancing of data tablets during cluster scaling. Thus, they are able to quickly join the cluster and share the computing pressure during peak times. In addition, in data lakehouse analysis, these nodes will be the first ones to execute queries on remote storage (HDFS/S3) so there will be no resource competition between internal tables and external tables. + 1. Doc: https://doris.apache.org/docs/dev/advanced/compute_node/ +2. **Hot-cold data separation**. Hot/cold data refers to data that is frequently/seldom accessed, respectively. Generally, it makes more sense to store cold data in low-cost storage. Older versions of Apache Doris support lifecycle management of table partitions: As hot data cooled down, it would be moved from SSD to HDD. However, data was stored with multiple replicas on HDD, which was still a waste. Now, in Apache Doris 2.0, cold data can be stored in object storage, which is even cheaper and allows single-copy storage. That reduces the storage costs by 70% and cuts down the computation and network overheads that come with storage. + 1. Read more: https://doris.apache.org/blog/HCDS/ + +For neater separate of computation and storage, the VeloDB team is going to contribute the Cloud Compute-Storage-Separation solution to the Apache Doris project. The performance and stability of it has stood the test of hundreds of companies in their production environment. The merging of code will be finished by October this year, and all Apache Doris users will be able to get an early taste of it in September. + +## Enhanced Usability + +Apache Doris 2.0.0 also highlights some enterprise-facing functionalities. + +### Support for Kubernetes Deployment + +Older versions of Apache Doris communicate based on IP, so any host failure in Kubernetes deployment that causes a POD IP drift will lead to cluster unavailability. Now, version 2.0 supports FQDN. That means the failed Doris nodes can recover automatically without human intervention, which lays the foundation for Kubernetes deployment and elastic scaling. + +### Support for Cross-Cluster Replication (CCR) + +Apache Doris 2.0.0 supports cross-cluster replication (CCR). Data changes at the database/table level in the source cluster will be synchronized to the target cluster. You can choose to replicate the incremental data or the overall data. + +It also supports synchronization of DDL, which means DDL statements executed by the source cluster can also by automatically replicated to the target cluster. + +It is simple to configure and use CCR in Doris. Leveraging this functionality, you can implement read-write separation and multi-datacenter replication + +This feature allows for higher availability of data, read/write workload separation, and cross-data-center replication more efficiently. + +## Behavior Change + +- Use rolling upgrade from 1.2-ITS to 2.0.0, and restart upgrade from preview versions of 2.0 to 2.0.0; +- The new query optimizer (Nereids) is enabled by default: `enable_nereids_planner=true`; +- All non-vectorized code has been removed from the system, so the `enable_vectorized_engine` parameter no long works; +- A new parameter `enable_single_replica_compaction` has been added; +- datev2, datetimev2, and decimalv3 are the default data types in table creation; datav1, datetimev1, and decimalv2 are not supported in table creation; +- decimalv3 is the default data type for JDBC and Iceberg Catalog; +- A new data type `AGG_STATE` has been added; +- The cluster column has been removed from backend tables; +- For better compatibility with BI tools, datev2 and datetimev2 are displayed as date and datetime when `show create table`; +- max_openfiles and swaps checks are added to the backend startup script so inappropriate system configuration might lead to backend failure; +- Password-free login is not allowed when accessing frontend on localhost; +- If there is a Multi-Catalog in the system, by default, only data of the internal catalog will be displayed when querying information schema; +- A limit has been imposed on the depth of the expression tree. The default value is 200; +- The single quote in the return value of array string has been changed to double quote; +- The Doris processes are renamed to DorisFE and DorisBE. +- The functions AES and SM4 with two arguments' behaviour changed. See more informations in [relative function docs](../../sql-manual/sql-functions/encrypt-digest-functions/sm4-encrypt.md) + +## Embarking on the 2.0.0 Journey + +To make Apache Doris 2.0.0 production-ready, we invited hundreds of enterprise users to engage in the testing and optimized it for better performance, stability, and usability. In the next phase, we will continue responding to user needs with agile release planning. We plan to launch 2.0.1 in late August and 2.0.2 in September, as we keep fixing bugs and adding new features. We also plan to release an early version of 2.1 in September to bring a few long-requested capabilities to you. For example, in Doris 2.1, the Variant data type will better serve the schema-free analytic needs of semi-structured data; the multi-table materialized views will be able to simplify the data scheduling and processing link while speeding up queries; more and neater data ingestion methods will be added and nested composite data types will be realized. + +If you have any questions or ideas when investigating, testing, and deploying Apache Doris, please find us on [Slack](https://t.co/ZxJuNJHXb2). Our developers will be happy to hear them and provide targeted support. + diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.1.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.1.md new file mode 100644 index 0000000000000..d8c19fb67525b --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.1.md @@ -0,0 +1,224 @@ +--- +{ + "title": "Release 2.0.1", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, 383 improvements and bug fixes have been made in Doris 2.0.1. + +## Behavior Changes + +- [https://github.com/apache/doris/pull/21302](https://github.com/apache/doris/pull/21302) + +## Improvements + +### functionality and stability of array and map datatypes +- [https://github.com/apache/doris/pull/22793](https://github.com/apache/doris/pull/22793) +- [https://github.com/apache/doris/pull/22927](https://github.com/apache/doris/pull/22927) +- https://github.com/apache/doris/pull/22738 +- https://github.com/apache/doris/pull/22347 +- https://github.com/apache/doris/pull/23250 +- https://github.com/apache/doris/pull/22300 + +### performance for inverted index query +- https://github.com/apache/doris/pull/22836 +- https://github.com/apache/doris/pull/23381 +- https://github.com/apache/doris/pull/23389 +- https://github.com/apache/doris/pull/22570 + +### performance for bitmap, like, scan, agg functions +- https://github.com/apache/doris/pull/23172 +- https://github.com/apache/doris/pull/23495 +- https://github.com/apache/doris/pull/23476 +- https://github.com/apache/doris/pull/23396 +- https://github.com/apache/doris/pull/23182 +- https://github.com/apache/doris/pull/22216 + +### functionality and stability of CCR +- https://github.com/apache/doris/pull/22447 +- https://github.com/apache/doris/pull/22559 +- https://github.com/apache/doris/pull/22173 +- https://github.com/apache/doris/pull/22678 + +### merge on write unique table + +- https://github.com/apache/doris/pull/22282 +- https://github.com/apache/doris/pull/22984 +- https://github.com/apache/doris/pull/21933 +- https://github.com/apache/doris/pull/22874 + +### optimizer table stats and analyze + +- https://github.com/apache/doris/pull/22658 +- https://github.com/apache/doris/pull/22211 +- https://github.com/apache/doris/pull/22775 +- https://github.com/apache/doris/pull/22896 +- https://github.com/apache/doris/pull/22788 +- https://github.com/apache/doris/pull/22882 +- + +### functionality and performance of multi catalog + +- https://github.com/apache/doris/pull/22949 +- https://github.com/apache/doris/pull/22923 +- https://github.com/apache/doris/pull/22336 +- https://github.com/apache/doris/pull/22915 +- https://github.com/apache/doris/pull/23056 +- https://github.com/apache/doris/pull/23297 +- https://github.com/apache/doris/pull/23279 + + +## Important Bug fixes + +- https://github.com/apache/doris/pull/22673 +- https://github.com/apache/doris/pull/22656 +- https://github.com/apache/doris/pull/22892 +- https://github.com/apache/doris/pull/22959 +- https://github.com/apache/doris/pull/22902 +- https://github.com/apache/doris/pull/22976 +- https://github.com/apache/doris/pull/22734 +- https://github.com/apache/doris/pull/22840 +- https://github.com/apache/doris/pull/23008 +- https://github.com/apache/doris/pull/23003 +- https://github.com/apache/doris/pull/22966 +- https://github.com/apache/doris/pull/22965 +- https://github.com/apache/doris/pull/22784 +- https://github.com/apache/doris/pull/23049 +- https://github.com/apache/doris/pull/23084 +- https://github.com/apache/doris/pull/22947 +- https://github.com/apache/doris/pull/22919 +- https://github.com/apache/doris/pull/22979 +- https://github.com/apache/doris/pull/23096 +- https://github.com/apache/doris/pull/23113 +- https://github.com/apache/doris/pull/23062 +- https://github.com/apache/doris/pull/22918 +- https://github.com/apache/doris/pull/23026 +- https://github.com/apache/doris/pull/23175 +- https://github.com/apache/doris/pull/23167 +- https://github.com/apache/doris/pull/23015 +- https://github.com/apache/doris/pull/23165 +- https://github.com/apache/doris/pull/23264 +- https://github.com/apache/doris/pull/23246 +- https://github.com/apache/doris/pull/23198 +- https://github.com/apache/doris/pull/23221 +- https://github.com/apache/doris/pull/23277 +- https://github.com/apache/doris/pull/23249 +- https://github.com/apache/doris/pull/23272 +- https://github.com/apache/doris/pull/23383 +- https://github.com/apache/doris/pull/23372 +- https://github.com/apache/doris/pull/23399 +- https://github.com/apache/doris/pull/23295 +- https://github.com/apache/doris/pull/23446 +- https://github.com/apache/doris/pull/23406 +- https://github.com/apache/doris/pull/23387 +- https://github.com/apache/doris/pull/23421 +- https://github.com/apache/doris/pull/23456 +- https://github.com/apache/doris/pull/23361 +- https://github.com/apache/doris/pull/23402 +- https://github.com/apache/doris/pull/23369 +- https://github.com/apache/doris/pull/23245 +- https://github.com/apache/doris/pull/23532 +- https://github.com/apache/doris/pull/23529 +- https://github.com/apache/doris/pull/23601 + + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.1-merged+is%3Aclosed) . + + +## Big Thanks + +Thanks all who contribute to this release: + +@adonis0147 +@airborne12 +@amorynan +@AshinGau +@BePPPower +@BiteTheDDDDt +@bobhan1 +@ByteYue +@caiconghui +@CalvinKirs +@csun5285 +@DarvenDuan +@deadlinefen +@DongLiang-0 +@Doris-Extras +@dutyu +@englefly +@freemandealer +@Gabriel39 +@GoGoWen +@HappenLee +@hello-stephen +@HHoflittlefish777 +@hubgeter +@hust-hhb +@JackDrogon +@jacktengg +@jackwener +@Jibing-Li +@kaijchen +@kaka11chen +@Kikyou1997 +@Lchangliang +@LemonLiTree +@liaoxin01 +@LiBinfeng-01 +@lsy3993 +@luozenglin +@morningman +@morrySnow +@mrhhsg +@Mryange +@mymeiyi +@shuke987 +@sohardforaname +@starocean999 +@TangSiyang2001 +@Tanya-W +@ucasfl +@vinlee19 +@wangbo +@wsjz +@wuwenchi +@xiaokang +@XieJiann +@xinyiZzz +@yujun777 +@Yukang-Lian +@Yulei-Yang +@zclllyybb +@zddr +@zenoyang +@zgxme +@zhangguoqiang666 +@zhangstar333 +@zhannngchen +@zhiqiang-hhhh +@zxealous +@zy-kkk +@zzzxl1993 +@zzzzzzzs + diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.10.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.10.md new file mode 100644 index 0000000000000..5d8592a0ee25c --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.10.md @@ -0,0 +1,59 @@ +--- +{ + "title": "Release 2.0.10", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 83 improvements and bug fixes have been made in Doris 2.0.10 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + + +## Improvement and Optimizations + +- This enhancement introduces the `read_only` and `super_read_only` variables to the database system, ensuring compatibility with MySQL's read-only modes. + +- When the check status is not IO_ERROR, the disk path should not be added to the broken list. This ensures that only disks with actual I/O errors are marked as broken. + +- When performing a Create Table As Select (CTAS) operation from an external table, convert the `VARCHAR` column to `STRING` type. + +- Support mapping Paimon column type "ROW" to Doris type "STRUCT" + +- Choose disk tolerate with little skew when creating tablet + +- Write editlog to `set replica drop` to avoid confusing status on follower FE + +- Make the schema change memory space adaptive to avoid memory over limit + +- Inverted index 'unicode' tokenizer supports configuration to exclude stop words + +See the complete list of improvements and bug fixes on [GitHub](https://github.com/apache/doris/compare/2.0.9...2.0.10) . + +## Credits + +Thanks to all who contributed to this release: + +@airborne12, @BePPPower, @ByteYue, @CalvinKirs, @cambyzju, @csun5285, @dataroaring, @deardeng, @DongLiang-0, @eldenmoon, @felixwluo, @HappenLee, @hubgeter, @jackwener, @kaijchen, @kaka11chen, @Lchangliang, @liaoxin01, @LiBinfeng-01, @luennng, @morningman, @morrySnow, @Mryange, @nextdreamblue, @qidaye, @starocean999, @suxiaogang223, @SWJTU-ZhangLei, @w41ter, @xiaokang, @xy720, @yujun777, @Yukang-Lian, @zhangstar333, @zxealous, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.11.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.11.md new file mode 100644 index 0000000000000..1a2598b0d41a0 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.11.md @@ -0,0 +1,60 @@ +--- +{ + "title": "Release 2.0.11", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 123 improvements and bug fixes have been made in Doris 2.0.11 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + +## 1 Behavior change + +Since the inverted index is now mature and stable, it can replace the old BITMAP INDEX. Therefore, any newly created `BITMAP INDEX` will automatically switch to an `INVERTED INDEX`, while existing `BITMAP INDEX` will remain unchanged. This entire switching process is transparent to the user, with no changes to writing or querying. Additionally, users can disable this automatic switch by setting the FE configuration `enable_create_bitmap_index_as_inverted_index` to false. [#35528](https://github.com/apache/doris/pull/35528) + + +## 2 Improvement and optimizations + +- Add Trino JDBC Catalog type mapping for JSON and TIME + +- FE exit when failed to transfer to (non) master to prevent unknown state and too many logs + +- Write audit log while doing drop stats table. + +- Ignore min/max column stats if table is partially analyzed to avoid inefficient query plan + +- Support minus operation for set like `set1 - set2` + +- Improve perfmance of LIKE and REGEXP clause with concat (col, pattern_str), eg. `col1 LIKE concat('%', col2, '%')` + +- Add query options for short circuit queries for upgrade compatibility + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.10...2.0.11) . + +## Credits + +Thanks all who contribute to this release: + +@AshinGau, @BePPPower, @BiteTheDDDDt, @ByteYue, @CalvinKirs, @cambyzju, @csun5285, @dataroaring, @eldenmoon, @englefly, @feiniaofeiafei, @Gabriel39, @GoGoWen, @HHoflittlefish777, @hubgeter, @jacktengg, @jackwener, @jeffreys-cat, @Jibing-Li, @kaka11chen, @kobe6th, @LiBinfeng-01, @mongo360, @morningman, @morrySnow, @mrhhsg, @Mryange, @nextdreamblue, @qidaye, @sjyango, @starocean999, @SWJTU-ZhangLei, @w41ter, @wangbo, @wsjz, @wuwenchi, @xiaokang, @XieJiann, @xy720, @yujun777, @Yukang-Lian, @Yulei-Yang, @zclllyybb, @zddr, @zhangstar333, @zhiqiang-hhhh, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.12.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.12.md new file mode 100644 index 0000000000000..0bc289c91a8ef --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.12.md @@ -0,0 +1,58 @@ +--- +{ + "title": "Release 2.0.12", + "language": "en" +} +--- + + + +Thanks to our community developers and users for their contributions. Doris version 2.0.12 will bring 99 improvements and bug fixes. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- No longer set the default table comment to the table type. Instead, set it to be empty by default, for example, change COMMENT 'OLAP' to COMMENT ' '. This new behavior is more friendly for BI software that relies on table comments. [#35855](https://github.com/apache/doris/pull/35855) + +- Change the type of the `@@autocommit` variable from `BOOLEAN` to `BIGINT` to prevent errors from certain MySQL clients (such as .NET MySQL.Data). [#33282](https://github.com/apache/doris/pull/33282) + + +## Improvements + +- Remove the `disable_nested_complex_type` parameter and allow the creation of nested `ARRAY`, `MAP`, and `STRUCT` types by default. [#36255](https://github.com/apache/doris/pull/36255) + +- The HMS catalog supports the `SHOW CREATE DATABASE` command. [#28145](https://github.com/apache/doris/pull/28145) + +- Add more inverted index metrics to the query profile. [#36545](https://github.com/apache/doris/pull/36545) + +- Cross-Cluster Replication (CCR) supports inverted indices. [#31743](https://github.com/apache/doris/pull/31743) + +You can access the full list through the GitHub [link](https://github.com/apache/doris/compare/2.0.11...2.0.12) , with the key features and improvements highlighted below. + + + +## Credits + +Thanks all who contribute to this release: + +@airborne12, D14@amorynan, D14@BiteTheDDDDt, D14@cambyzju, D14@caoliang-web, D14@dataroaring, D14@eldenmoon, D14@feiniaofeiafei, D14@felixwluo, D14@gavinchou, D14@HappenLee, D14@hello-stephen, D14@jacktengg, D14@Jibing-Li, D14@Johnnyssc, D14@liaoxin01, D14@LiBinfeng-01, D14@luwei16, D14@mongo360, D14@morningman, D14@morrySnow, D14@mrhhsg, D14@Mryange, D14@mymeiyi, D14@qidaye, D14@qzsee, D14@starocean999, D14@w41ter, D14@wangbo, D14@wsjz, D14@wuwenchi, D14@xiaokang, D14@XuPengfei-1020, D14@xy720, D14@yongjinhou, D14@yujun777, D14@Yukang-Lian, D14@Yulei-Yang, D14@zclllyybb, D14@zddr, D14@zhannngchen, D14@zhiqiang-hhhh, D14@zy-kkk, D14@zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.13.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.13.md new file mode 100644 index 0000000000000..1b6e54d948d7d --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.13.md @@ -0,0 +1,61 @@ +--- +{ + "title": "Release 2.0.13", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 112 improvements and bug fixes have been made in Doris 2.0.13 version + +[Quick Download](https://doris.apache.org/download/) + +## Behavior changes + +SQL input is treated as multiple statements only when the `CLIENT_MULTI_STATEMENTS` setting is enabled on the client side, enhancing compatibility with MySQL. [#36759](https://github.com/apache/doris/pull/36759) + +## New features + +- A new BE configuration `allow_zero_date` has been added, allowing dates with all zeros. When set to `false`, `0000-00-00` is parsed as `NULL`, and when set to `true`, it is parsed as `0000-01-01`. The default value is `false` to maintain consistency with previous behavior. [#34961](https://github.com/apache/doris/pull/34961) + +- `LogicalWindow` and `LogicalPartitionTopN` support multi-field predicate pushdown to improve performance. [#36828](https://github.com/apache/doris/pull/36828) + +- The ES Catalog now maps ES `nested` or `object` types to Doris `JSON` types. [#37101](https://github.com/apache/doris/pull/37101) + +## Improvements + +- Queries with `LIMIT` end reading data earlier to reduce resource consumption and improve performance. [#36535](https://github.com/apache/doris/pull/36535) + +- Special JSON data with empty keys is now supported. [#36762](https://github.com/apache/doris/pull/36762) + +- Stability and usability of routine load have been improved, including load balancing, automatic recovery, exception handling, and more user-friendly error messages. [#36450](https://github.com/apache/doris/pull/36450) [#35376](https://github.com/apache/doris/pull/35376) [#35266](https://github.com/apache/doris/pull/35266) [ #33372](https://github.com/apache/doris/pull/33372) [#32282](https://github.com/apache/doris/pull/32282) [#32046](https://github.com/apache/doris/pull/32046) [#32021](https://github.com/apache/doris/pull/32021) [#31846](https://github.com/apache/doris/pull/31846) [#31273](https://github.com/apache/doris/pull/31273) + +- BE load balancing selection of hard disk strategy and speed optimization. [#36826](https://github.com/apache/doris/pull/36826) [#36795](https://github.com/apache/doris/pull/36795) [#36509](https://github.com/apache/doris/pull/36509) + +- Stability and usability of the JDBC catalog have been improved, including encryption, thread pool connection count configuration, and more user-friendly error messages. [#36940](https://github.com/apache/doris/pull/36940) [#36720](https://github.com/apache/doris/pull/36720) [#30880](https://github.com/apache/doris/pull/30880) [#35692](https://github.com/apache/doris/pull/35692) + +You can access the full list through the GitHub [link](https://github.com/apache/doris/compare/2.0.12...2.0.13) , with the key features and improvements highlighted below. + +## Credits + +Thanks to all who contributed to this release: + +@Gabriel39, @Jibing-Li, @Johnnyssc, @Lchangliang, @LiBinfeng-01, @SWJTU-ZhangLei, @Thearas, @Yukang-Lian, @Yulei-Yang, @airborne12, @amorynan, @bobhan1, @cambyzju, @csun5285, @dataroaring, @deardeng, @eldenmoon, @englefly, @feiniaofeiafei, @hello-stephen, @jacktengg, @kaijchen, @liutang123, @luwei16, @morningman, @morrySnow, @mrhhsg, @mymeiyi, @platoneko, @qidaye, @sollhui, @starocean999, @w41ter, @xiaokang, @xy720, @yujun777, @zclllyybb, @zddr \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.14.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.14.md new file mode 100644 index 0000000000000..061c5cb7a1093 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.14.md @@ -0,0 +1,59 @@ +--- +{ + "title": "Release 2.0.14", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 110 improvements and bug fixes have been made in Doris 2.0.14 version + + +## 1 New features + +- Adds a REST interface to retrieve the most recent query profile: `curl http://user:password@127.0.0.1:8030/api/profile/text` [#38268](https://github.com/apache/doris/pull/38268) + +## 2 Improvements + +- Optimizes the primary key point query performance for MOW tables with sequence columns [#38287](https://github.com/apache/doris/pull/38287) + +- Enhances the performance of inverted index queries with many conditions [#35346](https://github.com/apache/doris/pull/35346) + +- Automatically enables the `support_phrase` option when creating a tokenized inverted index to accelerate `match_phrase` phrase queries [#37949](https://github.com/apache/doris/pull/37949) + +- Supports simplified SQL hints, for example: `SELECT /*+ query_timeout(3000) */ * FROM t;` [#37720](https://github.com/apache/doris/pull/37720) + +- Automatically retries reading from object storage when encountering a `429` error to improve stability [#35396](https://github.com/apache/doris/pull/35396) + +- LEFT SEMI / ANTI JOIN terminates subsequent matching execution upon matching a qualifying data row to enhance performance. [#34703](https://github.com/apache/doris/pull/34703) + +- Prevents coredump when returning illegal data to MySQL results. [#28069](https://github.com/apache/doris/pull/28069) + +- Unifies the output of type names in lowercase to maintain compatibility with MySQL and be more friendly to BI tools. [#38521](https://github.com/apache/doris/pull/38521) + + +You can access the full list through the GitHub [link](https://github.com/apache/doris/compare/2.0.13...2.0.14) , with the key features and improvements highlighted below. + +## Credits + +Thanks all who contribute to this release: + +@ByteYue, @CalvinKirs, @GoGoWen, @HappenLee, @Jibing-Li, @Lchangliang, @LiBinfeng-01, @Mryange, @XieJiann, @Yukang-Lian, @Yulei-Yang, @airborne12, @amorynan, @biohazard4321, @cambyzju, @csun5285, @eldenmoon, @englefly, @freemandealer, @hello-stephen, @hubgeter, @kaijchen, @liaoxin01, @luwei16, @morningman, @morrySnow, @mymeiyi, @qidaye, @sollhui, @starocean999, @w41ter, @wuwenchi, @xiaokang, @xy720, @yujun777, @zclllyybb, @zddr, @zhangstar333, @zhiqiang-hhhh, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.15.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.15.md new file mode 100644 index 0000000000000..58237f7c3f097 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.15.md @@ -0,0 +1,91 @@ +--- +{ + "title": "Release 2.0.15", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 157 improvements and bug fixes have been made in Doris 2.0.15 version + +- Quick Download: https://doris.apache.org/download + +- GitHub: https://github.com/apache/doris/releases/tag/2.0.15 + +## 1 Behavior Change + +NA + +## 2 New Features + +- Restore now supports deleting redundant tablets and partition options. [#39028](https://github.com/apache/doris/pull/39028) + +- Support JSON function `json_search`.[#40948](https://github.com/apache/doris/pull/40948) + +## 3 Improvement and Optimizations + +### Stability + +- Add a FE configuration `abort_txn_after_lost_heartbeat_time_second` for transaction abort time. [#28662](https://github.com/apache/doris/pull/28662) + +- Abort transactions after a BE loses heartbeat for over 1 minute instead of 5 seconds, to avoid overly sensitive transaction aborts. [#22781](https://github.com/apache/doris/pull/22781) + +- Delay scheduling EOF tasks of routine load to avoid an excessive number of small transactions. [#39975](https://github.com/apache/doris/pull/39975) + +- Prefer querying from online disk services to be more robust. [#39467](https://github.com/apache/doris/pull/39467) + +- Skip checking newly inserted rows in non-strict mode partial updates if the row's delete sign is marked. [#40322](https://github.com/apache/doris/pull/40322) + +- To prevent FE OOM, limit the number of tablets in backup tasks, with a default value of 300,000. [#39987](https://github.com/apache/doris/pull/39987) + +### Performance + +- Optimize slow column updates caused by concurrent column updates and compactions. [#38487](https://github.com/apache/doris/pull/38487) + +- When a NullLiteral exists in a filter condition, it can now be folded into False and further converted to an EmptySet to reduce unnecessary data scanning and computation. [#38135](https://github.com/apache/doris/pull/38135) + +- Improve performance of `ORDER BY` permutation. [#38985](https://github.com/apache/doris/pull/38985) + +- Improve the performance of string processing in inverted indexes. [#37395](https://github.com/apache/doris/pull/37395) + +### Optimizer and Statistics + +- Added support for statements beginning with a semicolon. [#39399](https://github.com/apache/doris/pull/39399) + +- Polish aggregate function signature matching. [#39352](https://github.com/apache/doris/pull/39352) + +- Drop column statistics and trigger auto analysis after schema change. [#39101](https://github.com/apache/doris/pull/39101) + +- Support dropping cached stats using `DROP CACHED STATS table_name`. [#39367](https://github.com/apache/doris/pull/39367) + +### Multi Catalog and Others + +- Optimize JDBC Catalog refresh to reduce the frequency of client creation. [#40261](https://github.com/apache/doris/pull/40261) + +- Fix thread leaks in JDBC Catalog under certain conditions. [#39423](https://github.com/apache/doris/pull/39423) + +- ARRAY MAP STRUCT types now support `REPLACE_IF_NOT_NULL`. [#38304](https://github.com/apache/doris/pull/38304) + +- Retry delete jobs for failures that are not `DELETE_INVALID_XXX`. [#37834](https://github.com/apache/doris/pull/37834) + +**Credits** + +@924060929, @BePPPower, @BiteTheDDDDt, @CalvinKirs, @GoGoWen, @HappenLee, @Jibing-Li, @Johnnyssc, @LiBinfeng-01, @Mryange, @SWJTU-ZhangLei, @TangSiyang2001, @Toms1999, @Vallishp, @Yukang-Lian, @airborne12, @amorynan, @bobhan1, @cambyzju, @csun5285, @dataroaring, @eldenmoon, @englefly, @feiniaofeiafei, @hello-stephen, @htyoung, @hubgeter, @justfortaste, @liaoxin01, @liugddx, @liutang123, @luwei16, @mongo360, @morrySnow, @qidaye, @smallx, @sollhui, @starocean999, @w41ter, @xiaokang, @xzj7019, @yujun777, @zclllyybb, @zddr, @zhangstar333, @zhannngchen, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.2.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.2.md new file mode 100644 index 0000000000000..3f8e89cddf946 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.2.md @@ -0,0 +1,157 @@ +--- +{ + "title": "Release 2.0.2", + "language": "en" +} +--- + + + +# Release 2.0.2 + +Thanks to our community users and developers, 489 improvements and bug fixes have been made in Doris 2.0.2. + +## Behavior Changes + +- [Remove json -> operator convert to json_extract #24679](https://github.com/apache/doris/pull/24679) + + Remove json '->' operator since it is conflicted with lambda function syntax. It's a syntax sugar for function json_extract and can be replaced with the former. +- [Start the script to set metadata_failure_recovery #24308](https://github.com/apache/doris/pull/24308) + + Move metadata_failure_recovery from fe.conf to start_fe.sh argument to prevent being used unexpectedly. +- [Change ordinary type null value is \N,complex type null value is null #24207](https://github.com/apache/doris/pull/24207) +- [Optimize priority_ network matching logic for be #23795](https://github.com/apache/doris/pull/23795) +- [Fix cancel load failed because Job could not be cancelled… #17730](https://github.com/apache/doris/pull/17730) + + Allow cancel a retrying load job. + +## Improvements + +### Easier to use + +- [Support custom lib dir to save custom libs #23887](https://github.com/apache/doris/pull/23887) + + Add a custom_lib dir to allow users place custom lib files and custom_lib will not be replaced. +- [Optimize priority_ network matching logic #23784](https://github.com/apache/doris/pull/23784) + + Optimize priority_network logic to avoid error when this config is wrong or not configured. +- [Row policy support role #23022](https://github.com/apache/doris/pull/23022) + + Support role based auth for row policy. + +### New optimizer Nereids statistics collection improvement + +- [Disable file cache while running analysis tasks. #23663](https://github.com/apache/doris/pull/23663) +- [Show column stats even when error occurred. #23703](https://github.com/apache/doris/pull/23703) +- [Support basic jdbc external table stats collection. #23965](https://github.com/apache/doris/pull/23965) +- [Skip unknown col stats check on __internal_scheam and information_schema #24625](https://github.com/apache/doris/pull/24625) + +### Better support for JDBC, HDFS, Hive, MySQL, Max Compute, Multi-Catalog + +- [Support hadoop viewfs. #24168](https://github.com/apache/doris/pull/24168) +- [Avoid calling checksum when replaying creating jdbc catalog and fix ranger issue #22369](https://github.com/apache/doris/pull/22369) +- [Optimize the JDBC Catalog connection error message #23868](https://github.com/apache/doris/pull/23868) + + Improve property check and error message for JDBC catalog +- [Fix mc decimal type parse, fix wrong obj location #24242](https://github.com/apache/doris/pull/24242) + + Fix some issues for Max Compute catalog +- [Support sql cache for hms catalog #23391](https://github.com/apache/doris/pull/23391) + + SQL cache for Hive catalog +- [Merge hms partition events. #22869](https://github.com/apache/doris/pull/22869) + + Improve performance for Hive metadata sync +- [Add metadata_name_ids for quickly get catlogs,db,table and add profiling table in order to Compatible with mysql #22702](https://github.com/apache/doris/pull/22702) + +### Performance for inverted index query + +- [Add bkd index query cache to improve perf #23952](https://github.com/apache/doris/pull/23952) +- [Improve performance for count on index other than match #24678](https://github.com/apache/doris/pull/24678) +- [Improve match performance without index #24751](https://github.com/apache/doris/pull/24751) +- [Optimize multiple terms conjunction query #23871](https://github.com/apache/doris/pull/23871) +Improve performance of MATCH_ALL +- [Optimize unnecessary conversions #24389](https://github.com/apache/doris/pull/24389) +Improve performance of MATCH + +### Improve Array functions + +- [[Fix old optimizer with some array literal functions #23630](https://github.com/apache/doris/pull/23630) +- [Improve array union support multi params #24327](https://github.com/apache/doris/pull/24327) +- [Improve explode func with array nested complex type #24455](https://github.com/apache/doris/pull/24455) + +## Important Bug fixes + +- [The parameter positions of timestamp diff function to sql are reversed #23601](https://github.com/apache/doris/pull/23601) +- [Fix old optimizer with some array literal functions #23630](https://github.com/apache/doris/pull/23630) +- [Fix query cache returns wrong result after deleting partitions. #23555](https://github.com/apache/doris/pull/23555) +- [Fix potential data loss when clone task's dst tablet is cooldown replica #17644](https://github.com/apache/doris/pull/17644) +- [Fix array map batch append data with right next_array_item_rowid #23779](https://github.com/apache/doris/pull/23779) +- [Fix or to in rule #23940](https://github.com/apache/doris/pull/23940) +- [Fix 'char' function's toSql implementation is wrong #23860](https://github.com/apache/doris/pull/23860) +- [Record wrong best plan properties #23973](https://github.com/apache/doris/pull/23973) +- [Make TVF's distribution spec always be RANDOM #24020](https://github.com/apache/doris/pull/24020) +- [External scan use STORAGE_ANY instead of ANY as distibution #24039](https://github.com/apache/doris/pull/24039) +- [Runtimefilter target is not SlotReference #23958](https://github.com/apache/doris/pull/23958) +- [mv in select materialized_view should disable show table #24104](https://github.com/apache/doris/pull/24104) +- [Fail over to remote file reader if local cache failed #24097](https://github.com/apache/doris/pull/24097) +- [Fix revoke role operation cause fe down #23852](https://github.com/apache/doris/pull/23852) +- [Handle status code correctly and add a new error code `ENTRY_NOT_FOUND` #24139](https://github.com/apache/doris/pull/24139) +- [Fix leaky abstraction and shield the status code `END_OF_FILE` from upper layers #24165](https://github.com/apache/doris/pull/24165) +- [Fix bug that Read garbled files caused be crash. #24164](https://github.com/apache/doris/pull/24164) +- [Fix be core when user sepcified empty `column_separator` using hdfs tvf #24369](https://github.com/apache/doris/pull/24369) +- [Fix need to restart BE after replacing the jar package in java-udf #24372](https://github.com/apache/doris/pull/24372) +- [Need to call 'set_version' in nested functions #24381](https://github.com/apache/doris/pull/24381) +- [windown_funnel compatibility issue with multi backends #24385](https://github.com/apache/doris/pull/24385) +- [correlated anti join shouldn't be translated to null aware anti join #24290](https://github.com/apache/doris/pull/24290) +- [Change ordinary type null value is \N,complex type null value is null #24207](https://github.com/apache/doris/pull/24207) +- [Fix analyze failed when there are thousands of partitions. #24521](https://github.com/apache/doris/pull/24521) +- [Do not use enum as the data type for JavaUdfDataType. #24460](https://github.com/apache/doris/pull/24460) +- [Fix multi window projection issue temporarily #24568](https://github.com/apache/doris/pull/24568) +- [Make metadata compatible with 2.0.3 #24610](https://github.com/apache/doris/pull/24610) +- [Select outfile column order is wrong #24595](https://github.com/apache/doris/pull/24595) +- [Incorrect result of semi/anti mark join #24616](https://github.com/apache/doris/pull/24616) +- [Fix broker read issue #24635](https://github.com/apache/doris/pull/24635) +- [Skip unknown col stats check on __internal_scheam and information_schema #24625](https://github.com/apache/doris/pull/24625) +- [Fixed bug when parsing multi-character delimiters. #24572](https://github.com/apache/doris/pull/24572) +- [Fix timezone parse when there is no tzfile #24578](https://github.com/apache/doris/pull/24578) +- [We need to issue an error when starting FE without setting the Java home environment #23943](https://github.com/apache/doris/pull/23943) +- [Enable_unique_key_partial_update should be forwarded to master #24697](https://github.com/apache/doris/pull/24697) +- [Fix paimon file catalog meta issue and replication num analysis issue #24681](https://github.com/apache/doris/pull/24681) +- [Add more log for ingest_binlog && Fix ingest_binlog not rewrite rowset_meta tablet_uid #24617](https://github.com/apache/doris/pull/24617) +- [Do not abort when a disk is broken #24692](https://github.com/apache/doris/pull/24692) +- [colocate join could not work well on full outer join #24700](https://github.com/apache/doris/pull/24700) +- [Optimize unnecessary conversions #24389](https://github.com/apache/doris/pull/24389) +- [Optimize the reading efficiency of nullable (string) columns. #24698](https://github.com/apache/doris/pull/24698) +- [Fix segment cache core when output rowset is nullptr #24778](https://github.com/apache/doris/pull/24778) +- [Fix duplicate key in schema change #24782](https://github.com/apache/doris/pull/24782) +- [Make metadata compatible for future version after 2.0.2 #24800](https://github.com/apache/doris/pull/24800) +- [Fix map/array deserialize string with quote pair #24808](https://github.com/apache/doris/pull/24808) +- [Failed on arm platform, with clang compiler and pch on, close #24633 #24636](https://github.com/apache/doris/pull/24636) +- [Table column order is changed if add a column and do truncate #24981](https://github.com/apache/doris/pull/24981) +- [Make parser mode coarse grained by default #24949](https://github.com/apache/doris/pull/24949) + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.2-merged+is%3Aclosed) . + +## Big Thanks + +Thanks all who contribute to this release: + +[@adonis0147](https://github.com/adonis0147) [@airborne12](https://github.com/airborne12) [@amorynan](https://github.com/amorynan) [@AshinGau](https://github.com/AshinGau) [@BePPPower](https://github.com/BePPPower) [@BiteTheDDDDt](https://github.com/BiteTheDDDDt) [@bobhan1](https://github.com/bobhan1) [@ByteYue](https://github.com/ByteYue) [@caiconghui](https://github.com/caiconghui) [@CalvinKirs](https://github.com/CalvinKirs) [@cambyzju](https://github.com/cambyzju) [@ChengDaqi2023](https://github.com/ChengDaqi2023) [@ChinaYiGuan](https://github.com/ChinaYiGuan) [@CodeCooker17](https://github.com/CodeCooker17) [@csun5285](https://github.com/csun5285) [@dataroaring](https://github.com/dataroaring) [@deadlinefen](https://github.com/deadlinefen) [@DongLiang-0](https://github.com/DongLiang-0) [@Doris-Extras](https://github.com/Doris-Extras) [@dutyu](https://github.com/dutyu) [@eldenmoon](https://github.com/eldenmoon) [@englefly](https://github.com/englefly) [@freemandealer](https://github.com/freemandealer) [@Gabriel39](https://github.com/Gabriel39) [@gnehil](https://github.com/gnehil) [@GoGoWen](https://github.com/GoGoWen) [@gohalo](https://github.com/gohalo) [@HappenLee](https://github.com/HappenLee) [@hello-stephen](https://github.com/hello-stephen) [@HHoflittlefish777](https://github.com/HHoflittlefish777) [@hubgeter](https://github.com/hubgeter) [@hust-hhb](https://github.com/hust-hhb) [@ixzc](https://github.com/ixzc) [@JackDrogon](https://github.com/JackDrogon) [@jacktengg](https://github.com/jacktengg) [@jackwener](https://github.com/jackwener) [@Jibing-Li](https://github.com/Jibing-Li) [@JNSimba](https://github.com/JNSimba) [@kaijchen](https://github.com/kaijchen) [@kaka11chen](https://github.com/kaka11chen) [@Kikyou1997](https://github.com/Kikyou1997) [@Lchangliang](https://github.com/Lchangliang) [@LemonLiTree](https://github.com/LemonLiTree) [@liaoxin01](https://github.com/liaoxin01) [@LiBinfeng-01](https://github.com/LiBinfeng-01) [@liugddx](https://github.com/liugddx) [@luwei16](https://github.com/luwei16) [@mongo360](https://github.com/mongo360) [@morningman](https://github.com/morningman) [@morrySnow](https://github.com/morrySnow) @mrhhsg @Mryange @mymeiyi @neuyilan @pingchunzhang @platoneko @qidaye @realize096 @RYH61 @shuke987 @sohardforaname @starocean999 @SWJTU-ZhangLei @TangSiyang2001 @Tech-Circle-48 @w41ter @wangbo @wsjz @wuwenchi @wyx123654 @xiaokang @XieJiann @xinyiZzz @XuJianxu @xutaoustc @xy720 @xyfsjq @xzj7019 @yiguolei @yujun777 @Yukang-Lian @Yulei-Yang @zclllyybb @zddr @zhangguoqiang666 @zhangstar333 @ZhangYu0123 @zhannngchen @zxealous @zy-kkk @zzzxl1993 @zzzzzzzs diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.3.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.3.md new file mode 100644 index 0000000000000..a716d6d711fb0 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.3.md @@ -0,0 +1,253 @@ +--- +{ + "title": "Release 2.0.3", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 1000 improvements and bug fixes have been made in Doris 2.0.3 version, including optimizer statistics, inverted index, complex datatypes, data lake, replica management. + + + +## 1 Behavior change + +- The output format of the complex data type array/map/struct has been changed to be consistent to the input format and JSON specification. The main changes from the previous version are that DATE/DATETIME and STRING/VARCHAR are enclosed in double quotes and null values inside ARRAY/MAP are displayed as `null` instead of `NULL`. + - https://github.com/apache/doris/pull/25946 +- SHOW_VIEW permission is supported. Users with SELECT or LOAD permission will no longer be able to execute the 'SHOW CREATE VIEW' statement and must be granted the SHOW_VIEW permission separately. + - https://github.com/apache/doris/pull/25370 + + +## 2 New features + +### 2.1 Support collecting statistics for optimizer automatically + +Collecting statistics helps the optimizer understand the data distribution characteristics and choose a better plan to greatly improve query performance. It is officially supported starting from version 2.0.3 and is enabled all day by default. + +### 2.2 Support complex datatypes for more datalake source +- Support complex datatypes for JAVA UDF, JDBC and Hudi MOR + - https://github.com/apache/doris/pull/24810 + - https://github.com/apache/doris/pull/26236 +- Support complex datatypes for Paimon + - https://github.com/apache/doris/pull/25364 +- Suport Paimon version 0.5 + - https://github.com/apache/doris/pull/24985 + + +### 2.3 Add more builtin functions +- Support the BitmapAgg function in new optimizer + - https://github.com/apache/doris/pull/25508 +- Supports SHA series digest functions + - https://github.com/apache/doris/pull/24342 +- Support the BITMAP datatype in the aggregate functions min_by and max_by + - https://github.com/apache/doris/pull/25430 +- Add milliseconds/microseconds_add/sub/diff functions + - https://github.com/apache/doris/pull/24114 +- Add some json functions: json_insert, json_replace, json_set + - https://github.com/apache/doris/pull/24384 + + +## 3 Improvement and optimizations + +### 3.1 Performance optimizations + +- When the inverted index MATCH WHERE condition with a high filter rate is combined with the common WHERE condition with a low filter rate, the I/O of the index column is greatly reduced. +- Optimize the efficiency of random data access after the where filter. +- Optimizes the performance of the old get_json_xx function on JSON data types by 2~4x. +- Supports the configuration to reduce the priority of the data read thread, ensuring the CPU resources for real-time writing. +- Adds `uuid-numeric` function that returns largeint, which is 20 times faster than `uuid` function that returns string. +- Optimized the performance of case when by 3x. +- Cut out unnecessary predicate calculations in storage engine execution. +- Accelerate count performance by pushing down count operator to storage tier. +- Optimizes the computation performance of the nullable type in and or expressions. +- Supports rewriting the limit operator before `join` in more scenarios to improve query performance. +- Eliminate useless `order by` operators from inline view to improve query performance. +- Optimizes the accuracy of cardinality estimates and cost models in some cases. +- Optimized jdbc catalog predicate pushdown logic. +- Optimized the read efficiency of the file cache when it's enable for the first time. +- Optimizes the hive table sql cache policy and uses the partition update time stored in HMS to improve the cache hit ratio. +- Optimize mow compaction efficiency. +- Optimized thread allocation logic for external table query to reduce memory usage +- Optimize memory usage for column reader. + + + +### 3.2 Distributed replica management improvements + +Distributed replica management improvements include skipping partition deletion, colocate group deletion, balance failure due to continuous write, and hot and cold seperation table balance. + + +### 3.3 Security enhancement +- The audit log plug-in uses a token instead of a plaintext password to enhance security + - https://github.com/apache/doris/pull/26278 +- log4j configures security enhancement + - https://github.com/apache/doris/pull/24861 +- Sensitive user information is not displayed in logs + - https://github.com/apache/doris/pull/26912 + + +## 4 Bugfix and stability + +### 4.1 Complex datatypes +- Fix issues that fixed-length CHAR(n) was not truncated correctly in map/struct. + - https://github.com/apache/doris/pull/25725 +- Fix write failure for struct datatype nested for map/array + - https://github.com/apache/doris/pull/26973 +- Fix the issue that count distinct did not support array/map/struct + - https://github.com/apache/doris/pull/25483 +- Fix be crash in updating to 2.0.3 after the delete complex type appeared in query + - https://github.com/apache/doris/pull/26006 +- Fix be crash when JSON datatype is in WHERE clause. + - https://github.com/apache/doris/pull/27325 +- Fix be crash when ARRAY datatype is in OUTER JOIN clause. + - https://github.com/apache/doris/pull/25669 +- Fix reading incorrect result for DECIMAL datatype in ORC format. + - https://github.com/apache/doris/pull/26548 + - https://github.com/apache/doris/pull/25977 + - https://github.com/apache/doris/pull/26633 + +### 4.2 Inverted index +- Fix incorrect result for OR NOT combination in WHERE clause were incorrect when disable inverted index query. + - https://github.com/apache/doris/pull/26327 +- Fix be crash when write a empty with inverted index + - https://github.com/apache/doris/pull/25984 +- Fix be crash in index compaction when the output of compaction is empty. + - https://github.com/apache/doris/pull/25486 +- Fixed the problem of adding an inverted index to be crashed when no data is written to the newly added column. +- Fix be crash when BUILD INDEX after ADD COLUMN without new data written. + - https://github.com/apache/doris/pull/27276 +- Fix missing and leak problem of hardlink for inverted index file. + - https://github.com/apache/doris/pull/26903 +- Fix index file corrupt when disk is full temporarilly + - https://github.com/apache/doris/pull/28191 +- Fix incorrect result due to optimization for skip reading index column + - https://github.com/apache/doris/pull/28104 + +### 4.3 Materialized View +- Fix the problem of BE crash caused by repeated expressions in the group by statement +- Fix be crash when there are duplicate expressions in `group by` statements. + - https://github.com/apache/doris/pull/27523 +- Disables the float/double type in the `group by` clause when a view is created. + - https://github.com/apache/doris/pull/25823 +- Improve the function of select query matching materialized view + - https://github.com/apache/doris/pull/24691 +- Fix an issue that materialized views could not be matched when a table alias was used + - https://github.com/apache/doris/pull/25321 +- Fix the problem using percentile_approx when creating materialized views + - https://github.com/apache/doris/pull/26528 + +### 4.4 Table sample +- Fix the problem that table sample query can not work on table with partitions. + - https://github.com/apache/doris/pull/25912 +- Fix the problem that table sample query can not work when specify tablet. + - https://github.com/apache/doris/pull/25378 + + +### 4.5 Unique with merge on write +- Fix null pointer exception in conditional update based on primary key + - https://github.com/apache/doris/pull/26881 +- Fix field name capitalization issues in partial update + - https://github.com/apache/doris/pull/27223 +- Fix duplicate keys occur in mow during schema change repairement. + - https://github.com/apache/doris/pull/25705 + + +### 4.6 Load and compaction +- Fix unkown slot descriptor error in routineload for running multiple tables + - https://github.com/apache/doris/pull/25762 +- Fix be crash due to concurrent memory access when caculating memory + - https://github.com/apache/doris/pull/27101 +- Fix be crash on duplicate cancel for load. + - https://github.com/apache/doris/pull/27111 +- Fix broker connection error during broker load + - https://github.com/apache/doris/pull/26050 +- Fix incorrect result delete predicates in concurrent case of compation and scan. + - https://github.com/apache/doris/pull/24638 +- Fix the problem tha compaction task would print too many stacktrace logs + - https://github.com/apache/doris/pull/25597 + + +### 4.7 Data Lake compatibility +- Solve the problem that the iceberg table contains special characters that cause query failure + - https://github.com/apache/doris/pull/27108 +- Fix compatibility issues of different hive metastore versions + - https://github.com/apache/doris/pull/27327 +- Fix an error reading max compute partition table + - https://github.com/apache/doris/pull/24911 +- Fix the issue that backup to object storage failed + - https://github.com/apache/doris/pull/25496 + - https://github.com/apache/doris/pull/25803 + + +### 4.8 JDBC external table compatibility + +- Fix Oracle date type format error in jdbc catalog + - https://github.com/apache/doris/pull/25487 +- Fix MySQL 0000-00-00 date exception in jdbc catalog + - https://github.com/apache/doris/pull/26569 +- Fix an exception in reading data from Mariadb where the default value of the time type is current_timestamp + - https://github.com/apache/doris/pull/25016 +- Fix be crash when processing BITMAP datatype in jdbc catalog + - https://github.com/apache/doris/pull/25034 + - https://github.com/apache/doris/pull/26933 + + +### 4.9 SQL Planner and Optimizer + +- Fix partition prune error in some scenes + - https://github.com/apache/doris/pull/27047 + - https://github.com/apache/doris/pull/26873 + - https://github.com/apache/doris/pull/25769 + - https://github.com/apache/doris/pull/27636 + +- Fix incorrect sub-query processing in some scenarios + - https://github.com/apache/doris/pull/26034 + - https://github.com/apache/doris/pull/25492 + - https://github.com/apache/doris/pull/25955 + - https://github.com/apache/doris/pull/27177 + +- Fix some semantic parsing errors + - https://github.com/apache/doris/pull/24928 + - https://github.com/apache/doris/pull/25627 + +- Fix data loss during right outer/anti join + - https://github.com/apache/doris/pull/26529 + +- Fix incorrect pushing down of predicate pass aggregation operators. + - https://github.com/apache/doris/pull/25525 + +- Fix incorrect result header in some cases + - https://github.com/apache/doris/pull/25372 + +- Fix incorrect plan when the nullsafeEquals expression (<=>) is used as the join condition + - https://github.com/apache/doris/pull/27127 + +- Fix correct column prune in set operation operator. + - https://github.com/apache/doris/pull/26884 + + +### Others + +- Fix BE crash when the order of columns in a table is changed and then upgraded to 2.0.3. + - https://github.com/apache/doris/pull/28205 + + +See the complete list of improvements and bug fixes on [github dev/2.0.3-merged](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.3-merged+is%3Aclosed) . diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.4.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.4.md new file mode 100644 index 0000000000000..e1dac58fbf69a --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.4.md @@ -0,0 +1,67 @@ +--- +{ + "title": "Release 2.0.4", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, about 333 improvements and bug fixes have been made in Doris 2.0.4 version. + +**Quick Download** : [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub** : [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Behavior change +- More reasonable and accurate precision and scale inference for decimal data type + - [https://github.com/apache/doris/pull/28034](https://github.com/apache/doris/pull/28034) + +- Support drop policy for user or role + - [https://github.com/apache/doris/pull/29488](https://github.com/apache/doris/pull/29488) + +## New features + +- Support datev1, datetimev1 and decimalv2 datatypes in new optimizer Nereids. +- Support ODBC table for new optimizer Nereids. +- Add `lower_case` and `ignore_above` option for inverted index +- Support `match_regexp` and `match_phrase_prefix` optimization by inverted index +- Support paimon native reader in datalake +- Support audit-log for `insert into` SQL +- Support reading parquet file in lzo compressed format + +## Three Improvement and optimizations + +- Improve storage management including balance, migration, publish and others. +- Improve storage cooldown policy to use save disk space. +- Performance optimization for substr with ascii string. +- Improve partition prune when date function is used. +- Improve auto analyze visibility and performance. + +See the complete list of improvements and bug fixes on github [dev/2.0.4-merged](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.4-merged+is%3Aclosed) + + + +## Credits +Last but not least, this release would not have been possible without the following contributors: + +airborne12, amorynan, AshinGau, BePPPower, bingquanzhao, BiteTheDDDDt, bobhan1, ByteYue, caiconghui,CalvinKirs, cambyzju, caoliang-web, catpineapple, csun5285, dataroaring, deardeng, dutyu, eldenmoon, englefly, feifeifeimoon, fornaix, Gabriel39, gnehil, HappenLee, hello-stephen, HHoflittlefish777,hubgeter, hust-hhb, ixzc, jacktengg, jackwener, Jibing-Li, kaka11chen, KassieZ, LemonLiTree,liaoxin01, LiBinfeng-01, lihuigang, liugddx, luwei16, morningman, morrySnow, mrhhsg, Mryange, nextdreamblue, Nitin-Kashyap, platoneko, py023, qidaye, shuke987, starocean999, SWJTU-ZhangLei, w41ter, wangbo, wsjz, wuwenchi, Xiaoccer, xiaokang, XieJiann, xingyingone, xinyiZzz, xuwei0912, xy720, xzj7019, yujun777, zclllyybb, zddr, zhangguoqiang666, zhangstar333, zhannngchen, zhiqiang-hhhh, zy-kkk, zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.5.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.5.md new file mode 100644 index 0000000000000..20d6bd9302b2c --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.5.md @@ -0,0 +1,73 @@ +--- +{ + "title": "Release 2.0.5", + "language": "en" +} +--- + + + + + +Thanks to our community users and developers, about 217 improvements and bug fixes have been made in Doris 2.0.5 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Behavior change +- Change char function behaviour: `select char(0) = '\0'` return true as MySQL + - https://github.com/apache/doris/pull/30034 +- Allow exporting empty data + - https://github.com/apache/doris/pull/30703 + +## New features +- Eliminate left outer join with `is null` condition +- Add `show-tablets-belong` stmt for analyzing a batch of tablet-ids +- InferPredicates support In, such as `a = b & a in [1, 2] -> b in [1, 2]` +- Optimize plan when column stats are unavailable +- Optimize plan using rollup column stats +- Support analyze materialized view +- Support ShowProcessStmt Show all FE connection + +## Improvement and optimizations +- Optimize query plan when column stats are unaviable +- Optimize query plan using rollup column stats +- Stop analyze quickly after user close auto analyze +- Catch load column stats exception, avoid print too much stack info to fe.out +- Select materialized view by specify the view name in SQL +- Change auto analyze max table width default value to 100 +- Escape characters for columns in recovery predicate pushdown in JDBC Catalog +- Fix JDBC MYSQL Catalog `to_date` fun pushdown +- Optimize the close logic of JDBC client +- Optimize JDBC connection pool parameter settings +- Obtain hudi partition information through HMS's API +- Optimize routine load job error msg and memory +- Skip all backup/restore jobs if max allowd option is set to 0 + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.4-rc06...2.0.5-rc02). + + +## Credits +Thanks all who contribute to this release: + +airborne12, alexxing662, amorynan, AshinGau, BePPPower, bingquanzhao, BiteTheDDDDt, ByteYue, caiconghui, cambyzju, catpineapple, dataroaring, eldenmoon, Emor-nj, englefly, felixwluo, GoGoWen, HappenLee, hello-stephen, HHoflittlefish777, HowardQin, JackDrogon, jacktengg, jackwener, Jibing-Li, KassieZ, LemonLiTree, liaoxin01, liugddx, LuGuangming, morningman, morrySnow, mrhhsg, Mryange, mymeiyi, nextdreamblue, qidaye, ryanzryu, seawinde,starocean999, TangSiyang2001, vinlee19, w41ter, wangbo, wsjz, wuwenchi, xiaokang, XieJiann, xingyingone, xy720,xzj7019, yujun777, zclllyybb, zhangstar333, zhannngchen, zhiqiang-hhhh, zxealous, zy-kkk, zzzxl1993 + diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.6.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.6.md new file mode 100644 index 0000000000000..9591ed8d3fab8 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.6.md @@ -0,0 +1,59 @@ +--- +{ + "title": "Release 2.0.6", + "language": "en" +} +--- + + + + + +Thanks to our community users and developers, about 114 improvements and bug fixes have been created by 51 contributors in Doris 2.0.6 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Behavior change +- N/A + +## New features +- Support match a function with alias in materialized-view +- Add a command to drop a tablet replica safely on backend +- Add row count cache for external table. +- Support analyze rollup to gather statistics for optimizer + +## Improvement and optimizations +- Improve tablet schema cache memory by using deterministic way to serialize protobuf +- Improve show column stats performance +- Support estimate row count for iceberg and paimon +- Support sqlserver timestamp type read for JDBC catalog + + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.5-rc02...2.0.6). + + +## Credits +Thanks all who contribute to this release: + +924060929, AshinGau, BePPPower, BiteTheDDDDt, CalvinKirs, cambyzju, deardeng, DongLiang-0, eldenmoon, englefly, feelshana, feiniaofeiafei, felixwluo, HappenLee, hust-hhb, iwanttobepowerful, ixzc, JackDrogon, Jibing-Li, KassieZ, larshelge, liaoxin01, LiBinfeng-01, liutang123, luennng, morningman, morrySnow, mrhhsg, qidaye, starocean999, TangSiyang2001, wangbo, wsjz, wuwenchi, xiaokang, XieJiann, xuwei0912, xy720, xzj7019, yiguolei, yujun777, Yukang-Lian, Yulei-Yang, zclllyybb, zddr, zhangstar333, zhannngchen, zhiqiang-hhhh, zy-kkk, zzzxl1993 + diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.7.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.7.md new file mode 100644 index 0000000000000..10f226dbd63b4 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.7.md @@ -0,0 +1,84 @@ +--- +{ + "title": "Release 2.0.7", + "language": "en" +} +--- + + + + + +Thanks to our community users and developers, about 80 improvements and bug fixes have been made in Doris 2.0.7 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## 1 Behavior change + +- `round` function defaults to rounding normally as MySQL, eg. round(5/2) return 3 instead of 2. + + - https://github.com/apache/doris/pull/31583 + +- `round` datetime with scale from string literal as MySQL, eg. round '2023-10-12 14:31:49.666' to '2023-10-12 14:31:50' . + + - https://github.com/apache/doris/pull/27965 + + +## 2 New features +- Support make miss slot as null alias when converting outer join to anti join to speed up query + + - https://github.com/apache/doris/pull/31854 + +- Enable proxy protocol to support IP transparency for Nginx and HAProxy. + + - https://github.com/apache/doris/pull/32338 + + +## 3 Improvement and optimizations + +- Add DEFAULT_ENCRYPTION column in `information_schema` table and add `processlist` table for better compatibility for BI tools + +- Automatically test connectivity by default when creating a JDBC Catalog. + +- Enhance auto resume to keep routine load stable + +- Use lowercase by default for Chinese tokenizer in inverted index + +- Add error msg if exceeded maximum default value in repeat function + +- Skip hidden file and dir in Hive table + +- Reduce file meta cache size and disable cache for some cases to avoid OOM + +- Reduce jvm heap memory consumed by profiles of BrokerLoadJob + +- Remove sort which is under table sink to speed up query like `INSERT INTO t1 SELECT * FROM t2 ORDER BY k`. + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.6...2.0.7) . + + +## 4 Credits + +Thanks all who contribute to this release: + +924060929,airborne12,amorynan,ByteYue,dataroaring,deardeng,feiniaofeiafei,felixwluo,freemandealer,gavinchou,hello-stephen,HHoflittlefish777,jacktengg,jackwener,jeffreys-cat,Jibing-Li,KassieZ,LiBinfeng-01,luwei16,morningman,mrhhsg,Mryange,nextdreamblue,platoneko,qidaye,rohitrs1983,seawinde,shuke987,starocean999,SWJTU-ZhangLei,w41ter,wsjz,wuwenchi,xiaokang,XieJiann,XuJianxu,yujun777,Yulei-Yang,zhangstar333,zhiqiang-hhhh,zy-kkk,zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.8.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.8.md new file mode 100644 index 0000000000000..d881a80628b44 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.8.md @@ -0,0 +1,76 @@ +--- +{ + "title": "Release 2.0.8", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, about 65 improvements and bug fixes have been made in Doris 2.0.8 version. + +- **Quick Download** : [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub** : [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + + +## 1 Behavior change + +The `ADMIN SHOW` statement can not be executed with high version of MySQL 8.x jdbc driver. So rename these statement, remove the `ADMIN` keywords. + +- https://github.com/apache/doris/pull/29492 + +```sql +ADMIN SHOW CONFIG -> SHOW CONFIG +ADMIN SHOW REPLICA -> SHOW REPLICA +ADMIN DIAGNOSE TABLET -> SHOW TABLET DIAGNOSIS +ADMIN SHOW TABLET -> SHOW TABLET +``` + + +## 2 New features + +N/A + + + +## 3 Improvement and optimizations + +- Make Inverted Index work with TopN opt in Nereids + +- Limit the max string length to 1024 while collecting column stats to control BE memory usage + +- JDBC Catalog close when JDBC client is not empty + +- Accept all Iceberg database and do not check the name format of database + +- Refresh external table's rowcount async to avoid cache miss and unstable query plan + +- Simplify the isSplitable method of hive external table to avoid too many hadoop metrics + +See the complete list of improvements and bug fixes on [GitHub](https://github.com/apache/doris/compare/2.0.7...2.0.8) . + +## 4 Credits + +Thanks all who contribute to this release: + +924060929, AcKing-Sam, amorynan, AshinGau, BePPPower, BiteTheDDDDt, ByteYue, cambyzju, dongsilun, eldenmoon, feiniaofeiafei, gnehil, Jibing-Li, liaoxin01, luwei16, morningman, morrySnow, mrhhsg, Mryange, nextdreamblue, platoneko, starocean999, SWJTU-ZhangLei, wuwenchi, xiaokang, xinyiZzz, Yukang-Lian, Yulei-Yang, zclllyybb, zddr, zhangstar333, zhiqiang-hhhh, ziyanTOP, zy-kkk, zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.9.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.9.md new file mode 100644 index 0000000000000..04048fc060461 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.9.md @@ -0,0 +1,75 @@ +--- +{ + "title": "Release 2.0.9", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, about 68 improvements and bug fixes have been made in Doris 2.0.9 version. + +- **Quick Download** : [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub** : [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## 1 Behavior change + +NA + +## 2 New features + +- Support predicate apprear both on key and value mv column + +- Support mv with `bitmap_union(bitmap_from_array())` + +- Add a FE config to force replicate allocation for OLAP tables in the cluster + +- Support date literal support timezone in new optimizer Nereids + +- Support slop in fulltext search `match_phrase` to specify word distence + +- Show index id in `SHOW PROC INDEXES` + +## 3 Improvement and optimizations + +- Sdd a secondary argument in `first_value` / `last_value` to ignore NULL values + +- the offset params in `LEAD`/ `LAG` function could use 0 + +- Adjust priority of materialized view match rule + +- TopN opt reads only limit number of records for better performance + +- Add profile for delete_bitmap get_agg function + +- Refine the Meta cache to get better performance + +- Add FE config `autobucket_max_buckets` + +See the complete list of improvements and bug fixes on [GitHub](https://github.com/apache/doris/compare/2.0.8...2.0.9) . + +## Big Thanks + +Thanks all who contribute to this release: + +adonis0147, airborne12, amorynan, AshinGau, BePPPower, BiteTheDDDDt, CalvinKirs, cambyzju, csun5285, eldenmoon, englefly, feiniaofeiafei, HHoflittlefish777, htyoung, hust-hhb, jackwener, Jibing-Li, kaijchen, kylinmac, liaoxin01, luwei16, morningman, mrhhsg, qidaye, starocean999, SWJTU-ZhangLei, w41ter, xiaokang, xiedeyantu, xy720, zclllyybb, zhangstar333, zhannngchen, zy-kkk, zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.0.md b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.0.md new file mode 100644 index 0000000000000..b0b88f715ee51 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.0.md @@ -0,0 +1,159 @@ +--- +{ + "title": "Release 2.1.0", + "language": "en" +} +--- + + + +Dear community, we are pleased to share with you the official release of Apache Doris 2.1.0, now available for download and use as of March 8th. This latest version marks a significant milestone in our journey towards enhancing data analysis capabilities, particularly for handling massive and complex datasets. + +With Doris 2.1.0, our primary focus has been on optimizing analysis performance, and the results speak for themselves. We have achieved an impressive performance improvement of over 100% on the TPC-DS 1TB test dataset, making Apache Doris more capable of challenging real-world business scenarios. + +- **Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Performance improvement + +### Smarter optimizer + +On the basis of V2.0, the query optimizer in Doris V2.1 comes with enhanced statistics-based inference and enumeration framework. We have upgraded the cost model and expanded the optimization rules to serve the needs of more use cases + +### Better heuristic optimization + +For data analytics at scale or data lake scenarios, Doris V2.1 provides better heuristic query plans. Meanwhile, the RuntimeFilter is more self-adaptive to enable higher performance even without statistical information. + +### Parallel adaptive scan + +Doris V2.1 has adopted parallel adaptive scan to optimize scan I/O and thus improve query performance. It can avoid the negative impact of unreasonable numbers of buckets. (This feature is currently available on the Duplicate Key model and Merge-on-Write Unique Key model.) + +### Local shuffle + +We have introduced Local Shuffle to prevent uneven data distribution. Benchmark tests show that Local Shuffle in combination with Parallel Adaptive Scan can guarantee fast query performance in spite of unreasonable bucket number settings upon table creation. + +### Faster INSERT INTO SELECT + +To further improve the performance of INSERT INTO SELECT, which is a frequent operation in ETL, we have moved forward the MemTable execution-wise to reduce data ingestion overheads. Tests show that this can double the data ingestion speed in most cases compared to V2.0. +Improved data lake analytics capabilities + +## Data lake analytic performance + +### TPC-DS Benchmark + +According to TPC-DS benchmark tests (1TB) of Doris V2.1 against Trino, + +- Without caching, the total execution time of Doris is 56% of that of Trino V435. (717s VS 1296s) +- Enabling file cache can further increase the overall performance of Doris by 2.2 times. (323s) + This is achieved by a series of optimizations in I/O, parquet/ORC file reading, predicate pushdown, caching, and scan task scheduling, etc. + +### SQL dialects compatibility + +To facilitate migration to Doris and increase its compatibility with other DBMS, we have enabled SQL dialect conversion in V2.1. ([read more](../../lakehouse/sql-dialect)) For example, by set sql_dialect = "trino" in Doris, you can use the Trino SQL dialect as you're used to, without modifying your current business logic, and Doris will execute the corresponding queries for you. Tests in user production environment show that Doris V2.1 is compatible with 99% of Trino SQL. + +### Arrow Flight SQL protocol + +As a column-oriented database compatible with MySQL 8.0 protocol, Doris V2.1 now supports the Arrow Flight SQL protocol as well so users can have fast access to Doris data via Pandas/Numpy without data serialization and deserialization. For most common data types, the Arrow Flight protocol enables tens of times faster performance than the MySQL protocol. + +## Asynchronous materialized view + +V2.1 allows creating a materialized view based on multiple tables. This feature currently supports: + +- Transparent rewriting: supports transparent rewriting of common operators including Select, Where, Join, Group By, and Aggregation. +- Auto refresh: supports regular refresh, manual refresh, full refresh, incremental refresh, and partition-based refresh. +- Materialized view of external tables: supports materialized views based on external data tables such as those on Hive, Hudi, and Iceberg; supported synchronizing data from data lakes into Doris internal tables via materialized views. +- Direct query on materialized views: Materialized views can be regarded as the result set after ETL. In this sense, materialized views are data tables, so users can conduct queries on them directly. + +## Enhanced storage + +### Auto-increment column + +V2.1 supports auto-increment columns, which can ensure data uniqueness of each row. This lays the foundation for efficient dictionary encoding and query pagination. For example, for precise UV calculation and customer grouping, users often apply the bitmap type in Doris, the process of which entails dictionary encoding. With V2.1, users can first create a dictionary table using the auto-increment column, and then simply load user data into it. + +### Auto partition + +To further release burden on operation and maintenance, V2.1 allows auto data partitioning. Upon data ingestion, it detects whether a partition exists for the data based on the partitioning column. If not, it automatically creates one and starts data ingestion. + +### High-concurrency real-time data ingestion + +For data writing, a back pressure mechanism is in place to avoid execessive data versions, so as to reduce resource consumption by data version merging. In addition, V2.1 supports group commit ([read more](../../data-operate/import/import-way/group-commit-manual)), which means to accumulate multiple writing and commit them as one. Benchmark tests on group commit with JDBC ingestion and the Stream Load method present great results. + +## Semi-structured data analysis + +### A new data type: Variant + +V2.1 supports a new data type named Variant. It can accommodate semi-structured data such as JSON as well as compound data types that contain integers, strings, booleans, etcs. Users don't have to pre-define the exact data types for a Variant column in the table schema. The Variant type is handy when processing nested data structures. +You can include Variant columns and static columns with pre-defined data types in the same table. This will provide you with more flexibility in storage and queries. +Tests with ClickBench datasets prove that data in Variant columns takes up the same storage space as data in static columns, which is half of that in JSON format. In terms of query performance, the Variant type enables 8 times higher query speed than JSON in hot runs and even more in cold runs. + +### IP types + +Doris V2.1 provides native support for IPv4 and IPv6. It stores IP data in binary format, which cuts down storage space usage by 60% compared to IP string in plain texts. Along with these IP types, we have added over 20 functions for IP data processing. + +### More powerful functions for compound data types + +- explode_map: supports exploding rows into columns for the Map data type. +- Supports the STRUCT data type in the IN predicates + +## Workload Management + +### Hard isolation of resources + +On the basis of the Workload Group mechanism, which imposes a soft limit on the resources that a workload group can use, Doris 2.1 introduces a hard limit on CPU resource consumption for workload groups as a way to ensure higher stability in query performance. + +### TopSQL + +V2.1 allows users to check the most resource-consuming SQL queries in the runtime. This can be a big help when handling cluster load spike caused by unexpected large queries. + + +## Others + +### Decimal 256 + +For users in the financial sector or high-end manufacturing, V2.1 supports a high-precision data type: Decimal, which supports up to 76 significant digits (an experimental feature, please set enable_decimal256=true.) + +### Job scheduler + +V2.1 provides a good option for regular task scheduling: Doris Job Scheduler. It can trigger the pre-defined operations on schedule or at fixed intervals. The Doris Job Scheduler is accurate to the second. It provides consistency guarantee for data writing, high efficiency and flexibility, high-performance processing queues, retraceable scheduling records, and high availability of jobs. + +### Support Docker fast start to experience the new version + +Starting from version 2.1.0, we will provide a separate Docker Image to support the rapid creation of a 1FE, 1BE Docker container to experience the new version of Doris. The container will complete the initialization of FE and BE, BE registration and other steps by default. After creating the container, it can directly access and use the Doris cluster about 1 [minute.In](http://minute.in/) this image version, the default `max_map_count`, `ulimit`, `Swap` and other hard limits are removed. It supports X64 (avx2) machines and ARM machines for deployment. The default open ports are 8000, 8030, 8040, 9030.If you need to experience the Broker component, you can add the environment variable `--env BROKER=true` at startup to start the Broker process synchronously. After startup, it will automatically complete the registration. The Broker name is `test`. + +Please note that this version is only suitable for quick experience and functional testing, not for production environment! + +## Behavior changed + +- The default data model is the Merge-on-Write Unique Key model. enable_unique_key_merge_on_write will be included as a default setting when a table is created in the Unique Key model. +- As inverted index has proven to be more performant than bitmap index, V2.1 stops supporting bitmap index. Existing bitmap indexes will remain effective but new creation is not allowed. We will remove bitmap index-related code in the future. +- cpu_resource_limit is no longer supported. It is to put a limit on the number of scanner threads on Doris BE. Since the workload group mechanism also supports such settings, the already configured cpu_resource_limit will be invalid. +- The default value of enable_segcompaction is true. This means Doris supports compaction of multiple segments in the same rowset. +- Audit log plug-in + - Since V2.1.0, Doris has a built-in audit log plug-in. Users can simply enable or disable it by setting the enable_audit_plugin parameter. + - If you have already installed your own audit log plug-in, you can either continue using it after upgrading to Doris V2.1, or uninstall it and use the one in Doris. Please note that the audit log table will be relocated after switching plug-in. + - For more details, please see the [docs](../../admin-manual/audit-plugin). + + +## Credits +Thanks all who contribute to this release: + +467887319, 924060929, acnot, airborne12, AKIRA, alan_rodriguez, AlexYue, allenhooo, amory, amory, AshinGau, beat4ocean, BePPPower, bigben0204, bingquanzhao, BirdAmosBird, BiteTheDDDDt, bobhan1, caiconghui, camby, camby, CanGuan, caoliang-web, catpineapple, Centurybbx, chen, ChengDaqi2023, ChenyangSunChenyang, Chester, ChinaYiGuan, ChouGavinChou, chunping, colagy, CSTGluigi, czzmmc, daidai, dalong, dataroaring, DeadlineFen, DeadlineFen, deadlinefen, deardeng, didiaode18, DongLiang-0, dong-shuai, Doris-Extras, Dragonliu2018, DrogonJackDrogon, DuanXujianDuan, DuRipeng, dutyu, echo-dundun, ElvinWei, englefly, Euporia, feelshana, feifeifeimoon, feiniaofeiafei, felixwluo, figurant, flynn, fornaix, FreeOnePlus, Gabriel39, gitccl, gnehil, GoGoWen, gohalo, guardcrystal, hammer, HappenLee, HB, hechao, HelgeLarsHelge, herry2038, HeZhangJianHe, HHoflittlefish777, HonestManXin, hongkun-Shao, HowardQin, hqx871, httpshirley, htyoung, huanghaibin, HuJerryHu, HuZhiyuHu, Hyman-zhao, i78086, irenesrl, ixzc, jacktengg, jacktengg, jackwener, jayhua, Jeffrey, jiafeng.zhang, Jibing-Li, JingDas, julic20s, kaijchen, kaka11chen, KassieZ, kindred77, KirsCalvinKirs, KirsCalvinKirs, kkop, koarz, LemonLiTree, LHG41278, liaoxin01, LiBinfeng-01, LiChuangLi, LiDongyangLi, Lightman, lihangyu, lihuigang, LingAdonisLing, liugddx, LiuGuangdongLiu, LiuHongLiu, liuJiwenliu, LiuLijiaLiu, lsy3993, LuGuangmingLu, LuoMetaLuo, luozenglin, Luwei, Luzhijing, lxliyou001, Ma1oneZhang, mch_ucchi, Miaohongkai, morningman, morrySnow, Mryange, mymeiyi, nanfeng, nanfeng, Nitin-Kashyap, PaiVallishPai, Petrichor, plat1ko, py023, q763562998, qidaye, QiHouliangQi, ranxiang327, realize096, rohitrs1983, sdhzwc, seawinde, seuhezhiqiang, seuhezhiqiang, shee, shuke987, shysnow, songguangfan, Stalary, starocean999, SunChenyangSun, sunny, SWJTU-ZhangLei, TangSiyang2001, Tanya-W, taoxutao, Uniqueyou, vhwzIs, walter, walter, wangbo, Wanghuan, wangqt, wangtao, wangtianyi2004, wenluowen, whuxingying, wsjz, wudi, wudongliang, wuwenchihdu, wyx123654, xiangran0327, Xiaocc, XiaoChangmingXiao, xiaokang, XieJiann, Xinxing, xiongjx, xuefengze, xueweizhang, XueYuhai, XuJianxu, xuke-hat, xy, xy720, xyfsjq, xzj7019, yagagagaga, yangshijie, YangYAN, yiguolei, yiguolei, yimeng, YinShaowenYin, Yoko, yongjinhou, ytwp, yuanyuan8983, yujian, yujun777, Yukang-Lian, Yulei-Yang, yuxuan-luo, zclllyybb, ZenoYang, zfr95, zgxme, zhangdong, zhangguoqiang, zhangstar333, zhangstar333, zhangy5, ZhangYu0123, zhannngchen, ZhaoLongZhao, zhaoshuo, zhengyu, zhiqqqq, ZhongJinHacker, ZhuArmandoZhu, zlw5307, ZouXinyiZou, zxealous, zy-kkk, zzwwhh, zzzxl1993, zzzzzzzs diff --git a/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.1.md b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.1.md new file mode 100644 index 0000000000000..384bccdceb414 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.1.md @@ -0,0 +1,251 @@ +--- +{ + "title": "Release 2.1.1", + "language": "en" +} +--- + + + +Dear community members, Apache Doris 2.1.1 has been officially released on April 3, 2024, with several enhancements and bug fixes based on 2.1.0, enabling smoother user experience. + +- **Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + +## Behavior Changed + +1. Change float type output format to improve float type serialization performance. + +- https://github.com/apache/doris/pull/32049 + +2. Change system table value functions active_queries(), workload_groups() to system tables. + +- https://github.com/apache/doris/pull/32314 + +3. Disable show query/load profile stmt because there are not so many developers use it and the pipeline and pipelinex engine not support it. + +- https://github.com/apache/doris/pull/32467 + +4. Upgrade arrow flight version to 15.0.2 to fix some bugs, so that please use ADBC 15.0.2 version to access Doris. + +- https://github.com/apache/doris/pull/32827. + + + +## Upgrade Problem + +1. BE will core when rolling pgrade problem from 2.0.x to 2.1.x + +- https://github.com/apache/doris/pull/32672 + +- https://github.com/apache/doris/pull/32444 + +- https://github.com/apache/doris/pull/32162 + +2. JDBC Catalog will have query errors when rolling grade rom 2.0.x to 2.1.x. + +- https://github.com/apache/doris/pull/32618 + + + +## New Feature + +1. Enable column auth by default. + +- https://github.com/apache/doris/pull/32659 + + +2. Get correct cores for pipeline and pipelinex engine when running within docker or k8s. + +- https://github.com/apache/doris/pull/32370 + +3. Support read parquet int96 type. + +- https://github.com/apache/doris/pull/32394 + +4. Enable proxy protocol to support IP transparency. Using this protocol, IP transparency for load balancing can be achieved, so that after load balancing, Doris can still obtain the client's real IP and implement permission control such as whitelisting. + +- https://github.com/apache/doris/pull/32338/files + +5. Add workload group queue related columns for active_queries system table. Uses could use this system to monitor the workload queue usage. + +- https://github.com/apache/doris/pull/32259 + +6. Add new system table backend_active_tasks to monitor the realtime query statics on every BE. + +- https://github.com/apache/doris/pull/31945 + +7. Add ipv4 and ipv6 support for spark-doris connector. + +- https://github.com/apache/doris/pull/32240 + +8. Add inverted index support for CCR. + +- https://github.com/apache/doris/pull/32101 + +9. Support select experimental session variable. + +- https://github.com/apache/doris/pull/31837 + +10. Support materialized view with bitmap_union(bitmap_from_array()) case. + +- https://github.com/apache/doris/pull/31962 + +11. Support partition prune for *HIVE_DEFAULT_PARTITION*. + +- https://github.com/apache/doris/pull/31736 + +12. Support function in set variable statement. + +- https://github.com/apache/doris/pull/32492 + +13. Support arrow serialization for varint type. + +- https://github.com/apache/doris/pull/32809 + + + +## Optimization + +1. Auto resume routine load when be restart or during upgrade. And keep the routine load stable. + +- https://github.com/apache/doris/pull/32239 + +2. Routine Load: optimize allocate task to be algorithm for load balance. + +- https://github.com/apache/doris/pull/32021 + +3. Spark Load: update spark version for spark load to resolve cve problem. + +- https://github.com/apache/doris/pull/30368 + +4. Skip cooldown if the tablet is dropped. + +- https://github.com/apache/doris/pull/32079 + +5. Support using workload group to manage routine load. + +- https://github.com/apache/doris/pull/31671 + +6. [MTMV ]Improve the performance for query rewritting by materialized view. + +- https://github.com/apache/doris/pull/31886 + +7. Reduce jvm heap memory consumed by profiles of BrokerLoadJob. + +- https://github.com/apache/doris/pull/31985 +8. Imporve the high QPS query by speed up PartitionPrunner. + +- https://github.com/apache/doris/pull/31970 + +9. Reduce duplicated memory consumption for column name and column path for schema cache. + +- https://github.com/apache/doris/pull/31141 + +10. Support more join types for query rewriting by materialized view such as INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN, FULL OUTER JOIN, LEFT SEMI JOIN, RIGHT SEMI JOIN, LEFT ANTI JOIN, RIGHT ANTI JOIN + +- https://github.com/apache/doris/pull/32909 + + + +## Bugfix + + +1. Do not push down topn-filter through right/full outer join if the first orderkey is nulls first. + +- https://github.com/apache/doris/pull/32633 + +2. Fix memory leak in Java UDF + +- https://github.com/apache/doris/pull/32630 + +3. If some odbc tables use the same resource, and restore not all odbc tables, it will not retain the resource. +and check some conf for backup/restore + +- https://github.com/apache/doris/pull/31989 + +4. Fold constant will core for variant type. + +- https://github.com/apache/doris/pull/32265 + +5. Routine load will pause when transaction fail in some cases. + +- https://github.com/apache/doris/pull/32638 + +6. the result of left semi join with empty right side should be false instead of null. + +- https://github.com/apache/doris/pull/32477 + +7. Fix core when build inverted index for a new column with no data. + +- https://github.com/apache/doris/pull/32669 + +8. Fix be core caused by null-safe-equal join. + +- https://github.com/apache/doris/pull/32623 + +9. Partial update: fix data correctness risk when load delete sign data into a table with sequence col. + +- https://github.com/apache/doris/pull/32574 + +10. Select outfile: Fix the column type mapping in the orc/parquet file format. + +- https://github.com/apache/doris/pull/32281 + +11. Fix BE core during restore stage. + +- https://github.com/apache/doris/pull/32489 + +12. Use array_agg func after other agg func like count, sum, may make be core. + +- https://github.com/apache/doris/pull/32387 + +13. Variant type should always nullable or there will some bugs. + +- https://github.com/apache/doris/pull/32248 + +14. Fix the bug of handling empty blocks in schema change. + +- https://github.com/apache/doris/pull/32396 + +15. Fix BE will core when use json_length() in some cases. + +- https://github.com/apache/doris/pull/32145 + +16. Fix error when query iceberg table using date cast predicate + +- https://github.com/apache/doris/pull/32194 + +17. Fix some bugs when build inverted index for variant type. + +- https://github.com/apache/doris/pull/31992 + +18. Wrong result of two or more map_agg functions in query. + +- https://github.com/apache/doris/pull/31928 + +19. Fix wrong result of money_format function. + +- https://github.com/apache/doris/pull/31883 + +20. Fix connection hang after too many connections. + +- https://github.com/apache/doris/pull/31594 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.2.md b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.2.md new file mode 100644 index 0000000000000..6116bd9984632 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.2.md @@ -0,0 +1,110 @@ +--- +{ + "title": "Release 2.1.2", + "language": "en" +} +--- + + + +## Behavior Changed + +1. Set the default value of the `data_consistence` property of EXPORT to partition to make export more stable during load. + +- https://github.com/apache/doris/pull/32830 + +2. Some of MySQL Connector (eg, dotnet MySQL.Data) rely on variable's column type to make connection. + + eg, select @[@autocommit]([@autocommit](https://github.com/autocommit)) should with column type BIGINT, not BIT, otherwise it will throw error. So we change column type of @[@autocommit](https://github.com/autocommit) to BIGINT. + + - https://github.com/apache/doris/pull/33282 + + +## Upgrade Problem + +1. Normal workload group is not created when upgrade from 2.0 or other old versions. + + - https://github.com/apache/doris/pull/33197 + +## New Feature + + +1. Add processlist table in information_schema database, users could use this table to query active connections. + + - https://github.com/apache/doris/pull/32511 + +2. Add a new table valued function `LOCAL` to allow access file system like shared storage. + + - https://github.com/apache/doris-website/pull/494 + + +## Optimization + +1. Skip some useless process to make graceful stop more quickly in K8s env. + + - https://github.com/apache/doris/pull/33212 + +2. Add rollup table name in profile to help find the mv selection problem. + + - https://github.com/apache/doris/pull/33137 + +3. Add test connection function to DB2 database to allow user check the connection when create DB2 Catalog. + + - https://github.com/apache/doris/pull/33335 + +4. Add DNS Cache for FQDN to accelerate the connect process among BEs in K8s env. + + - https://github.com/apache/doris/pull/32869 + +5. Refresh external table's rowcount async to make the query plan more stable. + + - https://github.com/apache/doris/pull/32997 + + +## Bugfix + + +1. Fix Iceberg Catalog of HMS and Hadoop do not support Iceberg properties like "io.manifest.cache-enabled" to enable manifest cache in Iceberg. + + - https://github.com/apache/doris/pull/33113 + +2. The offset params in `LEAD`/`LAG` function could use 0 as offset. + + - https://github.com/apache/doris/pull/33174 + +3. Fix some timeout issues with load. + + - https://github.com/apache/doris/pull/33077 + + - https://github.com/apache/doris/pull/33260 + +4. Fix core problem related with `ARRAY`/`MAP`/`STRUCT` compaction process. + + - https://github.com/apache/doris/pull/33130 + + - https://github.com/apache/doris/pull/33295 + +5. Fix runtime filter wait timeout. + + - https://github.com/apache/doris/pull/33369 + +6. Fix `unix_timestamp` core for string input in auto partition. + + - https://github.com/apache/doris/pull/32871 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.3.md b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.3.md new file mode 100644 index 0000000000000..e88ec3e94fb6d --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.3.md @@ -0,0 +1,191 @@ +--- +{ + "title": "Release 2.1.3", + "language": "en" +} +--- + + + +Apache Doris 2.1.3 was officially released on May 21, 2024. This version has updated several improvements, including writing data back to Hive, materialized view, permission management and bug fixes. It further enhances the performance and stability of the system. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + + + +## Feature Enhancements + +**1. Support writing back data to hive tables via Hive Catalog** + +Starting from version 2.1.3, Apache Doris supports DDL and DML operations on Hive. Users can directly create libraries and tables in Hive through Apache Doris and write data to Hive tables by executing `INSERT INTO` statements. This feature allows users to perform complete data query and write operations on Hive through Apache Doris, further simplifying the integrated lakehouse architecture. + +Please refer: [https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/](https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/) + +**2. Support building new asynchronous materialized views on top of existing ones** + +Users can create new asynchronous materialized views on top of existing ones, directly reusing pre-computed intermediate results for data processing. This simplifies complex aggregation and computation operations, reducing resource consumption and maintenance costs while further accelerating query performance and improving data availability. [#32984](https://github.com/apache/doris/pull/32984) + +**3. Support rewriting through nested materialized views** + +Materialized View (MV) is a database object used to store query results. Now, Apache Doris supports rewriting through nested materialized views, which helps optimize query performance. [#33362](https://github.com/apache/doris/pull/33362) + +**4. New `SHOW VIEWS` statement** + +The `SHOW VIEWS` statement can be used to query views in the database, facilitating better management and understanding of view objects in the database. [#32358](https://github.com/apache/doris/pull/32358) + +**5. Workload Group supports binding to specific BE nodes** + +Workload Group can be bound to specific BE nodes, enabling more refined control over query execution to optimize resource usage and improve performance. [#32874](https://github.com/apache/doris/pull/32874) + +**6. Broker Load supports compressed JSON format** + +Broker Load now supports importing compressed JSON format data, significantly reducing bandwidth requirements for data transmission and accelerating data import performance. [#30809](https://github.com/apache/doris/pull/30809) + +**7. TRUNCATE Function can use columns as scale parameters** + +The TRUNCATE function can now accept columns as scale parameters, providing more flexibility when processing numerical data. [#32746](https://github.com/apache/doris/pull/32746) + +**8. Add new functions `uuid_to_int` and `int_to_uuid`** + +These two functions allow users to convert between UUID and integer, significantly helping in scenarios that require handling UUID data. [#33005](https://github.com/apache/doris/pull/33005) + +**9. Add `bypass_workload_group` session variable to bypass query queue** + +The `bypass_workload_group` session variable allows certain queries to bypass the Workload Group queue and execute directly, which is useful for handling critical queries that require quick responses. [#33101](https://github.com/apache/doris/pull/33101) + +**10. Add strcmp function** + +The strcmp function compares two strings and returns their comparison result, simplifying text data processing. [#33272](https://github.com/apache/doris/pull/33272) + +**11. Support HLL functions `hll_from_base64` and `hll_to_base64`** + +HyperLogLog (HLL) is an algorithm for cardinality estimation. These two functions allow users to decode HLL data from a Base64-encoded string or encode HLL data as a Base64 string, which is very useful for storing and transmitting HLL data. [#32089](https://github.com/apache/doris/pull/32089) + +## Optimization and Improvements + +**1. Replace SipHash with XXHash to improve shuffle performance** + +Both SipHash and XXHash are hashing functions, but XXHash may provide faster hashing speeds and better performance in certain scenarios. This optimization aims to improve performance during data shuffling by adopting XXHash. [#32919](https://github.com/apache/doris/pull/32919) + +**2. Asynchronous materialized views support NULL partition columns in OLAP tables** + +This enhancement allows asynchronous materialized views to support NULL partition columns in OLAP tables, enhancing data processing flexibility.[#32698](https://github.com/apache/doris/pull/32698) + +**3. Limit maximum string length to 1024 when collecting column statistics to control BE memory usage** + +Limiting the string length when collecting column statistics prevents excessive data from consuming too much BE memory, helping maintain system stability and performance. [#32470](https://github.com/apache/doris/pull/32470) + +**4. Support dynamic deletion of Bitmap cache to improve performance** + +Dynamically deleting no longer needed Bitmap Cache can free up memory and improve system performance. [#32991](https://github.com/apache/doris/pull/32991) + +**5. Reduce memory usage during ALTER operations** + +Reducing memory usage during ALTER operations improves the efficiency of system resource utilization. [#33474](https://github.com/apache/doris/pull/33474) + +**6. Support constant folding for complex types** + +Supports constant folding for Array/Map/Struct complex types.[#32867](https://github.com/apache/doris/pull/32867) + +**7. Add support for Variant type in Aggregate Key Model** + +The Variant data type can store multiple data types. This optimization allows aggregation operations on Variant type data, enhancing the flexibility of semi-structured data analysis. [#33493](https://github.com/apache/doris/pull/33493) + +**8. Support new inverted index format in CCR** [#33415](https://github.com/apache/doris/pull/33415) + +**9. Optimize rewriting performance for nested materialized views** [#34127](https://github.com/apache/doris/pull/34127) + +**10. Support decimal256 type in row-based storage format** + +Supporting the decimal256 type in row-based storage extends the system's ability to handle high-precision numerical data. [#34887](https://github.com/apache/doris/pull/34887) + +## Behavioral Changes + +**1. Authorization** + +- **Grant_priv permission changes**: `Grant_priv` can no longer be arbitrarily granted. When performing a `GRANT` operation, the user not only needs to have `Grant_priv` but also the permissions to be granted. For example, to grant `SELECT` permission on `table1`, the user needs both `GRANT` permission and `SELECT` permission on `table1`, enhancing security and consistency in permission management. [#32825](https://github.com/apache/doris/pull/32825) + +- **Workload group and resource usage_priv**: `Usage_priv` for Workload Group and Resource is no longer global but limited to Resource and Workload Group, making permission granting and usage more specific. [#32907](https://github.com/apache/doris/pull/32907) + +- **Authorization for operations**: Operations that were previously unauthorized now have corresponding authorizations for more detailed and comprehensive operational permission control. [#33347](https://github.com/apache/doris/pull/33347) + +**2. LOG directory configuration** + +The log directory configuration for FE and BE now uniformly uses the `LOG_DIR` environment variable. All other different types of logs will be stored with `LOG_DIR` as the root directory. To maintain compatibility between versions, the previous configuration item `sys_log_dir` can still be used. [#32933](https://github.com/apache/doris/pull/32933) + +**3. S3 Table Function (TVF)** + +Due to issues with correctly recognizing or processing S3 URLs in certain cases, the parsing logic for object storage paths has been refactored. For file paths in S3 table functions, the `force_parsing_by_standard_uri` parameter needs to be passed to ensure correct parsing. [#33858](https://github.com/apache/doris/pull/33858) + +## Upgrade Issues + +Since many users use certain keywords as column names or attribute values, the following keywords have been set as non-reserved, allowing users to use them as identifiers. [#34613](https://github.com/apache/doris/pull/34613) + +## Bug Fixes + +**1. Fix no data error when reading Hive tables on Tencent Cloud COSN** + +Resolved the no data error that could occur when reading Hive tables on Tencent Cloud COSN, enhancing compatibility with Tencent Cloud storage services. + +**2. Fix incorrect results returned by `milliseconds_diff` function** + +Fixed an issue where the `milliseconds_diff` function returned incorrect results in some cases, ensuring the accuracy of time difference calculations. [#32897](https://github.com/apache/doris/pull/32897) + +**3. User-defined variables should be rorwarded to the Master node** + +Ensured that user-defined variables are correctly passed to the Master node for consistency and correct execution logic across the entire system. [#33013]https://github.com/apache/doris/pull/33013 + +**4. Fix Schema Change issues when adding complex type columns** + +Resolved Schema Change issues that could arise when adding complex type columns, ensuring the correctness of Schema Changes. [#31824](https://github.com/apache/doris/pull/31824) + +**5. Fix data loss issue in Routine Load when FE Master node changes** + +`Routine Load` is often used to subscribe to Kafka message queues. This fix addresses potential data loss issues that may occur during FE Master node changes. [#33678](https://github.com/apache/doris/pull/33678) + +**6. Fix Routine Load failure when Workload Group cannot be found** + +Resolved an issue where `Routine Load` would fail if the specified Workload Group could not be found. [#33596](https://github.com/apache/doris/pull/33596) + +**7. Support column string64 to avoid join failures when string size overflows unit32** + +In some cases, string sizes may exceed the unit32 limit. Supporting the `string64` type ensures correct execution of string JOIN operations. [#33850](https://github.com/apache/doris/pull/33850) + +**8. Allow Hadoop users to create Paimon Catalog** + +Permitted authorized Hadoop users to create Paimon Catalogs.[#33833](https://github.com/apache/doris/pull/33833) + +**9. Fix `function_ipxx_cidr` function issues with constant parameters** + +Resolved problems with the `function_ipxx_cidr` function when handling constant parameters, ensuring the correctness of function execution.[#33968](https://github.com/apache/doris/pull/33968) + +**10. Fix file download errors when restoring using HDFS** + +Resolved "failed to download" errors encountered during data restoration using HDFS, ensuring the accuracy and reliability of data recovery. [#33303](https://github.com/apache/doris/issues/33303) + +**11. Fix column permission issues related to hidden columns** + +In some cases, permission settings for hidden columns may be incorrect. This fix ensures the correctness and security of column permission settings. [#34849](https://github.com/apache/doris/pull/34849) + +**12. Fix issue where Arrow Flight cannot obtain the correct IP in K8s deployments** + +This fix resolves an issue where Arrow Flight cannot correctly obtain the IP address in Kubernetes deployment environments.[#34850](https://github.com/apache/doris/pull/34850) \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.4.md b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.4.md new file mode 100644 index 0000000000000..521694ffa60fa --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.4.md @@ -0,0 +1,289 @@ +--- +{ + "title": "Release 2.1.4", + "language": "en" +} +--- + + + +**Apache Doris version 2.1.4 was officially released on June 26, 2024.** In this update, we have optimized various functional experiences for data lakehouse scenarios, with a focus on resolving the abnormal memory usage issue in the previous version. Additionally, we have implemented several improvemnents and bug fixes to enhance the stability. Welcome to download and use it. + + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + + +## Behavior changes + +- Non-existent files will be ignored when querying external tables such as Hive. [#35319](https://github.com/apache/doris/pull/35319) + + The file list is obtained from the meta cache, and it may not be consistent with the actual file list. + + Ignoring non-existent files helps to avoid query errors. + +- By default, creating a Bitmap Index will no longer be automatically changed to an Inverted Index. [#35521](https://github.com/apache/doris/pull/35521) + + This behavior is controlled by the FE configuration item `enable_create_bitmap_index_as_inverted_index`, which defaults to false. + +- When starting FE and BE processes using `--console`, all logs will be output to the standard output and differentiated by prefixes indicating the log type. [#35679](https://github.com/apache/doris/pull/35679) + + For more infomation, please see the documentations: + + - [Log Management - FE Log](../admin-manual/log-management/fe-log.md) + + - [Log Management - BE Log](../admin-manual/log-management/be-log.md) + +- If no table comment is provided when creating a table, the default comment will be empty instead of using the table type as the default comment. [#36025](https://github.com/apache/doris/pull/36025) + +- The default precision of DECIMALV3 has been adjusted from (9, 0) to (38, 9) to maintain compatibility with the version in which this feature was initially released. [#36316](https://github.com/apache/doris/pull/36316) + +## New features + +### Query optimizer + +- Support FE flame graph tool + + For more information, see the [documentation](/community/developer-guide/fe-profiler.md) + +- Support `SELECT DISTINCT` to be used with aggregation. + +- Support single table query rewrite without `GROUP BY`. This is useful for complex filters or expressions. [#35242](https://github.com/apache/doris/pull/35242). + +- The new optimizer fully supports point query functionality [#36205](https://github.com/apache/doris/pull/36205). + +### Data Lakehouse + +- Support native reader of Apache Paimon deletion vector [#35241](https://github.com/apache/doris/pull/35241) + +- Support using Resource in Table Valued Functions [#35139](https://github.com/apache/doris/pull/35139) + +- Access controller with Hive Ranger plugin supports Data Mask + +### Asynchronous materialized views + +- Build support for internal table triggered updates, where if a materialized view uses an internal table and the data in the internal table changes, it can trigger a refresh of the materialized view, specifying REFRESH ON COMMIT when creating the materialized view. + +- Support transparent rewriting for single tables. For more information, see [Querying Async Materialized View](../query/view-materialized-view/query-async-materialized-view.md). + +- Transparent rewriting supports aggregation roll-up for agg_state, agg_union types; materialized views can be defined as agg_state or agg_union, queries can use specific aggregation functions, or use agg_merge. For more information, see [AGG_STATE](../sql-manual/sql-types/Data-Types/AGG_STATE.md). + +### Others + +- Added function `replace_empty`. + + For more information, see [documentation]../sql-manual/sql-functions/string-functions/replace_empty). + +- Support `show storage policy using` statement. + + For more information, see [documentation](../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md). + +- Support JVM metrics on the BE side. + + By setting `enable_jvm_monitor=true` in `be.conf` to enable this feature. + +## Improvements + +- Supported creating inverted indexes for columns with Chinese names. [#36321](https://github.com/apache/doris/pull/36321) + +- Estimate memory consumed by segment cache more accurately so that unused memory can be released more quickly. [#35751](https://github.com/apache/doris/pull/35751) + +- Filter empty partitions before exporting tables to remote storage. [#35542](https://github.com/apache/doris/pull/35542) + +- Optimize routine load task allocation algorithm to balance the load among Backends. [#34778](https://github.com/apache/doris/pull/34778) + +- Provide hints when a related variable is not found during a set operation. [#35775](https://github.com/apache/doris/pull/35775) + +- Support placing Java UDF jar files in the FE's `custom_lib` directory for default loading. [#35984](https://github.com/apache/doris/pull/35984) + +- Add a timeout global variable `audit_plugin_load_timeout` for audit log load jobs. + +- Optimize the performance of transparent rewrite planning for asynchronous materialized views. + +- Optimize the `INSERT` operation that when the source is empty, the BE will not execute. [#34418](https://github.com/apache/doris/pull/34418) + +- Support fetching file lists of Hive/Hudi tables in batches. In a senario with 1.2 million files, the time taken to obtain the list of files has been reduced from 390 seconds to 46 seconds. [#35107](https://github.com/apache/doris/pull/35107) + +- Forbid dynamic partitioning when creating asynchronous materialized views. + +- Support detecting whether the partition data of external data of external tables in Hive is synchronized with asynchronous materialized views. + +- Allow to create index for asynchronous materialized views. + +## Bug fixes + +### Query optimizer + +- Fixed the issue where SQL cache returns old results after truncating a partition. [#34698](https://github.com/apache/doris/pull/34698) + +- Fixed the issue where casting from JSON to other types did not correctly handle nullable attributes. [#34707](https://github.com/apache/doris/pull/34707) + +- Fixed occasional DATETIMEV2 literal simplification errors. [#35153](https://github.com/apache/doris/pull/35153) + +- Fixed the issue where `COUNT(*)` could not be used in window functions. [#35220](https://github.com/apache/doris/pull/35220) + +- Fixed the issue where nullable attributes could be incorrect when all `SELECT` statements under `UNION ALL` have no `FROM` clause. [#35074](https://github.com/apache/doris/pull/35074) + +- Fixed the issue where `bitmap in join` and subquery unnesting could not be used simultaneously. [#35435](https://github.com/apache/doris/pull/35435) + +- Fixed the performance issue where filter conditions could not be pushed down to the CTE producer in specific situations. [#35463](https://github.com/apache/doris/pull/35463) + +- Fixed the issue where aggregate combinators written in uppercase could not be found. [#35540](https://github.com/apache/doris/pull/35540) + +- Fixed the performance issue where window functions were not properly pruned by column pruning. [#35504](https://github.com/apache/doris/pull/35504) + +- Fixed the issue where queries might parse incorrectly leading to wrong results when multiple tables with the same name but in different databases appeared simultaneously in the query. [#35571](https://github.com/apache/doris/pull/35571) + +- Fixed the query error caused by generating runtime filters during schema table scans. [#35655](https://github.com/apache/doris/pull/35655) + +- Fixed the issue where nested correlated subqueries could not execute because the join condition was folded into a null literal. [#35811](https://github.com/apache/doris/pull/35811) + +- Fixed the occasional issue where decimal literals were set with incorrect precision during planning. [#36055](https://github.com/apache/doris/pull/36055) + +- Fixed the occasional issue where multiple layers of aggregation were merged incorrectly during planning. [#36145](https://github.com/apache/doris/pull/36145) + +- Fixed the occasional issue where the input-output mismatch error occurred after aggregate expansion planning. [#36207](https://github.com/apache/doris/pull/36207) + +- Fixed the occasional issue where `<=>` was incorrectly converted to `=`. [#36521](https://github.com/apache/doris/pull/36521) + +### Query execution + +- Fixed the issue where the query hangs if the limited rows are reached on the pipeline engine and memory is not released. [#35746](https://github.com/apache/doris/pull/35746) + +- Fixed the BE coredump when `enable_decimal256` is true but falls back to the old planner. [#35731](https://github.com/apache/doris/pull/35731) + +### Asynchronous materialized views + +- Fixed the issue in the asynchronous materialized view build where the store_row_column attribute specified was not being recognized by the core. + +- Fixed the problem in the asynchronous materialized view build where specifying the storage_medium was not taking effect. + +- Resolved the error occurring in the asynchronous materialized view show partitions after the base table is deleted. + +- Fixed the issue where asynchronous materialized views caused backup and restore exceptions. [#35703](https://github.com/apache/doris/pull/35703) + +- Fixed the issue where partition rewrite could lead to incorrect results. [#35236](https://github.com/apache/doris/pull/35236) + +### Semi-structured + +- Fixed the core dump problem when a VARIANT with an empty key is used. [#35671](https://github.com/apache/doris/pull/35671) +- Bitmap and BloomFilter index should not perform light index changes. [#35225](https://github.com/apache/doris/pull/35225) + +### Primary key + +- Fixed the issue where an exception BE restart occurred in the case of partial column updates during import, which could result in duplicate keys. [#35678](https://github.com/apache/doris/pull/35678) + +- Fixed the issue where BE might core dump during clone operations when memory is tight. [#34702](https://github.com/apache/doris/pull/34702) + +### Data Lakehouse + +- Fixed the issue where a Hive table could not be created with a fully qualified name such as `ctl.db.tbl` [#34984](https://github.com/apache/doris/pull/34984) + +- Fixed the issue where the Hive metastore connection did not close when refreshing [#35426](https://github.com/apache/doris/pull/35426) + +- Fixed a potential meta replay issue when upgrading from 2.0.x to 2.1.x. [#35532](https://github.com/apache/doris/pull/35532) + +- Fixed the issue where the Table Valued Function could not read an empty snappy compressed file. [#34926](https://github.com/apache/doris/pull/34926) + +- Fixed the issue where unable to read Parquet files with invalid min-max column statistics [#35041](https://github.com/apache/doris/pull/35041) + +- Fixed the issue where unable to handle pushdown predicates with null-aware functions in the Parquet/ORC reader [#35335](https://github.com/apache/doris/pull/35335) + +- Fixed the issue about the order of partition columns when creating a Hive table [#35347](https://github.com/apache/doris/pull/35347) + +- Fixed the issue where writing to a Hive table on S3 failed when partition values contained spaces [#35645](https://github.com/apache/doris/pull/35645) + +- Fixed the issue about incorrect scheme of Aliyun OSS endpoint [#34907](https://github.com/apache/doris/pull/34907) + +- Fixed the issue where the Parquet format Hive table written by Doris could not be read by Hive [#34981](https://github.com/apache/doris/pull/34981) + +- Fixed the issue where unable to read ORC files after the schema change of a Hive table [#35583](https://github.com/apache/doris/pull/35583) + +- Fixed the issue where unable to read Paimon tables via JNI after the schema change of the Paimon table [#35309](https://github.com/apache/doris/pull/35309) + +- Fixed the issue of too small Row Groups in Parquet format files written out. [#36042](https://github.com/apache/doris/pull/36042) [#36143](https://github.com/apache/doris/pull/36143) + +- Fixed the issue where unable to read Paimon tables after schema changes [#36049](https://github.com/apache/doris/pull/36049) + +- Fixed the issue where unable to read Hive Parquet format tables after schema changes [#36182](https://github.com/apache/doris/pull/36182) + +- Fixed the FE OOM issue caused by Hadoop FS cache [#36403](https://github.com/apache/doris/pull/36403) + +- Fixed the issue where FE could not start after enabling the Hive Metastore Listener [#36533](https://github.com/apache/doris/pull/36533) + +- Fixed the issue of query performance degradation with a large number of files [#36431](https://github.com/apache/doris/pull/36431) + +- Fixed the timezone issue when reading the timestamp column type in Iceberg [#36435](https://github.com/apache/doris/pull/36435) + +- Fixed DATETIME conversion error and data path error on Iceberg Table. [#35708](https://github.com/apache/doris/pull/35708) + +- Support retain and pass the additional user-defined properties fo Table Valued Functions to the S3 SDK. [#35515](https://github.com/apache/doris/pull/35515) + + +### Data import + +- Fixed the issue where `CANCEL LOAD` did not work [#35352](https://github.com/apache/doris/pull/35352) + +- Fixed the issue where a null pointer error in the Publish phase of load transactions prevented the load from completing [#35977](https://github.com/apache/doris/pull/35977) + +- Fixed the issue with bRPC serializing large data files when sent via HTTP [#36169](https://github.com/apache/doris/pull/36169) + +### Data management + +- Fixed the isseu that the resource tag in ConnectionContext was not set after forwarding DDL or DML to master FE. [#35618](https://github.com/apache/doris/pull/35618) + +- Fixed the issue where the restored table name was incorrect when `lower_case_table_names` was enabled [#35508](https://github.com/apache/doris/pull/35508) + +- Fixed the issue where `admin clean trash` could not work [#35271](https://github.com/apache/doris/pull/35271) + +- Fixed the issue where a storage policy could not be deleted from a partition [#35874](https://github.com/apache/doris/pull/35874) + +- Fixed the issue of data loss when importing into a multi-replica automatic partition table [#36586](https://github.com/apache/doris/pull/36586) + +- Fixed the issue where the partition column of a table changed when querying or inserting into an automatic partition table using the old optimizer [#36514](https://github.com/apache/doris/pull/36514) + +### Memory management + +- Fixed the issue of frequent errors in the logs due to failure in obtaining Cgroup meminfo. [#35425](https://github.com/apache/doris/pull/35425) + +- Fixed the issue where the Segment cache size was uncontrolled when using BloomFilter, leading to abnormal process memory growth. [#34871](https://github.com/apache/doris/pull/34871) + +### Permissions + +- Fixed the issue where permission settings were ineffective after enabling case-insensitive table names. [#36557](https://github.com/apache/doris/pull/36557) + +- Fixed the issue where setting LDAP passwords through non-Master FE nodes did not take effect. [#36598](https://github.com/apache/doris/pull/36598) + +- Fixed the issue where authorization could not be checked for the `SELECT COUNT(*)` statement. [#35465](https://github.com/apache/doris/pull/35465) + +### Others + +- Fixed the issue where the client JDBC program could not close the connection if the MySQL connection was broken. [#36616](https://github.com/apache/doris/pull/36616) + +- Fixed MySQL protocol compatibility issue with the `SHOW PROCEDURE STATUS` statement. [#35350](https://github.com/apache/doris/pull/35350) + +- The `libevent` now forces Keepalive to solve the issue of connection leaks in certain situations. [#36088](https://github.com/apache/doris/pull/36088) + +## Credits + +Thanks to every one who contributes to this release. + +@airborne12, @amorynan, @AshinGau, @BePPPower, @BiteTheDDDDt, @ByteYue, @caiconghui, @CalvinKirs, @cambyzju, @catpineapple, @cjj2010, @csun5285, @DarvenDuan, @dataroaring, @deardeng, @Doris-Extras, @eldenmoon, @englefly, @feiniaofeiafei, @felixwluo, @freemandealer, @Gabriel39, @gavinchou, @GoGoWen, @HappenLee, @hello-stephen, @hubgeter, @hust-hhb, @jacktengg, @jackwener, @jeffreys-cat, @Jibing-Li, @kaijchen, @kaka11chen, @Lchangliang, @liaoxin01, @LiBinfeng-01, @lide-reed, @luennng, @luwei16, @mongo360, @morningman, @morrySnow, @mrhhsg, @Mryange, @mymeiyi, @nextdreamblue, @platoneko, @qidaye, @qzsee, @seawinde, @shuke987, @sollhui, @starocean999, @suxiaogang223, @TangSiyang2001, @Thearas, @Vallishp, @w41ter, @wangbo, @whutpencil, @wsjz, @wuwenchi, @xiaokang, @xiedeyantu, @XieJiann, @xinyiZzz, @XuPengfei-1020, @xy720, @xzj7019, @yiguolei, @yongjinhou, @yujun777, @Yukang-Lian, @Yulei-Yang, @zclllyybb, @zddr, @zfr9527, @zgxme, @zhangbutao, @zhangstar333, @zhannngchen, @zhiqiang-hhhh, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.5.md b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.5.md new file mode 100644 index 0000000000000..7c1910eeae8c5 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.5.md @@ -0,0 +1,395 @@ +--- +{ + "title": "Release 2.1.5", + "language": "en" +} +--- + + + +**Apache Doris version 2.1.5 was officially released on July 24, 2024.** In this update, we have optimized various functional experiences for data lakehouse and high concurrency scenarios, functionalities of asynchronous materialized views. Additionaly, we have implemented several improvemnents and bug fixes to enhance the stability. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- The default connection pool size for the JDBC Catalog has been increased from 10 to 30 to prevent connection exhaustion in high-concurrency scenarios. [#37023](https://github.com/apache/doris/pull/37023). + +- The system's reserved memory (low water mark) has been adjusted to `min(6.4GB, MemTotal * 5%)` to mitigate BE OOM issues. + +- When processing multiple statements in a single request, only the last statement's result is returned if the `CLIENT_MULTI_STATEMENTS` flag is not set. + +- Direct modifications to data in asynchronous materialized views are no longer permitted.[#37129](https://github.com/apache/doris/pull/37129) + +- A session variable `use_max_length_of_varchar_in_ctas` has been added to control the behavior of varchar and char type length generation during CTAS (Create Table As Select). The default value is true. When set to false, the derived varchar length is used instead of the maximum length. [#37284](https://github.com/apache/doris/pull/37284) + +- Statistics collection now defaults to enabling the functionality of estimating the number of rows in Hive tables based on file size. [#37694](https://github.com/apache/doris/pull/37694) + +- Transparent rewrite for asynchronous materialized views is now enabled by default. [#35897](https://github.com/apache/doris/pull/35897) + +- Transparent rewrite utilizes partitioned materialized views. If partitions fail, the base tables are unioned with the materialized view to ensure data correctness. [#35897](https://github.com/apache/doris/pull/35897) + +## New features + +### Lakehouse + +- The session variable `read_csv_empty_line_as_null` can be used to control whether empty lines are ignored when reading CSV format files. [#37153](https://github.com/apache/doris/pull/37153) + + By default, empty lines are ignored. When set to true, empty lines will be read as rows where all columns are null. + +- Compatibility with Presto's complex type output format can be enabled by setting `serde_dialect="presto"`. [#37253](https://github.com/apache/doris/pull/37253) + +### Multi-Table Materialized View + +- Supports non-deterministic functions in materialized view building. [#37651](https://github.com/apache/doris/pull/37651) + +- Atomically replaces definitions of asynchronous materialized views. [#37147](https://github.com/apache/doris/pull/37147) + +- Views creation statements can be viewed via `SHOW CREATE MATERIALIZED VIEW`. [#37125](https://github.com/apache/doris/pull/37125) + +- Transparent rewrites for multi-dimensional aggregation and non-aggregate queries. [#37436](https://github.com/apache/doris/pull/37436) [#37497](https://github.com/apache/doris/pull/37497) + +- Supports DISTINCT aggregations with key columns and partitioning for roll-ups. [#37651](https://github.com/apache/doris/pull/37651) + +- Support for partitioning materialized views to roll up partitions using `date_trunc` [#31812](https://github.com/apache/doris/pull/31812) [#35562](https://github.com/apache/doris/pull/35562) + +- Partitioned table-valued functions (TVFs) are supported. [#36479](https://github.com/apache/doris/pull/36479) + +### Semi-Structured Data Management + +- Tables using the VARIANT type now support partial column updates. [#34925](https://github.com/apache/doris/pull/34925) + +- PreparedStatement support is now enabled by default. [#36581](https://github.com/apache/doris/pull/36581) + +- The VARIANT type can be exported to CSV format. [#37857](https://github.com/apache/doris/pull/37857) + +- `explode_json_object` function transposes JSON Object rows into columns. [#36887](https://github.com/apache/doris/pull/36887) + +- The ES Catalog now maps ES NESTED or OBJECT types to the Doris JSON type.[#37101](https://github.com/apache/doris/pull/37101) + +- By default, support_phrase is enabled for inverted indexes with specified analyzers to improve the performance of match_phrase series queries. [#37949](https://github.com/apache/doris/pull/37949) + +### Query Optimizer + +- Support for explaining `DELETE FROM` statements. [#37100](https://github.com/apache/doris/pull/37100) + +- Support for hint form of constant expression parameters [#37988](https://github.com/apache/doris/pull/37988) + +### Memory Management + +- Added an HTTP API to clear the cache. [#36599](https://github.com/apache/doris/pull/36599) + +### Permissions + +- Support for authorization of resources within Table-Valued Functions (TVFs) [#37132](https://github.com/apache/doris/pull/37132) + +## Improvements + +### Lakehouse + +- Upgraded Paimon to version 0.8.1 + +- Fixes ClassNotFoundException for org.apache.commons.lang.StringUtils when querying Paimon tables. [#37512](https://github.com/apache/doris/pull/37512) + +- Added support for Tencent Cloud LakeFS. [#36891](https://github.com/apache/doris/pull/36891) + +- Optimized the timeout duration when fetching file lists for external table queries. [#36842](https://github.com/apache/doris/pull/36842) + +- Configurable via the session variable `fetch_splits_max_wait_time_ms`. + +- Improved default connection logic for SQLServer JDBC Catalog. [#36971](https://github.com/apache/doris/pull/36971) + + By default, the connection encryption settings are not intervened. Only when `force_sqlserver_jdbc_encrypt_false` is set to true, encrypt=false is forcibly added to the JDBC URL to reduce authentication errors. This allows for more flexible control over encryption behavior, enabling it to be turned on or off as needed. + +- Added serde properties to the show create table statements for Hive tables. [#37096](https://github.com/apache/doris/pull/37096) + +- Changed the default cache time for Hive table lists on the FE from 1 day to 4 hours + +- Data export (Export/Outfile) now supports specifying compression formats for Parquet and ORC + + For more information, please refer to [docs](https://doris.apache.org/docs/sql-manual/sql-statements/Data-Manipulation-Statements/Manipulation/EXPORT/?_highlight=compress_type). + +- When creating a table using CTAS+TVF, partition columns in the TVF are automatically mapped to Varchar(65533) instead of String, allowing them to be used as partition columns for internal tables [#37161](https://github.com/apache/doris/pull/37161) + +- Optimized the number of metadata accesses for Hive write operations [#37127](https://github.com/apache/doris/pull/37127) + +- ES Catalog now supports mapping nested/object types to Doris's Json type. [#37182](https://github.com/apache/doris/pull/37182) + +- Improved error messages when connecting to Oracle using older versions of the ojdbc driver [#37634](https://github.com/apache/doris/pull/37634) + +- When Hudi tables return an empty set during Incremental Read, Doris now also returns an empty set instead of error [#37636](https://github.com/apache/doris/pull/37636) + +- Fixed an issue where inner-outer table join queries could lead to FE timeouts in some cases [#37757](https://github.com/apache/doris/pull/37757) + +- Fixed an issue with FE metadata replay errors during upgrades from older versions to newer versions when the Hive metastore event listener is enabled. [#37757](https://github.com/apache/doris/pull/37757) + +### Multi-Table Materialized View + +- Automate key column selection for asynchronous materialized views. [#36601](https://github.com/apache/doris/pull/36601) + +- Support date_trunc in materialized view partition definitions.. [#35562](https://github.com/apache/doris/pull/35562) + +- Enable transparent rewrites across nested materialized view aggregations. [#37651](https://github.com/apache/doris/pull/37651) + +- Asynchronous materialized views remain available when schema changes do not affect the correctness of their data. [#37122](https://github.com/apache/doris/pull/37122) + +- Improve planning speed for transparent rewrites. [#37935](https://github.com/apache/doris/pull/37935) + +- When calculating the availability of asynchronous materialized views, the current refresh status is no longer taken into account. [#36617](https://github.com/apache/doris/pull/36617) + +### Semi-Structured Data Management + +- Optimize DESC performance for viewing VARIANT sub-columns through sampling. [#37217](https://github.com/apache/doris/pull/37217) + +- Support for special JSON data with empty keys in the JSON type. [#36762](https://github.com/apache/doris/pull/36762) + +### Inverted Index + +- Reduce latency by minimizing the invocation of inverted index exists to avoid delays in accessing object storage. [#36945](https://github.com/apache/doris/pull/36945) + +- Optimize the overhead of the inverted index query process. [#35357](https://github.com/apache/doris/pull/35357) + +- Prevent inverted indices in materialized views. [#36869](https://github.com/apache/doris/pull/36869) + +### Query Optimizer + +- When both sides of a comparison expression are literals, the string literal will attempt to convert to the type of the other side. [#36921](https://github.com/apache/doris/pull/36921) + +- Refactored the sub-path pushdown functionality for the variant type, now better supporting complex pushdown scenarios. [#36923](https://github.com/apache/doris/pull/36923) + +- Optimized the logic for calculating the cost of materialized views, enabling more accurate selection of lower-cost materialized views. [#37098](https://github.com/apache/doris/pull/37098) + +- Improved the SQL cache planning speed when using user variables in SQL. [#37119](https://github.com/apache/doris/pull/37119) + +- Optimized the row estimation logic for NOT NULL expressions, resulting in better performance when NOT NULL is present in queries. [#37498](https://github.com/apache/doris/pull/37498) + +- Optimized the null rejection derivation logic for LIKE expressions. [#37864](https://github.com/apache/doris/pull/37864) + +- Improved error messages when querying a specific partition fails, making it clearer which table is causing the issue. [#37280](https://github.com/apache/doris/pull/37280) + +### Query Execution + +- Improved the performance of the bitmap_union operator up to 3 times in certain scenarios. + +- Enhanced the reading performance of Arrow Flight in ARM environments. + +- Optimized the execution performance of the explode, explode_map, and explode_json functions. + +### Data Loading + +- Support setting `max_filter_ratio` for `INSERT INTO ... FROM TABLE VALUE FUNCTION` + +## Bug fixes + +### Lakehouse + +- Fixed an issue that caused BE crashes in some cases when querying Parquet format [#37086](https://github.com/apache/doris/pull/37086) + +- Fixed an issue where BE printed excessive logs when querying Parquet format. [#37012](https://github.com/apache/doris/pull/37012) + +- Fixed an issue where the FE side created a large number of duplicate FileSystem objects in some cases. [#37142](https://github.com/apache/doris/pull/37142) + +- Fixed an issue where transaction information was not cleaned up after writing to Hive in some cases. [#37172](https://github.com/apache/doris/pull/37172) + +- Fixed a thread leak issue caused by Hive table write operations in some cases. [#37247](https://github.com/apache/doris/pull/37247) + +- Fixed an issue where Hive Text format row and column delimiters could not be correctly obtained in some cases. [#37188](https://github.com/apache/doris/pull/37188) + +- Fixed a concurrency issue when reading lz4 compressed blocks in some cases. [#37187](https://github.com/apache/doris/pull/37187) + +- Fixed an issue where `count(*)` on Iceberg tables returned incorrect results in some cases. [#37810](https://github.com/apache/doris/pull/37810) + +- Fixed an issue where creating a Paimon catalog based on MinIO caused FE metadata replay errors in some cases. [#37249](https://github.com/apache/doris/pull/37249) + +- Fixed an issue where using Ranger to create a catalog caused the client to hang in some cases. [#37551](https://github.com/apache/doris/pull/37551) + +### Multi-Table Materialized View + +- Fixed an issue where adding new partitions to the base table could lead to incorrect results after partition aggregation roll-up rewrites. [#37651](https://github.com/apache/doris/pull/37651) + +- Fixed an issue where the materialized view partition status was not set to out-of-sync after deleting associated base table partitions. [#36602](https://github.com/apache/doris/pull/36602) + +- Fixed an occasional deadlock issue during asynchronous materialized view builds. [#37133](https://github.com/apache/doris/pull/37133) + +- Fixed an occasional "nereids cost too much time" error when refreshing a large number of partitions in a single asynchronous materialized view refresh. [#37589](https://github.com/apache/doris/pull/37589) + +- Fixed an issue where an asynchronous materialized view could not be created if the final select list contained a null literal. [#37281](https://github.com/apache/doris/pull/37281) + +- Fixed an issue with single-table materialized views where, even though the aggregation materialized view was successfully rewritten, the CBO did not select it. [#35721](https://github.com/apache/doris/pull/35721) [#36058](https://github.com/apache/doris/pull/36058) + +- Fixed an issue where partition derivation failed when building a partitioned materialized view with both join inputs being aggregations. [#34781](https://github.com/apache/doris/pull/34781) + +### Semi-Structured Data Management + +- Fixed issues with VARIANT in special cases such as concurrency and abnormal data.[#37976](https://github.com/apache/doris/pull/37976) [#37839](https://github.com/apache/doris/pull/37839) [#37794](https://github.com/apache/doris/pull/37794) [#37674](https://github.com/apache/doris/pull/37674) [#36997](https://github.com/apache/doris/pull/36997) + +- Fixed coredump issues when using VARIANT in unsupported SQL. [#37640](https://github.com/apache/doris/pull/37640) + +- Fixed coredump issues related to MAP data type when upgrading from 1.x to 2.x or higher versions. [#36937](https://github.com/apache/doris/pull/36937) + +- Improved ES Catalog support for Array types. [#36936](https://github.com/apache/doris/pull/36936) + +### Inverted Index + +- Fixed an issue where DROP INDEX for Inverted Index v2 did not delete metadata. [#37646](https://github.com/apache/doris/pull/37646) + +- Fixed query accuracy issues when string length exceeded the "ignore above" threshold. [#37679](https://github.com/apache/doris/pull/37679) + +- Fixed issues with index size statistics. [#37232](https://github.com/apache/doris/pull/37232) [#37564](https://github.com/apache/doris/pull/37564) + +### Query Optimizer + +- Fixed an issue that prevented import operations from executing due to the use of reserved keywords. [#35938](https://github.com/apache/doris/pull/35938) + +- Fixed a type error where char(255) was incorrectly recorded as char(1) when creating a table. [#37671](https://github.com/apache/doris/pull/37671) + +- Fixed incorrect results when the join expression in a correlated subquery was a complex expression. [#37683](https://github.com/apache/doris/pull/37683) + +- Fixed a potential issue with incorrect bucket pruning for decimal types. [#38013](https://github.com/apache/doris/pull/38013) + +- Fixed incorrect aggregation operator results when pipeline local shuffle was enabled in certain scenarios. [#38016](https://github.com/apache/doris/pull/38016) + +- Fixed planning errors that could occur when equal expressions existed in aggregation operators. [#36622](https://github.com/apache/doris/pull/36622) + +- Fixed planning errors that could occur when lambda expressions were present in aggregation operators. [#37285](https://github.com/apache/doris/pull/37285) + +- Fixed an issue where a literal generated from a window function being optimized to a literal had the wrong type, preventing execution. [#37283](https://github.com/apache/doris/pull/37283) + +- Fixed an issue with the null attribute being incorrectly output by the aggregate function foreach combinator. [#37980](https://github.com/apache/doris/pull/37980) + +- Fixed an issue where the acos function could not be planned when its parameter was a literal out of range. [#37996](https://github.com/apache/doris/pull/37996) + +- Fixed planning errors when specifying partitions for a query on a synchronized materialized view. [#36982](https://github.com/apache/doris/pull/36982) + +- Fixed occasional Null Pointer Exceptions (NPEs) during planning. [#38024](https://github.com/apache/doris/pull/38024) + +### Query Execution + +- Fixed an error in delete where statements when using decimal data types as conditions. [#37801](https://github.com/apache/doris/pull/37801) + +- Fixed an issue where BE memory was not released after query execution ended. [#37792](https://github.com/apache/doris/pull/37792) [#37297](https://github.com/apache/doris/pull/37297) + +- Fixed a problem where audit logs occupied too much FE memory under high QPS scenarios. [#37786](https://github.com/apache/doris/pull/37786) + +- Fixed BE core dumps when the sleep function received illegal input values. [#37681](https://github.com/apache/doris/pull/37681) + +- Fixed an error encountered during sync filter size execution. [#37103](https://github.com/apache/doris/pull/37103) + +- Fixed incorrect results when using time zones during execution. [#37062](https://github.com/apache/doris/pull/37062) + +- Fixed incorrect results when casting strings to integers. [#36788](https://github.com/apache/doris/pull/36788) + +- Fixed query errors when using the Arrow Flight protocol with pipelinex enabled. [#35804](https://github.com/apache/doris/pull/35804) + +- Fixed errors when casting strings to dates/datetimes. [#35637](https://github.com/apache/doris/pull/35637) + +- Fixed BE core dumps during large table join queries using <=>. [#36263](https://github.com/apache/doris/pull/36263) + +### Storage Management + +- Fixed the issue of invisible DELETE SIGN data encountered during column update and write operations. [#36755](https://github.com/apache/doris/pull/36755) + +- Optimized FE's memory usage during schema changes. [#36756](https://github.com/apache/doris/pull/36756) + +- Fixed the issue where BE would hang during restart due to transactions not being aborted [#36437](https://github.com/apache/doris/pull/36437) + +- Fixed occasional errors when changing from NOT NULL to NULL data types. [#36389](https://github.com/apache/doris/pull/36389) + +- Optimized replica repair scheduling when BE goes down. [#36897](https://github.com/apache/doris/pull/36897) + +- Supported round-robin disk selection for tablet creation on a single BE. [#36900](https://github.com/apache/doris/pull/36900) + +- Fixed query error -230 caused by slow publishing. [#36222](https://github.com/apache/doris/pull/36222) + +- Improved the speed of partition balancing. [#36976](https://github.com/apache/doris/pull/36976) + +- Controlled segment cache using the number of file descriptors (FDs) and memory to avoid FD exhaustion. [#37035](https://github.com/apache/doris/pull/37035) + +- Fixed potential replica loss caused by concurrent clone and alter operations [#36858](https://github.com/apache/doris/pull/36858) + +- Fixed the issue of not being able to adjust column order.[#37226](https://github.com/apache/doris/pull/37226) + +- Prohibited certain schema change operations on auto-increment columns. [#37331](https://github.com/apache/doris/pull/37331) + +- Fixed inaccurate error reporting for DELETE operations. [#37374](https://github.com/apache/doris/pull/37374) + +- Adjusted the trash expiration time on BE side to one day. [#37409](https://github.com/apache/doris/pull/37409) + +- Optimized compaction memory usage and scheduling. [#37491](https://github.com/apache/doris/pull/37491) + +- Checked for potential oversized backups causing FE restarts. [#37466](https://github.com/apache/doris/pull/37466) + +- Restored dynamic partition deletion policies and cross-partition behaviors to 2.1.3. [#37570](https://github.com/apache/doris/pull/37570) [#37506](https://github.com/apache/doris/pull/37506) + +- Fixed errors related to decimal types in DELETE predicates. [#37710](https://github.com/apache/doris/pull/37710) + +### Data Loading + +- Fixed data invisibility issues caused by race conditions in error handling during imports [#36744](https://github.com/apache/doris/pull/36744) + +- Added support for hhl_from_base64 in streamload imports. [#36819](https://github.com/apache/doris/pull/36819) + +- Fixed potential FE OOM issues when importing very large numbers of tablets for a single table. [#36944](https://github.com/apache/doris/pull/36944) + +- Fixed possible auto-increment column duplication during FE master-slave switchovers. [#36961](https://github.com/apache/doris/pull/36961) + +- Fixed errors when inserting into select with auto-increment columns. [#37029](https://github.com/apache/doris/pull/37029) + +- Reduced the number of data flush threads to optimize memory usage. [#37092](https://github.com/apache/doris/pull/37092) + +- Improved automatic recovery and error messaging for routine load tasks. [#37371](https://github.com/apache/doris/pull/37371) + +- Increased the default batch size for routine load. [#37388](https://github.com/apache/doris/pull/37388) + +- Fixed routine load task stoppage due to Kafka EOF expiration. [#37983](https://github.com/apache/doris/pull/37983) + +- Fixed coredump issues in multi-table streaming. [#37370](https://github.com/apache/doris/pull/37370) + +- Fixed premature backpressure caused by inaccurate memory estimation in groupcommit. [#37379](https://github.com/apache/doris/pull/37379) + +- Optimized BE-side thread usage in groupcommit. [#37380](https://github.com/apache/doris/pull/37380) + +- Fixed the issue of no error URL when data was not partitioned. [#37401](https://github.com/apache/doris/pull/37401) + +- Fixed potential memory misoperations during imports. [#38021](https://github.com/apache/doris/pull/38021) + +### Merge on Write Unique Key + +- Reduced memory usage during compaction for primary key tables. [#36968](https://github.com/apache/doris/pull/36968) + +- Fixed potential duplicate data issues when primary key replica cloning fails. [#37229](https://github.com/apache/doris/pull/37229) + +### Permissions + +- Fixed the issue of missing authorization when a table-valued function references a resource. [#37132](https://github.com/apache/doris/pull/37132) + +- Fixed the issue where the SHOW ROLE statement did not include workload group permissions. [#36032](https://github.com/apache/doris/pull/36032) + +- Fixed the issue where executing two statements simultaneously when creating a row policy could cause FE to fail to restart. [#37342](https://github.com/apache/doris/pull/37342) + +- Fixed the issue where, in some cases, upgrading from an older version could result in FE metadata replay failures due to row policies. [#37342](https://github.com/apache/doris/pull/37342) + +### Others + +- Fixed the issue of compute nodes participating in internal table creation. [#37961](https://github.com/apache/doris/pull/37961) + +- Fixed the read lag issue when `enable_strong_read_consistency` is set to true. [#37641](https://github.com/apache/doris/pull/37641) \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.6.md b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.6.md new file mode 100644 index 0000000000000..c14d25b52573f --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.6.md @@ -0,0 +1,524 @@ +--- +{ + "title": "Release 2.1.6", + "language": "en" +} +--- + + + +Dear community, **Apache Doris version 2.1.6 was officially released on September 10, 2024.** This version brings continuous upgrades and improvements to the Lakehouse, Async Materialized Views, and Semi-Structured Data Management. Additionally, several fixes have been implemented in areas such as the query optimizer, execution engine, storage management, permission management. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- Removed the `delete_if_exists` option from create repository. [#38192](https://github.com/apache/doris/pull/38192) + +- Added the `enable_prepared_stmt_audit_log` session variable to control whether JDBC prepared statements record audit logs, with the default being no recording. [#38624](https://github.com/apache/doris/pull/38624) [#39009](https://github.com/apache/doris/pull/39009) + +- Implemented fd limit and memory constraints for segment cache. [#39689](https://github.com/apache/doris/pull/39689) + +- When the FE configuration item `sys_log_mode` is set to BRIEF, file location information is added to the logs. [#39571](https://github.com/apache/doris/pull/39571) + +- Changed the default value of the session variable `max_allowed_packet` to 16MB. [#38697](https://github.com/apache/doris/pull/38697) + +- When a single request contains multiple statements, semicolons must be used to separate them. [#38670](https://github.com/apache/doris/pull/38670) + +- Added support for statements to begin with a semicolon. [#39399](https://github.com/apache/doris/pull/39399) + +- Aligned type formatting with MySQL in statements such as `show create table`. [#38012](https://github.com/apache/doris/pull/38012) + +- When the new optimizer planning times out, it no longer falls back to prevent the old optimizer from using longer planning times. [#39499](https://github.com/apache/doris/pull/39499) + +## New features + +### Lakehouse + +- Supported writeback for Iceberg tables. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/lakehouse/datalake-building/iceberg-build). + +- SQL interception rules now support external tables. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/query-admin/sql-interception). + +- Added the system table `file_cache_statistics` to view BE data cache metrics. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/system-tables/file_cache_statistics). + +### Async Materialized View + +- Supported transparent rewriting during inserts. [#38115](https://github.com/apache/doris/pull/38115) + +- Supported transparent rewriting when variant types exist in queries.[ #37929](https://github.com/apache/doris/pull/37929) + +### Semi-Structured Data Management + +- Supported casting ARRAY MAP to JSON type.[ #36548](https://github.com/apache/doris/pull/36548) + +- Supported the `json_keys` function.[ #36411](https://github.com/apache/doris/pull/36411) + +- Supported specifying the JSON path $. when importing JSON. [#38213](https://github.com/apache/doris/pull/38213) + +- ARRAY, MAP, STRUCT types now support `replace_if_not_null`[#38304](https://github.com/apache/doris/pull/38304) + +- ARRAY, MAP, STRUCT types now support adjusting column order.[#39210](https://github.com/apache/doris/pull/39210) + +- Added the `multi_match` function to match keywords across multiple fields, with support for inverted index acceleration. [#37722](https://github.com/apache/doris/pull/37722) + +### Query Optimizer + +- Filled in the original database name, table name, column name, and alias for returned columns in the MySQL protocol. [ #38126](https://github.com/apache/doris/pull/38126) + +- Supported the aggregation function `group_concat` with both order by and distinct simultaneously. [#38080](https://github.com/apache/doris/pull/38080) + +- SQL cache now supports reusing cached results for queries with different comments. [#40049](https://github.com/apache/doris/pull/40049) + +- In partition pruning, supported including `date_trunc` and date functions in filter conditions. [#38025](https://github.com/apache/doris/pull/38025) [#38743](https://github.com/apache/doris/pull/38743) + +- Allowed using the database name where the table resides as a qualifier prefix for table aliases. [#38640](https://github.com/apache/doris/pull/38640) + +- Supported hint-style comments.[#39113](https://github.com/apache/doris/pull/39113) + +### Others + +- Added the system table `table_properties` for viewing table properties. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/system-tables/information_schema/table_properties). + +- Introduced deadlock and slow lock detection in FE. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/maint-monitor/frontend-lock-manager). + +## Improvements + +### Lakehouse + +- Reimplemented the external table metadata caching mechanism. + + - For details, refer to the [documentation](https://doris.apache.org/docs/lakehouse/metacache). + +- Added the session variable `keep_carriage_return` with a default value of false. By default, reading Hive Text format tables treats both `\r\n` and `\n` as newline characters. [#38099](https://github.com/apache/doris/pull/38099) + +- Optimized memory statistics for Parquet/ORC file read/write operations.[#37257](https://github.com/apache/doris/pull/37257) + +- Supported pushing down IN/NOT IN predicates for Paimon tables. [#38390](https://github.com/apache/doris/pull/38390) + +- Enhanced the optimizer to support Time Travel syntax for Hudi tables. [#38591](https://github.com/apache/doris/pull/38591) + +- Optimized Kerberos authentication-related processes. [ #37301](https://github.com/apache/doris/pull/37301) + +- Enabled reading Hive tables after renaming column operations. [#38809](https://github.com/apache/doris/pull/38809) + +- Optimized the reading performance of partition columns for external tables. [#38810](https://github.com/apache/doris/pull/38810) + +- Improved the data shard merging strategy during external table query planning to avoid performance degradation caused by a large number of small shards.[#38964](https://github.com/apache/doris/pull/38964) + +- Added attributes such as location to `SHOW CREATE DATABASE/TABLE`. [#39644](https://github.com/apache/doris/pull/39644) + +- Supported complex types in MaxCompute Catalog. [#39822](https://github.com/apache/doris/pull/39822) + +- Optimized the file cache loading strategy by using asynchronous loading to avoid long BE startup times. [#39036](https://github.com/apache/doris/pull/39036) + +- Improved the file cache eviction strategy, such as evicting locks held for extended periods. [#39721](https://github.com/apache/doris/pull/39721) + +### Async Materialized View + +- Supported hourly, weekly, and quarterly partition roll-up construction. [#37678](https://github.com/apache/doris/pull/37678) + +- For materialized views based on Hive external tables, the metadata cache is now updated before refresh to ensure the latest data is obtained during each refresh. [#38212](https://github.com/apache/doris/pull/38212) + +- Improved the performance of transparent rewrite planning in storage-compute decoupled mode by batch fetching metadata. [#39301](https://github.com/apache/doris/pull/39301) + +- Enhanced the performance of transparent rewrite planning by prohibiting duplicate enumerations. [#39541](https://github.com/apache/doris/pull/39541) + +- Improved the performance of transparent rewrite for refreshing materialized views based on Hive external table partitions.[#38525](https://github.com/apache/doris/pull/38525) + +### Semi-Structured Data Management + +- Optimized memory allocation for TOPN queries to improve performance. [#37429](https://github.com/apache/doris/pull/37429) + +- Enhanced the performance of string processing in inverted indexes.[#37395](https://github.com/apache/doris/pull/37395) + +- Optimized the performance of inverted indexes in MOW tables. [#37428](https://github.com/apache/doris/pull/37428) + +- Supported specifying the row-store `page_size` during table creation to control compression effectiveness. [#37145](https://github.com/apache/doris/pull/37145) + +### Query Optimizer + +- Adjusted the row count estimation algorithm for mark joins, resulting in more accurate cardinality estimates for mark joins. [#38270](https://github.com/apache/doris/pull/38270) + +- Optimized the cost estimation algorithm for semi/anti joins, enabling more accurate selection of semi/anti join orders. [#37951](https://github.com/apache/doris/pull/37951) + +- Adjusted the filter estimation algorithm for cases where some columns have no statistical information, leading to more accurate cardinality estimates. [#39592](https://github.com/apache/doris/pull/39592) + +- Modified the instance calculation logic for set operation operators to prevent insufficient parallelism in extreme cases. [#39999](https://github.com/apache/doris/pull/39999) + +- Adjusted the usage strategy of bucket shuffle, achieving better performance when data is not sufficiently shuffled. [#36784](https://github.com/apache/doris/pull/36784) + +- Enabled early filtering of window function data, supporting multiple window functions in a single projection. [#38393](https://github.com/apache/doris/pull/38393) + +- When a `NullLiteral` exists in a filter condition, it can now be folded into false, further converted to an `EmptySet` to reduce unnecessary data scanning and computation. [#38135](https://github.com/apache/doris/pull/38135) + +- Expanded the scope of predicate derivation, reducing data scanning in queries with specific patterns. [#37314](https://github.com/apache/doris/pull/37314) + +- Supported partial short-circuit evaluation logic in partition pruning to improve partition pruning performance, achieving over 100% improvement in specific scenarios. [#38191](https://github.com/apache/doris/pull/38191) + +- Enabled the computation of arbitrary scalar functions within user variables. [#39144](https://github.com/apache/doris/pull/39144) + +- Maintained error messages consistent with MySQL when alias conflicts exist in queries. [#38104](https://github.com/apache/doris/pull/38104) + +### Query Execution + +- Adapted AggState for compatibility from 2.1 to 3.x and fixed coredump issues. [#37104](https://github.com/apache/doris/pull/37104) + +- Refactored the strategy selection for local shuffle when no joins are involved. [#37282](https://github.com/apache/doris/pull/37282) + +- Modified the scanner for internal table queries to an asynchronous approach to prevent blocking during internal table queries. [#38403](https://github.com/apache/doris/pull/38403) + +- Optimized the block merge process when building hash tables in Join operators. [#37471](https://github.com/apache/doris/pull/37471) + +- Reduced the lock holding time for MultiCast operations. [37462](https://github.com/apache/doris/pull/37462) + +- Optimized gRPC's keepAliveTime and added a connection monitoring mechanism, reducing the probability of query failures due to RPC errors during query execution. [#37304](https://github.com/apache/doris/pull/37304) + +- Cleaned up all dirty pages in jemalloc when memory limits are exceeded. [#37164](https://github.com/apache/doris/pull/37164) + +- Improved the performance of `aes_encrypt`/`decrypt` functions when handling constant types. [#37194](https://github.com/apache/doris/pull/37194) + +- Optimized the performance of `json_extract` functions when processing constant data. [#36927](https://github.com/apache/doris/pull/36927) + +- Optimized the performance of ParseURL functions when processing constant data. [#36882](https://github.com/apache/doris/pull/36882) + +### Backup Recovery / CCR + +- Restore now supports deleting redundant tablets and partition options. [#39363](https://github.com/apache/doris/pull/39363) + +- Check storage connectivity when creating a repository. [#39538](https://github.com/apache/doris/pull/39538) + +- Enables binlog to support `DROP TABLE`, allowing CCR to incrementally synchronize `DROP TABLE` operations. [#38541](https://github.com/apache/doris/pull/38541) + +### Compaction + +- Improves the issue where high-priority compaction tasks were not subject to task concurrency control limits. [#38189](https://github.com/apache/doris/pull/38189) + +- Automatically reduces compaction memory consumption based on data characteristics. [#37486](https://github.com/apache/doris/pull/37486) + +- Fixes an issue where the sequential data optimization strategy could lead to incorrect data in aggregate tables or MOR UNIQUE tables. [ #38299](https://github.com/apache/doris/pull/38299) + +- Optimizes the rowset selection strategy during compaction during replica replenishment to avoid triggering -235 errors. [#39262](https://github.com/apache/doris/pull/39262) + +### MOW (Merge-On-Write) + +- Optimizes slow column updates caused by concurrent column updates and compactions. [#38682](https://github.com/apache/doris/pull/38682) + +- Fixes an issue where segcompaction during bulk data imports could lead to incorrect MOW data. [#38992](https://github.com/apache/doris/pull/38992) [#39707](https://github.com/apache/doris/pull/39707) + +- Fixes data loss in column updates that may occur after BE restarts. [#39035](https://github.com/apache/doris/pull/39035) + +### Storage Management + +- Adds FE configuration to control whether queries under hot-cold tiering prefer local data replicas. [#38322](https://github.com/apache/doris/pull/38322) + +- Optimizes expired BE report messages to include newly created tablets. [#38839](https://github.com/apache/doris/pull/38839) [#39605](https://github.com/apache/doris/pull/39605) + +- Optimizes replica scheduling priority strategy to prioritize replicas with missing data. [#38884](https://github.com/apache/doris/pull/38884) + +- Prevents tablets with unfinished ALTER jobs from being balanced. [#39202](https://github.com/apache/doris/pull/39202) + +- Enables modifying the number of buckets for tables with list partitioning. [#39688](https://github.com/apache/doris/pull/39688) + +- Prefers querying from online disk services. [#39654](https://github.com/apache/doris/pull/39654) + +- Improves error messages for materialized view base tables that do not support deletion during synchronization. [#39857](https://github.com/apache/doris/pull/39857) + +- Improves error messages for single columns exceeding 4GB. [#39897](https://github.com/apache/doris/pull/39897) + +- Fixes an issue where aborted transactions were omitted when plan errors occurred during `INSERT` statements.[#38260](https://github.com/apache/doris/pull/38260) + +- Fixes exceptions during SSL connection closure.[#38677](https://github.com/apache/doris/pull/38677) + +- Fixes an issue where table locks were not held when aborting transactions using labels. [#38842](https://github.com/apache/doris/pull/38842) + +- Fixes `gson pretty` causing large image issues. [#39135](https://github.com/apache/doris/pull/39135) + +- Fixes an issue where the new optimizer did not check for bucket values of 0 in `CREATE TABLE` statements.[#38999](https://github.com/apache/doris/pull/38999) + +- Fixes errors when Chinese column names are included in `DELETE` condition predicates. [#39500](https://github.com/apache/doris/pull/39500) + +- Fixes frequent tablet balancing issues in partition balancing mode. [#39606](https://github.com/apache/doris/pull/39606) + +- Fixes an issue where partition storage policy attributes were lost. [#39677](https://github.com/apache/doris/pull/39677) + +- Fixes incorrect statistics when importing multiple tables within a transaction. [#39548](https://github.com/apache/doris/pull/39548) + +- Fixes errors when deleting random bucket tables. [#39830](https://github.com/apache/doris/pull/39830) + +- Fixes issues where FE fails to start due to non-existent UDFs. [#39868](https://github.com/apache/doris/pull/39868) + +- Fixes inconsistencies in the last failed version between FE master and slave. [#39947](https://github.com/apache/doris/pull/39947) + +- Fixes an issue where related tablets may still be in schema change state when schema change jobs are canceled. [ #39327](https://github.com/apache/doris/pull/39327) + +- Fixes errors when modifying type and column order in a single statement schema change (SC). [#39107](https://github.com/apache/doris/pull/39107) + +### Data Loading + +- Improves error messages for -238 errors during imports. [#39182](https://github.com/apache/doris/pull/39182) + +- Allows importing to other partitions while restoring a partition. [#39915](https://github.com/apache/doris/pull/39915) + +- Optimizes the strategy for FE to select BEs during group commit. [#37830](https://github.com/apache/doris/pull/37830) [#39010](https://github.com/apache/doris/pull/39010) + +- Avoids printing stack traces for some common streamload error messages. [#38418](https://github.com/apache/doris/pull/38418) + +- Improves handling of issues where offline BEs may affect import errors. [#38256](https://github.com/apache/doris/pull/38256) + +### Permissions + +- Optimizes access performance after enabling the Ranger authentication plugin. [#38575](https://github.com/apache/doris/pull/38575) +- Optimizes permission strategies for Refresh Catalog/Database/Table operations, allowing users to perform these operations with only SHOW permissions. [#39008](https://github.com/apache/doris/pull/39008) + +## Bug fixes + +### Lakehouse + +- Fixes the issue where switching catalogs may result in an error of not finding the database. [#38114](https://github.com/apache/doris/pull/38114) + +- Addresses exceptions caused by attempting to read non-existent data on S3. [#38253](https://github.com/apache/doris/pull/38253) + +- Resolves the issue where specifying an abnormal path during export operations may lead to incorrect export locations. [#38602](https://github.com/apache/doris/pull/38602) + +- Fixes the timezone issue for time columns in Paimon tables. [#37716](https://github.com/apache/doris/pull/37716) + +- Temporarily disables the Parquet PageIndex feature to avoid certain erroneous behaviors. + +- Corrects the selection of Backend nodes in the blacklist during external table queries. [#38984](https://github.com/apache/doris/pull/38984) + +- Resolves errors caused by missing subcolumns in Parquet Struct column types.[#39192](https://github.com/apache/doris/pull/39192) + +- Addresses several issues with predicate pushdown in JDBC Catalog. [#39082](https://github.com/apache/doris/pull/39082) + +- Fixes issues where some historical Parquet formats led to incorrect query results. [#39375](https://github.com/apache/doris/pull/39375) + +- Improves compatibility with ojdbc6 drivers for Oracle JDBC Catalog. [#39408](https://github.com/apache/doris/pull/39408) + +- Resolves potential FE memory leaks caused by Refresh Catalog/Database/Table operations. [#39186](https://github.com/apache/doris/pull/39186) [#39871](https://github.com/apache/doris/pull/39871) + +- Fixes thread leaks in JDBC Catalog under certain conditions. [#39666](https://github.com/apache/doris/pull/39666) [#39582](https://github.com/apache/doris/pull/39582) + +- Addresses potential event processing failures after enabling Hive Metastore event subscription. [#39239](https://github.com/apache/doris/pull/39239) + +- Disables reading Hive Text format tables with custom escape characters and null formats to prevent data errors. [#39869](https://github.com/apache/doris/pull/39869) + +- Resolves issues accessing Iceberg tables created via the Iceberg API under certain conditions. [#39203](https://github.com/apache/doris/pull/39203) + +- Fixes the inability to read Paimon tables stored on HDFS clusters with high availability enabled. [#39876](https://github.com/apache/doris/pull/39876) + +- Addresses errors that may occur when reading Paimon table deletion vectors after enabling file caching. [#39875](https://github.com/apache/doris/pull/39875) + +- Resolves potential deadlocks when reading Parquet files under certain conditions. [#39945](https://github.com/apache/doris/pull/39945) + +### Async Materialized View + +- Fixes the inability to use `SHOW CREATE MATERIALIZED VIEW` on follower FEs. [#38794](https://github.com/apache/doris/pull/38794) + +- Unifies the object type of asynchronous materialized views in metadata as tables to enable proper display in data tools. [#38797](https://github.com/apache/doris/pull/38797) + +- Resolves the issue where nested asynchronous materialized views always perform full refreshes. [#38698](https://github.com/apache/doris/pull/38698) + +- Fixes the issue where canceled tasks may show as running after restarting FEs. [ #39424](https://github.com/apache/doris/pull/39424) + +- Addresses incorrect use of contexts, which may lead to unexpected failures of materialized view refresh tasks. [#39690](https://github.com/apache/doris/pull/39690) + +- Resolves issues that may cause varchar type write failures due to unreasonable lengths when creating asynchronous materialized views based on external tables.[#37668](https://github.com/apache/doris/pull/37668) + +- Fixes the potential invalidation of asynchronous materialized views based on external tables after FE restarts or catalog rebuilds. [#39355](https://github.com/apache/doris/pull/39355) + +- Prohibits the use of partition rollup for materialized views with list partitions to prevent the generation of incorrect data. [#38124](https://github.com/apache/doris/pull/38124) + +- Fixes incorrect results when literals exist in the select list during transparent rewriting for aggregation rollup. [#38958](https://github.com/apache/doris/pull/38958) + +- Addresses potential errors during transparent rewriting when queries contain filters like `a = a`. [#39629](https://github.com/apache/doris/pull/39629) + +- Fixes issues where transparent rewriting for direct external table queries fails. [#39041](https://github.com/apache/doris/pull/39041) + +### Semi-Structured Data Management + +- Removes support for prepared statements in the old optimizer. [#39465](https://github.com/apache/doris/pull/39465) + +- Fixes issues with JSON escape character handling. [#37251](https://github.com/apache/doris/pull/37251) + +- Resolves issues with duplicate processing of JSON fields. [#38490](https://github.com/apache/doris/pull/38490) + +- Fixes issues with some ARRAY and MAP functions. [#39307](https://github.com/apache/doris/pull/39307) [#39699](https://github.com/apache/doris/pull/39699) [#39757](https://github.com/apache/doris/pull/39757) + +- Resolves complex combinations of inverted index queries and LIKE queries. [#36687](https://github.com/apache/doris/pull/36687) + +### Query Optimizer + +- Fixed the potential partition pruning error issue when the 'OR' condition exists in partition filter conditions. [#38897](https://github.com/apache/doris/pull/38897) + +- Fixed the potential partition pruning error issue when complex expressions are involved. [#39298](https://github.com/apache/doris/pull/39298) + +- Fixed the issue where nullable in `agg_state` subtypes might be planned incorrectly, leading to execution errors. [#37489](https://github.com/apache/doris/pull/37489) + +- Fixed the issue where nullable in set operation operators might be planned incorrectly, leading to execution errors. [#39109](https://github.com/apache/doris/pull/39109) + +- Fixed the incorrect execution priority issue of intersect operator. [#39095](https://github.com/apache/doris/pull/39095) + +- Fixed the NPE issue that may occur when the maximum valid date literal exists in the query. [#39482](https://github.com/apache/doris/pull/39482) + +- Fixed the occasional planning error that results in an illegal slot error during execution. [#39640](https://github.com/apache/doris/pull/39640) + +- Fixed the issue where repeatedly referencing columns in cte may lead to missing data in some columns in the result. [#39850](https://github.com/apache/doris/pull/39850) + +- Fixed the occasional planning error issue when 'case when' exists in the query. [#38491](https://github.com/apache/doris/pull/38491) + +- Fixed the issue where IP types cannot be implicitly converted to string types. [#39318](https://github.com/apache/doris/pull/39318) + +- Fixed the potential planning error issue when using multi-dimensional aggregation and the same column and its alias exist in the select list. [ #38166](https://github.com/apache/doris/pull/38166) + +- Fixed the issue where boolean types might be handled incorrectly when using BE constant folding. [#39019](https://github.com/apache/doris/pull/39019) + +- Fixed the planning error issue caused by `default_cluster`: as a prefix for the database name in expressions. [#39114](https://github.com/apache/doris/pull/39114) + +- Fixed the potential deadlock issue caused by` insert into`. [#38660](https://github.com/apache/doris/pull/38660) + +- Fixed the potential planning error issue caused by not holding table locks throughout the planning process. [#38950](https://github.com/apache/doris/pull/38950) + +- Fixed the issue where CHAR(0), VARCHAR(0) are not handled correctly when creating tables. [#38427](https://github.com/apache/doris/pull/38427) + +- Fixed the issue where `show create table` may incorrectly display hidden columns. [#38796](https://github.com/apache/doris/pull/38796) + +- Fixed the issue where columns with the same name as hidden columns are not prohibited when creating tables. [#38796](https://github.com/apache/doris/pull/38796) + +- Fixed the occasional planning error issue when executing `insert into as select` with CTEs. [#38526](https://github.com/apache/doris/pull/38526) + +- Fixed the issue where `insert into values` cannot automatically fill null default values. **[[fix](Nereids) fix insert into table with null literal default value #39122](https://github.com/apache/doris/pull/39122)** + +- Fixed the NPE issue caused by using cte in delete without using it. [#39379](https://github.com/apache/doris/pull/39379) + +- Fixed the issue where deleting from a randomly distributed aggregation model table fails. [#37985](https://github.com/apache/doris/pull/37985) + +### Query Execution + +- Fixed the issue where the pipeline execution engine gets stuck in multiple scenarios, causing queries not to end. [#38657](https://github.com/apache/doris/pull/38657) [#38206](https://github.com/apache/doris/pull/38206) [#38885](https://github.com/apache/doris/pull/38885) + +- Fixed the coredump issue caused by null and non-null columns in set difference calculations.[#38737](https://github.com/apache/doris/pull/38737) + +- Fixed the incorrect result issue of the `width_bucket` function. [#37892](https://github.com/apache/doris/pull/37892) + +- Fixed the query error issue when a single row of data is large and the result set is also large (exceeding 2GB). [#37990](https://github.com/apache/doris/pull/37990) + +- Fixed the incorrect result issue of `stddev` with DecimalV2 type. [#38731](https://github.com/apache/doris/pull/38731) + +- Fixed the coredump issue caused by the `MULTI_MATCH_ANY` function. [#37959](https://github.com/apache/doris/pull/37959) + +- Fixed the issue where `insert overwrite auto partition` causes transaction rollback. [#38103](https://github.com/apache/doris/pull/38103) + +- Fixed the incorrect result issue of the `convert_tz` function. [#37358](https://github.com/apache/doris/pull/37358) [#38764](https://github.com/apache/doris/pull/38764) + +- Fixed the coredump issue when using the `collect_set` function with window functions. [#38234](https://github.com/apache/doris/pull/38234) + +- Fixed the coredump issue caused by the mod function with abnormal input. [#37999](https://github.com/apache/doris/pull/37999) + +- Fixed the issue where executing the same expression in multiple threads may lead to incorrect Java UDF results. [#38612](https://github.com/apache/doris/pull/38612) + +- Fixed the overflow issue caused by the incorrect return type of the `conv` function. [#38001](https://github.com/apache/doris/pull/38001) + +- Fixed the unstable result issue of the histogram function. [#38608](https://github.com/apache/doris/pull/38608) + +### Backup & Recovery / CCR + +- Fixed the issue where the data version after backup and recovery may be incorrect, leading to unreadability. [#38343](https://github.com/apache/doris/pull/38343) + +- Fixed the issue of using restore version across versions. [#38396](https://github.com/apache/doris/pull/38396) + +- Fixed the issue where the job is not canceled when backup fails. [#38993](https://github.com/apache/doris/pull/38993) + +- Fixed the NPE issue in ccr during the upgrade from 2.1.4 to 2.1.5, causing the FE to fail to start. [#39910](https://github.com/apache/doris/pull/39910) + +- Fixed the issue where views and materialized views cannot be used after restoration. [#38072](https://github.com/apache/doris/pull/38072) [#39848](https://github.com/apache/doris/pull/39848) + +### Storage Management + +- Fixed possible memory leaks in routine load when loading multiple tables from a single stream. [#38824](https://github.com/apache/doris/pull/38824) + +- Fixed the issue where delimiters and escape characters in routine load were not effective. [#38825](https://github.com/apache/doris/pull/38825) + +- Fixed incorrectly show routine load results when the routine load task name contained uppercase letters. [#38826](https://github.com/apache/doris/pull/38826) + +- Fixed the issue where the offset cache was not reset when changing the routineload topic. [#38474](https://github.com/apache/doris/pull/38474) + +- Fixed the potential exception triggered by show routineload under concurrent scenarios. [#39525](https://github.com/apache/doris/pull/39525) + +- Fixed the issue where routine load might import data repeatedly. [#39526](https://github.com/apache/doris/pull/39526) + +- Fixed the data error caused by `setNull` when enabling group commit via JDBC. [#38276](https://github.com/apache/doris/pull/38276) + +- Fixed the potential NPE issue when enabling group commit insert to a non-master FE. [#38345](https://github.com/apache/doris/pull/38345) + +- Fixed incorrect error handling during internal data writing in group commit. [#38997](https://github.com/apache/doris/pull/38997) + +- Fixed the coredump that might be triggered when the group commit execution plan failed. [#39396](https://github.com/apache/doris/pull/39396) + +- Fixed the issue where concurrent imports into auto partition tables might report non-existent tablets. [#38793](https://github.com/apache/doris/pull/38793) + +- Fixed potential load stream leakage issues. [#39039](https://github.com/apache/doris/pull/39039) + +- Fixed the issue where transactions were opened for `insert into select` with no data. [#39108](https://github.com/apache/doris/pull/39108) + +- Ignored the single-replica import configuration when using memtable prefetching. [#39154](https://github.com/apache/doris/pull/39154) + +- Fixed the issue where background imports of stream load records might be abnormally aborted upon encountering db deletion. [#39527](https://github.com/apache/doris/pull/39527) + +- Fixed inaccurate error messages when data errors occurred in strict mode. [#39587](https://github.com/apache/doris/pull/39587) + +- Fixed the issue where streamload did not return an error URL upon encountering erroneous data. [#38417](https://github.com/apache/doris/pull/38417) + +- Fixed the issue with the combined use of insert overwrite and auto partition. [#38442](https://github.com/apache/doris/pull/38442) + +- Fixed parsing errors when CSV encountered data where the line delimiter was enclosed by the enclosing character. [#38445](https://github.com/apache/doris/pull/38445) + +### Data Exporting + +- Fixed the issue where enabling the delete_existing_files property during export operations might result in duplicate deletion of exported data. [#39304](https://github.com/apache/doris/pull/39304)) + +### Permissions + +- Fixed the incorrect requirement of ALTER TABLE permission when creating a materialized view. [#38011](https://github.com/apache/doris/pull/38011) + +- Fixed the issue where the db was explicitly displayed as empty when showing routine load. [#38365](https://github.com/apache/doris/pull/38365) + +- Fixed the incorrect requirement of CREATE permission on the original table when using CREATE TABLE LIKE. [#37879](https://github.com/apache/doris/pull/37879) + +- Fixed the issue where grant operations did not check if the object existed. [#39597](https://github.com/apache/doris/pull/39597) + +## Upgrade suggestions + +When upgrading Doris, please follow the principle of not skipping two minor versions and upgrade sequentially. + +For example, if you are upgrading from version 0.15.x to 2.0.x, it is recommended to first upgrade to the latest version of 1.1, then upgrade to the latest version of 1.2, and finally upgrade to the latest version of 2.0. + +For more upgrade information, see the documentation: [Cluster Upgrade](../../admin-manual/cluster-management/upgrade) \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.7.md b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.7.md new file mode 100644 index 0000000000000..414229276e6b0 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.7.md @@ -0,0 +1,180 @@ +--- +{ + "title": "Release 2.1.7", + "language": "en" +} +--- + + + +Dear community, **Apache Doris version 2.1.7 was officially released on November 10, 2024.** This version brings continuous upgrades and improvements. Additionally, several fixes have been implemented in areas such as the to the Lakehouse, Async Materialized Views, and Semi-Structured Data Management, Query Optimizer and Permission Management. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- The following global variables will be forcibly set to the following default values: + - enable_nereids_dml: true + - enable_nereids_dml_with_pipeline: true + - enable_nereids_planner: true + - enable_fallback_to_original_planner: true + - enable_pipeline_x_engine: true +- New columns have been added to the audit log. [#42262](https://github.com/apache/doris/pull/42262) + - For more information, please refer to [docs](https://doris.apache.org/docs/admin-manual/audit-plugin/) + +## New features + +### Async Materialized View + +- An asynchronous materialized view has added a property called `use_for_rewrite` to control whether it participates in transparent rewriting. [#40332](https://github.com/apache/doris/pull/40332) + +### Query Execution + +- The list of changed session variables is now output in the Profile. [#41016](https://github.com/apache/doris/pull/41016) +- Support for `trim_in`, `ltrim_in`, and `rtrim_in` functions has been added. [#42641](https://github.com/apache/doris/pull/42641) (Note: This is a duplicate mention, but I'm including it as per your original list.) +- Support for several URL functions (top_level_domain, first_significant_subdomain, cut_to_first_significant_subdomain) has been added. [#42916](https://github.com/apache/doris/pull/42916) +- The `bit_set` function has been added. [#42916](https://github.com/apache/doris/pull/42099) +- The `count_substrings` function has been added. [#42055](https://github.com/apache/doris/pull/42055) +- The `translate` and `url_encode` functions have been added. [#41051](https://github.com/apache/doris/pull/41051) +- The `normal_cdf`, `to_iso8601`, and `from_iso8601_date` functions have been added. [#40695](https://github.com/apache/doris/pull/40695) + + +### Storage Management + +- The `information_schema.table_options` and `table_properties` system tables have been added, supporting the querying of attributes set during table creation. [#34384](https://github.com/apache/doris/pull/34384) +- Support for `bitmap_empty` as a default value has been implemented. [#40364](https://github.com/apache/doris/pull/40364) +- A new session variable `require_sequence_in_insert` has been introduced to control whether a sequence column must be provided when performing `INSERT INTO SELECT` writes to a unique key table. [#41655](https://github.com/apache/doris/pull/41655) + +### Others + +- Allow for generating flame graphs on the BE WebUI page.[#41044](https://github.com/apache/doris/pull/41044) + +## Improvements + +### Lakehouse + +- Support for writing data to Hive text format tables. [#40537](https://github.com/apache/doris/pull/40537) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/datalake-building/hive-build) +- Access MaxCompute data using MaxCompute Open Storage API. [#41610](https://github.com/apache/doris/pull/41610) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/database/max-compute) +- Support for Paimon DLF Catalog. [#41694](https://github.com/apache/doris/pull/41694) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/datalake-analytics/paimon) +- Added `table$partitions` syntax to directly query Hive partition information.[#41230](https://github.com/apache/doris/pull/41230) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/datalake-analytics/hive) +- Support for reading Parquet files in brotli compression format.[#42162](https://github.com/apache/doris/pull/42162) +- Support for reading DECIMAL 256 types in Parquet files. [#42241](https://github.com/apache/doris/pull/42241) +- Support for reading Hive tables in OpenCsvSerde format.[#42939](https://github.com/apache/doris/pull/42939) + +### Async Materialized View + +- Refined the granularity of lock holding during the build process for asynchronous materialized views. [#40402](https://github.com/apache/doris/pull/40402) [#41010](https://github.com/apache/doris/pull/41010). + +### Query optimizer + +- Improved the accuracy of statistic information collection and usage in extreme cases to enhance planning stability. [#40457](https://github.com/apache/doris/pull/40457) +- Runtime filters can now be generated in more scenarios to improve query performance. [#40815](https://github.com/apache/doris/pull/40815) +- Enhanced constant folding capabilities for numerical, date, and string functions to boost query performance. [#40820](https://github.com/apache/doris/pull/40820) +- Optimized the column pruning algorithm to enhance query performance. [#41548](https://github.com/apache/doris/pull/41548) + +### Query Execution + +- Supported parallel preparation to reduce the time consumed by short queries. [#40270](https://github.com/apache/doris/pull/40270) +- Corrected the names of some counters in the profile to match the audit logs. [#41993](https://github.com/apache/doris/pull/41993) +- Added new local shuffle rules to speed up certain queries. [#40637](https://github.com/apache/doris/pull/40637) + +### Storage Management + +- The `SHOW PARTITIONS` command now supports displaying the commit version. [#28274](https://github.com/apache/doris/pull/28274) +- Checked for unreasonable partition expressions when creating tables. [#40158](https://github.com/apache/doris/pull/40158) +- Optimized the scheduling logic when encountering EOF in Routine Load. [#40509](https://github.com/apache/doris/pull/40509) +- Made Routine Load aware of schema changes. [#40508](https://github.com/apache/doris/pull/40508) +- Improved the timeout logic for Routine Load tasks. [#41135](https://github.com/apache/doris/pull/41135) + +### Others + +- Allowed closing the built-in service port of BRPC via BE configuration. [#41047](https://github.com/apache/doris/pull/41047) +- Fixed issues with missing fields and duplicate records in audit logs. [#43015](https://github.com/apache/doris/pull/43015) + +## Bug fixes + +### Lakehouse + +- Fixed the inconsistency in the behavior of INSERT OVERWRITE with Hive. [#39840](https://github.com/apache/doris/pull/39840) +- Cleaned up temporarily created folders to address the issue of too many empty folders on HDFS. [#40424](https://github.com/apache/doris/pull/40424) +- Resolved memory leaks in FE caused by using the JDBC Catalog in some cases. [#40923](https://github.com/apache/doris/pull/40923) +- Resolved memory leaks in BE caused by using the JDBC Catalog in some cases. [#41266](https://github.com/apache/doris/pull/41266) +- Fixed errors in reading Snappy compressed formats in certain scenarios. [#40862](https://github.com/apache/doris/pull/40862) +- Addressed potential FileSystem leaks on the FE side in certain scenarios. [#41108](https://github.com/apache/doris/pull/41108) +- Resolved issues where using EXPLAIN VERBOSE to view external table execution plans could cause null pointer exceptions in some cases. [#41231] (https://github.com/apache/doris/pull/41231) +- Fixed the inability to read tables in Paimon parquet format. [#41487](https://github.com/apache/doris/pull/41487) +- Addressed performance issues introduced by compatibility changes in the JDBC Oracle Catalog. [#41407](https://github.com/apache/doris/pull/41407) +- Disabled predicate pushing down after implicit conversion to resolve incorrect query results in some cases with JDBC Catalog. [#42242](https://github.com/apache/doris/pull/42242) +- Fixed issues with case-sensitive access to table names in the External Catalog. [#42261](https://github.com/apache/doris/pull/42261) + +### Async Materialized View + +- Fixed the issue where user-specified start times were not effective. [#39573](https://github.com/apache/doris/pull/39573) +- Resolved the issue of nested materialized views not refreshing. [#40433](https://github.com/apache/doris/pull/40433) +- Fixed the issue where materialized views might not refresh after the base table was deleted and recreated. [#41762](https://github.com/apache/doris/pull/41762) +- Addressed issues where partition compensation rewrites could lead to incorrect results. [#40803](https://github.com/apache/doris/pull/40803) +- Fixed potential errors in rewrite results when `sql_select_limit` was set. [#40106](https://github.com/apache/doris/pull/40106) + +### Semi-Structured Data Management + +- Fixed the issue of index file handle leaks. [#41915](https://github.com/apache/doris/pull/41915) +- Addressed inaccuracies in the `count()` function of inverted indexes in special cases. (#41127)[https://github.com/apache/doris/pull/41127] +- Fixed exceptions with variant when light schema change was not enabled. [#40908](https://github.com/apache/doris/pull/40908) +- Resolved memory leaks when variant returns arrays. [#41339](https://github.com/apache/doris/pull/41339) + +### Query optimizer + +- Corrected potential errors in nullable calculations for filter conditions during external table queries, leading to execution exceptions. [#41014](https://github.com/apache/doris/pull/41014) +- Fixed potential errors in optimizing range comparison expressions. [#41356](https://github.com/apache/doris/pull/41356) + +### Query Execution + +- The match_regexp function could not correctly handle empty strings. [#39503](https://github.com/apache/doris/pull/39503) +- Resolved issues where the scanner thread pool could become stuck in high-concurrency scenarios. [#40495](https://github.com/apache/doris/pull/40495) +- Fixed errors in the results of the `data_floor` function. [#41948](https://github.com/apache/doris/pull/41948) +- Addressed incorrect cancel messages in some scenarios. [#41798](https://github.com/apache/doris/pull/41798) +- Fixed issues with excessive warning logs printed by arrow flight. [#41770](https://github.com/apache/doris/pull/41770) +- Resolved issues where runtime filters failed to send in some scenarios. [#41698](https://github.com/apache/doris/pull/41698) +- Fixed problems where some system table queries could not end normally or became stuck. [#41592](https://github.com/apache/doris/pull/41592) +- Addressed incorrect results from window functions. ][#40761](https://github.com/apache/doris/pull/40761) +- Fixed issues where the encrypt and decrypt functions caused BE cores. [#40726](https://github.com/apache/doris/pull/40726) +- Resolved errors in the results of the conv function. [#40530](https://github.com/apache/doris/pull/40530) + +### Storage Management + +- Fixed import failures when Memtable migration was used in multi-replica scenarios with machine crashes. [#38003](https://github.com/apache/doris/pull/38003) +- Addressed inaccurate memory statistics during the Memtable flush phase during imports. [#39536](https://github.com/apache/doris/pull/39536) +- Fixed fault tolerance issues with Memtable migration in multi-replica scenarios. [#40477](https://github.com/apache/doris/pull/40477) +- Resolved inaccurate bvar statistics with Memtable migration. [#40985](https://github.com/apache/doris/pull/40985) +- Fixed inaccurate progress reporting for S3 loads. [#40987](https://github.com/apache/doris/pull/40987) + +### Permissions + +- Fixed permission issues related to show columns, show sync, and show data from db.table. [#39726](https://github.com/apache/doris/pull/39726) + +### Others + +- Fixed the issue where the audit log plugin for version 2.0 could not be used in version 2.1. [#41400](https://github.com/apache/doris/pull/41400) diff --git a/versioned_docs/version-3.0/releasenotes/v3.0/release-3.0.3.md b/versioned_docs/version-3.0/releasenotes/v3.0/release-3.0.3.md index c15141832f1eb..b15777212b400 100644 --- a/versioned_docs/version-3.0/releasenotes/v3.0/release-3.0.3.md +++ b/versioned_docs/version-3.0/releasenotes/v3.0/release-3.0.3.md @@ -25,7 +25,7 @@ under the License. --> -Dear community members, the Apache Doris 3.0.2 version was officially released on December 02, 2024, this version further enhances the performance and stability of the system. +Dear community members, the Apache Doris 3.0.3 version was officially released on December 02, 2024, this version further enhances the performance and stability of the system. **Quick Download:** https://doris.apache.org/download/ diff --git a/versioned_sidebars/version-1.2-sidebars.json b/versioned_sidebars/version-1.2-sidebars.json index 294acbf4fed39..0726d49201142 100644 --- a/versioned_sidebars/version-1.2-sidebars.json +++ b/versioned_sidebars/version-1.2-sidebars.json @@ -35,7 +35,9 @@ { "type": "category", "label": "Doris Introduction", - "items": ["summary/basic-summary"] + "items": [ + "summary/basic-summary" + ] }, { "type": "category", @@ -151,17 +153,25 @@ { "type": "category", "label": "Alter Table", - "items": ["advanced/alter-table/schema-change", "advanced/alter-table/replace-table"] + "items": [ + "advanced/alter-table/schema-change", + "advanced/alter-table/replace-table" + ] }, { "type": "category", "label": "Doris Partition", - "items": ["advanced/partition/dynamic-partition", "advanced/partition/table-temp-partition"] + "items": [ + "advanced/partition/dynamic-partition", + "advanced/partition/table-temp-partition" + ] }, { "type": "category", "label": "Data Cache", - "items": ["advanced/cache/partition-cache"] + "items": [ + "advanced/cache/partition-cache" + ] }, "advanced/autobucket", "advanced/broker", @@ -204,7 +214,9 @@ { "type": "category", "label": "Slow Query Analysis", - "items": ["query-acceleration/slow-query-analysis/get-profile"] + "items": [ + "query-acceleration/slow-query-analysis/get-profile" + ] } ] }, @@ -1008,7 +1020,9 @@ { "type": "category", "label": "Operators", - "items": ["sql-manual/sql-reference/Operators/in"] + "items": [ + "sql-manual/sql-reference/Operators/in" + ] }, { "type": "category", @@ -1097,12 +1111,17 @@ { "type": "category", "label": "User Privilege and Ldap", - "items": ["admin-manual/privilege-ldap/user-privilege", "admin-manual/privilege-ldap/ldap"] + "items": [ + "admin-manual/privilege-ldap/user-privilege", + "admin-manual/privilege-ldap/ldap" + ] }, { "type": "category", "label": "System Table", - "items": ["admin-manual/system-table/rowsets"] + "items": [ + "admin-manual/system-table/rowsets" + ] }, "admin-manual/multi-tenant", { @@ -1190,7 +1209,11 @@ "type": "category", "label": "Benchmark", "collapsed": false, - "items": ["benchmark/ssb", "benchmark/tpch", "benchmark/tpcds"] + "items": [ + "benchmark/ssb", + "benchmark/tpch", + "benchmark/tpcds" + ] }, { "type": "category", @@ -1233,23 +1256,94 @@ "type": "category", "label": "FAQ", "collapsed": false, - "items": ["faq/install-faq", "faq/data-faq", "faq/sql-faq", "faq/lakehouse-faq", "faq/bi-faq"] + "items": [ + "faq/install-faq", + "faq/data-faq", + "faq/sql-faq", + "faq/lakehouse-faq", + "faq/bi-faq" + ] }, { "type": "category", "label": "Releases", "collapsed": false, "items": [ - "releasenotes/v1.2/release-1.2.8", - "releasenotes/v1.2/release-1.2.7", - "releasenotes/v1.2/release-1.2.6", - "releasenotes/v1.2/release-1.2.5", - "releasenotes/v1.2/release-1.2.4", - "releasenotes/v1.2/release-1.2.3", - "releasenotes/v1.2/release-1.2.2", - "releasenotes/v1.2/release-1.2.1", - "releasenotes/v1.2/release-1.2.0" + "releasenotes/all-release", + { + "type": "category", + "label": "v3.0", + "items": [ + "releasenotes/v3.0/release-3.0.3", + "releasenotes/v3.0/release-3.0.2", + "releasenotes/v3.0/release-3.0.1", + "releasenotes/v3.0/release-3.0.0" + ] + }, + { + "type": "category", + "label": "v2.1", + "items": [ + "releasenotes/v2.1/release-2.1.7", + "releasenotes/v2.1/release-2.1.6", + "releasenotes/v2.1/release-2.1.5", + "releasenotes/v2.1/release-2.1.4", + "releasenotes/v2.1/release-2.1.3", + "releasenotes/v2.1/release-2.1.2", + "releasenotes/v2.1/release-2.1.1", + "releasenotes/v2.1/release-2.1.0" + ] + }, + { + "type": "category", + "label": "v2.0", + "items": [ + "releasenotes/v2.0/release-2.0.15", + "releasenotes/v2.0/release-2.0.14", + "releasenotes/v2.0/release-2.0.13", + "releasenotes/v2.0/release-2.0.12", + "releasenotes/v2.0/release-2.0.11", + "releasenotes/v2.0/release-2.0.10", + "releasenotes/v2.0/release-2.0.9", + "releasenotes/v2.0/release-2.0.8", + "releasenotes/v2.0/release-2.0.7", + "releasenotes/v2.0/release-2.0.6", + "releasenotes/v2.0/release-2.0.5", + "releasenotes/v2.0/release-2.0.4", + "releasenotes/v2.0/release-2.0.3", + "releasenotes/v2.0/release-2.0.2", + "releasenotes/v2.0/release-2.0.1", + "releasenotes/v2.0/release-2.0.0" + ] + }, + { + "type": "category", + "label": "v1.2", + "items": [ + "releasenotes/v1.2/release-1.2.8", + "releasenotes/v1.2/release-1.2.7", + "releasenotes/v1.2/release-1.2.6", + "releasenotes/v1.2/release-1.2.5", + "releasenotes/v1.2/release-1.2.4", + "releasenotes/v1.2/release-1.2.3", + "releasenotes/v1.2/release-1.2.2", + "releasenotes/v1.2/release-1.2.1", + "releasenotes/v1.2/release-1.2.0" + ] + }, + { + "type": "category", + "label": "v1.1", + "items": [ + "releasenotes/v1.1/release-1.1.5", + "releasenotes/v1.1/release-1.1.4", + "releasenotes/v1.1/release-1.1.3", + "releasenotes/v1.1/release-1.1.2", + "releasenotes/v1.1/release-1.1.1", + "releasenotes/v1.1/release-1.1.0" + ] + } ] } ] -} +} \ No newline at end of file diff --git a/versioned_sidebars/version-2.0-sidebars.json b/versioned_sidebars/version-2.0-sidebars.json index 7a2feea7ca4ce..e284f52cd0f3f 100644 --- a/versioned_sidebars/version-2.0-sidebars.json +++ b/versioned_sidebars/version-2.0-sidebars.json @@ -77,7 +77,9 @@ { "type": "category", "label": "Database Connection", - "items": ["db-connect/database-connect"] + "items": [ + "db-connect/database-connect" + ] }, { "type": "category", @@ -200,12 +202,18 @@ { "type": "category", "label": "Quering Variables", - "items": ["query/query-variables/variables", "query/query-variables/sql-mode"] + "items": [ + "query/query-variables/variables", + "query/query-variables/sql-mode" + ] }, { "type": "category", "label": "Cost-Based Optimizer", - "items": ["query/nereids/nereids-new", "query/nereids/statistics"] + "items": [ + "query/nereids/nereids-new", + "query/nereids/statistics" + ] }, "query/pipeline-execution-engine", { @@ -239,7 +247,10 @@ { "type": "category", "label": "Distincting Counts", - "items": ["query/duplicate/orthogonal-bitmap-manual", "query/duplicate/using-hll"] + "items": [ + "query/duplicate/orthogonal-bitmap-manual", + "query/duplicate/using-hll" + ] }, "query/high-concurrent-point-query", "query/topn-query", @@ -255,7 +266,10 @@ { "type": "category", "label": "User Defined Functions", - "items": ["query/udf/java-user-defined-function", "query/udf/remote-user-defined-function"] + "items": [ + "query/udf/java-user-defined-function", + "query/udf/remote-user-defined-function" + ] } ] }, @@ -488,7 +502,11 @@ "type": "category", "label": "Benchmark", "collapsed": false, - "items": ["benchmark/ssb", "benchmark/tpch", "benchmark/tpcds"] + "items": [ + "benchmark/ssb", + "benchmark/tpch", + "benchmark/tpcds" + ] }, { "type": "category", @@ -515,7 +533,10 @@ { "type": "category", "label": "SQL Clients", - "items": ["ecosystem/bi/dbeaver", "ecosystem/bi/datagrip"] + "items": [ + "ecosystem/bi/dbeaver", + "ecosystem/bi/datagrip" + ] }, { "type": "category", @@ -540,7 +561,13 @@ "type": "category", "label": "FAQ", "collapsed": false, - "items": ["faq/install-faq", "faq/data-faq", "faq/sql-faq", "faq/lakehouse-faq", "faq/bi-faq"] + "items": [ + "faq/install-faq", + "faq/data-faq", + "faq/sql-faq", + "faq/lakehouse-faq", + "faq/bi-faq" + ] }, { "type": "category", @@ -609,7 +636,10 @@ { "type": "category", "label": "IP Data Type", - "items": ["sql-manual/sql-data-types/ip/IPV4", "sql-manual/sql-data-types/ip/IPV6"] + "items": [ + "sql-manual/sql-data-types/ip/IPV4", + "sql-manual/sql-data-types/ip/IPV6" + ] } ] }, @@ -1460,7 +1490,9 @@ { "type": "category", "label": "Operators", - "items": ["sql-manual/sql-reference/Operators/in"] + "items": [ + "sql-manual/sql-reference/Operators/in" + ] }, { "type": "category", @@ -1485,23 +1517,80 @@ "collapsed": false, "items": [ "releasenotes/all-release", - "releasenotes/v2.0/release-2.0.15", - "releasenotes/v2.0/release-2.0.14", - "releasenotes/v2.0/release-2.0.13", - "releasenotes/v2.0/release-2.0.12", - "releasenotes/v2.0/release-2.0.11", - "releasenotes/v2.0/release-2.0.10", - "releasenotes/v2.0/release-2.0.9", - "releasenotes/v2.0/release-2.0.8", - "releasenotes/v2.0/release-2.0.7", - "releasenotes/v2.0/release-2.0.6", - "releasenotes/v2.0/release-2.0.5", - "releasenotes/v2.0/release-2.0.4", - "releasenotes/v2.0/release-2.0.3", - "releasenotes/v2.0/release-2.0.2", - "releasenotes/v2.0/release-2.0.1", - "releasenotes/v2.0/release-2.0.0" + { + "type": "category", + "label": "v3.0", + "items": [ + "releasenotes/v3.0/release-3.0.3", + "releasenotes/v3.0/release-3.0.2", + "releasenotes/v3.0/release-3.0.1", + "releasenotes/v3.0/release-3.0.0" + ] + }, + { + "type": "category", + "label": "v2.1", + "items": [ + "releasenotes/v2.1/release-2.1.7", + "releasenotes/v2.1/release-2.1.6", + "releasenotes/v2.1/release-2.1.5", + "releasenotes/v2.1/release-2.1.4", + "releasenotes/v2.1/release-2.1.3", + "releasenotes/v2.1/release-2.1.2", + "releasenotes/v2.1/release-2.1.1", + "releasenotes/v2.1/release-2.1.0" + ] + }, + { + "type": "category", + "label": "v2.0", + "items": [ + "releasenotes/v2.0/release-2.0.15", + "releasenotes/v2.0/release-2.0.14", + "releasenotes/v2.0/release-2.0.13", + "releasenotes/v2.0/release-2.0.12", + "releasenotes/v2.0/release-2.0.11", + "releasenotes/v2.0/release-2.0.10", + "releasenotes/v2.0/release-2.0.9", + "releasenotes/v2.0/release-2.0.8", + "releasenotes/v2.0/release-2.0.7", + "releasenotes/v2.0/release-2.0.6", + "releasenotes/v2.0/release-2.0.5", + "releasenotes/v2.0/release-2.0.4", + "releasenotes/v2.0/release-2.0.3", + "releasenotes/v2.0/release-2.0.2", + "releasenotes/v2.0/release-2.0.1", + "releasenotes/v2.0/release-2.0.0" + ] + }, + { + "type": "category", + "label": "v1.2", + "items": [ + "releasenotes/v1.2/release-1.2.8", + "releasenotes/v1.2/release-1.2.7", + "releasenotes/v1.2/release-1.2.6", + "releasenotes/v1.2/release-1.2.5", + "releasenotes/v1.2/release-1.2.4", + "releasenotes/v1.2/release-1.2.3", + "releasenotes/v1.2/release-1.2.2", + "releasenotes/v1.2/release-1.2.1", + "releasenotes/v1.2/release-1.2.0" + ] + }, + { + "type": "category", + "label": "v1.1", + "items": [ + "releasenotes/v1.1/release-1.1.5", + "releasenotes/v1.1/release-1.1.4", + "releasenotes/v1.1/release-1.1.3", + "releasenotes/v1.1/release-1.1.2", + "releasenotes/v1.1/release-1.1.1", + "releasenotes/v1.1/release-1.1.0" + ] + } ] } ] -} +} \ No newline at end of file diff --git a/versioned_sidebars/version-2.1-sidebars.json b/versioned_sidebars/version-2.1-sidebars.json index 43cf80f9a6422..79e4add3f12db 100644 --- a/versioned_sidebars/version-2.1-sidebars.json +++ b/versioned_sidebars/version-2.1-sidebars.json @@ -1873,15 +1873,80 @@ "collapsed": false, "items": [ "releasenotes/all-release", - "releasenotes/v2.1/release-2.1.7", - "releasenotes/v2.1/release-2.1.6", - "releasenotes/v2.1/release-2.1.5", - "releasenotes/v2.1/release-2.1.4", - "releasenotes/v2.1/release-2.1.3", - "releasenotes/v2.1/release-2.1.2", - "releasenotes/v2.1/release-2.1.1", - "releasenotes/v2.1/release-2.1.0" + { + "type": "category", + "label": "v3.0", + "items": [ + "releasenotes/v3.0/release-3.0.3", + "releasenotes/v3.0/release-3.0.2", + "releasenotes/v3.0/release-3.0.1", + "releasenotes/v3.0/release-3.0.0" + ] + }, + { + "type": "category", + "label": "v2.1", + "items": [ + "releasenotes/v2.1/release-2.1.7", + "releasenotes/v2.1/release-2.1.6", + "releasenotes/v2.1/release-2.1.5", + "releasenotes/v2.1/release-2.1.4", + "releasenotes/v2.1/release-2.1.3", + "releasenotes/v2.1/release-2.1.2", + "releasenotes/v2.1/release-2.1.1", + "releasenotes/v2.1/release-2.1.0" + ] + }, + { + "type": "category", + "label": "v2.0", + "items": [ + "releasenotes/v2.0/release-2.0.15", + "releasenotes/v2.0/release-2.0.14", + "releasenotes/v2.0/release-2.0.13", + "releasenotes/v2.0/release-2.0.12", + "releasenotes/v2.0/release-2.0.11", + "releasenotes/v2.0/release-2.0.10", + "releasenotes/v2.0/release-2.0.9", + "releasenotes/v2.0/release-2.0.8", + "releasenotes/v2.0/release-2.0.7", + "releasenotes/v2.0/release-2.0.6", + "releasenotes/v2.0/release-2.0.5", + "releasenotes/v2.0/release-2.0.4", + "releasenotes/v2.0/release-2.0.3", + "releasenotes/v2.0/release-2.0.2", + "releasenotes/v2.0/release-2.0.1", + "releasenotes/v2.0/release-2.0.0" + ] + }, + { + "type": "category", + "label": "v1.2", + "items": [ + "releasenotes/v1.2/release-1.2.8", + "releasenotes/v1.2/release-1.2.7", + "releasenotes/v1.2/release-1.2.6", + "releasenotes/v1.2/release-1.2.5", + "releasenotes/v1.2/release-1.2.4", + "releasenotes/v1.2/release-1.2.3", + "releasenotes/v1.2/release-1.2.2", + "releasenotes/v1.2/release-1.2.1", + "releasenotes/v1.2/release-1.2.0" + ] + }, + { + "type": "category", + "label": "v1.1", + "items": [ + "releasenotes/v1.1/release-1.1.5", + "releasenotes/v1.1/release-1.1.4", + "releasenotes/v1.1/release-1.1.3", + "releasenotes/v1.1/release-1.1.2", + "releasenotes/v1.1/release-1.1.1", + "releasenotes/v1.1/release-1.1.0" + ] + } ] } ] -} +} \ No newline at end of file diff --git a/versioned_sidebars/version-3.0-sidebars.json b/versioned_sidebars/version-3.0-sidebars.json index 3b79dbdcf280c..2597bde724be2 100644 --- a/versioned_sidebars/version-3.0-sidebars.json +++ b/versioned_sidebars/version-3.0-sidebars.json @@ -1930,11 +1930,80 @@ "collapsed": false, "items": [ "releasenotes/all-release", - "releasenotes/v3.0/release-3.0.3", - "releasenotes/v3.0/release-3.0.2", - "releasenotes/v3.0/release-3.0.1", - "releasenotes/v3.0/release-3.0.0" + { + "type": "category", + "label": "v3.0", + "items": [ + "releasenotes/v3.0/release-3.0.3", + "releasenotes/v3.0/release-3.0.2", + "releasenotes/v3.0/release-3.0.1", + "releasenotes/v3.0/release-3.0.0" + ] + }, + { + "type": "category", + "label": "v2.1", + "items": [ + "releasenotes/v2.1/release-2.1.7", + "releasenotes/v2.1/release-2.1.6", + "releasenotes/v2.1/release-2.1.5", + "releasenotes/v2.1/release-2.1.4", + "releasenotes/v2.1/release-2.1.3", + "releasenotes/v2.1/release-2.1.2", + "releasenotes/v2.1/release-2.1.1", + "releasenotes/v2.1/release-2.1.0" + ] + }, + { + "type": "category", + "label": "v2.0", + "items": [ + "releasenotes/v2.0/release-2.0.15", + "releasenotes/v2.0/release-2.0.14", + "releasenotes/v2.0/release-2.0.13", + "releasenotes/v2.0/release-2.0.12", + "releasenotes/v2.0/release-2.0.11", + "releasenotes/v2.0/release-2.0.10", + "releasenotes/v2.0/release-2.0.9", + "releasenotes/v2.0/release-2.0.8", + "releasenotes/v2.0/release-2.0.7", + "releasenotes/v2.0/release-2.0.6", + "releasenotes/v2.0/release-2.0.5", + "releasenotes/v2.0/release-2.0.4", + "releasenotes/v2.0/release-2.0.3", + "releasenotes/v2.0/release-2.0.2", + "releasenotes/v2.0/release-2.0.1", + "releasenotes/v2.0/release-2.0.0" + ] + }, + { + "type": "category", + "label": "v1.2", + "items": [ + "releasenotes/v1.2/release-1.2.8", + "releasenotes/v1.2/release-1.2.7", + "releasenotes/v1.2/release-1.2.6", + "releasenotes/v1.2/release-1.2.5", + "releasenotes/v1.2/release-1.2.4", + "releasenotes/v1.2/release-1.2.3", + "releasenotes/v1.2/release-1.2.2", + "releasenotes/v1.2/release-1.2.1", + "releasenotes/v1.2/release-1.2.0" + ] + }, + { + "type": "category", + "label": "v1.1", + "items": [ + "releasenotes/v1.1/release-1.1.5", + "releasenotes/v1.1/release-1.1.4", + "releasenotes/v1.1/release-1.1.3", + "releasenotes/v1.1/release-1.1.2", + "releasenotes/v1.1/release-1.1.1", + "releasenotes/v1.1/release-1.1.0" + ] + } ] } ] -} +} \ No newline at end of file From 83837f7399f971121c74707012be13d63bcf8059 Mon Sep 17 00:00:00 2001 From: kassiez Date: Fri, 20 Dec 2024 16:46:50 +0800 Subject: [PATCH 2/5] fix-deadlink-1220 --- .../gettingStarted/demo-block/demo-block.css | 44 +- .../gettingStarted/demo-block/latest.tsx | 12 +- .../gettingStarted/demo-block/page-hero-1.tsx | 4 +- .../building-lakehouse/doris-iceberg.md | 164 ----- .../gettingStarted/what-is-apache-doris.md | 2 +- .../async-materialized-view/faq.md | 2 +- .../functions-and-demands.md | 2 +- .../async-materialized-view/overview.md | 2 +- .../async-materialized-view/use-guide.md | 20 +- .../releasenotes/v2.1/release-2.1.0.md | 26 +- .../releasenotes/v2.1/release-2.1.2.md | 2 +- .../releasenotes/v2.1/release-2.1.3.md | 2 +- .../releasenotes/v2.1/release-2.1.4.md | 20 +- .../releasenotes/v2.1/release-2.1.5.md | 4 +- .../releasenotes/v2.1/release-2.1.6.md | 12 +- .../releasenotes/v2.1/release-2.1.7.md | 16 +- .../releasenotes/v3.0/release-3.0.0.md | 24 +- .../releasenotes/v3.0/release-3.0.1.md | 4 +- .../releasenotes/v3.0/release-3.0.2.md | 2 +- .../releasenotes/v3.0/release-3.0.3.md | 8 +- .../gettingStarted/demo-block/demo-block.css | 44 +- .../gettingStarted/demo-block/latest.tsx | 12 +- .../gettingStarted/demo-block/page-hero-1.tsx | 4 +- .../releasenotes/v2.1/release-2.1.0.md | 26 +- .../releasenotes/v2.1/release-2.1.2.md | 2 +- .../releasenotes/v2.1/release-2.1.3.md | 2 +- .../releasenotes/v2.1/release-2.1.4.md | 20 +- .../releasenotes/v2.1/release-2.1.5.md | 4 +- .../releasenotes/v2.1/release-2.1.6.md | 12 +- .../releasenotes/v2.1/release-2.1.7.md | 16 +- .../releasenotes/v3.0/release-3.0.0.md | 24 +- .../releasenotes/v3.0/release-3.0.1.md | 4 +- .../releasenotes/v3.0/release-3.0.2.md | 2 +- .../releasenotes/v3.0/release-3.0.3.md | 12 +- .../gettingStarted/demo-block/demo-block.css | 44 +- .../gettingStarted/demo-block/latest.tsx | 12 +- .../gettingStarted/demo-block/page-hero-1.tsx | 4 +- .../releasenotes/v2.1/release-2.1.0.md | 26 +- .../releasenotes/v2.1/release-2.1.2.md | 2 +- .../releasenotes/v2.1/release-2.1.3.md | 2 +- .../releasenotes/v2.1/release-2.1.4.md | 20 +- .../releasenotes/v2.1/release-2.1.5.md | 4 +- .../releasenotes/v2.1/release-2.1.6.md | 12 +- .../releasenotes/v2.1/release-2.1.7.md | 16 +- .../releasenotes/v3.0/release-3.0.0.md | 24 +- .../releasenotes/v3.0/release-3.0.1.md | 4 +- .../releasenotes/v3.0/release-3.0.2.md | 2 +- .../releasenotes/v3.0/release-3.0.3.md | 12 +- .../gettingStarted/demo-block/latest.tsx | 2 +- .../gettingStarted/demo-block/page-hero-1.tsx | 4 +- .../async-materialized-view/faq.md | 2 +- .../functions-and-demands.md | 2 +- .../async-materialized-view/overview.md | 2 +- .../async-materialized-view/use-guide.md | 16 +- .../releasenotes/v2.1/release-2.1.0.md | 26 +- .../releasenotes/v2.1/release-2.1.2.md | 2 +- .../releasenotes/v2.1/release-2.1.3.md | 3 +- .../releasenotes/v2.1/release-2.1.4.md | 20 +- .../releasenotes/v2.1/release-2.1.5.md | 4 +- .../releasenotes/v2.1/release-2.1.6.md | 13 +- .../releasenotes/v2.1/release-2.1.7.md | 16 +- .../releasenotes/v3.0/release-3.0.0.md | 24 +- .../releasenotes/v3.0/release-3.0.1.md | 4 +- .../releasenotes/v3.0/release-3.0.2.md | 2 +- .../releasenotes/v3.0/release-3.0.3.md | 12 +- .../gettingStarted/demo-block/demo-block.css | 44 +- .../gettingStarted/demo-block/latest.tsx | 12 +- .../gettingStarted/demo-block/page-hero-1.tsx | 4 +- .../async-materialized-view/faq.md | 2 +- .../functions-and-demands.md | 2 +- .../async-materialized-view/overview.md | 2 +- .../async-materialized-view/use-guide.md | 20 +- .../version-3.0/releasenotes/all-release.md | 6 +- .../releasenotes/v2.1/release-2.1.0.md | 26 +- .../releasenotes/v2.1/release-2.1.2.md | 2 +- .../releasenotes/v2.1/release-2.1.3.md | 2 +- .../releasenotes/v2.1/release-2.1.4.md | 20 +- .../releasenotes/v2.1/release-2.1.5.md | 6 +- .../releasenotes/v2.1/release-2.1.6.md | 12 +- .../releasenotes/v2.1/release-2.1.7.md | 16 +- .../releasenotes/v3.0/release-3.0.0.md | 24 +- .../releasenotes/v3.0/release-3.0.1.md | 4 +- .../releasenotes/v3.0/release-3.0.2.md | 2 +- .../releasenotes/v3.0/release-3.0.3.md | 12 +- sidebars.json | 9 +- .../version-1.2/releasenotes/all-release.md | 88 +++ .../releasenotes/v1.1/release-1.1.0.md | 379 +++++++++++ .../releasenotes/v1.1/release-1.1.1.md | 78 +++ .../releasenotes/v1.1/release-1.1.2.md | 84 +++ .../releasenotes/v1.1/release-1.1.3.md | 92 +++ .../releasenotes/v1.1/release-1.1.4.md | 72 +++ .../releasenotes/v1.1/release-1.1.5.md | 65 ++ .../releasenotes/v2.0/release-2.0.0.md | 236 +++++++ .../releasenotes/v2.0/release-2.0.1.md | 224 +++++++ .../releasenotes/v2.0/release-2.0.10.md | 59 ++ .../releasenotes/v2.0/release-2.0.11.md | 60 ++ .../releasenotes/v2.0/release-2.0.12.md | 58 ++ .../releasenotes/v2.0/release-2.0.13.md | 61 ++ .../releasenotes/v2.0/release-2.0.14.md | 59 ++ .../releasenotes/v2.0/release-2.0.15.md | 91 +++ .../releasenotes/v2.0/release-2.0.2.md | 157 +++++ .../releasenotes/v2.0/release-2.0.3.md | 253 ++++++++ .../releasenotes/v2.0/release-2.0.4.md | 67 ++ .../releasenotes/v2.0/release-2.0.5.md | 73 +++ .../releasenotes/v2.0/release-2.0.6.md | 59 ++ .../releasenotes/v2.0/release-2.0.7.md | 84 +++ .../releasenotes/v2.0/release-2.0.8.md | 76 +++ .../releasenotes/v2.0/release-2.0.9.md | 75 +++ .../releasenotes/v2.1/release-2.1.0.md | 159 +++++ .../releasenotes/v2.1/release-2.1.1.md | 251 ++++++++ .../releasenotes/v2.1/release-2.1.2.md | 110 ++++ .../releasenotes/v2.1/release-2.1.3.md | 191 ++++++ .../releasenotes/v2.1/release-2.1.4.md | 289 +++++++++ .../releasenotes/v2.1/release-2.1.5.md | 395 ++++++++++++ .../releasenotes/v2.1/release-2.1.6.md | 524 +++++++++++++++ .../releasenotes/v2.1/release-2.1.7.md | 180 ++++++ .../releasenotes/v3.0/release-3.0.0.md | 469 ++++++++++++++ .../releasenotes/v3.0/release-3.0.1.md | 604 ++++++++++++++++++ .../releasenotes/v3.0/release-3.0.2.md | 341 ++++++++++ .../releasenotes/v3.0/release-3.0.3.md | 226 +++++++ .../releasenotes/v1.1/release-1.1.0.md | 379 +++++++++++ .../releasenotes/v1.1/release-1.1.1.md | 78 +++ .../releasenotes/v1.1/release-1.1.2.md | 84 +++ .../releasenotes/v1.1/release-1.1.3.md | 92 +++ .../releasenotes/v1.1/release-1.1.4.md | 72 +++ .../releasenotes/v1.1/release-1.1.5.md | 65 ++ .../releasenotes/v1.2/release-1.2.0.md | 563 ++++++++++++++++ .../releasenotes/v1.2/release-1.2.1.md | 196 ++++++ .../releasenotes/v1.2/release-1.2.2.md | 254 ++++++++ .../releasenotes/v1.2/release-1.2.3.md | 109 ++++ .../releasenotes/v1.2/release-1.2.4.md | 81 +++ .../releasenotes/v1.2/release-1.2.5.md | 199 ++++++ .../releasenotes/v1.2/release-1.2.6.md | 135 ++++ .../releasenotes/v1.2/release-1.2.7.md | 46 ++ .../releasenotes/v1.2/release-1.2.8.md | 47 ++ .../releasenotes/v2.1/release-2.1.0.md | 159 +++++ .../releasenotes/v2.1/release-2.1.1.md | 251 ++++++++ .../releasenotes/v2.1/release-2.1.2.md | 110 ++++ .../releasenotes/v2.1/release-2.1.3.md | 191 ++++++ .../releasenotes/v2.1/release-2.1.4.md | 289 +++++++++ .../releasenotes/v2.1/release-2.1.5.md | 395 ++++++++++++ .../releasenotes/v2.1/release-2.1.6.md | 524 +++++++++++++++ .../releasenotes/v2.1/release-2.1.7.md | 180 ++++++ .../releasenotes/v3.0/release-3.0.0.md | 469 ++++++++++++++ .../releasenotes/v3.0/release-3.0.1.md | 604 ++++++++++++++++++ .../releasenotes/v3.0/release-3.0.2.md | 341 ++++++++++ .../releasenotes/v3.0/release-3.0.3.md | 226 +++++++ .../releasenotes/v1.1/release-1.1.0.md | 379 +++++++++++ .../releasenotes/v1.1/release-1.1.1.md | 78 +++ .../releasenotes/v1.1/release-1.1.2.md | 84 +++ .../releasenotes/v1.1/release-1.1.3.md | 92 +++ .../releasenotes/v1.1/release-1.1.4.md | 72 +++ .../releasenotes/v1.1/release-1.1.5.md | 65 ++ .../releasenotes/v1.2/release-1.2.0.md | 563 ++++++++++++++++ .../releasenotes/v1.2/release-1.2.1.md | 196 ++++++ .../releasenotes/v1.2/release-1.2.2.md | 254 ++++++++ .../releasenotes/v1.2/release-1.2.3.md | 109 ++++ .../releasenotes/v1.2/release-1.2.4.md | 81 +++ .../releasenotes/v1.2/release-1.2.5.md | 199 ++++++ .../releasenotes/v1.2/release-1.2.6.md | 135 ++++ .../releasenotes/v1.2/release-1.2.7.md | 46 ++ .../releasenotes/v1.2/release-1.2.8.md | 47 ++ .../releasenotes/v2.0/release-2.0.0.md | 236 +++++++ .../releasenotes/v2.0/release-2.0.1.md | 224 +++++++ .../releasenotes/v2.0/release-2.0.10.md | 59 ++ .../releasenotes/v2.0/release-2.0.11.md | 60 ++ .../releasenotes/v2.0/release-2.0.12.md | 58 ++ .../releasenotes/v2.0/release-2.0.13.md | 61 ++ .../releasenotes/v2.0/release-2.0.14.md | 59 ++ .../releasenotes/v2.0/release-2.0.15.md | 91 +++ .../releasenotes/v2.0/release-2.0.2.md | 157 +++++ .../releasenotes/v2.0/release-2.0.3.md | 253 ++++++++ .../releasenotes/v2.0/release-2.0.4.md | 67 ++ .../releasenotes/v2.0/release-2.0.5.md | 73 +++ .../releasenotes/v2.0/release-2.0.6.md | 59 ++ .../releasenotes/v2.0/release-2.0.7.md | 84 +++ .../releasenotes/v2.0/release-2.0.8.md | 76 +++ .../releasenotes/v2.0/release-2.0.9.md | 75 +++ .../releasenotes/v3.0/release-3.0.0.md | 469 ++++++++++++++ .../releasenotes/v3.0/release-3.0.1.md | 604 ++++++++++++++++++ .../releasenotes/v3.0/release-3.0.2.md | 341 ++++++++++ .../releasenotes/v3.0/release-3.0.3.md | 226 +++++++ .../releasenotes/v1.1/release-1.1.0.md | 379 +++++++++++ .../releasenotes/v1.1/release-1.1.1.md | 78 +++ .../releasenotes/v1.1/release-1.1.2.md | 84 +++ .../releasenotes/v1.1/release-1.1.3.md | 92 +++ .../releasenotes/v1.1/release-1.1.4.md | 72 +++ .../releasenotes/v1.1/release-1.1.5.md | 65 ++ .../releasenotes/v1.2/release-1.2.0.md | 563 ++++++++++++++++ .../releasenotes/v1.2/release-1.2.1.md | 196 ++++++ .../releasenotes/v1.2/release-1.2.2.md | 254 ++++++++ .../releasenotes/v1.2/release-1.2.3.md | 109 ++++ .../releasenotes/v1.2/release-1.2.4.md | 81 +++ .../releasenotes/v1.2/release-1.2.5.md | 199 ++++++ .../releasenotes/v1.2/release-1.2.6.md | 135 ++++ .../releasenotes/v1.2/release-1.2.7.md | 46 ++ .../releasenotes/v1.2/release-1.2.8.md | 47 ++ .../releasenotes/v2.0/release-2.0.0.md | 236 +++++++ .../releasenotes/v2.0/release-2.0.1.md | 224 +++++++ .../releasenotes/v2.0/release-2.0.10.md | 59 ++ .../releasenotes/v2.0/release-2.0.11.md | 60 ++ .../releasenotes/v2.0/release-2.0.12.md | 58 ++ .../releasenotes/v2.0/release-2.0.13.md | 61 ++ .../releasenotes/v2.0/release-2.0.14.md | 59 ++ .../releasenotes/v2.0/release-2.0.15.md | 91 +++ .../releasenotes/v2.0/release-2.0.2.md | 157 +++++ .../releasenotes/v2.0/release-2.0.3.md | 253 ++++++++ .../releasenotes/v2.0/release-2.0.4.md | 67 ++ .../releasenotes/v2.0/release-2.0.5.md | 73 +++ .../releasenotes/v2.0/release-2.0.6.md | 59 ++ .../releasenotes/v2.0/release-2.0.7.md | 84 +++ .../releasenotes/v2.0/release-2.0.8.md | 76 +++ .../releasenotes/v2.0/release-2.0.9.md | 75 +++ .../releasenotes/v2.1/release-2.1.0.md | 159 +++++ .../releasenotes/v2.1/release-2.1.1.md | 251 ++++++++ .../releasenotes/v2.1/release-2.1.2.md | 110 ++++ .../releasenotes/v2.1/release-2.1.3.md | 191 ++++++ .../releasenotes/v2.1/release-2.1.4.md | 289 +++++++++ .../releasenotes/v2.1/release-2.1.5.md | 395 ++++++++++++ .../releasenotes/v2.1/release-2.1.6.md | 524 +++++++++++++++ .../releasenotes/v2.1/release-2.1.7.md | 180 ++++++ .../releasenotes/v3.0/release-3.0.3.md | 2 +- versioned_sidebars/version-1.2-sidebars.json | 134 +++- versioned_sidebars/version-2.0-sidebars.json | 143 ++++- versioned_sidebars/version-2.1-sidebars.json | 81 ++- versioned_sidebars/version-3.0-sidebars.json | 77 ++- 226 files changed, 25236 insertions(+), 673 deletions(-) create mode 100644 versioned_docs/version-1.2/releasenotes/all-release.md create mode 100644 versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.0.md create mode 100644 versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.1.md create mode 100644 versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.2.md create mode 100644 versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.3.md create mode 100644 versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.4.md create mode 100644 versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.5.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.0.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.1.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.10.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.11.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.12.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.13.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.14.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.15.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.2.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.3.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.4.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.5.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.6.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.7.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.8.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.9.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.0.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.1.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.2.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.3.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.4.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.5.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.6.md create mode 100644 versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.7.md create mode 100644 versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.0.md create mode 100644 versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.1.md create mode 100644 versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.2.md create mode 100644 versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.3.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.0.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.1.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.2.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.3.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.4.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.5.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.0.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.1.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.2.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.3.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.4.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.5.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.6.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.7.md create mode 100644 versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.8.md create mode 100644 versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.0.md create mode 100644 versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.1.md create mode 100644 versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.2.md create mode 100644 versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.3.md create mode 100644 versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.4.md create mode 100644 versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.5.md create mode 100644 versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.6.md create mode 100644 versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.7.md create mode 100644 versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.0.md create mode 100644 versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.1.md create mode 100644 versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.2.md create mode 100644 versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.3.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.0.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.1.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.2.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.3.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.4.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.5.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.0.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.1.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.2.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.3.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.4.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.5.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.6.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.7.md create mode 100644 versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.8.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.0.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.1.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.10.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.11.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.12.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.13.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.14.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.15.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.2.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.3.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.4.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.5.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.6.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.7.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.8.md create mode 100644 versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.9.md create mode 100644 versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.0.md create mode 100644 versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.1.md create mode 100644 versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.2.md create mode 100644 versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.3.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.0.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.1.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.2.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.3.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.4.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.5.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.0.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.1.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.2.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.3.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.4.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.5.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.6.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.7.md create mode 100644 versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.8.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.0.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.1.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.10.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.11.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.12.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.13.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.14.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.15.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.2.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.3.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.4.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.5.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.6.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.7.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.8.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.9.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.0.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.1.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.2.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.3.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.4.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.5.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.6.md create mode 100644 versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.7.md diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/demo-block.css b/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/demo-block.css index 934e88ba28aaf..1257919249c60 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/demo-block.css +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/demo-block.css @@ -105,15 +105,6 @@ a:active { padding-right: 2rem } -.home-page-hero-right { - flex: 1; - flex-direction: row; - justify-content: center; - width: fit-content -} - - - .home-page-option-button { display: flex; margin-bottom: 0.5rem; @@ -209,11 +200,6 @@ a:active { justify-content: center; } -.home-page-hero-right { - align-items: center; - display: flex; - flex-direction: row; -} .home-page-hero-button { /* background-color: #fafafa; */ @@ -279,8 +265,18 @@ a:active { margin-top: 15px } +.home-page-hero-right a { + color: #4c576c +} - +.home-page-hero-right a:hover, +a:active { + /* color: #444fd9; */ + text-decoration: none; + transition-duration: .3s; + transition-timing-function: cubic-bezier(0, 0, .2, 1); + background-color: #fafafa +} .section-border { @@ -355,6 +351,24 @@ a:active { } +@media (max-width: 996px) { + .latest-button { + flex: 1 1 100%; + margin-right: 0 !important; + max-width: calc(100vw - 2rem); + min-height: 170px; + height: auto !important; + } + + .home-page-hero-right { + flex-wrap: wrap !important + } + .latest-button-CN{ + margin-right: 0 !important; + max-width: calc(100vw - 2rem); + } +} + .latest-button-CN { /* background-color: #fafafa; */ border: 0.3px solid #dcdcdc; diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/latest.tsx b/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/latest.tsx index 3e1eb5090e0fb..7c92f75c3c137 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/latest.tsx +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/latest.tsx @@ -33,12 +33,12 @@ export default function Latest() {
*/} -
Doris Summit Asia 2024|12 月 14 日 深圳
+
Doris Summit Asia 2024 圆满落幕
-
一年一度的 Apache Doris 峰会再次启航,Doris Summit Asia 2024 现已开启报名,将于 12 月 14 日在深圳正式举办。
-
立即报名
+
2024 年 12 月 14 日,由飞轮科技主办,腾讯云和阿里云联合主办的 Doris Summit Asia 2024 在深圳圆满落幕。演讲回放及资料会在 10 个工作日内逐步释出,可通过 Doris Summit 官网获取。
+
回放生成中
- +
版本发布
{/*
@@ -47,9 +47,9 @@ export default function Latest() {
*/} -
Apache Doris 3.0.2 正式发布
+
Apache Doris 3.0.3 正式发布
-
3.0.2 版本在存算分离、存储、湖仓一体、查询优化器以及执行引擎持续升级改进,欢迎大家下载使用。
+
3.0.3 版本在存算分离、存储、湖仓一体、查询优化器以及执行引擎持续升级改进,欢迎大家下载使用。
查看详情
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/page-hero-1.tsx b/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/page-hero-1.tsx index 4b9826c5d4e23..6666f3f97ac60 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/page-hero-1.tsx +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/demo-block/page-hero-1.tsx @@ -35,9 +35,9 @@ export default function PageHero() {
如何基于 Apache Doris 构建开放、高性能低成本、统一的日志存储分析平台。
- +
-
资源管理
+
负载管理
{/*
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/tutorials/building-lakehouse/doris-iceberg.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/tutorials/building-lakehouse/doris-iceberg.md index 3653581432726..3cc43ab17e47e 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/tutorials/building-lakehouse/doris-iceberg.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/tutorials/building-lakehouse/doris-iceberg.md @@ -304,167 +304,3 @@ mysql> SELECT * FROM iceberg.nyc.taxis FOR TIME AS OF "2024-07-29 03:40:22"; +-----------+---------+---------------+-------------+--------------------+----------------------------+ 4 rows in set (0.05 sec) ``` - -### 07 与 PyIceberg 交互 - -加载 Iceberg 表: - -```python -from pyiceberg.catalog import load_catalog - -catalog = load_catalog( - "iceberg", - **{ - "warehouse" = "warehouse", - "uri" = "http://rest:8181", - "s3.access-key-id" = "admin", - "s3.secret-access-key" = "password", - "s3.endpoint" = "http://minio:9000" - }, -) -table = catalog.load_table("nyc.taxis") -``` - -读取为 Arrow Table: - -```python -print(table.scan().to_arrow()) - -pyarrow.Table -vendor_id: int64 -trip_id: int64 -trip_distance: float -fare_amount: double -store_and_fwd_flag: large_string -ts: timestamp[us] ----- -vendor_id: [[1],[1],[2],[2]] -trip_id: [[1000371],[1000374],[1000373],[1000372]] -trip_distance: [[1.8],[8.4],[0.9],[2.5]] -fare_amount: [[15.32],[42.13],[9.01],[22.15]] -store_and_fwd_flag: [["N"],["Y"],["N"],["N"]] -ts: [[2024-01-01 09:15:23.000000],[2024-01-03 07:12:33.000000],[2024-01-01 03:25:15.000000],[2024-01-02 12:10:11.000000]] -``` - -读取为 Pandas DataFrame: - -```python -print(table.scan().to_pandas()) - -vendor_id trip_id trip_distance fare_amount store_and_fwd_flag ts -0 1 1000371 1.8 15.32 N 2024-01-01 09:15:23 -1 1 1000374 8.4 42.13 Y 2024-01-03 07:12:33 -2 2 1000373 0.9 9.01 N 2024-01-01 03:25:15 -3 2 1000372 2.5 22.15 N 2024-01-02 12:10:11 -``` - -读取为 Polars DataFrame: - -```python -import polars as pl - -print(pl.scan_iceberg(table).collect()) - -shape: (4, 6) -┌───────────┬─────────┬───────────────┬─────────────┬────────────────────┬─────────────────────┐ -│ vendor_id ┆ trip_id ┆ trip_distance ┆ fare_amount ┆ store_and_fwd_flag ┆ ts │ -│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ -│ i64 ┆ i64 ┆ f32 ┆ f64 ┆ str ┆ datetime[μs] │ -╞═══════════╪═════════╪═══════════════╪═════════════╪════════════════════╪═════════════════════╡ -│ 1 ┆ 1000371 ┆ 1.8 ┆ 15.32 ┆ N ┆ 2024-01-01 09:15:23 │ -│ 1 ┆ 1000374 ┆ 8.4 ┆ 42.13 ┆ Y ┆ 2024-01-03 07:12:33 │ -│ 2 ┆ 1000373 ┆ 0.9 ┆ 9.01 ┆ N ┆ 2024-01-01 03:25:15 │ -│ 2 ┆ 1000372 ┆ 2.5 ┆ 22.15 ┆ N ┆ 2024-01-02 12:10:11 │ -└───────────┴─────────┴───────────────┴─────────────┴────────────────────┴─────────────────────┘ -``` - -> 通过 pyiceberg 写入 iceberg 数据,请参阅[步骤](#通过-pyiceberg-写入数据) - -### 08 附录 - -#### 通过 PyIceberg 写入数据 - -加载 Iceberg 表: - -```python -from pyiceberg.catalog import load_catalog - -catalog = load_catalog( - "iceberg", - **{ - "warehouse" = "warehouse", - "uri" = "http://rest:8181", - "s3.access-key-id" = "admin", - "s3.secret-access-key" = "password", - "s3.endpoint" = "http://minio:9000" - }, -) -table = catalog.load_table("nyc.taxis") -``` - -Arrow Table 写入 Iceberg: - -```python -import pyarrow as pa - -df = pa.Table.from_pydict( - { - "vendor_id": pa.array([1, 2, 2, 1], pa.int64()), - "trip_id": pa.array([1000371, 1000372, 1000373, 1000374], pa.int64()), - "trip_distance": pa.array([1.8, 2.5, 0.9, 8.4], pa.float32()), - "fare_amount": pa.array([15.32, 22.15, 9.01, 42.13], pa.float64()), - "store_and_fwd_flag": pa.array(["N", "N", "N", "Y"], pa.string()), - "ts": pa.compute.strptime( - ["2024-01-01 9:15:23", "2024-01-02 12:10:11", "2024-01-01 3:25:15", "2024-01-03 7:12:33"], - "%Y-%m-%d %H:%M:%S", - "us", - ), - } -) -table.append(df) -``` - -Pandas DataFrame 写入 Iceberg: - -```python -import pyarrow as pa -import pandas as pd - -df = pd.DataFrame( - { - "vendor_id": pd.Series([1, 2, 2, 1]).astype("int64[pyarrow]"), - "trip_id": pd.Series([1000371, 1000372, 1000373, 1000374]).astype("int64[pyarrow]"), - "trip_distance": pd.Series([1.8, 2.5, 0.9, 8.4]).astype("float32[pyarrow]"), - "fare_amount": pd.Series([15.32, 22.15, 9.01, 42.13]).astype("float64[pyarrow]"), - "store_and_fwd_flag": pd.Series(["N", "N", "N", "Y"]).astype("string[pyarrow]"), - "ts": pd.Series(["2024-01-01 9:15:23", "2024-01-02 12:10:11", "2024-01-01 3:25:15", "2024-01-03 7:12:33"]).astype("timestamp[us][pyarrow]"), - } -) -table.append(pa.Table.from_pandas(df)) -``` - -Polars DataFrame 写入 Iceberg: - -```python -import polars as pl - -df = pl.DataFrame( - { - "vendor_id": [1, 2, 2, 1], - "trip_id": [1000371, 1000372, 1000373, 1000374], - "trip_distance": [1.8, 2.5, 0.9, 8.4], - "fare_amount": [15.32, 22.15, 9.01, 42.13], - "store_and_fwd_flag": ["N", "N", "N", "Y"], - "ts": ["2024-01-01 9:15:23", "2024-01-02 12:10:11", "2024-01-01 3:25:15", "2024-01-03 7:12:33"], - }, - { - "vendor_id": pl.Int64, - "trip_id": pl.Int64, - "trip_distance": pl.Float32, - "fare_amount": pl.Float64, - "store_and_fwd_flag": pl.String, - "ts": pl.String, - }, -).with_columns(pl.col("ts").str.strptime(pl.Datetime, "%Y-%m-%d %H:%M:%S")) -table.append(df.to_arrow()) -``` diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/what-is-apache-doris.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/what-is-apache-doris.md index 468d60e1b104c..809ad7b7b6e26 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/what-is-apache-doris.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/gettingStarted/what-is-apache-doris.md @@ -104,6 +104,6 @@ Apache Doris 也支持**强一致的物化视图**,物化视图的更新和选 ![Doris 查询引擎是向量化](/images/apache-doris-query-engine-2.png) -**Apache Doris 采用了自适应查询执行(Adaptive Query Execution)技术,** 可以根据 Runtime Statistics 来动态调整执行计划,比如通过 Runtime Filter 技术能够在运行时生成 Filter 推到 Probe 侧,并且能够将 Filter 自动穿透到 Probe 侧最底层的 Scan 节点,从而大幅减少 Probe 的数据量,加速 Join 性能。Apache Doris 的 Runtime Filter 支持 In/Min/Max/Bloom Filter。 +**Apache Doris 采用了自适应查询执行(Adaptive Query Execution)技术,**可以根据 Runtime Statistics 来动态调整执行计划,比如通过 Runtime Filter 技术能够在运行时生成 Filter 推到 Probe 侧,并且能够将 Filter 自动穿透到 Probe 侧最底层的 Scan 节点,从而大幅减少 Probe 的数据量,加速 Join 性能。Apache Doris 的 Runtime Filter 支持 In/Min/Max/Bloom Filter。 在**优化器**方面,Apache Doris 使用 CBO 和 RBO 结合的优化策略,RBO 支持常量折叠、子查询改写、谓词下推等,CBO 支持 Join Reorder。目前 CBO 还在持续优化中,主要集中在更加精准的统计信息收集和推导,更加精准的代价模型预估等方面。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/faq.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/faq.md index 69a7f275c0ea0..72068e39c7cec 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/faq.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/faq.md @@ -1,6 +1,6 @@ --- { - "title": "常见问题", + "title": "异步物化视图常见问题", "language": "zh-CN" } --- diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md index b0d4b9cabecfb..b470ff4a83deb 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md @@ -1,6 +1,6 @@ --- { - "title": "功能描述", + "title": "异步物化视图功能描述", "language": "zh-CN" } --- diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/overview.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/overview.md index 830ad751e2bd0..a140b4a871859 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/overview.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/overview.md @@ -1,6 +1,6 @@ --- { - "title": "原理介绍", + "title": "异步物化视图原理介绍", "language": "zh-CN" } --- diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/use-guide.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/use-guide.md index 382b9ad525f9a..0382278609121 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/use-guide.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/query-acceleration/materialized-view/async-materialized-view/use-guide.md @@ -1,6 +1,6 @@ --- { - "title": "使用与实践", + "title": "异步物化视图使用与实践", "language": "zh-CN" } --- @@ -26,9 +26,9 @@ under the License. ## 异步物化视图使用原则 -1. **时效性考虑:**异步物化视图通常用于对数据时效性要求不高的场景,一般是 T+1 的数据。如果时效性要求高,应考虑使用同步物化视图。 +1. **时效性考虑:** 异步物化视图通常用于对数据时效性要求不高的场景,一般是 T+1 的数据。如果时效性要求高,应考虑使用同步物化视图。 -2. **加速效果与一致性考虑:**在查询加速场景,创建物化视图时,DBA 应将常见查询 SQL 模式分组,尽量使组之间无重合。SQL 模式组划分越清晰,物化视图构建的质量越高。一个查询可能使用多个物化视图,同时一个物化视图也可能被多个查询使用。构建物化视图需要综合考虑命中物化视图的响应时间(加速效果)、构建成本、数据一致性要求等。 +2. **加速效果与一致性考虑:** 在查询加速场景,创建物化视图时,DBA 应将常见查询 SQL 模式分组,尽量使组之间无重合。SQL 模式组划分越清晰,物化视图构建的质量越高。一个查询可能使用多个物化视图,同时一个物化视图也可能被多个查询使用。构建物化视图需要综合考虑命中物化视图的响应时间(加速效果)、构建成本、数据一致性要求等。 3. **物化视图定义与构建成本考虑:** @@ -38,11 +38,11 @@ under the License. 需要注意: -1. **物化视图数量控制:**物化视图并非越多越好。物化视图参与透明改写,且 CBO 代价模型选择需要时间。理论上,物化视图越多,透明改写的时间越长,且物化视图构建和刷新占用的资源越大。 +1. **物化视图数量控制:** 物化视图并非越多越好。物化视图参与透明改写,且 CBO 代价模型选择需要时间。理论上,物化视图越多,透明改写的时间越长,且物化视图构建和刷新占用的资源越大。 -2. **定期检查物化视图使用状态:**如果未使用,应及时删除。 +2. **定期检查物化视图使用状态:** 如果未使用,应及时删除。 -3. **基表数据更新频率:**如果物化视图的基表数据频繁更新,可能不太适合使用物化视图,因为这会导致物化视图频繁失效,不能用于透明改写(可直查)。如果需要使用此类物化视图进行透明改写,需要允许查询的数据有一定的时效延迟,并可以设定`grace_period`。具体见`grace_period`的适用介绍。 +3. **基表数据更新频率:** 如果物化视图的基表数据频繁更新,可能不太适合使用物化视图,因为这会导致物化视图频繁失效,不能用于透明改写(可直查)。如果需要使用此类物化视图进行透明改写,需要允许查询的数据有一定的时效延迟,并可以设定`grace_period`。具体见`grace_period`的适用介绍。 ## 物化视图刷新方式选择原则 @@ -184,9 +184,9 @@ GROUP BY 通常物化视图会出现两种状态: -- **状态正常:**指的是当前物化视图是否可用于透明改写。 +- **状态正常:** 指的是当前物化视图是否可用于透明改写。 -- **不可用、状态不正常:**指的是物化视图不能用于透明改写的简称。尽管如此,该物化视图还是可以直查的。 +- **不可用、状态不正常:** 的是物化视图不能用于透明改写的简称。尽管如此,该物化视图还是可以直查的。 ### 查看物化视图元数据 @@ -222,9 +222,9 @@ SyncWithBaseTables: 1 - 对于分区增量的物化视图,分区物化视图是否可用,是以分区粒度去看的。也就是说,即使物化视图的部分分区不可用,但只要查询的是有效分区,那么此物化视图依旧可用于透明改写。是否能透明改写,主要看查询所用分区的 `SyncWithBaseTables` 字段是否一致。如果 `SyncWithBaseTables` 是 1,此分区可用于透明改写;如果是 0,则不能用于透明改写。 -- **JobName:**物化视图构建 Job 的名称,每个物化视图有一个 Job,每次刷新会有一个新的 Task,Job 和 Task 是 1:n 的关系 +- **JobName:** 物化视图构建 Job 的名称,每个物化视图有一个 Job,每次刷新会有一个新的 Task,Job 和 Task 是 1:n 的关系 -- **State:**如果变为 SCHEMA_CHANGE,代表基表的 Schema 发生了变化,此时物化视图将不能用来透明改写 (但是不影响直接查询物化视图),下次刷新任务如果执行成功,将恢复为 NORMAL。 +- **State:** 如果变为 SCHEMA_CHANGE,代表基表的 Schema 发生了变化,此时物化视图将不能用来透明改写 (但是不影响直接查询物化视图),下次刷新任务如果执行成功,将恢复为 NORMAL。 - **SchemaChangeDetail:** 表示 SCHEMA_CHANGE 发生的原因。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.0.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.0.md index 434677f520819..d14aec8a307e5 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.0.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.0.md @@ -104,7 +104,7 @@ under the License. ![Local Shuffle Clickbench and TPCH-100](/images/2.1-doris-clickbench-tpch.png) :::note 备注 -参考文档:[Pipeline X 执行引擎](https://doris.apache.org/zh-CN/docs/query-acceleration/pipeline-execution-engine) +参考文档:[Pipeline X 执行引擎](../../query-acceleration/pipeline-execution-engine) ::: ## ARM 架构深度适配,性能提升 230% @@ -141,9 +141,9 @@ under the License. 该功能目前为实验性质功能,当前已经支持 ClickHouse、Presto、Trino、Hive、Spark。在此我们以 Trino 为例,部署完 SQL 转换服务后,在会话变量中设置 `set sql_dialect = trino` ,即可直接采取 Trino SQL 语法执行查询。在某些社区用户的实际线上业务 SQL 兼容性测试中,在全部 3w 多条查询语句中与 Trino SQL 兼容度高达 99% 以上。也欢迎所有用户在使用过程中向我们反馈不兼容的 Case,帮助 Apache Doris 更加完善。 :::note -- 演示 Demo: https://www.bilibili.com/video/BV1cS421A7kA/?spm_id_from=333.999.0.0 +- [演示 Demo](https://www.bilibili.com/video/BV1cS421A7kA/?spm_id_from=333.999.0.0) -- 参考文档:[SQL 方言兼容](https://doris.apache.org/zh-CN/docs/lakehouse/sql-dialect.md) +- 参考文档:[SQL 方言兼容](../../lakehouse/sql-dialect.md) ::: @@ -302,7 +302,7 @@ CREATE MATERIALIZED VIEW mv1 :::note - 演示 Demo: https://www.bilibili.com/video/BV1s2421T71z/?spm_id_from=333.999.0.0 -- 参考文档:[异步物化视图](https://doris.apache.org/zh-CN/docs/query-acceleration/materialized-view/async-materialized-view/overview) +- 参考文档:[异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/overview) ::: ## 存储能力增强 @@ -408,7 +408,7 @@ PROPERTIES ( :::note -参考文档:[数据划分](https://doris.apache.org/zh-CN/docs/table-design/data-partitioning/basic-concepts) +参考文档:[数据划分](../../table-design/data-partitioning/basic-concepts) ::: ### INSERT INTO SELECT 导入性能提升 100% @@ -470,7 +470,7 @@ MemTable 前移在 2.1 版本中默认开启,用户无需修改原有的导入 :::note - 演示 Demo:https://www.bilibili.com/video/BV1um411o7Ha/?spm_id_from=333.999.0.0 -- 参考文档和完整测试报告:[Group Commit](https://doris.apache.org/zh-CN/docs/data-operate/import/import-way/group-commit-manual) +- 参考文档和完整测试报告:[Group Commit](../../data-operate/import/import-way/group-commit-manual) ::: @@ -542,7 +542,7 @@ SELECT v["properties"]["title"] from ${table_name} :::note - 演示 Demo: https://www.bilibili.com/video/BV13u4m1g7ra/?spm_id_from=333.999.0.0 -- 参考文档:[VARIANT](https://doris.apache.org/zh-CN/docs/sql-manual/sql-data-types/semi-structured/VARIANT.md) +- 参考文档:[VARIANT](../../sql-manual/sql-data-types/semi-structured/VARIANT.md) ::: @@ -557,7 +557,7 @@ SELECT v["properties"]["title"] from ${table_name} - INET_ATON:获取包含 IPv4 地址的字符串,格式为 A.B.C.D(点分隔的十进制数字) :::note -参考文档:[IPV6](https://doris.apache.org/zh-CN/docs/sql-manual/sql-data-types/ip/IPV6) +参考文档:[IPV6](../../sql-manual/sql-data-types/ip/IPV6) ::: @@ -674,7 +674,7 @@ mysql> select struct(1,"2") not in (struct(1,3), struct(1,"2"), struct(1,1), nul - `MAP_AGG`:接收 expr1 作为键,expr2 作为对应的值,返回一个 MAP :::note -参考文档:[MAP_AGG](https://doris.apache.org/zh-CN/docs/sql-manual/sql-functions/aggregate-functions/map-agg.md) +参考文档:[MAP_AGG](../../sql-manual/sql-functions/aggregate-functions/map-agg.md) ::: @@ -699,7 +699,7 @@ mysql> select struct(1,"2") not in (struct(1,3), struct(1,"2"), struct(1,1), nul :::note - 演示 Demo:https://www.bilibili.com/video/BV1Fz421X7XE/?spm_id_from=333.999.0.0 -- 参考文档:[Workload Group](https://doris.apache.org/zh-CN/docs/admin-manual/resource-admin/workload-group.md) +- 参考文档:[Workload Group](../../admin-manual/resource-admin/workload-group.md) ::: @@ -757,7 +757,7 @@ select QueryId,max(BePeakMemoryBytes) as be_peak_mem from active_queries() group 目前主要展示的负载类型包括 Select 和`Insert Into……Select`,预计在 2.1 版本之上的三位迭代版本中会支持 Stream Load 和 Broker Load 的资源用量展示。 :::note -参考文档:[ACTIVE_QUERIES](https://doris.apache.org/zh-CN/docs/sql-manual/sql-functions/table-functions/active_queries.md) +参考文档:[ACTIVE_QUERIES](../../sql-manual/sql-functions/table-functions/active_queries.md) ::: @@ -858,7 +858,7 @@ JOB e_daily :::caution 注意事项 -当前 Job Scheduler 仅支持 Insert 内表,参考文档:[CREATE-JOB](https://doris.apache.org/zh-CN/docs/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-JOB.md) +当前 Job Scheduler 仅支持 Insert 内表,参考文档:[CREATE-JOB](../../sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-JOB.md) ::: @@ -878,7 +878,7 @@ JOB e_daily - 对于之前已经安装过审计日志插件的用户,升级后可以继续使用原有插件,也可以通过 uninstall 命令卸载原有插件后,使用新的插件。但注意,切换插件后,审计日志表也将切换到新的表中。 - - 具体可参阅:[审计日志插件](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin.md) + - 具体可参阅:[审计日志插件](../../admin-manual/audit-plugin.md) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.2.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.2.md index 96b7c849d341b..1517bf0b53fca 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.2.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.2.md @@ -40,7 +40,7 @@ under the License. - https://github.com/apache/doris/pull/33282 -3. Auto Partition 语法变化,详见 https://doris.apache.org/zh-CN/docs/table-design/data-partition#%E8%87%AA%E5%8A%A8%E5%88%86%E5%8C%BA +3. Auto Partition 语法变化,详见[文档](../../table-design/data-partitioning/auto-partitioning.md) - https://github.com/apache/doris/pull/32737 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.3.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.3.md index dc33f0d6011fa..15056902e7534 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.3.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.3.md @@ -37,7 +37,7 @@ under the License. 从 2.1.3 版本开始,Apache Doris 支持对 Hive 的 DDL 和 DML 操作。用户可以直接通过 Apache Doris 在 Hive 中创建库表,通过执行`INSERT INTO`语句来向 Hive 表中写入数据。通过该功能,用户可以通过 Apache Doris 对 Hive 进行完整的数据查询和写入操作,进一步帮助用户简化湖仓一体架构。 -参考文档:[https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/](https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/) +参考[文档](../../lakehouse/datalake-building/hive-build) **2. 支持在异步物化视图之上构建新的异步物化视图** diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.4.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.4.md index d8e3a2d8be538..722de717ea32a 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.4.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.4.md @@ -40,9 +40,9 @@ under the License. 关于更多信息,请参考文档: - - [BE 日志管理](../admin-manual/log-management/be-log.md) + - [BE 日志管理](../../admin-manual/log-management/be-log.md) - - [FE 日志管理](../admin-manual/log-management/fe-log.md) + - [FE 日志管理](../../admin-manual/log-management/fe-log.md) - 如果建表时没有填写表注释,默认注释为空,不再使用表类型作为默认表注释。 [#36025](https://github.com/apache/doris/pull/36025) @@ -54,7 +54,7 @@ under the License. - **支持 FE 火焰图工具**:在 FE 部署目录 `${DORIS_FE_HOME}/bin` 中会增加`profile_fe.sh` 脚本,可以利用 async-profiler 工具生成 FE 的火焰图,用以发现性能瓶颈点。 - 关于更多信息,请参考文档:[使用 FE Profiler 生成火焰图](/community/developer-guide/fe-profiler.md) + 关于更多信息,请参考文档:[使用 FE Profiler 生成火焰图](https://doris.apache.org/zh-CN/community/developer-guide/fe-profiler) - **支持 SELECT DISTINCT 与聚合函数同时使用**:支持 `SELECT DISTINCT` 与聚合函数同时使用,在一个查询中同时去重和进行聚合操作,如 SUM、MIN/MAX 等。 @@ -66,15 +66,15 @@ under the License. - **支持 Paimon 的原生读取器来处理 Deletion Vector:** Deletion Vector 主要用于标记或追踪哪些数据已被删除或标记为删除,通常应用在需要保留历史数据的场景,基于本优化可以提升大量数据更新或删除时的处理效率。 [#35241](https://github.com/apache/doris/pull/35241) - 关于更多信息,请参考文档:[数据湖分析 - Paimon](../lakehouse/datalake-analytics/paimon.md) + 关于更多信息,请参考文档:[数据湖分析 - Paimon](../../lakehouse/datalake-analytics/paimon.md) - **支持在表值函数(TVF)中使用 Resource**:TVF 功能为 Apache Doris 提供了直接将对象存储或 HDFS 上的文件作为 Table 进行查询分析的能力。通过在 TVF 中引用 Resource,可以避免重复填写连接信息,提升使用体验。 [#35139](https://github.com/apache/doris/pull/35139) - 关于更多信息,请参考文档:[表函数 - HDFS](../sql-manual/sql-functions/table-functions/hdfs.md) + 关于更多信息,请参考文档:[表函数 - HDFS](../../sql-manual/sql-functions/table-functions/hdfs.md) - **支持通过 Ranger 插件实现数据脱敏**:开启 Ranger 鉴权功能后,支持使用 Ranger 中的 Data Mask 功能进行数据脱敏。 - 关于更多信息,请参考文档:[基于 Apache Ranger 的鉴权管理](https://doris.apache.org/zh-CN/docs/admin-manual/auth/ranger/#%E5%AE%89%E8%A3%85-doris-ranger-%E6%8F%92%E4%BB%B6) + 关于更多信息,请参考文档:[基于 Apache Ranger 的鉴权管理](../../admin-manual/auth/ranger#资源和权限) ### 异步物化视图 @@ -82,21 +82,21 @@ under the License. - 支持单表透明改写。 - 关于更多信息,请参考文档:[查询异步物化视图](../query/view-materialized-view/query-async-materialized-view.md) + 关于更多信息,请参考文档:[查询异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md) - 透明改写支持 agg_state, agg_union 类型的聚合上卷,物化视图可以定义为 agg_state 或者 agg_union,查询使用具体的聚合函数,或者使用 agg_merge - 关于更多信息,请参考文档:[AGG_STATE](../sql-manual/sql-types/Data-Types/AGG_STATE.md) + 关于更多信息,请参考文档:[AGG_STATE](../../sql-manual/sql-data-types/aggregate/AGG-STATE.md) ### 其他 - **新增 `replace_empty` 函数**:将字符串中的子字符串进行替换,当旧字符串为空时,会将新字符串插入到原有字符串的每个字符前以及最后。 - 关于更多信息,请参考文档:[字符串函数 - REPLACE_EMPTY](../sql-manual/sql-functions/string-functions/replace_empty.md) + 关于更多信息,请参考文档:[字符串函数 - REPLACE_EMPTY](../../sql-manual/sql-functions/string-functions/replace_empty.md) - 支持 `show storage policy using` 语句:支持查看所有或指定存储策略关联的表和分区。 - 关于更多信息,请参考文档:[SQL 语句 - SHOW](../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md) + 关于更多信息,请参考文档:[SQL 语句 - SHOW](../../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md) - **支持 BE 侧的 JVM 指标:** 通过在 `be.conf` 配置文件中设置`enable_jvm_monitor=true`,可以启用对 BE 节点 JVM 的监控和指标收集,有助于了解 BE JVM 的资源使用情况,以便进行故障排除和性能优化。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.5.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.5.md index b463d42968326..c41df17fce4ba 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.5.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.5.md @@ -131,7 +131,7 @@ under the License. - 数据导出(Export/Outfile)支持指定 Parquet 和 ORC 的压缩格式。 - - 更多信息,请参考[文档](https://doris.apache.org/docs/sql-manual/sql-statements/Data-Manipulation-Statements/Manipulation/EXPORT/?_highlight=compress_type)。 + - 更多信息,请参考[文档](../../sql-manual/sql-statements/data-modification/load-and-export/EXPORT.md)。 - 当使用 CTAS+TVF 创建表时,TVF 中的分区列将被自动映射为 Varchar(65533)而非 String,以便该分区列能够作为内表的分区列使用。 [#37161](https://github.com/apache/doris/pull/37161) @@ -207,7 +207,7 @@ under the License. - 支持为 `INSERT INTO ... FROM TABLE VALUE FUNCTION` 语句设置 `max_filter_ratio` 参数。 - - 更多信息,请参考[文档](https://doris.apache.org/zh-CN/docs/data-operate/import/import-way/insert-into-manual/) + - 更多信息,请参考[文档](../../data-operate/import/import-way/insert-into-manual) ## Bug 修复 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.6.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.6.md index 6261e4e0c6612..65853079ee177 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.6.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.6.md @@ -56,15 +56,15 @@ under the License. - 实现 Iceberg 表的写回功能。 - - 更多信息,请查看文档数据湖构建-[Iceberg](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-building/iceberg-build) + - 更多信息,请查看文档数据湖构建-[Iceberg](../../lakehouse/datalake-building/iceberg-build) - 增强 SQL 拦截规则,支持对外表的拦截处理。 - - 更多信息,请查看文档查询管理-[SQL 拦截](https://doris.apache.org/zh-CN/docs/admin-manual/query-admin/sql-interception) + - 更多信息,请查看文档查询管理-[SQL 拦截](../../admin-manual/query-admin/sql-interception) - 新增系统表`file_cache_statistics`,用于查看 BE 节点的数据缓存性能指标。 - - 更多信息,请查看文档系统表-[file_cache_statistics](https://doris.apache.org/docs/admin-manual/system-tables/file_cache_statistics/) + - 更多信息,请查看文档系统表-[file_cache_statistics](../../admin-manual/system-tables/information_schema/file_cache_statistics) ### 异步物化视图 @@ -108,10 +108,10 @@ under the License. - 新增系统表`table_properties`,便于用户查看和管理表的各项属性。 - - 更多信息,请查看文档 [table_properties](https://doris.apache.org/zh-CN/docs/admin-manual/system-tables/information_schema/table_properties/) + - 更多信息,请查看文档 [table_properties](../../admin-manual/system-tables/information_schema/table_properties/) - 新增 FE 中死锁和慢锁检测功能。 - - 更多信息,请查看文档 [FE 锁管理](https://doris.apache.org/zh-CN/docs/admin-manual/maint-monitor/frontend-lock-manager/) + - 更多信息,请查看文档 [FE 锁管理](../../admin-manual/maint-monitor/frontend-lock-manager/) ## 改进提升 @@ -119,7 +119,7 @@ under the License. - 革新外表元数据缓存机制。 - - 更多信息,请查看文档 [元数据缓存](https://doris.apache.org/zh-CN/docs/lakehouse/metacache/)。 + - 更多信息,请查看文档 [元数据缓存](../../lakehouse/metacache)。 - 新增会话变量`keep_carriage_return`,默认关闭。读取 Hive Text 格式表时,默认将`\r\n`与`\n`均视为换行符。[#38099](https://github.com/apache/doris/pull/38099) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.7.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.7.md index 2d85c595f497c..f5bfea1d272f5 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.7.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v2.1/release-2.1.7.md @@ -38,7 +38,7 @@ under the License. - enable_fallback_to_original_planner: true - enable_pipeline_x_engine: true - 审计日志增加了新的列 [#42262](https://github.com/apache/doris/pull/42262) - - 更多信息,请参考[管理指南](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin/) + - 更多信息,请参考[管理指南](../../admin-manual/audit-plugin) ## 新功能 @@ -61,8 +61,8 @@ under the License. - 增加了 `information_schema.table_options` 和 `information_schema.``table_properties` 系统表,支持查询建表时设置的一些属性。[#34384](https://github.com/apache/doris/pull/34384) - 更多信息,请参考系统表: - - [table_options](https://doris.apache.org/docs/admin-manual/system-tables/information_schema/table_options/) - - [table_properties](https://doris.apache.org/zh-CN/docs/admin-manual/system-tables/information_schema/table_properties/) + - [table_options](../../admin-manual/system-tables/information_schema/table_options) + - [table_properties](../../admin-manual/system-tables/information_schema/table_properties) - 支持 `bitmap_empty` 作为默认值。[#40364](https://github.com/apache/doris/pull/40364) - 增加了一个新的 Session 变量`require_sequence_in_insert` 来控制向 Unique Key 表进行`insert into select` 写入时,是否必须提供 Sequence 列。[#41655](https://github.com/apache/doris/pull/41655) @@ -75,16 +75,16 @@ under the License. ### 湖仓一体 - 支持写入数据到 Hive Text 格式表。[#40537](https://github.com/apache/doris/pull/40537) - - 更多信息,请参考[使用 Hive 构建数据湖](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-building/hive-build/)文档 + - 更多信息,请参考[使用 Hive 构建数据湖](../../lakehouse/datalake-building/hive-build/)文档 - 使用 MaxCompute Open Storage API 访问 MaxCompute 数据。[#41610](https://github.com/apache/doris/pull/41610) - - 更多信息,请参考 [MaxCompute](https://doris.apache.org/zh-CN/docs/lakehouse/database/max-compute/) 文档 + - 更多信息,请参考 [MaxCompute](../../lakehouse/database/max-compute/) 文档 - 支持 Paimon DLF Catalog。[#41694](https://github.com/apache/doris/pull/41694) - - 更多信息,请参考 [Paimon Catalog](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-analytics/paimon/) 文档 + - 更多信息,请参考 [Paimon Catalog](../../lakehouse/datalake-analytics/paimon/) 文档 - 新增语法 `table$partitions` 语法支持直接查询 Hive 分区信息 [#41230](https://github.com/apache/doris/pull/41230) - - 更多信息,请参考[通过 Hive 分析数据湖](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-analytics/hive/)文档 + - 更多信息,请参考[通过 Hive 分析数据湖](../../lakehouse/datalake-analytics/hive/)文档 - 支持 brotli 压缩格式的 Parquet 文件读取。[#42162](https://github.com/apache/doris/pull/42162) - 支持读取 Parquet 文件中的 DECIMAL 256 类型。[#42241](https://github.com/apache/doris/pull/42241) -- 支持读取 OpenCsvSerde 格式的 Hive 表。[#42939](https://github.com/apache/doris/pull/42939)https://github.com/apache/doris/pull/42939 +- 支持读取 OpenCsvSerde 格式的 Hive 表。[#42939](https://github.com/apache/doris/pull/42939) ### 异步物化视图 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.0.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.0.md index 5065dfc1566b7..40919bb5e2054 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.0.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.0.md @@ -151,7 +151,7 @@ under the License. :::info 备注 -参考文档:[存算分离](https://doris.apache.org/zh-CN/docs/3.0/compute-storage-decoupled/overview) +参考文档:[存算分离](../../compute-storage-decoupled/overview) ::: @@ -200,15 +200,15 @@ under the License. - [接入 Trino Connector](https://doris.apache.org/zh-CN/community/how-to-contribute/trino-connector-developer-guide) -- [TPC-H](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/tpch/) +- [TPC-H](../../lakehouse/datalake-analytics/tpch/) -- [TPC-DS](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/tpcds/) +- [TPC-DS](../../lakehouse/datalake-analytics/tpcds/) -- [Delta Lake](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/deltalake) +- [Delta Lake](../../lakehouse/datalake-analytics/deltalake) -- [Kudu](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/kudu) +- [Kudu](../../lakehouse/datalake-analytics/kudu) -- [BigQuery](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/bigquery) +- [BigQuery](../../lakehouse/datalake-analytics/bigquery) ::: ### 2-3 数据湖构建 @@ -219,7 +219,7 @@ under the License. :::info 备注 -参考文档:[数据湖构建](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build/) +参考文档:[数据湖构建](../../lakehouse/datalake-building/hive-build/) ::: @@ -277,7 +277,7 @@ Variant 数据类型在经过大规模生产打磨后,已具备充分的稳定 参考文档: -- [事务](https://doris.apache.org/zh-CN/docs/3.0/data-operate/transaction/) +- [事务](../../data-operate/transaction/) - 目前 CCR 暂未支持显示事务同步。 ::: @@ -329,9 +329,9 @@ Variant 数据类型在经过大规模生产打磨后,已具备充分的稳定 :::info 备注 参考文档: -- [异步物化视图概览](https://doris.apache.org/zh-CN/docs/query/view-materialized-view/async-materialized-view) +- [异步物化视图概览](../../query-acceleration/materialized-view/async-materialized-view/overview.md) -- [查询异步物化视图](https://doris.apache.org/zh-CN/docs/3.0/query/view-materialized-view/query-async-materialized-view/) +- [查询异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md) ::: ## 6. 性能提升 @@ -400,7 +400,7 @@ Runtime Filter 是否能够准确生成对查询性能的影响至关重要, ``` :::info 备注 -参考文档: [Java UDF - UDTF](https://doris.apache.org/zh-CN/docs/query/udf/java-user-defined-function#udtf-1) +参考文档: [Java UDF - UDTF](../../query-data/udf/java-user-defined-function.md#java-udtf-实例介绍) ::: ### 7-2 生成列 @@ -415,7 +415,7 @@ Runtime Filter 是否能够准确生成对查询性能的影响至关重要, 参考文档: -[CREATE TABLE AND GENERATED COLUMN](https://doris.apache.org/zh-CN/docs/3.0/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-TABLE-AND-GENERATED-COLUMN/) +[CREATE TABLE AND GENERATED COLUMN](../../sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-TABLE-AND-GENERATED-COLUMN/) ::: ## 8. 功能改进 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.1.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.1.md index 6f79a76c5872c..dd3d7829f2783 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.1.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.1.md @@ -74,7 +74,7 @@ under the License. - SQL 拦截功能现在支持外部表 - - 更多内容,参考文档[SQL 拦截](https://doris.apache.org/zh-CN/docs/admin-manual/query-admin/sql-interception) + - 更多内容,参考文档[SQL 拦截](../..//admin-manual/query-admin/sql-interception) - Insert Overwrite 现在支持 Iceberg 表。[#37191](https://github.com/apache/doris/pull/37191) @@ -108,7 +108,7 @@ under the License. - 新增加了 FE 参数 `skip_audit_user_list`,在此配置项中的用户操作将不会被记录到审计日志中。[#38310](https://github.com/apache/doris/pull/38310) - - 更多内容,参考文档[审计插件](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin/) + - 更多内容,参考文档[审计插件](../../admin-manual/audit-plugin/) ## 改进 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.2.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.2.md index bd84408eec7f0..cd509e52023ff 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.2.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.2.md @@ -63,7 +63,7 @@ under the License. ### Lakehouse -- 新增 Lakesoul Catalog。[Apache Doris Docs](https://doris.apache.org/zh-CN/docs/dev/lakehouse/datalake-analytics/lakesoul) +- 新增 Lakesoul Catalog。[Apache Doris Docs](../../lakehouse/datalake-analytics/lakesoul) - 新增系统表 `catalog_meta_cache_statistics`,用于查看 External Catalog 中各类元数据缓存的使用情况。[#40155](https://github.com/apache/doris/pull/40155) ### 查询优化器 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.3.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.3.md index 99a49f6207103..8a3ecbfa4f62f 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.3.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/releasenotes/v3.0/release-3.0.3.md @@ -45,11 +45,11 @@ under the License. - 新增 `table$partition` 语法,用于查询 Hive 表的分区信息。[#40774](https://github.com/apache/doris/pull/40774) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/hive#查询-hive-分区) + - [查看文档](../../lakehouse/datalake-analytics/hive#查询-hive-分区) - 支持创建 Text 格式的 Hive 表。[#41860](https://github.com/apache/doris/pull/41860) [#42175](https://github.com/apache/doris/pull/42175) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build#table) + - [查看文档](../../lakehouse/datalake-building/hive-build#table) ### 异步物化视图 @@ -96,7 +96,7 @@ under the License. - Paimon Catalog 支持阿里云 DLF 和 OSS-HDFS 存储。[#41247](https://github.com/apache/doris/pull/41247) [#42585](https://github.com/apache/doris/pull/42585) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/paimon#基于-aliyun-dlf-创建-catalog) + - [查看文档](../../lakehouse/datalake-analytics/paimon#基于-aliyun-dlf-创建-catalog) - 支持读取 OpenCSV 格式的 Hive 表。[#42257](https://github.com/apache/doris/pull/42257) [#42942](https://github.com/apache/doris/pull/42942) - 优化了访问 External Catalog 中 `information_schema.columns` 表的性能。[#41659](https://github.com/apache/doris/pull/41659) [#41962](https://github.com/apache/doris/pull/41962) @@ -224,4 +224,4 @@ under the License. - 补充了审计日志表和文件中缺失的审计日志字段。[#43303](https://github.com/apache/doris/pull/43303) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file + - [查看文档](../../admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/demo-block.css b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/demo-block.css index 934e88ba28aaf..1257919249c60 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/demo-block.css +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/demo-block.css @@ -105,15 +105,6 @@ a:active { padding-right: 2rem } -.home-page-hero-right { - flex: 1; - flex-direction: row; - justify-content: center; - width: fit-content -} - - - .home-page-option-button { display: flex; margin-bottom: 0.5rem; @@ -209,11 +200,6 @@ a:active { justify-content: center; } -.home-page-hero-right { - align-items: center; - display: flex; - flex-direction: row; -} .home-page-hero-button { /* background-color: #fafafa; */ @@ -279,8 +265,18 @@ a:active { margin-top: 15px } +.home-page-hero-right a { + color: #4c576c +} - +.home-page-hero-right a:hover, +a:active { + /* color: #444fd9; */ + text-decoration: none; + transition-duration: .3s; + transition-timing-function: cubic-bezier(0, 0, .2, 1); + background-color: #fafafa +} .section-border { @@ -355,6 +351,24 @@ a:active { } +@media (max-width: 996px) { + .latest-button { + flex: 1 1 100%; + margin-right: 0 !important; + max-width: calc(100vw - 2rem); + min-height: 170px; + height: auto !important; + } + + .home-page-hero-right { + flex-wrap: wrap !important + } + .latest-button-CN{ + margin-right: 0 !important; + max-width: calc(100vw - 2rem); + } +} + .latest-button-CN { /* background-color: #fafafa; */ border: 0.3px solid #dcdcdc; diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/latest.tsx b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/latest.tsx index 3e1eb5090e0fb..7c92f75c3c137 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/latest.tsx +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/latest.tsx @@ -33,12 +33,12 @@ export default function Latest() {
*/} -
Doris Summit Asia 2024|12 月 14 日 深圳
+
Doris Summit Asia 2024 圆满落幕
-
一年一度的 Apache Doris 峰会再次启航,Doris Summit Asia 2024 现已开启报名,将于 12 月 14 日在深圳正式举办。
-
立即报名
+
2024 年 12 月 14 日,由飞轮科技主办,腾讯云和阿里云联合主办的 Doris Summit Asia 2024 在深圳圆满落幕。演讲回放及资料会在 10 个工作日内逐步释出,可通过 Doris Summit 官网获取。
+
回放生成中
- +
版本发布
{/*
@@ -47,9 +47,9 @@ export default function Latest() {
*/} -
Apache Doris 3.0.2 正式发布
+
Apache Doris 3.0.3 正式发布
-
3.0.2 版本在存算分离、存储、湖仓一体、查询优化器以及执行引擎持续升级改进,欢迎大家下载使用。
+
3.0.3 版本在存算分离、存储、湖仓一体、查询优化器以及执行引擎持续升级改进,欢迎大家下载使用。
查看详情
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/page-hero-1.tsx b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/page-hero-1.tsx index 4b9826c5d4e23..6666f3f97ac60 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/page-hero-1.tsx +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/gettingStarted/demo-block/page-hero-1.tsx @@ -35,9 +35,9 @@ export default function PageHero() {
如何基于 Apache Doris 构建开放、高性能低成本、统一的日志存储分析平台。
- +
-
资源管理
+
负载管理
{/*
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.0.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.0.md index 434677f520819..d14aec8a307e5 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.0.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.0.md @@ -104,7 +104,7 @@ under the License. ![Local Shuffle Clickbench and TPCH-100](/images/2.1-doris-clickbench-tpch.png) :::note 备注 -参考文档:[Pipeline X 执行引擎](https://doris.apache.org/zh-CN/docs/query-acceleration/pipeline-execution-engine) +参考文档:[Pipeline X 执行引擎](../../query-acceleration/pipeline-execution-engine) ::: ## ARM 架构深度适配,性能提升 230% @@ -141,9 +141,9 @@ under the License. 该功能目前为实验性质功能,当前已经支持 ClickHouse、Presto、Trino、Hive、Spark。在此我们以 Trino 为例,部署完 SQL 转换服务后,在会话变量中设置 `set sql_dialect = trino` ,即可直接采取 Trino SQL 语法执行查询。在某些社区用户的实际线上业务 SQL 兼容性测试中,在全部 3w 多条查询语句中与 Trino SQL 兼容度高达 99% 以上。也欢迎所有用户在使用过程中向我们反馈不兼容的 Case,帮助 Apache Doris 更加完善。 :::note -- 演示 Demo: https://www.bilibili.com/video/BV1cS421A7kA/?spm_id_from=333.999.0.0 +- [演示 Demo](https://www.bilibili.com/video/BV1cS421A7kA/?spm_id_from=333.999.0.0) -- 参考文档:[SQL 方言兼容](https://doris.apache.org/zh-CN/docs/lakehouse/sql-dialect.md) +- 参考文档:[SQL 方言兼容](../../lakehouse/sql-dialect.md) ::: @@ -302,7 +302,7 @@ CREATE MATERIALIZED VIEW mv1 :::note - 演示 Demo: https://www.bilibili.com/video/BV1s2421T71z/?spm_id_from=333.999.0.0 -- 参考文档:[异步物化视图](https://doris.apache.org/zh-CN/docs/query-acceleration/materialized-view/async-materialized-view/overview) +- 参考文档:[异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/overview) ::: ## 存储能力增强 @@ -408,7 +408,7 @@ PROPERTIES ( :::note -参考文档:[数据划分](https://doris.apache.org/zh-CN/docs/table-design/data-partitioning/basic-concepts) +参考文档:[数据划分](../../table-design/data-partitioning/basic-concepts) ::: ### INSERT INTO SELECT 导入性能提升 100% @@ -470,7 +470,7 @@ MemTable 前移在 2.1 版本中默认开启,用户无需修改原有的导入 :::note - 演示 Demo:https://www.bilibili.com/video/BV1um411o7Ha/?spm_id_from=333.999.0.0 -- 参考文档和完整测试报告:[Group Commit](https://doris.apache.org/zh-CN/docs/data-operate/import/import-way/group-commit-manual) +- 参考文档和完整测试报告:[Group Commit](../../data-operate/import/import-way/group-commit-manual) ::: @@ -542,7 +542,7 @@ SELECT v["properties"]["title"] from ${table_name} :::note - 演示 Demo: https://www.bilibili.com/video/BV13u4m1g7ra/?spm_id_from=333.999.0.0 -- 参考文档:[VARIANT](https://doris.apache.org/zh-CN/docs/sql-manual/sql-data-types/semi-structured/VARIANT.md) +- 参考文档:[VARIANT](../../sql-manual/sql-data-types/semi-structured/VARIANT.md) ::: @@ -557,7 +557,7 @@ SELECT v["properties"]["title"] from ${table_name} - INET_ATON:获取包含 IPv4 地址的字符串,格式为 A.B.C.D(点分隔的十进制数字) :::note -参考文档:[IPV6](https://doris.apache.org/zh-CN/docs/sql-manual/sql-data-types/ip/IPV6) +参考文档:[IPV6](../../sql-manual/sql-data-types/ip/IPV6) ::: @@ -674,7 +674,7 @@ mysql> select struct(1,"2") not in (struct(1,3), struct(1,"2"), struct(1,1), nul - `MAP_AGG`:接收 expr1 作为键,expr2 作为对应的值,返回一个 MAP :::note -参考文档:[MAP_AGG](https://doris.apache.org/zh-CN/docs/sql-manual/sql-functions/aggregate-functions/map-agg.md) +参考文档:[MAP_AGG](../../sql-manual/sql-functions/aggregate-functions/map-agg.md) ::: @@ -699,7 +699,7 @@ mysql> select struct(1,"2") not in (struct(1,3), struct(1,"2"), struct(1,1), nul :::note - 演示 Demo:https://www.bilibili.com/video/BV1Fz421X7XE/?spm_id_from=333.999.0.0 -- 参考文档:[Workload Group](https://doris.apache.org/zh-CN/docs/admin-manual/resource-admin/workload-group.md) +- 参考文档:[Workload Group](../../admin-manual/resource-admin/workload-group.md) ::: @@ -757,7 +757,7 @@ select QueryId,max(BePeakMemoryBytes) as be_peak_mem from active_queries() group 目前主要展示的负载类型包括 Select 和`Insert Into……Select`,预计在 2.1 版本之上的三位迭代版本中会支持 Stream Load 和 Broker Load 的资源用量展示。 :::note -参考文档:[ACTIVE_QUERIES](https://doris.apache.org/zh-CN/docs/sql-manual/sql-functions/table-functions/active_queries.md) +参考文档:[ACTIVE_QUERIES](../../sql-manual/sql-functions/table-functions/active_queries.md) ::: @@ -858,7 +858,7 @@ JOB e_daily :::caution 注意事项 -当前 Job Scheduler 仅支持 Insert 内表,参考文档:[CREATE-JOB](https://doris.apache.org/zh-CN/docs/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-JOB.md) +当前 Job Scheduler 仅支持 Insert 内表,参考文档:[CREATE-JOB](../../sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-JOB.md) ::: @@ -878,7 +878,7 @@ JOB e_daily - 对于之前已经安装过审计日志插件的用户,升级后可以继续使用原有插件,也可以通过 uninstall 命令卸载原有插件后,使用新的插件。但注意,切换插件后,审计日志表也将切换到新的表中。 - - 具体可参阅:[审计日志插件](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin.md) + - 具体可参阅:[审计日志插件](../../admin-manual/audit-plugin.md) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.2.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.2.md index 96b7c849d341b..1517bf0b53fca 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.2.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.2.md @@ -40,7 +40,7 @@ under the License. - https://github.com/apache/doris/pull/33282 -3. Auto Partition 语法变化,详见 https://doris.apache.org/zh-CN/docs/table-design/data-partition#%E8%87%AA%E5%8A%A8%E5%88%86%E5%8C%BA +3. Auto Partition 语法变化,详见[文档](../../table-design/data-partitioning/auto-partitioning.md) - https://github.com/apache/doris/pull/32737 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.3.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.3.md index dc33f0d6011fa..15056902e7534 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.3.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.3.md @@ -37,7 +37,7 @@ under the License. 从 2.1.3 版本开始,Apache Doris 支持对 Hive 的 DDL 和 DML 操作。用户可以直接通过 Apache Doris 在 Hive 中创建库表,通过执行`INSERT INTO`语句来向 Hive 表中写入数据。通过该功能,用户可以通过 Apache Doris 对 Hive 进行完整的数据查询和写入操作,进一步帮助用户简化湖仓一体架构。 -参考文档:[https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/](https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/) +参考[文档](../../lakehouse/datalake-building/hive-build) **2. 支持在异步物化视图之上构建新的异步物化视图** diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.4.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.4.md index d8e3a2d8be538..722de717ea32a 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.4.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.4.md @@ -40,9 +40,9 @@ under the License. 关于更多信息,请参考文档: - - [BE 日志管理](../admin-manual/log-management/be-log.md) + - [BE 日志管理](../../admin-manual/log-management/be-log.md) - - [FE 日志管理](../admin-manual/log-management/fe-log.md) + - [FE 日志管理](../../admin-manual/log-management/fe-log.md) - 如果建表时没有填写表注释,默认注释为空,不再使用表类型作为默认表注释。 [#36025](https://github.com/apache/doris/pull/36025) @@ -54,7 +54,7 @@ under the License. - **支持 FE 火焰图工具**:在 FE 部署目录 `${DORIS_FE_HOME}/bin` 中会增加`profile_fe.sh` 脚本,可以利用 async-profiler 工具生成 FE 的火焰图,用以发现性能瓶颈点。 - 关于更多信息,请参考文档:[使用 FE Profiler 生成火焰图](/community/developer-guide/fe-profiler.md) + 关于更多信息,请参考文档:[使用 FE Profiler 生成火焰图](https://doris.apache.org/zh-CN/community/developer-guide/fe-profiler) - **支持 SELECT DISTINCT 与聚合函数同时使用**:支持 `SELECT DISTINCT` 与聚合函数同时使用,在一个查询中同时去重和进行聚合操作,如 SUM、MIN/MAX 等。 @@ -66,15 +66,15 @@ under the License. - **支持 Paimon 的原生读取器来处理 Deletion Vector:** Deletion Vector 主要用于标记或追踪哪些数据已被删除或标记为删除,通常应用在需要保留历史数据的场景,基于本优化可以提升大量数据更新或删除时的处理效率。 [#35241](https://github.com/apache/doris/pull/35241) - 关于更多信息,请参考文档:[数据湖分析 - Paimon](../lakehouse/datalake-analytics/paimon.md) + 关于更多信息,请参考文档:[数据湖分析 - Paimon](../../lakehouse/datalake-analytics/paimon.md) - **支持在表值函数(TVF)中使用 Resource**:TVF 功能为 Apache Doris 提供了直接将对象存储或 HDFS 上的文件作为 Table 进行查询分析的能力。通过在 TVF 中引用 Resource,可以避免重复填写连接信息,提升使用体验。 [#35139](https://github.com/apache/doris/pull/35139) - 关于更多信息,请参考文档:[表函数 - HDFS](../sql-manual/sql-functions/table-functions/hdfs.md) + 关于更多信息,请参考文档:[表函数 - HDFS](../../sql-manual/sql-functions/table-functions/hdfs.md) - **支持通过 Ranger 插件实现数据脱敏**:开启 Ranger 鉴权功能后,支持使用 Ranger 中的 Data Mask 功能进行数据脱敏。 - 关于更多信息,请参考文档:[基于 Apache Ranger 的鉴权管理](https://doris.apache.org/zh-CN/docs/admin-manual/auth/ranger/#%E5%AE%89%E8%A3%85-doris-ranger-%E6%8F%92%E4%BB%B6) + 关于更多信息,请参考文档:[基于 Apache Ranger 的鉴权管理](../../admin-manual/auth/ranger#资源和权限) ### 异步物化视图 @@ -82,21 +82,21 @@ under the License. - 支持单表透明改写。 - 关于更多信息,请参考文档:[查询异步物化视图](../query/view-materialized-view/query-async-materialized-view.md) + 关于更多信息,请参考文档:[查询异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md) - 透明改写支持 agg_state, agg_union 类型的聚合上卷,物化视图可以定义为 agg_state 或者 agg_union,查询使用具体的聚合函数,或者使用 agg_merge - 关于更多信息,请参考文档:[AGG_STATE](../sql-manual/sql-types/Data-Types/AGG_STATE.md) + 关于更多信息,请参考文档:[AGG_STATE](../../sql-manual/sql-data-types/aggregate/AGG-STATE.md) ### 其他 - **新增 `replace_empty` 函数**:将字符串中的子字符串进行替换,当旧字符串为空时,会将新字符串插入到原有字符串的每个字符前以及最后。 - 关于更多信息,请参考文档:[字符串函数 - REPLACE_EMPTY](../sql-manual/sql-functions/string-functions/replace_empty.md) + 关于更多信息,请参考文档:[字符串函数 - REPLACE_EMPTY](../../sql-manual/sql-functions/string-functions/replace_empty.md) - 支持 `show storage policy using` 语句:支持查看所有或指定存储策略关联的表和分区。 - 关于更多信息,请参考文档:[SQL 语句 - SHOW](../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md) + 关于更多信息,请参考文档:[SQL 语句 - SHOW](../../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md) - **支持 BE 侧的 JVM 指标:** 通过在 `be.conf` 配置文件中设置`enable_jvm_monitor=true`,可以启用对 BE 节点 JVM 的监控和指标收集,有助于了解 BE JVM 的资源使用情况,以便进行故障排除和性能优化。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.5.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.5.md index b463d42968326..c41df17fce4ba 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.5.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.5.md @@ -131,7 +131,7 @@ under the License. - 数据导出(Export/Outfile)支持指定 Parquet 和 ORC 的压缩格式。 - - 更多信息,请参考[文档](https://doris.apache.org/docs/sql-manual/sql-statements/Data-Manipulation-Statements/Manipulation/EXPORT/?_highlight=compress_type)。 + - 更多信息,请参考[文档](../../sql-manual/sql-statements/data-modification/load-and-export/EXPORT.md)。 - 当使用 CTAS+TVF 创建表时,TVF 中的分区列将被自动映射为 Varchar(65533)而非 String,以便该分区列能够作为内表的分区列使用。 [#37161](https://github.com/apache/doris/pull/37161) @@ -207,7 +207,7 @@ under the License. - 支持为 `INSERT INTO ... FROM TABLE VALUE FUNCTION` 语句设置 `max_filter_ratio` 参数。 - - 更多信息,请参考[文档](https://doris.apache.org/zh-CN/docs/data-operate/import/import-way/insert-into-manual/) + - 更多信息,请参考[文档](../../data-operate/import/import-way/insert-into-manual) ## Bug 修复 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.6.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.6.md index 6261e4e0c6612..65853079ee177 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.6.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.6.md @@ -56,15 +56,15 @@ under the License. - 实现 Iceberg 表的写回功能。 - - 更多信息,请查看文档数据湖构建-[Iceberg](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-building/iceberg-build) + - 更多信息,请查看文档数据湖构建-[Iceberg](../../lakehouse/datalake-building/iceberg-build) - 增强 SQL 拦截规则,支持对外表的拦截处理。 - - 更多信息,请查看文档查询管理-[SQL 拦截](https://doris.apache.org/zh-CN/docs/admin-manual/query-admin/sql-interception) + - 更多信息,请查看文档查询管理-[SQL 拦截](../../admin-manual/query-admin/sql-interception) - 新增系统表`file_cache_statistics`,用于查看 BE 节点的数据缓存性能指标。 - - 更多信息,请查看文档系统表-[file_cache_statistics](https://doris.apache.org/docs/admin-manual/system-tables/file_cache_statistics/) + - 更多信息,请查看文档系统表-[file_cache_statistics](../../admin-manual/system-tables/information_schema/file_cache_statistics) ### 异步物化视图 @@ -108,10 +108,10 @@ under the License. - 新增系统表`table_properties`,便于用户查看和管理表的各项属性。 - - 更多信息,请查看文档 [table_properties](https://doris.apache.org/zh-CN/docs/admin-manual/system-tables/information_schema/table_properties/) + - 更多信息,请查看文档 [table_properties](../../admin-manual/system-tables/information_schema/table_properties/) - 新增 FE 中死锁和慢锁检测功能。 - - 更多信息,请查看文档 [FE 锁管理](https://doris.apache.org/zh-CN/docs/admin-manual/maint-monitor/frontend-lock-manager/) + - 更多信息,请查看文档 [FE 锁管理](../../admin-manual/maint-monitor/frontend-lock-manager/) ## 改进提升 @@ -119,7 +119,7 @@ under the License. - 革新外表元数据缓存机制。 - - 更多信息,请查看文档 [元数据缓存](https://doris.apache.org/zh-CN/docs/lakehouse/metacache/)。 + - 更多信息,请查看文档 [元数据缓存](../../lakehouse/metacache)。 - 新增会话变量`keep_carriage_return`,默认关闭。读取 Hive Text 格式表时,默认将`\r\n`与`\n`均视为换行符。[#38099](https://github.com/apache/doris/pull/38099) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.7.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.7.md index 2d85c595f497c..f5bfea1d272f5 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.7.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v2.1/release-2.1.7.md @@ -38,7 +38,7 @@ under the License. - enable_fallback_to_original_planner: true - enable_pipeline_x_engine: true - 审计日志增加了新的列 [#42262](https://github.com/apache/doris/pull/42262) - - 更多信息,请参考[管理指南](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin/) + - 更多信息,请参考[管理指南](../../admin-manual/audit-plugin) ## 新功能 @@ -61,8 +61,8 @@ under the License. - 增加了 `information_schema.table_options` 和 `information_schema.``table_properties` 系统表,支持查询建表时设置的一些属性。[#34384](https://github.com/apache/doris/pull/34384) - 更多信息,请参考系统表: - - [table_options](https://doris.apache.org/docs/admin-manual/system-tables/information_schema/table_options/) - - [table_properties](https://doris.apache.org/zh-CN/docs/admin-manual/system-tables/information_schema/table_properties/) + - [table_options](../../admin-manual/system-tables/information_schema/table_options) + - [table_properties](../../admin-manual/system-tables/information_schema/table_properties) - 支持 `bitmap_empty` 作为默认值。[#40364](https://github.com/apache/doris/pull/40364) - 增加了一个新的 Session 变量`require_sequence_in_insert` 来控制向 Unique Key 表进行`insert into select` 写入时,是否必须提供 Sequence 列。[#41655](https://github.com/apache/doris/pull/41655) @@ -75,16 +75,16 @@ under the License. ### 湖仓一体 - 支持写入数据到 Hive Text 格式表。[#40537](https://github.com/apache/doris/pull/40537) - - 更多信息,请参考[使用 Hive 构建数据湖](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-building/hive-build/)文档 + - 更多信息,请参考[使用 Hive 构建数据湖](../../lakehouse/datalake-building/hive-build/)文档 - 使用 MaxCompute Open Storage API 访问 MaxCompute 数据。[#41610](https://github.com/apache/doris/pull/41610) - - 更多信息,请参考 [MaxCompute](https://doris.apache.org/zh-CN/docs/lakehouse/database/max-compute/) 文档 + - 更多信息,请参考 [MaxCompute](../../lakehouse/database/max-compute/) 文档 - 支持 Paimon DLF Catalog。[#41694](https://github.com/apache/doris/pull/41694) - - 更多信息,请参考 [Paimon Catalog](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-analytics/paimon/) 文档 + - 更多信息,请参考 [Paimon Catalog](../../lakehouse/datalake-analytics/paimon/) 文档 - 新增语法 `table$partitions` 语法支持直接查询 Hive 分区信息 [#41230](https://github.com/apache/doris/pull/41230) - - 更多信息,请参考[通过 Hive 分析数据湖](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-analytics/hive/)文档 + - 更多信息,请参考[通过 Hive 分析数据湖](../../lakehouse/datalake-analytics/hive/)文档 - 支持 brotli 压缩格式的 Parquet 文件读取。[#42162](https://github.com/apache/doris/pull/42162) - 支持读取 Parquet 文件中的 DECIMAL 256 类型。[#42241](https://github.com/apache/doris/pull/42241) -- 支持读取 OpenCsvSerde 格式的 Hive 表。[#42939](https://github.com/apache/doris/pull/42939)https://github.com/apache/doris/pull/42939 +- 支持读取 OpenCsvSerde 格式的 Hive 表。[#42939](https://github.com/apache/doris/pull/42939) ### 异步物化视图 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.0.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.0.md index 5065dfc1566b7..2e7cdee64215e 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.0.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.0.md @@ -151,7 +151,7 @@ under the License. :::info 备注 -参考文档:[存算分离](https://doris.apache.org/zh-CN/docs/3.0/compute-storage-decoupled/overview) +参考文档:[存算分离](../../compute-storage-decoupled/overview) ::: @@ -200,15 +200,15 @@ under the License. - [接入 Trino Connector](https://doris.apache.org/zh-CN/community/how-to-contribute/trino-connector-developer-guide) -- [TPC-H](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/tpch/) +- [TPC-H](../../lakehouse/datalake-analytics/tpch/) -- [TPC-DS](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/tpcds/) +- [TPC-DS](../../lakehouse/datalake-analytics/tpcds/) -- [Delta Lake](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/deltalake) +- [Delta Lake](../../lakehouse/datalake-analytics/deltalake) -- [Kudu](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/kudu) +- [Kudu](../../lakehouse/datalake-analytics/kudu) -- [BigQuery](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/bigquery) +- [BigQuery](../../lakehouse/datalake-analytics/bigquery) ::: ### 2-3 数据湖构建 @@ -219,7 +219,7 @@ under the License. :::info 备注 -参考文档:[数据湖构建](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build/) +参考文档:[数据湖构建](../../lakehouse/datalake-building/hive-build/) ::: @@ -277,7 +277,7 @@ Variant 数据类型在经过大规模生产打磨后,已具备充分的稳定 参考文档: -- [事务](https://doris.apache.org/zh-CN/docs/3.0/data-operate/transaction/) +- [事务](../../data-operate/transaction/) - 目前 CCR 暂未支持显示事务同步。 ::: @@ -329,9 +329,9 @@ Variant 数据类型在经过大规模生产打磨后,已具备充分的稳定 :::info 备注 参考文档: -- [异步物化视图概览](https://doris.apache.org/zh-CN/docs/query/view-materialized-view/async-materialized-view) +- [异步物化视图概览](../../query-acceleration/materialized-view/async-materialized-view/overview.md) -- [查询异步物化视图](https://doris.apache.org/zh-CN/docs/3.0/query/view-materialized-view/query-async-materialized-view/) +- [查询异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md) ::: ## 6. 性能提升 @@ -400,7 +400,7 @@ Runtime Filter 是否能够准确生成对查询性能的影响至关重要, ``` :::info 备注 -参考文档: [Java UDF - UDTF](https://doris.apache.org/zh-CN/docs/query/udf/java-user-defined-function#udtf-1) +参考文档: [Java UDF - UDTF](../../query-data/udf/java-user-defined-function.md#java-udtf-实例介绍) ::: ### 7-2 生成列 @@ -415,7 +415,7 @@ Runtime Filter 是否能够准确生成对查询性能的影响至关重要, 参考文档: -[CREATE TABLE AND GENERATED COLUMN](https://doris.apache.org/zh-CN/docs/3.0/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-TABLE-AND-GENERATED-COLUMN/) +[CREATE TABLE AND GENERATED COLUMN](../../sql-manual/sql-statements/table-and-view/table/CREATE-TABLE.md) ::: ## 8. 功能改进 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.1.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.1.md index 6f79a76c5872c..dd3d7829f2783 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.1.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.1.md @@ -74,7 +74,7 @@ under the License. - SQL 拦截功能现在支持外部表 - - 更多内容,参考文档[SQL 拦截](https://doris.apache.org/zh-CN/docs/admin-manual/query-admin/sql-interception) + - 更多内容,参考文档[SQL 拦截](../..//admin-manual/query-admin/sql-interception) - Insert Overwrite 现在支持 Iceberg 表。[#37191](https://github.com/apache/doris/pull/37191) @@ -108,7 +108,7 @@ under the License. - 新增加了 FE 参数 `skip_audit_user_list`,在此配置项中的用户操作将不会被记录到审计日志中。[#38310](https://github.com/apache/doris/pull/38310) - - 更多内容,参考文档[审计插件](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin/) + - 更多内容,参考文档[审计插件](../../admin-manual/audit-plugin/) ## 改进 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.2.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.2.md index bd84408eec7f0..cd509e52023ff 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.2.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.2.md @@ -63,7 +63,7 @@ under the License. ### Lakehouse -- 新增 Lakesoul Catalog。[Apache Doris Docs](https://doris.apache.org/zh-CN/docs/dev/lakehouse/datalake-analytics/lakesoul) +- 新增 Lakesoul Catalog。[Apache Doris Docs](../../lakehouse/datalake-analytics/lakesoul) - 新增系统表 `catalog_meta_cache_statistics`,用于查看 External Catalog 中各类元数据缓存的使用情况。[#40155](https://github.com/apache/doris/pull/40155) ### 查询优化器 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.3.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.3.md index 2f72f702483e3..8a3ecbfa4f62f 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.3.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/releasenotes/v3.0/release-3.0.3.md @@ -45,11 +45,11 @@ under the License. - 新增 `table$partition` 语法,用于查询 Hive 表的分区信息。[#40774](https://github.com/apache/doris/pull/40774) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/hive#查询-hive-分区) + - [查看文档](../../lakehouse/datalake-analytics/hive#查询-hive-分区) - 支持创建 Text 格式的 Hive 表。[#41860](https://github.com/apache/doris/pull/41860) [#42175](https://github.com/apache/doris/pull/42175) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build#table) + - [查看文档](../../lakehouse/datalake-building/hive-build#table) ### 异步物化视图 @@ -71,7 +71,7 @@ under the License. - 数组函数 `array_agg` 支持在 ARRAY 中嵌套 ARRAY/MAP/STRUCT。[#42009](https://github.com/apache/doris/pull/42009) - 新增近似聚合统计函数 `approx_top_k` 和 `approx_top_sum`。[#44082](https://github.com/apache/doris/pull/44082) -## 改进 +## 改进与优化 ### 存储 @@ -96,7 +96,7 @@ under the License. - Paimon Catalog 支持阿里云 DLF 和 OSS-HDFS 存储。[#41247](https://github.com/apache/doris/pull/41247) [#42585](https://github.com/apache/doris/pull/42585) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/paimon#基于-aliyun-dlf-创建-catalog) + - [查看文档](../../lakehouse/datalake-analytics/paimon#基于-aliyun-dlf-创建-catalog) - 支持读取 OpenCSV 格式的 Hive 表。[#42257](https://github.com/apache/doris/pull/42257) [#42942](https://github.com/apache/doris/pull/42942) - 优化了访问 External Catalog 中 `information_schema.columns` 表的性能。[#41659](https://github.com/apache/doris/pull/41659) [#41962](https://github.com/apache/doris/pull/41962) @@ -142,7 +142,7 @@ under the License. - FE 监控项中的连接数信息支持按用户分别显示。[#39200](https://github.com/apache/doris/pull/39200) -## 缺陷修复 +## 问题修复 ### 存储 @@ -224,4 +224,4 @@ under the License. - 补充了审计日志表和文件中缺失的审计日志字段。[#43303](https://github.com/apache/doris/pull/43303) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file + - [查看文档](../../admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/demo-block.css b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/demo-block.css index 934e88ba28aaf..1257919249c60 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/demo-block.css +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/demo-block.css @@ -105,15 +105,6 @@ a:active { padding-right: 2rem } -.home-page-hero-right { - flex: 1; - flex-direction: row; - justify-content: center; - width: fit-content -} - - - .home-page-option-button { display: flex; margin-bottom: 0.5rem; @@ -209,11 +200,6 @@ a:active { justify-content: center; } -.home-page-hero-right { - align-items: center; - display: flex; - flex-direction: row; -} .home-page-hero-button { /* background-color: #fafafa; */ @@ -279,8 +265,18 @@ a:active { margin-top: 15px } +.home-page-hero-right a { + color: #4c576c +} - +.home-page-hero-right a:hover, +a:active { + /* color: #444fd9; */ + text-decoration: none; + transition-duration: .3s; + transition-timing-function: cubic-bezier(0, 0, .2, 1); + background-color: #fafafa +} .section-border { @@ -355,6 +351,24 @@ a:active { } +@media (max-width: 996px) { + .latest-button { + flex: 1 1 100%; + margin-right: 0 !important; + max-width: calc(100vw - 2rem); + min-height: 170px; + height: auto !important; + } + + .home-page-hero-right { + flex-wrap: wrap !important + } + .latest-button-CN{ + margin-right: 0 !important; + max-width: calc(100vw - 2rem); + } +} + .latest-button-CN { /* background-color: #fafafa; */ border: 0.3px solid #dcdcdc; diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/latest.tsx b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/latest.tsx index 3e1eb5090e0fb..7c92f75c3c137 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/latest.tsx +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/latest.tsx @@ -33,12 +33,12 @@ export default function Latest() {
*/} -
Doris Summit Asia 2024|12 月 14 日 深圳
+
Doris Summit Asia 2024 圆满落幕
-
一年一度的 Apache Doris 峰会再次启航,Doris Summit Asia 2024 现已开启报名,将于 12 月 14 日在深圳正式举办。
-
立即报名
+
2024 年 12 月 14 日,由飞轮科技主办,腾讯云和阿里云联合主办的 Doris Summit Asia 2024 在深圳圆满落幕。演讲回放及资料会在 10 个工作日内逐步释出,可通过 Doris Summit 官网获取。
+
回放生成中
- +
版本发布
{/*
@@ -47,9 +47,9 @@ export default function Latest() {
*/} -
Apache Doris 3.0.2 正式发布
+
Apache Doris 3.0.3 正式发布
-
3.0.2 版本在存算分离、存储、湖仓一体、查询优化器以及执行引擎持续升级改进,欢迎大家下载使用。
+
3.0.3 版本在存算分离、存储、湖仓一体、查询优化器以及执行引擎持续升级改进,欢迎大家下载使用。
查看详情
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/page-hero-1.tsx b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/page-hero-1.tsx index 4b9826c5d4e23..6666f3f97ac60 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/page-hero-1.tsx +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/gettingStarted/demo-block/page-hero-1.tsx @@ -35,9 +35,9 @@ export default function PageHero() {
如何基于 Apache Doris 构建开放、高性能低成本、统一的日志存储分析平台。
- +
-
资源管理
+
负载管理
{/*
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.0.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.0.md index 434677f520819..d14aec8a307e5 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.0.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.0.md @@ -104,7 +104,7 @@ under the License. ![Local Shuffle Clickbench and TPCH-100](/images/2.1-doris-clickbench-tpch.png) :::note 备注 -参考文档:[Pipeline X 执行引擎](https://doris.apache.org/zh-CN/docs/query-acceleration/pipeline-execution-engine) +参考文档:[Pipeline X 执行引擎](../../query-acceleration/pipeline-execution-engine) ::: ## ARM 架构深度适配,性能提升 230% @@ -141,9 +141,9 @@ under the License. 该功能目前为实验性质功能,当前已经支持 ClickHouse、Presto、Trino、Hive、Spark。在此我们以 Trino 为例,部署完 SQL 转换服务后,在会话变量中设置 `set sql_dialect = trino` ,即可直接采取 Trino SQL 语法执行查询。在某些社区用户的实际线上业务 SQL 兼容性测试中,在全部 3w 多条查询语句中与 Trino SQL 兼容度高达 99% 以上。也欢迎所有用户在使用过程中向我们反馈不兼容的 Case,帮助 Apache Doris 更加完善。 :::note -- 演示 Demo: https://www.bilibili.com/video/BV1cS421A7kA/?spm_id_from=333.999.0.0 +- [演示 Demo](https://www.bilibili.com/video/BV1cS421A7kA/?spm_id_from=333.999.0.0) -- 参考文档:[SQL 方言兼容](https://doris.apache.org/zh-CN/docs/lakehouse/sql-dialect.md) +- 参考文档:[SQL 方言兼容](../../lakehouse/sql-dialect.md) ::: @@ -302,7 +302,7 @@ CREATE MATERIALIZED VIEW mv1 :::note - 演示 Demo: https://www.bilibili.com/video/BV1s2421T71z/?spm_id_from=333.999.0.0 -- 参考文档:[异步物化视图](https://doris.apache.org/zh-CN/docs/query-acceleration/materialized-view/async-materialized-view/overview) +- 参考文档:[异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/overview) ::: ## 存储能力增强 @@ -408,7 +408,7 @@ PROPERTIES ( :::note -参考文档:[数据划分](https://doris.apache.org/zh-CN/docs/table-design/data-partitioning/basic-concepts) +参考文档:[数据划分](../../table-design/data-partitioning/basic-concepts) ::: ### INSERT INTO SELECT 导入性能提升 100% @@ -470,7 +470,7 @@ MemTable 前移在 2.1 版本中默认开启,用户无需修改原有的导入 :::note - 演示 Demo:https://www.bilibili.com/video/BV1um411o7Ha/?spm_id_from=333.999.0.0 -- 参考文档和完整测试报告:[Group Commit](https://doris.apache.org/zh-CN/docs/data-operate/import/import-way/group-commit-manual) +- 参考文档和完整测试报告:[Group Commit](../../data-operate/import/import-way/group-commit-manual) ::: @@ -542,7 +542,7 @@ SELECT v["properties"]["title"] from ${table_name} :::note - 演示 Demo: https://www.bilibili.com/video/BV13u4m1g7ra/?spm_id_from=333.999.0.0 -- 参考文档:[VARIANT](https://doris.apache.org/zh-CN/docs/sql-manual/sql-data-types/semi-structured/VARIANT.md) +- 参考文档:[VARIANT](../../sql-manual/sql-data-types/semi-structured/VARIANT.md) ::: @@ -557,7 +557,7 @@ SELECT v["properties"]["title"] from ${table_name} - INET_ATON:获取包含 IPv4 地址的字符串,格式为 A.B.C.D(点分隔的十进制数字) :::note -参考文档:[IPV6](https://doris.apache.org/zh-CN/docs/sql-manual/sql-data-types/ip/IPV6) +参考文档:[IPV6](../../sql-manual/sql-data-types/ip/IPV6) ::: @@ -674,7 +674,7 @@ mysql> select struct(1,"2") not in (struct(1,3), struct(1,"2"), struct(1,1), nul - `MAP_AGG`:接收 expr1 作为键,expr2 作为对应的值,返回一个 MAP :::note -参考文档:[MAP_AGG](https://doris.apache.org/zh-CN/docs/sql-manual/sql-functions/aggregate-functions/map-agg.md) +参考文档:[MAP_AGG](../../sql-manual/sql-functions/aggregate-functions/map-agg.md) ::: @@ -699,7 +699,7 @@ mysql> select struct(1,"2") not in (struct(1,3), struct(1,"2"), struct(1,1), nul :::note - 演示 Demo:https://www.bilibili.com/video/BV1Fz421X7XE/?spm_id_from=333.999.0.0 -- 参考文档:[Workload Group](https://doris.apache.org/zh-CN/docs/admin-manual/resource-admin/workload-group.md) +- 参考文档:[Workload Group](../../admin-manual/resource-admin/workload-group.md) ::: @@ -757,7 +757,7 @@ select QueryId,max(BePeakMemoryBytes) as be_peak_mem from active_queries() group 目前主要展示的负载类型包括 Select 和`Insert Into……Select`,预计在 2.1 版本之上的三位迭代版本中会支持 Stream Load 和 Broker Load 的资源用量展示。 :::note -参考文档:[ACTIVE_QUERIES](https://doris.apache.org/zh-CN/docs/sql-manual/sql-functions/table-functions/active_queries.md) +参考文档:[ACTIVE_QUERIES](../../sql-manual/sql-functions/table-functions/active_queries.md) ::: @@ -858,7 +858,7 @@ JOB e_daily :::caution 注意事项 -当前 Job Scheduler 仅支持 Insert 内表,参考文档:[CREATE-JOB](https://doris.apache.org/zh-CN/docs/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-JOB.md) +当前 Job Scheduler 仅支持 Insert 内表,参考文档:[CREATE-JOB](../../sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-JOB.md) ::: @@ -878,7 +878,7 @@ JOB e_daily - 对于之前已经安装过审计日志插件的用户,升级后可以继续使用原有插件,也可以通过 uninstall 命令卸载原有插件后,使用新的插件。但注意,切换插件后,审计日志表也将切换到新的表中。 - - 具体可参阅:[审计日志插件](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin.md) + - 具体可参阅:[审计日志插件](../../admin-manual/audit-plugin.md) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.2.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.2.md index 96b7c849d341b..1517bf0b53fca 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.2.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.2.md @@ -40,7 +40,7 @@ under the License. - https://github.com/apache/doris/pull/33282 -3. Auto Partition 语法变化,详见 https://doris.apache.org/zh-CN/docs/table-design/data-partition#%E8%87%AA%E5%8A%A8%E5%88%86%E5%8C%BA +3. Auto Partition 语法变化,详见[文档](../../table-design/data-partitioning/auto-partitioning.md) - https://github.com/apache/doris/pull/32737 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.3.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.3.md index dc33f0d6011fa..15056902e7534 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.3.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.3.md @@ -37,7 +37,7 @@ under the License. 从 2.1.3 版本开始,Apache Doris 支持对 Hive 的 DDL 和 DML 操作。用户可以直接通过 Apache Doris 在 Hive 中创建库表,通过执行`INSERT INTO`语句来向 Hive 表中写入数据。通过该功能,用户可以通过 Apache Doris 对 Hive 进行完整的数据查询和写入操作,进一步帮助用户简化湖仓一体架构。 -参考文档:[https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/](https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/) +参考[文档](../../lakehouse/datalake-building/hive-build) **2. 支持在异步物化视图之上构建新的异步物化视图** diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.4.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.4.md index d8e3a2d8be538..722de717ea32a 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.4.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.4.md @@ -40,9 +40,9 @@ under the License. 关于更多信息,请参考文档: - - [BE 日志管理](../admin-manual/log-management/be-log.md) + - [BE 日志管理](../../admin-manual/log-management/be-log.md) - - [FE 日志管理](../admin-manual/log-management/fe-log.md) + - [FE 日志管理](../../admin-manual/log-management/fe-log.md) - 如果建表时没有填写表注释,默认注释为空,不再使用表类型作为默认表注释。 [#36025](https://github.com/apache/doris/pull/36025) @@ -54,7 +54,7 @@ under the License. - **支持 FE 火焰图工具**:在 FE 部署目录 `${DORIS_FE_HOME}/bin` 中会增加`profile_fe.sh` 脚本,可以利用 async-profiler 工具生成 FE 的火焰图,用以发现性能瓶颈点。 - 关于更多信息,请参考文档:[使用 FE Profiler 生成火焰图](/community/developer-guide/fe-profiler.md) + 关于更多信息,请参考文档:[使用 FE Profiler 生成火焰图](https://doris.apache.org/zh-CN/community/developer-guide/fe-profiler) - **支持 SELECT DISTINCT 与聚合函数同时使用**:支持 `SELECT DISTINCT` 与聚合函数同时使用,在一个查询中同时去重和进行聚合操作,如 SUM、MIN/MAX 等。 @@ -66,15 +66,15 @@ under the License. - **支持 Paimon 的原生读取器来处理 Deletion Vector:** Deletion Vector 主要用于标记或追踪哪些数据已被删除或标记为删除,通常应用在需要保留历史数据的场景,基于本优化可以提升大量数据更新或删除时的处理效率。 [#35241](https://github.com/apache/doris/pull/35241) - 关于更多信息,请参考文档:[数据湖分析 - Paimon](../lakehouse/datalake-analytics/paimon.md) + 关于更多信息,请参考文档:[数据湖分析 - Paimon](../../lakehouse/datalake-analytics/paimon.md) - **支持在表值函数(TVF)中使用 Resource**:TVF 功能为 Apache Doris 提供了直接将对象存储或 HDFS 上的文件作为 Table 进行查询分析的能力。通过在 TVF 中引用 Resource,可以避免重复填写连接信息,提升使用体验。 [#35139](https://github.com/apache/doris/pull/35139) - 关于更多信息,请参考文档:[表函数 - HDFS](../sql-manual/sql-functions/table-functions/hdfs.md) + 关于更多信息,请参考文档:[表函数 - HDFS](../../sql-manual/sql-functions/table-functions/hdfs.md) - **支持通过 Ranger 插件实现数据脱敏**:开启 Ranger 鉴权功能后,支持使用 Ranger 中的 Data Mask 功能进行数据脱敏。 - 关于更多信息,请参考文档:[基于 Apache Ranger 的鉴权管理](https://doris.apache.org/zh-CN/docs/admin-manual/auth/ranger/#%E5%AE%89%E8%A3%85-doris-ranger-%E6%8F%92%E4%BB%B6) + 关于更多信息,请参考文档:[基于 Apache Ranger 的鉴权管理](../../admin-manual/auth/ranger#资源和权限) ### 异步物化视图 @@ -82,21 +82,21 @@ under the License. - 支持单表透明改写。 - 关于更多信息,请参考文档:[查询异步物化视图](../query/view-materialized-view/query-async-materialized-view.md) + 关于更多信息,请参考文档:[查询异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md) - 透明改写支持 agg_state, agg_union 类型的聚合上卷,物化视图可以定义为 agg_state 或者 agg_union,查询使用具体的聚合函数,或者使用 agg_merge - 关于更多信息,请参考文档:[AGG_STATE](../sql-manual/sql-types/Data-Types/AGG_STATE.md) + 关于更多信息,请参考文档:[AGG_STATE](../../sql-manual/sql-data-types/aggregate/AGG-STATE.md) ### 其他 - **新增 `replace_empty` 函数**:将字符串中的子字符串进行替换,当旧字符串为空时,会将新字符串插入到原有字符串的每个字符前以及最后。 - 关于更多信息,请参考文档:[字符串函数 - REPLACE_EMPTY](../sql-manual/sql-functions/string-functions/replace_empty.md) + 关于更多信息,请参考文档:[字符串函数 - REPLACE_EMPTY](../../sql-manual/sql-functions/string-functions/replace_empty.md) - 支持 `show storage policy using` 语句:支持查看所有或指定存储策略关联的表和分区。 - 关于更多信息,请参考文档:[SQL 语句 - SHOW](../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md) + 关于更多信息,请参考文档:[SQL 语句 - SHOW](../../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md) - **支持 BE 侧的 JVM 指标:** 通过在 `be.conf` 配置文件中设置`enable_jvm_monitor=true`,可以启用对 BE 节点 JVM 的监控和指标收集,有助于了解 BE JVM 的资源使用情况,以便进行故障排除和性能优化。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.5.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.5.md index b463d42968326..c41df17fce4ba 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.5.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.5.md @@ -131,7 +131,7 @@ under the License. - 数据导出(Export/Outfile)支持指定 Parquet 和 ORC 的压缩格式。 - - 更多信息,请参考[文档](https://doris.apache.org/docs/sql-manual/sql-statements/Data-Manipulation-Statements/Manipulation/EXPORT/?_highlight=compress_type)。 + - 更多信息,请参考[文档](../../sql-manual/sql-statements/data-modification/load-and-export/EXPORT.md)。 - 当使用 CTAS+TVF 创建表时,TVF 中的分区列将被自动映射为 Varchar(65533)而非 String,以便该分区列能够作为内表的分区列使用。 [#37161](https://github.com/apache/doris/pull/37161) @@ -207,7 +207,7 @@ under the License. - 支持为 `INSERT INTO ... FROM TABLE VALUE FUNCTION` 语句设置 `max_filter_ratio` 参数。 - - 更多信息,请参考[文档](https://doris.apache.org/zh-CN/docs/data-operate/import/import-way/insert-into-manual/) + - 更多信息,请参考[文档](../../data-operate/import/import-way/insert-into-manual) ## Bug 修复 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.6.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.6.md index 6261e4e0c6612..65853079ee177 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.6.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.6.md @@ -56,15 +56,15 @@ under the License. - 实现 Iceberg 表的写回功能。 - - 更多信息,请查看文档数据湖构建-[Iceberg](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-building/iceberg-build) + - 更多信息,请查看文档数据湖构建-[Iceberg](../../lakehouse/datalake-building/iceberg-build) - 增强 SQL 拦截规则,支持对外表的拦截处理。 - - 更多信息,请查看文档查询管理-[SQL 拦截](https://doris.apache.org/zh-CN/docs/admin-manual/query-admin/sql-interception) + - 更多信息,请查看文档查询管理-[SQL 拦截](../../admin-manual/query-admin/sql-interception) - 新增系统表`file_cache_statistics`,用于查看 BE 节点的数据缓存性能指标。 - - 更多信息,请查看文档系统表-[file_cache_statistics](https://doris.apache.org/docs/admin-manual/system-tables/file_cache_statistics/) + - 更多信息,请查看文档系统表-[file_cache_statistics](../../admin-manual/system-tables/information_schema/file_cache_statistics) ### 异步物化视图 @@ -108,10 +108,10 @@ under the License. - 新增系统表`table_properties`,便于用户查看和管理表的各项属性。 - - 更多信息,请查看文档 [table_properties](https://doris.apache.org/zh-CN/docs/admin-manual/system-tables/information_schema/table_properties/) + - 更多信息,请查看文档 [table_properties](../../admin-manual/system-tables/information_schema/table_properties/) - 新增 FE 中死锁和慢锁检测功能。 - - 更多信息,请查看文档 [FE 锁管理](https://doris.apache.org/zh-CN/docs/admin-manual/maint-monitor/frontend-lock-manager/) + - 更多信息,请查看文档 [FE 锁管理](../../admin-manual/maint-monitor/frontend-lock-manager/) ## 改进提升 @@ -119,7 +119,7 @@ under the License. - 革新外表元数据缓存机制。 - - 更多信息,请查看文档 [元数据缓存](https://doris.apache.org/zh-CN/docs/lakehouse/metacache/)。 + - 更多信息,请查看文档 [元数据缓存](../../lakehouse/metacache)。 - 新增会话变量`keep_carriage_return`,默认关闭。读取 Hive Text 格式表时,默认将`\r\n`与`\n`均视为换行符。[#38099](https://github.com/apache/doris/pull/38099) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.7.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.7.md index 2d85c595f497c..f5bfea1d272f5 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.7.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v2.1/release-2.1.7.md @@ -38,7 +38,7 @@ under the License. - enable_fallback_to_original_planner: true - enable_pipeline_x_engine: true - 审计日志增加了新的列 [#42262](https://github.com/apache/doris/pull/42262) - - 更多信息,请参考[管理指南](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin/) + - 更多信息,请参考[管理指南](../../admin-manual/audit-plugin) ## 新功能 @@ -61,8 +61,8 @@ under the License. - 增加了 `information_schema.table_options` 和 `information_schema.``table_properties` 系统表,支持查询建表时设置的一些属性。[#34384](https://github.com/apache/doris/pull/34384) - 更多信息,请参考系统表: - - [table_options](https://doris.apache.org/docs/admin-manual/system-tables/information_schema/table_options/) - - [table_properties](https://doris.apache.org/zh-CN/docs/admin-manual/system-tables/information_schema/table_properties/) + - [table_options](../../admin-manual/system-tables/information_schema/table_options) + - [table_properties](../../admin-manual/system-tables/information_schema/table_properties) - 支持 `bitmap_empty` 作为默认值。[#40364](https://github.com/apache/doris/pull/40364) - 增加了一个新的 Session 变量`require_sequence_in_insert` 来控制向 Unique Key 表进行`insert into select` 写入时,是否必须提供 Sequence 列。[#41655](https://github.com/apache/doris/pull/41655) @@ -75,16 +75,16 @@ under the License. ### 湖仓一体 - 支持写入数据到 Hive Text 格式表。[#40537](https://github.com/apache/doris/pull/40537) - - 更多信息,请参考[使用 Hive 构建数据湖](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-building/hive-build/)文档 + - 更多信息,请参考[使用 Hive 构建数据湖](../../lakehouse/datalake-building/hive-build/)文档 - 使用 MaxCompute Open Storage API 访问 MaxCompute 数据。[#41610](https://github.com/apache/doris/pull/41610) - - 更多信息,请参考 [MaxCompute](https://doris.apache.org/zh-CN/docs/lakehouse/database/max-compute/) 文档 + - 更多信息,请参考 [MaxCompute](../../lakehouse/database/max-compute/) 文档 - 支持 Paimon DLF Catalog。[#41694](https://github.com/apache/doris/pull/41694) - - 更多信息,请参考 [Paimon Catalog](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-analytics/paimon/) 文档 + - 更多信息,请参考 [Paimon Catalog](../../lakehouse/datalake-analytics/paimon/) 文档 - 新增语法 `table$partitions` 语法支持直接查询 Hive 分区信息 [#41230](https://github.com/apache/doris/pull/41230) - - 更多信息,请参考[通过 Hive 分析数据湖](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-analytics/hive/)文档 + - 更多信息,请参考[通过 Hive 分析数据湖](../../lakehouse/datalake-analytics/hive/)文档 - 支持 brotli 压缩格式的 Parquet 文件读取。[#42162](https://github.com/apache/doris/pull/42162) - 支持读取 Parquet 文件中的 DECIMAL 256 类型。[#42241](https://github.com/apache/doris/pull/42241) -- 支持读取 OpenCsvSerde 格式的 Hive 表。[#42939](https://github.com/apache/doris/pull/42939)https://github.com/apache/doris/pull/42939 +- 支持读取 OpenCsvSerde 格式的 Hive 表。[#42939](https://github.com/apache/doris/pull/42939) ### 异步物化视图 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.0.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.0.md index 5065dfc1566b7..2e7cdee64215e 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.0.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.0.md @@ -151,7 +151,7 @@ under the License. :::info 备注 -参考文档:[存算分离](https://doris.apache.org/zh-CN/docs/3.0/compute-storage-decoupled/overview) +参考文档:[存算分离](../../compute-storage-decoupled/overview) ::: @@ -200,15 +200,15 @@ under the License. - [接入 Trino Connector](https://doris.apache.org/zh-CN/community/how-to-contribute/trino-connector-developer-guide) -- [TPC-H](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/tpch/) +- [TPC-H](../../lakehouse/datalake-analytics/tpch/) -- [TPC-DS](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/tpcds/) +- [TPC-DS](../../lakehouse/datalake-analytics/tpcds/) -- [Delta Lake](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/deltalake) +- [Delta Lake](../../lakehouse/datalake-analytics/deltalake) -- [Kudu](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/kudu) +- [Kudu](../../lakehouse/datalake-analytics/kudu) -- [BigQuery](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/bigquery) +- [BigQuery](../../lakehouse/datalake-analytics/bigquery) ::: ### 2-3 数据湖构建 @@ -219,7 +219,7 @@ under the License. :::info 备注 -参考文档:[数据湖构建](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build/) +参考文档:[数据湖构建](../../lakehouse/datalake-building/hive-build/) ::: @@ -277,7 +277,7 @@ Variant 数据类型在经过大规模生产打磨后,已具备充分的稳定 参考文档: -- [事务](https://doris.apache.org/zh-CN/docs/3.0/data-operate/transaction/) +- [事务](../../data-operate/transaction/) - 目前 CCR 暂未支持显示事务同步。 ::: @@ -329,9 +329,9 @@ Variant 数据类型在经过大规模生产打磨后,已具备充分的稳定 :::info 备注 参考文档: -- [异步物化视图概览](https://doris.apache.org/zh-CN/docs/query/view-materialized-view/async-materialized-view) +- [异步物化视图概览](../../query-acceleration/materialized-view/async-materialized-view/overview.md) -- [查询异步物化视图](https://doris.apache.org/zh-CN/docs/3.0/query/view-materialized-view/query-async-materialized-view/) +- [查询异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md) ::: ## 6. 性能提升 @@ -400,7 +400,7 @@ Runtime Filter 是否能够准确生成对查询性能的影响至关重要, ``` :::info 备注 -参考文档: [Java UDF - UDTF](https://doris.apache.org/zh-CN/docs/query/udf/java-user-defined-function#udtf-1) +参考文档: [Java UDF - UDTF](../../query-data/udf/java-user-defined-function.md#java-udtf-实例介绍) ::: ### 7-2 生成列 @@ -415,7 +415,7 @@ Runtime Filter 是否能够准确生成对查询性能的影响至关重要, 参考文档: -[CREATE TABLE AND GENERATED COLUMN](https://doris.apache.org/zh-CN/docs/3.0/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-TABLE-AND-GENERATED-COLUMN/) +[CREATE TABLE AND GENERATED COLUMN](../../sql-manual/sql-statements/table-and-view/table/CREATE-TABLE.md) ::: ## 8. 功能改进 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.1.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.1.md index 6f79a76c5872c..dd3d7829f2783 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.1.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.1.md @@ -74,7 +74,7 @@ under the License. - SQL 拦截功能现在支持外部表 - - 更多内容,参考文档[SQL 拦截](https://doris.apache.org/zh-CN/docs/admin-manual/query-admin/sql-interception) + - 更多内容,参考文档[SQL 拦截](../..//admin-manual/query-admin/sql-interception) - Insert Overwrite 现在支持 Iceberg 表。[#37191](https://github.com/apache/doris/pull/37191) @@ -108,7 +108,7 @@ under the License. - 新增加了 FE 参数 `skip_audit_user_list`,在此配置项中的用户操作将不会被记录到审计日志中。[#38310](https://github.com/apache/doris/pull/38310) - - 更多内容,参考文档[审计插件](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin/) + - 更多内容,参考文档[审计插件](../../admin-manual/audit-plugin/) ## 改进 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.2.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.2.md index bd84408eec7f0..cd509e52023ff 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.2.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.2.md @@ -63,7 +63,7 @@ under the License. ### Lakehouse -- 新增 Lakesoul Catalog。[Apache Doris Docs](https://doris.apache.org/zh-CN/docs/dev/lakehouse/datalake-analytics/lakesoul) +- 新增 Lakesoul Catalog。[Apache Doris Docs](../../lakehouse/datalake-analytics/lakesoul) - 新增系统表 `catalog_meta_cache_statistics`,用于查看 External Catalog 中各类元数据缓存的使用情况。[#40155](https://github.com/apache/doris/pull/40155) ### 查询优化器 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.3.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.3.md index 2f72f702483e3..8a3ecbfa4f62f 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.3.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/releasenotes/v3.0/release-3.0.3.md @@ -45,11 +45,11 @@ under the License. - 新增 `table$partition` 语法,用于查询 Hive 表的分区信息。[#40774](https://github.com/apache/doris/pull/40774) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/hive#查询-hive-分区) + - [查看文档](../../lakehouse/datalake-analytics/hive#查询-hive-分区) - 支持创建 Text 格式的 Hive 表。[#41860](https://github.com/apache/doris/pull/41860) [#42175](https://github.com/apache/doris/pull/42175) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build#table) + - [查看文档](../../lakehouse/datalake-building/hive-build#table) ### 异步物化视图 @@ -71,7 +71,7 @@ under the License. - 数组函数 `array_agg` 支持在 ARRAY 中嵌套 ARRAY/MAP/STRUCT。[#42009](https://github.com/apache/doris/pull/42009) - 新增近似聚合统计函数 `approx_top_k` 和 `approx_top_sum`。[#44082](https://github.com/apache/doris/pull/44082) -## 改进 +## 改进与优化 ### 存储 @@ -96,7 +96,7 @@ under the License. - Paimon Catalog 支持阿里云 DLF 和 OSS-HDFS 存储。[#41247](https://github.com/apache/doris/pull/41247) [#42585](https://github.com/apache/doris/pull/42585) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/paimon#基于-aliyun-dlf-创建-catalog) + - [查看文档](../../lakehouse/datalake-analytics/paimon#基于-aliyun-dlf-创建-catalog) - 支持读取 OpenCSV 格式的 Hive 表。[#42257](https://github.com/apache/doris/pull/42257) [#42942](https://github.com/apache/doris/pull/42942) - 优化了访问 External Catalog 中 `information_schema.columns` 表的性能。[#41659](https://github.com/apache/doris/pull/41659) [#41962](https://github.com/apache/doris/pull/41962) @@ -142,7 +142,7 @@ under the License. - FE 监控项中的连接数信息支持按用户分别显示。[#39200](https://github.com/apache/doris/pull/39200) -## 缺陷修复 +## 问题修复 ### 存储 @@ -224,4 +224,4 @@ under the License. - 补充了审计日志表和文件中缺失的审计日志字段。[#43303](https://github.com/apache/doris/pull/43303) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file + - [查看文档](../../admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/gettingStarted/demo-block/latest.tsx b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/gettingStarted/demo-block/latest.tsx index acaf64e6c44b3..7c92f75c3c137 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/gettingStarted/demo-block/latest.tsx +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/gettingStarted/demo-block/latest.tsx @@ -38,7 +38,7 @@ export default function Latest() {
2024 年 12 月 14 日,由飞轮科技主办,腾讯云和阿里云联合主办的 Doris Summit Asia 2024 在深圳圆满落幕。演讲回放及资料会在 10 个工作日内逐步释出,可通过 Doris Summit 官网获取。
回放生成中
- +
版本发布
{/*
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/gettingStarted/demo-block/page-hero-1.tsx b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/gettingStarted/demo-block/page-hero-1.tsx index 4b9826c5d4e23..6666f3f97ac60 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/gettingStarted/demo-block/page-hero-1.tsx +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/gettingStarted/demo-block/page-hero-1.tsx @@ -35,9 +35,9 @@ export default function PageHero() {
如何基于 Apache Doris 构建开放、高性能低成本、统一的日志存储分析平台。
- +
-
资源管理
+
负载管理
{/*
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/faq.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/faq.md index c850659d3b047..905111aa22c7f 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/faq.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/faq.md @@ -1,6 +1,6 @@ --- { - "title": "常见问题", + "title": "异步物化视图常见问题", "language": "zh-CN" } --- diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md index 9c5b27d37c38d..aa80b6df129d2 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md @@ -1,6 +1,6 @@ --- { - "title": "功能描述", + "title": "异步物化视图功能描述", "language": "zh-CN" } --- diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/overview.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/overview.md index 830ad751e2bd0..a140b4a871859 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/overview.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/overview.md @@ -1,6 +1,6 @@ --- { - "title": "原理介绍", + "title": "异步物化视图原理介绍", "language": "zh-CN" } --- diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/use-guide.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/use-guide.md index c9f6e28fe620e..49caa7caf98ed 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/use-guide.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/query-acceleration/materialized-view/async-materialized-view/use-guide.md @@ -1,6 +1,6 @@ --- { - "title": "使用与实践", + "title": "异步物化视图使用与实践", "language": "zh-CN" } --- @@ -26,9 +26,9 @@ under the License. ## 异步物化视图使用原则 -1. **时效性考虑:**异步物化视图通常用于对数据时效性要求不高的场景,一般是 T+1 的数据。如果时效性要求高,应考虑使用同步物化视图。 +1. **时效性考虑:** 异步物化视图通常用于对数据时效性要求不高的场景,一般是 T+1 的数据。如果时效性要求高,应考虑使用同步物化视图。 -2. **加速效果与一致性考虑:**在查询加速场景,创建物化视图时,DBA 应将常见查询 SQL 模式分组,尽量使组之间无重合。SQL 模式组划分越清晰,物化视图构建的质量越高。一个查询可能使用多个物化视图,同时一个物化视图也可能被多个查询使用。构建物化视图需要综合考虑命中物化视图的响应时间(加速效果)、构建成本、数据一致性要求等。 +2. **加速效果与一致性考虑:** 在查询加速场景,创建物化视图时,DBA 应将常见查询 SQL 模式分组,尽量使组之间无重合。SQL 模式组划分越清晰,物化视图构建的质量越高。一个查询可能使用多个物化视图,同时一个物化视图也可能被多个查询使用。构建物化视图需要综合考虑命中物化视图的响应时间(加速效果)、构建成本、数据一致性要求等。 3. **物化视图定义与构建成本考虑:** @@ -38,11 +38,11 @@ under the License. 需要注意: -1. **物化视图数量控制:**物化视图并非越多越好。物化视图参与透明改写,且 CBO 代价模型选择需要时间。理论上,物化视图越多,透明改写的时间越长,且物化视图构建和刷新占用的资源越大。 +1. **物化视图数量控制:** 物化视图并非越多越好。物化视图参与透明改写,且 CBO 代价模型选择需要时间。理论上,物化视图越多,透明改写的时间越长,且物化视图构建和刷新占用的资源越大。 -2. **定期检查物化视图使用状态:**如果未使用,应及时删除。 +2. **定期检查物化视图使用状态:** 如果未使用,应及时删除。 -3. **基表数据更新频率:**如果物化视图的基表数据频繁更新,可能不太适合使用物化视图,因为这会导致物化视图频繁失效,不能用于透明改写(可直查)。如果需要使用此类物化视图进行透明改写,需要允许查询的数据有一定的时效延迟,并可以设定`grace_period`。具体见`grace_period`的适用介绍。 +3. **基表数据更新频率:** 如果物化视图的基表数据频繁更新,可能不太适合使用物化视图,因为这会导致物化视图频繁失效,不能用于透明改写(可直查)。如果需要使用此类物化视图进行透明改写,需要允许查询的数据有一定的时效延迟,并可以设定`grace_period`。具体见`grace_period`的适用介绍。 ## 物化视图刷新方式选择原则 @@ -222,9 +222,9 @@ SyncWithBaseTables: 1 - 对于分区增量的物化视图,分区物化视图是否可用,是以分区粒度去看的。也就是说,即使物化视图的部分分区不可用,但只要查询的是有效分区,那么此物化视图依旧可用于透明改写。是否能透明改写,主要看查询所用分区的 `SyncWithBaseTables` 字段是否一致。如果 `SyncWithBaseTables` 是 1,此分区可用于透明改写;如果是 0,则不能用于透明改写。 -- **JobName:**物化视图构建 Job 的名称,每个物化视图有一个 Job,每次刷新会有一个新的 Task,Job 和 Task 是 1:n 的关系 +- **JobName:** 物化视图构建 Job 的名称,每个物化视图有一个 Job,每次刷新会有一个新的 Task,Job 和 Task 是 1:n 的关系 -- **State:**如果变为 SCHEMA_CHANGE,代表基表的 Schema 发生了变化,此时物化视图将不能用来透明改写 (但是不影响直接查询物化视图),下次刷新任务如果执行成功,将恢复为 NORMAL。 +- **State:** 如果变为 SCHEMA_CHANGE,代表基表的 Schema 发生了变化,此时物化视图将不能用来透明改写 (但是不影响直接查询物化视图),下次刷新任务如果执行成功,将恢复为 NORMAL。 - **SchemaChangeDetail:** 表示 SCHEMA_CHANGE 发生的原因。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.0.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.0.md index 434677f520819..6eee28debcab2 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.0.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.0.md @@ -104,7 +104,7 @@ under the License. ![Local Shuffle Clickbench and TPCH-100](/images/2.1-doris-clickbench-tpch.png) :::note 备注 -参考文档:[Pipeline X 执行引擎](https://doris.apache.org/zh-CN/docs/query-acceleration/pipeline-execution-engine) +参考文档:[Pipeline X 执行引擎](../../query-acceleration/pipeline-execution-engine) ::: ## ARM 架构深度适配,性能提升 230% @@ -141,9 +141,9 @@ under the License. 该功能目前为实验性质功能,当前已经支持 ClickHouse、Presto、Trino、Hive、Spark。在此我们以 Trino 为例,部署完 SQL 转换服务后,在会话变量中设置 `set sql_dialect = trino` ,即可直接采取 Trino SQL 语法执行查询。在某些社区用户的实际线上业务 SQL 兼容性测试中,在全部 3w 多条查询语句中与 Trino SQL 兼容度高达 99% 以上。也欢迎所有用户在使用过程中向我们反馈不兼容的 Case,帮助 Apache Doris 更加完善。 :::note -- 演示 Demo: https://www.bilibili.com/video/BV1cS421A7kA/?spm_id_from=333.999.0.0 +- [演示 Demo](https://www.bilibili.com/video/BV1cS421A7kA/?spm_id_from=333.999.0.0) -- 参考文档:[SQL 方言兼容](https://doris.apache.org/zh-CN/docs/lakehouse/sql-dialect.md) +- 参考文档:[SQL 方言兼容](../../lakehouse/sql-dialect.md) ::: @@ -302,7 +302,7 @@ CREATE MATERIALIZED VIEW mv1 :::note - 演示 Demo: https://www.bilibili.com/video/BV1s2421T71z/?spm_id_from=333.999.0.0 -- 参考文档:[异步物化视图](https://doris.apache.org/zh-CN/docs/query-acceleration/materialized-view/async-materialized-view/overview) +- 参考文档:[异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/overview) ::: ## 存储能力增强 @@ -408,7 +408,7 @@ PROPERTIES ( :::note -参考文档:[数据划分](https://doris.apache.org/zh-CN/docs/table-design/data-partitioning/basic-concepts) +参考文档:[数据划分](../../table-design/data-partitioning/basic-concepts) ::: ### INSERT INTO SELECT 导入性能提升 100% @@ -470,7 +470,7 @@ MemTable 前移在 2.1 版本中默认开启,用户无需修改原有的导入 :::note - 演示 Demo:https://www.bilibili.com/video/BV1um411o7Ha/?spm_id_from=333.999.0.0 -- 参考文档和完整测试报告:[Group Commit](https://doris.apache.org/zh-CN/docs/data-operate/import/import-way/group-commit-manual) +- 参考文档和完整测试报告:[Group Commit](../../data-operate/import/import-way/group-commit-manual) ::: @@ -542,7 +542,7 @@ SELECT v["properties"]["title"] from ${table_name} :::note - 演示 Demo: https://www.bilibili.com/video/BV13u4m1g7ra/?spm_id_from=333.999.0.0 -- 参考文档:[VARIANT](https://doris.apache.org/zh-CN/docs/sql-manual/sql-data-types/semi-structured/VARIANT.md) +- 参考文档:[VARIANT](../../sql-manual/sql-data-types/semi-structured/VARIANT.md) ::: @@ -557,7 +557,7 @@ SELECT v["properties"]["title"] from ${table_name} - INET_ATON:获取包含 IPv4 地址的字符串,格式为 A.B.C.D(点分隔的十进制数字) :::note -参考文档:[IPV6](https://doris.apache.org/zh-CN/docs/sql-manual/sql-data-types/ip/IPV6) +参考文档:[IPV6](../../sql-manual/sql-data-types/ip/IPV6) ::: @@ -674,7 +674,7 @@ mysql> select struct(1,"2") not in (struct(1,3), struct(1,"2"), struct(1,1), nul - `MAP_AGG`:接收 expr1 作为键,expr2 作为对应的值,返回一个 MAP :::note -参考文档:[MAP_AGG](https://doris.apache.org/zh-CN/docs/sql-manual/sql-functions/aggregate-functions/map-agg.md) +参考文档:[MAP_AGG](../../sql-manual/sql-functions/aggregate-functions/map-agg.md) ::: @@ -699,7 +699,7 @@ mysql> select struct(1,"2") not in (struct(1,3), struct(1,"2"), struct(1,1), nul :::note - 演示 Demo:https://www.bilibili.com/video/BV1Fz421X7XE/?spm_id_from=333.999.0.0 -- 参考文档:[Workload Group](https://doris.apache.org/zh-CN/docs/admin-manual/resource-admin/workload-group.md) +- 参考文档:[Workload Group](../../admin-manual/resource-admin/workload-group.md) ::: @@ -757,7 +757,7 @@ select QueryId,max(BePeakMemoryBytes) as be_peak_mem from active_queries() group 目前主要展示的负载类型包括 Select 和`Insert Into……Select`,预计在 2.1 版本之上的三位迭代版本中会支持 Stream Load 和 Broker Load 的资源用量展示。 :::note -参考文档:[ACTIVE_QUERIES](https://doris.apache.org/zh-CN/docs/sql-manual/sql-functions/table-functions/active_queries.md) +参考文档:[ACTIVE_QUERIES](../../sql-manual/sql-functions/table-functions/active_queries.md) ::: @@ -858,7 +858,7 @@ JOB e_daily :::caution 注意事项 -当前 Job Scheduler 仅支持 Insert 内表,参考文档:[CREATE-JOB](https://doris.apache.org/zh-CN/docs/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-JOB.md) +当前 Job Scheduler 仅支持 Insert 内表,参考文档:[CREATE-JOB](../../sql-manual/sql-statements/job/CREATE-JOB.md) ::: @@ -878,7 +878,7 @@ JOB e_daily - 对于之前已经安装过审计日志插件的用户,升级后可以继续使用原有插件,也可以通过 uninstall 命令卸载原有插件后,使用新的插件。但注意,切换插件后,审计日志表也将切换到新的表中。 - - 具体可参阅:[审计日志插件](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin.md) + - 具体可参阅:[审计日志插件](../../admin-manual/audit-plugin.md) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.2.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.2.md index 96b7c849d341b..1517bf0b53fca 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.2.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.2.md @@ -40,7 +40,7 @@ under the License. - https://github.com/apache/doris/pull/33282 -3. Auto Partition 语法变化,详见 https://doris.apache.org/zh-CN/docs/table-design/data-partition#%E8%87%AA%E5%8A%A8%E5%88%86%E5%8C%BA +3. Auto Partition 语法变化,详见[文档](../../table-design/data-partitioning/auto-partitioning.md) - https://github.com/apache/doris/pull/32737 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.3.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.3.md index dc33f0d6011fa..2eebec69e5e3b 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.3.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.3.md @@ -37,7 +37,8 @@ under the License. 从 2.1.3 版本开始,Apache Doris 支持对 Hive 的 DDL 和 DML 操作。用户可以直接通过 Apache Doris 在 Hive 中创建库表,通过执行`INSERT INTO`语句来向 Hive 表中写入数据。通过该功能,用户可以通过 Apache Doris 对 Hive 进行完整的数据查询和写入操作,进一步帮助用户简化湖仓一体架构。 -参考文档:[https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/](https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/) +参考[文档](../../lakehouse/datalake-building/hive-build) + **2. 支持在异步物化视图之上构建新的异步物化视图** diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.4.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.4.md index d8e3a2d8be538..722de717ea32a 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.4.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.4.md @@ -40,9 +40,9 @@ under the License. 关于更多信息,请参考文档: - - [BE 日志管理](../admin-manual/log-management/be-log.md) + - [BE 日志管理](../../admin-manual/log-management/be-log.md) - - [FE 日志管理](../admin-manual/log-management/fe-log.md) + - [FE 日志管理](../../admin-manual/log-management/fe-log.md) - 如果建表时没有填写表注释,默认注释为空,不再使用表类型作为默认表注释。 [#36025](https://github.com/apache/doris/pull/36025) @@ -54,7 +54,7 @@ under the License. - **支持 FE 火焰图工具**:在 FE 部署目录 `${DORIS_FE_HOME}/bin` 中会增加`profile_fe.sh` 脚本,可以利用 async-profiler 工具生成 FE 的火焰图,用以发现性能瓶颈点。 - 关于更多信息,请参考文档:[使用 FE Profiler 生成火焰图](/community/developer-guide/fe-profiler.md) + 关于更多信息,请参考文档:[使用 FE Profiler 生成火焰图](https://doris.apache.org/zh-CN/community/developer-guide/fe-profiler) - **支持 SELECT DISTINCT 与聚合函数同时使用**:支持 `SELECT DISTINCT` 与聚合函数同时使用,在一个查询中同时去重和进行聚合操作,如 SUM、MIN/MAX 等。 @@ -66,15 +66,15 @@ under the License. - **支持 Paimon 的原生读取器来处理 Deletion Vector:** Deletion Vector 主要用于标记或追踪哪些数据已被删除或标记为删除,通常应用在需要保留历史数据的场景,基于本优化可以提升大量数据更新或删除时的处理效率。 [#35241](https://github.com/apache/doris/pull/35241) - 关于更多信息,请参考文档:[数据湖分析 - Paimon](../lakehouse/datalake-analytics/paimon.md) + 关于更多信息,请参考文档:[数据湖分析 - Paimon](../../lakehouse/datalake-analytics/paimon.md) - **支持在表值函数(TVF)中使用 Resource**:TVF 功能为 Apache Doris 提供了直接将对象存储或 HDFS 上的文件作为 Table 进行查询分析的能力。通过在 TVF 中引用 Resource,可以避免重复填写连接信息,提升使用体验。 [#35139](https://github.com/apache/doris/pull/35139) - 关于更多信息,请参考文档:[表函数 - HDFS](../sql-manual/sql-functions/table-functions/hdfs.md) + 关于更多信息,请参考文档:[表函数 - HDFS](../../sql-manual/sql-functions/table-functions/hdfs.md) - **支持通过 Ranger 插件实现数据脱敏**:开启 Ranger 鉴权功能后,支持使用 Ranger 中的 Data Mask 功能进行数据脱敏。 - 关于更多信息,请参考文档:[基于 Apache Ranger 的鉴权管理](https://doris.apache.org/zh-CN/docs/admin-manual/auth/ranger/#%E5%AE%89%E8%A3%85-doris-ranger-%E6%8F%92%E4%BB%B6) + 关于更多信息,请参考文档:[基于 Apache Ranger 的鉴权管理](../../admin-manual/auth/ranger#资源和权限) ### 异步物化视图 @@ -82,21 +82,21 @@ under the License. - 支持单表透明改写。 - 关于更多信息,请参考文档:[查询异步物化视图](../query/view-materialized-view/query-async-materialized-view.md) + 关于更多信息,请参考文档:[查询异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md) - 透明改写支持 agg_state, agg_union 类型的聚合上卷,物化视图可以定义为 agg_state 或者 agg_union,查询使用具体的聚合函数,或者使用 agg_merge - 关于更多信息,请参考文档:[AGG_STATE](../sql-manual/sql-types/Data-Types/AGG_STATE.md) + 关于更多信息,请参考文档:[AGG_STATE](../../sql-manual/sql-data-types/aggregate/AGG-STATE.md) ### 其他 - **新增 `replace_empty` 函数**:将字符串中的子字符串进行替换,当旧字符串为空时,会将新字符串插入到原有字符串的每个字符前以及最后。 - 关于更多信息,请参考文档:[字符串函数 - REPLACE_EMPTY](../sql-manual/sql-functions/string-functions/replace_empty.md) + 关于更多信息,请参考文档:[字符串函数 - REPLACE_EMPTY](../../sql-manual/sql-functions/string-functions/replace_empty.md) - 支持 `show storage policy using` 语句:支持查看所有或指定存储策略关联的表和分区。 - 关于更多信息,请参考文档:[SQL 语句 - SHOW](../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md) + 关于更多信息,请参考文档:[SQL 语句 - SHOW](../../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md) - **支持 BE 侧的 JVM 指标:** 通过在 `be.conf` 配置文件中设置`enable_jvm_monitor=true`,可以启用对 BE 节点 JVM 的监控和指标收集,有助于了解 BE JVM 的资源使用情况,以便进行故障排除和性能优化。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.5.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.5.md index b463d42968326..c41df17fce4ba 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.5.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.5.md @@ -131,7 +131,7 @@ under the License. - 数据导出(Export/Outfile)支持指定 Parquet 和 ORC 的压缩格式。 - - 更多信息,请参考[文档](https://doris.apache.org/docs/sql-manual/sql-statements/Data-Manipulation-Statements/Manipulation/EXPORT/?_highlight=compress_type)。 + - 更多信息,请参考[文档](../../sql-manual/sql-statements/data-modification/load-and-export/EXPORT.md)。 - 当使用 CTAS+TVF 创建表时,TVF 中的分区列将被自动映射为 Varchar(65533)而非 String,以便该分区列能够作为内表的分区列使用。 [#37161](https://github.com/apache/doris/pull/37161) @@ -207,7 +207,7 @@ under the License. - 支持为 `INSERT INTO ... FROM TABLE VALUE FUNCTION` 语句设置 `max_filter_ratio` 参数。 - - 更多信息,请参考[文档](https://doris.apache.org/zh-CN/docs/data-operate/import/import-way/insert-into-manual/) + - 更多信息,请参考[文档](../../data-operate/import/import-way/insert-into-manual) ## Bug 修复 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.6.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.6.md index 6261e4e0c6612..738e6b4b0d326 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.6.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.6.md @@ -24,6 +24,7 @@ specific language governing permissions and limitations under the License. --> + 亲爱的社区小伙伴们,**Apache Doris 2.1.6 版本已于 2024 年 9 月 10 日正式发布。**2.1.6 版本在湖仓一体、异步物化视图、半结构化数据管理持续升级改进,同时在查询优化器、执行引擎、存储管理、数据导入与导出以及权限管理等方面完成了若干修复。欢迎大家下载使用。 - 官网下载:https://doris.apache.org/download @@ -56,15 +57,15 @@ under the License. - 实现 Iceberg 表的写回功能。 - - 更多信息,请查看文档数据湖构建-[Iceberg](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-building/iceberg-build) + - 更多信息,请查看文档数据湖构建-[Iceberg](../../lakehouse/datalake-building/iceberg-build) - 增强 SQL 拦截规则,支持对外表的拦截处理。 - - 更多信息,请查看文档查询管理-[SQL 拦截](https://doris.apache.org/zh-CN/docs/admin-manual/query-admin/sql-interception) + - 更多信息,请查看文档查询管理-[SQL 拦截](../../admin-manual/query-admin/sql-interception) - 新增系统表`file_cache_statistics`,用于查看 BE 节点的数据缓存性能指标。 - - 更多信息,请查看文档系统表-[file_cache_statistics](https://doris.apache.org/docs/admin-manual/system-tables/file_cache_statistics/) + - 更多信息,请查看文档系统表-[file_cache_statistics](../../admin-manual/system-tables/information_schema/file_cache_statistics) ### 异步物化视图 @@ -108,10 +109,10 @@ under the License. - 新增系统表`table_properties`,便于用户查看和管理表的各项属性。 - - 更多信息,请查看文档 [table_properties](https://doris.apache.org/zh-CN/docs/admin-manual/system-tables/information_schema/table_properties/) + - 更多信息,请查看文档 [table_properties](../../admin-manual/system-tables/information_schema/table_properties/) - 新增 FE 中死锁和慢锁检测功能。 - - 更多信息,请查看文档 [FE 锁管理](https://doris.apache.org/zh-CN/docs/admin-manual/maint-monitor/frontend-lock-manager/) + - 更多信息,请查看文档 [FE 锁管理](../../admin-manual/maint-monitor/frontend-lock-manager/) ## 改进提升 @@ -119,7 +120,7 @@ under the License. - 革新外表元数据缓存机制。 - - 更多信息,请查看文档 [元数据缓存](https://doris.apache.org/zh-CN/docs/lakehouse/metacache/)。 + - 更多信息,请查看文档 [元数据缓存](../../lakehouse/metacache)。 - 新增会话变量`keep_carriage_return`,默认关闭。读取 Hive Text 格式表时,默认将`\r\n`与`\n`均视为换行符。[#38099](https://github.com/apache/doris/pull/38099) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.7.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.7.md index 2d85c595f497c..f5bfea1d272f5 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.7.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v2.1/release-2.1.7.md @@ -38,7 +38,7 @@ under the License. - enable_fallback_to_original_planner: true - enable_pipeline_x_engine: true - 审计日志增加了新的列 [#42262](https://github.com/apache/doris/pull/42262) - - 更多信息,请参考[管理指南](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin/) + - 更多信息,请参考[管理指南](../../admin-manual/audit-plugin) ## 新功能 @@ -61,8 +61,8 @@ under the License. - 增加了 `information_schema.table_options` 和 `information_schema.``table_properties` 系统表,支持查询建表时设置的一些属性。[#34384](https://github.com/apache/doris/pull/34384) - 更多信息,请参考系统表: - - [table_options](https://doris.apache.org/docs/admin-manual/system-tables/information_schema/table_options/) - - [table_properties](https://doris.apache.org/zh-CN/docs/admin-manual/system-tables/information_schema/table_properties/) + - [table_options](../../admin-manual/system-tables/information_schema/table_options) + - [table_properties](../../admin-manual/system-tables/information_schema/table_properties) - 支持 `bitmap_empty` 作为默认值。[#40364](https://github.com/apache/doris/pull/40364) - 增加了一个新的 Session 变量`require_sequence_in_insert` 来控制向 Unique Key 表进行`insert into select` 写入时,是否必须提供 Sequence 列。[#41655](https://github.com/apache/doris/pull/41655) @@ -75,16 +75,16 @@ under the License. ### 湖仓一体 - 支持写入数据到 Hive Text 格式表。[#40537](https://github.com/apache/doris/pull/40537) - - 更多信息,请参考[使用 Hive 构建数据湖](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-building/hive-build/)文档 + - 更多信息,请参考[使用 Hive 构建数据湖](../../lakehouse/datalake-building/hive-build/)文档 - 使用 MaxCompute Open Storage API 访问 MaxCompute 数据。[#41610](https://github.com/apache/doris/pull/41610) - - 更多信息,请参考 [MaxCompute](https://doris.apache.org/zh-CN/docs/lakehouse/database/max-compute/) 文档 + - 更多信息,请参考 [MaxCompute](../../lakehouse/database/max-compute/) 文档 - 支持 Paimon DLF Catalog。[#41694](https://github.com/apache/doris/pull/41694) - - 更多信息,请参考 [Paimon Catalog](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-analytics/paimon/) 文档 + - 更多信息,请参考 [Paimon Catalog](../../lakehouse/datalake-analytics/paimon/) 文档 - 新增语法 `table$partitions` 语法支持直接查询 Hive 分区信息 [#41230](https://github.com/apache/doris/pull/41230) - - 更多信息,请参考[通过 Hive 分析数据湖](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-analytics/hive/)文档 + - 更多信息,请参考[通过 Hive 分析数据湖](../../lakehouse/datalake-analytics/hive/)文档 - 支持 brotli 压缩格式的 Parquet 文件读取。[#42162](https://github.com/apache/doris/pull/42162) - 支持读取 Parquet 文件中的 DECIMAL 256 类型。[#42241](https://github.com/apache/doris/pull/42241) -- 支持读取 OpenCsvSerde 格式的 Hive 表。[#42939](https://github.com/apache/doris/pull/42939)https://github.com/apache/doris/pull/42939 +- 支持读取 OpenCsvSerde 格式的 Hive 表。[#42939](https://github.com/apache/doris/pull/42939) ### 异步物化视图 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.0.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.0.md index 5065dfc1566b7..2e7cdee64215e 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.0.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.0.md @@ -151,7 +151,7 @@ under the License. :::info 备注 -参考文档:[存算分离](https://doris.apache.org/zh-CN/docs/3.0/compute-storage-decoupled/overview) +参考文档:[存算分离](../../compute-storage-decoupled/overview) ::: @@ -200,15 +200,15 @@ under the License. - [接入 Trino Connector](https://doris.apache.org/zh-CN/community/how-to-contribute/trino-connector-developer-guide) -- [TPC-H](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/tpch/) +- [TPC-H](../../lakehouse/datalake-analytics/tpch/) -- [TPC-DS](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/tpcds/) +- [TPC-DS](../../lakehouse/datalake-analytics/tpcds/) -- [Delta Lake](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/deltalake) +- [Delta Lake](../../lakehouse/datalake-analytics/deltalake) -- [Kudu](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/kudu) +- [Kudu](../../lakehouse/datalake-analytics/kudu) -- [BigQuery](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/bigquery) +- [BigQuery](../../lakehouse/datalake-analytics/bigquery) ::: ### 2-3 数据湖构建 @@ -219,7 +219,7 @@ under the License. :::info 备注 -参考文档:[数据湖构建](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build/) +参考文档:[数据湖构建](../../lakehouse/datalake-building/hive-build/) ::: @@ -277,7 +277,7 @@ Variant 数据类型在经过大规模生产打磨后,已具备充分的稳定 参考文档: -- [事务](https://doris.apache.org/zh-CN/docs/3.0/data-operate/transaction/) +- [事务](../../data-operate/transaction/) - 目前 CCR 暂未支持显示事务同步。 ::: @@ -329,9 +329,9 @@ Variant 数据类型在经过大规模生产打磨后,已具备充分的稳定 :::info 备注 参考文档: -- [异步物化视图概览](https://doris.apache.org/zh-CN/docs/query/view-materialized-view/async-materialized-view) +- [异步物化视图概览](../../query-acceleration/materialized-view/async-materialized-view/overview.md) -- [查询异步物化视图](https://doris.apache.org/zh-CN/docs/3.0/query/view-materialized-view/query-async-materialized-view/) +- [查询异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md) ::: ## 6. 性能提升 @@ -400,7 +400,7 @@ Runtime Filter 是否能够准确生成对查询性能的影响至关重要, ``` :::info 备注 -参考文档: [Java UDF - UDTF](https://doris.apache.org/zh-CN/docs/query/udf/java-user-defined-function#udtf-1) +参考文档: [Java UDF - UDTF](../../query-data/udf/java-user-defined-function.md#java-udtf-实例介绍) ::: ### 7-2 生成列 @@ -415,7 +415,7 @@ Runtime Filter 是否能够准确生成对查询性能的影响至关重要, 参考文档: -[CREATE TABLE AND GENERATED COLUMN](https://doris.apache.org/zh-CN/docs/3.0/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-TABLE-AND-GENERATED-COLUMN/) +[CREATE TABLE AND GENERATED COLUMN](../../sql-manual/sql-statements/table-and-view/table/CREATE-TABLE.md) ::: ## 8. 功能改进 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.1.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.1.md index 6f79a76c5872c..dd3d7829f2783 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.1.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.1.md @@ -74,7 +74,7 @@ under the License. - SQL 拦截功能现在支持外部表 - - 更多内容,参考文档[SQL 拦截](https://doris.apache.org/zh-CN/docs/admin-manual/query-admin/sql-interception) + - 更多内容,参考文档[SQL 拦截](../..//admin-manual/query-admin/sql-interception) - Insert Overwrite 现在支持 Iceberg 表。[#37191](https://github.com/apache/doris/pull/37191) @@ -108,7 +108,7 @@ under the License. - 新增加了 FE 参数 `skip_audit_user_list`,在此配置项中的用户操作将不会被记录到审计日志中。[#38310](https://github.com/apache/doris/pull/38310) - - 更多内容,参考文档[审计插件](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin/) + - 更多内容,参考文档[审计插件](../../admin-manual/audit-plugin/) ## 改进 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.2.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.2.md index bd84408eec7f0..cd509e52023ff 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.2.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.2.md @@ -63,7 +63,7 @@ under the License. ### Lakehouse -- 新增 Lakesoul Catalog。[Apache Doris Docs](https://doris.apache.org/zh-CN/docs/dev/lakehouse/datalake-analytics/lakesoul) +- 新增 Lakesoul Catalog。[Apache Doris Docs](../../lakehouse/datalake-analytics/lakesoul) - 新增系统表 `catalog_meta_cache_statistics`,用于查看 External Catalog 中各类元数据缓存的使用情况。[#40155](https://github.com/apache/doris/pull/40155) ### 查询优化器 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.3.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.3.md index 2f72f702483e3..8a3ecbfa4f62f 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.3.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/releasenotes/v3.0/release-3.0.3.md @@ -45,11 +45,11 @@ under the License. - 新增 `table$partition` 语法,用于查询 Hive 表的分区信息。[#40774](https://github.com/apache/doris/pull/40774) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/hive#查询-hive-分区) + - [查看文档](../../lakehouse/datalake-analytics/hive#查询-hive-分区) - 支持创建 Text 格式的 Hive 表。[#41860](https://github.com/apache/doris/pull/41860) [#42175](https://github.com/apache/doris/pull/42175) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build#table) + - [查看文档](../../lakehouse/datalake-building/hive-build#table) ### 异步物化视图 @@ -71,7 +71,7 @@ under the License. - 数组函数 `array_agg` 支持在 ARRAY 中嵌套 ARRAY/MAP/STRUCT。[#42009](https://github.com/apache/doris/pull/42009) - 新增近似聚合统计函数 `approx_top_k` 和 `approx_top_sum`。[#44082](https://github.com/apache/doris/pull/44082) -## 改进 +## 改进与优化 ### 存储 @@ -96,7 +96,7 @@ under the License. - Paimon Catalog 支持阿里云 DLF 和 OSS-HDFS 存储。[#41247](https://github.com/apache/doris/pull/41247) [#42585](https://github.com/apache/doris/pull/42585) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/paimon#基于-aliyun-dlf-创建-catalog) + - [查看文档](../../lakehouse/datalake-analytics/paimon#基于-aliyun-dlf-创建-catalog) - 支持读取 OpenCSV 格式的 Hive 表。[#42257](https://github.com/apache/doris/pull/42257) [#42942](https://github.com/apache/doris/pull/42942) - 优化了访问 External Catalog 中 `information_schema.columns` 表的性能。[#41659](https://github.com/apache/doris/pull/41659) [#41962](https://github.com/apache/doris/pull/41962) @@ -142,7 +142,7 @@ under the License. - FE 监控项中的连接数信息支持按用户分别显示。[#39200](https://github.com/apache/doris/pull/39200) -## 缺陷修复 +## 问题修复 ### 存储 @@ -224,4 +224,4 @@ under the License. - 补充了审计日志表和文件中缺失的审计日志字段。[#43303](https://github.com/apache/doris/pull/43303) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file + - [查看文档](../../admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/demo-block.css b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/demo-block.css index 934e88ba28aaf..1257919249c60 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/demo-block.css +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/demo-block.css @@ -105,15 +105,6 @@ a:active { padding-right: 2rem } -.home-page-hero-right { - flex: 1; - flex-direction: row; - justify-content: center; - width: fit-content -} - - - .home-page-option-button { display: flex; margin-bottom: 0.5rem; @@ -209,11 +200,6 @@ a:active { justify-content: center; } -.home-page-hero-right { - align-items: center; - display: flex; - flex-direction: row; -} .home-page-hero-button { /* background-color: #fafafa; */ @@ -279,8 +265,18 @@ a:active { margin-top: 15px } +.home-page-hero-right a { + color: #4c576c +} - +.home-page-hero-right a:hover, +a:active { + /* color: #444fd9; */ + text-decoration: none; + transition-duration: .3s; + transition-timing-function: cubic-bezier(0, 0, .2, 1); + background-color: #fafafa +} .section-border { @@ -355,6 +351,24 @@ a:active { } +@media (max-width: 996px) { + .latest-button { + flex: 1 1 100%; + margin-right: 0 !important; + max-width: calc(100vw - 2rem); + min-height: 170px; + height: auto !important; + } + + .home-page-hero-right { + flex-wrap: wrap !important + } + .latest-button-CN{ + margin-right: 0 !important; + max-width: calc(100vw - 2rem); + } +} + .latest-button-CN { /* background-color: #fafafa; */ border: 0.3px solid #dcdcdc; diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/latest.tsx b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/latest.tsx index 3e1eb5090e0fb..7c92f75c3c137 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/latest.tsx +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/latest.tsx @@ -33,12 +33,12 @@ export default function Latest() {
*/} -
Doris Summit Asia 2024|12 月 14 日 深圳
+
Doris Summit Asia 2024 圆满落幕
-
一年一度的 Apache Doris 峰会再次启航,Doris Summit Asia 2024 现已开启报名,将于 12 月 14 日在深圳正式举办。
-
立即报名
+
2024 年 12 月 14 日,由飞轮科技主办,腾讯云和阿里云联合主办的 Doris Summit Asia 2024 在深圳圆满落幕。演讲回放及资料会在 10 个工作日内逐步释出,可通过 Doris Summit 官网获取。
+
回放生成中
- +
版本发布
{/*
@@ -47,9 +47,9 @@ export default function Latest() {
*/} -
Apache Doris 3.0.2 正式发布
+
Apache Doris 3.0.3 正式发布
-
3.0.2 版本在存算分离、存储、湖仓一体、查询优化器以及执行引擎持续升级改进,欢迎大家下载使用。
+
3.0.3 版本在存算分离、存储、湖仓一体、查询优化器以及执行引擎持续升级改进,欢迎大家下载使用。
查看详情
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/page-hero-1.tsx b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/page-hero-1.tsx index 4b9826c5d4e23..6666f3f97ac60 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/page-hero-1.tsx +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/gettingStarted/demo-block/page-hero-1.tsx @@ -35,9 +35,9 @@ export default function PageHero() {
如何基于 Apache Doris 构建开放、高性能低成本、统一的日志存储分析平台。
- +
-
资源管理
+
负载管理
{/*
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/faq.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/faq.md index c850659d3b047..905111aa22c7f 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/faq.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/faq.md @@ -1,6 +1,6 @@ --- { - "title": "常见问题", + "title": "异步物化视图常见问题", "language": "zh-CN" } --- diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md index 0f7f67f0cdf3c..68da377d09ab3 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md @@ -1,6 +1,6 @@ --- { - "title": "功能描述", + "title": "异步物化视图功能描述", "language": "zh-CN" } --- diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/overview.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/overview.md index 830ad751e2bd0..a140b4a871859 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/overview.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/overview.md @@ -1,6 +1,6 @@ --- { - "title": "原理介绍", + "title": "异步物化视图原理介绍", "language": "zh-CN" } --- diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/use-guide.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/use-guide.md index c9f6e28fe620e..c40439a15c37a 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/use-guide.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/query-acceleration/materialized-view/async-materialized-view/use-guide.md @@ -1,6 +1,6 @@ --- { - "title": "使用与实践", + "title": "异步物化视图使用与实践", "language": "zh-CN" } --- @@ -26,9 +26,9 @@ under the License. ## 异步物化视图使用原则 -1. **时效性考虑:**异步物化视图通常用于对数据时效性要求不高的场景,一般是 T+1 的数据。如果时效性要求高,应考虑使用同步物化视图。 +1. **时效性考虑:** 异步物化视图通常用于对数据时效性要求不高的场景,一般是 T+1 的数据。如果时效性要求高,应考虑使用同步物化视图。 -2. **加速效果与一致性考虑:**在查询加速场景,创建物化视图时,DBA 应将常见查询 SQL 模式分组,尽量使组之间无重合。SQL 模式组划分越清晰,物化视图构建的质量越高。一个查询可能使用多个物化视图,同时一个物化视图也可能被多个查询使用。构建物化视图需要综合考虑命中物化视图的响应时间(加速效果)、构建成本、数据一致性要求等。 +2. **加速效果与一致性考虑:** 在查询加速场景,创建物化视图时,DBA 应将常见查询 SQL 模式分组,尽量使组之间无重合。SQL 模式组划分越清晰,物化视图构建的质量越高。一个查询可能使用多个物化视图,同时一个物化视图也可能被多个查询使用。构建物化视图需要综合考虑命中物化视图的响应时间(加速效果)、构建成本、数据一致性要求等。 3. **物化视图定义与构建成本考虑:** @@ -38,11 +38,11 @@ under the License. 需要注意: -1. **物化视图数量控制:**物化视图并非越多越好。物化视图参与透明改写,且 CBO 代价模型选择需要时间。理论上,物化视图越多,透明改写的时间越长,且物化视图构建和刷新占用的资源越大。 +1. **物化视图数量控制:** 物化视图并非越多越好。物化视图参与透明改写,且 CBO 代价模型选择需要时间。理论上,物化视图越多,透明改写的时间越长,且物化视图构建和刷新占用的资源越大。 -2. **定期检查物化视图使用状态:**如果未使用,应及时删除。 +2. **定期检查物化视图使用状态:** 如果未使用,应及时删除。 -3. **基表数据更新频率:**如果物化视图的基表数据频繁更新,可能不太适合使用物化视图,因为这会导致物化视图频繁失效,不能用于透明改写(可直查)。如果需要使用此类物化视图进行透明改写,需要允许查询的数据有一定的时效延迟,并可以设定`grace_period`。具体见`grace_period`的适用介绍。 +3. **基表数据更新频率:** 如果物化视图的基表数据频繁更新,可能不太适合使用物化视图,因为这会导致物化视图频繁失效,不能用于透明改写(可直查)。如果需要使用此类物化视图进行透明改写,需要允许查询的数据有一定的时效延迟,并可以设定`grace_period`。具体见`grace_period`的适用介绍。 ## 物化视图刷新方式选择原则 @@ -184,9 +184,9 @@ GROUP BY 通常物化视图会出现两种状态: -- **状态正常:**指的是当前物化视图是否可用于透明改写。 +- **状态正常:** 指的是当前物化视图是否可用于透明改写。 -- **不可用、状态不正常:**指的是物化视图不能用于透明改写的简称。尽管如此,该物化视图还是可以直查的。 +- **不可用、状态不正常:** 指的是物化视图不能用于透明改写的简称。尽管如此,该物化视图还是可以直查的。 ### 查看物化视图元数据 @@ -222,9 +222,9 @@ SyncWithBaseTables: 1 - 对于分区增量的物化视图,分区物化视图是否可用,是以分区粒度去看的。也就是说,即使物化视图的部分分区不可用,但只要查询的是有效分区,那么此物化视图依旧可用于透明改写。是否能透明改写,主要看查询所用分区的 `SyncWithBaseTables` 字段是否一致。如果 `SyncWithBaseTables` 是 1,此分区可用于透明改写;如果是 0,则不能用于透明改写。 -- **JobName:**物化视图构建 Job 的名称,每个物化视图有一个 Job,每次刷新会有一个新的 Task,Job 和 Task 是 1:n 的关系 +- **JobName:** 物化视图构建 Job 的名称,每个物化视图有一个 Job,每次刷新会有一个新的 Task,Job 和 Task 是 1:n 的关系 -- **State:**如果变为 SCHEMA_CHANGE,代表基表的 Schema 发生了变化,此时物化视图将不能用来透明改写 (但是不影响直接查询物化视图),下次刷新任务如果执行成功,将恢复为 NORMAL。 +- **State:** 如果变为 SCHEMA_CHANGE,代表基表的 Schema 发生了变化,此时物化视图将不能用来透明改写 (但是不影响直接查询物化视图),下次刷新任务如果执行成功,将恢复为 NORMAL。 - **SchemaChangeDetail:** 表示 SCHEMA_CHANGE 发生的原因。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/all-release.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/all-release.md index 2cb8a1e320631..14e9ae2cc665c 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/all-release.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/all-release.md @@ -39,13 +39,13 @@ under the License.
-- [2024-12-02, Apache Doris 3.0.2 版本发布](../releasenotes/v3.0/release-3.0.2.md) +- [2024-12-02, Apache Doris 3.0.3 版本发布](../releasenotes/v3.0/release-3.0.3.md) -- [2024-11-10, Apache Doris 2.1.7 版本发布](../releasenotes/v2.1/release-2.1.7.md) +- [2024-11-10, Apache Doris 2.1.7 版本发布](../releasenotes/v2.1/release-2.1.7) - [2024-10-15, Apache Doris 3.0.2 版本发布](../releasenotes/v3.0/release-3.0.2.md) -- [2024-09-30, Apache Doris 2.0.15 版本发布](../releasenotes/v2.0/release-2.0.15.md) +- [2024-09-30, Apache Doris 2.0.15 版本发布](/releasenotes/v2.0/release-2.0.15.md) - [2024-09-10, Apache Doris 2.1.6 版本发布](../releasenotes/v2.1/release-2.1.6.md) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.0.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.0.md index 434677f520819..d14aec8a307e5 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.0.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.0.md @@ -104,7 +104,7 @@ under the License. ![Local Shuffle Clickbench and TPCH-100](/images/2.1-doris-clickbench-tpch.png) :::note 备注 -参考文档:[Pipeline X 执行引擎](https://doris.apache.org/zh-CN/docs/query-acceleration/pipeline-execution-engine) +参考文档:[Pipeline X 执行引擎](../../query-acceleration/pipeline-execution-engine) ::: ## ARM 架构深度适配,性能提升 230% @@ -141,9 +141,9 @@ under the License. 该功能目前为实验性质功能,当前已经支持 ClickHouse、Presto、Trino、Hive、Spark。在此我们以 Trino 为例,部署完 SQL 转换服务后,在会话变量中设置 `set sql_dialect = trino` ,即可直接采取 Trino SQL 语法执行查询。在某些社区用户的实际线上业务 SQL 兼容性测试中,在全部 3w 多条查询语句中与 Trino SQL 兼容度高达 99% 以上。也欢迎所有用户在使用过程中向我们反馈不兼容的 Case,帮助 Apache Doris 更加完善。 :::note -- 演示 Demo: https://www.bilibili.com/video/BV1cS421A7kA/?spm_id_from=333.999.0.0 +- [演示 Demo](https://www.bilibili.com/video/BV1cS421A7kA/?spm_id_from=333.999.0.0) -- 参考文档:[SQL 方言兼容](https://doris.apache.org/zh-CN/docs/lakehouse/sql-dialect.md) +- 参考文档:[SQL 方言兼容](../../lakehouse/sql-dialect.md) ::: @@ -302,7 +302,7 @@ CREATE MATERIALIZED VIEW mv1 :::note - 演示 Demo: https://www.bilibili.com/video/BV1s2421T71z/?spm_id_from=333.999.0.0 -- 参考文档:[异步物化视图](https://doris.apache.org/zh-CN/docs/query-acceleration/materialized-view/async-materialized-view/overview) +- 参考文档:[异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/overview) ::: ## 存储能力增强 @@ -408,7 +408,7 @@ PROPERTIES ( :::note -参考文档:[数据划分](https://doris.apache.org/zh-CN/docs/table-design/data-partitioning/basic-concepts) +参考文档:[数据划分](../../table-design/data-partitioning/basic-concepts) ::: ### INSERT INTO SELECT 导入性能提升 100% @@ -470,7 +470,7 @@ MemTable 前移在 2.1 版本中默认开启,用户无需修改原有的导入 :::note - 演示 Demo:https://www.bilibili.com/video/BV1um411o7Ha/?spm_id_from=333.999.0.0 -- 参考文档和完整测试报告:[Group Commit](https://doris.apache.org/zh-CN/docs/data-operate/import/import-way/group-commit-manual) +- 参考文档和完整测试报告:[Group Commit](../../data-operate/import/import-way/group-commit-manual) ::: @@ -542,7 +542,7 @@ SELECT v["properties"]["title"] from ${table_name} :::note - 演示 Demo: https://www.bilibili.com/video/BV13u4m1g7ra/?spm_id_from=333.999.0.0 -- 参考文档:[VARIANT](https://doris.apache.org/zh-CN/docs/sql-manual/sql-data-types/semi-structured/VARIANT.md) +- 参考文档:[VARIANT](../../sql-manual/sql-data-types/semi-structured/VARIANT.md) ::: @@ -557,7 +557,7 @@ SELECT v["properties"]["title"] from ${table_name} - INET_ATON:获取包含 IPv4 地址的字符串,格式为 A.B.C.D(点分隔的十进制数字) :::note -参考文档:[IPV6](https://doris.apache.org/zh-CN/docs/sql-manual/sql-data-types/ip/IPV6) +参考文档:[IPV6](../../sql-manual/sql-data-types/ip/IPV6) ::: @@ -674,7 +674,7 @@ mysql> select struct(1,"2") not in (struct(1,3), struct(1,"2"), struct(1,1), nul - `MAP_AGG`:接收 expr1 作为键,expr2 作为对应的值,返回一个 MAP :::note -参考文档:[MAP_AGG](https://doris.apache.org/zh-CN/docs/sql-manual/sql-functions/aggregate-functions/map-agg.md) +参考文档:[MAP_AGG](../../sql-manual/sql-functions/aggregate-functions/map-agg.md) ::: @@ -699,7 +699,7 @@ mysql> select struct(1,"2") not in (struct(1,3), struct(1,"2"), struct(1,1), nul :::note - 演示 Demo:https://www.bilibili.com/video/BV1Fz421X7XE/?spm_id_from=333.999.0.0 -- 参考文档:[Workload Group](https://doris.apache.org/zh-CN/docs/admin-manual/resource-admin/workload-group.md) +- 参考文档:[Workload Group](../../admin-manual/resource-admin/workload-group.md) ::: @@ -757,7 +757,7 @@ select QueryId,max(BePeakMemoryBytes) as be_peak_mem from active_queries() group 目前主要展示的负载类型包括 Select 和`Insert Into……Select`,预计在 2.1 版本之上的三位迭代版本中会支持 Stream Load 和 Broker Load 的资源用量展示。 :::note -参考文档:[ACTIVE_QUERIES](https://doris.apache.org/zh-CN/docs/sql-manual/sql-functions/table-functions/active_queries.md) +参考文档:[ACTIVE_QUERIES](../../sql-manual/sql-functions/table-functions/active_queries.md) ::: @@ -858,7 +858,7 @@ JOB e_daily :::caution 注意事项 -当前 Job Scheduler 仅支持 Insert 内表,参考文档:[CREATE-JOB](https://doris.apache.org/zh-CN/docs/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-JOB.md) +当前 Job Scheduler 仅支持 Insert 内表,参考文档:[CREATE-JOB](../../sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-JOB.md) ::: @@ -878,7 +878,7 @@ JOB e_daily - 对于之前已经安装过审计日志插件的用户,升级后可以继续使用原有插件,也可以通过 uninstall 命令卸载原有插件后,使用新的插件。但注意,切换插件后,审计日志表也将切换到新的表中。 - - 具体可参阅:[审计日志插件](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin.md) + - 具体可参阅:[审计日志插件](../../admin-manual/audit-plugin.md) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.2.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.2.md index 96b7c849d341b..1517bf0b53fca 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.2.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.2.md @@ -40,7 +40,7 @@ under the License. - https://github.com/apache/doris/pull/33282 -3. Auto Partition 语法变化,详见 https://doris.apache.org/zh-CN/docs/table-design/data-partition#%E8%87%AA%E5%8A%A8%E5%88%86%E5%8C%BA +3. Auto Partition 语法变化,详见[文档](../../table-design/data-partitioning/auto-partitioning.md) - https://github.com/apache/doris/pull/32737 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.3.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.3.md index dc33f0d6011fa..15056902e7534 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.3.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.3.md @@ -37,7 +37,7 @@ under the License. 从 2.1.3 版本开始,Apache Doris 支持对 Hive 的 DDL 和 DML 操作。用户可以直接通过 Apache Doris 在 Hive 中创建库表,通过执行`INSERT INTO`语句来向 Hive 表中写入数据。通过该功能,用户可以通过 Apache Doris 对 Hive 进行完整的数据查询和写入操作,进一步帮助用户简化湖仓一体架构。 -参考文档:[https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/](https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/) +参考[文档](../../lakehouse/datalake-building/hive-build) **2. 支持在异步物化视图之上构建新的异步物化视图** diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.4.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.4.md index d8e3a2d8be538..722de717ea32a 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.4.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.4.md @@ -40,9 +40,9 @@ under the License. 关于更多信息,请参考文档: - - [BE 日志管理](../admin-manual/log-management/be-log.md) + - [BE 日志管理](../../admin-manual/log-management/be-log.md) - - [FE 日志管理](../admin-manual/log-management/fe-log.md) + - [FE 日志管理](../../admin-manual/log-management/fe-log.md) - 如果建表时没有填写表注释,默认注释为空,不再使用表类型作为默认表注释。 [#36025](https://github.com/apache/doris/pull/36025) @@ -54,7 +54,7 @@ under the License. - **支持 FE 火焰图工具**:在 FE 部署目录 `${DORIS_FE_HOME}/bin` 中会增加`profile_fe.sh` 脚本,可以利用 async-profiler 工具生成 FE 的火焰图,用以发现性能瓶颈点。 - 关于更多信息,请参考文档:[使用 FE Profiler 生成火焰图](/community/developer-guide/fe-profiler.md) + 关于更多信息,请参考文档:[使用 FE Profiler 生成火焰图](https://doris.apache.org/zh-CN/community/developer-guide/fe-profiler) - **支持 SELECT DISTINCT 与聚合函数同时使用**:支持 `SELECT DISTINCT` 与聚合函数同时使用,在一个查询中同时去重和进行聚合操作,如 SUM、MIN/MAX 等。 @@ -66,15 +66,15 @@ under the License. - **支持 Paimon 的原生读取器来处理 Deletion Vector:** Deletion Vector 主要用于标记或追踪哪些数据已被删除或标记为删除,通常应用在需要保留历史数据的场景,基于本优化可以提升大量数据更新或删除时的处理效率。 [#35241](https://github.com/apache/doris/pull/35241) - 关于更多信息,请参考文档:[数据湖分析 - Paimon](../lakehouse/datalake-analytics/paimon.md) + 关于更多信息,请参考文档:[数据湖分析 - Paimon](../../lakehouse/datalake-analytics/paimon.md) - **支持在表值函数(TVF)中使用 Resource**:TVF 功能为 Apache Doris 提供了直接将对象存储或 HDFS 上的文件作为 Table 进行查询分析的能力。通过在 TVF 中引用 Resource,可以避免重复填写连接信息,提升使用体验。 [#35139](https://github.com/apache/doris/pull/35139) - 关于更多信息,请参考文档:[表函数 - HDFS](../sql-manual/sql-functions/table-functions/hdfs.md) + 关于更多信息,请参考文档:[表函数 - HDFS](../../sql-manual/sql-functions/table-functions/hdfs.md) - **支持通过 Ranger 插件实现数据脱敏**:开启 Ranger 鉴权功能后,支持使用 Ranger 中的 Data Mask 功能进行数据脱敏。 - 关于更多信息,请参考文档:[基于 Apache Ranger 的鉴权管理](https://doris.apache.org/zh-CN/docs/admin-manual/auth/ranger/#%E5%AE%89%E8%A3%85-doris-ranger-%E6%8F%92%E4%BB%B6) + 关于更多信息,请参考文档:[基于 Apache Ranger 的鉴权管理](../../admin-manual/auth/ranger#资源和权限) ### 异步物化视图 @@ -82,21 +82,21 @@ under the License. - 支持单表透明改写。 - 关于更多信息,请参考文档:[查询异步物化视图](../query/view-materialized-view/query-async-materialized-view.md) + 关于更多信息,请参考文档:[查询异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md) - 透明改写支持 agg_state, agg_union 类型的聚合上卷,物化视图可以定义为 agg_state 或者 agg_union,查询使用具体的聚合函数,或者使用 agg_merge - 关于更多信息,请参考文档:[AGG_STATE](../sql-manual/sql-types/Data-Types/AGG_STATE.md) + 关于更多信息,请参考文档:[AGG_STATE](../../sql-manual/sql-data-types/aggregate/AGG-STATE.md) ### 其他 - **新增 `replace_empty` 函数**:将字符串中的子字符串进行替换,当旧字符串为空时,会将新字符串插入到原有字符串的每个字符前以及最后。 - 关于更多信息,请参考文档:[字符串函数 - REPLACE_EMPTY](../sql-manual/sql-functions/string-functions/replace_empty.md) + 关于更多信息,请参考文档:[字符串函数 - REPLACE_EMPTY](../../sql-manual/sql-functions/string-functions/replace_empty.md) - 支持 `show storage policy using` 语句:支持查看所有或指定存储策略关联的表和分区。 - 关于更多信息,请参考文档:[SQL 语句 - SHOW](../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md) + 关于更多信息,请参考文档:[SQL 语句 - SHOW](../../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md) - **支持 BE 侧的 JVM 指标:** 通过在 `be.conf` 配置文件中设置`enable_jvm_monitor=true`,可以启用对 BE 节点 JVM 的监控和指标收集,有助于了解 BE JVM 的资源使用情况,以便进行故障排除和性能优化。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.5.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.5.md index b463d42968326..d8b86761d77fc 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.5.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.5.md @@ -56,7 +56,7 @@ under the License. - 会话变量 `read_csv_empty_line_as_null` 用于控制在读取 CSV 格式文件时,是否忽略空行。默认情况下忽略空行,当设置为 true 时,空行将被读取为所有列均为 Null 的行。[#37153](https://github.com/apache/doris/pull/37153) - - 更多信息,请参考[文档](https://doris.apache.org/docs/lakehouse/datalake-analytics/hive?_highlight=compress_type)。 + - 更多信息,请参考[文档](../../lakehouse/datalake-analytics/hive?_highlight=compress_type)。 - 新增兼容 Presto 的复杂类型输出格式。通过设置 `set serde_dialect="presto"`,可以控制复杂类型的输出格式 与 Presto 一致,用于平滑迁移 Presto 业务。[#37253](https://github.com/apache/doris/pull/37253) @@ -131,7 +131,7 @@ under the License. - 数据导出(Export/Outfile)支持指定 Parquet 和 ORC 的压缩格式。 - - 更多信息,请参考[文档](https://doris.apache.org/docs/sql-manual/sql-statements/Data-Manipulation-Statements/Manipulation/EXPORT/?_highlight=compress_type)。 + - 更多信息,请参考[文档](../../sql-manual/sql-statements/data-modification/load-and-export/EXPORT.md)。 - 当使用 CTAS+TVF 创建表时,TVF 中的分区列将被自动映射为 Varchar(65533)而非 String,以便该分区列能够作为内表的分区列使用。 [#37161](https://github.com/apache/doris/pull/37161) @@ -207,7 +207,7 @@ under the License. - 支持为 `INSERT INTO ... FROM TABLE VALUE FUNCTION` 语句设置 `max_filter_ratio` 参数。 - - 更多信息,请参考[文档](https://doris.apache.org/zh-CN/docs/data-operate/import/import-way/insert-into-manual/) + - 更多信息,请参考[文档](../../data-operate/import/import-way/insert-into-manual) ## Bug 修复 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.6.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.6.md index 6261e4e0c6612..65853079ee177 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.6.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.6.md @@ -56,15 +56,15 @@ under the License. - 实现 Iceberg 表的写回功能。 - - 更多信息,请查看文档数据湖构建-[Iceberg](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-building/iceberg-build) + - 更多信息,请查看文档数据湖构建-[Iceberg](../../lakehouse/datalake-building/iceberg-build) - 增强 SQL 拦截规则,支持对外表的拦截处理。 - - 更多信息,请查看文档查询管理-[SQL 拦截](https://doris.apache.org/zh-CN/docs/admin-manual/query-admin/sql-interception) + - 更多信息,请查看文档查询管理-[SQL 拦截](../../admin-manual/query-admin/sql-interception) - 新增系统表`file_cache_statistics`,用于查看 BE 节点的数据缓存性能指标。 - - 更多信息,请查看文档系统表-[file_cache_statistics](https://doris.apache.org/docs/admin-manual/system-tables/file_cache_statistics/) + - 更多信息,请查看文档系统表-[file_cache_statistics](../../admin-manual/system-tables/information_schema/file_cache_statistics) ### 异步物化视图 @@ -108,10 +108,10 @@ under the License. - 新增系统表`table_properties`,便于用户查看和管理表的各项属性。 - - 更多信息,请查看文档 [table_properties](https://doris.apache.org/zh-CN/docs/admin-manual/system-tables/information_schema/table_properties/) + - 更多信息,请查看文档 [table_properties](../../admin-manual/system-tables/information_schema/table_properties/) - 新增 FE 中死锁和慢锁检测功能。 - - 更多信息,请查看文档 [FE 锁管理](https://doris.apache.org/zh-CN/docs/admin-manual/maint-monitor/frontend-lock-manager/) + - 更多信息,请查看文档 [FE 锁管理](../../admin-manual/maint-monitor/frontend-lock-manager/) ## 改进提升 @@ -119,7 +119,7 @@ under the License. - 革新外表元数据缓存机制。 - - 更多信息,请查看文档 [元数据缓存](https://doris.apache.org/zh-CN/docs/lakehouse/metacache/)。 + - 更多信息,请查看文档 [元数据缓存](../../lakehouse/metacache)。 - 新增会话变量`keep_carriage_return`,默认关闭。读取 Hive Text 格式表时,默认将`\r\n`与`\n`均视为换行符。[#38099](https://github.com/apache/doris/pull/38099) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.7.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.7.md index 2d85c595f497c..f5bfea1d272f5 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.7.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v2.1/release-2.1.7.md @@ -38,7 +38,7 @@ under the License. - enable_fallback_to_original_planner: true - enable_pipeline_x_engine: true - 审计日志增加了新的列 [#42262](https://github.com/apache/doris/pull/42262) - - 更多信息,请参考[管理指南](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin/) + - 更多信息,请参考[管理指南](../../admin-manual/audit-plugin) ## 新功能 @@ -61,8 +61,8 @@ under the License. - 增加了 `information_schema.table_options` 和 `information_schema.``table_properties` 系统表,支持查询建表时设置的一些属性。[#34384](https://github.com/apache/doris/pull/34384) - 更多信息,请参考系统表: - - [table_options](https://doris.apache.org/docs/admin-manual/system-tables/information_schema/table_options/) - - [table_properties](https://doris.apache.org/zh-CN/docs/admin-manual/system-tables/information_schema/table_properties/) + - [table_options](../../admin-manual/system-tables/information_schema/table_options) + - [table_properties](../../admin-manual/system-tables/information_schema/table_properties) - 支持 `bitmap_empty` 作为默认值。[#40364](https://github.com/apache/doris/pull/40364) - 增加了一个新的 Session 变量`require_sequence_in_insert` 来控制向 Unique Key 表进行`insert into select` 写入时,是否必须提供 Sequence 列。[#41655](https://github.com/apache/doris/pull/41655) @@ -75,16 +75,16 @@ under the License. ### 湖仓一体 - 支持写入数据到 Hive Text 格式表。[#40537](https://github.com/apache/doris/pull/40537) - - 更多信息,请参考[使用 Hive 构建数据湖](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-building/hive-build/)文档 + - 更多信息,请参考[使用 Hive 构建数据湖](../../lakehouse/datalake-building/hive-build/)文档 - 使用 MaxCompute Open Storage API 访问 MaxCompute 数据。[#41610](https://github.com/apache/doris/pull/41610) - - 更多信息,请参考 [MaxCompute](https://doris.apache.org/zh-CN/docs/lakehouse/database/max-compute/) 文档 + - 更多信息,请参考 [MaxCompute](../../lakehouse/database/max-compute/) 文档 - 支持 Paimon DLF Catalog。[#41694](https://github.com/apache/doris/pull/41694) - - 更多信息,请参考 [Paimon Catalog](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-analytics/paimon/) 文档 + - 更多信息,请参考 [Paimon Catalog](../../lakehouse/datalake-analytics/paimon/) 文档 - 新增语法 `table$partitions` 语法支持直接查询 Hive 分区信息 [#41230](https://github.com/apache/doris/pull/41230) - - 更多信息,请参考[通过 Hive 分析数据湖](https://doris.apache.org/zh-CN/docs/lakehouse/datalake-analytics/hive/)文档 + - 更多信息,请参考[通过 Hive 分析数据湖](../../lakehouse/datalake-analytics/hive/)文档 - 支持 brotli 压缩格式的 Parquet 文件读取。[#42162](https://github.com/apache/doris/pull/42162) - 支持读取 Parquet 文件中的 DECIMAL 256 类型。[#42241](https://github.com/apache/doris/pull/42241) -- 支持读取 OpenCsvSerde 格式的 Hive 表。[#42939](https://github.com/apache/doris/pull/42939)https://github.com/apache/doris/pull/42939 +- 支持读取 OpenCsvSerde 格式的 Hive 表。[#42939](https://github.com/apache/doris/pull/42939) ### 异步物化视图 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.0.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.0.md index 5065dfc1566b7..2e7cdee64215e 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.0.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.0.md @@ -151,7 +151,7 @@ under the License. :::info 备注 -参考文档:[存算分离](https://doris.apache.org/zh-CN/docs/3.0/compute-storage-decoupled/overview) +参考文档:[存算分离](../../compute-storage-decoupled/overview) ::: @@ -200,15 +200,15 @@ under the License. - [接入 Trino Connector](https://doris.apache.org/zh-CN/community/how-to-contribute/trino-connector-developer-guide) -- [TPC-H](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/tpch/) +- [TPC-H](../../lakehouse/datalake-analytics/tpch/) -- [TPC-DS](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/tpcds/) +- [TPC-DS](../../lakehouse/datalake-analytics/tpcds/) -- [Delta Lake](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/deltalake) +- [Delta Lake](../../lakehouse/datalake-analytics/deltalake) -- [Kudu](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/kudu) +- [Kudu](../../lakehouse/datalake-analytics/kudu) -- [BigQuery](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/bigquery) +- [BigQuery](../../lakehouse/datalake-analytics/bigquery) ::: ### 2-3 数据湖构建 @@ -219,7 +219,7 @@ under the License. :::info 备注 -参考文档:[数据湖构建](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build/) +参考文档:[数据湖构建](../../lakehouse/datalake-building/hive-build/) ::: @@ -277,7 +277,7 @@ Variant 数据类型在经过大规模生产打磨后,已具备充分的稳定 参考文档: -- [事务](https://doris.apache.org/zh-CN/docs/3.0/data-operate/transaction/) +- [事务](../../data-operate/transaction/) - 目前 CCR 暂未支持显示事务同步。 ::: @@ -329,9 +329,9 @@ Variant 数据类型在经过大规模生产打磨后,已具备充分的稳定 :::info 备注 参考文档: -- [异步物化视图概览](https://doris.apache.org/zh-CN/docs/query/view-materialized-view/async-materialized-view) +- [异步物化视图概览](../../query-acceleration/materialized-view/async-materialized-view/overview.md) -- [查询异步物化视图](https://doris.apache.org/zh-CN/docs/3.0/query/view-materialized-view/query-async-materialized-view/) +- [查询异步物化视图](../../query-acceleration/materialized-view/async-materialized-view/functions-and-demands.md) ::: ## 6. 性能提升 @@ -400,7 +400,7 @@ Runtime Filter 是否能够准确生成对查询性能的影响至关重要, ``` :::info 备注 -参考文档: [Java UDF - UDTF](https://doris.apache.org/zh-CN/docs/query/udf/java-user-defined-function#udtf-1) +参考文档: [Java UDF - UDTF](../../query-data/udf/java-user-defined-function.md#java-udtf-实例介绍) ::: ### 7-2 生成列 @@ -415,7 +415,7 @@ Runtime Filter 是否能够准确生成对查询性能的影响至关重要, 参考文档: -[CREATE TABLE AND GENERATED COLUMN](https://doris.apache.org/zh-CN/docs/3.0/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-TABLE-AND-GENERATED-COLUMN/) +[CREATE TABLE AND GENERATED COLUMN](../../sql-manual/sql-statements/table-and-view/table/CREATE-TABLE.md) ::: ## 8. 功能改进 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.1.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.1.md index 6f79a76c5872c..dd3d7829f2783 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.1.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.1.md @@ -74,7 +74,7 @@ under the License. - SQL 拦截功能现在支持外部表 - - 更多内容,参考文档[SQL 拦截](https://doris.apache.org/zh-CN/docs/admin-manual/query-admin/sql-interception) + - 更多内容,参考文档[SQL 拦截](../..//admin-manual/query-admin/sql-interception) - Insert Overwrite 现在支持 Iceberg 表。[#37191](https://github.com/apache/doris/pull/37191) @@ -108,7 +108,7 @@ under the License. - 新增加了 FE 参数 `skip_audit_user_list`,在此配置项中的用户操作将不会被记录到审计日志中。[#38310](https://github.com/apache/doris/pull/38310) - - 更多内容,参考文档[审计插件](https://doris.apache.org/zh-CN/docs/admin-manual/audit-plugin/) + - 更多内容,参考文档[审计插件](../../admin-manual/audit-plugin/) ## 改进 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.2.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.2.md index bd84408eec7f0..cd509e52023ff 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.2.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.2.md @@ -63,7 +63,7 @@ under the License. ### Lakehouse -- 新增 Lakesoul Catalog。[Apache Doris Docs](https://doris.apache.org/zh-CN/docs/dev/lakehouse/datalake-analytics/lakesoul) +- 新增 Lakesoul Catalog。[Apache Doris Docs](../../lakehouse/datalake-analytics/lakesoul) - 新增系统表 `catalog_meta_cache_statistics`,用于查看 External Catalog 中各类元数据缓存的使用情况。[#40155](https://github.com/apache/doris/pull/40155) ### 查询优化器 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.3.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.3.md index 2f72f702483e3..8a3ecbfa4f62f 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.3.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/releasenotes/v3.0/release-3.0.3.md @@ -45,11 +45,11 @@ under the License. - 新增 `table$partition` 语法,用于查询 Hive 表的分区信息。[#40774](https://github.com/apache/doris/pull/40774) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/hive#查询-hive-分区) + - [查看文档](../../lakehouse/datalake-analytics/hive#查询-hive-分区) - 支持创建 Text 格式的 Hive 表。[#41860](https://github.com/apache/doris/pull/41860) [#42175](https://github.com/apache/doris/pull/42175) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build#table) + - [查看文档](../../lakehouse/datalake-building/hive-build#table) ### 异步物化视图 @@ -71,7 +71,7 @@ under the License. - 数组函数 `array_agg` 支持在 ARRAY 中嵌套 ARRAY/MAP/STRUCT。[#42009](https://github.com/apache/doris/pull/42009) - 新增近似聚合统计函数 `approx_top_k` 和 `approx_top_sum`。[#44082](https://github.com/apache/doris/pull/44082) -## 改进 +## 改进与优化 ### 存储 @@ -96,7 +96,7 @@ under the License. - Paimon Catalog 支持阿里云 DLF 和 OSS-HDFS 存储。[#41247](https://github.com/apache/doris/pull/41247) [#42585](https://github.com/apache/doris/pull/42585) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-analytics/paimon#基于-aliyun-dlf-创建-catalog) + - [查看文档](../../lakehouse/datalake-analytics/paimon#基于-aliyun-dlf-创建-catalog) - 支持读取 OpenCSV 格式的 Hive 表。[#42257](https://github.com/apache/doris/pull/42257) [#42942](https://github.com/apache/doris/pull/42942) - 优化了访问 External Catalog 中 `information_schema.columns` 表的性能。[#41659](https://github.com/apache/doris/pull/41659) [#41962](https://github.com/apache/doris/pull/41962) @@ -142,7 +142,7 @@ under the License. - FE 监控项中的连接数信息支持按用户分别显示。[#39200](https://github.com/apache/doris/pull/39200) -## 缺陷修复 +## 问题修复 ### 存储 @@ -224,4 +224,4 @@ under the License. - 补充了审计日志表和文件中缺失的审计日志字段。[#43303](https://github.com/apache/doris/pull/43303) - - [查看文档](https://doris.apache.org/zh-CN/docs/3.0/admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file + - [查看文档](../../admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file diff --git a/sidebars.json b/sidebars.json index f1fc0355d530a..cb00972b7fbbf 100644 --- a/sidebars.json +++ b/sidebars.json @@ -117,10 +117,11 @@ "type": "category", "label": "Data Models", "items": [ - "admin-manual/data-admin/ccr/overview", - "admin-manual/data-admin/ccr/quickstart", - "admin-manual/data-admin/ccr/feature", - "admin-manual/data-admin/ccr/manual" + "table-design/data-model/overview", + "table-design/data-model/duplicate", + "table-design/data-model/unique", + "table-design/data-model/aggregate", + "table-design/data-model/tips" ] }, "table-design/row-store", diff --git a/versioned_docs/version-1.2/releasenotes/all-release.md b/versioned_docs/version-1.2/releasenotes/all-release.md new file mode 100644 index 0000000000000..392e8e1a9562e --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/all-release.md @@ -0,0 +1,88 @@ +--- +{ + "title": "All Releases", + "language": "en" +} +--- + + + +This document presents a summary of Apache Doris versions released within one year, listed in reverse chronological order. + +:::tip Latest Release + +🎉 Version 3.0.3 released now. Check out the 🔗[Release Notes](../releasenotes/v3.0/release-3.0.3) here. Starting from version 3.X, Apache Doris supports a compute-storage decoupled mode in addition to the compute-storage coupled mode for cluster deployment. With the cloud-native architecture that decouples the computation and storage layers, users can achieve physical isolation between query loads across multiple compute clusters, as well as isolation between read and write loads. + +
+ +🎉 Version 2.1.7 released now. Check out the 🔗[Release Notes](../releasenotes/v2.1/release-2.1.6) here. The 2.1 version delivers exceptional performance with 100% higher out-of-the-box queries proven by TPC-DS 1TB tests, enhanced data lake analytics that are 4-6 times speedier than Trino and Spark, solid support for semi-structured data analysis with new Variant types and suite of analytical functions, asynchronous materialized views for query acceleration, optimized real-time writing at scale, and better workload management with stability and runtime SQL resource tracking. + +::: + + +
+ +- [2024-12-02, Apache Doris 3.0.3 is released](../releasenotes/v3.0/release-3.0.3.md) + +- [2024-11-10, Apache Doris 2.1.7 is released](../releasenotes/v2.1/release-2.1.7.md) + +- [2024-10-15, Apache Doris 3.0.2 is released](../releasenotes/v3.0/release-3.0.2.md) + +- [2024-09-30, Apache Doris 2.0.15 is released](../releasenotes/v2.0/release-2.0.15.md) + +- [2024-09-10, Apache Doris 2.1.6 is released](../releasenotes/v2.1/release-2.1.6.md) + +- [2024-08-23, Apache Doris 3.0.1 is released](../releasenotes/v3.0/release-3.0.1.md) + +- [2024-07-24, Apache Doris 2.1.5 is released](../releasenotes/v2.1/release-2.1.5.md) + +- [2024-07-17, Apache Doris 2.0.13 is released](../releasenotes/v2.0/release-2.0.13.md) + +- [2024-06-27, Apache Doris 2.0.12 is released](../releasenotes/v2.0/release-2.0.12.md) + +- [2024-06-26, Apache Doris 2.1.4 is released](../releasenotes/v2.1/release-2.1.4.md) + +- [2024-06-05, Apache DOris 2.0.11 is released](../releasenotes/v2.0/release-2.0.11.md) + +- [2024-05-21, Apache Doris 2.1.3 is released](../releasenotes/v2.1/release-2.1.3.md) + +- [2024-05-16, Apache Doris 2.0.10 is released](../releasenotes/v2.0/release-2.0.10.md) + +- [2024-04-23, Apache Doris 2.0.9 is released](../releasenotes/v2.0/release-2.0.9.md) + +- [2024-04-12, Apache Doris 2.1.2 is released](../releasenotes/v2.1/release-2.1.2.md) + +- [2024-04-09, Apache Doris 2.0.8 is released](../releasenotes/v2.0/release-2.0.8.md) + +- [2024-04-03, Apache Doris 2.1.1 is released](../releasenotes/v2.1/release-2.1.1.md) + +- [2024-03-26, Apache Doris 2.0.7 is released](../releasenotes/v2.0/release-2.0.7.md) + +- [2024-03-12, Apache Doris 2.1.0 is released](../releasenotes/v2.1/release-2.1.0.md) + +- [2024-03-11, Apache Doris 2.0.6 is released](../releasenotes/v2.0/release-2.0.6.md) + +- [2024-02-28, Apache Doris 2.0.5 is released](../releasenotes/v2.0/release-2.0.5.md) + +- [2024-01-26, Apache Doris 2.0.4 is released](../releasenotes/v2.0/release-2.0.4.md) + + + + diff --git a/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.0.md b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.0.md new file mode 100644 index 0000000000000..dd94da6816294 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.0.md @@ -0,0 +1,379 @@ +--- +{ + "title": "Release 1.1.0", + "language": "en" +} +--- + + + +In version 1.1, we realized the full vectorization of the computing layer and storage layer, and officially enabled the vectorized execution engine as a stable function. All queries are executed by the vectorized execution engine by default, and the performance is 3-5 times higher than the previous version. It increases the ability to access the external tables of Apache Iceberg and supports federated query of data in Doris and Iceberg, and expands the analysis capabilities of Apache Doris on the data lake; on the basis of the original LZ4, the ZSTD compression algorithm is added , further improves the data compression rate; fixed many performance and stability problems in previous versions, greatly improving system stability. Downloading and using is recommended. + +## Upgrade Notes + +### The vectorized execution engine is enabled by default + +In version 1.0, we introduced the vectorized execution engine as an experimental feature and Users need to manually enable it when executing queries by configuring the session variables through `set batch_size = 4096` and `set enable_vectorized_engine = true` . + +In version 1.1, we officially fully enabled the vectorized execution engine as a stable function. The session variable `enable_vectorized_engine` is set to true by default. All queries are executed by default through the vectorized execution engine. + +### BE Binary File Renaming + +BE binary file has been renamed from palo_be to doris_be . Please pay attention to modifying the relevant scripts if you used to rely on process names for cluster management and other operations. + +### Segment storage format upgrade + +The storage format of earlier versions of Apache Doris was Segment V1. In version 0.12, we had implemented Segment V2 as a new storage format, which introduced Bitmap indexes, memory tables, page cache, dictionary compression, delayed materialization and many other features. Starting from version 0.13, the default storage format for newly created tables is Segment V2, while maintaining compatibility with the Segment V1 format. + +In order to ensure the maintainability of the code structure and reduce the additional learning and development costs caused by redundant historical codes, we have decided to no longer support the Segment v1 storage format from the next version. It is expected that this part of the code will be deleted in the Apache Doris 1.2 version. + +### Normal Upgrade + +For normal upgrade operations, you can perform rolling upgrades according to the cluster upgrade documentation on the official website. + +[https://doris.apache.org//docs/admin-manual/cluster-management/upgrade](https://doris.apache.org//docs/admin-manual/cluster-management/upgrade) + +## Features + +### Support random distribution of data [experimental] + +In some scenarios (such as log data analysis), users may not be able to find a suitable bucket key to avoid data skew, so the system needs to provide additional distribution methods to solve the problem. + +Therefore, when creating a table you can set `DISTRIBUTED BY random BUCKETS number`to use random distribution, the data will be randomly written to a single tablet when importing to reduce the data fanout during the loading process. And reduce resource overhead and improve system stability. + +### Support for creating Iceberg external tables[experimental] + +Iceberg external tables provide Apache Doris with direct access to data stored in Iceberg. Through Iceberg external tables, federated queries on data stored in local storage and Iceberg can be implemented, which saves tedious data loading work, simplifies the system architecture for data analysis, and performs more complex analysis operations. + +In version 1.1, Apache Doris supports creating Iceberg external tables and querying data, and supports automatic synchronization of all table schemas in the Iceberg database through the REFRESH command. + +### Added ZSTD compression algorithm + +At present, the data compression method in Apache Doris is uniformly specified by the system, and the default is LZ4. For some scenarios that are sensitive to data storage costs, the original data compression ratio requirements cannot be met. + +In version 1.1, users can set "compression"="zstd" in the table properties to specify the compression method as ZSTD when creating a table. In the 25GB 110 million lines of text log test data, the highest compression rate is nearly 10 times, which is 53% higher than the original compression rate, and the speed of reading data from disk and decompressing it is increased by 30%. + +## Improvements + +### More comprehensive vectorization support + +In version 1.1, we implemented full vectorization of the compute and storage layers, including: + +Implemented vectorization of all built-in functions + +The storage layer implements vectorization and supports dictionary optimization for low-cardinality string columns + +Optimized and resolved numerous performance and stability issues with the vectorization engine. + +We tested the performance of Apache Doris version 1.1 and version 0.15 on the SSB and TPC-H standard test datasets: + +On all 13 SQLs in the SSB test data set, version 1.1 is better than version 0.15, and the overall performance is improved by about 3 times, which solves the problem of performance degradation in some scenarios in version 1.0; + +On all 22 SQLs in the TPC-H test data set, version 1.1 is better than version 0.15, the overall performance is improved by about 4.5 times, and the performance of some scenarios is improved by more than ten times; + +![](/images/release-note-1.1.0-SSB.png) + +

SSB Benchmark

+ +![](/images/release-note-1.1.0-TPC-H.png) + + +

TPC-H Benchmark

+ +**Performance test report** + +[https://doris.apache.org//docs/benchmark/ssb](https://doris.apache.org//docs/benchmark/ssb) + +[https://doris.apache.org//docs/benchmark/tpch](https://doris.apache.org//docs/benchmark/tpch) + +### Compaction logic optimization and real-time guarantee + +In Apache Doris, each commit will generate a data version. In high concurrent write scenarios, -235 errors are prone to occur due to too many data versions and untimely compaction, and query performance will also decrease accordingly. + +In version 1.1, we introduced QuickCompaction, which will actively trigger compaction when the data version increases. At the same time, by improving the ability to scan fragment metadata, it can quickly find fragments with too many data versions and trigger compaction. Through active triggering and passive scanning, the real-time problem of data merging is completely solved. + +At the same time, for high-frequency small file cumulative compaction, the scheduling and isolation of compaction tasks is implemented to prevent the heavyweight base compaction from affecting the merging of new data. + +Finally, for the merging of small files, the strategy of merging small files is optimized, and the method of gradient merging is adopted. Each time the files participating in the merging belong to the same data magnitude, it prevents versions with large differences in size from merging, and gradually merges hierarchically. , reducing the number of times a single file participates in merging, which can greatly save the CPU consumption of the system. + +When the data upstream maintains a write frequency of 10w per second (20 concurrent write tasks, 5000 rows per job, and checkpoint interval of 1s), version 1.1 behaves as follows: + +- Quick data consolidation: Tablet version remains below 50 and compaction score is stable. Compared with the -235 problem that frequently occurred during high concurrent writing in the previous version, the compaction merge efficiency has been improved by more than 10 times. + +- Significantly reduced CPU resource consumption: The strategy has been optimized for small file Compaction. In the above scenario of high concurrent writing, CPU resource consumption is reduced by 25%; + +- Stable query time consumption: The overall orderliness of data is improved, and the fluctuation of query time consumption is greatly reduced. The query time consumption during high concurrent writing is the same as that of only querying, and the query performance is improved by 3-4 times compared with the previous version. + +### Read efficiency optimization for Parquet and ORC files + +By adjusting arrow parameters, arrow's multi-threaded read capability is used to speed up Arrow's reading of each row_group, and it is modified to SPSC model to reduce the cost of waiting for the network through prefetching. After optimization, the performance of Parquet file import is improved by 4 to 5 times. + +### Safer metadata Checkpoint + +By double-checking the image files generated after the metadata checkpoint and retaining the function of historical image files, the problem of metadata corruption caused by image file errors is solved. + +## Bugfix + +### Fix the problem that the data cannot be queried due to the missing data version.(Serious) + +This issue was introduced in version 1.0 and may result in the loss of data versions for multiple replicas. + +### Fix the problem that the resource isolation is invalid for the resource usage limit of loading tasks (Moderate) + +In 1.1, the broker load and routine load will use Backends with specified resource tags to do the load. + +### Use HTTP BRPC to transfer network data packets over 2GB (Moderate) + +In the previous version, when the data transmitted between Backends through BRPC exceeded 2GB, +it may cause data transmission errors. + +## Others + +### Disabling Mini Load + +The `/_load` interface is disabled by default, please use `the /_stream_load` interface uniformly. +Of course, you can re-enable it by turning off the FE configuration item `disable_mini_load`. + +The Mini Load interface will be completely removed in version 1.2. + +### Completely disable the SegmentV1 storage format + +Data in SegmentV1 format is no longer allowed to be created. Existing data can continue to be accessed normally. +You can use the `ADMIN SHOW TABLET STORAGE FORMAT` statement to check whether the data in SegmentV1 format +still exists in the cluster. And convert to SegmentV2 through the data conversion command + +Access to SegmentV1 data will no longer be supported in version 1.2. + +### Limit the maximum length of String type + +In previous versions, String types were allowed a maximum length of 2GB. +In version 1.1, we will limit the maximum length of the string type to 1MB. Strings longer than this length cannot be written anymore. +At the same time, using the String type as a partitioning or bucketing column of a table is no longer supported. + +The String type that has been written can be accessed normally. + +### Fix fastjson related vulnerabilities + +Update to Canal version to fix fastjson security vulnerability. + +### Added `ADMIN DIAGNOSE TABLET` command + +Used to quickly diagnose problems with the specified tablet. + +## Download to Use + +### Download Link + +[hhttps://doris.apache.org/download](https://doris.apache.org/download) + +### Feedback + +If you encounter any problems with use, please feel free to contact us through GitHub discussion forum or Dev e-mail group anytime. + +GitHub Forum: [https://github.com/apache/doris/discussions](https://github.com/apache/doris/discussions) + +Mailing list: [dev@doris.apache.org](dev@doris.apache.org) + +## Thanks + +Thanks to everyone who has contributed to this release: + +``` + +@adonis0147 + +@airborne12 + +@amosbird + +@aopangzi + +@arthuryangcs + +@awakeljw + +@BePPPower + +@BiteTheDDDDt + +@bridgeDream + +@caiconghui + +@cambyzju + +@ccoffline + +@chenlinzhong + +@daikon12 + +@DarvenDuan + +@dataalive + +@dataroaring + +@deardeng + +@Doris-Extras + +@emerkfu + +@EmmyMiao87 + +@englefly + +@Gabriel39 + +@GoGoWen + +@gtchaos + +@HappenLee + +@hello-stephen + +@Henry2SS + +@hewei-nju + +@hf200012 + +@jacktengg + +@jackwener + +@Jibing-Li + +@JNSimba + +@kangshisen + +@Kikyou1997 + +@kylinmac + +@Lchangliang + +@leo65535 + +@liaoxin01 + +@liutang123 + +@lovingfeel + +@luozenglin + +@luwei16 + +@luzhijing + +@mklzl + +@morningman + +@morrySnow + +@nextdreamblue + +@Nivane + +@pengxiangyu + +@qidaye + +@qzsee + +@SaintBacchus + +@SleepyBear96 + +@smallhibiscus + +@spaces-X + +@stalary + +@starocean999 + +@steadyBoy + +@SWJTU-ZhangLei + +@Tanya-W + +@tarepanda1024 + +@tianhui5 + +@Userwhite + +@wangbo + +@wangyf0555 + +@weizuo93 + +@whutpencil + +@wsjz + +@wunan1210 + +@xiaokang + +@xinyiZzz + +@xlwh + +@xy720 + +@yangzhg + +@Yankee24 + +@yiguolei + +@yinzhijian + +@yixiutt + +@zbtzbtzbt + +@zenoyang + +@zhangstar333 + +@zhangyifan27 + +@zhannngchen + +@zhengshengjun + +@zhengshiJ + +@zingdle + +@zuochunwei + +@zy-kkk +``` diff --git a/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.1.md b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.1.md new file mode 100644 index 0000000000000..73a6c2d976999 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.1.md @@ -0,0 +1,78 @@ +--- +{ + "title": "Release 1.1.1", + "language": "en" +} +--- + + + +## Features + +### Support ODBC Sink in Vectorized Engine. + +This feature is enabled in non-vectorized engine but it is missed in vectorized engine in 1.1. So that we add back this feature in 1.1.1. + +### Simple Memtracker for Vectorized Engine. + +There is no memtracker in BE for vectorized engine in 1.1, so that the memory is out of control and cause OOM. In 1.1.1, a simple memtracker is added to BE and could control the memory and cancel the query when memory exceeded. + +## Improvements + +### Cache decompressed data in page cache. + +Some data is compressed using bitshuffle and it costs a lot of time to decompress it during query. In 1.1.1, doris will decompress the data that encoded by bitshuffle to accelerate query and we find it could reduce 30% latency for some query in ssb-flat. + +## Bug Fix + +### Fix the problem that could not do rolling upgrade from 1.0.(Serious) + +This issue was introduced in version 1.1 and may cause BE core when upgrade BE but not upgrade FE. + +If you encounter this problem, you can try to fix it with [#10833](https://github.com/apache/doris/pull/10833). + +### Fix the problem that some query not fall back to non-vectorized engine, and BE will core. + +Currently, vectorized engine could not deal with all sql queries and some queries (like left outer join) will use non-vectorized engine to run. But there are some cases not covered in 1.1. And it will cause be crash. + +### Compaction not work correctly and cause -235 Error. + +One rowset multi segments in uniq key compaction, segments rows will be merged in generic_iterator but merged_rows not increased. Compaction will failed in check_correctness, and make a tablet with too much versions which lead to -235 load error. + +### Some segment fault cases during query. + +[#10961](https://github.com/apache/doris/pull/10961) +[#10954](https://github.com/apache/doris/pull/10954) +[#10962](https://github.com/apache/doris/pull/10962) + +# Thanks + +Thanks to everyone who has contributed to this release: + +``` +@jacktengg +@mrhhsg +@xinyiZzz +@yixiutt +@starocean999 +@morrySnow +@morningman +@HappenLee +``` \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.2.md b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.2.md new file mode 100644 index 0000000000000..223b65fda064c --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.2.md @@ -0,0 +1,84 @@ +--- +{ + "title": "Release 1.1.2", + "language": "en" +} +--- + + + + +In this release, Doris Team has fixed more than 170 issues or performance improvement since 1.1.1. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + +# Features + +### New MemTracker + +Introduced new MemTracker for both vectorized engine and non-vectorized engine which is more accurate. + +### Add API for showing current queries and kill query + +### Support read/write emoji of UTF16 via ODBC Table + +# Improvements + +### Data Lake related improvements + +- Improved HDFS ORC File scan performance about 300%. [#11501](https://github.com/apache/doris/pull/11501) + +- Support HDFS HA mode when query Iceberg table. + +- Support query Hive data created by [Apache Tez](https://tez.apache.org/) + +- Add Ali OSS as Hive external support. + +### Add support for string and text type in Spark Load + + +### Add reuse block in non-vectorized engine and have 50% performance improvement in some cases. [#11392](https://github.com/apache/doris/pull/11392) + +### Improve like or regex performance + +### Disable tcmalloc's aggressive_memory_decommit + +It will have 40% performance gains in load or query. + +Currently it is a config, you can change it by set config `tc_enable_aggressive_memory_decommit`. + +# Bug Fix + +### Some issues about FE that will cause FE failure or data corrupt. + +- Add reserved disk config to avoid too many reserved BDB-JE files.**(Serious)** In an HA environment, BDB JE will retains as many reserved files. The BDB-je log doesn't delete until approaching a disk limit. + +- Fix fatal bug in BDB-JE which will cause FE replica could not start correctly or data corrupted.** (Serious)** + +### Fe will hang on waitFor_rpc during query and BE will hang in high concurrent scenarios. + +[#12459](https://github.com/apache/doris/pull/12459) [#12458](https://github.com/apache/doris/pull/12458) [#12392](https://github.com/apache/doris/pull/12392) + +### A fatal issue in vectorized storage engine which will cause wrong result. **(Serious)** + +[#11754](https://github.com/apache/doris/pull/11754) [#11694](https://github.com/apache/doris/pull/11694) + +### Lots of planner related issues that will cause BE core or in abnormal state. + +[#12080](https://github.com/apache/doris/pull/12080) [#12075](https://github.com/apache/doris/pull/12075) [#12040](https://github.com/apache/doris/pull/12040) [#12003](https://github.com/apache/doris/pull/12003) [#12007](https://github.com/apache/doris/pull/12007) [#11971](https://github.com/apache/doris/pull/11971) [#11933](https://github.com/apache/doris/pull/11933) [#11861](https://github.com/apache/doris/pull/11861) [#11859](https://github.com/apache/doris/pull/11859) [#11855](https://github.com/apache/doris/pull/11855) [#11837](https://github.com/apache/doris/pull/11837) [#11834](https://github.com/apache/doris/pull/11834) [#11821](https://github.com/apache/doris/pull/11821) [#11782](https://github.com/apache/doris/pull/11782) [#11723](https://github.com/apache/doris/pull/11723) [#11569](https://github.com/apache/doris/pull/11569) + diff --git a/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.3.md b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.3.md new file mode 100644 index 0000000000000..cfa7151097de3 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.3.md @@ -0,0 +1,92 @@ +--- +{ + "title": "Release 1.1.3", + "language": "en" +} +--- + + + + +In this release, Doris Team has fixed more than 80 issues or performance improvement since 1.1.2. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + + +# Features + +- Support escape identifiers for sqlserver and postgresql in ODBC table. + +- Could use Parquet as output file format. + +# Improvements + +- Optimize flush policy to avoid small segments. [#12706](https://github.com/apache/doris/pull/12706) [#12716](https://github.com/apache/doris/pull/12716) + +- Refactor runtime filter to reduce the prepare time. [#13127](https://github.com/apache/doris/pull/13127) + +- Lots of memory control related issues during query or load process. [#12682](https://github.com/apache/doris/pull/12682) [#12688](https://github.com/apache/doris/pull/12688) [#12708](https://github.com/apache/doris/pull/12708) [#12776](https://github.com/apache/doris/pull/12776) [#12782](https://github.com/apache/doris/pull/12782) [#12791](https://github.com/apache/doris/pull/12791) [#12794](https://github.com/apache/doris/pull/12794) [#12820](https://github.com/apache/doris/pull/12820) [#12932](https://github.com/apache/doris/pull/12932) [#12954](https://github.com/apache/doris/pull/12954) [#12951](https://github.com/apache/doris/pull/12951) + +# BugFix + +- Core dump on compaction with largeint. [#10094](https://github.com/apache/doris/pull/10094) + +- Grouping sets cause be core or return wrong results. [#12313](https://github.com/apache/doris/pull/12313) + +- PREAGGREGATION flag in orthogonal_bitmap_union_count operator is wrong. [#12581](https://github.com/apache/doris/pull/12581) + +- Level1Iterator should release iterators in heap and it may cause memory leak. [#12592](https://github.com/apache/doris/pull/12592) + +- Fix decommission failure with 2 BEs and existing colocation table. [#12644](https://github.com/apache/doris/pull/12644) + +- BE may core dump because of stack-buffer-overflow when TBrokerOpenReaderResponse too large. [#12658](https://github.com/apache/doris/pull/12658) + +- BE may OOM during load when error code -238 occurs. [#12666](https://github.com/apache/doris/pull/12666) + +- Fix wrong child expression of lead function. [#12587](https://github.com/apache/doris/pull/12587) + +- Fix intersect query failed in row storage code. [#12712](https://github.com/apache/doris/pull/12712) + +- Fix wrong result produced by curdate()/current_date() function. [#12720](https://github.com/apache/doris/pull/12720) + +- Fix lateral view explode_split with temp table bug. [#13643](https://github.com/apache/doris/pull/13643) + +- Bucket shuffle join plan is wrong in two same table. [#12930](https://github.com/apache/doris/pull/12930) + +- Fix bug that tablet version may be wrong when doing alter and load. [#13070](https://github.com/apache/doris/pull/13070) + +- BE core when load data using broker with md5sum()/sm3sum(). [#13009](https://github.com/apache/doris/pull/13009) + +# Upgrade Notes + +PageCache and ChunkAllocator are disabled by default to reduce memory usage and can be re-enabled by modifying the configuration items `disable_storage_page_cache` and `chunk_reserved_bytes_limit`. + +Storage Page Cache and Chunk Allocator cache user data chunks and memory preallocation, respectively. + +These two functions take up a certain percentage of memory and are not freed. This part of memory cannot be flexibly allocated, which may lead to insufficient memory for other tasks in some scenarios, affecting system stability and availability. Therefore, we disabled these two features by default in version 1.1.3. + +However, in some latency-sensitive reporting scenarios, turning off this feature may lead to increased query latency. If you are worried about the impact of this feature on your business after upgrade, you can add the following parameters to be.conf to keep the same behavior as the previous version. + +``` +disable_storage_page_cache=false +chunk_reserved_bytes_limit=10% +``` + +* ``disable_storage_page_cache``: Whether to disable Storage Page Cache. version 1.1.2 (inclusive), the default is false, i.e., on. version 1.1.3 defaults to true, i.e., off. +* `chunk_reserved_bytes_limit`: Chunk allocator reserved memory size. 1.1.2 (and earlier), the default is 10% of the overall memory. 1.1.3 version default is 209715200 (200MB). + diff --git a/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.4.md b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.4.md new file mode 100644 index 0000000000000..4710463f4bcde --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.4.md @@ -0,0 +1,72 @@ +--- +{ + "title": "Release 1.1.4", + "language": "en" +} +--- + + + +In this release, Doris Team has fixed about 60 issues or performance improvement since 1.1.3. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + + +# Features + +- Support obs broker load for Huawei Cloud. [#13523](https://github.com/apache/doris/pull/13523) + +- SparkLoad support parquet and orc file.[#13438](https://github.com/apache/doris/pull/13438) + +# Improvements + +- Do not acquire mutex in metric hook since it will affect query performance during heavy load.[#10941](https://github.com/apache/doris/pull/10941) + + +# BugFix + +- The where condition does not take effect when spark load loads the file. [#13804](https://github.com/apache/doris/pull/13804) + +- If function return error result when there is nullable column in vectorized mode. [#13779](https://github.com/apache/doris/pull/13779) + +- Fix incorrect result when using anti join with other join predicates. [#13743](https://github.com/apache/doris/pull/13743) + +- BE crash when call function concat(ifnull). [#13693](https://github.com/apache/doris/pull/13693) + +- Fix planner bug when there is a function in group by clause. [#13613](https://github.com/apache/doris/pull/13613) + +- Table name and column name is not recognized correctly in lateral view clause. [#13600](https://github.com/apache/doris/pull/13600) + +- Unknown column when use MV and table alias. [#13605](https://github.com/apache/doris/pull/13605) + +- JSONReader release memory of both value and parse allocator. [#13513](https://github.com/apache/doris/pull/13513) + +- Fix allow create mv using to_bitmap() on negative value columns when enable_vectorized_alter_table is true. [#13448](https://github.com/apache/doris/pull/13448) + +- Microsecond in function from_date_format_str is lost. [#13446](https://github.com/apache/doris/pull/13446) + +- Sort exprs nullability property may not be right after subsitute using child's smap info. [#13328](https://github.com/apache/doris/pull/13328) + +- Fix core dump on case when have 1000 condition. [#13315](https://github.com/apache/doris/pull/13315) + +- Fix bug that last line of data lost for stream load. [#13066](https://github.com/apache/doris/pull/13066) + +- Restore table or partition with the same replication num as before the backup. [#11942](https://github.com/apache/doris/pull/11942) + + + diff --git a/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.5.md b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.5.md new file mode 100644 index 0000000000000..ee0482b3ba487 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v1.1/release-1.1.5.md @@ -0,0 +1,65 @@ +--- +{ + "title": "Release 1.1.5", + "language": "en" +} +--- + + + +In this release, Doris Team has fixed about 36 issues or performance improvement since 1.1.4. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + +# Behavior Changes + +When alias name is same as the original column name like "select year(birthday) as birthday" and use it in group by, order by , having clause, doris's behavior is different from MySQL in the past. In this release, we make it follow MySQL's behavior. Group by and having clause will use original column at first and order by will use alias first. It maybe a litter confuse here so there is a simple advice here, you'd better not use an alias the same as original column name. + +# Features + +Add support of murmur_hash3_64. [#14636](https://github.com/apache/doris/pull/14636) + +# Improvements + +Add timezone cache for convert_tz to improve performance. [#14616](https://github.com/apache/doris/pull/14616) + +Sort result by tablename when call show clause. [#14492](https://github.com/apache/doris/pull/14492) + +# Bug Fix + +Fix coredump when there is a if constant expr in select clause. [#14858](https://github.com/apache/doris/pull/14858) + +ColumnVector::insert_date_column may crashed. [#14839](https://github.com/apache/doris/pull/14839) + +Update high_priority_flush_thread_num_per_store default value to 6 and it will improve the load performance. [#14775](https://github.com/apache/doris/pull/14775) + +Fix quick compaction core. [#14731](https://github.com/apache/doris/pull/14731) + +Partition column is not duplicate key, spark load will throw IndexOutOfBounds error. [#14661](https://github.com/apache/doris/pull/14661) + +Fix a memory leak problem in VCollectorIterator. [#14549](https://github.com/apache/doris/pull/14549) + +Fix create table like when having sequence column. [#14511](https://github.com/apache/doris/pull/14511) + +Using avg rowset to calculate batch size instead of using total_bytes since it costs a lot of cpu. [#14273](https://github.com/apache/doris/pull/14273) + +Fix right outer join core with conjunct. [#14821](https://github.com/apache/doris/pull/14821) + +Optimize policy of tcmalloc gc. [#14777](https://github.com/apache/doris/pull/14777) [#14738](https://github.com/apache/doris/pull/14738) [#14374](https://github.com/apache/doris/pull/14374) + + diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.0.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.0.md new file mode 100644 index 0000000000000..61ba6c5c60890 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.0.md @@ -0,0 +1,236 @@ +--- +{ + "title": "Release 2.0.0", + "language": "en" +} +--- + + + + +We are more than excited to announce that, after six months of coding, testing, and fine-tuning, Apache Doris 2.0.0 is now production-ready. Special thanks to the 275 committers who altogether contributed over 4100 optimizations and fixes to the project. + +This new version highlights: + +- 10 times faster data queries +- Enhanced log analytic and federated query capabilities +- More efficient data writing and updates +- Improved multi-tenant and resource isolation mechanisms +- Progresses in elastic scaling of resources and storage-compute separation +- Enterprise-facing features for higher usability + +> Download: https://doris.apache.org/download +> +> GitHub source code: https://github.com/apache/doris/releases/tag/2.0.0-rc04 + +## **A 10 Times Performance Increase** + +In SSB-Flat and TPC-H benchmarking, Apache Doris 2.0.0 delivered **over 10-time faster query performance** compared to an early version of Apache Doris. + +![](/images/release-note-2.0.0-1.png) + +This is realized by the introduction of a smarter query optimizer, inverted index, a parallel execution model, and a series of new functionalities to support high-concurrency point queries. + +### A smarter query optimizer + +The brand new query optimizer, Nereids, has a richer statistical base and adopts the Cascades framework. It is capable of self-tuning in most query scenarios and supports all 99 SQLs in TPC-DS, so users can expect high performance without any fine-tuning or SQL rewriting. + +TPC-H tests showed that Nereids, with no human intervention, outperformed the old query optimizer by a wide margin. Over 100 users have tried Apache Doris 2.0.0 in their production environment and the vast majority of them reported huge speedups in query execution. + +![](/images/release-note-2.0.0-2.png) + +**Doc**: https://doris.apache.org/docs/dev/query-acceleration/nereids/ + +Nereids is enabled by default in Apache Doris 2.0.0: `SET enable_nereids_planner=true`. Nereids collects statistical data by calling the Analyze command. + +### Inverted Index + +In Apache Doris 2.0.0, we introduced inverted index to better support fuzzy keyword search, equivalence queries, and range queries. + +A smartphone manufacturer tested Apache Doris 2.0.0 in their user behavior analysis scenarios. With inverted index enabled, v2.0.0 was able to finish the queries within milliseconds and maintain stable performance as the query concurrency level went up. In this case, it is 5 to 90 times faster than its old version. + +![](/images/release-note-2.0.0-3.png) + +### 20 times higher concurrency capability + +In scenarios like e-commerce order queries and express tracking, a huge number of end data users search for a certain data record simultaneously. These are what we call high-concurrency point queries, which can bring huge pressure on the system. A traditional solution is to introduce Key-Value stores like Apache HBase for such queries, and Redis as a cache layer to ease the burden, but that means redundant storage and higher maintenance costs. + +For a column-oriented DBMS like Apache Doris, the I/O usage of point queries will be multiplied. We need neater execution. Thus, on the basis of columnar storage, we added row storage format and row cache to increase row reading efficiency, short-circuit plans to speed up data retrieval, and prepared statements to reduce frontend overheads. + +After these optimizations, Apache Doris 2.0 reached a concurrency level of **30,000 QPS per node** on YCSB on a 16 Core 64G cloud server with 4×1T hard drives, representing an improvement of **20 times** compared to its older version. This makes Apache Doris a good alternative to HBase in high-concurrency scenarios, so that users don't need to endure extra maintenance costs and redundant storage brought by complicated tech stacks. + +Read more: https://doris.apache.org/blog/High_concurrency + +### A self-adaptive parallel execution model + +Apache 2.0 brought in a Pipeline execution model for higher efficiency and stability in hybrid analytic workloads. In this model, the execution of queries is driven by data. The blocking operators in all query execution processes are split into pipelines. Whether a pipeline gets an execution thread depends on whether its relevant data is ready. This enables asynchronous blocking operations and more flexible system resource management. Also, this improves CPU efficiency as the system doesn't have to create and destroy threads that much. + +Doc: https://doris.apache.org/docs/dev/query-acceleration/pipeline-execution-engine/ + +**How to enable the Pipeline execution model** + +- The Pipeline execution engine is enabled by default in Apache Doris 2.0: `Set enable_pipeline_engine = true`. +- `parallel_pipeline_task_num` represents the number of pipeline tasks that are parallelly executed in SQL queries. The default value of it is `0`, which means Apache Doris will automatically set the concurrency level to half the number of CPUs in each backend node. Users can change this value as they need it. +- For those who are upgrading to Apache Doris 2.0 from an older version, it is recommended to set the value of `parallel_pipeline_task_num` to that of `parallel_fragment_exec_instance_num` in the old version. + +## A Unified Platform for Multiple Analytic Workloads + +Apache Doris has been pushing its boundaries. Starting as an OLAP engine for reporting, it is now a data warehouse capable of ETL/ELT and more. Version 2.0 is making advancements in its log analysis and data lakehousing capabilities. + +### A 10 times more cost-effective log analysis solution + +Apache Doris 2.0.0 provides native support for semi-structured data. In addition to JSON and Array, it now supports a complex data type: Map. Based on Light Schema Change, it also supports Schema Evolution, which means you can adjust the schema as your business changes. You can add or delete fields and indexes, and change the data types for fields. As we introduced inverted index and a high-performance text analysis algorithm into it, it can execute full-text search and dimensional analysis of logs more efficiently. With faster data writing and query speed and lower storage cost, it is 10 times more cost-effective than the common log analytic solution within the industry. + +![](/images/release-note-2.0.0-4.png) + +### Enhanced data lakehousing capabilities + +In Apache Doris 1.2, we introduced Multi-Catalog to allow for auto-mapping and auto-synchronization of data from heterogeneous sources. In version 2.0.0, we extended the list of data sources supported and optimized Doris for based on users' needs in production environment. + +![](/images/release-note-2.0.0-5.png) + +Apache Doris 2.0.0 supports dozens of data sources including Hive, Hudi, Iceberg, Paimon, MaxCompute, Elasticsearch, Trino, ClickHouse, and almost all open lakehouse formats. It also supports snapshot queries on Hudi Copy-on-Write tables and read optimized queries on Hudi Merge-on-Read tables. It allows for authorization of Hive Catalog using Apache Ranger, so users can reuse their existing privilege control system. Besides, it supports extensible authorization plug-ins to enable user-defined authorization methods for any catalog. + +TPC-H benchmark tests showed that Apache Doris 2.0.0 is 3~5 times faster than Presto/Trino in queries on Hive tables. This is realized by all-around optimizations (in small file reading, flat table reading, local file cache, ORC/Parquet file reading, Compute Nodes, and information collection of external tables) finished in this development cycle and the distributed execution framework, vectorized execution engine, and query optimizer of Apache Doris. + +![](/images/release-note-2.0.0-6.png) + +All this gives Apache Doris 2.0.0 an edge in data lakehousing scenarios. With Doris, you can do incremental or overall synchronization of multiple upstream data sources in one place, and expect much higher data query performance than other query engines. The processed data can be written back to the sources or provided for downstream systems. In this way, you can make Apache Doris your unified data analytic gateway. + +## Efficient Data Update + +Data update is important in real-time analysis, since users want to always be accessible to the latest data, and be able to update data flexibly, such as updating a row or just a few columns, batching updating or deleting their specified data, or even overwriting a whole data partition. + +Efficient data updating has been another hill to climb in data analysis. Apache Hive only supports updates on the partition level, while Hudi and Iceberg do better in low-frequency batch updates instead of real-time updates due to their Merge-on-Read and Copy-on-Write implementations. + +As for data updating, Apache Doris 2.0.0 is capable of: + +- **Faster data writing**: In the pressure tests with an online payment platform, under 20 concurrent data writing tasks, Doris reached a writing throughput of 300,000 records per second and maintained stability throughout the over 10-hour continuous writing process. +- **Partial column update**: Older versions of Doris implements partial column update by `replace_if_not_null` in the Aggregate Key model. In 2.0.0, we enable partial column updates in the Unique Key model. That means you can directly write data from multiple source tables into a flat table, without having to concatenate them into one output stream using Flink before writing. This method avoids a complicated processing pipeline and the extra resource consumption. You can simply specify the columns you need to update. +- **Conditional update and deletion**: In addition to the simple Update and Delete operations, we realize complicated conditional updates and deletes operations on the basis of Merge-on-Write. + +## Faster, Stabler, and Smarter Data Writing + +### Higher speed in data writing + +As part of our continuing effort to strengthen the real-time analytic capability of Apache Doris, we have improved the end-to-end real-time data writing capability of version 2.0.0. Benchmark tests reported higher throughput in various writing methods: + +- Stream Load, TPC-H 144G lineitem table, 48-bucket Duplicate table, triple-replica writing: throughput increased by 100% +- Stream Load, TPC-H 144G lineitem table, 48-bucket Unique Key table, triple-replica writing: throughput increased by 200% +- Insert Into Select, TPC-H 144G lineitem table, 48-bucket Duplicate table: throughput increased by 50% +- Insert Into Select, TPC-H 144G lineitem table, 48-bucket Unique Key table: throughput increased by 150% + +### Greater stability in high-concurrency data writing + +The sources of system instability often includes small file merging, write amplification, and the consequential disk I/O and CPU overheads. Hence, we introduced Vertical Compaction and Segment Compaction in version 2.0.0 to eliminate OOM errors in compaction and avoid the generation of too many segment files during data writing. After such improvements, Apache Doris can write data 50% faster while **using only 10% of the memory that it previously used**. + +Read more: https://doris.apache.org/blog/Compaction + +### Auto-synchronization of table schema + +The latest Flink-Doris-Connector allows users to synchronize an entire database (such as MySQL and Oracle) to Apache Doris by one simple step. According to our test results, one single synchronization task can support the real-time concurrent writing of thousands of tables. Users no longer need to go through a complicated synchronization procedure because Apache Doris has automated the process. Changes in the upstream data schema will be automatically captured and dynamically updated to Apache Doris in a seamless manner. + +Read more: https://doris.apache.org/blog/FDC + +## A New Multi-Tenant Resource Isolation Solution + +The purpose of multi-tenant resource isolation is to avoid resource preemption in the case of heavy loads. For that sake, older versions of Apache Doris adopted a hard isolation plan featured by Resource Group: Backend nodes of the same Doris cluster would be tagged, and those of the same tag formed a Resource Group. As data was ingested into the database, different data replicas would be written into different Resource Groups, which will be responsible for different workloads. For example, data reading and writing will be conducted on different data tablets, so as to realize read-write separation. Similarly, you can also put online and offline business on different Resource Groups. + +![](/images/release-note-2.0.0-7.png) + +This is an effective solution, but in practice, it happens that some Resource Groups are heavily occupied while others are idle. We want a more flexible way to reduce vacancy rate of resources. Thus, in 2.0.0, we introduce Workload Group resource soft limit. + +![](/images/release-note-2.0.0-8.png) + +The idea is to divide workloads into groups to allow for flexible management of CPU and memory resources. Apache Doris associates a query with a Workload Group, and limits the percentage of CPU and memory that a single query can use on a backend node. The memory soft limit can be configured and enabled by the user. + +When there is a cluster resource shortage, the system will kill the largest memory-consuming query tasks; when there are sufficient cluster resources, once a Workload Group uses more resources than expected, the idle cluster resources will be shared among all the Workload Groups to give full play to the system memory and ensure stable execution of queries. You can also prioritize the Workload Groups in terms of resource allocation. In other words, you can decide which tasks can be assigned with adequate resources and which not. + +Meanwhile, we introduced Query Queue in 2.0.0. Upon Workload Group creation, you can set a maximum query number for a query queue. Queries beyond that limit will wait for execution in the queue. This is to reduce system burden under heavy workloads. + +## Elastic Scaling and Storage-Compute Separation + +When it comes to computation and storage resources, what do users want? + +- **Elastic scaling of computation resources**: Scale up resources quickly in peak times to increase efficiency and scale down in valley times to reduce costs. +- **Lower storage costs**: Use low-cost storage media and separate storage from computation. +- **Separation of workloads**: Isolate the computation resources of different workloads to avoid preemption. +- **Unified management of data**: Simply manage catalogs and data in one place. + +To separate storage and computation is a way to realize elastic scaling of resources, but it demands more efforts in maintaining storage stability, which determines the stability and continuity of OLAP services. To ensure storage stability, we introduced mechanisms including cache management, computation resource management, and garbage collection. + + In this respect, we divide our users into three groups after investigation: + +1. Users with no need for resource scaling +2. Users requiring resource scaling, low storage costs, and workload separation from Apache Doris +3. Users who already have a stable large-scale storage system and thus require an advanced compute-storage-separated architecture for efficient resource scaling + +Apache Doris 2.0 provides two solutions to address the needs of the first two types of users. + +1. **Compute nodes**. We introduced stateless compute nodes in version 2.0. Unlike the mix nodes, the compute nodes do not save any data and are not involved in workload balancing of data tablets during cluster scaling. Thus, they are able to quickly join the cluster and share the computing pressure during peak times. In addition, in data lakehouse analysis, these nodes will be the first ones to execute queries on remote storage (HDFS/S3) so there will be no resource competition between internal tables and external tables. + 1. Doc: https://doris.apache.org/docs/dev/advanced/compute_node/ +2. **Hot-cold data separation**. Hot/cold data refers to data that is frequently/seldom accessed, respectively. Generally, it makes more sense to store cold data in low-cost storage. Older versions of Apache Doris support lifecycle management of table partitions: As hot data cooled down, it would be moved from SSD to HDD. However, data was stored with multiple replicas on HDD, which was still a waste. Now, in Apache Doris 2.0, cold data can be stored in object storage, which is even cheaper and allows single-copy storage. That reduces the storage costs by 70% and cuts down the computation and network overheads that come with storage. + 1. Read more: https://doris.apache.org/blog/HCDS/ + +For neater separate of computation and storage, the VeloDB team is going to contribute the Cloud Compute-Storage-Separation solution to the Apache Doris project. The performance and stability of it has stood the test of hundreds of companies in their production environment. The merging of code will be finished by October this year, and all Apache Doris users will be able to get an early taste of it in September. + +## Enhanced Usability + +Apache Doris 2.0.0 also highlights some enterprise-facing functionalities. + +### Support for Kubernetes Deployment + +Older versions of Apache Doris communicate based on IP, so any host failure in Kubernetes deployment that causes a POD IP drift will lead to cluster unavailability. Now, version 2.0 supports FQDN. That means the failed Doris nodes can recover automatically without human intervention, which lays the foundation for Kubernetes deployment and elastic scaling. + +### Support for Cross-Cluster Replication (CCR) + +Apache Doris 2.0.0 supports cross-cluster replication (CCR). Data changes at the database/table level in the source cluster will be synchronized to the target cluster. You can choose to replicate the incremental data or the overall data. + +It also supports synchronization of DDL, which means DDL statements executed by the source cluster can also by automatically replicated to the target cluster. + +It is simple to configure and use CCR in Doris. Leveraging this functionality, you can implement read-write separation and multi-datacenter replication + +This feature allows for higher availability of data, read/write workload separation, and cross-data-center replication more efficiently. + +## Behavior Change + +- Use rolling upgrade from 1.2-ITS to 2.0.0, and restart upgrade from preview versions of 2.0 to 2.0.0; +- The new query optimizer (Nereids) is enabled by default: `enable_nereids_planner=true`; +- All non-vectorized code has been removed from the system, so the `enable_vectorized_engine` parameter no long works; +- A new parameter `enable_single_replica_compaction` has been added; +- datev2, datetimev2, and decimalv3 are the default data types in table creation; datav1, datetimev1, and decimalv2 are not supported in table creation; +- decimalv3 is the default data type for JDBC and Iceberg Catalog; +- A new data type `AGG_STATE` has been added; +- The cluster column has been removed from backend tables; +- For better compatibility with BI tools, datev2 and datetimev2 are displayed as date and datetime when `show create table`; +- max_openfiles and swaps checks are added to the backend startup script so inappropriate system configuration might lead to backend failure; +- Password-free login is not allowed when accessing frontend on localhost; +- If there is a Multi-Catalog in the system, by default, only data of the internal catalog will be displayed when querying information schema; +- A limit has been imposed on the depth of the expression tree. The default value is 200; +- The single quote in the return value of array string has been changed to double quote; +- The Doris processes are renamed to DorisFE and DorisBE. +- The functions AES and SM4 with two arguments' behaviour changed. See more informations in [relative function docs](../../sql-manual/sql-functions/encrypt-digest-functions/sm4-encrypt.md) + +## Embarking on the 2.0.0 Journey + +To make Apache Doris 2.0.0 production-ready, we invited hundreds of enterprise users to engage in the testing and optimized it for better performance, stability, and usability. In the next phase, we will continue responding to user needs with agile release planning. We plan to launch 2.0.1 in late August and 2.0.2 in September, as we keep fixing bugs and adding new features. We also plan to release an early version of 2.1 in September to bring a few long-requested capabilities to you. For example, in Doris 2.1, the Variant data type will better serve the schema-free analytic needs of semi-structured data; the multi-table materialized views will be able to simplify the data scheduling and processing link while speeding up queries; more and neater data ingestion methods will be added and nested composite data types will be realized. + +If you have any questions or ideas when investigating, testing, and deploying Apache Doris, please find us on [Slack](https://t.co/ZxJuNJHXb2). Our developers will be happy to hear them and provide targeted support. + diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.1.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.1.md new file mode 100644 index 0000000000000..d8c19fb67525b --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.1.md @@ -0,0 +1,224 @@ +--- +{ + "title": "Release 2.0.1", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, 383 improvements and bug fixes have been made in Doris 2.0.1. + +## Behavior Changes + +- [https://github.com/apache/doris/pull/21302](https://github.com/apache/doris/pull/21302) + +## Improvements + +### functionality and stability of array and map datatypes +- [https://github.com/apache/doris/pull/22793](https://github.com/apache/doris/pull/22793) +- [https://github.com/apache/doris/pull/22927](https://github.com/apache/doris/pull/22927) +- https://github.com/apache/doris/pull/22738 +- https://github.com/apache/doris/pull/22347 +- https://github.com/apache/doris/pull/23250 +- https://github.com/apache/doris/pull/22300 + +### performance for inverted index query +- https://github.com/apache/doris/pull/22836 +- https://github.com/apache/doris/pull/23381 +- https://github.com/apache/doris/pull/23389 +- https://github.com/apache/doris/pull/22570 + +### performance for bitmap, like, scan, agg functions +- https://github.com/apache/doris/pull/23172 +- https://github.com/apache/doris/pull/23495 +- https://github.com/apache/doris/pull/23476 +- https://github.com/apache/doris/pull/23396 +- https://github.com/apache/doris/pull/23182 +- https://github.com/apache/doris/pull/22216 + +### functionality and stability of CCR +- https://github.com/apache/doris/pull/22447 +- https://github.com/apache/doris/pull/22559 +- https://github.com/apache/doris/pull/22173 +- https://github.com/apache/doris/pull/22678 + +### merge on write unique table + +- https://github.com/apache/doris/pull/22282 +- https://github.com/apache/doris/pull/22984 +- https://github.com/apache/doris/pull/21933 +- https://github.com/apache/doris/pull/22874 + +### optimizer table stats and analyze + +- https://github.com/apache/doris/pull/22658 +- https://github.com/apache/doris/pull/22211 +- https://github.com/apache/doris/pull/22775 +- https://github.com/apache/doris/pull/22896 +- https://github.com/apache/doris/pull/22788 +- https://github.com/apache/doris/pull/22882 +- + +### functionality and performance of multi catalog + +- https://github.com/apache/doris/pull/22949 +- https://github.com/apache/doris/pull/22923 +- https://github.com/apache/doris/pull/22336 +- https://github.com/apache/doris/pull/22915 +- https://github.com/apache/doris/pull/23056 +- https://github.com/apache/doris/pull/23297 +- https://github.com/apache/doris/pull/23279 + + +## Important Bug fixes + +- https://github.com/apache/doris/pull/22673 +- https://github.com/apache/doris/pull/22656 +- https://github.com/apache/doris/pull/22892 +- https://github.com/apache/doris/pull/22959 +- https://github.com/apache/doris/pull/22902 +- https://github.com/apache/doris/pull/22976 +- https://github.com/apache/doris/pull/22734 +- https://github.com/apache/doris/pull/22840 +- https://github.com/apache/doris/pull/23008 +- https://github.com/apache/doris/pull/23003 +- https://github.com/apache/doris/pull/22966 +- https://github.com/apache/doris/pull/22965 +- https://github.com/apache/doris/pull/22784 +- https://github.com/apache/doris/pull/23049 +- https://github.com/apache/doris/pull/23084 +- https://github.com/apache/doris/pull/22947 +- https://github.com/apache/doris/pull/22919 +- https://github.com/apache/doris/pull/22979 +- https://github.com/apache/doris/pull/23096 +- https://github.com/apache/doris/pull/23113 +- https://github.com/apache/doris/pull/23062 +- https://github.com/apache/doris/pull/22918 +- https://github.com/apache/doris/pull/23026 +- https://github.com/apache/doris/pull/23175 +- https://github.com/apache/doris/pull/23167 +- https://github.com/apache/doris/pull/23015 +- https://github.com/apache/doris/pull/23165 +- https://github.com/apache/doris/pull/23264 +- https://github.com/apache/doris/pull/23246 +- https://github.com/apache/doris/pull/23198 +- https://github.com/apache/doris/pull/23221 +- https://github.com/apache/doris/pull/23277 +- https://github.com/apache/doris/pull/23249 +- https://github.com/apache/doris/pull/23272 +- https://github.com/apache/doris/pull/23383 +- https://github.com/apache/doris/pull/23372 +- https://github.com/apache/doris/pull/23399 +- https://github.com/apache/doris/pull/23295 +- https://github.com/apache/doris/pull/23446 +- https://github.com/apache/doris/pull/23406 +- https://github.com/apache/doris/pull/23387 +- https://github.com/apache/doris/pull/23421 +- https://github.com/apache/doris/pull/23456 +- https://github.com/apache/doris/pull/23361 +- https://github.com/apache/doris/pull/23402 +- https://github.com/apache/doris/pull/23369 +- https://github.com/apache/doris/pull/23245 +- https://github.com/apache/doris/pull/23532 +- https://github.com/apache/doris/pull/23529 +- https://github.com/apache/doris/pull/23601 + + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.1-merged+is%3Aclosed) . + + +## Big Thanks + +Thanks all who contribute to this release: + +@adonis0147 +@airborne12 +@amorynan +@AshinGau +@BePPPower +@BiteTheDDDDt +@bobhan1 +@ByteYue +@caiconghui +@CalvinKirs +@csun5285 +@DarvenDuan +@deadlinefen +@DongLiang-0 +@Doris-Extras +@dutyu +@englefly +@freemandealer +@Gabriel39 +@GoGoWen +@HappenLee +@hello-stephen +@HHoflittlefish777 +@hubgeter +@hust-hhb +@JackDrogon +@jacktengg +@jackwener +@Jibing-Li +@kaijchen +@kaka11chen +@Kikyou1997 +@Lchangliang +@LemonLiTree +@liaoxin01 +@LiBinfeng-01 +@lsy3993 +@luozenglin +@morningman +@morrySnow +@mrhhsg +@Mryange +@mymeiyi +@shuke987 +@sohardforaname +@starocean999 +@TangSiyang2001 +@Tanya-W +@ucasfl +@vinlee19 +@wangbo +@wsjz +@wuwenchi +@xiaokang +@XieJiann +@xinyiZzz +@yujun777 +@Yukang-Lian +@Yulei-Yang +@zclllyybb +@zddr +@zenoyang +@zgxme +@zhangguoqiang666 +@zhangstar333 +@zhannngchen +@zhiqiang-hhhh +@zxealous +@zy-kkk +@zzzxl1993 +@zzzzzzzs + diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.10.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.10.md new file mode 100644 index 0000000000000..5d8592a0ee25c --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.10.md @@ -0,0 +1,59 @@ +--- +{ + "title": "Release 2.0.10", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 83 improvements and bug fixes have been made in Doris 2.0.10 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + + +## Improvement and Optimizations + +- This enhancement introduces the `read_only` and `super_read_only` variables to the database system, ensuring compatibility with MySQL's read-only modes. + +- When the check status is not IO_ERROR, the disk path should not be added to the broken list. This ensures that only disks with actual I/O errors are marked as broken. + +- When performing a Create Table As Select (CTAS) operation from an external table, convert the `VARCHAR` column to `STRING` type. + +- Support mapping Paimon column type "ROW" to Doris type "STRUCT" + +- Choose disk tolerate with little skew when creating tablet + +- Write editlog to `set replica drop` to avoid confusing status on follower FE + +- Make the schema change memory space adaptive to avoid memory over limit + +- Inverted index 'unicode' tokenizer supports configuration to exclude stop words + +See the complete list of improvements and bug fixes on [GitHub](https://github.com/apache/doris/compare/2.0.9...2.0.10) . + +## Credits + +Thanks to all who contributed to this release: + +@airborne12, @BePPPower, @ByteYue, @CalvinKirs, @cambyzju, @csun5285, @dataroaring, @deardeng, @DongLiang-0, @eldenmoon, @felixwluo, @HappenLee, @hubgeter, @jackwener, @kaijchen, @kaka11chen, @Lchangliang, @liaoxin01, @LiBinfeng-01, @luennng, @morningman, @morrySnow, @Mryange, @nextdreamblue, @qidaye, @starocean999, @suxiaogang223, @SWJTU-ZhangLei, @w41ter, @xiaokang, @xy720, @yujun777, @Yukang-Lian, @zhangstar333, @zxealous, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.11.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.11.md new file mode 100644 index 0000000000000..1a2598b0d41a0 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.11.md @@ -0,0 +1,60 @@ +--- +{ + "title": "Release 2.0.11", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 123 improvements and bug fixes have been made in Doris 2.0.11 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + +## 1 Behavior change + +Since the inverted index is now mature and stable, it can replace the old BITMAP INDEX. Therefore, any newly created `BITMAP INDEX` will automatically switch to an `INVERTED INDEX`, while existing `BITMAP INDEX` will remain unchanged. This entire switching process is transparent to the user, with no changes to writing or querying. Additionally, users can disable this automatic switch by setting the FE configuration `enable_create_bitmap_index_as_inverted_index` to false. [#35528](https://github.com/apache/doris/pull/35528) + + +## 2 Improvement and optimizations + +- Add Trino JDBC Catalog type mapping for JSON and TIME + +- FE exit when failed to transfer to (non) master to prevent unknown state and too many logs + +- Write audit log while doing drop stats table. + +- Ignore min/max column stats if table is partially analyzed to avoid inefficient query plan + +- Support minus operation for set like `set1 - set2` + +- Improve perfmance of LIKE and REGEXP clause with concat (col, pattern_str), eg. `col1 LIKE concat('%', col2, '%')` + +- Add query options for short circuit queries for upgrade compatibility + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.10...2.0.11) . + +## Credits + +Thanks all who contribute to this release: + +@AshinGau, @BePPPower, @BiteTheDDDDt, @ByteYue, @CalvinKirs, @cambyzju, @csun5285, @dataroaring, @eldenmoon, @englefly, @feiniaofeiafei, @Gabriel39, @GoGoWen, @HHoflittlefish777, @hubgeter, @jacktengg, @jackwener, @jeffreys-cat, @Jibing-Li, @kaka11chen, @kobe6th, @LiBinfeng-01, @mongo360, @morningman, @morrySnow, @mrhhsg, @Mryange, @nextdreamblue, @qidaye, @sjyango, @starocean999, @SWJTU-ZhangLei, @w41ter, @wangbo, @wsjz, @wuwenchi, @xiaokang, @XieJiann, @xy720, @yujun777, @Yukang-Lian, @Yulei-Yang, @zclllyybb, @zddr, @zhangstar333, @zhiqiang-hhhh, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.12.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.12.md new file mode 100644 index 0000000000000..0bc289c91a8ef --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.12.md @@ -0,0 +1,58 @@ +--- +{ + "title": "Release 2.0.12", + "language": "en" +} +--- + + + +Thanks to our community developers and users for their contributions. Doris version 2.0.12 will bring 99 improvements and bug fixes. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- No longer set the default table comment to the table type. Instead, set it to be empty by default, for example, change COMMENT 'OLAP' to COMMENT ' '. This new behavior is more friendly for BI software that relies on table comments. [#35855](https://github.com/apache/doris/pull/35855) + +- Change the type of the `@@autocommit` variable from `BOOLEAN` to `BIGINT` to prevent errors from certain MySQL clients (such as .NET MySQL.Data). [#33282](https://github.com/apache/doris/pull/33282) + + +## Improvements + +- Remove the `disable_nested_complex_type` parameter and allow the creation of nested `ARRAY`, `MAP`, and `STRUCT` types by default. [#36255](https://github.com/apache/doris/pull/36255) + +- The HMS catalog supports the `SHOW CREATE DATABASE` command. [#28145](https://github.com/apache/doris/pull/28145) + +- Add more inverted index metrics to the query profile. [#36545](https://github.com/apache/doris/pull/36545) + +- Cross-Cluster Replication (CCR) supports inverted indices. [#31743](https://github.com/apache/doris/pull/31743) + +You can access the full list through the GitHub [link](https://github.com/apache/doris/compare/2.0.11...2.0.12) , with the key features and improvements highlighted below. + + + +## Credits + +Thanks all who contribute to this release: + +@airborne12, D14@amorynan, D14@BiteTheDDDDt, D14@cambyzju, D14@caoliang-web, D14@dataroaring, D14@eldenmoon, D14@feiniaofeiafei, D14@felixwluo, D14@gavinchou, D14@HappenLee, D14@hello-stephen, D14@jacktengg, D14@Jibing-Li, D14@Johnnyssc, D14@liaoxin01, D14@LiBinfeng-01, D14@luwei16, D14@mongo360, D14@morningman, D14@morrySnow, D14@mrhhsg, D14@Mryange, D14@mymeiyi, D14@qidaye, D14@qzsee, D14@starocean999, D14@w41ter, D14@wangbo, D14@wsjz, D14@wuwenchi, D14@xiaokang, D14@XuPengfei-1020, D14@xy720, D14@yongjinhou, D14@yujun777, D14@Yukang-Lian, D14@Yulei-Yang, D14@zclllyybb, D14@zddr, D14@zhannngchen, D14@zhiqiang-hhhh, D14@zy-kkk, D14@zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.13.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.13.md new file mode 100644 index 0000000000000..1b6e54d948d7d --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.13.md @@ -0,0 +1,61 @@ +--- +{ + "title": "Release 2.0.13", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 112 improvements and bug fixes have been made in Doris 2.0.13 version + +[Quick Download](https://doris.apache.org/download/) + +## Behavior changes + +SQL input is treated as multiple statements only when the `CLIENT_MULTI_STATEMENTS` setting is enabled on the client side, enhancing compatibility with MySQL. [#36759](https://github.com/apache/doris/pull/36759) + +## New features + +- A new BE configuration `allow_zero_date` has been added, allowing dates with all zeros. When set to `false`, `0000-00-00` is parsed as `NULL`, and when set to `true`, it is parsed as `0000-01-01`. The default value is `false` to maintain consistency with previous behavior. [#34961](https://github.com/apache/doris/pull/34961) + +- `LogicalWindow` and `LogicalPartitionTopN` support multi-field predicate pushdown to improve performance. [#36828](https://github.com/apache/doris/pull/36828) + +- The ES Catalog now maps ES `nested` or `object` types to Doris `JSON` types. [#37101](https://github.com/apache/doris/pull/37101) + +## Improvements + +- Queries with `LIMIT` end reading data earlier to reduce resource consumption and improve performance. [#36535](https://github.com/apache/doris/pull/36535) + +- Special JSON data with empty keys is now supported. [#36762](https://github.com/apache/doris/pull/36762) + +- Stability and usability of routine load have been improved, including load balancing, automatic recovery, exception handling, and more user-friendly error messages. [#36450](https://github.com/apache/doris/pull/36450) [#35376](https://github.com/apache/doris/pull/35376) [#35266](https://github.com/apache/doris/pull/35266) [ #33372](https://github.com/apache/doris/pull/33372) [#32282](https://github.com/apache/doris/pull/32282) [#32046](https://github.com/apache/doris/pull/32046) [#32021](https://github.com/apache/doris/pull/32021) [#31846](https://github.com/apache/doris/pull/31846) [#31273](https://github.com/apache/doris/pull/31273) + +- BE load balancing selection of hard disk strategy and speed optimization. [#36826](https://github.com/apache/doris/pull/36826) [#36795](https://github.com/apache/doris/pull/36795) [#36509](https://github.com/apache/doris/pull/36509) + +- Stability and usability of the JDBC catalog have been improved, including encryption, thread pool connection count configuration, and more user-friendly error messages. [#36940](https://github.com/apache/doris/pull/36940) [#36720](https://github.com/apache/doris/pull/36720) [#30880](https://github.com/apache/doris/pull/30880) [#35692](https://github.com/apache/doris/pull/35692) + +You can access the full list through the GitHub [link](https://github.com/apache/doris/compare/2.0.12...2.0.13) , with the key features and improvements highlighted below. + +## Credits + +Thanks to all who contributed to this release: + +@Gabriel39, @Jibing-Li, @Johnnyssc, @Lchangliang, @LiBinfeng-01, @SWJTU-ZhangLei, @Thearas, @Yukang-Lian, @Yulei-Yang, @airborne12, @amorynan, @bobhan1, @cambyzju, @csun5285, @dataroaring, @deardeng, @eldenmoon, @englefly, @feiniaofeiafei, @hello-stephen, @jacktengg, @kaijchen, @liutang123, @luwei16, @morningman, @morrySnow, @mrhhsg, @mymeiyi, @platoneko, @qidaye, @sollhui, @starocean999, @w41ter, @xiaokang, @xy720, @yujun777, @zclllyybb, @zddr \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.14.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.14.md new file mode 100644 index 0000000000000..061c5cb7a1093 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.14.md @@ -0,0 +1,59 @@ +--- +{ + "title": "Release 2.0.14", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 110 improvements and bug fixes have been made in Doris 2.0.14 version + + +## 1 New features + +- Adds a REST interface to retrieve the most recent query profile: `curl http://user:password@127.0.0.1:8030/api/profile/text` [#38268](https://github.com/apache/doris/pull/38268) + +## 2 Improvements + +- Optimizes the primary key point query performance for MOW tables with sequence columns [#38287](https://github.com/apache/doris/pull/38287) + +- Enhances the performance of inverted index queries with many conditions [#35346](https://github.com/apache/doris/pull/35346) + +- Automatically enables the `support_phrase` option when creating a tokenized inverted index to accelerate `match_phrase` phrase queries [#37949](https://github.com/apache/doris/pull/37949) + +- Supports simplified SQL hints, for example: `SELECT /*+ query_timeout(3000) */ * FROM t;` [#37720](https://github.com/apache/doris/pull/37720) + +- Automatically retries reading from object storage when encountering a `429` error to improve stability [#35396](https://github.com/apache/doris/pull/35396) + +- LEFT SEMI / ANTI JOIN terminates subsequent matching execution upon matching a qualifying data row to enhance performance. [#34703](https://github.com/apache/doris/pull/34703) + +- Prevents coredump when returning illegal data to MySQL results. [#28069](https://github.com/apache/doris/pull/28069) + +- Unifies the output of type names in lowercase to maintain compatibility with MySQL and be more friendly to BI tools. [#38521](https://github.com/apache/doris/pull/38521) + + +You can access the full list through the GitHub [link](https://github.com/apache/doris/compare/2.0.13...2.0.14) , with the key features and improvements highlighted below. + +## Credits + +Thanks all who contribute to this release: + +@ByteYue, @CalvinKirs, @GoGoWen, @HappenLee, @Jibing-Li, @Lchangliang, @LiBinfeng-01, @Mryange, @XieJiann, @Yukang-Lian, @Yulei-Yang, @airborne12, @amorynan, @biohazard4321, @cambyzju, @csun5285, @eldenmoon, @englefly, @freemandealer, @hello-stephen, @hubgeter, @kaijchen, @liaoxin01, @luwei16, @morningman, @morrySnow, @mymeiyi, @qidaye, @sollhui, @starocean999, @w41ter, @wuwenchi, @xiaokang, @xy720, @yujun777, @zclllyybb, @zddr, @zhangstar333, @zhiqiang-hhhh, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.15.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.15.md new file mode 100644 index 0000000000000..58237f7c3f097 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.15.md @@ -0,0 +1,91 @@ +--- +{ + "title": "Release 2.0.15", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 157 improvements and bug fixes have been made in Doris 2.0.15 version + +- Quick Download: https://doris.apache.org/download + +- GitHub: https://github.com/apache/doris/releases/tag/2.0.15 + +## 1 Behavior Change + +NA + +## 2 New Features + +- Restore now supports deleting redundant tablets and partition options. [#39028](https://github.com/apache/doris/pull/39028) + +- Support JSON function `json_search`.[#40948](https://github.com/apache/doris/pull/40948) + +## 3 Improvement and Optimizations + +### Stability + +- Add a FE configuration `abort_txn_after_lost_heartbeat_time_second` for transaction abort time. [#28662](https://github.com/apache/doris/pull/28662) + +- Abort transactions after a BE loses heartbeat for over 1 minute instead of 5 seconds, to avoid overly sensitive transaction aborts. [#22781](https://github.com/apache/doris/pull/22781) + +- Delay scheduling EOF tasks of routine load to avoid an excessive number of small transactions. [#39975](https://github.com/apache/doris/pull/39975) + +- Prefer querying from online disk services to be more robust. [#39467](https://github.com/apache/doris/pull/39467) + +- Skip checking newly inserted rows in non-strict mode partial updates if the row's delete sign is marked. [#40322](https://github.com/apache/doris/pull/40322) + +- To prevent FE OOM, limit the number of tablets in backup tasks, with a default value of 300,000. [#39987](https://github.com/apache/doris/pull/39987) + +### Performance + +- Optimize slow column updates caused by concurrent column updates and compactions. [#38487](https://github.com/apache/doris/pull/38487) + +- When a NullLiteral exists in a filter condition, it can now be folded into False and further converted to an EmptySet to reduce unnecessary data scanning and computation. [#38135](https://github.com/apache/doris/pull/38135) + +- Improve performance of `ORDER BY` permutation. [#38985](https://github.com/apache/doris/pull/38985) + +- Improve the performance of string processing in inverted indexes. [#37395](https://github.com/apache/doris/pull/37395) + +### Optimizer and Statistics + +- Added support for statements beginning with a semicolon. [#39399](https://github.com/apache/doris/pull/39399) + +- Polish aggregate function signature matching. [#39352](https://github.com/apache/doris/pull/39352) + +- Drop column statistics and trigger auto analysis after schema change. [#39101](https://github.com/apache/doris/pull/39101) + +- Support dropping cached stats using `DROP CACHED STATS table_name`. [#39367](https://github.com/apache/doris/pull/39367) + +### Multi Catalog and Others + +- Optimize JDBC Catalog refresh to reduce the frequency of client creation. [#40261](https://github.com/apache/doris/pull/40261) + +- Fix thread leaks in JDBC Catalog under certain conditions. [#39423](https://github.com/apache/doris/pull/39423) + +- ARRAY MAP STRUCT types now support `REPLACE_IF_NOT_NULL`. [#38304](https://github.com/apache/doris/pull/38304) + +- Retry delete jobs for failures that are not `DELETE_INVALID_XXX`. [#37834](https://github.com/apache/doris/pull/37834) + +**Credits** + +@924060929, @BePPPower, @BiteTheDDDDt, @CalvinKirs, @GoGoWen, @HappenLee, @Jibing-Li, @Johnnyssc, @LiBinfeng-01, @Mryange, @SWJTU-ZhangLei, @TangSiyang2001, @Toms1999, @Vallishp, @Yukang-Lian, @airborne12, @amorynan, @bobhan1, @cambyzju, @csun5285, @dataroaring, @eldenmoon, @englefly, @feiniaofeiafei, @hello-stephen, @htyoung, @hubgeter, @justfortaste, @liaoxin01, @liugddx, @liutang123, @luwei16, @mongo360, @morrySnow, @qidaye, @smallx, @sollhui, @starocean999, @w41ter, @xiaokang, @xzj7019, @yujun777, @zclllyybb, @zddr, @zhangstar333, @zhannngchen, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.2.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.2.md new file mode 100644 index 0000000000000..3f8e89cddf946 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.2.md @@ -0,0 +1,157 @@ +--- +{ + "title": "Release 2.0.2", + "language": "en" +} +--- + + + +# Release 2.0.2 + +Thanks to our community users and developers, 489 improvements and bug fixes have been made in Doris 2.0.2. + +## Behavior Changes + +- [Remove json -> operator convert to json_extract #24679](https://github.com/apache/doris/pull/24679) + + Remove json '->' operator since it is conflicted with lambda function syntax. It's a syntax sugar for function json_extract and can be replaced with the former. +- [Start the script to set metadata_failure_recovery #24308](https://github.com/apache/doris/pull/24308) + + Move metadata_failure_recovery from fe.conf to start_fe.sh argument to prevent being used unexpectedly. +- [Change ordinary type null value is \N,complex type null value is null #24207](https://github.com/apache/doris/pull/24207) +- [Optimize priority_ network matching logic for be #23795](https://github.com/apache/doris/pull/23795) +- [Fix cancel load failed because Job could not be cancelled… #17730](https://github.com/apache/doris/pull/17730) + + Allow cancel a retrying load job. + +## Improvements + +### Easier to use + +- [Support custom lib dir to save custom libs #23887](https://github.com/apache/doris/pull/23887) + + Add a custom_lib dir to allow users place custom lib files and custom_lib will not be replaced. +- [Optimize priority_ network matching logic #23784](https://github.com/apache/doris/pull/23784) + + Optimize priority_network logic to avoid error when this config is wrong or not configured. +- [Row policy support role #23022](https://github.com/apache/doris/pull/23022) + + Support role based auth for row policy. + +### New optimizer Nereids statistics collection improvement + +- [Disable file cache while running analysis tasks. #23663](https://github.com/apache/doris/pull/23663) +- [Show column stats even when error occurred. #23703](https://github.com/apache/doris/pull/23703) +- [Support basic jdbc external table stats collection. #23965](https://github.com/apache/doris/pull/23965) +- [Skip unknown col stats check on __internal_scheam and information_schema #24625](https://github.com/apache/doris/pull/24625) + +### Better support for JDBC, HDFS, Hive, MySQL, Max Compute, Multi-Catalog + +- [Support hadoop viewfs. #24168](https://github.com/apache/doris/pull/24168) +- [Avoid calling checksum when replaying creating jdbc catalog and fix ranger issue #22369](https://github.com/apache/doris/pull/22369) +- [Optimize the JDBC Catalog connection error message #23868](https://github.com/apache/doris/pull/23868) + + Improve property check and error message for JDBC catalog +- [Fix mc decimal type parse, fix wrong obj location #24242](https://github.com/apache/doris/pull/24242) + + Fix some issues for Max Compute catalog +- [Support sql cache for hms catalog #23391](https://github.com/apache/doris/pull/23391) + + SQL cache for Hive catalog +- [Merge hms partition events. #22869](https://github.com/apache/doris/pull/22869) + + Improve performance for Hive metadata sync +- [Add metadata_name_ids for quickly get catlogs,db,table and add profiling table in order to Compatible with mysql #22702](https://github.com/apache/doris/pull/22702) + +### Performance for inverted index query + +- [Add bkd index query cache to improve perf #23952](https://github.com/apache/doris/pull/23952) +- [Improve performance for count on index other than match #24678](https://github.com/apache/doris/pull/24678) +- [Improve match performance without index #24751](https://github.com/apache/doris/pull/24751) +- [Optimize multiple terms conjunction query #23871](https://github.com/apache/doris/pull/23871) +Improve performance of MATCH_ALL +- [Optimize unnecessary conversions #24389](https://github.com/apache/doris/pull/24389) +Improve performance of MATCH + +### Improve Array functions + +- [[Fix old optimizer with some array literal functions #23630](https://github.com/apache/doris/pull/23630) +- [Improve array union support multi params #24327](https://github.com/apache/doris/pull/24327) +- [Improve explode func with array nested complex type #24455](https://github.com/apache/doris/pull/24455) + +## Important Bug fixes + +- [The parameter positions of timestamp diff function to sql are reversed #23601](https://github.com/apache/doris/pull/23601) +- [Fix old optimizer with some array literal functions #23630](https://github.com/apache/doris/pull/23630) +- [Fix query cache returns wrong result after deleting partitions. #23555](https://github.com/apache/doris/pull/23555) +- [Fix potential data loss when clone task's dst tablet is cooldown replica #17644](https://github.com/apache/doris/pull/17644) +- [Fix array map batch append data with right next_array_item_rowid #23779](https://github.com/apache/doris/pull/23779) +- [Fix or to in rule #23940](https://github.com/apache/doris/pull/23940) +- [Fix 'char' function's toSql implementation is wrong #23860](https://github.com/apache/doris/pull/23860) +- [Record wrong best plan properties #23973](https://github.com/apache/doris/pull/23973) +- [Make TVF's distribution spec always be RANDOM #24020](https://github.com/apache/doris/pull/24020) +- [External scan use STORAGE_ANY instead of ANY as distibution #24039](https://github.com/apache/doris/pull/24039) +- [Runtimefilter target is not SlotReference #23958](https://github.com/apache/doris/pull/23958) +- [mv in select materialized_view should disable show table #24104](https://github.com/apache/doris/pull/24104) +- [Fail over to remote file reader if local cache failed #24097](https://github.com/apache/doris/pull/24097) +- [Fix revoke role operation cause fe down #23852](https://github.com/apache/doris/pull/23852) +- [Handle status code correctly and add a new error code `ENTRY_NOT_FOUND` #24139](https://github.com/apache/doris/pull/24139) +- [Fix leaky abstraction and shield the status code `END_OF_FILE` from upper layers #24165](https://github.com/apache/doris/pull/24165) +- [Fix bug that Read garbled files caused be crash. #24164](https://github.com/apache/doris/pull/24164) +- [Fix be core when user sepcified empty `column_separator` using hdfs tvf #24369](https://github.com/apache/doris/pull/24369) +- [Fix need to restart BE after replacing the jar package in java-udf #24372](https://github.com/apache/doris/pull/24372) +- [Need to call 'set_version' in nested functions #24381](https://github.com/apache/doris/pull/24381) +- [windown_funnel compatibility issue with multi backends #24385](https://github.com/apache/doris/pull/24385) +- [correlated anti join shouldn't be translated to null aware anti join #24290](https://github.com/apache/doris/pull/24290) +- [Change ordinary type null value is \N,complex type null value is null #24207](https://github.com/apache/doris/pull/24207) +- [Fix analyze failed when there are thousands of partitions. #24521](https://github.com/apache/doris/pull/24521) +- [Do not use enum as the data type for JavaUdfDataType. #24460](https://github.com/apache/doris/pull/24460) +- [Fix multi window projection issue temporarily #24568](https://github.com/apache/doris/pull/24568) +- [Make metadata compatible with 2.0.3 #24610](https://github.com/apache/doris/pull/24610) +- [Select outfile column order is wrong #24595](https://github.com/apache/doris/pull/24595) +- [Incorrect result of semi/anti mark join #24616](https://github.com/apache/doris/pull/24616) +- [Fix broker read issue #24635](https://github.com/apache/doris/pull/24635) +- [Skip unknown col stats check on __internal_scheam and information_schema #24625](https://github.com/apache/doris/pull/24625) +- [Fixed bug when parsing multi-character delimiters. #24572](https://github.com/apache/doris/pull/24572) +- [Fix timezone parse when there is no tzfile #24578](https://github.com/apache/doris/pull/24578) +- [We need to issue an error when starting FE without setting the Java home environment #23943](https://github.com/apache/doris/pull/23943) +- [Enable_unique_key_partial_update should be forwarded to master #24697](https://github.com/apache/doris/pull/24697) +- [Fix paimon file catalog meta issue and replication num analysis issue #24681](https://github.com/apache/doris/pull/24681) +- [Add more log for ingest_binlog && Fix ingest_binlog not rewrite rowset_meta tablet_uid #24617](https://github.com/apache/doris/pull/24617) +- [Do not abort when a disk is broken #24692](https://github.com/apache/doris/pull/24692) +- [colocate join could not work well on full outer join #24700](https://github.com/apache/doris/pull/24700) +- [Optimize unnecessary conversions #24389](https://github.com/apache/doris/pull/24389) +- [Optimize the reading efficiency of nullable (string) columns. #24698](https://github.com/apache/doris/pull/24698) +- [Fix segment cache core when output rowset is nullptr #24778](https://github.com/apache/doris/pull/24778) +- [Fix duplicate key in schema change #24782](https://github.com/apache/doris/pull/24782) +- [Make metadata compatible for future version after 2.0.2 #24800](https://github.com/apache/doris/pull/24800) +- [Fix map/array deserialize string with quote pair #24808](https://github.com/apache/doris/pull/24808) +- [Failed on arm platform, with clang compiler and pch on, close #24633 #24636](https://github.com/apache/doris/pull/24636) +- [Table column order is changed if add a column and do truncate #24981](https://github.com/apache/doris/pull/24981) +- [Make parser mode coarse grained by default #24949](https://github.com/apache/doris/pull/24949) + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.2-merged+is%3Aclosed) . + +## Big Thanks + +Thanks all who contribute to this release: + +[@adonis0147](https://github.com/adonis0147) [@airborne12](https://github.com/airborne12) [@amorynan](https://github.com/amorynan) [@AshinGau](https://github.com/AshinGau) [@BePPPower](https://github.com/BePPPower) [@BiteTheDDDDt](https://github.com/BiteTheDDDDt) [@bobhan1](https://github.com/bobhan1) [@ByteYue](https://github.com/ByteYue) [@caiconghui](https://github.com/caiconghui) [@CalvinKirs](https://github.com/CalvinKirs) [@cambyzju](https://github.com/cambyzju) [@ChengDaqi2023](https://github.com/ChengDaqi2023) [@ChinaYiGuan](https://github.com/ChinaYiGuan) [@CodeCooker17](https://github.com/CodeCooker17) [@csun5285](https://github.com/csun5285) [@dataroaring](https://github.com/dataroaring) [@deadlinefen](https://github.com/deadlinefen) [@DongLiang-0](https://github.com/DongLiang-0) [@Doris-Extras](https://github.com/Doris-Extras) [@dutyu](https://github.com/dutyu) [@eldenmoon](https://github.com/eldenmoon) [@englefly](https://github.com/englefly) [@freemandealer](https://github.com/freemandealer) [@Gabriel39](https://github.com/Gabriel39) [@gnehil](https://github.com/gnehil) [@GoGoWen](https://github.com/GoGoWen) [@gohalo](https://github.com/gohalo) [@HappenLee](https://github.com/HappenLee) [@hello-stephen](https://github.com/hello-stephen) [@HHoflittlefish777](https://github.com/HHoflittlefish777) [@hubgeter](https://github.com/hubgeter) [@hust-hhb](https://github.com/hust-hhb) [@ixzc](https://github.com/ixzc) [@JackDrogon](https://github.com/JackDrogon) [@jacktengg](https://github.com/jacktengg) [@jackwener](https://github.com/jackwener) [@Jibing-Li](https://github.com/Jibing-Li) [@JNSimba](https://github.com/JNSimba) [@kaijchen](https://github.com/kaijchen) [@kaka11chen](https://github.com/kaka11chen) [@Kikyou1997](https://github.com/Kikyou1997) [@Lchangliang](https://github.com/Lchangliang) [@LemonLiTree](https://github.com/LemonLiTree) [@liaoxin01](https://github.com/liaoxin01) [@LiBinfeng-01](https://github.com/LiBinfeng-01) [@liugddx](https://github.com/liugddx) [@luwei16](https://github.com/luwei16) [@mongo360](https://github.com/mongo360) [@morningman](https://github.com/morningman) [@morrySnow](https://github.com/morrySnow) @mrhhsg @Mryange @mymeiyi @neuyilan @pingchunzhang @platoneko @qidaye @realize096 @RYH61 @shuke987 @sohardforaname @starocean999 @SWJTU-ZhangLei @TangSiyang2001 @Tech-Circle-48 @w41ter @wangbo @wsjz @wuwenchi @wyx123654 @xiaokang @XieJiann @xinyiZzz @XuJianxu @xutaoustc @xy720 @xyfsjq @xzj7019 @yiguolei @yujun777 @Yukang-Lian @Yulei-Yang @zclllyybb @zddr @zhangguoqiang666 @zhangstar333 @ZhangYu0123 @zhannngchen @zxealous @zy-kkk @zzzxl1993 @zzzzzzzs diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.3.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.3.md new file mode 100644 index 0000000000000..a716d6d711fb0 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.3.md @@ -0,0 +1,253 @@ +--- +{ + "title": "Release 2.0.3", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 1000 improvements and bug fixes have been made in Doris 2.0.3 version, including optimizer statistics, inverted index, complex datatypes, data lake, replica management. + + + +## 1 Behavior change + +- The output format of the complex data type array/map/struct has been changed to be consistent to the input format and JSON specification. The main changes from the previous version are that DATE/DATETIME and STRING/VARCHAR are enclosed in double quotes and null values inside ARRAY/MAP are displayed as `null` instead of `NULL`. + - https://github.com/apache/doris/pull/25946 +- SHOW_VIEW permission is supported. Users with SELECT or LOAD permission will no longer be able to execute the 'SHOW CREATE VIEW' statement and must be granted the SHOW_VIEW permission separately. + - https://github.com/apache/doris/pull/25370 + + +## 2 New features + +### 2.1 Support collecting statistics for optimizer automatically + +Collecting statistics helps the optimizer understand the data distribution characteristics and choose a better plan to greatly improve query performance. It is officially supported starting from version 2.0.3 and is enabled all day by default. + +### 2.2 Support complex datatypes for more datalake source +- Support complex datatypes for JAVA UDF, JDBC and Hudi MOR + - https://github.com/apache/doris/pull/24810 + - https://github.com/apache/doris/pull/26236 +- Support complex datatypes for Paimon + - https://github.com/apache/doris/pull/25364 +- Suport Paimon version 0.5 + - https://github.com/apache/doris/pull/24985 + + +### 2.3 Add more builtin functions +- Support the BitmapAgg function in new optimizer + - https://github.com/apache/doris/pull/25508 +- Supports SHA series digest functions + - https://github.com/apache/doris/pull/24342 +- Support the BITMAP datatype in the aggregate functions min_by and max_by + - https://github.com/apache/doris/pull/25430 +- Add milliseconds/microseconds_add/sub/diff functions + - https://github.com/apache/doris/pull/24114 +- Add some json functions: json_insert, json_replace, json_set + - https://github.com/apache/doris/pull/24384 + + +## 3 Improvement and optimizations + +### 3.1 Performance optimizations + +- When the inverted index MATCH WHERE condition with a high filter rate is combined with the common WHERE condition with a low filter rate, the I/O of the index column is greatly reduced. +- Optimize the efficiency of random data access after the where filter. +- Optimizes the performance of the old get_json_xx function on JSON data types by 2~4x. +- Supports the configuration to reduce the priority of the data read thread, ensuring the CPU resources for real-time writing. +- Adds `uuid-numeric` function that returns largeint, which is 20 times faster than `uuid` function that returns string. +- Optimized the performance of case when by 3x. +- Cut out unnecessary predicate calculations in storage engine execution. +- Accelerate count performance by pushing down count operator to storage tier. +- Optimizes the computation performance of the nullable type in and or expressions. +- Supports rewriting the limit operator before `join` in more scenarios to improve query performance. +- Eliminate useless `order by` operators from inline view to improve query performance. +- Optimizes the accuracy of cardinality estimates and cost models in some cases. +- Optimized jdbc catalog predicate pushdown logic. +- Optimized the read efficiency of the file cache when it's enable for the first time. +- Optimizes the hive table sql cache policy and uses the partition update time stored in HMS to improve the cache hit ratio. +- Optimize mow compaction efficiency. +- Optimized thread allocation logic for external table query to reduce memory usage +- Optimize memory usage for column reader. + + + +### 3.2 Distributed replica management improvements + +Distributed replica management improvements include skipping partition deletion, colocate group deletion, balance failure due to continuous write, and hot and cold seperation table balance. + + +### 3.3 Security enhancement +- The audit log plug-in uses a token instead of a plaintext password to enhance security + - https://github.com/apache/doris/pull/26278 +- log4j configures security enhancement + - https://github.com/apache/doris/pull/24861 +- Sensitive user information is not displayed in logs + - https://github.com/apache/doris/pull/26912 + + +## 4 Bugfix and stability + +### 4.1 Complex datatypes +- Fix issues that fixed-length CHAR(n) was not truncated correctly in map/struct. + - https://github.com/apache/doris/pull/25725 +- Fix write failure for struct datatype nested for map/array + - https://github.com/apache/doris/pull/26973 +- Fix the issue that count distinct did not support array/map/struct + - https://github.com/apache/doris/pull/25483 +- Fix be crash in updating to 2.0.3 after the delete complex type appeared in query + - https://github.com/apache/doris/pull/26006 +- Fix be crash when JSON datatype is in WHERE clause. + - https://github.com/apache/doris/pull/27325 +- Fix be crash when ARRAY datatype is in OUTER JOIN clause. + - https://github.com/apache/doris/pull/25669 +- Fix reading incorrect result for DECIMAL datatype in ORC format. + - https://github.com/apache/doris/pull/26548 + - https://github.com/apache/doris/pull/25977 + - https://github.com/apache/doris/pull/26633 + +### 4.2 Inverted index +- Fix incorrect result for OR NOT combination in WHERE clause were incorrect when disable inverted index query. + - https://github.com/apache/doris/pull/26327 +- Fix be crash when write a empty with inverted index + - https://github.com/apache/doris/pull/25984 +- Fix be crash in index compaction when the output of compaction is empty. + - https://github.com/apache/doris/pull/25486 +- Fixed the problem of adding an inverted index to be crashed when no data is written to the newly added column. +- Fix be crash when BUILD INDEX after ADD COLUMN without new data written. + - https://github.com/apache/doris/pull/27276 +- Fix missing and leak problem of hardlink for inverted index file. + - https://github.com/apache/doris/pull/26903 +- Fix index file corrupt when disk is full temporarilly + - https://github.com/apache/doris/pull/28191 +- Fix incorrect result due to optimization for skip reading index column + - https://github.com/apache/doris/pull/28104 + +### 4.3 Materialized View +- Fix the problem of BE crash caused by repeated expressions in the group by statement +- Fix be crash when there are duplicate expressions in `group by` statements. + - https://github.com/apache/doris/pull/27523 +- Disables the float/double type in the `group by` clause when a view is created. + - https://github.com/apache/doris/pull/25823 +- Improve the function of select query matching materialized view + - https://github.com/apache/doris/pull/24691 +- Fix an issue that materialized views could not be matched when a table alias was used + - https://github.com/apache/doris/pull/25321 +- Fix the problem using percentile_approx when creating materialized views + - https://github.com/apache/doris/pull/26528 + +### 4.4 Table sample +- Fix the problem that table sample query can not work on table with partitions. + - https://github.com/apache/doris/pull/25912 +- Fix the problem that table sample query can not work when specify tablet. + - https://github.com/apache/doris/pull/25378 + + +### 4.5 Unique with merge on write +- Fix null pointer exception in conditional update based on primary key + - https://github.com/apache/doris/pull/26881 +- Fix field name capitalization issues in partial update + - https://github.com/apache/doris/pull/27223 +- Fix duplicate keys occur in mow during schema change repairement. + - https://github.com/apache/doris/pull/25705 + + +### 4.6 Load and compaction +- Fix unkown slot descriptor error in routineload for running multiple tables + - https://github.com/apache/doris/pull/25762 +- Fix be crash due to concurrent memory access when caculating memory + - https://github.com/apache/doris/pull/27101 +- Fix be crash on duplicate cancel for load. + - https://github.com/apache/doris/pull/27111 +- Fix broker connection error during broker load + - https://github.com/apache/doris/pull/26050 +- Fix incorrect result delete predicates in concurrent case of compation and scan. + - https://github.com/apache/doris/pull/24638 +- Fix the problem tha compaction task would print too many stacktrace logs + - https://github.com/apache/doris/pull/25597 + + +### 4.7 Data Lake compatibility +- Solve the problem that the iceberg table contains special characters that cause query failure + - https://github.com/apache/doris/pull/27108 +- Fix compatibility issues of different hive metastore versions + - https://github.com/apache/doris/pull/27327 +- Fix an error reading max compute partition table + - https://github.com/apache/doris/pull/24911 +- Fix the issue that backup to object storage failed + - https://github.com/apache/doris/pull/25496 + - https://github.com/apache/doris/pull/25803 + + +### 4.8 JDBC external table compatibility + +- Fix Oracle date type format error in jdbc catalog + - https://github.com/apache/doris/pull/25487 +- Fix MySQL 0000-00-00 date exception in jdbc catalog + - https://github.com/apache/doris/pull/26569 +- Fix an exception in reading data from Mariadb where the default value of the time type is current_timestamp + - https://github.com/apache/doris/pull/25016 +- Fix be crash when processing BITMAP datatype in jdbc catalog + - https://github.com/apache/doris/pull/25034 + - https://github.com/apache/doris/pull/26933 + + +### 4.9 SQL Planner and Optimizer + +- Fix partition prune error in some scenes + - https://github.com/apache/doris/pull/27047 + - https://github.com/apache/doris/pull/26873 + - https://github.com/apache/doris/pull/25769 + - https://github.com/apache/doris/pull/27636 + +- Fix incorrect sub-query processing in some scenarios + - https://github.com/apache/doris/pull/26034 + - https://github.com/apache/doris/pull/25492 + - https://github.com/apache/doris/pull/25955 + - https://github.com/apache/doris/pull/27177 + +- Fix some semantic parsing errors + - https://github.com/apache/doris/pull/24928 + - https://github.com/apache/doris/pull/25627 + +- Fix data loss during right outer/anti join + - https://github.com/apache/doris/pull/26529 + +- Fix incorrect pushing down of predicate pass aggregation operators. + - https://github.com/apache/doris/pull/25525 + +- Fix incorrect result header in some cases + - https://github.com/apache/doris/pull/25372 + +- Fix incorrect plan when the nullsafeEquals expression (<=>) is used as the join condition + - https://github.com/apache/doris/pull/27127 + +- Fix correct column prune in set operation operator. + - https://github.com/apache/doris/pull/26884 + + +### Others + +- Fix BE crash when the order of columns in a table is changed and then upgraded to 2.0.3. + - https://github.com/apache/doris/pull/28205 + + +See the complete list of improvements and bug fixes on [github dev/2.0.3-merged](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.3-merged+is%3Aclosed) . diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.4.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.4.md new file mode 100644 index 0000000000000..e1dac58fbf69a --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.4.md @@ -0,0 +1,67 @@ +--- +{ + "title": "Release 2.0.4", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, about 333 improvements and bug fixes have been made in Doris 2.0.4 version. + +**Quick Download** : [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub** : [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Behavior change +- More reasonable and accurate precision and scale inference for decimal data type + - [https://github.com/apache/doris/pull/28034](https://github.com/apache/doris/pull/28034) + +- Support drop policy for user or role + - [https://github.com/apache/doris/pull/29488](https://github.com/apache/doris/pull/29488) + +## New features + +- Support datev1, datetimev1 and decimalv2 datatypes in new optimizer Nereids. +- Support ODBC table for new optimizer Nereids. +- Add `lower_case` and `ignore_above` option for inverted index +- Support `match_regexp` and `match_phrase_prefix` optimization by inverted index +- Support paimon native reader in datalake +- Support audit-log for `insert into` SQL +- Support reading parquet file in lzo compressed format + +## Three Improvement and optimizations + +- Improve storage management including balance, migration, publish and others. +- Improve storage cooldown policy to use save disk space. +- Performance optimization for substr with ascii string. +- Improve partition prune when date function is used. +- Improve auto analyze visibility and performance. + +See the complete list of improvements and bug fixes on github [dev/2.0.4-merged](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.4-merged+is%3Aclosed) + + + +## Credits +Last but not least, this release would not have been possible without the following contributors: + +airborne12, amorynan, AshinGau, BePPPower, bingquanzhao, BiteTheDDDDt, bobhan1, ByteYue, caiconghui,CalvinKirs, cambyzju, caoliang-web, catpineapple, csun5285, dataroaring, deardeng, dutyu, eldenmoon, englefly, feifeifeimoon, fornaix, Gabriel39, gnehil, HappenLee, hello-stephen, HHoflittlefish777,hubgeter, hust-hhb, ixzc, jacktengg, jackwener, Jibing-Li, kaka11chen, KassieZ, LemonLiTree,liaoxin01, LiBinfeng-01, lihuigang, liugddx, luwei16, morningman, morrySnow, mrhhsg, Mryange, nextdreamblue, Nitin-Kashyap, platoneko, py023, qidaye, shuke987, starocean999, SWJTU-ZhangLei, w41ter, wangbo, wsjz, wuwenchi, Xiaoccer, xiaokang, XieJiann, xingyingone, xinyiZzz, xuwei0912, xy720, xzj7019, yujun777, zclllyybb, zddr, zhangguoqiang666, zhangstar333, zhannngchen, zhiqiang-hhhh, zy-kkk, zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.5.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.5.md new file mode 100644 index 0000000000000..20d6bd9302b2c --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.5.md @@ -0,0 +1,73 @@ +--- +{ + "title": "Release 2.0.5", + "language": "en" +} +--- + + + + + +Thanks to our community users and developers, about 217 improvements and bug fixes have been made in Doris 2.0.5 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Behavior change +- Change char function behaviour: `select char(0) = '\0'` return true as MySQL + - https://github.com/apache/doris/pull/30034 +- Allow exporting empty data + - https://github.com/apache/doris/pull/30703 + +## New features +- Eliminate left outer join with `is null` condition +- Add `show-tablets-belong` stmt for analyzing a batch of tablet-ids +- InferPredicates support In, such as `a = b & a in [1, 2] -> b in [1, 2]` +- Optimize plan when column stats are unavailable +- Optimize plan using rollup column stats +- Support analyze materialized view +- Support ShowProcessStmt Show all FE connection + +## Improvement and optimizations +- Optimize query plan when column stats are unaviable +- Optimize query plan using rollup column stats +- Stop analyze quickly after user close auto analyze +- Catch load column stats exception, avoid print too much stack info to fe.out +- Select materialized view by specify the view name in SQL +- Change auto analyze max table width default value to 100 +- Escape characters for columns in recovery predicate pushdown in JDBC Catalog +- Fix JDBC MYSQL Catalog `to_date` fun pushdown +- Optimize the close logic of JDBC client +- Optimize JDBC connection pool parameter settings +- Obtain hudi partition information through HMS's API +- Optimize routine load job error msg and memory +- Skip all backup/restore jobs if max allowd option is set to 0 + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.4-rc06...2.0.5-rc02). + + +## Credits +Thanks all who contribute to this release: + +airborne12, alexxing662, amorynan, AshinGau, BePPPower, bingquanzhao, BiteTheDDDDt, ByteYue, caiconghui, cambyzju, catpineapple, dataroaring, eldenmoon, Emor-nj, englefly, felixwluo, GoGoWen, HappenLee, hello-stephen, HHoflittlefish777, HowardQin, JackDrogon, jacktengg, jackwener, Jibing-Li, KassieZ, LemonLiTree, liaoxin01, liugddx, LuGuangming, morningman, morrySnow, mrhhsg, Mryange, mymeiyi, nextdreamblue, qidaye, ryanzryu, seawinde,starocean999, TangSiyang2001, vinlee19, w41ter, wangbo, wsjz, wuwenchi, xiaokang, XieJiann, xingyingone, xy720,xzj7019, yujun777, zclllyybb, zhangstar333, zhannngchen, zhiqiang-hhhh, zxealous, zy-kkk, zzzxl1993 + diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.6.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.6.md new file mode 100644 index 0000000000000..9591ed8d3fab8 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.6.md @@ -0,0 +1,59 @@ +--- +{ + "title": "Release 2.0.6", + "language": "en" +} +--- + + + + + +Thanks to our community users and developers, about 114 improvements and bug fixes have been created by 51 contributors in Doris 2.0.6 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Behavior change +- N/A + +## New features +- Support match a function with alias in materialized-view +- Add a command to drop a tablet replica safely on backend +- Add row count cache for external table. +- Support analyze rollup to gather statistics for optimizer + +## Improvement and optimizations +- Improve tablet schema cache memory by using deterministic way to serialize protobuf +- Improve show column stats performance +- Support estimate row count for iceberg and paimon +- Support sqlserver timestamp type read for JDBC catalog + + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.5-rc02...2.0.6). + + +## Credits +Thanks all who contribute to this release: + +924060929, AshinGau, BePPPower, BiteTheDDDDt, CalvinKirs, cambyzju, deardeng, DongLiang-0, eldenmoon, englefly, feelshana, feiniaofeiafei, felixwluo, HappenLee, hust-hhb, iwanttobepowerful, ixzc, JackDrogon, Jibing-Li, KassieZ, larshelge, liaoxin01, LiBinfeng-01, liutang123, luennng, morningman, morrySnow, mrhhsg, qidaye, starocean999, TangSiyang2001, wangbo, wsjz, wuwenchi, xiaokang, XieJiann, xuwei0912, xy720, xzj7019, yiguolei, yujun777, Yukang-Lian, Yulei-Yang, zclllyybb, zddr, zhangstar333, zhannngchen, zhiqiang-hhhh, zy-kkk, zzzxl1993 + diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.7.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.7.md new file mode 100644 index 0000000000000..10f226dbd63b4 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.7.md @@ -0,0 +1,84 @@ +--- +{ + "title": "Release 2.0.7", + "language": "en" +} +--- + + + + + +Thanks to our community users and developers, about 80 improvements and bug fixes have been made in Doris 2.0.7 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## 1 Behavior change + +- `round` function defaults to rounding normally as MySQL, eg. round(5/2) return 3 instead of 2. + + - https://github.com/apache/doris/pull/31583 + +- `round` datetime with scale from string literal as MySQL, eg. round '2023-10-12 14:31:49.666' to '2023-10-12 14:31:50' . + + - https://github.com/apache/doris/pull/27965 + + +## 2 New features +- Support make miss slot as null alias when converting outer join to anti join to speed up query + + - https://github.com/apache/doris/pull/31854 + +- Enable proxy protocol to support IP transparency for Nginx and HAProxy. + + - https://github.com/apache/doris/pull/32338 + + +## 3 Improvement and optimizations + +- Add DEFAULT_ENCRYPTION column in `information_schema` table and add `processlist` table for better compatibility for BI tools + +- Automatically test connectivity by default when creating a JDBC Catalog. + +- Enhance auto resume to keep routine load stable + +- Use lowercase by default for Chinese tokenizer in inverted index + +- Add error msg if exceeded maximum default value in repeat function + +- Skip hidden file and dir in Hive table + +- Reduce file meta cache size and disable cache for some cases to avoid OOM + +- Reduce jvm heap memory consumed by profiles of BrokerLoadJob + +- Remove sort which is under table sink to speed up query like `INSERT INTO t1 SELECT * FROM t2 ORDER BY k`. + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.6...2.0.7) . + + +## 4 Credits + +Thanks all who contribute to this release: + +924060929,airborne12,amorynan,ByteYue,dataroaring,deardeng,feiniaofeiafei,felixwluo,freemandealer,gavinchou,hello-stephen,HHoflittlefish777,jacktengg,jackwener,jeffreys-cat,Jibing-Li,KassieZ,LiBinfeng-01,luwei16,morningman,mrhhsg,Mryange,nextdreamblue,platoneko,qidaye,rohitrs1983,seawinde,shuke987,starocean999,SWJTU-ZhangLei,w41ter,wsjz,wuwenchi,xiaokang,XieJiann,XuJianxu,yujun777,Yulei-Yang,zhangstar333,zhiqiang-hhhh,zy-kkk,zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.8.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.8.md new file mode 100644 index 0000000000000..d881a80628b44 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.8.md @@ -0,0 +1,76 @@ +--- +{ + "title": "Release 2.0.8", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, about 65 improvements and bug fixes have been made in Doris 2.0.8 version. + +- **Quick Download** : [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub** : [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + + +## 1 Behavior change + +The `ADMIN SHOW` statement can not be executed with high version of MySQL 8.x jdbc driver. So rename these statement, remove the `ADMIN` keywords. + +- https://github.com/apache/doris/pull/29492 + +```sql +ADMIN SHOW CONFIG -> SHOW CONFIG +ADMIN SHOW REPLICA -> SHOW REPLICA +ADMIN DIAGNOSE TABLET -> SHOW TABLET DIAGNOSIS +ADMIN SHOW TABLET -> SHOW TABLET +``` + + +## 2 New features + +N/A + + + +## 3 Improvement and optimizations + +- Make Inverted Index work with TopN opt in Nereids + +- Limit the max string length to 1024 while collecting column stats to control BE memory usage + +- JDBC Catalog close when JDBC client is not empty + +- Accept all Iceberg database and do not check the name format of database + +- Refresh external table's rowcount async to avoid cache miss and unstable query plan + +- Simplify the isSplitable method of hive external table to avoid too many hadoop metrics + +See the complete list of improvements and bug fixes on [GitHub](https://github.com/apache/doris/compare/2.0.7...2.0.8) . + +## 4 Credits + +Thanks all who contribute to this release: + +924060929, AcKing-Sam, amorynan, AshinGau, BePPPower, BiteTheDDDDt, ByteYue, cambyzju, dongsilun, eldenmoon, feiniaofeiafei, gnehil, Jibing-Li, liaoxin01, luwei16, morningman, morrySnow, mrhhsg, Mryange, nextdreamblue, platoneko, starocean999, SWJTU-ZhangLei, wuwenchi, xiaokang, xinyiZzz, Yukang-Lian, Yulei-Yang, zclllyybb, zddr, zhangstar333, zhiqiang-hhhh, ziyanTOP, zy-kkk, zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.9.md b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.9.md new file mode 100644 index 0000000000000..04048fc060461 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.0/release-2.0.9.md @@ -0,0 +1,75 @@ +--- +{ + "title": "Release 2.0.9", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, about 68 improvements and bug fixes have been made in Doris 2.0.9 version. + +- **Quick Download** : [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub** : [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## 1 Behavior change + +NA + +## 2 New features + +- Support predicate apprear both on key and value mv column + +- Support mv with `bitmap_union(bitmap_from_array())` + +- Add a FE config to force replicate allocation for OLAP tables in the cluster + +- Support date literal support timezone in new optimizer Nereids + +- Support slop in fulltext search `match_phrase` to specify word distence + +- Show index id in `SHOW PROC INDEXES` + +## 3 Improvement and optimizations + +- Sdd a secondary argument in `first_value` / `last_value` to ignore NULL values + +- the offset params in `LEAD`/ `LAG` function could use 0 + +- Adjust priority of materialized view match rule + +- TopN opt reads only limit number of records for better performance + +- Add profile for delete_bitmap get_agg function + +- Refine the Meta cache to get better performance + +- Add FE config `autobucket_max_buckets` + +See the complete list of improvements and bug fixes on [GitHub](https://github.com/apache/doris/compare/2.0.8...2.0.9) . + +## Big Thanks + +Thanks all who contribute to this release: + +adonis0147, airborne12, amorynan, AshinGau, BePPPower, BiteTheDDDDt, CalvinKirs, cambyzju, csun5285, eldenmoon, englefly, feiniaofeiafei, HHoflittlefish777, htyoung, hust-hhb, jackwener, Jibing-Li, kaijchen, kylinmac, liaoxin01, luwei16, morningman, mrhhsg, qidaye, starocean999, SWJTU-ZhangLei, w41ter, xiaokang, xiedeyantu, xy720, zclllyybb, zhangstar333, zhannngchen, zy-kkk, zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.0.md b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.0.md new file mode 100644 index 0000000000000..b0b88f715ee51 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.0.md @@ -0,0 +1,159 @@ +--- +{ + "title": "Release 2.1.0", + "language": "en" +} +--- + + + +Dear community, we are pleased to share with you the official release of Apache Doris 2.1.0, now available for download and use as of March 8th. This latest version marks a significant milestone in our journey towards enhancing data analysis capabilities, particularly for handling massive and complex datasets. + +With Doris 2.1.0, our primary focus has been on optimizing analysis performance, and the results speak for themselves. We have achieved an impressive performance improvement of over 100% on the TPC-DS 1TB test dataset, making Apache Doris more capable of challenging real-world business scenarios. + +- **Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Performance improvement + +### Smarter optimizer + +On the basis of V2.0, the query optimizer in Doris V2.1 comes with enhanced statistics-based inference and enumeration framework. We have upgraded the cost model and expanded the optimization rules to serve the needs of more use cases + +### Better heuristic optimization + +For data analytics at scale or data lake scenarios, Doris V2.1 provides better heuristic query plans. Meanwhile, the RuntimeFilter is more self-adaptive to enable higher performance even without statistical information. + +### Parallel adaptive scan + +Doris V2.1 has adopted parallel adaptive scan to optimize scan I/O and thus improve query performance. It can avoid the negative impact of unreasonable numbers of buckets. (This feature is currently available on the Duplicate Key model and Merge-on-Write Unique Key model.) + +### Local shuffle + +We have introduced Local Shuffle to prevent uneven data distribution. Benchmark tests show that Local Shuffle in combination with Parallel Adaptive Scan can guarantee fast query performance in spite of unreasonable bucket number settings upon table creation. + +### Faster INSERT INTO SELECT + +To further improve the performance of INSERT INTO SELECT, which is a frequent operation in ETL, we have moved forward the MemTable execution-wise to reduce data ingestion overheads. Tests show that this can double the data ingestion speed in most cases compared to V2.0. +Improved data lake analytics capabilities + +## Data lake analytic performance + +### TPC-DS Benchmark + +According to TPC-DS benchmark tests (1TB) of Doris V2.1 against Trino, + +- Without caching, the total execution time of Doris is 56% of that of Trino V435. (717s VS 1296s) +- Enabling file cache can further increase the overall performance of Doris by 2.2 times. (323s) + This is achieved by a series of optimizations in I/O, parquet/ORC file reading, predicate pushdown, caching, and scan task scheduling, etc. + +### SQL dialects compatibility + +To facilitate migration to Doris and increase its compatibility with other DBMS, we have enabled SQL dialect conversion in V2.1. ([read more](../../lakehouse/sql-dialect)) For example, by set sql_dialect = "trino" in Doris, you can use the Trino SQL dialect as you're used to, without modifying your current business logic, and Doris will execute the corresponding queries for you. Tests in user production environment show that Doris V2.1 is compatible with 99% of Trino SQL. + +### Arrow Flight SQL protocol + +As a column-oriented database compatible with MySQL 8.0 protocol, Doris V2.1 now supports the Arrow Flight SQL protocol as well so users can have fast access to Doris data via Pandas/Numpy without data serialization and deserialization. For most common data types, the Arrow Flight protocol enables tens of times faster performance than the MySQL protocol. + +## Asynchronous materialized view + +V2.1 allows creating a materialized view based on multiple tables. This feature currently supports: + +- Transparent rewriting: supports transparent rewriting of common operators including Select, Where, Join, Group By, and Aggregation. +- Auto refresh: supports regular refresh, manual refresh, full refresh, incremental refresh, and partition-based refresh. +- Materialized view of external tables: supports materialized views based on external data tables such as those on Hive, Hudi, and Iceberg; supported synchronizing data from data lakes into Doris internal tables via materialized views. +- Direct query on materialized views: Materialized views can be regarded as the result set after ETL. In this sense, materialized views are data tables, so users can conduct queries on them directly. + +## Enhanced storage + +### Auto-increment column + +V2.1 supports auto-increment columns, which can ensure data uniqueness of each row. This lays the foundation for efficient dictionary encoding and query pagination. For example, for precise UV calculation and customer grouping, users often apply the bitmap type in Doris, the process of which entails dictionary encoding. With V2.1, users can first create a dictionary table using the auto-increment column, and then simply load user data into it. + +### Auto partition + +To further release burden on operation and maintenance, V2.1 allows auto data partitioning. Upon data ingestion, it detects whether a partition exists for the data based on the partitioning column. If not, it automatically creates one and starts data ingestion. + +### High-concurrency real-time data ingestion + +For data writing, a back pressure mechanism is in place to avoid execessive data versions, so as to reduce resource consumption by data version merging. In addition, V2.1 supports group commit ([read more](../../data-operate/import/import-way/group-commit-manual)), which means to accumulate multiple writing and commit them as one. Benchmark tests on group commit with JDBC ingestion and the Stream Load method present great results. + +## Semi-structured data analysis + +### A new data type: Variant + +V2.1 supports a new data type named Variant. It can accommodate semi-structured data such as JSON as well as compound data types that contain integers, strings, booleans, etcs. Users don't have to pre-define the exact data types for a Variant column in the table schema. The Variant type is handy when processing nested data structures. +You can include Variant columns and static columns with pre-defined data types in the same table. This will provide you with more flexibility in storage and queries. +Tests with ClickBench datasets prove that data in Variant columns takes up the same storage space as data in static columns, which is half of that in JSON format. In terms of query performance, the Variant type enables 8 times higher query speed than JSON in hot runs and even more in cold runs. + +### IP types + +Doris V2.1 provides native support for IPv4 and IPv6. It stores IP data in binary format, which cuts down storage space usage by 60% compared to IP string in plain texts. Along with these IP types, we have added over 20 functions for IP data processing. + +### More powerful functions for compound data types + +- explode_map: supports exploding rows into columns for the Map data type. +- Supports the STRUCT data type in the IN predicates + +## Workload Management + +### Hard isolation of resources + +On the basis of the Workload Group mechanism, which imposes a soft limit on the resources that a workload group can use, Doris 2.1 introduces a hard limit on CPU resource consumption for workload groups as a way to ensure higher stability in query performance. + +### TopSQL + +V2.1 allows users to check the most resource-consuming SQL queries in the runtime. This can be a big help when handling cluster load spike caused by unexpected large queries. + + +## Others + +### Decimal 256 + +For users in the financial sector or high-end manufacturing, V2.1 supports a high-precision data type: Decimal, which supports up to 76 significant digits (an experimental feature, please set enable_decimal256=true.) + +### Job scheduler + +V2.1 provides a good option for regular task scheduling: Doris Job Scheduler. It can trigger the pre-defined operations on schedule or at fixed intervals. The Doris Job Scheduler is accurate to the second. It provides consistency guarantee for data writing, high efficiency and flexibility, high-performance processing queues, retraceable scheduling records, and high availability of jobs. + +### Support Docker fast start to experience the new version + +Starting from version 2.1.0, we will provide a separate Docker Image to support the rapid creation of a 1FE, 1BE Docker container to experience the new version of Doris. The container will complete the initialization of FE and BE, BE registration and other steps by default. After creating the container, it can directly access and use the Doris cluster about 1 [minute.In](http://minute.in/) this image version, the default `max_map_count`, `ulimit`, `Swap` and other hard limits are removed. It supports X64 (avx2) machines and ARM machines for deployment. The default open ports are 8000, 8030, 8040, 9030.If you need to experience the Broker component, you can add the environment variable `--env BROKER=true` at startup to start the Broker process synchronously. After startup, it will automatically complete the registration. The Broker name is `test`. + +Please note that this version is only suitable for quick experience and functional testing, not for production environment! + +## Behavior changed + +- The default data model is the Merge-on-Write Unique Key model. enable_unique_key_merge_on_write will be included as a default setting when a table is created in the Unique Key model. +- As inverted index has proven to be more performant than bitmap index, V2.1 stops supporting bitmap index. Existing bitmap indexes will remain effective but new creation is not allowed. We will remove bitmap index-related code in the future. +- cpu_resource_limit is no longer supported. It is to put a limit on the number of scanner threads on Doris BE. Since the workload group mechanism also supports such settings, the already configured cpu_resource_limit will be invalid. +- The default value of enable_segcompaction is true. This means Doris supports compaction of multiple segments in the same rowset. +- Audit log plug-in + - Since V2.1.0, Doris has a built-in audit log plug-in. Users can simply enable or disable it by setting the enable_audit_plugin parameter. + - If you have already installed your own audit log plug-in, you can either continue using it after upgrading to Doris V2.1, or uninstall it and use the one in Doris. Please note that the audit log table will be relocated after switching plug-in. + - For more details, please see the [docs](../../admin-manual/audit-plugin). + + +## Credits +Thanks all who contribute to this release: + +467887319, 924060929, acnot, airborne12, AKIRA, alan_rodriguez, AlexYue, allenhooo, amory, amory, AshinGau, beat4ocean, BePPPower, bigben0204, bingquanzhao, BirdAmosBird, BiteTheDDDDt, bobhan1, caiconghui, camby, camby, CanGuan, caoliang-web, catpineapple, Centurybbx, chen, ChengDaqi2023, ChenyangSunChenyang, Chester, ChinaYiGuan, ChouGavinChou, chunping, colagy, CSTGluigi, czzmmc, daidai, dalong, dataroaring, DeadlineFen, DeadlineFen, deadlinefen, deardeng, didiaode18, DongLiang-0, dong-shuai, Doris-Extras, Dragonliu2018, DrogonJackDrogon, DuanXujianDuan, DuRipeng, dutyu, echo-dundun, ElvinWei, englefly, Euporia, feelshana, feifeifeimoon, feiniaofeiafei, felixwluo, figurant, flynn, fornaix, FreeOnePlus, Gabriel39, gitccl, gnehil, GoGoWen, gohalo, guardcrystal, hammer, HappenLee, HB, hechao, HelgeLarsHelge, herry2038, HeZhangJianHe, HHoflittlefish777, HonestManXin, hongkun-Shao, HowardQin, hqx871, httpshirley, htyoung, huanghaibin, HuJerryHu, HuZhiyuHu, Hyman-zhao, i78086, irenesrl, ixzc, jacktengg, jacktengg, jackwener, jayhua, Jeffrey, jiafeng.zhang, Jibing-Li, JingDas, julic20s, kaijchen, kaka11chen, KassieZ, kindred77, KirsCalvinKirs, KirsCalvinKirs, kkop, koarz, LemonLiTree, LHG41278, liaoxin01, LiBinfeng-01, LiChuangLi, LiDongyangLi, Lightman, lihangyu, lihuigang, LingAdonisLing, liugddx, LiuGuangdongLiu, LiuHongLiu, liuJiwenliu, LiuLijiaLiu, lsy3993, LuGuangmingLu, LuoMetaLuo, luozenglin, Luwei, Luzhijing, lxliyou001, Ma1oneZhang, mch_ucchi, Miaohongkai, morningman, morrySnow, Mryange, mymeiyi, nanfeng, nanfeng, Nitin-Kashyap, PaiVallishPai, Petrichor, plat1ko, py023, q763562998, qidaye, QiHouliangQi, ranxiang327, realize096, rohitrs1983, sdhzwc, seawinde, seuhezhiqiang, seuhezhiqiang, shee, shuke987, shysnow, songguangfan, Stalary, starocean999, SunChenyangSun, sunny, SWJTU-ZhangLei, TangSiyang2001, Tanya-W, taoxutao, Uniqueyou, vhwzIs, walter, walter, wangbo, Wanghuan, wangqt, wangtao, wangtianyi2004, wenluowen, whuxingying, wsjz, wudi, wudongliang, wuwenchihdu, wyx123654, xiangran0327, Xiaocc, XiaoChangmingXiao, xiaokang, XieJiann, Xinxing, xiongjx, xuefengze, xueweizhang, XueYuhai, XuJianxu, xuke-hat, xy, xy720, xyfsjq, xzj7019, yagagagaga, yangshijie, YangYAN, yiguolei, yiguolei, yimeng, YinShaowenYin, Yoko, yongjinhou, ytwp, yuanyuan8983, yujian, yujun777, Yukang-Lian, Yulei-Yang, yuxuan-luo, zclllyybb, ZenoYang, zfr95, zgxme, zhangdong, zhangguoqiang, zhangstar333, zhangstar333, zhangy5, ZhangYu0123, zhannngchen, ZhaoLongZhao, zhaoshuo, zhengyu, zhiqqqq, ZhongJinHacker, ZhuArmandoZhu, zlw5307, ZouXinyiZou, zxealous, zy-kkk, zzwwhh, zzzxl1993, zzzzzzzs diff --git a/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.1.md b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.1.md new file mode 100644 index 0000000000000..384bccdceb414 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.1.md @@ -0,0 +1,251 @@ +--- +{ + "title": "Release 2.1.1", + "language": "en" +} +--- + + + +Dear community members, Apache Doris 2.1.1 has been officially released on April 3, 2024, with several enhancements and bug fixes based on 2.1.0, enabling smoother user experience. + +- **Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + +## Behavior Changed + +1. Change float type output format to improve float type serialization performance. + +- https://github.com/apache/doris/pull/32049 + +2. Change system table value functions active_queries(), workload_groups() to system tables. + +- https://github.com/apache/doris/pull/32314 + +3. Disable show query/load profile stmt because there are not so many developers use it and the pipeline and pipelinex engine not support it. + +- https://github.com/apache/doris/pull/32467 + +4. Upgrade arrow flight version to 15.0.2 to fix some bugs, so that please use ADBC 15.0.2 version to access Doris. + +- https://github.com/apache/doris/pull/32827. + + + +## Upgrade Problem + +1. BE will core when rolling pgrade problem from 2.0.x to 2.1.x + +- https://github.com/apache/doris/pull/32672 + +- https://github.com/apache/doris/pull/32444 + +- https://github.com/apache/doris/pull/32162 + +2. JDBC Catalog will have query errors when rolling grade rom 2.0.x to 2.1.x. + +- https://github.com/apache/doris/pull/32618 + + + +## New Feature + +1. Enable column auth by default. + +- https://github.com/apache/doris/pull/32659 + + +2. Get correct cores for pipeline and pipelinex engine when running within docker or k8s. + +- https://github.com/apache/doris/pull/32370 + +3. Support read parquet int96 type. + +- https://github.com/apache/doris/pull/32394 + +4. Enable proxy protocol to support IP transparency. Using this protocol, IP transparency for load balancing can be achieved, so that after load balancing, Doris can still obtain the client's real IP and implement permission control such as whitelisting. + +- https://github.com/apache/doris/pull/32338/files + +5. Add workload group queue related columns for active_queries system table. Uses could use this system to monitor the workload queue usage. + +- https://github.com/apache/doris/pull/32259 + +6. Add new system table backend_active_tasks to monitor the realtime query statics on every BE. + +- https://github.com/apache/doris/pull/31945 + +7. Add ipv4 and ipv6 support for spark-doris connector. + +- https://github.com/apache/doris/pull/32240 + +8. Add inverted index support for CCR. + +- https://github.com/apache/doris/pull/32101 + +9. Support select experimental session variable. + +- https://github.com/apache/doris/pull/31837 + +10. Support materialized view with bitmap_union(bitmap_from_array()) case. + +- https://github.com/apache/doris/pull/31962 + +11. Support partition prune for *HIVE_DEFAULT_PARTITION*. + +- https://github.com/apache/doris/pull/31736 + +12. Support function in set variable statement. + +- https://github.com/apache/doris/pull/32492 + +13. Support arrow serialization for varint type. + +- https://github.com/apache/doris/pull/32809 + + + +## Optimization + +1. Auto resume routine load when be restart or during upgrade. And keep the routine load stable. + +- https://github.com/apache/doris/pull/32239 + +2. Routine Load: optimize allocate task to be algorithm for load balance. + +- https://github.com/apache/doris/pull/32021 + +3. Spark Load: update spark version for spark load to resolve cve problem. + +- https://github.com/apache/doris/pull/30368 + +4. Skip cooldown if the tablet is dropped. + +- https://github.com/apache/doris/pull/32079 + +5. Support using workload group to manage routine load. + +- https://github.com/apache/doris/pull/31671 + +6. [MTMV ]Improve the performance for query rewritting by materialized view. + +- https://github.com/apache/doris/pull/31886 + +7. Reduce jvm heap memory consumed by profiles of BrokerLoadJob. + +- https://github.com/apache/doris/pull/31985 +8. Imporve the high QPS query by speed up PartitionPrunner. + +- https://github.com/apache/doris/pull/31970 + +9. Reduce duplicated memory consumption for column name and column path for schema cache. + +- https://github.com/apache/doris/pull/31141 + +10. Support more join types for query rewriting by materialized view such as INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN, FULL OUTER JOIN, LEFT SEMI JOIN, RIGHT SEMI JOIN, LEFT ANTI JOIN, RIGHT ANTI JOIN + +- https://github.com/apache/doris/pull/32909 + + + +## Bugfix + + +1. Do not push down topn-filter through right/full outer join if the first orderkey is nulls first. + +- https://github.com/apache/doris/pull/32633 + +2. Fix memory leak in Java UDF + +- https://github.com/apache/doris/pull/32630 + +3. If some odbc tables use the same resource, and restore not all odbc tables, it will not retain the resource. +and check some conf for backup/restore + +- https://github.com/apache/doris/pull/31989 + +4. Fold constant will core for variant type. + +- https://github.com/apache/doris/pull/32265 + +5. Routine load will pause when transaction fail in some cases. + +- https://github.com/apache/doris/pull/32638 + +6. the result of left semi join with empty right side should be false instead of null. + +- https://github.com/apache/doris/pull/32477 + +7. Fix core when build inverted index for a new column with no data. + +- https://github.com/apache/doris/pull/32669 + +8. Fix be core caused by null-safe-equal join. + +- https://github.com/apache/doris/pull/32623 + +9. Partial update: fix data correctness risk when load delete sign data into a table with sequence col. + +- https://github.com/apache/doris/pull/32574 + +10. Select outfile: Fix the column type mapping in the orc/parquet file format. + +- https://github.com/apache/doris/pull/32281 + +11. Fix BE core during restore stage. + +- https://github.com/apache/doris/pull/32489 + +12. Use array_agg func after other agg func like count, sum, may make be core. + +- https://github.com/apache/doris/pull/32387 + +13. Variant type should always nullable or there will some bugs. + +- https://github.com/apache/doris/pull/32248 + +14. Fix the bug of handling empty blocks in schema change. + +- https://github.com/apache/doris/pull/32396 + +15. Fix BE will core when use json_length() in some cases. + +- https://github.com/apache/doris/pull/32145 + +16. Fix error when query iceberg table using date cast predicate + +- https://github.com/apache/doris/pull/32194 + +17. Fix some bugs when build inverted index for variant type. + +- https://github.com/apache/doris/pull/31992 + +18. Wrong result of two or more map_agg functions in query. + +- https://github.com/apache/doris/pull/31928 + +19. Fix wrong result of money_format function. + +- https://github.com/apache/doris/pull/31883 + +20. Fix connection hang after too many connections. + +- https://github.com/apache/doris/pull/31594 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.2.md b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.2.md new file mode 100644 index 0000000000000..6116bd9984632 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.2.md @@ -0,0 +1,110 @@ +--- +{ + "title": "Release 2.1.2", + "language": "en" +} +--- + + + +## Behavior Changed + +1. Set the default value of the `data_consistence` property of EXPORT to partition to make export more stable during load. + +- https://github.com/apache/doris/pull/32830 + +2. Some of MySQL Connector (eg, dotnet MySQL.Data) rely on variable's column type to make connection. + + eg, select @[@autocommit]([@autocommit](https://github.com/autocommit)) should with column type BIGINT, not BIT, otherwise it will throw error. So we change column type of @[@autocommit](https://github.com/autocommit) to BIGINT. + + - https://github.com/apache/doris/pull/33282 + + +## Upgrade Problem + +1. Normal workload group is not created when upgrade from 2.0 or other old versions. + + - https://github.com/apache/doris/pull/33197 + +## New Feature + + +1. Add processlist table in information_schema database, users could use this table to query active connections. + + - https://github.com/apache/doris/pull/32511 + +2. Add a new table valued function `LOCAL` to allow access file system like shared storage. + + - https://github.com/apache/doris-website/pull/494 + + +## Optimization + +1. Skip some useless process to make graceful stop more quickly in K8s env. + + - https://github.com/apache/doris/pull/33212 + +2. Add rollup table name in profile to help find the mv selection problem. + + - https://github.com/apache/doris/pull/33137 + +3. Add test connection function to DB2 database to allow user check the connection when create DB2 Catalog. + + - https://github.com/apache/doris/pull/33335 + +4. Add DNS Cache for FQDN to accelerate the connect process among BEs in K8s env. + + - https://github.com/apache/doris/pull/32869 + +5. Refresh external table's rowcount async to make the query plan more stable. + + - https://github.com/apache/doris/pull/32997 + + +## Bugfix + + +1. Fix Iceberg Catalog of HMS and Hadoop do not support Iceberg properties like "io.manifest.cache-enabled" to enable manifest cache in Iceberg. + + - https://github.com/apache/doris/pull/33113 + +2. The offset params in `LEAD`/`LAG` function could use 0 as offset. + + - https://github.com/apache/doris/pull/33174 + +3. Fix some timeout issues with load. + + - https://github.com/apache/doris/pull/33077 + + - https://github.com/apache/doris/pull/33260 + +4. Fix core problem related with `ARRAY`/`MAP`/`STRUCT` compaction process. + + - https://github.com/apache/doris/pull/33130 + + - https://github.com/apache/doris/pull/33295 + +5. Fix runtime filter wait timeout. + + - https://github.com/apache/doris/pull/33369 + +6. Fix `unix_timestamp` core for string input in auto partition. + + - https://github.com/apache/doris/pull/32871 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.3.md b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.3.md new file mode 100644 index 0000000000000..e88ec3e94fb6d --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.3.md @@ -0,0 +1,191 @@ +--- +{ + "title": "Release 2.1.3", + "language": "en" +} +--- + + + +Apache Doris 2.1.3 was officially released on May 21, 2024. This version has updated several improvements, including writing data back to Hive, materialized view, permission management and bug fixes. It further enhances the performance and stability of the system. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + + + +## Feature Enhancements + +**1. Support writing back data to hive tables via Hive Catalog** + +Starting from version 2.1.3, Apache Doris supports DDL and DML operations on Hive. Users can directly create libraries and tables in Hive through Apache Doris and write data to Hive tables by executing `INSERT INTO` statements. This feature allows users to perform complete data query and write operations on Hive through Apache Doris, further simplifying the integrated lakehouse architecture. + +Please refer: [https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/](https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/) + +**2. Support building new asynchronous materialized views on top of existing ones** + +Users can create new asynchronous materialized views on top of existing ones, directly reusing pre-computed intermediate results for data processing. This simplifies complex aggregation and computation operations, reducing resource consumption and maintenance costs while further accelerating query performance and improving data availability. [#32984](https://github.com/apache/doris/pull/32984) + +**3. Support rewriting through nested materialized views** + +Materialized View (MV) is a database object used to store query results. Now, Apache Doris supports rewriting through nested materialized views, which helps optimize query performance. [#33362](https://github.com/apache/doris/pull/33362) + +**4. New `SHOW VIEWS` statement** + +The `SHOW VIEWS` statement can be used to query views in the database, facilitating better management and understanding of view objects in the database. [#32358](https://github.com/apache/doris/pull/32358) + +**5. Workload Group supports binding to specific BE nodes** + +Workload Group can be bound to specific BE nodes, enabling more refined control over query execution to optimize resource usage and improve performance. [#32874](https://github.com/apache/doris/pull/32874) + +**6. Broker Load supports compressed JSON format** + +Broker Load now supports importing compressed JSON format data, significantly reducing bandwidth requirements for data transmission and accelerating data import performance. [#30809](https://github.com/apache/doris/pull/30809) + +**7. TRUNCATE Function can use columns as scale parameters** + +The TRUNCATE function can now accept columns as scale parameters, providing more flexibility when processing numerical data. [#32746](https://github.com/apache/doris/pull/32746) + +**8. Add new functions `uuid_to_int` and `int_to_uuid`** + +These two functions allow users to convert between UUID and integer, significantly helping in scenarios that require handling UUID data. [#33005](https://github.com/apache/doris/pull/33005) + +**9. Add `bypass_workload_group` session variable to bypass query queue** + +The `bypass_workload_group` session variable allows certain queries to bypass the Workload Group queue and execute directly, which is useful for handling critical queries that require quick responses. [#33101](https://github.com/apache/doris/pull/33101) + +**10. Add strcmp function** + +The strcmp function compares two strings and returns their comparison result, simplifying text data processing. [#33272](https://github.com/apache/doris/pull/33272) + +**11. Support HLL functions `hll_from_base64` and `hll_to_base64`** + +HyperLogLog (HLL) is an algorithm for cardinality estimation. These two functions allow users to decode HLL data from a Base64-encoded string or encode HLL data as a Base64 string, which is very useful for storing and transmitting HLL data. [#32089](https://github.com/apache/doris/pull/32089) + +## Optimization and Improvements + +**1. Replace SipHash with XXHash to improve shuffle performance** + +Both SipHash and XXHash are hashing functions, but XXHash may provide faster hashing speeds and better performance in certain scenarios. This optimization aims to improve performance during data shuffling by adopting XXHash. [#32919](https://github.com/apache/doris/pull/32919) + +**2. Asynchronous materialized views support NULL partition columns in OLAP tables** + +This enhancement allows asynchronous materialized views to support NULL partition columns in OLAP tables, enhancing data processing flexibility.[#32698](https://github.com/apache/doris/pull/32698) + +**3. Limit maximum string length to 1024 when collecting column statistics to control BE memory usage** + +Limiting the string length when collecting column statistics prevents excessive data from consuming too much BE memory, helping maintain system stability and performance. [#32470](https://github.com/apache/doris/pull/32470) + +**4. Support dynamic deletion of Bitmap cache to improve performance** + +Dynamically deleting no longer needed Bitmap Cache can free up memory and improve system performance. [#32991](https://github.com/apache/doris/pull/32991) + +**5. Reduce memory usage during ALTER operations** + +Reducing memory usage during ALTER operations improves the efficiency of system resource utilization. [#33474](https://github.com/apache/doris/pull/33474) + +**6. Support constant folding for complex types** + +Supports constant folding for Array/Map/Struct complex types.[#32867](https://github.com/apache/doris/pull/32867) + +**7. Add support for Variant type in Aggregate Key Model** + +The Variant data type can store multiple data types. This optimization allows aggregation operations on Variant type data, enhancing the flexibility of semi-structured data analysis. [#33493](https://github.com/apache/doris/pull/33493) + +**8. Support new inverted index format in CCR** [#33415](https://github.com/apache/doris/pull/33415) + +**9. Optimize rewriting performance for nested materialized views** [#34127](https://github.com/apache/doris/pull/34127) + +**10. Support decimal256 type in row-based storage format** + +Supporting the decimal256 type in row-based storage extends the system's ability to handle high-precision numerical data. [#34887](https://github.com/apache/doris/pull/34887) + +## Behavioral Changes + +**1. Authorization** + +- **Grant_priv permission changes**: `Grant_priv` can no longer be arbitrarily granted. When performing a `GRANT` operation, the user not only needs to have `Grant_priv` but also the permissions to be granted. For example, to grant `SELECT` permission on `table1`, the user needs both `GRANT` permission and `SELECT` permission on `table1`, enhancing security and consistency in permission management. [#32825](https://github.com/apache/doris/pull/32825) + +- **Workload group and resource usage_priv**: `Usage_priv` for Workload Group and Resource is no longer global but limited to Resource and Workload Group, making permission granting and usage more specific. [#32907](https://github.com/apache/doris/pull/32907) + +- **Authorization for operations**: Operations that were previously unauthorized now have corresponding authorizations for more detailed and comprehensive operational permission control. [#33347](https://github.com/apache/doris/pull/33347) + +**2. LOG directory configuration** + +The log directory configuration for FE and BE now uniformly uses the `LOG_DIR` environment variable. All other different types of logs will be stored with `LOG_DIR` as the root directory. To maintain compatibility between versions, the previous configuration item `sys_log_dir` can still be used. [#32933](https://github.com/apache/doris/pull/32933) + +**3. S3 Table Function (TVF)** + +Due to issues with correctly recognizing or processing S3 URLs in certain cases, the parsing logic for object storage paths has been refactored. For file paths in S3 table functions, the `force_parsing_by_standard_uri` parameter needs to be passed to ensure correct parsing. [#33858](https://github.com/apache/doris/pull/33858) + +## Upgrade Issues + +Since many users use certain keywords as column names or attribute values, the following keywords have been set as non-reserved, allowing users to use them as identifiers. [#34613](https://github.com/apache/doris/pull/34613) + +## Bug Fixes + +**1. Fix no data error when reading Hive tables on Tencent Cloud COSN** + +Resolved the no data error that could occur when reading Hive tables on Tencent Cloud COSN, enhancing compatibility with Tencent Cloud storage services. + +**2. Fix incorrect results returned by `milliseconds_diff` function** + +Fixed an issue where the `milliseconds_diff` function returned incorrect results in some cases, ensuring the accuracy of time difference calculations. [#32897](https://github.com/apache/doris/pull/32897) + +**3. User-defined variables should be rorwarded to the Master node** + +Ensured that user-defined variables are correctly passed to the Master node for consistency and correct execution logic across the entire system. [#33013]https://github.com/apache/doris/pull/33013 + +**4. Fix Schema Change issues when adding complex type columns** + +Resolved Schema Change issues that could arise when adding complex type columns, ensuring the correctness of Schema Changes. [#31824](https://github.com/apache/doris/pull/31824) + +**5. Fix data loss issue in Routine Load when FE Master node changes** + +`Routine Load` is often used to subscribe to Kafka message queues. This fix addresses potential data loss issues that may occur during FE Master node changes. [#33678](https://github.com/apache/doris/pull/33678) + +**6. Fix Routine Load failure when Workload Group cannot be found** + +Resolved an issue where `Routine Load` would fail if the specified Workload Group could not be found. [#33596](https://github.com/apache/doris/pull/33596) + +**7. Support column string64 to avoid join failures when string size overflows unit32** + +In some cases, string sizes may exceed the unit32 limit. Supporting the `string64` type ensures correct execution of string JOIN operations. [#33850](https://github.com/apache/doris/pull/33850) + +**8. Allow Hadoop users to create Paimon Catalog** + +Permitted authorized Hadoop users to create Paimon Catalogs.[#33833](https://github.com/apache/doris/pull/33833) + +**9. Fix `function_ipxx_cidr` function issues with constant parameters** + +Resolved problems with the `function_ipxx_cidr` function when handling constant parameters, ensuring the correctness of function execution.[#33968](https://github.com/apache/doris/pull/33968) + +**10. Fix file download errors when restoring using HDFS** + +Resolved "failed to download" errors encountered during data restoration using HDFS, ensuring the accuracy and reliability of data recovery. [#33303](https://github.com/apache/doris/issues/33303) + +**11. Fix column permission issues related to hidden columns** + +In some cases, permission settings for hidden columns may be incorrect. This fix ensures the correctness and security of column permission settings. [#34849](https://github.com/apache/doris/pull/34849) + +**12. Fix issue where Arrow Flight cannot obtain the correct IP in K8s deployments** + +This fix resolves an issue where Arrow Flight cannot correctly obtain the IP address in Kubernetes deployment environments.[#34850](https://github.com/apache/doris/pull/34850) \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.4.md b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.4.md new file mode 100644 index 0000000000000..521694ffa60fa --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.4.md @@ -0,0 +1,289 @@ +--- +{ + "title": "Release 2.1.4", + "language": "en" +} +--- + + + +**Apache Doris version 2.1.4 was officially released on June 26, 2024.** In this update, we have optimized various functional experiences for data lakehouse scenarios, with a focus on resolving the abnormal memory usage issue in the previous version. Additionally, we have implemented several improvemnents and bug fixes to enhance the stability. Welcome to download and use it. + + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + + +## Behavior changes + +- Non-existent files will be ignored when querying external tables such as Hive. [#35319](https://github.com/apache/doris/pull/35319) + + The file list is obtained from the meta cache, and it may not be consistent with the actual file list. + + Ignoring non-existent files helps to avoid query errors. + +- By default, creating a Bitmap Index will no longer be automatically changed to an Inverted Index. [#35521](https://github.com/apache/doris/pull/35521) + + This behavior is controlled by the FE configuration item `enable_create_bitmap_index_as_inverted_index`, which defaults to false. + +- When starting FE and BE processes using `--console`, all logs will be output to the standard output and differentiated by prefixes indicating the log type. [#35679](https://github.com/apache/doris/pull/35679) + + For more infomation, please see the documentations: + + - [Log Management - FE Log](../admin-manual/log-management/fe-log.md) + + - [Log Management - BE Log](../admin-manual/log-management/be-log.md) + +- If no table comment is provided when creating a table, the default comment will be empty instead of using the table type as the default comment. [#36025](https://github.com/apache/doris/pull/36025) + +- The default precision of DECIMALV3 has been adjusted from (9, 0) to (38, 9) to maintain compatibility with the version in which this feature was initially released. [#36316](https://github.com/apache/doris/pull/36316) + +## New features + +### Query optimizer + +- Support FE flame graph tool + + For more information, see the [documentation](/community/developer-guide/fe-profiler.md) + +- Support `SELECT DISTINCT` to be used with aggregation. + +- Support single table query rewrite without `GROUP BY`. This is useful for complex filters or expressions. [#35242](https://github.com/apache/doris/pull/35242). + +- The new optimizer fully supports point query functionality [#36205](https://github.com/apache/doris/pull/36205). + +### Data Lakehouse + +- Support native reader of Apache Paimon deletion vector [#35241](https://github.com/apache/doris/pull/35241) + +- Support using Resource in Table Valued Functions [#35139](https://github.com/apache/doris/pull/35139) + +- Access controller with Hive Ranger plugin supports Data Mask + +### Asynchronous materialized views + +- Build support for internal table triggered updates, where if a materialized view uses an internal table and the data in the internal table changes, it can trigger a refresh of the materialized view, specifying REFRESH ON COMMIT when creating the materialized view. + +- Support transparent rewriting for single tables. For more information, see [Querying Async Materialized View](../query/view-materialized-view/query-async-materialized-view.md). + +- Transparent rewriting supports aggregation roll-up for agg_state, agg_union types; materialized views can be defined as agg_state or agg_union, queries can use specific aggregation functions, or use agg_merge. For more information, see [AGG_STATE](../sql-manual/sql-types/Data-Types/AGG_STATE.md). + +### Others + +- Added function `replace_empty`. + + For more information, see [documentation]../sql-manual/sql-functions/string-functions/replace_empty). + +- Support `show storage policy using` statement. + + For more information, see [documentation](../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md). + +- Support JVM metrics on the BE side. + + By setting `enable_jvm_monitor=true` in `be.conf` to enable this feature. + +## Improvements + +- Supported creating inverted indexes for columns with Chinese names. [#36321](https://github.com/apache/doris/pull/36321) + +- Estimate memory consumed by segment cache more accurately so that unused memory can be released more quickly. [#35751](https://github.com/apache/doris/pull/35751) + +- Filter empty partitions before exporting tables to remote storage. [#35542](https://github.com/apache/doris/pull/35542) + +- Optimize routine load task allocation algorithm to balance the load among Backends. [#34778](https://github.com/apache/doris/pull/34778) + +- Provide hints when a related variable is not found during a set operation. [#35775](https://github.com/apache/doris/pull/35775) + +- Support placing Java UDF jar files in the FE's `custom_lib` directory for default loading. [#35984](https://github.com/apache/doris/pull/35984) + +- Add a timeout global variable `audit_plugin_load_timeout` for audit log load jobs. + +- Optimize the performance of transparent rewrite planning for asynchronous materialized views. + +- Optimize the `INSERT` operation that when the source is empty, the BE will not execute. [#34418](https://github.com/apache/doris/pull/34418) + +- Support fetching file lists of Hive/Hudi tables in batches. In a senario with 1.2 million files, the time taken to obtain the list of files has been reduced from 390 seconds to 46 seconds. [#35107](https://github.com/apache/doris/pull/35107) + +- Forbid dynamic partitioning when creating asynchronous materialized views. + +- Support detecting whether the partition data of external data of external tables in Hive is synchronized with asynchronous materialized views. + +- Allow to create index for asynchronous materialized views. + +## Bug fixes + +### Query optimizer + +- Fixed the issue where SQL cache returns old results after truncating a partition. [#34698](https://github.com/apache/doris/pull/34698) + +- Fixed the issue where casting from JSON to other types did not correctly handle nullable attributes. [#34707](https://github.com/apache/doris/pull/34707) + +- Fixed occasional DATETIMEV2 literal simplification errors. [#35153](https://github.com/apache/doris/pull/35153) + +- Fixed the issue where `COUNT(*)` could not be used in window functions. [#35220](https://github.com/apache/doris/pull/35220) + +- Fixed the issue where nullable attributes could be incorrect when all `SELECT` statements under `UNION ALL` have no `FROM` clause. [#35074](https://github.com/apache/doris/pull/35074) + +- Fixed the issue where `bitmap in join` and subquery unnesting could not be used simultaneously. [#35435](https://github.com/apache/doris/pull/35435) + +- Fixed the performance issue where filter conditions could not be pushed down to the CTE producer in specific situations. [#35463](https://github.com/apache/doris/pull/35463) + +- Fixed the issue where aggregate combinators written in uppercase could not be found. [#35540](https://github.com/apache/doris/pull/35540) + +- Fixed the performance issue where window functions were not properly pruned by column pruning. [#35504](https://github.com/apache/doris/pull/35504) + +- Fixed the issue where queries might parse incorrectly leading to wrong results when multiple tables with the same name but in different databases appeared simultaneously in the query. [#35571](https://github.com/apache/doris/pull/35571) + +- Fixed the query error caused by generating runtime filters during schema table scans. [#35655](https://github.com/apache/doris/pull/35655) + +- Fixed the issue where nested correlated subqueries could not execute because the join condition was folded into a null literal. [#35811](https://github.com/apache/doris/pull/35811) + +- Fixed the occasional issue where decimal literals were set with incorrect precision during planning. [#36055](https://github.com/apache/doris/pull/36055) + +- Fixed the occasional issue where multiple layers of aggregation were merged incorrectly during planning. [#36145](https://github.com/apache/doris/pull/36145) + +- Fixed the occasional issue where the input-output mismatch error occurred after aggregate expansion planning. [#36207](https://github.com/apache/doris/pull/36207) + +- Fixed the occasional issue where `<=>` was incorrectly converted to `=`. [#36521](https://github.com/apache/doris/pull/36521) + +### Query execution + +- Fixed the issue where the query hangs if the limited rows are reached on the pipeline engine and memory is not released. [#35746](https://github.com/apache/doris/pull/35746) + +- Fixed the BE coredump when `enable_decimal256` is true but falls back to the old planner. [#35731](https://github.com/apache/doris/pull/35731) + +### Asynchronous materialized views + +- Fixed the issue in the asynchronous materialized view build where the store_row_column attribute specified was not being recognized by the core. + +- Fixed the problem in the asynchronous materialized view build where specifying the storage_medium was not taking effect. + +- Resolved the error occurring in the asynchronous materialized view show partitions after the base table is deleted. + +- Fixed the issue where asynchronous materialized views caused backup and restore exceptions. [#35703](https://github.com/apache/doris/pull/35703) + +- Fixed the issue where partition rewrite could lead to incorrect results. [#35236](https://github.com/apache/doris/pull/35236) + +### Semi-structured + +- Fixed the core dump problem when a VARIANT with an empty key is used. [#35671](https://github.com/apache/doris/pull/35671) +- Bitmap and BloomFilter index should not perform light index changes. [#35225](https://github.com/apache/doris/pull/35225) + +### Primary key + +- Fixed the issue where an exception BE restart occurred in the case of partial column updates during import, which could result in duplicate keys. [#35678](https://github.com/apache/doris/pull/35678) + +- Fixed the issue where BE might core dump during clone operations when memory is tight. [#34702](https://github.com/apache/doris/pull/34702) + +### Data Lakehouse + +- Fixed the issue where a Hive table could not be created with a fully qualified name such as `ctl.db.tbl` [#34984](https://github.com/apache/doris/pull/34984) + +- Fixed the issue where the Hive metastore connection did not close when refreshing [#35426](https://github.com/apache/doris/pull/35426) + +- Fixed a potential meta replay issue when upgrading from 2.0.x to 2.1.x. [#35532](https://github.com/apache/doris/pull/35532) + +- Fixed the issue where the Table Valued Function could not read an empty snappy compressed file. [#34926](https://github.com/apache/doris/pull/34926) + +- Fixed the issue where unable to read Parquet files with invalid min-max column statistics [#35041](https://github.com/apache/doris/pull/35041) + +- Fixed the issue where unable to handle pushdown predicates with null-aware functions in the Parquet/ORC reader [#35335](https://github.com/apache/doris/pull/35335) + +- Fixed the issue about the order of partition columns when creating a Hive table [#35347](https://github.com/apache/doris/pull/35347) + +- Fixed the issue where writing to a Hive table on S3 failed when partition values contained spaces [#35645](https://github.com/apache/doris/pull/35645) + +- Fixed the issue about incorrect scheme of Aliyun OSS endpoint [#34907](https://github.com/apache/doris/pull/34907) + +- Fixed the issue where the Parquet format Hive table written by Doris could not be read by Hive [#34981](https://github.com/apache/doris/pull/34981) + +- Fixed the issue where unable to read ORC files after the schema change of a Hive table [#35583](https://github.com/apache/doris/pull/35583) + +- Fixed the issue where unable to read Paimon tables via JNI after the schema change of the Paimon table [#35309](https://github.com/apache/doris/pull/35309) + +- Fixed the issue of too small Row Groups in Parquet format files written out. [#36042](https://github.com/apache/doris/pull/36042) [#36143](https://github.com/apache/doris/pull/36143) + +- Fixed the issue where unable to read Paimon tables after schema changes [#36049](https://github.com/apache/doris/pull/36049) + +- Fixed the issue where unable to read Hive Parquet format tables after schema changes [#36182](https://github.com/apache/doris/pull/36182) + +- Fixed the FE OOM issue caused by Hadoop FS cache [#36403](https://github.com/apache/doris/pull/36403) + +- Fixed the issue where FE could not start after enabling the Hive Metastore Listener [#36533](https://github.com/apache/doris/pull/36533) + +- Fixed the issue of query performance degradation with a large number of files [#36431](https://github.com/apache/doris/pull/36431) + +- Fixed the timezone issue when reading the timestamp column type in Iceberg [#36435](https://github.com/apache/doris/pull/36435) + +- Fixed DATETIME conversion error and data path error on Iceberg Table. [#35708](https://github.com/apache/doris/pull/35708) + +- Support retain and pass the additional user-defined properties fo Table Valued Functions to the S3 SDK. [#35515](https://github.com/apache/doris/pull/35515) + + +### Data import + +- Fixed the issue where `CANCEL LOAD` did not work [#35352](https://github.com/apache/doris/pull/35352) + +- Fixed the issue where a null pointer error in the Publish phase of load transactions prevented the load from completing [#35977](https://github.com/apache/doris/pull/35977) + +- Fixed the issue with bRPC serializing large data files when sent via HTTP [#36169](https://github.com/apache/doris/pull/36169) + +### Data management + +- Fixed the isseu that the resource tag in ConnectionContext was not set after forwarding DDL or DML to master FE. [#35618](https://github.com/apache/doris/pull/35618) + +- Fixed the issue where the restored table name was incorrect when `lower_case_table_names` was enabled [#35508](https://github.com/apache/doris/pull/35508) + +- Fixed the issue where `admin clean trash` could not work [#35271](https://github.com/apache/doris/pull/35271) + +- Fixed the issue where a storage policy could not be deleted from a partition [#35874](https://github.com/apache/doris/pull/35874) + +- Fixed the issue of data loss when importing into a multi-replica automatic partition table [#36586](https://github.com/apache/doris/pull/36586) + +- Fixed the issue where the partition column of a table changed when querying or inserting into an automatic partition table using the old optimizer [#36514](https://github.com/apache/doris/pull/36514) + +### Memory management + +- Fixed the issue of frequent errors in the logs due to failure in obtaining Cgroup meminfo. [#35425](https://github.com/apache/doris/pull/35425) + +- Fixed the issue where the Segment cache size was uncontrolled when using BloomFilter, leading to abnormal process memory growth. [#34871](https://github.com/apache/doris/pull/34871) + +### Permissions + +- Fixed the issue where permission settings were ineffective after enabling case-insensitive table names. [#36557](https://github.com/apache/doris/pull/36557) + +- Fixed the issue where setting LDAP passwords through non-Master FE nodes did not take effect. [#36598](https://github.com/apache/doris/pull/36598) + +- Fixed the issue where authorization could not be checked for the `SELECT COUNT(*)` statement. [#35465](https://github.com/apache/doris/pull/35465) + +### Others + +- Fixed the issue where the client JDBC program could not close the connection if the MySQL connection was broken. [#36616](https://github.com/apache/doris/pull/36616) + +- Fixed MySQL protocol compatibility issue with the `SHOW PROCEDURE STATUS` statement. [#35350](https://github.com/apache/doris/pull/35350) + +- The `libevent` now forces Keepalive to solve the issue of connection leaks in certain situations. [#36088](https://github.com/apache/doris/pull/36088) + +## Credits + +Thanks to every one who contributes to this release. + +@airborne12, @amorynan, @AshinGau, @BePPPower, @BiteTheDDDDt, @ByteYue, @caiconghui, @CalvinKirs, @cambyzju, @catpineapple, @cjj2010, @csun5285, @DarvenDuan, @dataroaring, @deardeng, @Doris-Extras, @eldenmoon, @englefly, @feiniaofeiafei, @felixwluo, @freemandealer, @Gabriel39, @gavinchou, @GoGoWen, @HappenLee, @hello-stephen, @hubgeter, @hust-hhb, @jacktengg, @jackwener, @jeffreys-cat, @Jibing-Li, @kaijchen, @kaka11chen, @Lchangliang, @liaoxin01, @LiBinfeng-01, @lide-reed, @luennng, @luwei16, @mongo360, @morningman, @morrySnow, @mrhhsg, @Mryange, @mymeiyi, @nextdreamblue, @platoneko, @qidaye, @qzsee, @seawinde, @shuke987, @sollhui, @starocean999, @suxiaogang223, @TangSiyang2001, @Thearas, @Vallishp, @w41ter, @wangbo, @whutpencil, @wsjz, @wuwenchi, @xiaokang, @xiedeyantu, @XieJiann, @xinyiZzz, @XuPengfei-1020, @xy720, @xzj7019, @yiguolei, @yongjinhou, @yujun777, @Yukang-Lian, @Yulei-Yang, @zclllyybb, @zddr, @zfr9527, @zgxme, @zhangbutao, @zhangstar333, @zhannngchen, @zhiqiang-hhhh, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.5.md b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.5.md new file mode 100644 index 0000000000000..7c1910eeae8c5 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.5.md @@ -0,0 +1,395 @@ +--- +{ + "title": "Release 2.1.5", + "language": "en" +} +--- + + + +**Apache Doris version 2.1.5 was officially released on July 24, 2024.** In this update, we have optimized various functional experiences for data lakehouse and high concurrency scenarios, functionalities of asynchronous materialized views. Additionaly, we have implemented several improvemnents and bug fixes to enhance the stability. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- The default connection pool size for the JDBC Catalog has been increased from 10 to 30 to prevent connection exhaustion in high-concurrency scenarios. [#37023](https://github.com/apache/doris/pull/37023). + +- The system's reserved memory (low water mark) has been adjusted to `min(6.4GB, MemTotal * 5%)` to mitigate BE OOM issues. + +- When processing multiple statements in a single request, only the last statement's result is returned if the `CLIENT_MULTI_STATEMENTS` flag is not set. + +- Direct modifications to data in asynchronous materialized views are no longer permitted.[#37129](https://github.com/apache/doris/pull/37129) + +- A session variable `use_max_length_of_varchar_in_ctas` has been added to control the behavior of varchar and char type length generation during CTAS (Create Table As Select). The default value is true. When set to false, the derived varchar length is used instead of the maximum length. [#37284](https://github.com/apache/doris/pull/37284) + +- Statistics collection now defaults to enabling the functionality of estimating the number of rows in Hive tables based on file size. [#37694](https://github.com/apache/doris/pull/37694) + +- Transparent rewrite for asynchronous materialized views is now enabled by default. [#35897](https://github.com/apache/doris/pull/35897) + +- Transparent rewrite utilizes partitioned materialized views. If partitions fail, the base tables are unioned with the materialized view to ensure data correctness. [#35897](https://github.com/apache/doris/pull/35897) + +## New features + +### Lakehouse + +- The session variable `read_csv_empty_line_as_null` can be used to control whether empty lines are ignored when reading CSV format files. [#37153](https://github.com/apache/doris/pull/37153) + + By default, empty lines are ignored. When set to true, empty lines will be read as rows where all columns are null. + +- Compatibility with Presto's complex type output format can be enabled by setting `serde_dialect="presto"`. [#37253](https://github.com/apache/doris/pull/37253) + +### Multi-Table Materialized View + +- Supports non-deterministic functions in materialized view building. [#37651](https://github.com/apache/doris/pull/37651) + +- Atomically replaces definitions of asynchronous materialized views. [#37147](https://github.com/apache/doris/pull/37147) + +- Views creation statements can be viewed via `SHOW CREATE MATERIALIZED VIEW`. [#37125](https://github.com/apache/doris/pull/37125) + +- Transparent rewrites for multi-dimensional aggregation and non-aggregate queries. [#37436](https://github.com/apache/doris/pull/37436) [#37497](https://github.com/apache/doris/pull/37497) + +- Supports DISTINCT aggregations with key columns and partitioning for roll-ups. [#37651](https://github.com/apache/doris/pull/37651) + +- Support for partitioning materialized views to roll up partitions using `date_trunc` [#31812](https://github.com/apache/doris/pull/31812) [#35562](https://github.com/apache/doris/pull/35562) + +- Partitioned table-valued functions (TVFs) are supported. [#36479](https://github.com/apache/doris/pull/36479) + +### Semi-Structured Data Management + +- Tables using the VARIANT type now support partial column updates. [#34925](https://github.com/apache/doris/pull/34925) + +- PreparedStatement support is now enabled by default. [#36581](https://github.com/apache/doris/pull/36581) + +- The VARIANT type can be exported to CSV format. [#37857](https://github.com/apache/doris/pull/37857) + +- `explode_json_object` function transposes JSON Object rows into columns. [#36887](https://github.com/apache/doris/pull/36887) + +- The ES Catalog now maps ES NESTED or OBJECT types to the Doris JSON type.[#37101](https://github.com/apache/doris/pull/37101) + +- By default, support_phrase is enabled for inverted indexes with specified analyzers to improve the performance of match_phrase series queries. [#37949](https://github.com/apache/doris/pull/37949) + +### Query Optimizer + +- Support for explaining `DELETE FROM` statements. [#37100](https://github.com/apache/doris/pull/37100) + +- Support for hint form of constant expression parameters [#37988](https://github.com/apache/doris/pull/37988) + +### Memory Management + +- Added an HTTP API to clear the cache. [#36599](https://github.com/apache/doris/pull/36599) + +### Permissions + +- Support for authorization of resources within Table-Valued Functions (TVFs) [#37132](https://github.com/apache/doris/pull/37132) + +## Improvements + +### Lakehouse + +- Upgraded Paimon to version 0.8.1 + +- Fixes ClassNotFoundException for org.apache.commons.lang.StringUtils when querying Paimon tables. [#37512](https://github.com/apache/doris/pull/37512) + +- Added support for Tencent Cloud LakeFS. [#36891](https://github.com/apache/doris/pull/36891) + +- Optimized the timeout duration when fetching file lists for external table queries. [#36842](https://github.com/apache/doris/pull/36842) + +- Configurable via the session variable `fetch_splits_max_wait_time_ms`. + +- Improved default connection logic for SQLServer JDBC Catalog. [#36971](https://github.com/apache/doris/pull/36971) + + By default, the connection encryption settings are not intervened. Only when `force_sqlserver_jdbc_encrypt_false` is set to true, encrypt=false is forcibly added to the JDBC URL to reduce authentication errors. This allows for more flexible control over encryption behavior, enabling it to be turned on or off as needed. + +- Added serde properties to the show create table statements for Hive tables. [#37096](https://github.com/apache/doris/pull/37096) + +- Changed the default cache time for Hive table lists on the FE from 1 day to 4 hours + +- Data export (Export/Outfile) now supports specifying compression formats for Parquet and ORC + + For more information, please refer to [docs](https://doris.apache.org/docs/sql-manual/sql-statements/Data-Manipulation-Statements/Manipulation/EXPORT/?_highlight=compress_type). + +- When creating a table using CTAS+TVF, partition columns in the TVF are automatically mapped to Varchar(65533) instead of String, allowing them to be used as partition columns for internal tables [#37161](https://github.com/apache/doris/pull/37161) + +- Optimized the number of metadata accesses for Hive write operations [#37127](https://github.com/apache/doris/pull/37127) + +- ES Catalog now supports mapping nested/object types to Doris's Json type. [#37182](https://github.com/apache/doris/pull/37182) + +- Improved error messages when connecting to Oracle using older versions of the ojdbc driver [#37634](https://github.com/apache/doris/pull/37634) + +- When Hudi tables return an empty set during Incremental Read, Doris now also returns an empty set instead of error [#37636](https://github.com/apache/doris/pull/37636) + +- Fixed an issue where inner-outer table join queries could lead to FE timeouts in some cases [#37757](https://github.com/apache/doris/pull/37757) + +- Fixed an issue with FE metadata replay errors during upgrades from older versions to newer versions when the Hive metastore event listener is enabled. [#37757](https://github.com/apache/doris/pull/37757) + +### Multi-Table Materialized View + +- Automate key column selection for asynchronous materialized views. [#36601](https://github.com/apache/doris/pull/36601) + +- Support date_trunc in materialized view partition definitions.. [#35562](https://github.com/apache/doris/pull/35562) + +- Enable transparent rewrites across nested materialized view aggregations. [#37651](https://github.com/apache/doris/pull/37651) + +- Asynchronous materialized views remain available when schema changes do not affect the correctness of their data. [#37122](https://github.com/apache/doris/pull/37122) + +- Improve planning speed for transparent rewrites. [#37935](https://github.com/apache/doris/pull/37935) + +- When calculating the availability of asynchronous materialized views, the current refresh status is no longer taken into account. [#36617](https://github.com/apache/doris/pull/36617) + +### Semi-Structured Data Management + +- Optimize DESC performance for viewing VARIANT sub-columns through sampling. [#37217](https://github.com/apache/doris/pull/37217) + +- Support for special JSON data with empty keys in the JSON type. [#36762](https://github.com/apache/doris/pull/36762) + +### Inverted Index + +- Reduce latency by minimizing the invocation of inverted index exists to avoid delays in accessing object storage. [#36945](https://github.com/apache/doris/pull/36945) + +- Optimize the overhead of the inverted index query process. [#35357](https://github.com/apache/doris/pull/35357) + +- Prevent inverted indices in materialized views. [#36869](https://github.com/apache/doris/pull/36869) + +### Query Optimizer + +- When both sides of a comparison expression are literals, the string literal will attempt to convert to the type of the other side. [#36921](https://github.com/apache/doris/pull/36921) + +- Refactored the sub-path pushdown functionality for the variant type, now better supporting complex pushdown scenarios. [#36923](https://github.com/apache/doris/pull/36923) + +- Optimized the logic for calculating the cost of materialized views, enabling more accurate selection of lower-cost materialized views. [#37098](https://github.com/apache/doris/pull/37098) + +- Improved the SQL cache planning speed when using user variables in SQL. [#37119](https://github.com/apache/doris/pull/37119) + +- Optimized the row estimation logic for NOT NULL expressions, resulting in better performance when NOT NULL is present in queries. [#37498](https://github.com/apache/doris/pull/37498) + +- Optimized the null rejection derivation logic for LIKE expressions. [#37864](https://github.com/apache/doris/pull/37864) + +- Improved error messages when querying a specific partition fails, making it clearer which table is causing the issue. [#37280](https://github.com/apache/doris/pull/37280) + +### Query Execution + +- Improved the performance of the bitmap_union operator up to 3 times in certain scenarios. + +- Enhanced the reading performance of Arrow Flight in ARM environments. + +- Optimized the execution performance of the explode, explode_map, and explode_json functions. + +### Data Loading + +- Support setting `max_filter_ratio` for `INSERT INTO ... FROM TABLE VALUE FUNCTION` + +## Bug fixes + +### Lakehouse + +- Fixed an issue that caused BE crashes in some cases when querying Parquet format [#37086](https://github.com/apache/doris/pull/37086) + +- Fixed an issue where BE printed excessive logs when querying Parquet format. [#37012](https://github.com/apache/doris/pull/37012) + +- Fixed an issue where the FE side created a large number of duplicate FileSystem objects in some cases. [#37142](https://github.com/apache/doris/pull/37142) + +- Fixed an issue where transaction information was not cleaned up after writing to Hive in some cases. [#37172](https://github.com/apache/doris/pull/37172) + +- Fixed a thread leak issue caused by Hive table write operations in some cases. [#37247](https://github.com/apache/doris/pull/37247) + +- Fixed an issue where Hive Text format row and column delimiters could not be correctly obtained in some cases. [#37188](https://github.com/apache/doris/pull/37188) + +- Fixed a concurrency issue when reading lz4 compressed blocks in some cases. [#37187](https://github.com/apache/doris/pull/37187) + +- Fixed an issue where `count(*)` on Iceberg tables returned incorrect results in some cases. [#37810](https://github.com/apache/doris/pull/37810) + +- Fixed an issue where creating a Paimon catalog based on MinIO caused FE metadata replay errors in some cases. [#37249](https://github.com/apache/doris/pull/37249) + +- Fixed an issue where using Ranger to create a catalog caused the client to hang in some cases. [#37551](https://github.com/apache/doris/pull/37551) + +### Multi-Table Materialized View + +- Fixed an issue where adding new partitions to the base table could lead to incorrect results after partition aggregation roll-up rewrites. [#37651](https://github.com/apache/doris/pull/37651) + +- Fixed an issue where the materialized view partition status was not set to out-of-sync after deleting associated base table partitions. [#36602](https://github.com/apache/doris/pull/36602) + +- Fixed an occasional deadlock issue during asynchronous materialized view builds. [#37133](https://github.com/apache/doris/pull/37133) + +- Fixed an occasional "nereids cost too much time" error when refreshing a large number of partitions in a single asynchronous materialized view refresh. [#37589](https://github.com/apache/doris/pull/37589) + +- Fixed an issue where an asynchronous materialized view could not be created if the final select list contained a null literal. [#37281](https://github.com/apache/doris/pull/37281) + +- Fixed an issue with single-table materialized views where, even though the aggregation materialized view was successfully rewritten, the CBO did not select it. [#35721](https://github.com/apache/doris/pull/35721) [#36058](https://github.com/apache/doris/pull/36058) + +- Fixed an issue where partition derivation failed when building a partitioned materialized view with both join inputs being aggregations. [#34781](https://github.com/apache/doris/pull/34781) + +### Semi-Structured Data Management + +- Fixed issues with VARIANT in special cases such as concurrency and abnormal data.[#37976](https://github.com/apache/doris/pull/37976) [#37839](https://github.com/apache/doris/pull/37839) [#37794](https://github.com/apache/doris/pull/37794) [#37674](https://github.com/apache/doris/pull/37674) [#36997](https://github.com/apache/doris/pull/36997) + +- Fixed coredump issues when using VARIANT in unsupported SQL. [#37640](https://github.com/apache/doris/pull/37640) + +- Fixed coredump issues related to MAP data type when upgrading from 1.x to 2.x or higher versions. [#36937](https://github.com/apache/doris/pull/36937) + +- Improved ES Catalog support for Array types. [#36936](https://github.com/apache/doris/pull/36936) + +### Inverted Index + +- Fixed an issue where DROP INDEX for Inverted Index v2 did not delete metadata. [#37646](https://github.com/apache/doris/pull/37646) + +- Fixed query accuracy issues when string length exceeded the "ignore above" threshold. [#37679](https://github.com/apache/doris/pull/37679) + +- Fixed issues with index size statistics. [#37232](https://github.com/apache/doris/pull/37232) [#37564](https://github.com/apache/doris/pull/37564) + +### Query Optimizer + +- Fixed an issue that prevented import operations from executing due to the use of reserved keywords. [#35938](https://github.com/apache/doris/pull/35938) + +- Fixed a type error where char(255) was incorrectly recorded as char(1) when creating a table. [#37671](https://github.com/apache/doris/pull/37671) + +- Fixed incorrect results when the join expression in a correlated subquery was a complex expression. [#37683](https://github.com/apache/doris/pull/37683) + +- Fixed a potential issue with incorrect bucket pruning for decimal types. [#38013](https://github.com/apache/doris/pull/38013) + +- Fixed incorrect aggregation operator results when pipeline local shuffle was enabled in certain scenarios. [#38016](https://github.com/apache/doris/pull/38016) + +- Fixed planning errors that could occur when equal expressions existed in aggregation operators. [#36622](https://github.com/apache/doris/pull/36622) + +- Fixed planning errors that could occur when lambda expressions were present in aggregation operators. [#37285](https://github.com/apache/doris/pull/37285) + +- Fixed an issue where a literal generated from a window function being optimized to a literal had the wrong type, preventing execution. [#37283](https://github.com/apache/doris/pull/37283) + +- Fixed an issue with the null attribute being incorrectly output by the aggregate function foreach combinator. [#37980](https://github.com/apache/doris/pull/37980) + +- Fixed an issue where the acos function could not be planned when its parameter was a literal out of range. [#37996](https://github.com/apache/doris/pull/37996) + +- Fixed planning errors when specifying partitions for a query on a synchronized materialized view. [#36982](https://github.com/apache/doris/pull/36982) + +- Fixed occasional Null Pointer Exceptions (NPEs) during planning. [#38024](https://github.com/apache/doris/pull/38024) + +### Query Execution + +- Fixed an error in delete where statements when using decimal data types as conditions. [#37801](https://github.com/apache/doris/pull/37801) + +- Fixed an issue where BE memory was not released after query execution ended. [#37792](https://github.com/apache/doris/pull/37792) [#37297](https://github.com/apache/doris/pull/37297) + +- Fixed a problem where audit logs occupied too much FE memory under high QPS scenarios. [#37786](https://github.com/apache/doris/pull/37786) + +- Fixed BE core dumps when the sleep function received illegal input values. [#37681](https://github.com/apache/doris/pull/37681) + +- Fixed an error encountered during sync filter size execution. [#37103](https://github.com/apache/doris/pull/37103) + +- Fixed incorrect results when using time zones during execution. [#37062](https://github.com/apache/doris/pull/37062) + +- Fixed incorrect results when casting strings to integers. [#36788](https://github.com/apache/doris/pull/36788) + +- Fixed query errors when using the Arrow Flight protocol with pipelinex enabled. [#35804](https://github.com/apache/doris/pull/35804) + +- Fixed errors when casting strings to dates/datetimes. [#35637](https://github.com/apache/doris/pull/35637) + +- Fixed BE core dumps during large table join queries using <=>. [#36263](https://github.com/apache/doris/pull/36263) + +### Storage Management + +- Fixed the issue of invisible DELETE SIGN data encountered during column update and write operations. [#36755](https://github.com/apache/doris/pull/36755) + +- Optimized FE's memory usage during schema changes. [#36756](https://github.com/apache/doris/pull/36756) + +- Fixed the issue where BE would hang during restart due to transactions not being aborted [#36437](https://github.com/apache/doris/pull/36437) + +- Fixed occasional errors when changing from NOT NULL to NULL data types. [#36389](https://github.com/apache/doris/pull/36389) + +- Optimized replica repair scheduling when BE goes down. [#36897](https://github.com/apache/doris/pull/36897) + +- Supported round-robin disk selection for tablet creation on a single BE. [#36900](https://github.com/apache/doris/pull/36900) + +- Fixed query error -230 caused by slow publishing. [#36222](https://github.com/apache/doris/pull/36222) + +- Improved the speed of partition balancing. [#36976](https://github.com/apache/doris/pull/36976) + +- Controlled segment cache using the number of file descriptors (FDs) and memory to avoid FD exhaustion. [#37035](https://github.com/apache/doris/pull/37035) + +- Fixed potential replica loss caused by concurrent clone and alter operations [#36858](https://github.com/apache/doris/pull/36858) + +- Fixed the issue of not being able to adjust column order.[#37226](https://github.com/apache/doris/pull/37226) + +- Prohibited certain schema change operations on auto-increment columns. [#37331](https://github.com/apache/doris/pull/37331) + +- Fixed inaccurate error reporting for DELETE operations. [#37374](https://github.com/apache/doris/pull/37374) + +- Adjusted the trash expiration time on BE side to one day. [#37409](https://github.com/apache/doris/pull/37409) + +- Optimized compaction memory usage and scheduling. [#37491](https://github.com/apache/doris/pull/37491) + +- Checked for potential oversized backups causing FE restarts. [#37466](https://github.com/apache/doris/pull/37466) + +- Restored dynamic partition deletion policies and cross-partition behaviors to 2.1.3. [#37570](https://github.com/apache/doris/pull/37570) [#37506](https://github.com/apache/doris/pull/37506) + +- Fixed errors related to decimal types in DELETE predicates. [#37710](https://github.com/apache/doris/pull/37710) + +### Data Loading + +- Fixed data invisibility issues caused by race conditions in error handling during imports [#36744](https://github.com/apache/doris/pull/36744) + +- Added support for hhl_from_base64 in streamload imports. [#36819](https://github.com/apache/doris/pull/36819) + +- Fixed potential FE OOM issues when importing very large numbers of tablets for a single table. [#36944](https://github.com/apache/doris/pull/36944) + +- Fixed possible auto-increment column duplication during FE master-slave switchovers. [#36961](https://github.com/apache/doris/pull/36961) + +- Fixed errors when inserting into select with auto-increment columns. [#37029](https://github.com/apache/doris/pull/37029) + +- Reduced the number of data flush threads to optimize memory usage. [#37092](https://github.com/apache/doris/pull/37092) + +- Improved automatic recovery and error messaging for routine load tasks. [#37371](https://github.com/apache/doris/pull/37371) + +- Increased the default batch size for routine load. [#37388](https://github.com/apache/doris/pull/37388) + +- Fixed routine load task stoppage due to Kafka EOF expiration. [#37983](https://github.com/apache/doris/pull/37983) + +- Fixed coredump issues in multi-table streaming. [#37370](https://github.com/apache/doris/pull/37370) + +- Fixed premature backpressure caused by inaccurate memory estimation in groupcommit. [#37379](https://github.com/apache/doris/pull/37379) + +- Optimized BE-side thread usage in groupcommit. [#37380](https://github.com/apache/doris/pull/37380) + +- Fixed the issue of no error URL when data was not partitioned. [#37401](https://github.com/apache/doris/pull/37401) + +- Fixed potential memory misoperations during imports. [#38021](https://github.com/apache/doris/pull/38021) + +### Merge on Write Unique Key + +- Reduced memory usage during compaction for primary key tables. [#36968](https://github.com/apache/doris/pull/36968) + +- Fixed potential duplicate data issues when primary key replica cloning fails. [#37229](https://github.com/apache/doris/pull/37229) + +### Permissions + +- Fixed the issue of missing authorization when a table-valued function references a resource. [#37132](https://github.com/apache/doris/pull/37132) + +- Fixed the issue where the SHOW ROLE statement did not include workload group permissions. [#36032](https://github.com/apache/doris/pull/36032) + +- Fixed the issue where executing two statements simultaneously when creating a row policy could cause FE to fail to restart. [#37342](https://github.com/apache/doris/pull/37342) + +- Fixed the issue where, in some cases, upgrading from an older version could result in FE metadata replay failures due to row policies. [#37342](https://github.com/apache/doris/pull/37342) + +### Others + +- Fixed the issue of compute nodes participating in internal table creation. [#37961](https://github.com/apache/doris/pull/37961) + +- Fixed the read lag issue when `enable_strong_read_consistency` is set to true. [#37641](https://github.com/apache/doris/pull/37641) \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.6.md b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.6.md new file mode 100644 index 0000000000000..c14d25b52573f --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.6.md @@ -0,0 +1,524 @@ +--- +{ + "title": "Release 2.1.6", + "language": "en" +} +--- + + + +Dear community, **Apache Doris version 2.1.6 was officially released on September 10, 2024.** This version brings continuous upgrades and improvements to the Lakehouse, Async Materialized Views, and Semi-Structured Data Management. Additionally, several fixes have been implemented in areas such as the query optimizer, execution engine, storage management, permission management. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- Removed the `delete_if_exists` option from create repository. [#38192](https://github.com/apache/doris/pull/38192) + +- Added the `enable_prepared_stmt_audit_log` session variable to control whether JDBC prepared statements record audit logs, with the default being no recording. [#38624](https://github.com/apache/doris/pull/38624) [#39009](https://github.com/apache/doris/pull/39009) + +- Implemented fd limit and memory constraints for segment cache. [#39689](https://github.com/apache/doris/pull/39689) + +- When the FE configuration item `sys_log_mode` is set to BRIEF, file location information is added to the logs. [#39571](https://github.com/apache/doris/pull/39571) + +- Changed the default value of the session variable `max_allowed_packet` to 16MB. [#38697](https://github.com/apache/doris/pull/38697) + +- When a single request contains multiple statements, semicolons must be used to separate them. [#38670](https://github.com/apache/doris/pull/38670) + +- Added support for statements to begin with a semicolon. [#39399](https://github.com/apache/doris/pull/39399) + +- Aligned type formatting with MySQL in statements such as `show create table`. [#38012](https://github.com/apache/doris/pull/38012) + +- When the new optimizer planning times out, it no longer falls back to prevent the old optimizer from using longer planning times. [#39499](https://github.com/apache/doris/pull/39499) + +## New features + +### Lakehouse + +- Supported writeback for Iceberg tables. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/lakehouse/datalake-building/iceberg-build). + +- SQL interception rules now support external tables. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/query-admin/sql-interception). + +- Added the system table `file_cache_statistics` to view BE data cache metrics. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/system-tables/file_cache_statistics). + +### Async Materialized View + +- Supported transparent rewriting during inserts. [#38115](https://github.com/apache/doris/pull/38115) + +- Supported transparent rewriting when variant types exist in queries.[ #37929](https://github.com/apache/doris/pull/37929) + +### Semi-Structured Data Management + +- Supported casting ARRAY MAP to JSON type.[ #36548](https://github.com/apache/doris/pull/36548) + +- Supported the `json_keys` function.[ #36411](https://github.com/apache/doris/pull/36411) + +- Supported specifying the JSON path $. when importing JSON. [#38213](https://github.com/apache/doris/pull/38213) + +- ARRAY, MAP, STRUCT types now support `replace_if_not_null`[#38304](https://github.com/apache/doris/pull/38304) + +- ARRAY, MAP, STRUCT types now support adjusting column order.[#39210](https://github.com/apache/doris/pull/39210) + +- Added the `multi_match` function to match keywords across multiple fields, with support for inverted index acceleration. [#37722](https://github.com/apache/doris/pull/37722) + +### Query Optimizer + +- Filled in the original database name, table name, column name, and alias for returned columns in the MySQL protocol. [ #38126](https://github.com/apache/doris/pull/38126) + +- Supported the aggregation function `group_concat` with both order by and distinct simultaneously. [#38080](https://github.com/apache/doris/pull/38080) + +- SQL cache now supports reusing cached results for queries with different comments. [#40049](https://github.com/apache/doris/pull/40049) + +- In partition pruning, supported including `date_trunc` and date functions in filter conditions. [#38025](https://github.com/apache/doris/pull/38025) [#38743](https://github.com/apache/doris/pull/38743) + +- Allowed using the database name where the table resides as a qualifier prefix for table aliases. [#38640](https://github.com/apache/doris/pull/38640) + +- Supported hint-style comments.[#39113](https://github.com/apache/doris/pull/39113) + +### Others + +- Added the system table `table_properties` for viewing table properties. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/system-tables/information_schema/table_properties). + +- Introduced deadlock and slow lock detection in FE. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/maint-monitor/frontend-lock-manager). + +## Improvements + +### Lakehouse + +- Reimplemented the external table metadata caching mechanism. + + - For details, refer to the [documentation](https://doris.apache.org/docs/lakehouse/metacache). + +- Added the session variable `keep_carriage_return` with a default value of false. By default, reading Hive Text format tables treats both `\r\n` and `\n` as newline characters. [#38099](https://github.com/apache/doris/pull/38099) + +- Optimized memory statistics for Parquet/ORC file read/write operations.[#37257](https://github.com/apache/doris/pull/37257) + +- Supported pushing down IN/NOT IN predicates for Paimon tables. [#38390](https://github.com/apache/doris/pull/38390) + +- Enhanced the optimizer to support Time Travel syntax for Hudi tables. [#38591](https://github.com/apache/doris/pull/38591) + +- Optimized Kerberos authentication-related processes. [ #37301](https://github.com/apache/doris/pull/37301) + +- Enabled reading Hive tables after renaming column operations. [#38809](https://github.com/apache/doris/pull/38809) + +- Optimized the reading performance of partition columns for external tables. [#38810](https://github.com/apache/doris/pull/38810) + +- Improved the data shard merging strategy during external table query planning to avoid performance degradation caused by a large number of small shards.[#38964](https://github.com/apache/doris/pull/38964) + +- Added attributes such as location to `SHOW CREATE DATABASE/TABLE`. [#39644](https://github.com/apache/doris/pull/39644) + +- Supported complex types in MaxCompute Catalog. [#39822](https://github.com/apache/doris/pull/39822) + +- Optimized the file cache loading strategy by using asynchronous loading to avoid long BE startup times. [#39036](https://github.com/apache/doris/pull/39036) + +- Improved the file cache eviction strategy, such as evicting locks held for extended periods. [#39721](https://github.com/apache/doris/pull/39721) + +### Async Materialized View + +- Supported hourly, weekly, and quarterly partition roll-up construction. [#37678](https://github.com/apache/doris/pull/37678) + +- For materialized views based on Hive external tables, the metadata cache is now updated before refresh to ensure the latest data is obtained during each refresh. [#38212](https://github.com/apache/doris/pull/38212) + +- Improved the performance of transparent rewrite planning in storage-compute decoupled mode by batch fetching metadata. [#39301](https://github.com/apache/doris/pull/39301) + +- Enhanced the performance of transparent rewrite planning by prohibiting duplicate enumerations. [#39541](https://github.com/apache/doris/pull/39541) + +- Improved the performance of transparent rewrite for refreshing materialized views based on Hive external table partitions.[#38525](https://github.com/apache/doris/pull/38525) + +### Semi-Structured Data Management + +- Optimized memory allocation for TOPN queries to improve performance. [#37429](https://github.com/apache/doris/pull/37429) + +- Enhanced the performance of string processing in inverted indexes.[#37395](https://github.com/apache/doris/pull/37395) + +- Optimized the performance of inverted indexes in MOW tables. [#37428](https://github.com/apache/doris/pull/37428) + +- Supported specifying the row-store `page_size` during table creation to control compression effectiveness. [#37145](https://github.com/apache/doris/pull/37145) + +### Query Optimizer + +- Adjusted the row count estimation algorithm for mark joins, resulting in more accurate cardinality estimates for mark joins. [#38270](https://github.com/apache/doris/pull/38270) + +- Optimized the cost estimation algorithm for semi/anti joins, enabling more accurate selection of semi/anti join orders. [#37951](https://github.com/apache/doris/pull/37951) + +- Adjusted the filter estimation algorithm for cases where some columns have no statistical information, leading to more accurate cardinality estimates. [#39592](https://github.com/apache/doris/pull/39592) + +- Modified the instance calculation logic for set operation operators to prevent insufficient parallelism in extreme cases. [#39999](https://github.com/apache/doris/pull/39999) + +- Adjusted the usage strategy of bucket shuffle, achieving better performance when data is not sufficiently shuffled. [#36784](https://github.com/apache/doris/pull/36784) + +- Enabled early filtering of window function data, supporting multiple window functions in a single projection. [#38393](https://github.com/apache/doris/pull/38393) + +- When a `NullLiteral` exists in a filter condition, it can now be folded into false, further converted to an `EmptySet` to reduce unnecessary data scanning and computation. [#38135](https://github.com/apache/doris/pull/38135) + +- Expanded the scope of predicate derivation, reducing data scanning in queries with specific patterns. [#37314](https://github.com/apache/doris/pull/37314) + +- Supported partial short-circuit evaluation logic in partition pruning to improve partition pruning performance, achieving over 100% improvement in specific scenarios. [#38191](https://github.com/apache/doris/pull/38191) + +- Enabled the computation of arbitrary scalar functions within user variables. [#39144](https://github.com/apache/doris/pull/39144) + +- Maintained error messages consistent with MySQL when alias conflicts exist in queries. [#38104](https://github.com/apache/doris/pull/38104) + +### Query Execution + +- Adapted AggState for compatibility from 2.1 to 3.x and fixed coredump issues. [#37104](https://github.com/apache/doris/pull/37104) + +- Refactored the strategy selection for local shuffle when no joins are involved. [#37282](https://github.com/apache/doris/pull/37282) + +- Modified the scanner for internal table queries to an asynchronous approach to prevent blocking during internal table queries. [#38403](https://github.com/apache/doris/pull/38403) + +- Optimized the block merge process when building hash tables in Join operators. [#37471](https://github.com/apache/doris/pull/37471) + +- Reduced the lock holding time for MultiCast operations. [37462](https://github.com/apache/doris/pull/37462) + +- Optimized gRPC's keepAliveTime and added a connection monitoring mechanism, reducing the probability of query failures due to RPC errors during query execution. [#37304](https://github.com/apache/doris/pull/37304) + +- Cleaned up all dirty pages in jemalloc when memory limits are exceeded. [#37164](https://github.com/apache/doris/pull/37164) + +- Improved the performance of `aes_encrypt`/`decrypt` functions when handling constant types. [#37194](https://github.com/apache/doris/pull/37194) + +- Optimized the performance of `json_extract` functions when processing constant data. [#36927](https://github.com/apache/doris/pull/36927) + +- Optimized the performance of ParseURL functions when processing constant data. [#36882](https://github.com/apache/doris/pull/36882) + +### Backup Recovery / CCR + +- Restore now supports deleting redundant tablets and partition options. [#39363](https://github.com/apache/doris/pull/39363) + +- Check storage connectivity when creating a repository. [#39538](https://github.com/apache/doris/pull/39538) + +- Enables binlog to support `DROP TABLE`, allowing CCR to incrementally synchronize `DROP TABLE` operations. [#38541](https://github.com/apache/doris/pull/38541) + +### Compaction + +- Improves the issue where high-priority compaction tasks were not subject to task concurrency control limits. [#38189](https://github.com/apache/doris/pull/38189) + +- Automatically reduces compaction memory consumption based on data characteristics. [#37486](https://github.com/apache/doris/pull/37486) + +- Fixes an issue where the sequential data optimization strategy could lead to incorrect data in aggregate tables or MOR UNIQUE tables. [ #38299](https://github.com/apache/doris/pull/38299) + +- Optimizes the rowset selection strategy during compaction during replica replenishment to avoid triggering -235 errors. [#39262](https://github.com/apache/doris/pull/39262) + +### MOW (Merge-On-Write) + +- Optimizes slow column updates caused by concurrent column updates and compactions. [#38682](https://github.com/apache/doris/pull/38682) + +- Fixes an issue where segcompaction during bulk data imports could lead to incorrect MOW data. [#38992](https://github.com/apache/doris/pull/38992) [#39707](https://github.com/apache/doris/pull/39707) + +- Fixes data loss in column updates that may occur after BE restarts. [#39035](https://github.com/apache/doris/pull/39035) + +### Storage Management + +- Adds FE configuration to control whether queries under hot-cold tiering prefer local data replicas. [#38322](https://github.com/apache/doris/pull/38322) + +- Optimizes expired BE report messages to include newly created tablets. [#38839](https://github.com/apache/doris/pull/38839) [#39605](https://github.com/apache/doris/pull/39605) + +- Optimizes replica scheduling priority strategy to prioritize replicas with missing data. [#38884](https://github.com/apache/doris/pull/38884) + +- Prevents tablets with unfinished ALTER jobs from being balanced. [#39202](https://github.com/apache/doris/pull/39202) + +- Enables modifying the number of buckets for tables with list partitioning. [#39688](https://github.com/apache/doris/pull/39688) + +- Prefers querying from online disk services. [#39654](https://github.com/apache/doris/pull/39654) + +- Improves error messages for materialized view base tables that do not support deletion during synchronization. [#39857](https://github.com/apache/doris/pull/39857) + +- Improves error messages for single columns exceeding 4GB. [#39897](https://github.com/apache/doris/pull/39897) + +- Fixes an issue where aborted transactions were omitted when plan errors occurred during `INSERT` statements.[#38260](https://github.com/apache/doris/pull/38260) + +- Fixes exceptions during SSL connection closure.[#38677](https://github.com/apache/doris/pull/38677) + +- Fixes an issue where table locks were not held when aborting transactions using labels. [#38842](https://github.com/apache/doris/pull/38842) + +- Fixes `gson pretty` causing large image issues. [#39135](https://github.com/apache/doris/pull/39135) + +- Fixes an issue where the new optimizer did not check for bucket values of 0 in `CREATE TABLE` statements.[#38999](https://github.com/apache/doris/pull/38999) + +- Fixes errors when Chinese column names are included in `DELETE` condition predicates. [#39500](https://github.com/apache/doris/pull/39500) + +- Fixes frequent tablet balancing issues in partition balancing mode. [#39606](https://github.com/apache/doris/pull/39606) + +- Fixes an issue where partition storage policy attributes were lost. [#39677](https://github.com/apache/doris/pull/39677) + +- Fixes incorrect statistics when importing multiple tables within a transaction. [#39548](https://github.com/apache/doris/pull/39548) + +- Fixes errors when deleting random bucket tables. [#39830](https://github.com/apache/doris/pull/39830) + +- Fixes issues where FE fails to start due to non-existent UDFs. [#39868](https://github.com/apache/doris/pull/39868) + +- Fixes inconsistencies in the last failed version between FE master and slave. [#39947](https://github.com/apache/doris/pull/39947) + +- Fixes an issue where related tablets may still be in schema change state when schema change jobs are canceled. [ #39327](https://github.com/apache/doris/pull/39327) + +- Fixes errors when modifying type and column order in a single statement schema change (SC). [#39107](https://github.com/apache/doris/pull/39107) + +### Data Loading + +- Improves error messages for -238 errors during imports. [#39182](https://github.com/apache/doris/pull/39182) + +- Allows importing to other partitions while restoring a partition. [#39915](https://github.com/apache/doris/pull/39915) + +- Optimizes the strategy for FE to select BEs during group commit. [#37830](https://github.com/apache/doris/pull/37830) [#39010](https://github.com/apache/doris/pull/39010) + +- Avoids printing stack traces for some common streamload error messages. [#38418](https://github.com/apache/doris/pull/38418) + +- Improves handling of issues where offline BEs may affect import errors. [#38256](https://github.com/apache/doris/pull/38256) + +### Permissions + +- Optimizes access performance after enabling the Ranger authentication plugin. [#38575](https://github.com/apache/doris/pull/38575) +- Optimizes permission strategies for Refresh Catalog/Database/Table operations, allowing users to perform these operations with only SHOW permissions. [#39008](https://github.com/apache/doris/pull/39008) + +## Bug fixes + +### Lakehouse + +- Fixes the issue where switching catalogs may result in an error of not finding the database. [#38114](https://github.com/apache/doris/pull/38114) + +- Addresses exceptions caused by attempting to read non-existent data on S3. [#38253](https://github.com/apache/doris/pull/38253) + +- Resolves the issue where specifying an abnormal path during export operations may lead to incorrect export locations. [#38602](https://github.com/apache/doris/pull/38602) + +- Fixes the timezone issue for time columns in Paimon tables. [#37716](https://github.com/apache/doris/pull/37716) + +- Temporarily disables the Parquet PageIndex feature to avoid certain erroneous behaviors. + +- Corrects the selection of Backend nodes in the blacklist during external table queries. [#38984](https://github.com/apache/doris/pull/38984) + +- Resolves errors caused by missing subcolumns in Parquet Struct column types.[#39192](https://github.com/apache/doris/pull/39192) + +- Addresses several issues with predicate pushdown in JDBC Catalog. [#39082](https://github.com/apache/doris/pull/39082) + +- Fixes issues where some historical Parquet formats led to incorrect query results. [#39375](https://github.com/apache/doris/pull/39375) + +- Improves compatibility with ojdbc6 drivers for Oracle JDBC Catalog. [#39408](https://github.com/apache/doris/pull/39408) + +- Resolves potential FE memory leaks caused by Refresh Catalog/Database/Table operations. [#39186](https://github.com/apache/doris/pull/39186) [#39871](https://github.com/apache/doris/pull/39871) + +- Fixes thread leaks in JDBC Catalog under certain conditions. [#39666](https://github.com/apache/doris/pull/39666) [#39582](https://github.com/apache/doris/pull/39582) + +- Addresses potential event processing failures after enabling Hive Metastore event subscription. [#39239](https://github.com/apache/doris/pull/39239) + +- Disables reading Hive Text format tables with custom escape characters and null formats to prevent data errors. [#39869](https://github.com/apache/doris/pull/39869) + +- Resolves issues accessing Iceberg tables created via the Iceberg API under certain conditions. [#39203](https://github.com/apache/doris/pull/39203) + +- Fixes the inability to read Paimon tables stored on HDFS clusters with high availability enabled. [#39876](https://github.com/apache/doris/pull/39876) + +- Addresses errors that may occur when reading Paimon table deletion vectors after enabling file caching. [#39875](https://github.com/apache/doris/pull/39875) + +- Resolves potential deadlocks when reading Parquet files under certain conditions. [#39945](https://github.com/apache/doris/pull/39945) + +### Async Materialized View + +- Fixes the inability to use `SHOW CREATE MATERIALIZED VIEW` on follower FEs. [#38794](https://github.com/apache/doris/pull/38794) + +- Unifies the object type of asynchronous materialized views in metadata as tables to enable proper display in data tools. [#38797](https://github.com/apache/doris/pull/38797) + +- Resolves the issue where nested asynchronous materialized views always perform full refreshes. [#38698](https://github.com/apache/doris/pull/38698) + +- Fixes the issue where canceled tasks may show as running after restarting FEs. [ #39424](https://github.com/apache/doris/pull/39424) + +- Addresses incorrect use of contexts, which may lead to unexpected failures of materialized view refresh tasks. [#39690](https://github.com/apache/doris/pull/39690) + +- Resolves issues that may cause varchar type write failures due to unreasonable lengths when creating asynchronous materialized views based on external tables.[#37668](https://github.com/apache/doris/pull/37668) + +- Fixes the potential invalidation of asynchronous materialized views based on external tables after FE restarts or catalog rebuilds. [#39355](https://github.com/apache/doris/pull/39355) + +- Prohibits the use of partition rollup for materialized views with list partitions to prevent the generation of incorrect data. [#38124](https://github.com/apache/doris/pull/38124) + +- Fixes incorrect results when literals exist in the select list during transparent rewriting for aggregation rollup. [#38958](https://github.com/apache/doris/pull/38958) + +- Addresses potential errors during transparent rewriting when queries contain filters like `a = a`. [#39629](https://github.com/apache/doris/pull/39629) + +- Fixes issues where transparent rewriting for direct external table queries fails. [#39041](https://github.com/apache/doris/pull/39041) + +### Semi-Structured Data Management + +- Removes support for prepared statements in the old optimizer. [#39465](https://github.com/apache/doris/pull/39465) + +- Fixes issues with JSON escape character handling. [#37251](https://github.com/apache/doris/pull/37251) + +- Resolves issues with duplicate processing of JSON fields. [#38490](https://github.com/apache/doris/pull/38490) + +- Fixes issues with some ARRAY and MAP functions. [#39307](https://github.com/apache/doris/pull/39307) [#39699](https://github.com/apache/doris/pull/39699) [#39757](https://github.com/apache/doris/pull/39757) + +- Resolves complex combinations of inverted index queries and LIKE queries. [#36687](https://github.com/apache/doris/pull/36687) + +### Query Optimizer + +- Fixed the potential partition pruning error issue when the 'OR' condition exists in partition filter conditions. [#38897](https://github.com/apache/doris/pull/38897) + +- Fixed the potential partition pruning error issue when complex expressions are involved. [#39298](https://github.com/apache/doris/pull/39298) + +- Fixed the issue where nullable in `agg_state` subtypes might be planned incorrectly, leading to execution errors. [#37489](https://github.com/apache/doris/pull/37489) + +- Fixed the issue where nullable in set operation operators might be planned incorrectly, leading to execution errors. [#39109](https://github.com/apache/doris/pull/39109) + +- Fixed the incorrect execution priority issue of intersect operator. [#39095](https://github.com/apache/doris/pull/39095) + +- Fixed the NPE issue that may occur when the maximum valid date literal exists in the query. [#39482](https://github.com/apache/doris/pull/39482) + +- Fixed the occasional planning error that results in an illegal slot error during execution. [#39640](https://github.com/apache/doris/pull/39640) + +- Fixed the issue where repeatedly referencing columns in cte may lead to missing data in some columns in the result. [#39850](https://github.com/apache/doris/pull/39850) + +- Fixed the occasional planning error issue when 'case when' exists in the query. [#38491](https://github.com/apache/doris/pull/38491) + +- Fixed the issue where IP types cannot be implicitly converted to string types. [#39318](https://github.com/apache/doris/pull/39318) + +- Fixed the potential planning error issue when using multi-dimensional aggregation and the same column and its alias exist in the select list. [ #38166](https://github.com/apache/doris/pull/38166) + +- Fixed the issue where boolean types might be handled incorrectly when using BE constant folding. [#39019](https://github.com/apache/doris/pull/39019) + +- Fixed the planning error issue caused by `default_cluster`: as a prefix for the database name in expressions. [#39114](https://github.com/apache/doris/pull/39114) + +- Fixed the potential deadlock issue caused by` insert into`. [#38660](https://github.com/apache/doris/pull/38660) + +- Fixed the potential planning error issue caused by not holding table locks throughout the planning process. [#38950](https://github.com/apache/doris/pull/38950) + +- Fixed the issue where CHAR(0), VARCHAR(0) are not handled correctly when creating tables. [#38427](https://github.com/apache/doris/pull/38427) + +- Fixed the issue where `show create table` may incorrectly display hidden columns. [#38796](https://github.com/apache/doris/pull/38796) + +- Fixed the issue where columns with the same name as hidden columns are not prohibited when creating tables. [#38796](https://github.com/apache/doris/pull/38796) + +- Fixed the occasional planning error issue when executing `insert into as select` with CTEs. [#38526](https://github.com/apache/doris/pull/38526) + +- Fixed the issue where `insert into values` cannot automatically fill null default values. **[[fix](Nereids) fix insert into table with null literal default value #39122](https://github.com/apache/doris/pull/39122)** + +- Fixed the NPE issue caused by using cte in delete without using it. [#39379](https://github.com/apache/doris/pull/39379) + +- Fixed the issue where deleting from a randomly distributed aggregation model table fails. [#37985](https://github.com/apache/doris/pull/37985) + +### Query Execution + +- Fixed the issue where the pipeline execution engine gets stuck in multiple scenarios, causing queries not to end. [#38657](https://github.com/apache/doris/pull/38657) [#38206](https://github.com/apache/doris/pull/38206) [#38885](https://github.com/apache/doris/pull/38885) + +- Fixed the coredump issue caused by null and non-null columns in set difference calculations.[#38737](https://github.com/apache/doris/pull/38737) + +- Fixed the incorrect result issue of the `width_bucket` function. [#37892](https://github.com/apache/doris/pull/37892) + +- Fixed the query error issue when a single row of data is large and the result set is also large (exceeding 2GB). [#37990](https://github.com/apache/doris/pull/37990) + +- Fixed the incorrect result issue of `stddev` with DecimalV2 type. [#38731](https://github.com/apache/doris/pull/38731) + +- Fixed the coredump issue caused by the `MULTI_MATCH_ANY` function. [#37959](https://github.com/apache/doris/pull/37959) + +- Fixed the issue where `insert overwrite auto partition` causes transaction rollback. [#38103](https://github.com/apache/doris/pull/38103) + +- Fixed the incorrect result issue of the `convert_tz` function. [#37358](https://github.com/apache/doris/pull/37358) [#38764](https://github.com/apache/doris/pull/38764) + +- Fixed the coredump issue when using the `collect_set` function with window functions. [#38234](https://github.com/apache/doris/pull/38234) + +- Fixed the coredump issue caused by the mod function with abnormal input. [#37999](https://github.com/apache/doris/pull/37999) + +- Fixed the issue where executing the same expression in multiple threads may lead to incorrect Java UDF results. [#38612](https://github.com/apache/doris/pull/38612) + +- Fixed the overflow issue caused by the incorrect return type of the `conv` function. [#38001](https://github.com/apache/doris/pull/38001) + +- Fixed the unstable result issue of the histogram function. [#38608](https://github.com/apache/doris/pull/38608) + +### Backup & Recovery / CCR + +- Fixed the issue where the data version after backup and recovery may be incorrect, leading to unreadability. [#38343](https://github.com/apache/doris/pull/38343) + +- Fixed the issue of using restore version across versions. [#38396](https://github.com/apache/doris/pull/38396) + +- Fixed the issue where the job is not canceled when backup fails. [#38993](https://github.com/apache/doris/pull/38993) + +- Fixed the NPE issue in ccr during the upgrade from 2.1.4 to 2.1.5, causing the FE to fail to start. [#39910](https://github.com/apache/doris/pull/39910) + +- Fixed the issue where views and materialized views cannot be used after restoration. [#38072](https://github.com/apache/doris/pull/38072) [#39848](https://github.com/apache/doris/pull/39848) + +### Storage Management + +- Fixed possible memory leaks in routine load when loading multiple tables from a single stream. [#38824](https://github.com/apache/doris/pull/38824) + +- Fixed the issue where delimiters and escape characters in routine load were not effective. [#38825](https://github.com/apache/doris/pull/38825) + +- Fixed incorrectly show routine load results when the routine load task name contained uppercase letters. [#38826](https://github.com/apache/doris/pull/38826) + +- Fixed the issue where the offset cache was not reset when changing the routineload topic. [#38474](https://github.com/apache/doris/pull/38474) + +- Fixed the potential exception triggered by show routineload under concurrent scenarios. [#39525](https://github.com/apache/doris/pull/39525) + +- Fixed the issue where routine load might import data repeatedly. [#39526](https://github.com/apache/doris/pull/39526) + +- Fixed the data error caused by `setNull` when enabling group commit via JDBC. [#38276](https://github.com/apache/doris/pull/38276) + +- Fixed the potential NPE issue when enabling group commit insert to a non-master FE. [#38345](https://github.com/apache/doris/pull/38345) + +- Fixed incorrect error handling during internal data writing in group commit. [#38997](https://github.com/apache/doris/pull/38997) + +- Fixed the coredump that might be triggered when the group commit execution plan failed. [#39396](https://github.com/apache/doris/pull/39396) + +- Fixed the issue where concurrent imports into auto partition tables might report non-existent tablets. [#38793](https://github.com/apache/doris/pull/38793) + +- Fixed potential load stream leakage issues. [#39039](https://github.com/apache/doris/pull/39039) + +- Fixed the issue where transactions were opened for `insert into select` with no data. [#39108](https://github.com/apache/doris/pull/39108) + +- Ignored the single-replica import configuration when using memtable prefetching. [#39154](https://github.com/apache/doris/pull/39154) + +- Fixed the issue where background imports of stream load records might be abnormally aborted upon encountering db deletion. [#39527](https://github.com/apache/doris/pull/39527) + +- Fixed inaccurate error messages when data errors occurred in strict mode. [#39587](https://github.com/apache/doris/pull/39587) + +- Fixed the issue where streamload did not return an error URL upon encountering erroneous data. [#38417](https://github.com/apache/doris/pull/38417) + +- Fixed the issue with the combined use of insert overwrite and auto partition. [#38442](https://github.com/apache/doris/pull/38442) + +- Fixed parsing errors when CSV encountered data where the line delimiter was enclosed by the enclosing character. [#38445](https://github.com/apache/doris/pull/38445) + +### Data Exporting + +- Fixed the issue where enabling the delete_existing_files property during export operations might result in duplicate deletion of exported data. [#39304](https://github.com/apache/doris/pull/39304)) + +### Permissions + +- Fixed the incorrect requirement of ALTER TABLE permission when creating a materialized view. [#38011](https://github.com/apache/doris/pull/38011) + +- Fixed the issue where the db was explicitly displayed as empty when showing routine load. [#38365](https://github.com/apache/doris/pull/38365) + +- Fixed the incorrect requirement of CREATE permission on the original table when using CREATE TABLE LIKE. [#37879](https://github.com/apache/doris/pull/37879) + +- Fixed the issue where grant operations did not check if the object existed. [#39597](https://github.com/apache/doris/pull/39597) + +## Upgrade suggestions + +When upgrading Doris, please follow the principle of not skipping two minor versions and upgrade sequentially. + +For example, if you are upgrading from version 0.15.x to 2.0.x, it is recommended to first upgrade to the latest version of 1.1, then upgrade to the latest version of 1.2, and finally upgrade to the latest version of 2.0. + +For more upgrade information, see the documentation: [Cluster Upgrade](../../admin-manual/cluster-management/upgrade) \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.7.md b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.7.md new file mode 100644 index 0000000000000..414229276e6b0 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v2.1/release-2.1.7.md @@ -0,0 +1,180 @@ +--- +{ + "title": "Release 2.1.7", + "language": "en" +} +--- + + + +Dear community, **Apache Doris version 2.1.7 was officially released on November 10, 2024.** This version brings continuous upgrades and improvements. Additionally, several fixes have been implemented in areas such as the to the Lakehouse, Async Materialized Views, and Semi-Structured Data Management, Query Optimizer and Permission Management. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- The following global variables will be forcibly set to the following default values: + - enable_nereids_dml: true + - enable_nereids_dml_with_pipeline: true + - enable_nereids_planner: true + - enable_fallback_to_original_planner: true + - enable_pipeline_x_engine: true +- New columns have been added to the audit log. [#42262](https://github.com/apache/doris/pull/42262) + - For more information, please refer to [docs](https://doris.apache.org/docs/admin-manual/audit-plugin/) + +## New features + +### Async Materialized View + +- An asynchronous materialized view has added a property called `use_for_rewrite` to control whether it participates in transparent rewriting. [#40332](https://github.com/apache/doris/pull/40332) + +### Query Execution + +- The list of changed session variables is now output in the Profile. [#41016](https://github.com/apache/doris/pull/41016) +- Support for `trim_in`, `ltrim_in`, and `rtrim_in` functions has been added. [#42641](https://github.com/apache/doris/pull/42641) (Note: This is a duplicate mention, but I'm including it as per your original list.) +- Support for several URL functions (top_level_domain, first_significant_subdomain, cut_to_first_significant_subdomain) has been added. [#42916](https://github.com/apache/doris/pull/42916) +- The `bit_set` function has been added. [#42916](https://github.com/apache/doris/pull/42099) +- The `count_substrings` function has been added. [#42055](https://github.com/apache/doris/pull/42055) +- The `translate` and `url_encode` functions have been added. [#41051](https://github.com/apache/doris/pull/41051) +- The `normal_cdf`, `to_iso8601`, and `from_iso8601_date` functions have been added. [#40695](https://github.com/apache/doris/pull/40695) + + +### Storage Management + +- The `information_schema.table_options` and `table_properties` system tables have been added, supporting the querying of attributes set during table creation. [#34384](https://github.com/apache/doris/pull/34384) +- Support for `bitmap_empty` as a default value has been implemented. [#40364](https://github.com/apache/doris/pull/40364) +- A new session variable `require_sequence_in_insert` has been introduced to control whether a sequence column must be provided when performing `INSERT INTO SELECT` writes to a unique key table. [#41655](https://github.com/apache/doris/pull/41655) + +### Others + +- Allow for generating flame graphs on the BE WebUI page.[#41044](https://github.com/apache/doris/pull/41044) + +## Improvements + +### Lakehouse + +- Support for writing data to Hive text format tables. [#40537](https://github.com/apache/doris/pull/40537) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/datalake-building/hive-build) +- Access MaxCompute data using MaxCompute Open Storage API. [#41610](https://github.com/apache/doris/pull/41610) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/database/max-compute) +- Support for Paimon DLF Catalog. [#41694](https://github.com/apache/doris/pull/41694) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/datalake-analytics/paimon) +- Added `table$partitions` syntax to directly query Hive partition information.[#41230](https://github.com/apache/doris/pull/41230) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/datalake-analytics/hive) +- Support for reading Parquet files in brotli compression format.[#42162](https://github.com/apache/doris/pull/42162) +- Support for reading DECIMAL 256 types in Parquet files. [#42241](https://github.com/apache/doris/pull/42241) +- Support for reading Hive tables in OpenCsvSerde format.[#42939](https://github.com/apache/doris/pull/42939) + +### Async Materialized View + +- Refined the granularity of lock holding during the build process for asynchronous materialized views. [#40402](https://github.com/apache/doris/pull/40402) [#41010](https://github.com/apache/doris/pull/41010). + +### Query optimizer + +- Improved the accuracy of statistic information collection and usage in extreme cases to enhance planning stability. [#40457](https://github.com/apache/doris/pull/40457) +- Runtime filters can now be generated in more scenarios to improve query performance. [#40815](https://github.com/apache/doris/pull/40815) +- Enhanced constant folding capabilities for numerical, date, and string functions to boost query performance. [#40820](https://github.com/apache/doris/pull/40820) +- Optimized the column pruning algorithm to enhance query performance. [#41548](https://github.com/apache/doris/pull/41548) + +### Query Execution + +- Supported parallel preparation to reduce the time consumed by short queries. [#40270](https://github.com/apache/doris/pull/40270) +- Corrected the names of some counters in the profile to match the audit logs. [#41993](https://github.com/apache/doris/pull/41993) +- Added new local shuffle rules to speed up certain queries. [#40637](https://github.com/apache/doris/pull/40637) + +### Storage Management + +- The `SHOW PARTITIONS` command now supports displaying the commit version. [#28274](https://github.com/apache/doris/pull/28274) +- Checked for unreasonable partition expressions when creating tables. [#40158](https://github.com/apache/doris/pull/40158) +- Optimized the scheduling logic when encountering EOF in Routine Load. [#40509](https://github.com/apache/doris/pull/40509) +- Made Routine Load aware of schema changes. [#40508](https://github.com/apache/doris/pull/40508) +- Improved the timeout logic for Routine Load tasks. [#41135](https://github.com/apache/doris/pull/41135) + +### Others + +- Allowed closing the built-in service port of BRPC via BE configuration. [#41047](https://github.com/apache/doris/pull/41047) +- Fixed issues with missing fields and duplicate records in audit logs. [#43015](https://github.com/apache/doris/pull/43015) + +## Bug fixes + +### Lakehouse + +- Fixed the inconsistency in the behavior of INSERT OVERWRITE with Hive. [#39840](https://github.com/apache/doris/pull/39840) +- Cleaned up temporarily created folders to address the issue of too many empty folders on HDFS. [#40424](https://github.com/apache/doris/pull/40424) +- Resolved memory leaks in FE caused by using the JDBC Catalog in some cases. [#40923](https://github.com/apache/doris/pull/40923) +- Resolved memory leaks in BE caused by using the JDBC Catalog in some cases. [#41266](https://github.com/apache/doris/pull/41266) +- Fixed errors in reading Snappy compressed formats in certain scenarios. [#40862](https://github.com/apache/doris/pull/40862) +- Addressed potential FileSystem leaks on the FE side in certain scenarios. [#41108](https://github.com/apache/doris/pull/41108) +- Resolved issues where using EXPLAIN VERBOSE to view external table execution plans could cause null pointer exceptions in some cases. [#41231] (https://github.com/apache/doris/pull/41231) +- Fixed the inability to read tables in Paimon parquet format. [#41487](https://github.com/apache/doris/pull/41487) +- Addressed performance issues introduced by compatibility changes in the JDBC Oracle Catalog. [#41407](https://github.com/apache/doris/pull/41407) +- Disabled predicate pushing down after implicit conversion to resolve incorrect query results in some cases with JDBC Catalog. [#42242](https://github.com/apache/doris/pull/42242) +- Fixed issues with case-sensitive access to table names in the External Catalog. [#42261](https://github.com/apache/doris/pull/42261) + +### Async Materialized View + +- Fixed the issue where user-specified start times were not effective. [#39573](https://github.com/apache/doris/pull/39573) +- Resolved the issue of nested materialized views not refreshing. [#40433](https://github.com/apache/doris/pull/40433) +- Fixed the issue where materialized views might not refresh after the base table was deleted and recreated. [#41762](https://github.com/apache/doris/pull/41762) +- Addressed issues where partition compensation rewrites could lead to incorrect results. [#40803](https://github.com/apache/doris/pull/40803) +- Fixed potential errors in rewrite results when `sql_select_limit` was set. [#40106](https://github.com/apache/doris/pull/40106) + +### Semi-Structured Data Management + +- Fixed the issue of index file handle leaks. [#41915](https://github.com/apache/doris/pull/41915) +- Addressed inaccuracies in the `count()` function of inverted indexes in special cases. (#41127)[https://github.com/apache/doris/pull/41127] +- Fixed exceptions with variant when light schema change was not enabled. [#40908](https://github.com/apache/doris/pull/40908) +- Resolved memory leaks when variant returns arrays. [#41339](https://github.com/apache/doris/pull/41339) + +### Query optimizer + +- Corrected potential errors in nullable calculations for filter conditions during external table queries, leading to execution exceptions. [#41014](https://github.com/apache/doris/pull/41014) +- Fixed potential errors in optimizing range comparison expressions. [#41356](https://github.com/apache/doris/pull/41356) + +### Query Execution + +- The match_regexp function could not correctly handle empty strings. [#39503](https://github.com/apache/doris/pull/39503) +- Resolved issues where the scanner thread pool could become stuck in high-concurrency scenarios. [#40495](https://github.com/apache/doris/pull/40495) +- Fixed errors in the results of the `data_floor` function. [#41948](https://github.com/apache/doris/pull/41948) +- Addressed incorrect cancel messages in some scenarios. [#41798](https://github.com/apache/doris/pull/41798) +- Fixed issues with excessive warning logs printed by arrow flight. [#41770](https://github.com/apache/doris/pull/41770) +- Resolved issues where runtime filters failed to send in some scenarios. [#41698](https://github.com/apache/doris/pull/41698) +- Fixed problems where some system table queries could not end normally or became stuck. [#41592](https://github.com/apache/doris/pull/41592) +- Addressed incorrect results from window functions. ][#40761](https://github.com/apache/doris/pull/40761) +- Fixed issues where the encrypt and decrypt functions caused BE cores. [#40726](https://github.com/apache/doris/pull/40726) +- Resolved errors in the results of the conv function. [#40530](https://github.com/apache/doris/pull/40530) + +### Storage Management + +- Fixed import failures when Memtable migration was used in multi-replica scenarios with machine crashes. [#38003](https://github.com/apache/doris/pull/38003) +- Addressed inaccurate memory statistics during the Memtable flush phase during imports. [#39536](https://github.com/apache/doris/pull/39536) +- Fixed fault tolerance issues with Memtable migration in multi-replica scenarios. [#40477](https://github.com/apache/doris/pull/40477) +- Resolved inaccurate bvar statistics with Memtable migration. [#40985](https://github.com/apache/doris/pull/40985) +- Fixed inaccurate progress reporting for S3 loads. [#40987](https://github.com/apache/doris/pull/40987) + +### Permissions + +- Fixed permission issues related to show columns, show sync, and show data from db.table. [#39726](https://github.com/apache/doris/pull/39726) + +### Others + +- Fixed the issue where the audit log plugin for version 2.0 could not be used in version 2.1. [#41400](https://github.com/apache/doris/pull/41400) diff --git a/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.0.md b/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.0.md new file mode 100644 index 0000000000000..baa62b37e1e75 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.0.md @@ -0,0 +1,469 @@ +--- +{ + "title": "Release 3.0.0", + "language": "en" +} +--- + + + + +We are excited to announce the release of Apache Doris 3.0! + +**Starting from version 3.X, Apache Doris supports a compute-storage decoupled mode in addition to the compute-storage coupled mode for cluster deployment. With the cloud-native architecture that decouples the computation and storage layers, users can achieve physical isolation between query loads across multiple compute clusters, as well as isolation between read and write loads. Additionally, users can take advantage of low-cost shared storage systems such as object storage or HDFS to significantly reduce storage costs.** + +Version 3.0 marks a milestone in the evolution of Apache Doris towards a unified data lake and data warehouse architecture. This version introduces the ability to write data back to data lakes, allowing users to perform data analysis, sharing, processing, and storage operations across multiple data sources within Apache Doris. With capabilities such as asynchronous materialized views, Apache Doris can serve as a unified data processing engine for enterprises, helping users better manage data across lakes, warehouses, and databases. Also, Apache Doris 3.0 introduces the Trino Connector. It allows users to quickly connect or adapt to more data sources, and leverage the high-performance compute engine of Doris to deliver faster query results than Trino. + +Version 3.0 also enhances support for ETL batch processing scenarios, adding explicit transaction support for operations like `insert into select`, `delete` and `update`. The observability of query execution has also been improved. + +In terms of performance, we have improved the framework capabilities, infrastructure, and rules of the query optimizer in version 3.0. This provides optimized performance, which has been proven by blind testing in more complex and diverse business scenarios. + +The adaptive Runtime Filter computation method now accurately estimates filters based on data size during execution, delivering better performance under large data volumes and high loads. Additionally, asynchronous materialized view has been more stable and user-friendly in query acceleration and data modeling. + +**During the development of version 3.0, over 200 contributors submitted nearly 5,000 optimizations** and fixes to Apache Doris. Contributors from companies such as VeloDB, Baidu, Meituan, ByteDance, Tencent, Alibaba, Kwai, Huawei, and Tianyi Cloud actively collaborated with the community, contributing test cases from real-world use cases to help us improve Apache Doris. We extend our heartfelt thanks to all the contributors involved in the development, testing, and feedback process for this release. + +- **GitHub**: https://github.com/apache/doris/releases + +- **Website**: https://doris.apache.org/download + +## 1. Compute-storage decoupled mode + +Since V3.0, Apache Doris supports the compute-storage decoupled mode. Users can choose between it and the compute-storage coupled mode during cluster deployment. + +In the compute-storage decoupled mode, the BE nodes no longer store the data, but instead, a shared storage layer (HDFS and object storage) is introduced as the shared data storage layer. The computing and storage resources can be scaled independently, bringing multiple benefits to users: + +- **Workload isolation**: Multiple compute clusters can share the same data, allowing users to isolate different business workloads or offline loads using separate compute clusters. + +- **Reduced storage costs**: The full dataset is stored in the more cost-effective and highly reliable shared storage, with only hot data cached locally. Compared to the compute-storage coupled mode with three data replicas, the storage cost can be reduced by up to 90%. + +- **Elastic computing resources**: Since no data is stored on the BE nodes, the computing resources can be scaled flexibly based on the load requirements. Users can scale in or out an individual compute cluster or increase/decrease the number of compute clusters. This also leads to cost savings. + +- **Improved system robustness**: By storing the data in shared storage, Doris no longer needs to handle the complex logic of multi-replica consistency, thus simplifying distributed storage complexity and improving the overall system robustness. + +- **Flexible data sharing and cloning**: The flexibility of the compute-storage decoupled mode extends beyond a single Doris cluster. Tables from one Doris cluster can be easily cloned to another Doris cluster, with just metadata replication. + +### 1-1. From coupled to decoupled + +In the compute-storage coupled mode, the Apache Doris architecture consists of two main process types: Frontend (FE) and Backend (BE). The FE is primarily responsible for user request access, query parsing and planning, metadata management, and node management. The BE is responsible for data storage and query plan execution. + +The BE nodes employ an MPP (Massively Parallel Processing) distributed computing architecture, leveraging a multi-replica consistency protocol to ensure high service availability and high data reliability. + +![From coupled to decoupled](/images/storage-compute-decoupled.PNG) + + +The maturation of emerging cloud computing infrastructure, including public clouds, private clouds, and Kubernetes-based container platforms, has driven the need for cloud-native capabilities. Increasingly, users are seeking deeper integration between Apache Doris and cloud computing infrastructure to provide more elasticity. + +**To address this need, the VeloDB team has designed and implemented a cloud-native version of Apache Doris that decouples compute and storage, known as VeloDB Cloud. After extensive production testing and refinement across hundreds of enterprises over a long time, this cloud-native solution has now been contributed to the Apache Doris community, manifesting as the Apache Doris 3.0 in the compute-storage decoupled mode.** + +In the compute-storage decoupled mode, the Apache Doris architecture consists of three layers: + +- **Meta data layer**: A new Meta Service module has been introduced to provide meta data services, such as processing database and table information, schemas, rowset meta, and transactions. The Meta Service is stateless and horizontally scalable. In V3.0, all of the BE's meta data and parts of the FE's meta data have been migrated to the Meta Service. We will finish the migration of the remains in future versions. +- **Computation layer**: The stateless BE nodes execute query plans and cache a portion of the data and tablet meta data locally to improve query performance. Multiple stateless BE nodes can be organized into a computing resource pool (i.e., compute cluster), and multiple compute clusters can share the same data and metadata service. The compute clusters can be elastically scaled by adding or removing nodes as needed. +- **Shared storage layer**: Data is persisted to the shared storage layer, which currently supports HDFS as well as various cloud-based object storage systems that are compatible with the S3 protocol, such as S3, OSS, GCS, Azure Blob, COS, BOS, and MinIO. + +![From coupled to decoupled-2](/images/storage-compute-decoupled-2.JPEG) + +### 1-2 Design highlight + +The design of the compute-storage decoupled mode of Apache Doris highlights the transformation of the FE's in-memory metadata model into a shared metadata service. This approach offers a globally consistent state view, allowing any node to directly submit writes without needing to go through the FE for publishing. During write operations, data is stored in shared storage, while metadata is managed by the metadata service. **This effectively controls the number of small files in shared storage. Meanwhile, the real-time write performance for individual tables is nearly on par with that in the compute-storage coupled mode. The system's overall write capacity is no longer limited by the processing power of a single FE node.** + +![Design highlight](/images/design-hightlight.PNG) + +Based on the globally consistent state view, for data garbage collection, we have adopted a design approach for data deletion that is easier to prove correct and more efficient. + +Specifically, data in the shared storage is incorporated into the globally consistent view offered by the shared meta data service. Whenever data is generated, we bind it to a separate, independent transaction. Similarly, for a meta data deletion operation, we also bind it to a separate, independent transaction. The purpose of this approach is to ensure that deletion and write operations cannot succeed together. The view records which data needs to be deleted, and the asynchronous deletion process can simply perform a forward deletion of the data based on the transaction records, without the need for reverse garbage collection. + +As the tablet-related meta data in the FE is gradually migrated to the shared meta data service, the scalability of the Doris cluster will no longer be constrained by the memory capacity of a single FE node. Building upon the shared meta data service and the forward data deletion technique, we can conveniently expand functionality such as data sharing and lightweight cloning. + +### 1-3 Comparison with alternative solutions + +Another design of decoupling compute and storage in the industry is to store the data and BE node meta data in a shared object storage or HDFS. However, this approach brings the following problems: + +- **Inability to support real-time writes**: During data writes, the data is mapped to tablets based on the partitioning and bucketing rules, generating segment files and rowset meta data. During the write process, a two-phase commit (Publish) is performed through the FE. When a BE node receives the Publish request, it then sets the rowset as visible. The Publish operation must not fail. If the rowset meta data is stored in the shared storage, the total small file data during the real-time write process would triple the size of the actual data files - one replica of data files, one for rowset meta data, and another for rowset meta data changes during Publish. The Publish operation is driven by a single FE node, so the write capacity of a single table or even the entire system is limited by the FE node's capabilities. + + ![Comparison with alternative solutions](/images/comparison-with-alternative-solutions.png) + + We compared the real-time data write performance of Apache Doris 3.0 with the above-described solution. We simulated 500 concurrent tasks writing 10,000 data files with 500 rows each, and 50 concurrent tasks writing 250 data files with 20,000 rows each, using the same computational resources. + + **The results showed that at 50 concurrent tasks, the micro-batch write performances of Apache Doris in both compute-storage coupled and decoupled modes were almost identical, while the industry solution lagged behind Apache Doris by a factor of 100.** + + At 500 concurrent tasks, the performance of Apache Doris in the compute-storage decoupled mode showed slight degradation, but it still maintained an 11X advantage over the industry solution. To ensure a fair test, Apache Doris did not enable the Group Commit feature (which the industry solution lacks). Enabling Group Commit would further enhance real-time write performance. + + ![Comparison with alternative solutions](/images/real-time-write-performance..png) + + Additionally, the industry solution also faces stability and cost issues in terms of real-time data ingestion: + + - Stability concerns: A large number of small files can put pressure on the shared storage, especially HDFS, and introduce stability risks. + + - High object storage request costs: Some public cloud object storage services charge 10 times more for Put and Delete operations compared to Get operations. A large number of small files can lead to a significant increase in object storage request costs, which can even exceed the storage costs. + +- **Limited scalability**: Use cases of the compute-storage decoupled model often handles larger data storage sizes, since the FE (Frontend) meta data is entirely in-memory, when the number of tablets reaches a certain high level (e.g. tens of millions), the FE's memory pressure can become a bottleneck that limits the overall write throughput of the system. + +- **Potential data deletion logic issues**: In the compute-storage decoupled architecture, data is stored with one single replica. Therefore, the data deletion logic is critical for the system's reliability. The conventional approach of cross-system data deletion by comparing the differences can be challenging. During the write process, there is no way to completely avoid deletion and write from succeeding together, which can lead to data loss. Additionally, when the storage system experiences anomalies, the input used for difference calculation may be incorrect, which potentially leads to unintended data deletion. + +- **Data sharing and lightweight cloning**: The flexibility of the decoupled storage-compute architecture can enable future data sharing and lightweight data cloning, reducing the burden of enterprise data management. However, if each cluster has a separate FE, after cloning data across clusters, it becomes difficult to accurately determine which data is no longer referenced and can be safely deleted, as calculating cross-cluster references can easily lead to unintended data deletion. + +By evolving the FE's full in-memory meta data model into a shared meta data service, Apache Doris 3.0 avoids all the aforementioned issues. + +### 1-4 Query performance comparison + +In the compute-storage decoupled mode, data needs to be read from the remote shared storage system, the main bottleneck has become the network bandwidth instead of the disk I/O in the compute-storage coupled mode. + +To accelerate data access, Apache Doris has implemented a high-speed caching mechanism based on local disks, and provides two cache management policies: LRU (Least Recently Used) and TTL (Time-To-Live). The newly imported data is asynchronously written to the cache to accelerate the first-time access to the latest data. If the data required by a query is not in the cache, the system will read the data from the remote storage into memory and synchronously write it to the cache for subsequent queries. + +In use cases involving multiple compute clusters, Apache Doris provides a cache preheating function. When a new compute cluster is established, users can choose to preheat specific data (such as tables or partitions) to further improve query efficiency. + +In this context, we have conducted performance tests with different caching strategies in both the compute-storage coupled and decoupled modes, using the TPC-DS 1TB test dataset. The results are concluded as follows: + +- When the cache is fully hit (i.e., all the data required for the query is loaded into the cache), **the query performance of the compute-storage decoupled mode is on par with that of the compute-storage coupled mode**. + +- When the cache is partially hit (i.e., the cache is cleared before the test, and data is gradually loaded into the cache during the test, with performance continuously improving), the query performance of the compute-storage decoupled mode is about 10% lower than that of the compute-storage coupled mode. This test scenario is the most similar to the real-life use cases. + +- When the cache is completely missed (i.e., the cache is cleared before every SQL execution, simulating an extreme case), the performance loss is around 35%. **Even so, Apache Doris in the compute-storage decoupled mode delivers much higher performance than its alternative solutions.** + +![Query performance comparison](/images/query-performance-comparison.png) + +### 1-5 Write speed comparison + +In terms of write performance, we have simulated two test cases under the same computing resources: batch import and high-concurrency real-time import. The comparison of write performance between the compute-storage coupled mode and the compute-storage decoupled mode is as follows: + +- **Batch import**: When importing the 1TB TPC-H and 1TB TPC-DS test datasets, **the write performance of the compute-storage decoupled mode is 20.05% and 27.98% higher than the compute-storage coupled mode**, respectively, under the single-replica configuration. During batch import, the segment file size is generally in the range of tens to hundreds of MB. In the compute-storage decoupled mode, the segment files are split into smaller files and concurrently uploaded to the object storage, which can result in higher throughput compared to writing to local disks. In real-life deployments, the compute-storage coupled mode typically uses three replicas, which means the write speed advantage of the compute-storage decoupled mode will be even more pronounced. + +- **High-concurrency real-time import**: as described in the "Comparison with alternative solutions" section. + +![Write speed comparison](/images/write-speed-comparison.png) + +### 1-6 Tips for production environment + +- **Performance**: For real-time data analysis, users can achieve query performance comparable to the compute-storage coupled mode by specifying a TTL (Time-To-Live) for the cache and writing newly ingested data into the cache. To prevent query jitter, users can cache the data generated by background tasks such as compaction and schema changes based on how frequently used the data is. + +- **Workload isolation**: Users can achieve physical resource isolation for different business using multiple compute clusters. For workload isolation within a single compute cluster, users can utilize the Workload Group mechanism to limit and isolate resources for different queries. + +### 1-7 Notes + +- Apache Doris 3.0 does not support the co-existence of the compute-storage coupled mode and the compute-storage decoupled mode. Users need to specify one of them during cluster deployment. + +- If users need the compute-storage coupled mode, following the [documentation](https://doris.apache.org/docs/3.0/install/source-install/compilation-with-docker/) for its deployment and upgrade. We recommend using Doris Manager for quick deployment and cluster upgrades. However, the compute-storage decoupled mode does not yet support Doris Manager deployment and upgrade. We will continue iteration for better support in future versions. + +- Currently Apache Doris does not support in-place upgrade from V2.1 to the compute-storage decoupled mode of V3.0. For such purpose, users need to perform data migration using tools like X2Doris after deploying the compute-storage decoupled clusters. In the future, we will support migration without service interruption through the CCR (Change Data Capture) capability. + +:::info +See doc: +https://doris.apache.org/docs/3.0/compute-storage-decoupled/overview/ +::: + +## 2. Data lakehouse + +Apache Doris is positioned as a real-time data warehouse, but it is much more than that. In previous versions, we have consistently pushed beyond the boundaries of traditional data warehouse capabilities, advancing towards a unified data lakehouse. Version 3.0 marks a milestone in this journey, with its capabilities in the lakehouse architecture becoming fully mature. We believe that a unified lakehouse is identified by **boundaryless data** and **lakehouse fusion**: + +**Boundaryless data: Apache Doris serves as a unified query processing engine, breaking down data barriers across different systems. It provides a consistent and ultra-fast analysis experience across all data sources, including data warehouses, data lakes, data streams, and local data files.** + +- **Lakehouse query acceleration**: Without the need to migrate data to Apache Doris, users can leverage Doris’ efficient query engine to directly query data stored in data lakes such as Iceberg, Hudi, Paimon, and offline data warehouses like Hive, thereby accelerating query analysis. + +- **Federated analysis**: By extending its catalog and storage plugins, Apache Doris enhances its federated analysis capabilities, allowing users to perform unified analysis across multiple heterogeneous data sources without physically centralizing the data in a single storage system. This enables external table queries and federated joins between internal and external tables, breaking down data silos and providing globally consistent data insights. + +- **Data lake construction**: Apache Doris introduces write-back functionality for Hive and Iceberg, allowing users to directly create Hive and Iceberg tables through Doris and write data into them. This allows users to write internal table data back to the offline lakehouse or process offline lakehouse data using Doris and save the results back into the lakehouse, simplifying and streamlining the data lake construction process. + +**Lakehouse fusion: As data lake architectures become increasingly complex, the costs of technology selection and maintenance rise for users. Achieving consistent fine-grained access control across multiple systems also becomes challenging, and real-time performance suffers. To address this, Apache Doris integrates core features of the data lake, transforming itself into a lightweight, efficient, native real-time lakehouse.** + +- **Real-time data updates**: Starting with version 1.2, Apache Doris enhanced the primary key model by introducing Merge-on-Write, supporting real-time updates. This feature allows high-frequency, real-time data updates based on primary key changes from upstream data sources. + +- **Data science and** **AI** **computation support**: From version 2.1, Apache Doris, using the efficient Arrow Flight protocol, increased the openness of its storage system and its support for various compute loads, enabling data science and AI computations. + +- **Enhancements for semi-structured and unstructured Data**: Apache Doris has introduced support for data types like Array, Map, Struct, JSON, and Variant, with plans to support vector indexing in the future. + +- **Improved resource efficiency by decoupling storage and compute**: With version 3.0, Apache Doris supports a decoupled storage and compute mode, further improving resource efficiency and scalability. + +### 2-1 Faster queries in the data lakehouse + +TPC-H and TPC-DS benchmarking proves that Apache Doris achieves average query performance that is 3 to 5 times faster than Trino/Presto. + +In V3.0, we have focused on optimizing query performance for production environments, including: + +- **More granular task splitting strategy**: By adjusting the consistent hashing algorithm and introducing a task sharding weighting mechanism, we ensure balanced query loads across all nodes. + +- **Scheduling optimizations for use cases with numerous partitions and files**: For cases with a large number of files (over 1 million), we have largely reduced query latency (from 100 seconds to 10 seconds) and alleviated memory pressure on the Frontend (FE) by asynchronously and batch-fetching file shards. + +We will continue to specifically enhance query acceleration performance in real-world business scenarios, improve the actual user experience, and build an industry-leading lakehouse query acceleration engine. + +### 2-2 Federated analysis: more data connectors + +Previous versions of Apache Doris support connectors for over 10 mainstream data lakehouses, warehouses, and relational databases. In V3.0, we have introduced the Trino Connector compatibility framework, which expands the range of data sources that Apache Doris can connect to. With this framework, users can easily adapt their existing setups to access corresponding data sources using Doris and leverage its high-speed computing engine for data analysis. + +Currently, Doris has completed adaptations for Delta Lake, Kudu, BigQuery, Kafka, TPCH, and TPCDS. We also encourage contributions from developers to prolong this list. + +:::info Note + +See doc: + +- Trino Connector: https://doris.apache.org/community/how-to-contribute/trino-connector-developer-guide/ + +- TPC-H: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/tpch/ + +- TPC-DS: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/tpcds/ + +- Delta Lake: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/deltalake/ + +- Kudu: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/kudu/ + +- BigQuery: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/bigquery/ +::: + + +### 2-3 Data lake building + +In V3.0, we have introduced data writeback functionality for Hive and Iceberg. This allows users to create Hive and Iceberg tables directly through Doris and write data into these tables, and enables users to perform data analysis, sharing, processing, and storage operations across multiple data sources within Doris. + +In future iterations, Apache Doris will further enhance support for data lake table formats and improve the openness of storage APIs. + +:::info Note +See doc: https://doris.apache.org/docs/3.0/lakehouse/datalake-building/hive-build/ +::: + +## 3. Upgraded semi-structured data analysis capabilities + +In versions 2.0 and 2.1, Apache Doris introduced some well-embraced features such as inverted index, NGram Bloom Filter, and Variant data type to support high-performance full-text search and multi-dimensional analysis. With them, the storage and processing of complex semi-structured data have been more flexible and efficient. + +In V3.0, we have further enhanced the capabilities in this scenario. + +After extensive testing in production environments, the Variant data type has gained sufficient stability and become the preferred choice for JSON data storage and analysis. In V3.0, we have made multiple optimizations to it: + +- Support for indexing of the Variant data type to accelerate queries, including inverted index, Bloom Filter index, and the built-in ZoneMap index. + +- Support for flexible partial column updates for Unique Key tables containing the Variant data type. + +- Support for the use of the Variant data type in the compute-storage decoupled mode, with optimizations of its metadata storage. + +- Support for exporting the Variant data type to formats such as Parquet and CSV. + +The inverted index, introduced since V2.0, has reached a high level of maturity after more than a year of refinement and is now running in production environments of hundreds of enterprises. In V3.0, we have made multiple optimizations to the inverted index: + +- After performance optimizations, including lock concurrency, Apache Doris outperforms Elasticsearch in key metrics such as query latency and concurrency in real-time reporting analysis. + +- Optimized index file in the compute-storage decoupled mode to reduce remote storage calls and decrease index query latency. + +- Support for the Array data type to accelerate the `array_contains` queries. + +- Enhanced the `match_phrase_*` functionality, including support for slop and phrase prefix matching `match_phrase_prefix`. + +## 4. Enhanced ETL capabilities + +### 4-1. Transaction improvements + +Data processing in data warehouses often involves multiple data changes that need to be handled as a single transaction. V3.0 provides explicit transaction support for `insert into select`, `delete`, and `update` operations. Example cases include: + +- **Transactional requirements**: For example, when updating data within a time range, the typical approach is to first delete the data in that time range, and then insert the new data. Considering that the data might already be in service, there is a need to ensure that queries visit either the old data or the new data. Thus, it can be achieved by executing the `delete` and `insert into select` operations in a transaction. + + ```Java + BEGIN; + DELETE FROM table WHERE date >= "2024-07-01" AND date <= "2024-07-31"; + INSERT INTO table SELECT * FROM stage_table; + COMMIT; + ``` + +- **Simplified the processing of failed tasks**: For example, when two `insert into select` operations are executed within a single transaction, if any of the operations fail, it can be retried directly. + + ```Java + BEGIN WITH LABEL label_etl_1; + INTO table1 SELECT * FROM stage_table1; + INSERT INTO table SELECT * FROM stage_table; + COMMIT; + ``` + +:::info Note +See doc: https://doris.apache.org/docs/3.0/data-operate/transaction/ +Currently, explicit transaction synchronization is not supported in Cross-Cluster Replication (CCR). +::: + +### 4-2. Improved observability + +- **Real-time profile retrieval**: In previous versions, due to issues with the execution plan or the data, some complex queries might have high computational requirements, so developers can only access the query profile for performance analysis after the completion of the query. This makes it hard to promptly identify issues in query execution to guarantee stability of the production environment. Now, with the ability to retrieve real-time profiles, V3.0 allows users to monitor query execution as the query is running. It also allows them to better monitor the progress of each ETL job. + +- **`backend_active_tasks` system table**: The `backend_active_tasks` system table provides real-time resource consumption information for each query on each BE node. Users can analyze this system table using SQL to obtain the resource usage of each query, which helps identify large queries or abnormal workloads. + +## 5. Asynchronous materialized view + +In V3.0, asynchronous materialized view is faster and more stable. It is also more user-friendly for query acceleration and data modeling scenarios. We have restructured the logic for transparent rewrite and expanded its capabilities, making it 2X faster. + +### 5-1 Refresh + +- Support for incremental update of materialized views by partitions and partition roll-ups on materialized views to allow refreshes at different granularities. + +- Support for nested materialized views, which is useful in data modeling scenarios. + +- Support for index creation and sort key specification in asynchronous materialized views, which will improve query performance after the materialized view is hit. + +- Higher usability of materialized view DDL with support for atomically replacing materialized views, allowing modifications to the materialized view definition SQL while keeping the materialized view available. + +- Support for non-deterministic functions in materialized views to better serve daily materialized view creation. + +- Support for trigger-based materialized view refresh, which ensures data consistency in data modeling with nested materialized views. + +- Support for a broader range of SQL patterns for building partitioned materialized views, making the incremental update capability available to more use cases. + +### 5-2 Refresh stability + +- V3.0 supports specifying a Workload Group for building materialized views. This is to limit the resources used by the materialized view build process and ensure that sufficient resources remain available for ongoing queries. + +### 5-3 Transparent rewrite + +- Support for transparent rewrite of more Join types, including derived Joins. Even when there is a mismatch of Join types between the query and materialized view, transparent rewrite can still be performed by compensating with additional predicates, as long as the materialized view can provide all the data needed for the query. + +- Support for more aggregate functions for roll-up as well as rewrite of multi-dimensional aggregations like GROUPING SETS, ROLLUP, and CUBE; support rewriting queries with aggregations when the materialized view does not contain aggregations, simplifying Join operations and expression computation. + +- Support for transparent rewrite of nested materialized views, enabling higher performance for complex queries. + +- For partially invalid partitioned materialized views, V3.0 supports `Union All` the base tables for data completion, expanding the applicability of partitioned materialized views. + +### 5-4 Transparent rewrite performance + +- Continuous optimization has been done to improve the transparent rewrite performance, achieving 2X the speed compared to version 2.1.0. + +:::info Note + +See doc: + +https://doris.apache.org/docs/3.0/query/view-materialized-view/query-async-materialized-view + +https://doris.apache.org/docs/3.0/query/view-materialized-view/async-materialized-view/ + +::: + +## 6. Performance improvement + +### 6-1 Smarter optimizer + +In V3.0, the query optimizer has been enhanced in terms of framework capabilities, distributed plan support, optimizer infrastructure, and rule expansion. It provides better optimization capabilities for more complex and diverse business scenarios, with higher blind test performance for complex SQL: + +- **Improved plan enumeration capability**: The key structure Memo for plan enumeration has been restructured and normalized. This improves the efficiency of the Cascades framework in plan enumeration and the possibility of producing better plans. Additionally, it fixes incomplete column pruning during the Join Reorder process in older versions, which led to unnecessary overhead of the Join operator, thus improving the execution performance in the relevant scenarios. + +- **Improved distributed plan support**: The distributed query plan has been enhanced to allow aggregation, join, and window function operations to more intelligently identify the data characteristics of intermediate computation results, avoiding ineffective data redistribution operations. Meanwhile, we have optimized the execution under the multi-replica continuous execution mode, making it more data cache-friendly. + +- **Improved optimizer infrastructure**: V3 has fixed several issues in cost model and statistics information estimation. The fixes to the cost model are more adaptable to the evolution of the execution engine, making the execution plan more stable compared to previous versions. + +- **Enhanced Runtime Filter plan support**: On the basis of Join Runtime Filter, V3.0 has expanded the capability of the TopN Runtime Filter to achieve better performance in use cases that involve a TopN operator. + +- **Enriched optimization rule library**: Based on user feedback and internal testing results, we have introduced optimization rules such as Intersect Reorder to enrich the rule set of the optimizer. + +### 6-2 Self-adaptive Runtime Filter + +In previous versions, the generation of Runtime Filter relies on manual setting by users based on statistical information. However, inaccurate settings in certain cases could lead to performance instability. + +In V3.0, Doris implements a self-adaptive Runtime Filter calculation approach. It can estimate the Runtime Filter at runtime based on the data size with high accuracy, enabling better performance in use cases with large data volumes and high workloads. + +### 6-3 Function performance optimization + +- V3.0 has improved the vectorized implementation of dozens of functions, enabling a performance improvement of over 50% for some commonly used functions. +- V3.0 has also made extensive optimizations to the aggregation of nullable data types, enabling a 30% performance improvement. + +### 6-4 Blind test performance improvement + +Our blind tests on V3.0 and V2.1 show that the new version is 7.3% and 6.2% faster in TPC-DS and TPC-H benchmark tests, respectively. + +![Blind test performance improvement](/images/blind-test-performance-improvement.png) + +## 7. New features + +### 7-1 Java UDTF + +Version 3.0 has added support for Java UDTFs. The key operations are as follows: + +- Implementing a UDTF: Similar to a UDF, a UDTF requires the user to implement an `evaluate` method. Note that the return value of a UDTF function must be of the `Array` data type. + + ```sql + public class UDTFStringTest { + public ArrayList evaluate(String value, String separator) { + if (value == null || separator == null) { + return null; + } else { + return new ArrayList<>(Arrays.asList(value.split(separator))); + } + } + } + ``` + +- Creating a UDTF: By default, two corresponding functions will be created - `java-utdf`and `java-utdf_outer`. The `_outer` suffix adds a single row of `NULL` data when the table function generates 0 rows of output. + + ```sql + CREATE TABLES FUNCTION java-utdf(string, string) RETURNS array PROPERTIES ( + "file"="file:///pathTo/java-udaf.jar", + "symbol"="org.apache.doris.udf.demo.UDTFStringTest", + "always_nullable"="true", + "type"="JAVA_UDF" + ); + ``` + +:::info + +See doc: https://doris.apache.org/docs/3.0/query/udf/java-user-defined-function/#udtf-1 + +::: + +### 7-2 Generated column + +A generated column is a special column whose value is calculated from the values of other columns rather than directly inserted or updated by the user. It supports pre-computing the results of expressions and storing them in the database, which is suitable for scenarios that require frequent queries or complex calculations. + +Results can be automatically calculated based on predefined expressions when data is imported or updated, and then stored persistently. In this way, during subsequent queries, the system can directly access these calculated results without performing complex calculations, thereby improving query performance. + +Generated columns are supported since V3.0. When creating a table, you can specify a column as generated column. A generated column automatically calculates values based on the defined expression when data is written. Generated columns allow for more complex expressions to be defined, but the value cannot be explicitly written or set. + +:::info + +See doc: https://doris.apache.org/docs/3.0/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-TABLE-AND-GENERATED-COLUMN/ + +::: + +## 8. Functional improvements + +### 8-1. Materialized view + +We have refactored the selection logic for materialized views and migrated it from the rule-based optimizer (RBO) to the cost-based optimizer (CBO). This aligns the selection logic with that of asynchronous materialized views. This functionality is enabled by default. If any issues are encountered, you can revert to the RBO mode using `set global enable_sync_mv_cost_based_rewrite = false`. + +### 8-2. Routine Load + +In previous versions, the Routine Load functionality faced some usability challenges, such as uneven task scheduling across BE nodes, untimely task scheduling, complex configuration requirements (the need to change multiple FE and BE settings for optimization), insufficient overall stability (where restarts or upgrades could frequently pause Routine Load jobs, requiring manual user intervention to resume). + +To address these issues, we have made extensive optimizations to the Routine Load feature: + +- **Resource scheduling**: We have improved the scheduling balance to make sure that tasks are more evenly distributed across BE nodes. Jobs that encounter unrepairable errors will be promptly paused to avoid wasting resources on futile scheduling attempts. Additionally, we have improved the timeliness of the scheduling process, which has enhanced the import performance of Routine Load. + +- **Parameter configuration**: Users in most environments no longer need to modify FE and BE configurations for optimization. An automatic adjustment mechanism with timeout parameter has been introduced to prevent tasks from constantly retrying when cluster pressure increases. + +- **Stability**: We have enhanced the robustness of Doris in various exceptional scenarios, such as FE failovers, BE rolling upgrades, and Kafka cluster anomalies, ensuring continuous stable operation. We have also optimized the Auto Resume mechanism, allowing Routine Load to automatically resume operation after faults are repaired, reducing the need for manual user intervention. + +## 9. Behavior changed + +- `cpu_resource_limit` will no longer be supported, and all types of resource isolation will be implemented through Workload Groups. + +- Please use JDK 17 for Apache Doris 3.0 and later versions. The recommended version being `jdk-17.0.10_linux-x64_bin.tar.gz`. + +## Try Apache Doris 3.0 now! + +Before the official release of version 3.0, the compute-storage decoupled mode of Apache Doris has undergone nearly two years of extensive testing and optimization in the production environments of hundreds of enterprises. Contributors from many tech giants have collaborated with the community to provide a significant number of test cases based on their real-world business needs. This has rigorously validated the usability and stability of version 3.0. + +We highly recommend users with compute-storage decoupling needs to download version 3.0 and experience it firsthand. + +Going forward, we will accelerate our release iteration cycle to deliver a more stable version experience for all users. Feel free to join us in the [Apache Doris community](https://join.slack.com/t/apachedoriscommunity/shared_invite/zt-2gmq5o30h-455W226d79zP3L96ZhXIoQ) and engage directly with the core developers. + +## Credits + +Special thanks to the following contributors who participated in the development, testing, and provided feedback for this version: + +@133tosakarin、@390008457、@924060929、@AcKing-Sam、@AshinGau、@BePPPower、@BiteTheDDDDt、@ByteYue、@CSTGluigi、@CalvinKirs、@Ceng23333、@DarvenDuan、@DongLiang-0、@Doris-Extras、@Dragonliu2018、@Emor-nj、@FreeOnePlus、@Gabriel39、@GoGoWen、@HappenLee、@HowardQin、@Hyman-zhao、@INNOCENT-BOY、@JNSimba、@JackDrogon、@Jibing-Li、@KassieZ、@Lchangliang、@LemonLiTree、@LiBinfeng-01、@LompleZ、@M1saka2003、@Mryange、@Nitin-Kashyap、@On-Work-Song、@SWJTU-ZhangLei、@StarryVerse、@TangSiyang2001、@Tech-Circle-48、@Thearas、@Vallishp、@WinkerDu、@XieJiann、@XuJianxu、@XuPengfei-1020、@Yukang-Lian、@Yulei-Yang、@Z-SWEI、@ZhongJinHacker、@adonis0147、@airborne12、@allenhooo、@amorynan、@bingquanzhao、@biohazard4321、@bobhan1、@caiconghui、@cambyzju、@caoliang-web、@catpineapple、@cjj2010、@csun5285、@dataroaring、@deardeng、@dongsilun、@dutyu、@echo-hhj、@eldenmoon、@elvestar、@englefly、@feelshana、@feifeifeimoon、@feiniaofeiafei、@felixwluo、@freemandealer、@gavinchou、@ghkang98、@gnehil、@hechao-ustc、@hello-stephen、@httpshirley、@hubgeter、@hust-hhb、@iszhangpch、@iwanttobepowerful、@ixzc、@jacktengg、@jackwener、@jeffreys-cat、@kaijchen、@kaka11chen、@kindred77、@koarz、@kobe6th、@kylinmac、@larshelge、@liaoxin01、@lide-reed、@liugddx、@liujiwen-up、@liutang123、@lsy3993、@luwei16、@luzhijing、@lxliyou001、@mongo360、@morningman、@morrySnow、@mrhhsg、@my-vegetable-has-exploded、@mymeiyi、@nanfeng1999、@nextdreamblue、@pingchunzhang、@platoneko、@py023、@qidaye、@qzsee、@raboof、@rohitrs1983、@rotkang、@ryanzryu、@seawinde、@shoothzj、@shuke987、@sjyango、@smallhibiscus、@sollhui、@sollhui、@spaces-X、@stalary、@starocean999、@superdiaodiao、@suxiaogang223、@taptao、@vhwzx、@vinlee19、@w41ter、@wangbo、@wangshuo128、@whutpencil、@wsjz、@wuwenchi、@wyxxxcat、@xiaokang、@xiedeyantu、@xiedeyantu、@xingyingone、@xinyiZzz、@xy720、@xzj7019、@yagagagaga、@yiguolei、@yongjinhou、@ytwp、@yuanyuan8983、@yujun777、@yuxuan-luo、@zclllyybb、@zddr、@zfr9527、@zgxme、@zhangbutao、@zhangstar333、@zhannngchen、@zhiqiang-hhhh、@ziyanTOP、@zxealous、@zy-kkk、@zzzxl1993、@zzzzzzzs \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.1.md b/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.1.md new file mode 100644 index 0000000000000..9b9007e4391aa --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.1.md @@ -0,0 +1,604 @@ +--- +{ + "title": "Release 3.0.1", + "language": "en" +} +--- + + + +Dear community members, the Apache Doris 3.0.1 version was officially released on August 23, 2024, featuring updates and improvements in compute-storage decoupling, lakehouse, semi-structured data analysis, asynchronous materialized views, and more. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior Changes + +### Query Optimizer + +- Added the variable `use_max_length_of_varchar_in_ctas` to control the length behavior of VARCHAR type when executing `CREATE TABLE AS SELECT` (CTAS) operations. [#37069](https://github.com/apache/doris/pull/37069) + + - This variable is set to true by default. + + - When set to true, if the VARCHAR type column originates from a table, the derived length is used; otherwise, the maximum length is used. + + - When set to false, the VARCHAR type will always use the derived length. + +- All data types will now be displayed in lowercase to maintain compatibility with MySQL format. [#38012](https://github.com/apache/doris/pull/38012) + +- Multiple query statements in the same query request must now be separated by semicolons. [#38670](https://github.com/apache/doris/pull/38670) + +### Query Execution + +- The default number of parallel tasks after shuffle operations in the cluster is set to 100, which will improve query stability and concurrent processing capability in large clusters. [#38196](https://github.com/apache/doris/pull/38196) + +### Storage + +- The default value of `trash_file_expire_time_sec` has been changed from 86400 seconds to 0 seconds, which means that if files are deleted by mistake and the FE trash is cleared, the data cannot be recovered. + +- The table attribute `enable_mow_delete_on_delete_predicate` (introduced in version 3.0.0) has been renamed to `enable_mow_light_delete`. + +- Explicit transactions are now prohibited from performing delete operations on tables with written data. + +- Heavy schema change operations are prohibited on tables with auto-increment fields. + + + +## New Features + +### Job Scheduling + +- Optimized the execution logic of internal scheduling jobs, decoupling the strong association between start time and immediate execution parameters. Now, tasks can be created with a specified start time or selected for immediate execution, without conflict, enhancing scheduling flexibility. [#36805](https://github.com/apache/doris/pull/36805) + +### Compute-Storage Decoupled + +- Supports dynamic modification of the upper limit for file cache usage. [#37484](https://github.com/apache/doris/pull/37484) + +- Recycler now supports object storage rate limiting and server-side rate limiting retry functionality. [#37663](https://github.com/apache/doris/pull/37663) [#37680](https://github.com/apache/doris/pull/37680) + +### Lakehouse + +- Added the session variable `serde_dialect` to set the output format for complex types. [#37039](https://github.com/apache/doris/pull/37039) + +- SQL interception now supports external tables. + + - For more information, refer to the documentation on [SQL Interception](https://doris.apache.org/docs/admin-manual/query-admin/sql-interception). + +- Insert overwrite now supports Iceberg tables. [#37191](https://github.com/apache/doris/pull/37191) + +### Asynchronous Materialized Views + +- Supports partition roll-up and build at the hourly level. [#37678](https://github.com/apache/doris/pull/37678) + +- Supports atomic replacement of asynchronous materialized view definition statements. [#36749](https://github.com/apache/doris/pull/36749) + +- Transparent rewriting now supports Insert statements. [#38115](https://github.com/apache/doris/pull/38115) + +- Transparent rewriting now supports the VARIANT type. [#37929](https://github.com/apache/doris/pull/37929) + +### Query Execution + +- The group concat function now supports DISTINCT and ORDER BY options. [#38744](https://github.com/apache/doris/pull/38744) + +### Semi-Structured Data Management + +- The ES Catalog now maps `nested` or `object` types in Elasticsearch to the JSON type in Doris. [#37101](https://github.com/apache/doris/pull/37101) + +- Added the `MULTI_MATCH` function, which supports matching keywords across multiple fields and can leverage inverted indexes to accelerate searches. [#37722](https://github.com/apache/doris/pull/37722) + +- Added the `explode_json_object` function, which can unfold objects in JSON data into multiple rows. [#36887](https://github.com/apache/doris/pull/36887) + +- Inverted indexes now support memtable advancement, requiring index construction only once during multi-replica writes, reducing CPU consumption and improving performance. [#35891](https://github.com/apache/doris/pull/35891) + +- Added `MATCH_PHRASE` support for positive slop, e.g., `msg MATCH_PHRASE 'a b 2+'` can match instances containing words a and b with a slop of no more than two, and a preceding b; regular slop without the final `+` does not guarantee this order. [#36356](https://github.com/apache/doris/pull/36356) + +### Other + +- Added the FE parameter `skip_audit_user_list`, where user operations specified in this configuration will not be recorded in the audit log. [#38310](https://github.com/apache/doris/pull/38310) + + - For more information, refer to the documentation on [Audit Plugin](https://doris.apache.org/docs/admin-manual/audit-plugin/). + + + +## Improvements + +### Storage + +- Reduced the likelihood of write failures caused by disk balancing within a single BE. [#38000](https://github.com/apache/doris/pull/38000) + +- Decreased memory consumption by the memtable limiter. [#37511](https://github.com/apache/doris/pull/37511) + +- Moved old partitions to the FE trash during partition replacement operations. [#36361](https://github.com/apache/doris/pull/36361) + +- Optimized memory consumption during compaction. [#37099](https://github.com/apache/doris/pull/37099) + +- Added a session variable to control audit logs for JDBC PreparedStatement, with default setting to not print. [#38419](https://github.com/apache/doris/pull/38419) + +- Optimized the logic for selecting BEs for group commits. [#35558](https://github.com/apache/doris/pull/35558) + +- Improved the performance of column updates. [#38487](https://github.com/apache/doris/pull/38487) + +- Optimized the use of `delete bitmap cache`. [#38761](https://github.com/apache/doris/pull/38761) + +- Added a configuration to control query affinity during hot and cold tiering. [#37492](https://github.com/apache/doris/pull/37492) + +### Compute-Storage Decoupled + +- Implemented automatic retries when encountering object storage server rate limiting. [#37199](https://github.com/apache/doris/pull/37199) + +- Adapted the number of threads for memtable flush in the compute-storage decoupled mode. [#38789](https://github.com/apache/doris/pull/38789) + +- Added Azure as a compile option to support compilation in environments without Azure support. + +- Optimized the observability of object storage access rate limiting. [#38294](https://github.com/apache/doris/pull/38294) + +- Allowed the file cache TTL queue to perform LRU eviction, enhancing TTL queue usability. [#37312](https://github.com/apache/doris/pull/37312) + +- Optimized the number of balance writeeditlog IO operations in the storage and compute separation mode. [#37787](https://github.com/apache/doris/pull/37787) + +- Improved table creation speed in the storage and compute separation mode by sending tablet creation requests in batches. [#36786](https://github.com/apache/doris/pull/36786) + +- Optimized read failures caused by potential inconsistencies in the local file cache through backoff retries. [#38645](https://github.com/apache/doris/pull/38645) + +### Lakehouse + +- Optimized memory statistics for Parquet/ORC format read and write operations. [#37234](https://github.com/apache/doris/pull/37234) + +- Trino Connector Catalog now supports predicate pushdown. [#37874](https://github.com/apache/doris/pull/37874) + +- Added a session variable `enable_count_push_down_for_external_table` to control whether to enable `count(*)` pushdown optimization for external tables. [#37046](https://github.com/apache/doris/pull/37046) + +- Optimized the read logic for Hudi snapshot reads, returning an empty set when the snapshot is empty, consistent with Spark behavior. [#37702](https://github.com/apache/doris/pull/37702) + +- Improved the read performance of partition columns for Hive tables. [#37377](https://github.com/apache/doris/pull/37377) + +### Asynchronous Materialized Views + +- Improved transparent rewrite plan speed by 20%. [#37197](https://github.com/apache/doris/pull/37197) + +- Eliminated roll-up during transparent rewrite if the group key satisfies data uniqueness for better nested matching. [#38387](https://github.com/apache/doris/pull/38387) + +- Transparent rewrite now performs better aggregation elimination to improve the matching success rate of nested materialized views. [#36888](https://github.com/apache/doris/pull/36888) + +### MySQL Compatibility + +- Now correctly populates the database name, table name, and original name in the MySQL protocol result columns. [#38126](https://github.com/apache/doris/pull/38126) + +- Supported the hint format `/*+ func(value) */`. [#37720](https://github.com/apache/doris/pull/37720) + +### Query Optimizer + +- Significantly improved the plan speed for complex queries. [#38317](https://github.com/apache/doris/pull/38317) + +- Adaptively chose whether to perform bucket shuffle based on the number of data buckets to avoid performance degradation in extreme cases. [#36784](https://github.com/apache/doris/pull/36784) + +- Optimized the cost estimation logic for SEMI / ANTI JOIN. [#37951](https://github.com/apache/doris/pull/37951) [#37060](https://github.com/apache/doris/pull/37060) + +- Supported pushing Limit down to the first stage of aggregation to improve performance. [#34853](https://github.com/apache/doris/pull/34853) + +- Partition pruning now supports filter conditions containing the `date_trunc` or `date` function. [#38025](https://github.com/apache/doris/pull/38025) [#38743](https://github.com/apache/doris/pull/38743) + +- SQL cache now supports query scenarios that include user variables. [#37915](https://github.com/apache/doris/pull/37915) + +- Optimized error messages for invalid aggregation semantics. [#38122](https://github.com/apache/doris/pull/38122) + +### Query Execution + +- Adapted AggState compatibility from 2.1 to 3.x and fixed Coredump issues. [#37104](https://github.com/apache/doris/pull/37104) + +- Refactored the strategy selection for local shuffle without Join. [#37282](https://github.com/apache/doris/pull/37282) + +- Modified the scanner for internal table queries to be asynchronous to prevent stalling during such queries. [#38403](https://github.com/apache/doris/pull/38403) + +- Optimized the block merge process during Hash table construction for Join operators. [#37471](https://github.com/apache/doris/pull/37471) + +- Optimized the duration of lock holding for MultiCast. [#37462](https://github.com/apache/doris/pull/37462) + +- Optimized gRPC keepAliveTime and added link monitoring to reduce the probability of query failure due to RPC errors. [#37304](https://github.com/apache/doris/pull/37304) + +- Cleaned up all dirty pages in jemalloc when memory limits were exceeded. [#37164](https://github.com/apache/doris/pull/37164) + +- Optimized the processing performance of `aes_encrypt`/`decrypt` functions for constant types. [#37194](https://github.com/apache/doris/pull/37194) + +- Optimized the processing performance of the `json_extract` function for constant data. [#36927](https://github.com/apache/doris/pull/36927) + +- Optimized the processing performance of the `ParseUrl` function for constant data. [#36882](https://github.com/apache/doris/pull/36882) + +### Semi-Structured Data Management + +- Bitmap indexes now default to using inverted indexes, with `enable_create_bitmap_index_as_inverted_index` set to true by default. [#36692](https://github.com/apache/doris/pull/36692) + +- In the compute-storage decoupled mode, DESC can now view sub-columns of VARIANT type. [#38143](https://github.com/apache/doris/pull/38143) + +- Removed the step of checking file existence during inverted index queries to reduce access latency to remote storage. [#36945](https://github.com/apache/doris/pull/36945) + +- Complex types ARRAY / MAP / STRUCT now support `replace_if_not_null` for AGG tables. [#38304](https://github.com/apache/doris/pull/38304) + +- Escape characters for JSON data are now supported. [#37176](https://github.com/apache/doris/pull/37176) [#37251](https://github.com/apache/doris/pull/37251) + +- Inverted index queries now behave consistently on MOW tables and DUP tables. [#37428](https://github.com/apache/doris/pull/37428) + +- Optimized the performance of inverted index acceleration for IN queries. [#37395](https://github.com/apache/doris/pull/37395) + +- Reduced unnecessary memory allocation during TOPN queries to improve performance. [#37429](https://github.com/apache/doris/pull/37429) + +- When creating an inverted index with tokenization, the `support_phrase` option is now automatically enabled to accelerate `match_phrase` series phrase queries. [#37949](https://github.com/apache/doris/pull/37949) + +### Other + +- Audit log now can record SQL types. [#37790](https://github.com/apache/doris/pull/37790) + +- Added support for `information_schema.processlist` to show all FE. [#38701](https://github.com/apache/doris/pull/38701) + +- Cached ranger's `atamask` and `rowpolicy` to accelerate query efficiency. [#37723](https://github.com/apache/doris/pull/37723) + +- Optimized metadata management in job manager to release locks immediately after modifying metadata, reducing lock holding time. [#38162](https://github.com/apache/doris/pull/38162) + + + +## Bug Fixes + +### Upgrade + +- Fix the issue where `mtmv load` fails during upgrade from version 2.1. [#38799](https://github.com/apache/doris/pull/38799) + +- Resolve the issue where `null_type` cannot be found during the upgrade to version 2.1. [#39373](https://github.com/apache/doris/pull/39373) + +- Address the compatibility issue with permission persistence during the upgrade from version 2.1 to 3.0. [#39288](https://github.com/apache/doris/pull/39288) + +### Load + +- Fix the issue where parsing fails when the newline character is surrounded by delimiters in CSV format parsing. [#38347](https://github.com/apache/doris/pull/38347) +- Resolve potential exception issues when FE forwards group commit. [#38228](https://github.com/apache/doris/pull/38228) [#38265](https://github.com/apache/doris/pull/38265) + +- Group commit now supports the new optimizer. [#37002](https://github.com/apache/doris/pull/37002) + +- Fix the issue where group commit reports data errors when JDBC setNull is used. [#38262](https://github.com/apache/doris/pull/38262) + +- Optimize the retry logic for group commit when encountering `delete bitmap lock` errors. [#37600](https://github.com/apache/doris/pull/37600) + +- Resolve the issue where routine load cannot use CSV delimiters and escape characters. [#38402](https://github.com/apache/doris/pull/38402) + +- Fix the issue where routine load job names with mixed case cannot be displayed. [#38523](https://github.com/apache/doris/pull/38523) + +- Optimize the logic for actively recovering routine load during FE master-slave switching. [#37876](https://github.com/apache/doris/pull/37876) + +- Resolve the issue where routine load pauses when all data in Kafka is expired. [#37288](https://github.com/apache/doris/pull/37288) + +- Fix the issue where `show routine load` returns empty results. [#38199](https://github.com/apache/doris/pull/38199) + +- Resolve the memory leak issue during multi-table stream import in routine load. [#38255](https://github.com/apache/doris/pull/38255) + +- Fix the issue where stream load does not return the error URL. [#38325](https://github.com/apache/doris/pull/38325) + +- Resolve potential load channel leak issues. [#38031](https://github.com/apache/doris/pull/38031) [#37500](https://github.com/apache/doris/pull/37500) + +- Fix the issue where no error may be reported when importing fewer segments than expected. [#36753](https://github.com/apache/doris/pull/36753) + +- Resolve the load stream leak issue. [#38912](https://github.com/apache/doris/pull/38912) + +- Optimize the impact of offline nodes on import operations. [#38198](https://github.com/apache/doris/pull/38198) + +- Fix the issue where transactions do not end when inserting into empty data. [#38991](https://github.com/apache/doris/pull/38991) + +### Storage + +**01 Backup and Restoration** + +- Fix the issue where tables cannot be written after backup and restoration. [#37089](https://github.com/apache/doris/pull/37089) + +- Resolve the issue where view database names are incorrect after backup and restoration. [#37412](https://github.com/apache/doris/pull/37412) + +**02 Compaction** + +- Fix the issue where cumu compaction handles delete errors incorrectly during ordered data compression. [#38742](https://github.com/apache/doris/pull/38742) + +- Resolve the issue of duplicate keys in aggregate tables caused by sequential compression optimization. [#38224](https://github.com/apache/doris/pull/38224) + +- Fix the issue where compression operations cause coredump in large wide tables. [#37960](https://github.com/apache/doris/pull/37960) + +- Resolve the compression starvation issue caused by inaccurate concurrent statistics of compression tasks. [#37318](https://github.com/apache/doris/pull/37318) + +**03 MOW Unique Key** + +- Resolve the issue of inconsistent data between replicas caused by cumulative compression deletion of delete sign. [#37950](https://github.com/apache/doris/pull/37950) + +- MOW delete now uses partial column updates with the new optimizer. [#38751](https://github.com/apache/doris/pull/38751) + +- Fix the potential duplicate key issue in MOW tables under compute-storage decoupled. [#39018](https://github.com/apache/doris/pull/39018) + +- Resolve the issue where MOW unique and duplicate tables cannot modify column order. [#37067](https://github.com/apache/doris/pull/37067) + +- Fix the potential data correctness issue caused by segcompaction. [#37760](https://github.com/apache/doris/pull/37760) + +- Resolve the potential memory leak issue during column updates. [#37706](https://github.com/apache/doris/pull/37706) + +**04 Other** + +- Fix the small probability of exceptions in TOPN queries. [#39119](https://github.com/apache/doris/pull/39119) [#39199](https://github.com/apache/doris/pull/39199) + +- Resolve the issue where auto-increment IDs may duplicate during FE restart. [#37306](https://github.com/apache/doris/pull/37306) + +- Fix the potential queuing issue in the delete operation priority queue. [#37169](https://github.com/apache/doris/pull/37169) + +- Optimize the delete retry logic. [#37363](https://github.com/apache/doris/pull/37363) + +- Resolve the issue with `bucket = 0` in table creation statements under the new optimizer. [#38971](https://github.com/apache/doris/pull/38971) + +- Fix the issue where FE reports success incorrectly when image generation fails. [#37508](https://github.com/apache/doris/pull/37508) + +- Resolve the issue where using the wrong nodename during FE offline nodes may cause inconsistent FE members. [#37987](https://github.com/apache/doris/pull/37987) + +- Fix the issue where CCR partition addition may fail. [#37295](https://github.com/apache/doris/pull/37295) + +- Resolve the `int32` overflow issue in inverted index files. [#38891](https://github.com/apache/doris/pull/38891) + +- Fix the issue where TRUNCATE TABLE failure may cause BE to fail to go offline. [#37334](https://github.com/apache/doris/pull/37334) + +- Resolve the issue where publish cannot continue due to null pointers. [#37724](https://github.com/apache/doris/pull/37724) [#37531](https://github.com/apache/doris/pull/37531) + +- Fix the potential coredump issue when manually triggering disk migration. [#37712](https://github.com/apache/doris/pull/37712) + +### Compute-Storage Decoupled + +- Fixed the issue where `show create table` might display the `file_cache_ttl_seconds` attribute twice. [#38052](https://github.com/apache/doris/pull/38052) + +- Fixed the issue where segment Footer TTL was not set correctly after setting file cache TTL. [#37485](https://github.com/apache/doris/pull/37485) + +- Fixed the issue where file cache might cause coredump due to massive conversion of cache types. [#38518](https://github.com/apache/doris/pull/38518) + +- Fixed the potential file descriptor (fd) leak in file cache. [#38051](https://github.com/apache/doris/pull/38051) + +- Fixed the issue where schema change Job overwriting compaction Job prevented base tablet compaction from completing normally. [#38210](https://github.com/apache/doris/pull/38210) + +- Fixed the potential inaccuracy of base compaction score due to data race. [#38006](https://github.com/apache/doris/pull/38006) + +- Fixed the issue where error messages from imports might not be uploaded correctly to object storage. [#38359](https://github.com/apache/doris/pull/38359) + +- Fixed the inconsistency in return information between compute-storage decoupled mode and storage and compute integration mode for 2PC imports. [#38076](https://github.com/apache/doris/pull/38076) + +- Fix the issue where incorrect file size setting during file cache warm-up leads to coredump. [#38939](https://github.com/apache/doris/pull/38939) + +- Fixed the issue where partial column updates did not correctly dequeue delete operations. [#37151](https://github.com/apache/doris/pull/37151) + +- Fixed compatibility issues with permission persistence in compute-storage decoupled mode. [#38136](https://github.com/apache/doris/pull/38136) [#37708](https://github.com/apache/doris/pull/37708) + +- Fixed the issue where observer did not retry correctly when encountering a `-230` error. [#37625](https://github.com/apache/doris/pull/37625) + +- Fixed the issue where `show load` with conditions did not perform correct analysis. [#37656](https://github.com/apache/doris/pull/37656) + +- Fixed the issue where `show streamload` in compute-storage decoupled mode caused BE coredump. [#37903](https://github.com/apache/doris/pull/37903) + +- Fixed the issue where `copy into` did not correctly verify column names in strict mode. [#37650](https://github.com/apache/doris/pull/37650) + +- Fixed the issue where multi-stream imports into a single table lacked permissions. [#38878](https://github.com/apache/doris/pull/38878) + +- Fixed the potential overflow issue in `getVersionUpdateTimeMs`. [#38074](https://github.com/apache/doris/pull/38074) + +- Fixed the issue where FE azure blob list was not implemented correctly. [#37986](https://github.com/apache/doris/pull/37986) + +- Fixed the issue where inaccurate azure blob recycling time calculation prevented recycling. [#37535](https://github.com/apache/doris/pull/37535) + +- Fixed the issue where inverted index files were not deleted in compute-storage decoupled mode. [#38306](https://github.com/apache/doris/pull/38306) + +### Lakehouse + +- Fixed the issue with reading binary data from Oracle Catalog. [#37078](https://github.com/apache/doris/pull/37078) + +- Fixed the potential deadlock issue when acquiring external table metadata in multi-FE scenarios. [#37756](https://github.com/apache/doris/pull/37756) + +- Fixed the issue where JNI scanner failure caused BE nodes to crash. [#37697](https://github.com/apache/doris/pull/37697) + +- Fixed the issue with slow reading of date types from Trino Connector Catalog. [#37266](https://github.com/apache/doris/pull/37266) + +- Optimized kerberos authentication logic for Hive Catalog. [#37301](https://github.com/apache/doris/pull/37301) + +- Fixed the issue where region attributes might be parsed incorrectly when parsing MinIO properties. [#37249](https://github.com/apache/doris/pull/37249) + +- Fixed the issue where creating too many FileSystems by FE caused memory leaks. [#36954](https://github.com/apache/doris/pull/36954) + +- Fixed the issue with reading incorrect time zone information from Paimon. [#37716](https://github.com/apache/doris/pull/37716) + +- Fixed the potential thread leak issue caused by Hive write-back operations. [#36990](https://github.com/apache/doris/pull/36990) + +- Fixed the null pointer issue caused by enabling Hive metastore event synchronization. [#38421](https://github.com/apache/doris/pull/38421) + +- Fixed the issue where error messages were unclear or caused stalling when creating catalogs. [#37551](https://github.com/apache/doris/pull/37551) + +- Fixed the issue where reading Hive text format tables behaved differently from Hive. [#37638](https://github.com/apache/doris/pull/37638) + +- Fixed the logic error when switching between catalogs and databases. [#37828](https://github.com/apache/doris/pull/37828) + +### MySQL Compatibility + +- Fixed the issue where certain flags in the MySQL protocol were set incorrectly when SSL was enabled. [#38086](https://github.com/apache/doris/pull/38086) + +### Asynchronous Materialized Views + +- Fixed the issue where construction might fail when the base table had a very large number of partitions. [#37589](https://github.com/apache/doris/pull/37589) + +- Fixed the issue where nested materialized views incorrectly performed full table refreshes even when partition refreshes were possible. [#38698](https://github.com/apache/doris/pull/38698) + +- Fixed the issue where partition refresh could not handle the simultaneous existence of valid and invalid dependencies when analyzing partition dependencies. [#38367](https://github.com/apache/doris/pull/38367) + +- Fixed the issue where the final result containing NULL type might cause asynchronous materialized views to fail. [#37019](https://github.com/apache/doris/pull/37019) + +- Fixed the planning error that might occur during transparent rewriting when both synchronous and asynchronous materialized views with the same name were present. [#37311](https://github.com/apache/doris/pull/37311) + +### Synchronous Materialized Views + +- The rewritten synchronous materialized views now can correctly perform partition pruning. [#38527](https://github.com/apache/doris/pull/38527) + +- When rewriting synchronous materialized views, those with unready data are no longer selected. [#38148](https://github.com/apache/doris/pull/38148) + +### Query Optimizer + +- Fixed the deadlock issue that might occur when queries and delete operations are performed simultaneously. [#38660](https://github.com/apache/doris/pull/38660) + +- Fixed the issue where bucket pruning might incorrectly prune on decimal column buckets. [#37889](https://github.com/apache/doris/pull/37889) + +- Fixed the issue where planning might be incorrect when mark join participates in join reorder. [#39152](https://github.com/apache/doris/pull/39152) + +- Fixed the issue where the result is incorrect when the correlation condition of a correlated subquery is not a simple column. [#37644](https://github.com/apache/doris/pull/37644) + +- Fixed the issue where partition pruning cannot correctly handle or expressions. [#38897](https://github.com/apache/doris/pull/38897) + +- Fixed the planning error that might occur when optimizing the execution order of JOIN and AGG. [#37343](https://github.com/apache/doris/pull/37343) + +- Fixed the issue where `str_to_date` performs incorrect constant folding calculations on datev1 types. [#37360](https://github.com/apache/doris/pull/37360) + +- Fixed the issue where the ACOS function's constant folding returns non-NaN values. [#37932](https://github.com/apache/doris/pull/37932) + +- Fixed the occasional planning error: "The children format needs to be [WhenClause+, DefaultValue?]". [#38491](https://github.com/apache/doris/pull/38491) + +- Fixed the issue where planning might be incorrect when the projection includes window functions and there is both the original column and its alias. [#38166](https://github.com/apache/doris/pull/38166) + +- Fixed the issue where planning might report an error when the aggregation parameter contains a lambda expression. [#37109](https://github.com/apache/doris/pull/37109) + +- Fixed the insert error that might occur in extreme cases: "MultiCastDataSink cannot be cast to DataStreamSink". [#38526](https://github.com/apache/doris/pull/38526) + +- Fixed the issue where the new optimizer does not correctly handle `char(0)/varchar(0)` when creating a table. [#38427](https://github.com/apache/doris/pull/38427) + +- Fixed the incorrect behavior of `char(255) toSql`. [#37340](https://github.com/apache/doris/pull/37340) + +- Fixed the issue where the nullable attribute within the `agg_state` type might lead to planning errors. [#37489](https://github.com/apache/doris/pull/37489) +- Fixed the issue where row count statistics are inaccurate during mark Join. [#38270](https://github.com/apache/doris/pull/38270) + +### Query Execution + +- Fixed issues where the Pipeline execution engine was stuck, causing queries to not end, in multiple scenarios. [#38657](https://github.com/apache/doris/pull/38657), [#38206](https://github.com/apache/doris/pull/38206), [#38885](https://github.com/apache/doris/pull/38885), [#38151](https://github.com/apache/doris/pull/38151), [#37297](https://github.com/apache/doris/pull/37297) + +- Fixed the coredump issue caused by NULL and non-NULL columns during set difference calculations. [#38750](https://github.com/apache/doris/pull/38750) + +- Fixed the error when using the DECIMAL type with pure decimals in delete statements. [#37801](https://github.com/apache/doris/pull/37801) + +- Fixed the issue where the `width_bucket` function returned incorrect results. [#37892](https://github.com/apache/doris/pull/37892) + +- Fixed the query error when a single row of data was very large and the result set was also large (exceeding 2GB). [#37990](https://github.com/apache/doris/pull/37990) + +- Fixed the coredump issue caused by incorrect release of rpc connections during single-replica imports. [#38087](https://github.com/apache/doris/pull/38087) + +- Fixed the coredump issue caused by processing NULL values with the `foreach` function. [#37349](https://github.com/apache/doris/pull/37349) + +- Fixed the issue where stddev returned incorrect results for DECIMALV2 types. [#38731](https://github.com/apache/doris/pull/38731) + +- Fixed the slow performance of `bitmap union` calculations. [#37816](https://github.com/apache/doris/pull/37816) + +- Fixed the issue where RowsProduced for aggregation operators was not set in the profile. [#38271](https://github.com/apache/doris/pull/38271) + +- Fixed the overflow issue when calculating the number of buckets for the hash table under hash join. [#37193](https://github.com/apache/doris/pull/37193), [#37493](https://github.com/apache/doris/pull/37493) + +- Fixed the inaccurate recording of the `jemalloc cache memory tracker`. [#37464](https://github.com/apache/doris/pull/37464) + +- Added the `enable_stacktrace` configuration option, allowing users to control whether exception stacks are output in BE logs. [#37713](https://github.com/apache/doris/pull/37713) + +- Fixed the issue where Arrow Flight SQL did not work correctly when `enable_parallel_result_sink` was set to false. [#37779](https://github.com/apache/doris/pull/37779) + +- Fixed the incorrect use of colocate Join. [#37361](https://github.com/apache/doris/pull/37361), [#37729](https://github.com/apache/doris/pull/37729) + +- Fixed the calculation overflow issue of the `round` function on DECIMAL128 types. [#37733](https://github.com/apache/doris/pull/37733), [#38106](https://github.com/apache/doris/pull/38106) + +- Fixed the coredump issue when passing a const string to the `sleep` function. [#37681](https://github.com/apache/doris/pull/37681) + +- Increased the queue length for audit logs, solving the issue where audit logs could not be recorded normally under high concurrency scenarios with thousands of concurrent connections. [#37786](https://github.com/apache/doris/pull/37786) + +- Fixed the issue where creating a workload group caused too many threads, leading to BE coredump. [#38096](https://github.com/apache/doris/pull/38096) + +- Fixed the coredump issue caused by the `MULTI_MATCH_ANY` function. [#37959](https://github.com/apache/doris/pull/37959) + +- Fixed the transaction rollback issue caused by `insert overwrite auto partition`. [#38103](https://github.com/apache/doris/pull/38103) + +- Fixed the issue where the TimeUtils formatter did not use the correct time zone. [#37465](https://github.com/apache/doris/pull/37465) + +- Fixed the issue where results were incorrect under constant folding scenarios for week/yearweek. [#37376](https://github.com/apache/doris/pull/37376) + +- Fixed the issue where the `convert_tz` function returned incorrect results. [#37358](https://github.com/apache/doris/pull/37358), [#38764](https://github.com/apache/doris/pull/38764) + +- Fixed the coredump issue when using the `collect_set` function with window functions. [#38234](https://github.com/apache/doris/pull/38234) + +- Fixed the coredump issue caused by `percentile_approx` during rolling upgrades. [#39321](https://github.com/apache/doris/pull/39321) + +- Fixed the coredump issue caused by the `mod` function when encountering abnormal input. [#37999](https://github.com/apache/doris/pull/37999) + +- Fixed the issue where the hash table was not fully built when the broadcast join probe started running. [#37643](https://github.com/apache/doris/pull/37643) + +- Fixed the issue where executing the same expression in multithreaded environments might lead to incorrect results for Java UDFs. [#38612](https://github.com/apache/doris/pull/38612) + +- Fixed the overflow issue caused by incorrect return types of the `conv` function. [#38001](https://github.com/apache/doris/pull/38001) + +- Fixed the issue where the `json_replace` function returned incorrect types. [#3701](https://github.com/apache/doris/pull/37014) + +- Fixed the issue where the nullable attribute setting was unreasonable for the `percentile` aggregation function. [#37330](https://github.com/apache/doris/pull/37330) + +- Fixed the issue where the results of the `histogram` function were unstable. [#38608](https://github.com/apache/doris/pull/38608) + +- Fixed the issue where task state was displayed incorrectly in the profile. [#38082](https://github.com/apache/doris/pull/38082) + +- Fixed the issue where some queries were incorrectly canceled when the system just started. [#37662](https://github.com/apache/doris/pull/37662) + +### Semi-Structured Data Management + +- Fix some issues with time series compression. [#39170](https://github.com/apache/doris/pull/39170) [#39176](https://github.com/apache/doris/pull/39176) + +- Fix the issue of incorrect index size statistics during compression. [#37232](https://github.com/apache/doris/pull/37232) + +- Fix the potential incorrect matching of ultra-long strings without tokenization in inverted indexes. [#37679](https://github.com/apache/doris/pull/37679) [#38218](https://github.com/apache/doris/pull/38218) + +- Fix the high memory usage issue of `array_range` and `array_with_const` functions when dealing with large data volumes. [#38284](https://github.com/apache/doris/pull/38284) [#37495](https://github.com/apache/doris/pull/37495) + +- Fix the potential coredump issue when selecting columns of ARRAY / MAP / STRUCT types. [#37936](https://github.com/apache/doris/pull/37936) + +- Fix the import failure issue caused by simdjson parsing errors when specifying jsonpath in Stream Load. [#38490](https://github.com/apache/doris/pull/38490) + +- Fix the exception handling issue when there are duplicate keys in JSON data. [#38146](https://github.com/apache/doris/pull/38146) + +- Fix the potential query error after DROP INDEX. [#37646](https://github.com/apache/doris/pull/37646) + +- Fix the error return issue in row merging checks during index compression. [#38732](https://github.com/apache/doris/pull/38732) + +- Inverted index v2 format now supports renaming columns. [#38079](https://github.com/apache/doris/pull/38079) + +- Fix the coredump issue when the `MATCH` function matches an empty string without an index. [#37947](https://github.com/apache/doris/pull/37947) + +- Fix the handling of NULL values in inverted indexes. [#37921](https://github.com/apache/doris/pull/37921) [#37842](https://github.com/apache/doris/pull/37842) [#38741](https://github.com/apache/doris/pull/38741) + +- Fix the incorrect `row_store_page_size` after FE restart. [#38240](https://github.com/apache/doris/pull/38240) + +### Other + +- Fix the timezone configuration issue. The default timezone is no longer fixed at UTC+8 and is now obtained from system configuration. [#37294](https://github.com/apache/doris/pull/37294) + +- Fix the class conflict issue when using ranger due to multiple JSR specification implementations. [#37575](https://github.com/apache/doris/pull/37575) + +- Fix the potential uninitialized field issue in some BE code. [#37403](https://github.com/apache/doris/pull/37403) + +- Fix the error in delete statements for random distributed tables. [#37985](https://github.com/apache/doris/pull/37985) + +- Fix the incorrect requirement for `alter_priv` permission on the base table when creating a synchronized materialized view. [#38011](https://github.com/apache/doris/pull/38011) + +- Fix the issue of not authenticating resources when used in TVF. [#36928](https://github.com/apache/doris/pull/36928) + + +## Credits + +Thanks all who contribute to this release: + +@133tosakarin, @924060929, @AshinGau, @Baymine, @BePPPower, @BiteTheDDDDt, @ByteYue, @CalvinKirs, @Ceng23333, @DarvenDuan, @FreeOnePlus, @Gabriel39, @HappenLee, @JNSimba, @Jibing-Li, @KassieZ, @Lchangliang, @LiBinfeng-01, @Mryange, @SWJTU-ZhangLei, @TangSiyang2001, @Tech-Circle-48, @Vallishp, @Yukang-Lian, @Yulei-Yang, @airborne12, @amorynan, @bobhan1, @cambyzju, @cjj2010, @csun5285, @dataroaring, @deardeng, @eldenmoon, @englefly, @feiniaofeiafei, @felixwluo, @freemandealer, @gavinchou, @ghkang98, @hello-stephen, @hubgeter, @hust-hhb, @jacktengg, @kaijchen, @kaka11chen, @keanji-x, @liaoxin01, @liutang123, @luwei16, @luzhijing, @lxr599, @morningman, @morrySnow, @mrhhsg, @mymeiyi, @platoneko, @qidaye, @qzsee, @seawinde, @shuke987, @sollhui, @starocean999, @suxiaogang223, @w41ter, @wangbo, @wangshuo128, @whutpencil, @wsjz, @wuwenchi, @wyxxxcat, @xiaokang, @xiedeyantu, @xinyiZzz, @xy720, @xzj7019, @yagagagaga, @yiguolei, @yujun777, @z404289981, @zclllyybb, @zddr, @zfr9527, @zhangbutao, @zhangstar333, @zhannngchen, @zhiqiang-hhhh, @zjj, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.2.md b/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.2.md new file mode 100644 index 0000000000000..0ab6a828ab95d --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.2.md @@ -0,0 +1,341 @@ +--- +{ + "title": "Release 3.0.2", + "language": "en" +} +--- + + + + +Dear community members, the Apache Doris 3.0.2 version was officially released on October 15, 2024, featuring updates and improvements in compute-storage decoupling, data storage, lakehouse, query optimizer, query execution and more. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavioral Changes + +### Storage + +- Limited the number of tablets in a single backup task to prevent FE memory overflow. [#40518](https://github.com/apache/doris/pull/40518) +- The `SHOW PARTITIONS` command now displays the `CommittedVersion` of partitions. [#28274](https://github.com/apache/doris/pull/28274) + +### Other + +- The default printing mode (asynchronous) of `fe.log` now includes file line number information. If performance issues are encountered due to line number output, please switch to BRIEF mode. [#39419](https://github.com/apache/doris/pull/39419) +- The default value of the session variable `ENABLE_PREPARED_STMT_AUDIT_LOG` has been changed from `true` to `false`, and the audit log of prepare statements will no longer be printed. [#38865](https://github.com/apache/doris/pull/38865) +- The default value of the session variable `max_allowed_packet` has been adjusted from 1MB to 16MB to align with MySQL 8.4. [#38697](https://github.com/apache/doris/pull/38697) +- The JVM of FE and BE defaults to using the UTF-8 character set. [#39521](https://github.com/apache/doris/pull/39521) + +## New Features + +### Storage + +- Backup and recovery now support clearing tables or partitions that are not in the backup. [#39028](https://github.com/apache/doris/pull/39028) + +### Compute-Storage Decoupled + +- Support for parallel recycling of expired data on multiple tablets. [#37630](https://github.com/apache/doris/pull/37630) +- Support for changing storage vaults through `ALTER` statements. [#38685](https://github.com/apache/doris/pull/38685) [#37606](https://github.com/apache/doris/pull/37606) +- Support for importing a large number of tablets (5000+) in a single transaction (experimental feature). [#38243](https://github.com/apache/doris/pull/38243) +- Support for automatically aborting pending transactions caused by reasons such as node restarts, solving the issue of pending transactions blocking decommission or schema change. [#37669](https://github.com/apache/doris/pull/37669) +- A new session variable `enable_segment_cache` has been added to control whether to use segment cache during queries (default is `true`). [#37141](https://github.com/apache/doris/pull/37141) +- Resolved the issue of not being able to import a large amount of data during schema changes in compute-storage decoupled mode. [#39558](https://github.com/apache/doris/pull/39558) +- Support for adding multiple follower roles of FE in compute-storage decoupled mode. [#38388](https://github.com/apache/doris/pull/38388) +- Support for using memory as file cache to accelerate queries in environments with no disks or low-performance HDDs. [#38811](https://github.com/apache/doris/pull/38811) + +### Lakehouse + +- New Lakesoul Catalog has been added. [Apache Doris Docs](https://doris.apache.org/zh-CN/docs/dev/lakehouse/datalake-analytics/lakesoul) +- A new system table `catalog_meta_cache_statistics` has been added to view the usage of various metadata caches in external catalog. [#40155](https://github.com/apache/doris/pull/40155) + +### Query Optimizer + +- Support for `is [not] true/false` expressions. [#38623](https://github.com/apache/doris/pull/38623) + +### Query Execution + +- A new CRC32 function has been added. [#38204](https://github.com/apache/doris/pull/38204) +- New aggregate functions skew and kurt have been added. [#41277](https://github.com/apache/doris/pull/41277) +- Profiles are now persisted to the FE's disk to retain more profiles. [#33690](https://github.com/apache/doris/pull/33690) +- A new system table `workload_group_privileges` has been added to view permission information related to workload groups. [#38436](https://github.com/apache/doris/pull/38436) +- A new system table `workload_group_resource_usage` has been added to monitor resource statistics of workload groups. [#39177](https://github.com/apache/doris/pull/39177) +- Workload groups now support limiting reads of local IO and remote IO. [#39012](https://github.com/apache/doris/pull/39012) +- Workload groups now support cgroupv2 to limit CPU usage. [#39374](https://github.com/apache/doris/pull/39374) +- A new system table `information_schema.partitions` has been added to view some table creation attributes. [#40636](https://github.com/apache/doris/pull/40636) + +### Other + +- Support for using the `SHOW` statement to display BE's configuration information, such as `SHOW BACKEND CONFIG LIKE ${pattern}`. [#36525](https://github.com/apache/doris/pull/36525) + +## Improvements + +### Load + +- Improved the import efficiency of routine load when encountering frequent EOFs from Kafka. [#39975](https://github.com/apache/doris/pull/39975) +- The stream load result now includes the time taken to read HTTP data, `ReceiveDataTimeMs`, which can quickly determine slow stream load issues caused by network reasons. [#40735](https://github.com/apache/doris/pull/40735) +- Optimized the routine load timeout logic to avoid frequent timeouts during inverted index and mow writes. [#40818](https://github.com/apache/doris/pull/40818) + +### Storage + +- Support for batch addition of partitions. [#37114](https://github.com/apache/doris/pull/37114) + +### Compute-Storage Decoupled + +- Added the meta-service HTTP interface `/MetaService/http/show_meta_ranges` to facilitate the statistics of KV distribution in FDB. [#39208](https://github.com/apache/doris/pull/39208) +- The meta-service/recycler stop script ensures that the process fully exits before returning. [#40218](https://github.com/apache/doris/pull/40218) +- Support for using the session variable `version_comment` (Cloud Mode) to display the current deployment mode as compute-storage decoupled. [#38269](https://github.com/apache/doris/pull/38269) +- Fixed the detailed message returned when transaction submission fails. [#40584](https://github.com/apache/doris/pull/40584) +- Support for using one meta-service process to provide both metadata services and data recycling services. [#40223](https://github.com/apache/doris/pull/40223) +- Optimized the default configuration of file_cache to avoid potential issues when not set. [#41421](https://github.com/apache/doris/pull/41421) [#41507](https://github.com/apache/doris/pull/41507) +- Improved query performance by batch retrieving the version of multiple partitions. [#38949](https://github.com/apache/doris/pull/38949) +- Delayed the redistribution of tablets to avoid query performance issues caused by temporary network fluctuations. [#40371](https://github.com/apache/doris/pull/40371) +- Optimized the read-write lock logic in the balance. [#40633](https://github.com/apache/doris/pull/40633) +- Enhanced the robustness of file cache in handling TTL filenames during restarts/crashes. [#40226](https://github.com/apache/doris/pull/40226) +- Added the BE HTTP interface `/api/file_cache?op=hash` to facilitate the calculation of the hash file names of segment files on disk. [#40831](https://github.com/apache/doris/pull/40831) +- Optimized the unified naming to be compatible with using compute group to represent BE groups (original cloud cluster). [#40767](https://github.com/apache/doris/pull/40767) +- Optimized the waiting time for obtaining locks when calculating delete bitmaps in primary key tables. [#40341](https://github.com/apache/doris/pull/40341) +- When there are many delete bitmaps in primary key tables, optimized the high CPU consumption during queries by pre-merging multiple delete bitmaps. [#40204](https://github.com/apache/doris/pull/40204) +- Support for managing FE/BE nodes in compute-storage decoupled mode through SQL statements, hiding the logic of direct interaction with meta-service when deploying in compute-storage decoupled mode. [#40264](https://github.com/apache/doris/pull/40264) +- Added a script for rapid deployment of FDB. [#39803](https://github.com/apache/doris/pull/39803) +- Optimized the output of `SHOW CACHE HOTSPOT` to unify the column name style with other `SHOW` statements. [#41322](https://github.com/apache/doris/pull/41322) +- When using a storage vault as the storage backend, disallowed the use of `latest_fs()` to avoid binding different storage backends to the same table. [#40516](https://github.com/apache/doris/pull/40516) +- Optimized the timeout strategy for calculating delete bitmaps when importing mow tables. [#40562](https://github.com/apache/doris/pull/40562) [#40333](https://github.com/apache/doris/pull/40333) +- The enable_file_cache in be.conf is now enabled by default in compute-storage decoupled mode. [#41502](https://github.com/apache/doris/pull/41502) + +### Lakehouse + +- When reading tables in CSV format, support for the session `keep_carriage_return` setting to control the reading behavior of the `\r` symbol. [#39980](https://github.com/apache/doris/pull/39980) +- The default maximum memory of BE's JVM has been adjusted to 2GB (affecting only new deployments). [#41403](https://github.com/apache/doris/pull/41403) +- Hive Catalog has added `hive.recursive_directories_table` and `hive.ignore_absent_partitions` properties to specify whether to recursively traverse data directories and whether to ignore missing partitions. [#39494](https://github.com/apache/doris/pull/39494) +- Optimized the Catalog refresh logic to avoid generating a large number of connections during refresh. [#39205](https://github.com/apache/doris/pull/39205) +- `SHOW CREATE DATABASE` and `SHOW CREATE TABLE` for external data sources now display location information. [#39179](https://github.com/apache/doris/pull/39179) +- The new optimizer supports inserting data into JDBC external tables using the `INSERT INTO` statement. [#41511](https://github.com/apache/doris/pull/41511) +- MaxCompute Catalog now supports complex data types. [#39259](https://github.com/apache/doris/pull/39259) +- Optimized the logic for reading and merging data shards of external tables. [#38311](https://github.com/apache/doris/pull/38311) +- Optimized some refresh strategies for metadata caches of external tables. [#38506](https://github.com/apache/doris/pull/38506) +- Paimon tables now support pushing down `IN/NOT IN` predicates. [#38390](https://github.com/apache/doris/pull/38390) +- Compatible with tables created in Parquet format by Paimon version 0.9. [#41020](https://github.com/apache/doris/pull/41020) + +### Asynchronous Materialized Views + +- Building asynchronous materialized views now supports the use of both immediate and starttime. [#39573](https://github.com/apache/doris/pull/39573) +- Asynchronous materialized views based on external tables will refresh the metadata cache of the external tables before refreshing the materialized views, ensuring construction based on the latest external table data. [#38212](https://github.com/apache/doris/pull/38212) +- Partition incremental construction now supports rolling up according to weekly and quarterly granularities. [#39286](https://github.com/apache/doris/pull/39286) + +### Query Optimizer + +- The aggregate function `GROUP_CONCAT` now supports the use of both `DISTINCT` and `ORDER BY`. [#38080](https://github.com/apache/doris/pull/38080) +- Optimized the collection and use of statistical information, as well as the logic for estimating row counts and cost calculations, to generate more efficient and stable execution plans. +- Window function partition data pre-filtering now supports cases containing multiple window functions. [#38393](https://github.com/apache/doris/pull/38393) + +### Query Execution + +- Reduced query latency by running prepare pipeline tasks in parallel. [#40874](https://github.com/apache/doris/pull/40874) +- Display Catalog information in Profile. [#38283](https://github.com/apache/doris/pull/38283) +- Optimized the computational performance of `IN` filtering conditions. [#40917](https://github.com/apache/doris/pull/40917) +- Supported cgroupv2 in K8S to limit Doris's memory usage. [#39256](https://github.com/apache/doris/pull/39256) +- Optimized the performance of converting strings to datetime types. [#38385](https://github.com/apache/doris/pull/38385) +- When a `string` is a decimal number, support casting it to an `int`, which will be more compatible with certain behaviors of MySQL. [#38847](https://github.com/apache/doris/pull/38847) + +### Semi-Structured Data Management + +- Optimized the performance of inverted index matching. [#41122](https://github.com/apache/doris/pull/41122) +- Temporarily prohibited the creation of inverted indexes with tokenization on arrays. [#39062](https://github.com/apache/doris/pull/39062) +- `explode_json_array` now supports binary JSON types. [#37278](https://github.com/apache/doris/pull/37278) +- IP data types now support bloomfilter indexes. [#39253](https://github.com/apache/doris/pull/39253) +- IP data types now support row storage. [#39258](https://github.com/apache/doris/pull/39258) +- Nested data types such as ARRAY, MAP, and STRUCT now support schema changes. [#39210](https://github.com/apache/doris/pull/39210) +- When creating MTMV, automatically truncate KEYs encountered in VARIANT data types. [#39988](https://github.com/apache/doris/pull/39988) +- Lazy loading of inverted indexes during queries to improve performance. [#38979](https://github.com/apache/doris/pull/38979) +- `add inverted index file size for open file`. [#37482](https://github.com/apache/doris/pull/37482) +- Reduced access to object storage interfaces during compaction to improve performance. [#41079](https://github.com/apache/doris/pull/41079) +- Added three new query profile metrics related to inverted indexes. [#36696](https://github.com/apache/doris/pull/36696) +- Reduced cache overhead for non-PreparedStatement SQL to improve performance. [#40910](https://github.com/apache/doris/pull/40910) +- Pre-warming cache now supports inverted indexes. [#38986](https://github.com/apache/doris/pull/38986) +- Inverted indexes are now cached immediately after writing. [#39076](https://github.com/apache/doris/pull/39076) + +### Compatibility + +- Fixed the issue of Thrift ID incompatibility on the master with branch-2.1. [#41057](https://github.com/apache/doris/pull/41057) + +### Other + +- BE HTTP API now supports authentication; set config::enable_all_http_auth to true (default is false) when authentication is required. [#39577](https://github.com/apache/doris/pull/39577) +- Optimized the user permissions required for the REFRESH operation. Permissions have been relaxed from ALTER to SHOW. [#39008](https://github.com/apache/doris/pull/39008) +- Reduced the range of nextId when calling advanceNextId(). [#40160](https://github.com/apache/doris/pull/40160) +- Optimized the caching mechanism for Java UDFs. [#40404](https://github.com/apache/doris/pull/40404) + +## Bug Fixes + +### Load + +- Fixed the issue where `abortTransaction` did not handle return codes. [#41275](https://github.com/apache/doris/pull/41275) +- Fixed the issue where transactions failed to commit or abort in compute-storage decoupled mode without calling `afterCommit/afterAbort`. [#41267](https://github.com/apache/doris/pull/41267) +- Fixed the issue where Routine Load could not work properly when modifying consumer offsets in compute-storage decoupled mode. [#39159](https://github.com/apache/doris/pull/39159) +- Fixed the issue of repeatedly closing file handles when obtaining error log file paths. [#41320](https://github.com/apache/doris/pull/41320) +- Fixed the issue of incorrect job progress caching for Routine Load in compute-storage decoupled mode. [#39313](https://github.com/apache/doris/pull/39313) +- Fixed the issue where Routine Load could get stuck when failing to commit transactions in compute-storage decoupled mode. [#40539](https://github.com/apache/doris/pull/40539) +- Fixed the issue where Routine Load kept reporting data quality check errors in compute-storage decoupled mode. [#39790](https://github.com/apache/doris/pull/39790) +- Fixed the issue where Routine Load did not check transactions before committing in compute-storage decoupled mode. [#39775](https://github.com/apache/doris/pull/39775) +- Fixed the issue where Routine Load did not check transactions before aborting in compute-storage decoupled mode. [#40463](https://github.com/apache/doris/pull/40463) +- Fixed the issue where cluster keys did not support certain data types. [#38966](https://github.com/apache/doris/pull/38966) +- Fixed the issue of transactions being repeatedly committed. [#39786](https://github.com/apache/doris/pull/39786) +- Fixed the issue of use after free with WAL when BE exits. [#33131](https://github.com/apache/doris/pull/33131) +- Fixed the issue where WAL playback did not skip completed import transactions in compute-storage decoupled mode. [#41262](https://github.com/apache/doris/pull/41262) +- Fixed the logic for selecting BE in group commit in compute-storage decoupled mode. [#39986](https://github.com/apache/doris/pull/39986) [#38644](https://github.com/apache/doris/pull/38644) +- Fixed the issue where BE might crash when group commit was enabled for insert into. [#39339](https://github.com/apache/doris/pull/39339) +- Fixed the issue where insert into with group commit enabled might get stuck. [#39391](https://github.com/apache/doris/pull/39391) +- Fixed the issue where not enabling the group commit option during import might result in a table not found error. [#39731](https://github.com/apache/doris/pull/39731) +- Fixed the issue of transaction submission timeouts due to too many tablets. [#40031](https://github.com/apache/doris/pull/40031) +- Fixed the issue of concurrent opens with Auto Partition. [#38605](https://github.com/apache/doris/pull/38605) +- Fixed the issue of import lock granularity being too large. [#40134](https://github.com/apache/doris/pull/40134) +- Fixed the issue of coredumps caused by zero-length varchars. [#40940](https://github.com/apache/doris/pull/40940) +- Fixed the issue of incorrect index Id values in log prints. [#38790](https://github.com/apache/doris/pull/38790) +- Fixed the issue of memtable shifting not closing BRPC streaming. [#40105](https://github.com/apache/doris/pull/40105) +- Fixed the issue of inaccurate bvar statistics during memtable shifting. [#39075](https://github.com/apache/doris/pull/39075) +- Fixed the issue of multi-replication fault tolerance during memtable shifting. [#38003](https://github.com/apache/doris/pull/38003) +- Fixed the issue of incorrect message length calculations for Routine Load with multiple tables in one stream. [#40367](https://github.com/apache/doris/pull/40367) +- Fixed the issue of inaccurate progress reporting for Broker Load. [#40325](https://github.com/apache/doris/pull/40325) +- Fixed the issue of inaccurate data scan volume reporting for Broker Load. [#40694](https://github.com/apache/doris/pull/40694) +- Fixed the issue of concurrency with Routine Load in compute-storage decoupled mode. [#39242](https://github.com/apache/doris/pull/39242) +- Fixed the issue of Routine Load jobs being canceled in compute-storage decoupled mode. [#39514](https://github.com/apache/doris/pull/39514) +- Fixed the issue of progress not being reset when deleting Kafka topics. [#38474](https://github.com/apache/doris/pull/38474) +- Fixed the issue of updating progress during transaction state transitions in Routine Load. [#39311](https://github.com/apache/doris/pull/39311) +- Fixed the issue of Routine Load switching from a paused state to a paused state. [#40728](https://github.com/apache/doris/pull/40728) +- Fixed the issue of Stream Load records being missed due to database deletion. [#39360](https://github.com/apache/doris/pull/39360) + +### Storage + +- Fixed the issue of missing storage policies. [#38700](https://github.com/apache/doris/pull/38700) +- Fixed the issue of errors during cross-version backup and recovery. [#38370](https://github.com/apache/doris/pull/38370) +- Fixed the NPE issue with ccr binlog. [#39909](https://github.com/apache/doris/pull/39909) +- Fixed potential issues with duplicate keys in mow. [#41309](https://github.com/apache/doris/pull/41309) [#39791](https://github.com/apache/doris/pull/39791) [#39958](https://github.com/apache/doris/pull/39958) [#38369](https://github.com/apache/doris/pull/38369) [#38331](https://github.com/apache/doris/pull/38331) +- Fixed the issue of not being able to write after backup and recovery in high-frequency write scenarios. [#40118](https://github.com/apache/doris/pull/40118) [#38321](https://github.com/apache/doris/pull/38321) +- Fixed the issue of data errors potentially triggered by deleting empty strings and schema changes. [#41064](https://github.com/apache/doris/pull/41064) +- Fixed the issue of incorrect statistics due to column updates. [#40880](https://github.com/apache/doris/pull/40880) +- Limited the size of tablet meta pb to prevent BE crashes due to oversized meta. [#39455](https://github.com/apache/doris/pull/39455) +- Fixed the potential column misalignment issue with the new optimizer in `begin; insert into values; commit`. [#39295](https://github.com/apache/doris/pull/39295) + +### Compute-Storage Decoupled + +- Fixed the issue where the tablet distribution might be inconsistent across multiple FEs in compute-storage decoupled mode. [#41458](https://github.com/apache/doris/pull/41458) +- Fixed the issue where TVF might not work in multi-computing group environments. [#39249](https://github.com/apache/doris/pull/39249) +- Fixed the issue where compaction used resources that had already been released when BE exited in compute-storage decoupled mode. [#39302](https://github.com/apache/doris/pull/39302) +- Fixed the issue where automatic start-stop might cause FE replay to get stuck. [#40027](https://github.com/apache/doris/pull/40027) +- Fixed the issue where the BE status and the stored status in meta-service were inconsistent. [#40799](https://github.com/apache/doris/pull/40799) +- Fixed the issue where the FE->meta-service connection pool could not automatically expire and reconnect. [#41202](https://github.com/apache/doris/pull/41202) [#40661](https://github.com/apache/doris/pull/40661) +- Fixed the issue where some tablets might repeatedly undergo unexpected balance processes during rebalance. [#39792](https://github.com/apache/doris/pull/39792) +- Fixed the issue where storage vault permissions were lost after FE restarted. [#40260](https://github.com/apache/doris/pull/40260) +- Fixed the issue where tablet row counts and other statistical information might be incomplete due to FDB scan range pagination. [#40494](https://github.com/apache/doris/pull/40494) +- Fixed the performance issue caused by a large number of aborted transactions associated with the same label. [#40606](https://github.com/apache/doris/pull/40606) +- Fixed the issue where `commit_txn` did not automatically re-enter, maintaining consistent behavior between compute-storage decoupled and integrated modes. [#39615](https://github.com/apache/doris/pull/39615) +- Fixed the issue where the number of projected columns increased when dropping columns. [#40187](https://github.com/apache/doris/pull/40187) +- Fixed the issue where delete statements did not correctly handle return values, causing data to still be visible after deletion. [#39428](https://github.com/apache/doris/pull/39428) +- Fixed the coredump issue caused by rowset metadata competition during file cache preheating. [#39361](https://github.com/apache/doris/pull/39361) +- Fixed the issue where the entire cache space would be used up when TTL cache enabled LRU eviction. [#39814](https://github.com/apache/doris/pull/39814) +- Fixed the issue where temporary files could not be recycled when importing commit rowset failed with HDFS storage backend. [#40215](https://github.com/apache/doris/pull/40215) + +### Lakehouse + +- Fixed some issues with predicate pushdown in JDBC Catalog. [#39064](https://github.com/apache/doris/pull/39064) +- Fixed the issue of not being able to read when `S``TRUCT` type columns are missing in Parquet format. [#38718](https://github.com/apache/doris/pull/38718) +- Fixed the issue of FileSystem leaks on the FE side in some cases. [#38610](https://github.com/apache/doris/pull/38610) +- Fixed the issue of metadata cache information being inconsistent when Hive/Iceberg tables write back in some cases. [#40729](https://github.com/apache/doris/pull/40729) +- Fixed the issue of unstable partition ID generation for external tables in some cases. [#39325](https://github.com/apache/doris/pull/39325) +- Fixed the issue of external table queries selecting BE nodes in the blacklist in some cases. [#39451](https://github.com/apache/doris/pull/39451) +- Optimized the timeout time for batch retrieval of external table partition information to avoid long-term thread occupation. [#39346](https://github.com/apache/doris/pull/39346) +- Fixed the issue of memory leaks when querying Hudi tables in some cases. [#41256](https://github.com/apache/doris/pull/41256) +- Fixed the issue of connection pool connection leaks in JDBC Catalog in some cases. [#39582](https://github.com/apache/doris/pull/39582) +- Fixed the issue of BE memory leaks in JDBC Catalog in some cases. [#41041](https://github.com/apache/doris/pull/41041) +- Fixed the issue of not being able to query Hudi data on Alibaba Cloud OSS. [#41316](https://github.com/apache/doris/pull/41316) +- Fixed the issue of not being able to read empty partitions in MaxCompute. [#40046](https://github.com/apache/doris/pull/40046) +- Fixed the issue of poor performance when querying Oracle through JDBC Catalog. [#41513](https://github.com/apache/doris/pull/41513) +- Fixed the issue of BE crashes when querying deletion vector of Paimon tables after enabling file cache features. [#39877](https://github.com/apache/doris/pull/39877) +- Fixed the issue of not being able to access Paimon tables on HDFS clusters with HA enabled. [#39806](https://github.com/apache/doris/pull/39806) +- Temporarily disabled the page index filtering feature of Parquet to avoid potential issues. [#38691](https://github.com/apache/doris/pull/38691) +- Fixed the issue of not being able to read unsigned types in Parquet files. [#39926](https://github.com/apache/doris/pull/39926) +- Fixed the issue of potential infinite loops when reading Parquet files in some cases. [#39523](https://github.com/apache/doris/pull/39523) + +### Asynchronous Materialized Views + +- Fixed the issue where partition construction might select the wrong table to track partitions if both sides have the same column names. [#40810](https://github.com/apache/doris/pull/40810) +- Fixed the issue where transparent rewrite partition compensation might result in incorrect results. [#40803](https://github.com/apache/doris/pull/40803) +- Fixed the issue where transparent rewrite did not take effect on external tables. [#38909](https://github.com/apache/doris/pull/38909) +- Fixed the issue where nested materialized views might not refresh properly. [#40433](https://github.com/apache/doris/pull/40433) + +### Synchronous Materialized Views + +- Fixed the issue where creating synchronous materialized views on MOW tables might result in incorrect query results. [#39171](https://github.com/apache/doris/pull/39171) + +### Query Optimizer + +- Fixed the issue where existing synchronous materialized views might not be usable after upgrading. [#41283](https://github.com/apache/doris/pull/41283) +- Fixed the issue of not correctly handling milliseconds when comparing datetime literals. [#40121](https://github.com/apache/doris/pull/40121) +- Fixed the issue of potential errors in conditional function partition pruning. [#39298](https://github.com/apache/doris/pull/39298) +- Fixed the issue where MOW tables with synchronous materialized views could not perform delete operations. [#39578](https://github.com/apache/doris/pull/39578) +- Fixed the issue where the nullable of slots in JDBC external table query predicates might be incorrectly planned, causing query errors. [#41014](https://github.com/apache/doris/pull/41014) + +### Query Execution + +- Fixed the memory leak issue caused by the use of runtime filters. [#39155](https://github.com/apache/doris/pull/39155) +- Fixed the issue of excessive memory usage by window functions. [#39581](https://github.com/apache/doris/pull/39581) +- Fixed a series of function compatibility issues during rolling upgrades. [#41023](https://github.com/apache/doris/pull/41023) [#40438](https://github.com/apache/doris/pull/40438) [#39648](https://github.com/apache/doris/pull/39648) +- Fixed the issue of incorrect results with `encryption_function` when used with constants. [#40201](https://github.com/apache/doris/pull/40201) +- Fixed the issue of errors when importing single-table materialized views. [#39061](https://github.com/apache/doris/pull/39061) +- Fixed the issue of incorrect partition result calculations for window functions. [#39100](https://github.com/apache/doris/pull/39100) [#40761](https://github.com/apache/doris/pull/40761) +- Fixed the issue of incorrect calculations for topn when null values are present. [#39497](https://github.com/apache/doris/pull/39497) +- Fixed the issue of incorrect results with the `map_agg` function. [#39743](https://github.com/apache/doris/pull/39743) +- Fixed the issue of incorrect messages returned by cancel. [#38982](https://github.com/apache/doris/pull/38982) +- Fixed the issue of BE core dumps caused by encrypt and decrypt functions. [#40726](https://github.com/apache/doris/pull/40726) +- Fixed the issue of queries getting stuck due to too many scanners in high-concurrency scenarios. [#40495](https://github.com/apache/doris/pull/40495) +- Supported time types in runtime filters. [#38258](https://github.com/apache/doris/pull/38258) +- Fixed the issue of incorrect results with window funnel functions. [#40960](https://github.com/apache/doris/pull/40960) + +### Semi-Structured Data Management + +- Fixed the issue of match function errors when no indexes were present. [#38989](https://github.com/apache/doris/pull/38989) +- Fixed the issue of crashes when ARRAY data types were used as parameters for array_min/array_max functions. [#39492](https://github.com/apache/doris/pull/39492) +- Fixed the issue of nullable with the `array_enumerate_uniq` function. [#38384](https://github.com/apache/doris/pull/38384) +- Fixed the issue of bloomfilter indexes not being updated when adding or deleting columns. [#38431](https://github.com/apache/doris/pull/38431) +- Fixed the issue of es-catalog parsing exceptions with array data. [#39104](https://github.com/apache/doris/pull/39104) +- Fixed the issue of improper predicate push-down in es-catalog. [#40111](https://github.com/apache/doris/pull/40111) +- Fixed the issue of exceptions caused by modifying input data with`map()` and `struct()` functions. [#39699](https://github.com/apache/doris/pull/39699) +- Fixed the issue of index compaction crashes in special cases. [#40294](https://github.com/apache/doris/pull/40294) +- Fixed the issue of ARRAY type inverted indexes missing nullbitmaps. [#38907](https://github.com/apache/doris/pull/38907) +- Fixed the issue of incorrect results with the `count()` function on inverted indexes. [#41152](https://github.com/apache/doris/pull/41152) +- Fixed the issue of correct results with the `explode_map` function when using aliases. [#39757](https://github.com/apache/doris/pull/39757) +- Fixed the issue of VARIANT type not being able to use row storage for exceptional JSON data. [#39394](https://github.com/apache/doris/pull/39394) +- Fixed the issue of memory leaks when returning ARRAY results with VARIANT type. [#41358](https://github.com/apache/doris/pull/41358) +- Fixed the issue of changing column names with VARIANT type. [#40320](https://github.com/apache/doris/pull/40320) +- Fixed the issue of potential precision loss when converting VARIANT type to DECIMAL type. [#39650](https://github.com/apache/doris/pull/39650) +- Fixed the issue of nullable handling with VARIANT type. [#39732](https://github.com/apache/doris/pull/39732) +- Fixed the issue of sparse column reading with VARIANT type. [#40295](https://github.com/apache/doris/pull/40295) + +### Other + +- Fixed the compatibility issue between new and old audit log plugins. [#41401](https://github.com/apache/doris/pull/41401) +- Fixed the issue where users could see processes of others in certain cases. [#39747](https://github.com/apache/doris/pull/39747) +- Fixed the issue where users with permissions could not export. [#38365](https://github.com/apache/doris/pull/38365) +- Fixed the issue where create table like required create permissions for the existing table. [#37879](https://github.com/apache/doris/pull/37879) +- Fixed the issue where some features did not verify permissions. [#39726](https://github.com/apache/doris/pull/39726) +- Fixed the issue of not correctly closing connections when using SSL. [#38587](https://github.com/apache/doris/pull/38587) +- Fixed the issue where executing ALTER VIEW operations in some cases caused FE to fail to start. [#40872](https://github.com/apache/doris/pull/40872) \ No newline at end of file diff --git a/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.3.md b/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.3.md new file mode 100644 index 0000000000000..b15777212b400 --- /dev/null +++ b/versioned_docs/version-1.2/releasenotes/v3.0/release-3.0.3.md @@ -0,0 +1,226 @@ +--- +{ + "title": "Release 3.0.3", + "language": "en" +} +--- + + + + +Dear community members, the Apache Doris 3.0.3 version was officially released on December 02, 2024, this version further enhances the performance and stability of the system. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavioral Changes + +- Prohibited column updates on MOW tables with synchronous materialized views. [#40190](https://github.com/apache/doris/pull/40190) +- Adjusted the default parameters of RoutineLoad to improve import efficiency. [#42968](https://github.com/apache/doris/pull/42968) +- When StreamLoad fails, the return value of LoadedRows is adjusted to 0. [#41946](https://github.com/apache/doris/pull/41946) [#42291](https://github.com/apache/doris/pull/42291) +- Adjusted the default memory limit of Segment cache to 5%. [#42308](https://github.com/apache/doris/pull/42308) [#42436](https://github.com/apache/doris/pull/42436) + +## New Features + +- Introduced the session variable `enable_cooldown_replica_affinity` to control the affinity of cold and hot tiered replicas. [#42677](https://github.com/apache/doris/pull/42677) + +- Added `table$partition` syntax for querying partition information of Hive tables. [#40774](https://github.com/apache/doris/pull/40774) + + - [View Documentation](https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/hive) + +- Supported creation of Hive tables in Text format. [#41860](https://github.com/apache/doris/pull/41860) [#42175](https://github.com/apache/doris/pull/42175) + + - [View Documentation](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build#table) + +### Asynchronous Materialized Views + +- Introduced new materialized view attribute `use_for_rewrite`. When `use_for_rewrite` is set to false, the materialized view does not participate in transparent rewriting. [#40332](https://github.com/apache/doris/pull/40332) + +### Query Optimizer + +- Supported correlated non-aggregate subqueries. [#42236](https://github.com/apache/doris/pull/42236) + +### Query Execution + +- Added functions `ngram_search`, `normal_cdf`, `to_iso8601`, `from_iso8601_date`, `SESSION_USER()`, `last_query_id`. [#38226](https://github.com/apache/doris/pull/38226) [#40695](https://github.com/apache/doris/pull/40695) [#41075](https://github.com/apache/doris/pull/41075) [#41600](https://github.com/apache/doris/pull/41600) [#39575](https://github.com/apache/doris/pull/39575) [#40739](https://github.com/apache/doris/pull/40739) +- The `aes_encrypt` and `aes_decrypt` functions support GCM mode. [#40004](https://github.com/apache/doris/pull/40004) +- Profile outputs the changed session variable values. [#41016](https://github.com/apache/doris/pull/41016) [#41318](https://github.com/apache/doris/pull/41318) + +### Semi-structured Data Management + +- Added array functions `array_match_all` and `array_match_any`. [#40605](https://github.com/apache/doris/pull/40605) [#43514](https://github.com/apache/doris/pull/43514) +- The array function `array_agg` supports nesting ARRAY/MAP/STRUCT within ARRAY. [#42009](https://github.com/apache/doris/pull/42009) +- Added approximate aggregate statistical functions `approx_top_k` and `approx_top_sum`. [#44082](https://github.com/apache/doris/pull/44082) + +## Improvements + +### Storage + +- Supported `bitmap_empty` as the default value. [#40364](https://github.com/apache/doris/pull/40364) +- Introduced the session variable `insert_timeout` to control the timeout of DELETE statements. [#41063](https://github.com/apache/doris/pull/41063) +- Improved some error message prompts. [#41048](https://github.com/apache/doris/pull/41048) [#39631](https://github.com/apache/doris/pull/39631) +- Improved the priority scheduling of replica repair. [#41076](https://github.com/apache/doris/pull/41076) +- Enhanced the robustness of timezone handling when creating tables. [#41926](https://github.com/apache/doris/pull/41926) [#42389](https://github.com/apache/doris/pull/42389) +- Checked the validity of partition expressions when creating tables. [#40158](https://github.com/apache/doris/pull/40158) +- Supported Unicode-encoded column names in DELETE operations. [#39381](https://github.com/apache/doris/pull/39381) + +### Compute-Storage Decoupled + +- Supported ARM architecture deployment in storage and compute separation mode. [#42467](https://github.com/apache/doris/pull/42467) [#43377](https://github.com/apache/doris/pull/43377) +- Optimized the eviction strategy and lock competition of file cache, improving hit rate and high concurrency point query performance. [#42451](https://github.com/apache/doris/pull/42451) [#43201](https://github.com/apache/doris/pull/43201) [#41818](https://github.com/apache/doris/pull/41818) [#43401](https://github.com/apache/doris/pull/43401) +- S3 storage vault supported `use_path_style`, solving the problem of using custom domain names for object storage. [#43060](https://github.com/apache/doris/pull/43060) [#43343](https://github.com/apache/doris/pull/43343) [#43330](https://github.com/apache/doris/pull/43330) +- Optimized storage and compute separation configuration and deployment, preventing misoperations in different modes. [#43381](https://github.com/apache/doris/pull/43381) [#43522](https://github.com/apache/doris/pull/43522) [#43434](https://github.com/apache/doris/pull/43434) [#40764](https://github.com/apache/doris/pull/40764) [#43891](https://github.com/apache/doris/pull/43891) +- Optimized observability and provided an interface for deleting specified segment file cache. [#38489](https://github.com/apache/doris/pull/38489) [#42896](https://github.com/apache/doris/pull/42896) [#41037](https://github.com/apache/doris/pull/41037) [#43412](https://github.com/apache/doris/pull/43412) +- Optimized Meta-service operation and maintenance interface: RPC rate limiting and tablet metadata correction. [#42413](https://github.com/apache/doris/pull/42413) [#43884](https://github.com/apache/doris/pull/43884) [#41782](https://github.com/apache/doris/pull/41782) [#43460](https://github.com/apache/doris/pull/43460) + +### Lakehouse + +- Paimon Catalog supported Alibaba Cloud DLF and OSS-HDFS storage. [#41247](https://github.com/apache/doris/pull/41247) [#42585](https://github.com/apache/doris/pull/42585) + + - View [Documentation](https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/paimon) + +- Supported reading of Hive tables in OpenCSV format. [#42257](https://github.com/apache/doris/pull/42257) [#42942](https://github.com/apache/doris/pull/42942) +- Optimized the performance of accessing the `information_schema.columns` table in External Catalog. [#41659](https://github.com/apache/doris/pull/41659) [#41962](https://github.com/apache/doris/pull/41962) +- Used the new Max Compute open storage API to access Max Compute data sources. [#41614](https://github.com/apache/doris/pull/41614) +- Optimized the scheduling policy of the JNI part of Paimon tables, making scan tasks more balanced. [#43310](https://github.com/apache/doris/pull/43310) +- Optimized the read performance of small ORC files. [#42004](https://github.com/apache/doris/pull/42004) [#43467](https://github.com/apache/doris/pull/43467) +- Supported reading of parquet files in brotli compressed format. [#42177](https://github.com/apache/doris/pull/42177) +- Added `file_cache_statistics` table under the `information_schema` library to view metadata cache statistics. [#42160](https://github.com/apache/doris/pull/42160) + +### Query Optimizer + +- Optimization: When queries only differ in comments, the same SQL Cache can be reused. [#40049](https://github.com/apache/doris/pull/40049) +- Optimization: Improved the stability of statistical information when data is frequently updated. [#43865](https://github.com/apache/doris/pull/43865) [#39788](https://github.com/apache/doris/pull/39788) [#43009](https://github.com/apache/doris/pull/43009) [#40457](https://github.com/apache/doris/pull/40457) [#42409](https://github.com/apache/doris/pull/42409) [#41894](https://github.com/apache/doris/pull/41894) +- Optimization: Enhanced the stability of constant folding. [#42910](https://github.com/apache/doris/pull/42910) [#41164](https://github.com/apache/doris/pull/41164) [#39723](https://github.com/apache/doris/pull/39723) [#41394](https://github.com/apache/doris/pull/41394) [#42256](https://github.com/apache/doris/pull/42256) [#40441](https://github.com/apache/doris/pull/40441) +- Optimization: Column pruning can generate better execution plans. [#41719](https://github.com/apache/doris/pull/41719) [#41548](https://github.com/apache/doris/pull/41548) + +### Query Execution + +- Optimized the memory usage of the sort operator. [#39306](https://github.com/apache/doris/pull/39306) +- Optimized the performance of computations on ARM. [#38888](https://github.com/apache/doris/pull/38888) [#38759](https://github.com/apache/doris/pull/38759) +- Optimized the computational performance of a series of functions. [#40366](https://github.com/apache/doris/pull/40366) [#40821](https://github.com/apache/doris/pull/40821) [#40670](https://github.com/apache/doris/pull/40670) [#41206](https://github.com/apache/doris/pull/41206) [#40162](https://github.com/apache/doris/pull/40162) +- Used SSE instructions to optimize the performance of the `match_ipv6_subnet` function. [#38755](https://github.com/apache/doris/pull/38755) +- Supported automatic creation of new partitions during insert overwrite. [#38628](https://github.com/apache/doris/pull/38628) [#42645](https://github.com/apache/doris/pull/42645) +- Added the status of each PipelineTask in Profile. [#42981](https://github.com/apache/doris/pull/42981) +- IP type supported runtime filter. [#39985](https://github.com/apache/doris/pull/39985) + +### Semi-structured Data Management + +- Output the real SQL of prepared statements in audit logs. [#43321](https://github.com/apache/doris/pull/43321) +- The filebeat doris output plugin supports fault tolerance and progress reporting. [#36355](https://github.com/apache/doris/pull/36355) +- Optimized the performance of inverted index queries. [#41547](https://github.com/apache/doris/pull/41547) [#41585](https://github.com/apache/doris/pull/41585) [#41567](https://github.com/apache/doris/pull/41567) [#41577](https://github.com/apache/doris/pull/41577) [#42060](https://github.com/apache/doris/pull/42060) [#42372](https://github.com/apache/doris/pull/42372) +- The array function `array overlaps` supports acceleration using inverted indexes. [#41571](https://github.com/apache/doris/pull/41571) +- The IP function `is_ip_address_in_range` supports acceleration using inverted indexes. [#41571](https://github.com/apache/doris/pull/41571) +- Optimized the CAST performance of the VARIANT data type. [#41775](https://github.com/apache/doris/pull/41775) [#42438](https://github.com/apache/doris/pull/42438) [#43320](https://github.com/apache/doris/pull/43320) +- Optimized the CPU resource consumption of the Variant data type. [#42856](https://github.com/apache/doris/pull/42856) [#43062](https://github.com/apache/doris/pull/43062) [#43634](https://github.com/apache/doris/pull/43634) +- Optimized the metadata and execution memory resource consumption of the Variant data type. [#42448](https://github.com/apache/doris/pull/42448) [#43326](https://github.com/apache/doris/pull/43326) [#41482](https://github.com/apache/doris/pull/41482) [#43093](https://github.com/apache/doris/pull/43093) [#43567](https://github.com/apache/doris/pull/43567) [#43620](https://github.com/apache/doris/pull/43620) + +### Permissions + +- Added a new configuration item `ldap_group_filter` in LDAP for custom group filtering. [#43292](https://github.com/apache/doris/pull/43292) + +### Other + +- Supported displaying connection count information by user in FE monitoring items. [#39200](https://github.com/apache/doris/pull/39200) + +## Bug Fixes + +### Storage + +- Fixed the issue with using IPv6 hostnames. [#40074](https://github.com/apache/doris/pull/40074) +- Fixed the inaccurate display of broker/s3 load progress. [#43535](https://github.com/apache/doris/pull/43535) +- Fixed the issue where queries might hang from FE. [#41303](https://github.com/apache/doris/pull/41303) [#42382](https://github.com/apache/doris/pull/42382) +- Fixed the issue of duplicate auto-increment IDs under exceptional circumstances. [#43774](https://github.com/apache/doris/pull/43774) [#43983](https://github.com/apache/doris/pull/43983) +- Fixed occasional NPE issues with groupcommit. [#43635](https://github.com/apache/doris/pull/43635) +- Fixed the inaccurate calculation of auto bucket. [#41675](https://github.com/apache/doris/pull/41675) [#41835](https://github.com/apache/doris/pull/41835) +- Fixed the issue where FE might not correctly plan multi-table flows after restart. [#41677](https://github.com/apache/doris/pull/41677) [#42290](https://github.com/apache/doris/pull/42290) + +### Compute-Storage Decoupled + +- Fixed the issue that MOW primary key tables with large delete bitmaps might cause coredump. [#43088](https://github.com/apache/doris/pull/43088) [#43457](https://github.com/apache/doris/pull/43457) [#43479](https://github.com/apache/doris/pull/43479) [#43407](https://github.com/apache/doris/pull/43407) [#43297](https://github.com/apache/doris/pull/43297) [#43613](https://github.com/apache/doris/pull/43613) [#43615](https://github.com/apache/doris/pull/43615) [#43854](https://github.com/apache/doris/pull/43854) [#43968](https://github.com/apache/doris/pull/43968) [#44074](https://github.com/apache/doris/pull/44074) [#41793](https://github.com/apache/doris/pull/41793) [#42142](https://github.com/apache/doris/pull/42142) +- Fixed the issue that segment files, when being a multiple of 5MB, would fail to upload objects. [#43254](https://github.com/apache/doris/pull/43254) +- Fixed the issue that the default retry policy of aws sdk did not take effect. [#43575](https://github.com/apache/doris/pull/43575) [#43648](https://github.com/apache/doris/pull/43648) +- Fixed the issue that altering storage vault could continue execution even when the wrong type was specified. [#43489](https://github.com/apache/doris/pull/43489) [#43352](https://github.com/apache/doris/pull/43352) [#43495](https://github.com/apache/doris/pull/43495) +- Fixed the issue that tablet_id might be 0 during the delayed commit process of large transactions. [#42043](https://github.com/apache/doris/pull/42043) [#42905](https://github.com/apache/doris/pull/42905) +- Fixed the issue that constant folding RCP and FE forwarding SQL might not be executed in the expected computation group. [#43110](https://github.com/apache/doris/pull/43110) [#41819](https://github.com/apache/doris/pull/41819) [#41846](https://github.com/apache/doris/pull/41846) +- Fixed the issue that meta-service did not strictly check instance_id upon receiving RPC. [#43253](https://github.com/apache/doris/pull/43253) [#43832](https://github.com/apache/doris/pull/43832) +- Fixed the issue that FE follower information_schema version did not update in time. [#43496](https://github.com/apache/doris/pull/43496) +- Fixed the issue of atomicity in file cache rename and inaccurate metrics. [#42869](https://github.com/apache/doris/pull/42869) [#43504](https://github.com/apache/doris/pull/43504) [#43220](https://github.com/apache/doris/pull/43220) + +### Lakehouse + +- Prohibited implicit conversion predicates from being pushed down to JDBC data sources to avoid inconsistent query results. [#42102](https://github.com/apache/doris/pull/42102) +- Fixed some read issues with high-version Hive transactional tables. [#42226](https://github.com/apache/doris/pull/42226) +- Fixed the issue that the Export command might cause deadlocks. [#43083](https://github.com/apache/doris/pull/43083) [#43402](https://github.com/apache/doris/pull/43402) +- Fixed the issue of being unable to query Hive views created by Spark. [#43552](https://github.com/apache/doris/pull/43552) +- Fixed the issue that Hive partition paths containing special characters led to incorrect partition pruning. [#42906](https://github.com/apache/doris/pull/42906) +- Fixed the issue that Iceberg Catalog could not use AWS Glue. [#41084](https://github.com/apache/doris/pull/41084) + +### Asynchronous Materialized Views + +- Fixed the issue that asynchronous materialized views might not refresh after the base table is rebuilt. [#41762](https://github.com/apache/doris/pull/41762) + +### Query Optimizer + +- Fixed the issue that partition pruning results might be incorrect when using multi-column range partitioning. [#43332](https://github.com/apache/doris/pull/43332) +- Fixed the issue of incorrect calculation results in some limit offset scenarios. [#42576](https://github.com/apache/doris/pull/42576) + +### Query Execution + +- Fixed the issue that hash join with array types larger than 4G could cause BE Core. [#43861](https://github.com/apache/doris/pull/43861) +- Fixed the issue that is null predicate operations might yield incorrect results in some scenarios. [#43619](https://github.com/apache/doris/pull/43619) +- Fixed the issue that bitmap types might produce incorrect output results in hash join. [#43718](https://github.com/apache/doris/pull/43718) +- Fixed some issues where function results were calculated incorrectly. [#40710](https://github.com/apache/doris/pull/40710) [#39358](https://github.com/apache/doris/pull/39358) [#40929](https://github.com/apache/doris/pull/40929) [#40869](https://github.com/apache/doris/pull/40869) [#40285](https://github.com/apache/doris/pull/40285) [#39891](https://github.com/apache/doris/pull/39891) [#40530](https://github.com/apache/doris/pull/40530) [#41948](https://github.com/apache/doris/pull/41948) [#43588](https://github.com/apache/doris/pull/43588) +- Fixed some issues with JSON type parsing. [#39937](https://github.com/apache/doris/pull/39937) +- Fixed issues with varchar and char types in runtime filter operations. [#43758](https://github.com/apache/doris/pull/43758) [#43919](https://github.com/apache/doris/pull/43919) +- Fixed some issues with the use of decimal256 in scalar and aggregate functions. [#42136](https://github.com/apache/doris/pull/42136) [#42356](https://github.com/apache/doris/pull/42356) +- Fixed the issue that arrow flight reported `Reach limit of connections` errors upon connection. [#39127](https://github.com/apache/doris/pull/39127) +- Fixed the issue of incorrect memory usage statistics for BE in k8s environments. [#41123](https://github.com/apache/doris/pull/41123) + +### Semi-structured Data Management + +- Adjusted the default values of `segment_cache_fd_percentage` and `inverted_index_fd_number_limit_percent`. [#42224](https://github.com/apache/doris/pull/42224) +- logstash now supports group_commit. [#40450](https://github.com/apache/doris/pull/40450) +- Fixed the issue of coredump when building index. [#43246](https://github.com/apache/doris/pull/43246) [#43298](https://github.com/apache/doris/pull/43298) +- Fixed issues with variant index. [#43375](https://github.com/apache/doris/pull/43375) [#43773](https://github.com/apache/doris/pull/43773) +- Fixed potential fd and memory leaks under abnormal compaction circumstances. [#42374](https://github.com/apache/doris/pull/42374) +- Inverted index match null now correctly returns null instead of false. [#41786](https://github.com/apache/doris/pull/41786) +- Fixed the issue of coredump when ngram bloomfilter index bf_size is set to 65536. [#43645](https://github.com/apache/doris/pull/43645) +- Fixed the issue of potential coredump during complex data type JOINs. [#40398](https://github.com/apache/doris/pull/40398) +- Fixed the issue of coredump with TVF JSON data. [#43187](https://github.com/apache/doris/pull/43187) +- Fixed the precision issue of bloom filter calculations for dates and times. [#43612](https://github.com/apache/doris/pull/43612) +- Fixed the issue of coredump with IPv6 type storage. [#43251](https://github.com/apache/doris/pull/43251) +- Fixed the issue of coredump when using VARIANT type with light_schema_change disabled. [#40908](https://github.com/apache/doris/pull/40908) +- Improved cache performance for high-concurrency point queries. [#44077](https://github.com/apache/doris/pull/44077) +- Fixed the issue that bloom filter indexes were not synchronized when columns were deleted. [#43378](https://github.com/apache/doris/pull/43378) +- Fixed instability issues with es catalog under special circumstances such as mixed array and scalar data. [#40314](https://github.com/apache/doris/pull/40314) [#40385](https://github.com/apache/doris/pull/40385) [#43399](https://github.com/apache/doris/pull/43399) [#40614](https://github.com/apache/doris/pull/40614) +- Fixed coredump issues caused by abnormal regular pattern matching. [#43394](https://github.com/apache/doris/pull/43394) + +### Permissions + +- Fixed several issues where permissions were not properly restricted after authorization. [#43193](https://github.com/apache/doris/pull/43193) [#41723](https://github.com/apache/doris/pull/41723) [#42107](https://github.com/apache/doris/pull/42107) [#43306](https://github.com/apache/doris/pull/43306) +- Enhanced several permission checks. [#40688](https://github.com/apache/doris/pull/40688) [#40533](https://github.com/apache/doris/pull/40533) [#41791](https://github.com/apache/doris/pull/41791) [#42106](https://github.com/apache/doris/pull/42106) + +### Other + +- Supplemented missing audit log fields in audit log tables and files. [#43303](https://github.com/apache/doris/pull/43303) + + - [View Documentation](https://doris.apache.org/docs/3.0/admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file diff --git a/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.0.md b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.0.md new file mode 100644 index 0000000000000..dd94da6816294 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.0.md @@ -0,0 +1,379 @@ +--- +{ + "title": "Release 1.1.0", + "language": "en" +} +--- + + + +In version 1.1, we realized the full vectorization of the computing layer and storage layer, and officially enabled the vectorized execution engine as a stable function. All queries are executed by the vectorized execution engine by default, and the performance is 3-5 times higher than the previous version. It increases the ability to access the external tables of Apache Iceberg and supports federated query of data in Doris and Iceberg, and expands the analysis capabilities of Apache Doris on the data lake; on the basis of the original LZ4, the ZSTD compression algorithm is added , further improves the data compression rate; fixed many performance and stability problems in previous versions, greatly improving system stability. Downloading and using is recommended. + +## Upgrade Notes + +### The vectorized execution engine is enabled by default + +In version 1.0, we introduced the vectorized execution engine as an experimental feature and Users need to manually enable it when executing queries by configuring the session variables through `set batch_size = 4096` and `set enable_vectorized_engine = true` . + +In version 1.1, we officially fully enabled the vectorized execution engine as a stable function. The session variable `enable_vectorized_engine` is set to true by default. All queries are executed by default through the vectorized execution engine. + +### BE Binary File Renaming + +BE binary file has been renamed from palo_be to doris_be . Please pay attention to modifying the relevant scripts if you used to rely on process names for cluster management and other operations. + +### Segment storage format upgrade + +The storage format of earlier versions of Apache Doris was Segment V1. In version 0.12, we had implemented Segment V2 as a new storage format, which introduced Bitmap indexes, memory tables, page cache, dictionary compression, delayed materialization and many other features. Starting from version 0.13, the default storage format for newly created tables is Segment V2, while maintaining compatibility with the Segment V1 format. + +In order to ensure the maintainability of the code structure and reduce the additional learning and development costs caused by redundant historical codes, we have decided to no longer support the Segment v1 storage format from the next version. It is expected that this part of the code will be deleted in the Apache Doris 1.2 version. + +### Normal Upgrade + +For normal upgrade operations, you can perform rolling upgrades according to the cluster upgrade documentation on the official website. + +[https://doris.apache.org//docs/admin-manual/cluster-management/upgrade](https://doris.apache.org//docs/admin-manual/cluster-management/upgrade) + +## Features + +### Support random distribution of data [experimental] + +In some scenarios (such as log data analysis), users may not be able to find a suitable bucket key to avoid data skew, so the system needs to provide additional distribution methods to solve the problem. + +Therefore, when creating a table you can set `DISTRIBUTED BY random BUCKETS number`to use random distribution, the data will be randomly written to a single tablet when importing to reduce the data fanout during the loading process. And reduce resource overhead and improve system stability. + +### Support for creating Iceberg external tables[experimental] + +Iceberg external tables provide Apache Doris with direct access to data stored in Iceberg. Through Iceberg external tables, federated queries on data stored in local storage and Iceberg can be implemented, which saves tedious data loading work, simplifies the system architecture for data analysis, and performs more complex analysis operations. + +In version 1.1, Apache Doris supports creating Iceberg external tables and querying data, and supports automatic synchronization of all table schemas in the Iceberg database through the REFRESH command. + +### Added ZSTD compression algorithm + +At present, the data compression method in Apache Doris is uniformly specified by the system, and the default is LZ4. For some scenarios that are sensitive to data storage costs, the original data compression ratio requirements cannot be met. + +In version 1.1, users can set "compression"="zstd" in the table properties to specify the compression method as ZSTD when creating a table. In the 25GB 110 million lines of text log test data, the highest compression rate is nearly 10 times, which is 53% higher than the original compression rate, and the speed of reading data from disk and decompressing it is increased by 30%. + +## Improvements + +### More comprehensive vectorization support + +In version 1.1, we implemented full vectorization of the compute and storage layers, including: + +Implemented vectorization of all built-in functions + +The storage layer implements vectorization and supports dictionary optimization for low-cardinality string columns + +Optimized and resolved numerous performance and stability issues with the vectorization engine. + +We tested the performance of Apache Doris version 1.1 and version 0.15 on the SSB and TPC-H standard test datasets: + +On all 13 SQLs in the SSB test data set, version 1.1 is better than version 0.15, and the overall performance is improved by about 3 times, which solves the problem of performance degradation in some scenarios in version 1.0; + +On all 22 SQLs in the TPC-H test data set, version 1.1 is better than version 0.15, the overall performance is improved by about 4.5 times, and the performance of some scenarios is improved by more than ten times; + +![](/images/release-note-1.1.0-SSB.png) + +

SSB Benchmark

+ +![](/images/release-note-1.1.0-TPC-H.png) + + +

TPC-H Benchmark

+ +**Performance test report** + +[https://doris.apache.org//docs/benchmark/ssb](https://doris.apache.org//docs/benchmark/ssb) + +[https://doris.apache.org//docs/benchmark/tpch](https://doris.apache.org//docs/benchmark/tpch) + +### Compaction logic optimization and real-time guarantee + +In Apache Doris, each commit will generate a data version. In high concurrent write scenarios, -235 errors are prone to occur due to too many data versions and untimely compaction, and query performance will also decrease accordingly. + +In version 1.1, we introduced QuickCompaction, which will actively trigger compaction when the data version increases. At the same time, by improving the ability to scan fragment metadata, it can quickly find fragments with too many data versions and trigger compaction. Through active triggering and passive scanning, the real-time problem of data merging is completely solved. + +At the same time, for high-frequency small file cumulative compaction, the scheduling and isolation of compaction tasks is implemented to prevent the heavyweight base compaction from affecting the merging of new data. + +Finally, for the merging of small files, the strategy of merging small files is optimized, and the method of gradient merging is adopted. Each time the files participating in the merging belong to the same data magnitude, it prevents versions with large differences in size from merging, and gradually merges hierarchically. , reducing the number of times a single file participates in merging, which can greatly save the CPU consumption of the system. + +When the data upstream maintains a write frequency of 10w per second (20 concurrent write tasks, 5000 rows per job, and checkpoint interval of 1s), version 1.1 behaves as follows: + +- Quick data consolidation: Tablet version remains below 50 and compaction score is stable. Compared with the -235 problem that frequently occurred during high concurrent writing in the previous version, the compaction merge efficiency has been improved by more than 10 times. + +- Significantly reduced CPU resource consumption: The strategy has been optimized for small file Compaction. In the above scenario of high concurrent writing, CPU resource consumption is reduced by 25%; + +- Stable query time consumption: The overall orderliness of data is improved, and the fluctuation of query time consumption is greatly reduced. The query time consumption during high concurrent writing is the same as that of only querying, and the query performance is improved by 3-4 times compared with the previous version. + +### Read efficiency optimization for Parquet and ORC files + +By adjusting arrow parameters, arrow's multi-threaded read capability is used to speed up Arrow's reading of each row_group, and it is modified to SPSC model to reduce the cost of waiting for the network through prefetching. After optimization, the performance of Parquet file import is improved by 4 to 5 times. + +### Safer metadata Checkpoint + +By double-checking the image files generated after the metadata checkpoint and retaining the function of historical image files, the problem of metadata corruption caused by image file errors is solved. + +## Bugfix + +### Fix the problem that the data cannot be queried due to the missing data version.(Serious) + +This issue was introduced in version 1.0 and may result in the loss of data versions for multiple replicas. + +### Fix the problem that the resource isolation is invalid for the resource usage limit of loading tasks (Moderate) + +In 1.1, the broker load and routine load will use Backends with specified resource tags to do the load. + +### Use HTTP BRPC to transfer network data packets over 2GB (Moderate) + +In the previous version, when the data transmitted between Backends through BRPC exceeded 2GB, +it may cause data transmission errors. + +## Others + +### Disabling Mini Load + +The `/_load` interface is disabled by default, please use `the /_stream_load` interface uniformly. +Of course, you can re-enable it by turning off the FE configuration item `disable_mini_load`. + +The Mini Load interface will be completely removed in version 1.2. + +### Completely disable the SegmentV1 storage format + +Data in SegmentV1 format is no longer allowed to be created. Existing data can continue to be accessed normally. +You can use the `ADMIN SHOW TABLET STORAGE FORMAT` statement to check whether the data in SegmentV1 format +still exists in the cluster. And convert to SegmentV2 through the data conversion command + +Access to SegmentV1 data will no longer be supported in version 1.2. + +### Limit the maximum length of String type + +In previous versions, String types were allowed a maximum length of 2GB. +In version 1.1, we will limit the maximum length of the string type to 1MB. Strings longer than this length cannot be written anymore. +At the same time, using the String type as a partitioning or bucketing column of a table is no longer supported. + +The String type that has been written can be accessed normally. + +### Fix fastjson related vulnerabilities + +Update to Canal version to fix fastjson security vulnerability. + +### Added `ADMIN DIAGNOSE TABLET` command + +Used to quickly diagnose problems with the specified tablet. + +## Download to Use + +### Download Link + +[hhttps://doris.apache.org/download](https://doris.apache.org/download) + +### Feedback + +If you encounter any problems with use, please feel free to contact us through GitHub discussion forum or Dev e-mail group anytime. + +GitHub Forum: [https://github.com/apache/doris/discussions](https://github.com/apache/doris/discussions) + +Mailing list: [dev@doris.apache.org](dev@doris.apache.org) + +## Thanks + +Thanks to everyone who has contributed to this release: + +``` + +@adonis0147 + +@airborne12 + +@amosbird + +@aopangzi + +@arthuryangcs + +@awakeljw + +@BePPPower + +@BiteTheDDDDt + +@bridgeDream + +@caiconghui + +@cambyzju + +@ccoffline + +@chenlinzhong + +@daikon12 + +@DarvenDuan + +@dataalive + +@dataroaring + +@deardeng + +@Doris-Extras + +@emerkfu + +@EmmyMiao87 + +@englefly + +@Gabriel39 + +@GoGoWen + +@gtchaos + +@HappenLee + +@hello-stephen + +@Henry2SS + +@hewei-nju + +@hf200012 + +@jacktengg + +@jackwener + +@Jibing-Li + +@JNSimba + +@kangshisen + +@Kikyou1997 + +@kylinmac + +@Lchangliang + +@leo65535 + +@liaoxin01 + +@liutang123 + +@lovingfeel + +@luozenglin + +@luwei16 + +@luzhijing + +@mklzl + +@morningman + +@morrySnow + +@nextdreamblue + +@Nivane + +@pengxiangyu + +@qidaye + +@qzsee + +@SaintBacchus + +@SleepyBear96 + +@smallhibiscus + +@spaces-X + +@stalary + +@starocean999 + +@steadyBoy + +@SWJTU-ZhangLei + +@Tanya-W + +@tarepanda1024 + +@tianhui5 + +@Userwhite + +@wangbo + +@wangyf0555 + +@weizuo93 + +@whutpencil + +@wsjz + +@wunan1210 + +@xiaokang + +@xinyiZzz + +@xlwh + +@xy720 + +@yangzhg + +@Yankee24 + +@yiguolei + +@yinzhijian + +@yixiutt + +@zbtzbtzbt + +@zenoyang + +@zhangstar333 + +@zhangyifan27 + +@zhannngchen + +@zhengshengjun + +@zhengshiJ + +@zingdle + +@zuochunwei + +@zy-kkk +``` diff --git a/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.1.md b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.1.md new file mode 100644 index 0000000000000..73a6c2d976999 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.1.md @@ -0,0 +1,78 @@ +--- +{ + "title": "Release 1.1.1", + "language": "en" +} +--- + + + +## Features + +### Support ODBC Sink in Vectorized Engine. + +This feature is enabled in non-vectorized engine but it is missed in vectorized engine in 1.1. So that we add back this feature in 1.1.1. + +### Simple Memtracker for Vectorized Engine. + +There is no memtracker in BE for vectorized engine in 1.1, so that the memory is out of control and cause OOM. In 1.1.1, a simple memtracker is added to BE and could control the memory and cancel the query when memory exceeded. + +## Improvements + +### Cache decompressed data in page cache. + +Some data is compressed using bitshuffle and it costs a lot of time to decompress it during query. In 1.1.1, doris will decompress the data that encoded by bitshuffle to accelerate query and we find it could reduce 30% latency for some query in ssb-flat. + +## Bug Fix + +### Fix the problem that could not do rolling upgrade from 1.0.(Serious) + +This issue was introduced in version 1.1 and may cause BE core when upgrade BE but not upgrade FE. + +If you encounter this problem, you can try to fix it with [#10833](https://github.com/apache/doris/pull/10833). + +### Fix the problem that some query not fall back to non-vectorized engine, and BE will core. + +Currently, vectorized engine could not deal with all sql queries and some queries (like left outer join) will use non-vectorized engine to run. But there are some cases not covered in 1.1. And it will cause be crash. + +### Compaction not work correctly and cause -235 Error. + +One rowset multi segments in uniq key compaction, segments rows will be merged in generic_iterator but merged_rows not increased. Compaction will failed in check_correctness, and make a tablet with too much versions which lead to -235 load error. + +### Some segment fault cases during query. + +[#10961](https://github.com/apache/doris/pull/10961) +[#10954](https://github.com/apache/doris/pull/10954) +[#10962](https://github.com/apache/doris/pull/10962) + +# Thanks + +Thanks to everyone who has contributed to this release: + +``` +@jacktengg +@mrhhsg +@xinyiZzz +@yixiutt +@starocean999 +@morrySnow +@morningman +@HappenLee +``` \ No newline at end of file diff --git a/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.2.md b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.2.md new file mode 100644 index 0000000000000..223b65fda064c --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.2.md @@ -0,0 +1,84 @@ +--- +{ + "title": "Release 1.1.2", + "language": "en" +} +--- + + + + +In this release, Doris Team has fixed more than 170 issues or performance improvement since 1.1.1. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + +# Features + +### New MemTracker + +Introduced new MemTracker for both vectorized engine and non-vectorized engine which is more accurate. + +### Add API for showing current queries and kill query + +### Support read/write emoji of UTF16 via ODBC Table + +# Improvements + +### Data Lake related improvements + +- Improved HDFS ORC File scan performance about 300%. [#11501](https://github.com/apache/doris/pull/11501) + +- Support HDFS HA mode when query Iceberg table. + +- Support query Hive data created by [Apache Tez](https://tez.apache.org/) + +- Add Ali OSS as Hive external support. + +### Add support for string and text type in Spark Load + + +### Add reuse block in non-vectorized engine and have 50% performance improvement in some cases. [#11392](https://github.com/apache/doris/pull/11392) + +### Improve like or regex performance + +### Disable tcmalloc's aggressive_memory_decommit + +It will have 40% performance gains in load or query. + +Currently it is a config, you can change it by set config `tc_enable_aggressive_memory_decommit`. + +# Bug Fix + +### Some issues about FE that will cause FE failure or data corrupt. + +- Add reserved disk config to avoid too many reserved BDB-JE files.**(Serious)** In an HA environment, BDB JE will retains as many reserved files. The BDB-je log doesn't delete until approaching a disk limit. + +- Fix fatal bug in BDB-JE which will cause FE replica could not start correctly or data corrupted.** (Serious)** + +### Fe will hang on waitFor_rpc during query and BE will hang in high concurrent scenarios. + +[#12459](https://github.com/apache/doris/pull/12459) [#12458](https://github.com/apache/doris/pull/12458) [#12392](https://github.com/apache/doris/pull/12392) + +### A fatal issue in vectorized storage engine which will cause wrong result. **(Serious)** + +[#11754](https://github.com/apache/doris/pull/11754) [#11694](https://github.com/apache/doris/pull/11694) + +### Lots of planner related issues that will cause BE core or in abnormal state. + +[#12080](https://github.com/apache/doris/pull/12080) [#12075](https://github.com/apache/doris/pull/12075) [#12040](https://github.com/apache/doris/pull/12040) [#12003](https://github.com/apache/doris/pull/12003) [#12007](https://github.com/apache/doris/pull/12007) [#11971](https://github.com/apache/doris/pull/11971) [#11933](https://github.com/apache/doris/pull/11933) [#11861](https://github.com/apache/doris/pull/11861) [#11859](https://github.com/apache/doris/pull/11859) [#11855](https://github.com/apache/doris/pull/11855) [#11837](https://github.com/apache/doris/pull/11837) [#11834](https://github.com/apache/doris/pull/11834) [#11821](https://github.com/apache/doris/pull/11821) [#11782](https://github.com/apache/doris/pull/11782) [#11723](https://github.com/apache/doris/pull/11723) [#11569](https://github.com/apache/doris/pull/11569) + diff --git a/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.3.md b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.3.md new file mode 100644 index 0000000000000..cfa7151097de3 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.3.md @@ -0,0 +1,92 @@ +--- +{ + "title": "Release 1.1.3", + "language": "en" +} +--- + + + + +In this release, Doris Team has fixed more than 80 issues or performance improvement since 1.1.2. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + + +# Features + +- Support escape identifiers for sqlserver and postgresql in ODBC table. + +- Could use Parquet as output file format. + +# Improvements + +- Optimize flush policy to avoid small segments. [#12706](https://github.com/apache/doris/pull/12706) [#12716](https://github.com/apache/doris/pull/12716) + +- Refactor runtime filter to reduce the prepare time. [#13127](https://github.com/apache/doris/pull/13127) + +- Lots of memory control related issues during query or load process. [#12682](https://github.com/apache/doris/pull/12682) [#12688](https://github.com/apache/doris/pull/12688) [#12708](https://github.com/apache/doris/pull/12708) [#12776](https://github.com/apache/doris/pull/12776) [#12782](https://github.com/apache/doris/pull/12782) [#12791](https://github.com/apache/doris/pull/12791) [#12794](https://github.com/apache/doris/pull/12794) [#12820](https://github.com/apache/doris/pull/12820) [#12932](https://github.com/apache/doris/pull/12932) [#12954](https://github.com/apache/doris/pull/12954) [#12951](https://github.com/apache/doris/pull/12951) + +# BugFix + +- Core dump on compaction with largeint. [#10094](https://github.com/apache/doris/pull/10094) + +- Grouping sets cause be core or return wrong results. [#12313](https://github.com/apache/doris/pull/12313) + +- PREAGGREGATION flag in orthogonal_bitmap_union_count operator is wrong. [#12581](https://github.com/apache/doris/pull/12581) + +- Level1Iterator should release iterators in heap and it may cause memory leak. [#12592](https://github.com/apache/doris/pull/12592) + +- Fix decommission failure with 2 BEs and existing colocation table. [#12644](https://github.com/apache/doris/pull/12644) + +- BE may core dump because of stack-buffer-overflow when TBrokerOpenReaderResponse too large. [#12658](https://github.com/apache/doris/pull/12658) + +- BE may OOM during load when error code -238 occurs. [#12666](https://github.com/apache/doris/pull/12666) + +- Fix wrong child expression of lead function. [#12587](https://github.com/apache/doris/pull/12587) + +- Fix intersect query failed in row storage code. [#12712](https://github.com/apache/doris/pull/12712) + +- Fix wrong result produced by curdate()/current_date() function. [#12720](https://github.com/apache/doris/pull/12720) + +- Fix lateral view explode_split with temp table bug. [#13643](https://github.com/apache/doris/pull/13643) + +- Bucket shuffle join plan is wrong in two same table. [#12930](https://github.com/apache/doris/pull/12930) + +- Fix bug that tablet version may be wrong when doing alter and load. [#13070](https://github.com/apache/doris/pull/13070) + +- BE core when load data using broker with md5sum()/sm3sum(). [#13009](https://github.com/apache/doris/pull/13009) + +# Upgrade Notes + +PageCache and ChunkAllocator are disabled by default to reduce memory usage and can be re-enabled by modifying the configuration items `disable_storage_page_cache` and `chunk_reserved_bytes_limit`. + +Storage Page Cache and Chunk Allocator cache user data chunks and memory preallocation, respectively. + +These two functions take up a certain percentage of memory and are not freed. This part of memory cannot be flexibly allocated, which may lead to insufficient memory for other tasks in some scenarios, affecting system stability and availability. Therefore, we disabled these two features by default in version 1.1.3. + +However, in some latency-sensitive reporting scenarios, turning off this feature may lead to increased query latency. If you are worried about the impact of this feature on your business after upgrade, you can add the following parameters to be.conf to keep the same behavior as the previous version. + +``` +disable_storage_page_cache=false +chunk_reserved_bytes_limit=10% +``` + +* ``disable_storage_page_cache``: Whether to disable Storage Page Cache. version 1.1.2 (inclusive), the default is false, i.e., on. version 1.1.3 defaults to true, i.e., off. +* `chunk_reserved_bytes_limit`: Chunk allocator reserved memory size. 1.1.2 (and earlier), the default is 10% of the overall memory. 1.1.3 version default is 209715200 (200MB). + diff --git a/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.4.md b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.4.md new file mode 100644 index 0000000000000..4710463f4bcde --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.4.md @@ -0,0 +1,72 @@ +--- +{ + "title": "Release 1.1.4", + "language": "en" +} +--- + + + +In this release, Doris Team has fixed about 60 issues or performance improvement since 1.1.3. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + + +# Features + +- Support obs broker load for Huawei Cloud. [#13523](https://github.com/apache/doris/pull/13523) + +- SparkLoad support parquet and orc file.[#13438](https://github.com/apache/doris/pull/13438) + +# Improvements + +- Do not acquire mutex in metric hook since it will affect query performance during heavy load.[#10941](https://github.com/apache/doris/pull/10941) + + +# BugFix + +- The where condition does not take effect when spark load loads the file. [#13804](https://github.com/apache/doris/pull/13804) + +- If function return error result when there is nullable column in vectorized mode. [#13779](https://github.com/apache/doris/pull/13779) + +- Fix incorrect result when using anti join with other join predicates. [#13743](https://github.com/apache/doris/pull/13743) + +- BE crash when call function concat(ifnull). [#13693](https://github.com/apache/doris/pull/13693) + +- Fix planner bug when there is a function in group by clause. [#13613](https://github.com/apache/doris/pull/13613) + +- Table name and column name is not recognized correctly in lateral view clause. [#13600](https://github.com/apache/doris/pull/13600) + +- Unknown column when use MV and table alias. [#13605](https://github.com/apache/doris/pull/13605) + +- JSONReader release memory of both value and parse allocator. [#13513](https://github.com/apache/doris/pull/13513) + +- Fix allow create mv using to_bitmap() on negative value columns when enable_vectorized_alter_table is true. [#13448](https://github.com/apache/doris/pull/13448) + +- Microsecond in function from_date_format_str is lost. [#13446](https://github.com/apache/doris/pull/13446) + +- Sort exprs nullability property may not be right after subsitute using child's smap info. [#13328](https://github.com/apache/doris/pull/13328) + +- Fix core dump on case when have 1000 condition. [#13315](https://github.com/apache/doris/pull/13315) + +- Fix bug that last line of data lost for stream load. [#13066](https://github.com/apache/doris/pull/13066) + +- Restore table or partition with the same replication num as before the backup. [#11942](https://github.com/apache/doris/pull/11942) + + + diff --git a/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.5.md b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.5.md new file mode 100644 index 0000000000000..ee0482b3ba487 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.1/release-1.1.5.md @@ -0,0 +1,65 @@ +--- +{ + "title": "Release 1.1.5", + "language": "en" +} +--- + + + +In this release, Doris Team has fixed about 36 issues or performance improvement since 1.1.4. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + +# Behavior Changes + +When alias name is same as the original column name like "select year(birthday) as birthday" and use it in group by, order by , having clause, doris's behavior is different from MySQL in the past. In this release, we make it follow MySQL's behavior. Group by and having clause will use original column at first and order by will use alias first. It maybe a litter confuse here so there is a simple advice here, you'd better not use an alias the same as original column name. + +# Features + +Add support of murmur_hash3_64. [#14636](https://github.com/apache/doris/pull/14636) + +# Improvements + +Add timezone cache for convert_tz to improve performance. [#14616](https://github.com/apache/doris/pull/14616) + +Sort result by tablename when call show clause. [#14492](https://github.com/apache/doris/pull/14492) + +# Bug Fix + +Fix coredump when there is a if constant expr in select clause. [#14858](https://github.com/apache/doris/pull/14858) + +ColumnVector::insert_date_column may crashed. [#14839](https://github.com/apache/doris/pull/14839) + +Update high_priority_flush_thread_num_per_store default value to 6 and it will improve the load performance. [#14775](https://github.com/apache/doris/pull/14775) + +Fix quick compaction core. [#14731](https://github.com/apache/doris/pull/14731) + +Partition column is not duplicate key, spark load will throw IndexOutOfBounds error. [#14661](https://github.com/apache/doris/pull/14661) + +Fix a memory leak problem in VCollectorIterator. [#14549](https://github.com/apache/doris/pull/14549) + +Fix create table like when having sequence column. [#14511](https://github.com/apache/doris/pull/14511) + +Using avg rowset to calculate batch size instead of using total_bytes since it costs a lot of cpu. [#14273](https://github.com/apache/doris/pull/14273) + +Fix right outer join core with conjunct. [#14821](https://github.com/apache/doris/pull/14821) + +Optimize policy of tcmalloc gc. [#14777](https://github.com/apache/doris/pull/14777) [#14738](https://github.com/apache/doris/pull/14738) [#14374](https://github.com/apache/doris/pull/14374) + + diff --git a/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.0.md b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.0.md new file mode 100644 index 0000000000000..2529ce7e58aa2 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.0.md @@ -0,0 +1,563 @@ +--- +{ + "title": "Release 1.2.0", + "language": "en" +} +--- + + + + + +# Feature +## Highlight + +1. Full Vectorizied-Engine support, greatly improved performance + + In the standard ssb-100-flat benchmark, the performance of 1.2 is 2 times faster than that of 1.1; in complex TPCH 100 benchmark, the performance of 1.2 is 3 times faster than that of 1.1. + +2. Merge-on-Write Unique Key + + Support Merge-On-Write on Unique Key Model. This mode marks the data that needs to be deleted or updated when the data is written, thereby avoiding the overhead of Merge-On-Read when querying, and greatly improving the reading efficiency on the updateable data model. + +3. Multi Catalog + + The multi-catalog feature provides Doris with the ability to quickly access external data sources for access. Users can connect to external data sources through the `CREATE CATALOG` command. Doris will automatically map the library and table information of external data sources. After that, users can access the data in these external data sources just like accessing ordinary tables. It avoids the complicated operation that the user needs to manually establish external mapping for each table. + + Currently this feature supports the following data sources: + + 1. Hive Metastore: You can access data tables including Hive, Iceberg, and Hudi. It can also be connected to data sources compatible with Hive Metastore, such as Alibaba Cloud's DataLake Formation. Supports data access on both HDFS and object storage. + 2. Elasticsearch: Access ES data sources. + 3. JDBC: Access MySQL through the JDBC protocol. + + Documentation: https://doris.apache.org//docs/dev/lakehouse/multi-catalog) + + > Note: The corresponding permission level will also be changed automatically, see the "Upgrade Notes" section for details. + +4. Light table structure changes + +In the new version, it is no longer necessary to change the data file synchronously for the operation of adding and subtracting columns to the data table, and only need to update the metadata in FE, thus realizing the millisecond-level Schema Change operation. Through this function, the DDL synchronization capability of upstream CDC data can be realized. For example, users can use Flink CDC to realize DML and DDL synchronization from upstream database to Doris. + +Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE + +When creating a table, set `"light_schema_change"="true"` in properties. + +5. JDBC facade + + Users can connect to external data sources through JDBC. Currently supported: + + - MySQL + - PostgreSQL + - Oracle + - SQL Server + - Clickhouse + + Documentation: [https://doris.apache.org/en/docs/dev/lakehouse/multi-catalog/jdbc](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc/) + + > Note: The ODBC feature will be removed in a later version, please try to switch to the JDBC. + +6. JAVA UDF + + Supports writing UDF/UDAF in Java, which is convenient for users to use custom functions in the Java ecosystem. At the same time, through technologies such as off-heap memory and Zero Copy, the efficiency of cross-language data access has been greatly improved. + + Document: https://doris.apache.org//docs/dev/ecosystem/udf/java-user-defined-function + + Example: https://github.com/apache/doris/tree/master/samples/doris-demo + +7. Remote UDF + + Supports accessing remote user-defined function services through RPC, thus completely eliminating language restrictions for users to write UDFs. Users can use any programming language to implement custom functions to complete complex data analysis work. + + Documentation: https://doris.apache.org//docs/ecosystem/udf/remote-user-defined-function + + Example: https://github.com/apache/doris/tree/master/samples/doris-demo + +8. More data types support + + - Array type + + Array types are supported. It also supports nested array types. In some scenarios such as user portraits and tags, the Array type can be used to better adapt to business scenarios. At the same time, in the new version, we have also implemented a large number of data-related functions to better support the application of data types in actual scenarios. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Types/ARRAY + + Related functions: https://doris.apache.org//docs/dev/sql-manual/sql-functions/array-functions/array_max + + - Jsonb type + + Support binary Json data type: Jsonb. This type provides a more compact json encoding format, and at the same time provides data access in the encoding format. Compared with json data stored in strings, it is several times newer and can be improved. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Types/JSONB + + Related functions: https://doris.apache.org//docs/dev/sql-manual/sql-functions/json-functions/jsonb_parse + + - Date V2 + + Sphere of influence: + + 1. The user needs to specify datev2 and datetimev2 when creating the table, and the date and datetime of the original table will not be affected. + 2. When datev2 and datetimev2 are calculated with the original date and datetime (for example, equivalent connection), the original type will be cast into a new type for calculation + 3. The example is in the documentation + + Documentation: https://doris.apache.org/docs/1.2/sql-manual/sql-reference/Data-Types/DATEV2 + + +## More + +1. A new memory management framework + + Documentation: https://doris.apache.org//docs/dev/admin-manual/maint-monitor/memory-management/memory-tracker + +2. Table Valued Function + + Doris implements a set of Table Valued Function (TVF). TVF can be regarded as an ordinary table, which can appear in all places where "table" can appear in SQL. + + For example, we can use S3 TVF to implement data import on object storage: + + ``` + insert into tbl select * from s3("s3://bucket/file.*", "ak" = "xx", "sk" = "xxx") where c1 > 2; + ``` + + Or directly query data files on HDFS: + + ``` + insert into tbl select * from hdfs("hdfs://bucket/file.*") where c1 > 2; + ``` + + TVF can help users make full use of the rich expressiveness of SQL and flexibly process various data. + + Documentation: + + https://doris.apache.org//docs/dev/sql-manual/sql-functions/table-functions/s3 + + https://doris.apache.org//docs/dev/sql-manual/sql-functions/table-functions/hdfs + +3. A more convenient way to create partitions + + Support for creating multiple partitions within a time range via the `FROM TO` command. + +4. Column renaming + + For tables with Light Schema Change enabled, column renaming is supported. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-TABLE-RENAME + +5. Richer permission management + + - Support row-level permissions + + Row-level permissions can be created with the `CREATE ROW POLICY` command. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-POLICY + + - Support specifying password strength, expiration time, etc. + + - Support for locking accounts after multiple failed logins. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Account-Management-Statements/ALTER-USER + +6. Import + + - CSV import supports csv files with header. + + Search for `csv_with_names` in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD/ + + - Stream Load adds `hidden_columns`, which can explicitly specify the delete flag column and sequence column. + + Search for `hidden_columns` in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD + + - Spark Load supports Parquet and ORC file import. + + - Support for cleaning completed imported Labels + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CLEAN-LABEL + + - Support batch cancellation of import jobs by status + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CANCEL-LOAD + + - Added support for Alibaba Cloud oss, Tencent Cloud cos/chdfs and Huawei Cloud obs in broker load. + + Documentation: https://doris.apache.org//docs/dev/advanced/broker + + - Support access to hdfs through hive-site.xml file configuration. + + Documentation: https://doris.apache.org//docs/dev/admin-manual/config/config-dir + +7. Support viewing the contents of the catalog recycle bin through `SHOW CATALOG RECYCLE BIN` function. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Show-Statements/SHOW-CATALOG-RECYCLE-BIN + +8. Support `SELECT * EXCEPT` syntax. + + Documentation: https://doris.apache.org//docs/dev/data-table/basic-usage + +9. OUTFILE supports ORC format export. And supports multi-byte delimiters. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/OUTFILE + +10. Support to modify the number of Query Profiles that can be saved through configuration. + + Document search FE configuration item: max_query_profile_num + +11. The DELETE statement supports IN predicate conditions. And it supports partition pruning. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Manipulation/DELETE + +12. The default value of the time column supports using `CURRENT_TIMESTAMP` + + Search for "CURRENT_TIMESTAMP" in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE + +13. Add two system tables: backends, rowsets + + Documentation: + + https://doris.apache.org//docs/dev/admin-manual/system-table/backends + + https://doris.apache.org//docs/dev/admin-manual/system-table/rowsets + +14. Backup and restore + + - The Restore job supports the `reserve_replica` parameter, so that the number of replicas of the restored table is the same as that of the backup. + + - The Restore job supports `reserve_dynamic_partition_enable` parameter, so that the restored table keeps the dynamic partition enabled. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Backup-and-Restore/RESTORE + + - Support backup and restore operations through the built-in libhdfs, no longer rely on broker. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Backup-and-Restore/CREATE-REPOSITORY + +15. Support data balance between multiple disks on the same machine + + Documentation: + + https://doris.apache.org//docs/dev/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-REBALANCE-DISK + + https://doris.apache.org//docs/dev/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-CANCEL-REBALANCE-DISK + +16. Routine Load supports subscribing to Kerberos-authenticated Kafka services. + + Search for kerberos in the documentation: https://doris.apache.org//docs/dev/data-operate/import/import-way/routine-load-manual + +17. New built-in-function + + Added the following built-in functions: + + - `cbrt` + - `sequence_match/sequence_count` + - `mask/mask_first_n/mask_last_n` + - `elt` + - `any/any_value` + - `group_bitmap_xor` + - `ntile` + - `nvl` + - `uuid` + - `initcap` + - `regexp_replace_one/regexp_extract_all` + - `multi_search_all_positions/multi_match_any` + - `domain/domain_without_www/protocol` + - `running_difference` + - `bitmap_hash64` + - `murmur_hash3_64` + - `to_monday` + - `not_null_or_empty` + - `window_funnel` + - `group_bit_and/group_bit_or/group_bit_xor` + - `outer combine` + - and all array functions + +# Upgrade Notice + +## Known Issues + +- Use JDK11 will cause BE crash, please use JDK8 instead. + +## Behavior Changed + +- Permission level changes + + Because the catalog level is introduced, the corresponding user permission level will also be changed automatically. The rules are as follows: + + - GlobalPrivs and ResourcePrivs remain unchanged + - Added CatalogPrivs level. + - The original DatabasePrivs level is added with the internal prefix (indicating the db in the internal catalog) + - Add the internal prefix to the original TablePrivs level (representing tbl in the internal catalog) + +- In GroupBy and Having clauses, match on column names in preference to aliases. (#14408) + +- Creating columns starting with `mv_` is no longer supported. `mv_` is a reserved keyword in materialized views (#14361) + +- Removed the default limit of 65535 rows added by the order by statement, and added the session variable `default_order_by_limit` to configure this limit. (#12478) + +- In the table generated by "Create Table As Select", all string columns use the string type uniformly, and no longer distinguish varchar/char/string (#14382) + +- In the audit log, remove the word `default_cluster` before the db and user names. (#13499) (#11408) + +- Add sql digest field in audit log (#8919) + +- The union clause always changes the order by logic. In the new version, the order by clause will be executed after the union is executed, unless explicitly associated by parentheses. (#9745) + +- During the decommission operation, the tablet in the recycle bin will be ignored to ensure that the decomission can be completed. (#14028) + +- The returned result of Decimal will be displayed according to the precision declared in the original column, or according to the precision specified in the cast function. (#13437) + +- Changed column name length limit from 64 to 256 (#14671) + +- Changes to FE configuration items + + - The `enable_vectorized_load` parameter is enabled by default. (#11833) + + - Increased `create_table_timeout` value. The default timeout for table creation operations will be increased. (#13520) + + - Modify `stream_load_default_timeout_second` default value to 3 days. + + - Modify the default value of `alter_table_timeout_second` to one month. + + - Increase the parameter `max_replica_count_when_schema_change` to limit the number of replicas involved in the alter job, the default is 100000. (#12850) + + - Add `disable_iceberg_hudi_table`. The iceberg and hudi appearances are disabled by default, and the multi catalog function is recommended. (#13932) + +- Changes to BE configuration items + + - Removed `disable_stream_load_2pc` parameter. 2PC's stream load can be used directly. (#13520) + + - Modify `tablet_rowset_stale_sweep_time_sec` from 1800 seconds to 300 seconds. + + - Redesigned configuration item name about compaction (#13495) + + - Revisited parameter about memory optimization (#13781) + +- Session variable changes + + - Modify the variable `enable_insert_strict` to true by default. This will cause some insert operations that could be executed before, but inserted illegal values, to no longer be executed. (11866) + + - Modified variable `enable_local_exchange` to default to true (#13292) + + - Default data transmission via lz4 compression, controlled by variable `fragment_transmission_compression_codec` (#11955) + + - Add `skip_storage_engine_merge` variable for debugging unique or agg model data (#11952) + + Documentation: https://doris.apache.org//docs/dev/advanced/variables + +- The BE startup script will check whether the value is greater than 200W through `/proc/sys/vm/max_map_count`. Otherwise, the startup fails. (#11052) + +- Removed mini load interface (#10520) + +- FE Metadata Version + + FE Meta Version changed from 107 to 114, and cannot be rolled back after upgrading. + +## During Upgrade + +1. Upgrade preparation + + - Need to replace: lib, bin directory (start/stop scripts have been modified) + + - BE also needs to configure JAVA_HOME, and already supports JDBC Table and Java UDF. + + - The default JVM Xmx parameter in fe.conf is changed to 8GB. + +2. Possible errors during the upgrade process + + - The repeat function cannot be used and an error is reported: `vectorized repeat function cannot be executed`, you can turn off the vectorized execution engine before upgrading. (#13868) + + - schema change fails with error: `desc_tbl is not set. Maybe the FE version is not equal to the BE` (#13822) + + - Vectorized hash join cannot be used and an error will be reported. `vectorized hash join cannot be executed`. You can turn off the vectorized execution engine before upgrading. (#13753) + + The above errors will return to normal after a full upgrade. + +## Performance Impact + +- By default, JeMalloc is used as the memory allocator of the new version BE, replacing TcMalloc (#13367) + +- The batch size in the tablet sink is modified to be at least 8K. (#13912) + +- Disable chunk allocator by default (#13285) + +## Api change + +- BE's http api error return information changed from `{"status": "Fail", "msg": "xxx"}` to more specific ``{"status": "Not found", "msg": "Tablet not found. tablet_id=1202"}``(#9771) + +- In `SHOW CREATE TABLE`, the content of comment is changed from double quotes to single quotes (#10327) + +- Support ordinary users to obtain query profile through http command. (#14016) +Documentation: https://doris.apache.org//docs/dev/admin-manual/http-actions/fe/manager/query-profile-action + +- Optimized the way to specify the sequence column, you can directly specify the column name. (#13872) +Documentation: https://doris.apache.org//docs/dev/data-operate/update-delete/sequence-column-manual + +- Increase the space usage of remote storage in the results returned by `show backends` and `show tablets` (#11450) + +- Removed Num-Based Compaction related code (#13409) + +- Refactored BE's error code mechanism, some returned error messages will change (#8855) +other + +- Support Docker official image. + +- Support compiling Doris on MacOS(x86/M1) and ubuntu-22.04 + Documentation: https://doris.apache.org//docs/dev/install/source-install/compilation-mac/ + +- Support for image file verification. + + Documentation: https://doris.apache.org//docs/dev/admin-manual/maint-monitor/metadata-operation/ + +- script related + + - The stop scripts of FE and BE support exiting FE and BE via the `--grace` parameter (use kill -15 signal instead of kill -9) + + - FE start script supports checking the current FE version via --version (#11563) + + - Support to get the data and related table creation statement of a tablet through the `ADMIN COPY TABLET` command, for local problem debugging (#12176) + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-COPY-TABLET + +- Support to obtain a table creation statement related to a SQL statement through the http api for local problem reproduction (#11979) + + Documentation: https://doris.apache.org//docs/dev/admin-manual/http-actions/fe/query-schema-action + +- Support to close the compaction function of this table when creating a table, for testing (#11743) + + Search for "disble_auto_compaction" in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE + +# Big Thanks + +Thanks to ALL who contributed to this release! (alphabetically) +``` +@924060929 +@a19920714liou +@adonis0147 +@Aiden-Dong +@aiwenmo +@AshinGau +@b19mud +@BePPPower +@BiteTheDDDDt +@bridgeDream +@ByteYue +@caiconghui +@CalvinKirs +@cambyzju +@caoliang-web +@carlvinhust2012 +@catpineapple +@ccoffline +@chenlinzhong +@chovy-3012 +@coderjiang +@cxzl25 +@dataalive +@dataroaring +@dependabot[bot] +@dinggege1024 +@DongLiang-0 +@Doris-Extras +@eldenmoon +@EmmyMiao87 +@englefly +@FreeOnePlus +@Gabriel39 +@gaodayue +@geniusjoe +@gj-zhang +@gnehil +@GoGoWen +@HappenLee +@hello-stephen +@Henry2SS +@hf200012 +@huyuanfeng2018 +@jacktengg +@jackwener +@jeffreys-cat +@Jibing-Li +@JNSimba +@Kikyou1997 +@Lchangliang +@LemonLiTree +@lexoning +@liaoxin01 +@lide-reed +@link3280 +@liutang123 +@liuyaolin +@LOVEGISER +@lsy3993 +@luozenglin +@luzhijing +@madongz +@morningman +@morningman-cmy +@morrySnow +@mrhhsg +@Myasuka +@myfjdthink +@nextdreamblue +@pan3793 +@pangzhili +@pengxiangyu +@platoneko +@qidaye +@qzsee +@SaintBacchus +@SeekingYang +@smallhibiscus +@sohardforaname +@song7788q +@spaces-X +@ssusieee +@stalary +@starocean999 +@SWJTU-ZhangLei +@TaoZex +@timelxy +@Wahno +@wangbo +@wangshuo128 +@wangyf0555 +@weizhengte +@weizuo93 +@wsjz +@wunan1210 +@xhmz +@xiaokang +@xiaokangguo +@xinyiZzz +@xy720 +@yangzhg +@Yankee24 +@yeyudefeng +@yiguolei +@yinzhijian +@yixiutt +@yuanyuan8983 +@zbtzbtzbt +@zenoyang +@zhangboya1 +@zhangstar333 +@zhannngchen +@ZHbamboo +@zhengshiJ +@zhenhb +@zhqu1148980644 +@zuochunwei +@zy-kkk +``` diff --git a/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.1.md b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.1.md new file mode 100644 index 0000000000000..d5adb31eb5256 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.1.md @@ -0,0 +1,196 @@ +--- +{ + "title": "Release 1.2.1", + "language": "en" +} +--- + + + +# Improvement + +### Supports new type DecimalV3 + +DecimalV3, which supports higher precision and better performance, has the following advantages over past versions. + +- Larger representable range, the range of values are significantly expanded, and the valid number range [1,38]. + +- Higher performance, adaptive adjustment of the storage space occupied according to different precision. + +- More complete precision derivation support, for different expressions, different precision derivation rules are applied to the accuracy of the result. + +[DecimalV3](https://doris.apache.org/docs/dev/sql-manual/sql-reference/Data-Types/DECIMALV3/) + +### Support Iceberg V2 + +Support Iceberg V2 (only Position Delete is supported, Equality Delete will be supported in subsequent versions). + +Tables in Iceberg V2 format can be accessed through the Multi-Catalog feature. + +### Support OR condition to IN + +Support converting OR condition to IN condition, which can improve the execution efficiency in some scenarios.[#15437](https://github.com/apache/doris/pull/15437) [#12872](https://github.com/apache/doris/pull/12872) + +### Optimize the import and query performance of JSONB type + +Optimize the import and query performance of JSONB type. [#15219](https://github.com/apache/doris/pull/15219) [#15219](https://github.com/apache/doris/pull/15219) + +### Stream load supports quoted csv data + +Search trim_double_quotes in Document:[https://doris.apache.org/docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD](https://doris.apache.org/docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD) + +### Broker supports Tencent Cloud CHDFS and Baidu Cloud BOS, AFS + +Data on CHDFS, BOS, and AFS can be accessed through Broker. [#15297](https://github.com/apache/doris/pull/15297) [#15448](https://github.com/apache/doris/pull/15448) + +### New function + +Add function `substring_index`. [#15373](https://github.com/apache/doris/pull/15373) + +# Bug Fix + +- In some cases, after upgrading from version 1.1 to version 1.2, the user permission information will be lost. [#15144](https://github.com/apache/doris/pull/15144) + +- Fix the problem that the partition value is wrong when using datev2/datetimev2 type for partitioning. [#15094](https://github.com/apache/doris/pull/15094) + +- Bug fixes for a large number of released features. For a complete list see: [PR List](https://github.com/apache/doris/pulls?q=is%3Apr+label%3Adev%2F1.2.1-merged+is%3Aclosed) + +# Upgrade Notice + +### Known Issues + +- Do not use JDK11 as the runtime JDK of BE, it will cause BE Crash. +- The reading performance of the csv format in this version has declined, which will affect the import and reading efficiency of the csv format. We will fix it as soon as possible in the next three-digit version + +### Behavior Changed + +- The default value of the BE configuration item `high_priority_flush_thread_num_per_store` is changed from 1 to 6, to improve the write efficiency of Routine Load. (https://github.com/apache/doris/pull/14775) + +- The default value of the FE configuration item `enable_new_load_scan_node` is changed to true. Import tasks will be performed using the new File Scan Node. No impact on users.[#14808](https://github.com/apache/doris/pull/14808) + +- Delete the FE configuration item `enable_multi_catalog`. The Multi-Catalog function is enabled by default. + +- The vectorized execution engine is forced to be enabled by default.[#15213](https://github.com/apache/doris/pull/15213) + +The session variable enable_vectorized_engine will no longer take effect. Enabled by default. + +To make it valid again, set the FE configuration item `disable_enable_vectorized_engine` to false, and restart FE to make `enable_vectorized_engine` valid again. + + +# Big Thanks + +Thanks to ALL who contributed to this release! + + +@adonis0147 + +@AshinGau + +@BePPPower + +@BiteTheDDDDt + +@ByteYue + +@caiconghui + +@cambyzju + +@chenlinzhong + +@dataroaring + +@Doris-Extras + +@dutyu + +@eldenmoon + +@englefly + +@freemandealer + +@Gabriel39 + +@HappenLee + +@Henry2SS + +@hf200012 + +@jacktengg + +@Jibing-Li + +@Kikyou1997 + +@liaoxin01 + +@luozenglin + +@morningman + +@morrySnow + +@mrhhsg + +@nextdreamblue + +@qidaye + +@spaces-X + +@starocean999 + +@wangshuo128 + +@weizuo93 + +@wsjz + +@xiaokang + +@xinyiZzz + +@xutaoustc + +@yangzhg + +@yiguolei + +@yixiutt + +@Yulei-Yang + +@yuxuan-luo + +@zenoyang + +@zhangstar333 + +@zhannngchen + +@zhengshengjun + + + + + + diff --git a/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.2.md b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.2.md new file mode 100644 index 0000000000000..08fd22571a03f --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.2.md @@ -0,0 +1,254 @@ +--- +{ + "title": "Release 1.2.2", + "language": "en" +} +--- + + + +# New Features + +### Lakehouse + +- Support automatic synchronization of Hive metastore. + +- Support reading the Iceberg Snapshot, and viewing the Snapshot history. + +- JDBC Catalog supports PostgreSQL, Clickhouse, Oracle, SQLServer + +- JDBC Catalog supports Insert operation + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/) + +### Auto Bucket + + Set and scale the number of buckets for different partitions to keep the number of tablet in a relatively appropriate range. + +### New Functions + +Add the new function `width_bucket`. + +Reference: [https://doris.apache.org/zh-CN/docs/dev/sql-manual/sql-functions/width-bucket/#description](https://doris.apache.org/zh-CN/docs/dev/sql-manual/sql-functions/width-bucket/#description) + +# Behavior Changes + +- Disable BE's page cache by default: `disable_storage_page_cache=true` + +Turn off this configuration to optimize memory usage and reduce the risk of memory OOM. +But it will reduce the query latency of some small queries. +If you are sensitive to query latency, or have high concurrency and small query scenarios, you can configure *disable_storage_page_cache=false* to enable page cache again. + +- Add new session variable `group_by_and_having_use_alias_first`, used to control whether the group and having clauses use alias. + +Reference: [https://doris.apache.org/docs/dev/advanced/variables](https://doris.apache.org/docs/dev/advanced/variables) + +# Improvement + +### Compaction + +- Support `Vertical Compaction`. To optimize the compaction overhead and efficiency of wide tables. + +- Support `Segment ompaction`. Fix -238 and -235 issues with high frequency imports. + +### Lakehouse + +- Hive Catalog can be compatible with Hive version 1/2/3 + +- Hive Catalog can access JuiceFS based HDFS with Broker. + +- Iceberg Catalog Support Hive Metastore and Rest Catalog type. + +- ES Catalog support _id column mapping. + +- Optimize Iceberg V2 read performance with large number of delete rows. + +- Support for reading Iceberg tables after Schema Evolution + +- Parquet Reader handles column name case correctly. + +### Other + +- Support for accessing Hadoop KMS-encrypted HDFS. + +- Support to cancel the Export export task in progress. + +- Optimize the performance of `explode_split` with 1x. + +- Optimize the read performance of nullable columns with 3x. + +- Optimize some problems of Memtracker, improve memory management accuracy, and optimize memory application. + + + +# Bug Fix + +- Fixed memory leak when loading data with Doris Flink Connector. + +- Fixed the possible thread scheduling problem of BE and reduce the `Fragment sent timeout` error caused by BE thread exhaustion. + +- Fixed various correctness and precision issues of column type datetimev2/decimalv3. + +- Fixed the problem data correctness issue with Unique Key Merge-on-Read table. + +- Fixed various known issues with the Light Schema Change feature. + +- Fixed various data correctness issues of bitmap type Runtime Filter. + +- Fixed the problem of poor reading performance of csv reader introduced in version 1.2.1. + +- Fixed the problem of BE OOM caused by Spark Load data download phase. + +- Fixed possible metadata compatibility issues when upgrading from version 1.1 to version 1.2. + +- Fixed the metadata problem when creating JDBC Catalog with Resource. + +- Fixed the problem of high CPU usage caused by load operation. + +- Fixed the problem of FE OOM caused by a large number of failed Broker Load jobs. + +- Fixed the problem of precision loss when loading floating-point types. + +- Fixed the problem of memory leak when useing 2PC stream load + +# Other + +Add metrics to view the total rowset and segment numbers on BE + +- doris_be_all_rowsets_num and doris_be_all_segments_num + + +# Big Thanks + +Thanks to ALL who contributed to this release! + + +@adonis0147 + +@AshinGau + +@BePPPower + +@BiteTheDDDDt + +@ByteYue + +@caiconghui + +@cambyzju + +@chenlinzhong + +@DarvenDuan + +@dataroaring + +@Doris-Extras + +@dutyu + +@englefly + +@freemandealer + +@Gabriel39 + +@HappenLee + +@Henry2SS + +@htyoung + +@isHuangXin + +@JackDrogon + +@jacktengg + +@Jibing-Li + +@kaka11chen + +@Kikyou1997 + +@Lchangliang + +@LemonLiTree + +@liaoxin01 + +@liqing-coder + +@luozenglin + +@morningman + +@morrySnow + +@mrhhsg + +@nextdreamblue + +@qidaye + +@qzsee + +@spaces-X + +@stalary + + +@starocean999 + +@weizuo93 + +@wsjz + +@xiaokang + +@xinyiZzz + +@xy720 + +@yangzhg + +@yiguolei + +@yixiutt + +@Yukang-Lian + +@Yulei-Yang + +@zclllyybb + +@zddr + +@zhangstar333 + +@zhannngchen + +@zy-kkk + + + + + + diff --git a/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.3.md b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.3.md new file mode 100644 index 0000000000000..cd9226b15e14f --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.3.md @@ -0,0 +1,109 @@ +--- +{ + "title": "Release 1.2.3", + "language": "en" +} +--- + + + +# Improvement + +### JDBC Catalog + +- Support connecting to Doris clusters through JDBC Catalog. + +Currently, Jdbc Catalog only support to use 5.x version of JDBC jar package to connect another Doris database. If you use 8.x version of JDBC jar package, the data type of column may not be matched. + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc/#doris](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc/#doris) + +- Support to synchronize only the specified database through the `only_specified_database` attribute. + +- Support synchronizing table names in the form of lowercase through `lower_case_table_names` to solve the problem of case sensitivity of table names. + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc) + +- Optimize the read performance of JDBC Catalog. + +### Elasticsearch Catalog + +- Support Array type mapping. + +- Support whether to push down the like expression through the `like_push_down` attribute to control the CPU overhead of the ES cluster. + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/es](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/es) + +### Hive Catalog + +- Support Hive table default partition `_HIVE_DEFAULT_PARTITION_`. + +- Hive Metastore metadata automatic synchronization supports notification event in compressed format. + +### Dynamic Partition Improvement + +- Dynamic partition supports specifying the `storage_medium` parameter to control the storage medium of the newly added partition. + +Reference: [https://doris.apache.org/docs/dev/advanced/partition/dynamic-partition](https://doris.apache.org/docs/dev/advanced/partition/dynamic-partition) + + +### Optimize BE's Threading Model + +- Optimize BE's threading model to avoid stability problems caused by frequent thread creation and destroy. + +# Bugfix + +- Fixed issues with Merge-On-Write Unique Key tables. + +- Fixed compaction related issues. + +- Fixed some delete statement issues causing data errors. + +- Fixed several query execution errors. + +- Fixed the problem of using JDBC catalog to cause BE crash on some operating system. + +- Fixed Multi-Catalog issues. + +- Fixed memory statistics and optimization issues. + +- Fixed decimalV3 and date/datetimev2 related issues. + +- Fixed load transaction stability issues. + +- Fixed light-weight schema change issues. + +- Fixed the issue of using `datetime` type for batch partition creation. + +- Fixed the problem that a large number of failed broker loads would cause the FE memory usage to be too high. + +- Fixed the problem that stream load cannot be canceled after dropping the table. + +- Fixed querying `information_schema` timeout in some cases. + +- Fixed the problem of BE crash caused by concurrent data export using `select outfile`. + +- Fixed transactional insert operation memory leak. + +- Fixed several query/load profile issues, and supports direct download of profiles through FE web ui. + +- Fixed the problem that the BE tablet GC thread caused the IO util to be too high. + +- Fixed the problem that the commit offset is inaccurate in Kafka routine load. + diff --git a/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.4.md b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.4.md new file mode 100644 index 0000000000000..a959a323d06d1 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.4.md @@ -0,0 +1,81 @@ +--- +{ + "title": "Release 1.2.4", + "language": "en" +} +--- + + + + +# Behavior Changed + +- For `DateV2`/`DatetimeV2` and `DecimalV3` type, in the results of `DESCRIBLE` and `SHOW CREATE TABLE` statements, they will no longer be displayed as `DateV2`/`DatetimeV2` or `DecimalV3`, but directly displayed as `Date`/`Datetime` or `Decimal`. + + - This change is for compatibility with some BI tools. If you want to see the actual type of the column, you can check it with the `DESCRIBE ALL` statement. + +- When querying tables in the `information_schema` database, the meta information(database, table, column, etc.) in the external catalog is no longer returned by default. + + - This change avoids the problem that the `information_schema` database cannot be queried due to the connection problem of some external catalog, so as to solve the problem of using some BI tools with Doris. It can be controlled by the FE configuration `infodb_support_ext_catalog`, and the default value is `false`, that is, the meta information of external catalog will not be returned. + +# Improvement + +### JDBC Catalog + +- Supports connecting to Trino/Presto via JDBC Catalog + +​ Refer to: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#trino](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#trino) + +- JDBC Catalog connects to Clickhouse data source and supports Array type mapping + +​ Refer to: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#clickhouse](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#clickhouse) + +### Spark Load + +- Spark Load supports Resource Manager HA related configuration + +​ Refer to: https://github.com/apache/doris/pull/15000 + +## Bug Fixes + +- Fixed several connectivity issues with Hive Catalog. + +- Fixed ClassNotFound issues with Hudi Catalog. + +- Optimize the connection pool of JDBC Catalog to avoid too many connections. + +- Fix the problem that OOM will occur when importing data from another Doris cluster through JDBC Catalog. + +- Fixed serveral queries and imports planning issues. + +- Fixed several issues with Unique Key Merge-On-Write data model. + +- Fix several BDBJE issues and solve the problem of abnormal FE metadata in some cases. + +- Fix the problem that the `CREATE VIEW` statement does not support Table Valued Function. + +- Fixed several memory statistics issues. + +- Fixed several issues reading Parquet/ORC format. + +- Fixed several issues with DecimalV3. + +- Fixed several issues with SHOW QUERY/LOAD PROFILE. + diff --git a/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.5.md b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.5.md new file mode 100644 index 0000000000000..55af863ba47d6 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.5.md @@ -0,0 +1,199 @@ +--- +{ + "title": "Release 1.2.5", + "language": "en" +} +--- + + + +In version 1.2.5, the Doris team has fixed nearly 210 issues or performance improvements since the release of version 1.2.4. At the same time, version 1.2.5 is also an iterative version of version 1.2.4, which has higher stability. It is recommended that all users upgrade to this version. + +# Behavior Changed + +- The `start_be.sh` script will check that the maximum number of file handles in the system must be greater than or equal to 65536, otherwise the startup will fail. + +- The BE configuration item `enable_quick_compaction` is set to true by default. The Quick Compaction is enabled by default. This feature is used to optimize the problem of small files in the case of large batch import. + +- After modifying the dynamic partition attribute of the table, it will no longer take effect immediately, but wait for the next task scheduling of the dynamic partition table to avoid some deadlock problems. + +# Improvement + +- Optimize the use of bthread and pthread to reduce the RPC blocking problem during the query process. + +- A button to download Profile is added to the Profile page of the FE web UI. + +- Added FE configuration `recover_with_skip_missing_version`, which is used to query to skip the problematic replica under certain failure conditions. + +- The row-level permission function supports external Catalog. + +- Hive Catalog supports automatic refreshing of kerberos tickets on the BE side without manual refreshing. + +- JDBC Catalog supports tables under the MySQL/ClickHouse system database (`information_schema`). + +# Bug Fixes + +- Fixed the problem of incorrect query results caused by low-cardinality column optimization + +- Fixed several authentication and compatibility issues accessing HDFS. + +- Fixed several issues with float/double and decimal types. + +- Fixed several issues with date/datetimev2 types. + +- Fixed several query execution and planning issues. + +- Fixed several issues with JDBC Catalog. + +- Fixed several query-related issues with Hive Catalog, and Hive Metastore metadata synchronization issues. + +- Fix the problem that the result of `SHOW LOAD PROFILE` statement is incorrect. + +- Fixed several memory related issues. + +- Fixed several issues with `CREATE TABLE AS SELECT` functionality. + +- Fix the problem that the jsonb type causes BE to crash on CPU that do not support avx2. + +- Fixed several issues with dynamic partitions. + +- Fixed several issues with TOPN query optimization. + +- Fixed several issues with the Unique Key Merge-on-Write table model. + +# Big Thanks + +58 contributors participated in the improvement and release of 1.2.5, and thank them for their hard work and dedication: + +@adonis0147 + +@airborne12 + +@AshinGau + +@BePPPower + +@BiteTheDDDDt + +@caiconghui + +@CalvinKirs + +@cambyzju + +@caoliang-web + +@dataroaring + +@Doris-Extras + +@dujl + +@dutyu + +@fsilent + +@Gabriel39 + +@gitccl + +@gnehil + +@GoGoWen + +@gongzexin + +@HappenLee + +@herry2038 + +@jacktengg + +@Jibing-Li + +@kaka11chen + +@Kikyou1997 + +@LemonLiTree + +@liaoxin01 + +@LiBinfeng-01 + +@luwei16 + +@Moonm3n + +@morningman + +@mrhhsg + +@Mryange + +@nextdreamblue + +@nsnhuang + +@qidaye + +@Shoothzj + +@sohardforaname + +@stalary + +@starocean999 + +@SWJTU-ZhangLei + +@wsjz + +@xiaokang + +@xinyiZzz + +@yangzhg + +@yiguolei + +@yixiutt + +@yujun777 + +@Yulei-Yang + +@yuxuan-luo + +@zclllyybb + +@zddr + +@zenoyang + +@zhangstar333 + +@zhannngchen + +@zxealous + +@zy-kkk + +@zzzzzzzs diff --git a/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.6.md b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.6.md new file mode 100644 index 0000000000000..39146b35b15ac --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.6.md @@ -0,0 +1,135 @@ +--- +{ + "title": "Release 1.2.6", + "language": "en" +} +--- + + + + +# Behavior Change + +- Add a BE configuration item `allow_invalid_decimalv2_literal` to control whether can import data that exceeding the decimal's precision, for compatibility with previous logic. + +# Query + +- Fix several query planning issues. +- Support `sql_select_limit` session variable. +- Optimize query cold run performance. +- Fix expr context memory leak. +- Fix the issue that the `explode_split` function was executed incorrectly in some cases. + +## Multi Catalog + +- Fix the issue that synchronizing hive metadata caused FE replay edit log to fail. +- Fix `refresh catalog` operation causing FE OOM. +- Fix the issue that jdbc catalog cannot handle `0000-00-00` correctly. +- Fixed the issue that the kerberos ticket cannot be refreshed automatically. +- Optimize the partition pruning performance of hive. +- Fix the inconsistent behavior of trino and presto in jdbc catalog. +- Fix the issue that hdfs short-circuit read could not be used to improve query efficiency in some environments. +- Fix the issue that the iceberg table on CHDFS could not be read. + +# Storage + +- Fix the wrong calculation of delete bitmap in MOW table. +- Fix several BE memory issues. +- Fix snappy compression issue. +- Fix the issue that jemalloc may cause BE to crash in some cases. + +# Others + +- Fix several java udf related issues. +- Fix the issue that the `recover table` operation incorrectly triggered the creation of dynamic partitions. +- Fix timezone when importing orc files via broker load. +- Fix the issue that the newly added `PERCENT` keyword caused the replay metadata of the routine load job to fail. +- Fix the issue that the `truncate` operation failed to acts on a non-partitioned table. +- Fix the issue that the mysql connection was lost due to the `show snapshot` operation. +- Optimize the lock logic to reduce the probability of lock timeout errors when creating tables. +- Add session variable `have_query_cache` to be compatible with some old mysql clients. +- Optimize the error message when encountering an error of loading. + +# Big Thanks + +Thanks all who contribute to this release: + +@amorynan + +@BiteTheDDDDt + +@caoliang-web + +@dataroaring + +@Doris-Extras + +@dutyu + +@Gabriel39 + +@HHoflittlefish777 + +@htyoung + +@jacktengg + +@jeffreys-cat + +@kaijchen + +@kaka11chen + +@Kikyou1997 + +@KnightLiJunLong + +@liaoxin01 + +@LiBinfeng-01 + +@morningman + +@mrhhsg + +@sohardforaname + +@starocean999 + +@vinlee19 + +@wangbo + +@wsjz + +@xiaokang + +@xinyiZzz + +@yiguolei + +@yujun777 + +@Yulei-Yang + +@zhangstar333 + +@zy-kkk + diff --git a/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.7.md b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.7.md new file mode 100644 index 0000000000000..cd47282f4688d --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.7.md @@ -0,0 +1,46 @@ +--- +{ + "title": "Release 1.2.7", + "language": "en" +} +--- + + + +# Bug Fixes + +- Fixed some query issues. +- Fix some storage issues. +- Fix some decimal precision issues. +- Fix query error caused by invalid `sql_select_limit` session variable's value. +- Fix the problem that hdfs short-circuit read cannot be used. +- Fix the problem that Tencent Cloud cosn cannot be accessed. +- Fix several issues with hive catalog kerberos access. +- Fix the problem that stream load profile cannot be used. +- Fix promethus monitoring parameter format problem. +- Fix the table creation timeout issue when creating a large number of tablets. + +# New Features + +- Unique Key model supports array type as value column +- Added `have_query_cache` variable for compatibility with MySQL ecosystem. +- Added `enable_strong_consistency_read` to support strong consistent read between sessions +- FE metrics supports user-level query counter + diff --git a/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.8.md b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.8.md new file mode 100644 index 0000000000000..35cbb7a3cdcf1 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v1.2/release-1.2.8.md @@ -0,0 +1,47 @@ +--- +{ + "title": "Release 1.2.8", + "language": "en" +} +--- + + + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Bug Fixes +- Fixed several issues with query execution. +- Fixed several issues with Spark Load. +- Fixed several issues with Parquet Reader. +- Fixed several issues with Orc Reader. +- Fixed Broker "FileSystem closed" problem. +- Fixed several issues with Broker Load. +- Fixed several issues with CTAS execution. +- Fixed several issues with backup and restore. +- Added "Catalog" column in audit log. +- Optimized the metadata cache of Iceberg Catalog. +- Fixed several issues with outfile/export feature. +- Fixed an issue with "replayEraseTable" edit log causing FE start to fail. +- Fixed some security issues. + + diff --git a/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.0.md b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.0.md new file mode 100644 index 0000000000000..b0b88f715ee51 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.0.md @@ -0,0 +1,159 @@ +--- +{ + "title": "Release 2.1.0", + "language": "en" +} +--- + + + +Dear community, we are pleased to share with you the official release of Apache Doris 2.1.0, now available for download and use as of March 8th. This latest version marks a significant milestone in our journey towards enhancing data analysis capabilities, particularly for handling massive and complex datasets. + +With Doris 2.1.0, our primary focus has been on optimizing analysis performance, and the results speak for themselves. We have achieved an impressive performance improvement of over 100% on the TPC-DS 1TB test dataset, making Apache Doris more capable of challenging real-world business scenarios. + +- **Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Performance improvement + +### Smarter optimizer + +On the basis of V2.0, the query optimizer in Doris V2.1 comes with enhanced statistics-based inference and enumeration framework. We have upgraded the cost model and expanded the optimization rules to serve the needs of more use cases + +### Better heuristic optimization + +For data analytics at scale or data lake scenarios, Doris V2.1 provides better heuristic query plans. Meanwhile, the RuntimeFilter is more self-adaptive to enable higher performance even without statistical information. + +### Parallel adaptive scan + +Doris V2.1 has adopted parallel adaptive scan to optimize scan I/O and thus improve query performance. It can avoid the negative impact of unreasonable numbers of buckets. (This feature is currently available on the Duplicate Key model and Merge-on-Write Unique Key model.) + +### Local shuffle + +We have introduced Local Shuffle to prevent uneven data distribution. Benchmark tests show that Local Shuffle in combination with Parallel Adaptive Scan can guarantee fast query performance in spite of unreasonable bucket number settings upon table creation. + +### Faster INSERT INTO SELECT + +To further improve the performance of INSERT INTO SELECT, which is a frequent operation in ETL, we have moved forward the MemTable execution-wise to reduce data ingestion overheads. Tests show that this can double the data ingestion speed in most cases compared to V2.0. +Improved data lake analytics capabilities + +## Data lake analytic performance + +### TPC-DS Benchmark + +According to TPC-DS benchmark tests (1TB) of Doris V2.1 against Trino, + +- Without caching, the total execution time of Doris is 56% of that of Trino V435. (717s VS 1296s) +- Enabling file cache can further increase the overall performance of Doris by 2.2 times. (323s) + This is achieved by a series of optimizations in I/O, parquet/ORC file reading, predicate pushdown, caching, and scan task scheduling, etc. + +### SQL dialects compatibility + +To facilitate migration to Doris and increase its compatibility with other DBMS, we have enabled SQL dialect conversion in V2.1. ([read more](../../lakehouse/sql-dialect)) For example, by set sql_dialect = "trino" in Doris, you can use the Trino SQL dialect as you're used to, without modifying your current business logic, and Doris will execute the corresponding queries for you. Tests in user production environment show that Doris V2.1 is compatible with 99% of Trino SQL. + +### Arrow Flight SQL protocol + +As a column-oriented database compatible with MySQL 8.0 protocol, Doris V2.1 now supports the Arrow Flight SQL protocol as well so users can have fast access to Doris data via Pandas/Numpy without data serialization and deserialization. For most common data types, the Arrow Flight protocol enables tens of times faster performance than the MySQL protocol. + +## Asynchronous materialized view + +V2.1 allows creating a materialized view based on multiple tables. This feature currently supports: + +- Transparent rewriting: supports transparent rewriting of common operators including Select, Where, Join, Group By, and Aggregation. +- Auto refresh: supports regular refresh, manual refresh, full refresh, incremental refresh, and partition-based refresh. +- Materialized view of external tables: supports materialized views based on external data tables such as those on Hive, Hudi, and Iceberg; supported synchronizing data from data lakes into Doris internal tables via materialized views. +- Direct query on materialized views: Materialized views can be regarded as the result set after ETL. In this sense, materialized views are data tables, so users can conduct queries on them directly. + +## Enhanced storage + +### Auto-increment column + +V2.1 supports auto-increment columns, which can ensure data uniqueness of each row. This lays the foundation for efficient dictionary encoding and query pagination. For example, for precise UV calculation and customer grouping, users often apply the bitmap type in Doris, the process of which entails dictionary encoding. With V2.1, users can first create a dictionary table using the auto-increment column, and then simply load user data into it. + +### Auto partition + +To further release burden on operation and maintenance, V2.1 allows auto data partitioning. Upon data ingestion, it detects whether a partition exists for the data based on the partitioning column. If not, it automatically creates one and starts data ingestion. + +### High-concurrency real-time data ingestion + +For data writing, a back pressure mechanism is in place to avoid execessive data versions, so as to reduce resource consumption by data version merging. In addition, V2.1 supports group commit ([read more](../../data-operate/import/import-way/group-commit-manual)), which means to accumulate multiple writing and commit them as one. Benchmark tests on group commit with JDBC ingestion and the Stream Load method present great results. + +## Semi-structured data analysis + +### A new data type: Variant + +V2.1 supports a new data type named Variant. It can accommodate semi-structured data such as JSON as well as compound data types that contain integers, strings, booleans, etcs. Users don't have to pre-define the exact data types for a Variant column in the table schema. The Variant type is handy when processing nested data structures. +You can include Variant columns and static columns with pre-defined data types in the same table. This will provide you with more flexibility in storage and queries. +Tests with ClickBench datasets prove that data in Variant columns takes up the same storage space as data in static columns, which is half of that in JSON format. In terms of query performance, the Variant type enables 8 times higher query speed than JSON in hot runs and even more in cold runs. + +### IP types + +Doris V2.1 provides native support for IPv4 and IPv6. It stores IP data in binary format, which cuts down storage space usage by 60% compared to IP string in plain texts. Along with these IP types, we have added over 20 functions for IP data processing. + +### More powerful functions for compound data types + +- explode_map: supports exploding rows into columns for the Map data type. +- Supports the STRUCT data type in the IN predicates + +## Workload Management + +### Hard isolation of resources + +On the basis of the Workload Group mechanism, which imposes a soft limit on the resources that a workload group can use, Doris 2.1 introduces a hard limit on CPU resource consumption for workload groups as a way to ensure higher stability in query performance. + +### TopSQL + +V2.1 allows users to check the most resource-consuming SQL queries in the runtime. This can be a big help when handling cluster load spike caused by unexpected large queries. + + +## Others + +### Decimal 256 + +For users in the financial sector or high-end manufacturing, V2.1 supports a high-precision data type: Decimal, which supports up to 76 significant digits (an experimental feature, please set enable_decimal256=true.) + +### Job scheduler + +V2.1 provides a good option for regular task scheduling: Doris Job Scheduler. It can trigger the pre-defined operations on schedule or at fixed intervals. The Doris Job Scheduler is accurate to the second. It provides consistency guarantee for data writing, high efficiency and flexibility, high-performance processing queues, retraceable scheduling records, and high availability of jobs. + +### Support Docker fast start to experience the new version + +Starting from version 2.1.0, we will provide a separate Docker Image to support the rapid creation of a 1FE, 1BE Docker container to experience the new version of Doris. The container will complete the initialization of FE and BE, BE registration and other steps by default. After creating the container, it can directly access and use the Doris cluster about 1 [minute.In](http://minute.in/) this image version, the default `max_map_count`, `ulimit`, `Swap` and other hard limits are removed. It supports X64 (avx2) machines and ARM machines for deployment. The default open ports are 8000, 8030, 8040, 9030.If you need to experience the Broker component, you can add the environment variable `--env BROKER=true` at startup to start the Broker process synchronously. After startup, it will automatically complete the registration. The Broker name is `test`. + +Please note that this version is only suitable for quick experience and functional testing, not for production environment! + +## Behavior changed + +- The default data model is the Merge-on-Write Unique Key model. enable_unique_key_merge_on_write will be included as a default setting when a table is created in the Unique Key model. +- As inverted index has proven to be more performant than bitmap index, V2.1 stops supporting bitmap index. Existing bitmap indexes will remain effective but new creation is not allowed. We will remove bitmap index-related code in the future. +- cpu_resource_limit is no longer supported. It is to put a limit on the number of scanner threads on Doris BE. Since the workload group mechanism also supports such settings, the already configured cpu_resource_limit will be invalid. +- The default value of enable_segcompaction is true. This means Doris supports compaction of multiple segments in the same rowset. +- Audit log plug-in + - Since V2.1.0, Doris has a built-in audit log plug-in. Users can simply enable or disable it by setting the enable_audit_plugin parameter. + - If you have already installed your own audit log plug-in, you can either continue using it after upgrading to Doris V2.1, or uninstall it and use the one in Doris. Please note that the audit log table will be relocated after switching plug-in. + - For more details, please see the [docs](../../admin-manual/audit-plugin). + + +## Credits +Thanks all who contribute to this release: + +467887319, 924060929, acnot, airborne12, AKIRA, alan_rodriguez, AlexYue, allenhooo, amory, amory, AshinGau, beat4ocean, BePPPower, bigben0204, bingquanzhao, BirdAmosBird, BiteTheDDDDt, bobhan1, caiconghui, camby, camby, CanGuan, caoliang-web, catpineapple, Centurybbx, chen, ChengDaqi2023, ChenyangSunChenyang, Chester, ChinaYiGuan, ChouGavinChou, chunping, colagy, CSTGluigi, czzmmc, daidai, dalong, dataroaring, DeadlineFen, DeadlineFen, deadlinefen, deardeng, didiaode18, DongLiang-0, dong-shuai, Doris-Extras, Dragonliu2018, DrogonJackDrogon, DuanXujianDuan, DuRipeng, dutyu, echo-dundun, ElvinWei, englefly, Euporia, feelshana, feifeifeimoon, feiniaofeiafei, felixwluo, figurant, flynn, fornaix, FreeOnePlus, Gabriel39, gitccl, gnehil, GoGoWen, gohalo, guardcrystal, hammer, HappenLee, HB, hechao, HelgeLarsHelge, herry2038, HeZhangJianHe, HHoflittlefish777, HonestManXin, hongkun-Shao, HowardQin, hqx871, httpshirley, htyoung, huanghaibin, HuJerryHu, HuZhiyuHu, Hyman-zhao, i78086, irenesrl, ixzc, jacktengg, jacktengg, jackwener, jayhua, Jeffrey, jiafeng.zhang, Jibing-Li, JingDas, julic20s, kaijchen, kaka11chen, KassieZ, kindred77, KirsCalvinKirs, KirsCalvinKirs, kkop, koarz, LemonLiTree, LHG41278, liaoxin01, LiBinfeng-01, LiChuangLi, LiDongyangLi, Lightman, lihangyu, lihuigang, LingAdonisLing, liugddx, LiuGuangdongLiu, LiuHongLiu, liuJiwenliu, LiuLijiaLiu, lsy3993, LuGuangmingLu, LuoMetaLuo, luozenglin, Luwei, Luzhijing, lxliyou001, Ma1oneZhang, mch_ucchi, Miaohongkai, morningman, morrySnow, Mryange, mymeiyi, nanfeng, nanfeng, Nitin-Kashyap, PaiVallishPai, Petrichor, plat1ko, py023, q763562998, qidaye, QiHouliangQi, ranxiang327, realize096, rohitrs1983, sdhzwc, seawinde, seuhezhiqiang, seuhezhiqiang, shee, shuke987, shysnow, songguangfan, Stalary, starocean999, SunChenyangSun, sunny, SWJTU-ZhangLei, TangSiyang2001, Tanya-W, taoxutao, Uniqueyou, vhwzIs, walter, walter, wangbo, Wanghuan, wangqt, wangtao, wangtianyi2004, wenluowen, whuxingying, wsjz, wudi, wudongliang, wuwenchihdu, wyx123654, xiangran0327, Xiaocc, XiaoChangmingXiao, xiaokang, XieJiann, Xinxing, xiongjx, xuefengze, xueweizhang, XueYuhai, XuJianxu, xuke-hat, xy, xy720, xyfsjq, xzj7019, yagagagaga, yangshijie, YangYAN, yiguolei, yiguolei, yimeng, YinShaowenYin, Yoko, yongjinhou, ytwp, yuanyuan8983, yujian, yujun777, Yukang-Lian, Yulei-Yang, yuxuan-luo, zclllyybb, ZenoYang, zfr95, zgxme, zhangdong, zhangguoqiang, zhangstar333, zhangstar333, zhangy5, ZhangYu0123, zhannngchen, ZhaoLongZhao, zhaoshuo, zhengyu, zhiqqqq, ZhongJinHacker, ZhuArmandoZhu, zlw5307, ZouXinyiZou, zxealous, zy-kkk, zzwwhh, zzzxl1993, zzzzzzzs diff --git a/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.1.md b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.1.md new file mode 100644 index 0000000000000..384bccdceb414 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.1.md @@ -0,0 +1,251 @@ +--- +{ + "title": "Release 2.1.1", + "language": "en" +} +--- + + + +Dear community members, Apache Doris 2.1.1 has been officially released on April 3, 2024, with several enhancements and bug fixes based on 2.1.0, enabling smoother user experience. + +- **Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + +## Behavior Changed + +1. Change float type output format to improve float type serialization performance. + +- https://github.com/apache/doris/pull/32049 + +2. Change system table value functions active_queries(), workload_groups() to system tables. + +- https://github.com/apache/doris/pull/32314 + +3. Disable show query/load profile stmt because there are not so many developers use it and the pipeline and pipelinex engine not support it. + +- https://github.com/apache/doris/pull/32467 + +4. Upgrade arrow flight version to 15.0.2 to fix some bugs, so that please use ADBC 15.0.2 version to access Doris. + +- https://github.com/apache/doris/pull/32827. + + + +## Upgrade Problem + +1. BE will core when rolling pgrade problem from 2.0.x to 2.1.x + +- https://github.com/apache/doris/pull/32672 + +- https://github.com/apache/doris/pull/32444 + +- https://github.com/apache/doris/pull/32162 + +2. JDBC Catalog will have query errors when rolling grade rom 2.0.x to 2.1.x. + +- https://github.com/apache/doris/pull/32618 + + + +## New Feature + +1. Enable column auth by default. + +- https://github.com/apache/doris/pull/32659 + + +2. Get correct cores for pipeline and pipelinex engine when running within docker or k8s. + +- https://github.com/apache/doris/pull/32370 + +3. Support read parquet int96 type. + +- https://github.com/apache/doris/pull/32394 + +4. Enable proxy protocol to support IP transparency. Using this protocol, IP transparency for load balancing can be achieved, so that after load balancing, Doris can still obtain the client's real IP and implement permission control such as whitelisting. + +- https://github.com/apache/doris/pull/32338/files + +5. Add workload group queue related columns for active_queries system table. Uses could use this system to monitor the workload queue usage. + +- https://github.com/apache/doris/pull/32259 + +6. Add new system table backend_active_tasks to monitor the realtime query statics on every BE. + +- https://github.com/apache/doris/pull/31945 + +7. Add ipv4 and ipv6 support for spark-doris connector. + +- https://github.com/apache/doris/pull/32240 + +8. Add inverted index support for CCR. + +- https://github.com/apache/doris/pull/32101 + +9. Support select experimental session variable. + +- https://github.com/apache/doris/pull/31837 + +10. Support materialized view with bitmap_union(bitmap_from_array()) case. + +- https://github.com/apache/doris/pull/31962 + +11. Support partition prune for *HIVE_DEFAULT_PARTITION*. + +- https://github.com/apache/doris/pull/31736 + +12. Support function in set variable statement. + +- https://github.com/apache/doris/pull/32492 + +13. Support arrow serialization for varint type. + +- https://github.com/apache/doris/pull/32809 + + + +## Optimization + +1. Auto resume routine load when be restart or during upgrade. And keep the routine load stable. + +- https://github.com/apache/doris/pull/32239 + +2. Routine Load: optimize allocate task to be algorithm for load balance. + +- https://github.com/apache/doris/pull/32021 + +3. Spark Load: update spark version for spark load to resolve cve problem. + +- https://github.com/apache/doris/pull/30368 + +4. Skip cooldown if the tablet is dropped. + +- https://github.com/apache/doris/pull/32079 + +5. Support using workload group to manage routine load. + +- https://github.com/apache/doris/pull/31671 + +6. [MTMV ]Improve the performance for query rewritting by materialized view. + +- https://github.com/apache/doris/pull/31886 + +7. Reduce jvm heap memory consumed by profiles of BrokerLoadJob. + +- https://github.com/apache/doris/pull/31985 +8. Imporve the high QPS query by speed up PartitionPrunner. + +- https://github.com/apache/doris/pull/31970 + +9. Reduce duplicated memory consumption for column name and column path for schema cache. + +- https://github.com/apache/doris/pull/31141 + +10. Support more join types for query rewriting by materialized view such as INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN, FULL OUTER JOIN, LEFT SEMI JOIN, RIGHT SEMI JOIN, LEFT ANTI JOIN, RIGHT ANTI JOIN + +- https://github.com/apache/doris/pull/32909 + + + +## Bugfix + + +1. Do not push down topn-filter through right/full outer join if the first orderkey is nulls first. + +- https://github.com/apache/doris/pull/32633 + +2. Fix memory leak in Java UDF + +- https://github.com/apache/doris/pull/32630 + +3. If some odbc tables use the same resource, and restore not all odbc tables, it will not retain the resource. +and check some conf for backup/restore + +- https://github.com/apache/doris/pull/31989 + +4. Fold constant will core for variant type. + +- https://github.com/apache/doris/pull/32265 + +5. Routine load will pause when transaction fail in some cases. + +- https://github.com/apache/doris/pull/32638 + +6. the result of left semi join with empty right side should be false instead of null. + +- https://github.com/apache/doris/pull/32477 + +7. Fix core when build inverted index for a new column with no data. + +- https://github.com/apache/doris/pull/32669 + +8. Fix be core caused by null-safe-equal join. + +- https://github.com/apache/doris/pull/32623 + +9. Partial update: fix data correctness risk when load delete sign data into a table with sequence col. + +- https://github.com/apache/doris/pull/32574 + +10. Select outfile: Fix the column type mapping in the orc/parquet file format. + +- https://github.com/apache/doris/pull/32281 + +11. Fix BE core during restore stage. + +- https://github.com/apache/doris/pull/32489 + +12. Use array_agg func after other agg func like count, sum, may make be core. + +- https://github.com/apache/doris/pull/32387 + +13. Variant type should always nullable or there will some bugs. + +- https://github.com/apache/doris/pull/32248 + +14. Fix the bug of handling empty blocks in schema change. + +- https://github.com/apache/doris/pull/32396 + +15. Fix BE will core when use json_length() in some cases. + +- https://github.com/apache/doris/pull/32145 + +16. Fix error when query iceberg table using date cast predicate + +- https://github.com/apache/doris/pull/32194 + +17. Fix some bugs when build inverted index for variant type. + +- https://github.com/apache/doris/pull/31992 + +18. Wrong result of two or more map_agg functions in query. + +- https://github.com/apache/doris/pull/31928 + +19. Fix wrong result of money_format function. + +- https://github.com/apache/doris/pull/31883 + +20. Fix connection hang after too many connections. + +- https://github.com/apache/doris/pull/31594 \ No newline at end of file diff --git a/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.2.md b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.2.md new file mode 100644 index 0000000000000..6116bd9984632 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.2.md @@ -0,0 +1,110 @@ +--- +{ + "title": "Release 2.1.2", + "language": "en" +} +--- + + + +## Behavior Changed + +1. Set the default value of the `data_consistence` property of EXPORT to partition to make export more stable during load. + +- https://github.com/apache/doris/pull/32830 + +2. Some of MySQL Connector (eg, dotnet MySQL.Data) rely on variable's column type to make connection. + + eg, select @[@autocommit]([@autocommit](https://github.com/autocommit)) should with column type BIGINT, not BIT, otherwise it will throw error. So we change column type of @[@autocommit](https://github.com/autocommit) to BIGINT. + + - https://github.com/apache/doris/pull/33282 + + +## Upgrade Problem + +1. Normal workload group is not created when upgrade from 2.0 or other old versions. + + - https://github.com/apache/doris/pull/33197 + +## New Feature + + +1. Add processlist table in information_schema database, users could use this table to query active connections. + + - https://github.com/apache/doris/pull/32511 + +2. Add a new table valued function `LOCAL` to allow access file system like shared storage. + + - https://github.com/apache/doris-website/pull/494 + + +## Optimization + +1. Skip some useless process to make graceful stop more quickly in K8s env. + + - https://github.com/apache/doris/pull/33212 + +2. Add rollup table name in profile to help find the mv selection problem. + + - https://github.com/apache/doris/pull/33137 + +3. Add test connection function to DB2 database to allow user check the connection when create DB2 Catalog. + + - https://github.com/apache/doris/pull/33335 + +4. Add DNS Cache for FQDN to accelerate the connect process among BEs in K8s env. + + - https://github.com/apache/doris/pull/32869 + +5. Refresh external table's rowcount async to make the query plan more stable. + + - https://github.com/apache/doris/pull/32997 + + +## Bugfix + + +1. Fix Iceberg Catalog of HMS and Hadoop do not support Iceberg properties like "io.manifest.cache-enabled" to enable manifest cache in Iceberg. + + - https://github.com/apache/doris/pull/33113 + +2. The offset params in `LEAD`/`LAG` function could use 0 as offset. + + - https://github.com/apache/doris/pull/33174 + +3. Fix some timeout issues with load. + + - https://github.com/apache/doris/pull/33077 + + - https://github.com/apache/doris/pull/33260 + +4. Fix core problem related with `ARRAY`/`MAP`/`STRUCT` compaction process. + + - https://github.com/apache/doris/pull/33130 + + - https://github.com/apache/doris/pull/33295 + +5. Fix runtime filter wait timeout. + + - https://github.com/apache/doris/pull/33369 + +6. Fix `unix_timestamp` core for string input in auto partition. + + - https://github.com/apache/doris/pull/32871 \ No newline at end of file diff --git a/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.3.md b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.3.md new file mode 100644 index 0000000000000..e88ec3e94fb6d --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.3.md @@ -0,0 +1,191 @@ +--- +{ + "title": "Release 2.1.3", + "language": "en" +} +--- + + + +Apache Doris 2.1.3 was officially released on May 21, 2024. This version has updated several improvements, including writing data back to Hive, materialized view, permission management and bug fixes. It further enhances the performance and stability of the system. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + + + +## Feature Enhancements + +**1. Support writing back data to hive tables via Hive Catalog** + +Starting from version 2.1.3, Apache Doris supports DDL and DML operations on Hive. Users can directly create libraries and tables in Hive through Apache Doris and write data to Hive tables by executing `INSERT INTO` statements. This feature allows users to perform complete data query and write operations on Hive through Apache Doris, further simplifying the integrated lakehouse architecture. + +Please refer: [https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/](https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/) + +**2. Support building new asynchronous materialized views on top of existing ones** + +Users can create new asynchronous materialized views on top of existing ones, directly reusing pre-computed intermediate results for data processing. This simplifies complex aggregation and computation operations, reducing resource consumption and maintenance costs while further accelerating query performance and improving data availability. [#32984](https://github.com/apache/doris/pull/32984) + +**3. Support rewriting through nested materialized views** + +Materialized View (MV) is a database object used to store query results. Now, Apache Doris supports rewriting through nested materialized views, which helps optimize query performance. [#33362](https://github.com/apache/doris/pull/33362) + +**4. New `SHOW VIEWS` statement** + +The `SHOW VIEWS` statement can be used to query views in the database, facilitating better management and understanding of view objects in the database. [#32358](https://github.com/apache/doris/pull/32358) + +**5. Workload Group supports binding to specific BE nodes** + +Workload Group can be bound to specific BE nodes, enabling more refined control over query execution to optimize resource usage and improve performance. [#32874](https://github.com/apache/doris/pull/32874) + +**6. Broker Load supports compressed JSON format** + +Broker Load now supports importing compressed JSON format data, significantly reducing bandwidth requirements for data transmission and accelerating data import performance. [#30809](https://github.com/apache/doris/pull/30809) + +**7. TRUNCATE Function can use columns as scale parameters** + +The TRUNCATE function can now accept columns as scale parameters, providing more flexibility when processing numerical data. [#32746](https://github.com/apache/doris/pull/32746) + +**8. Add new functions `uuid_to_int` and `int_to_uuid`** + +These two functions allow users to convert between UUID and integer, significantly helping in scenarios that require handling UUID data. [#33005](https://github.com/apache/doris/pull/33005) + +**9. Add `bypass_workload_group` session variable to bypass query queue** + +The `bypass_workload_group` session variable allows certain queries to bypass the Workload Group queue and execute directly, which is useful for handling critical queries that require quick responses. [#33101](https://github.com/apache/doris/pull/33101) + +**10. Add strcmp function** + +The strcmp function compares two strings and returns their comparison result, simplifying text data processing. [#33272](https://github.com/apache/doris/pull/33272) + +**11. Support HLL functions `hll_from_base64` and `hll_to_base64`** + +HyperLogLog (HLL) is an algorithm for cardinality estimation. These two functions allow users to decode HLL data from a Base64-encoded string or encode HLL data as a Base64 string, which is very useful for storing and transmitting HLL data. [#32089](https://github.com/apache/doris/pull/32089) + +## Optimization and Improvements + +**1. Replace SipHash with XXHash to improve shuffle performance** + +Both SipHash and XXHash are hashing functions, but XXHash may provide faster hashing speeds and better performance in certain scenarios. This optimization aims to improve performance during data shuffling by adopting XXHash. [#32919](https://github.com/apache/doris/pull/32919) + +**2. Asynchronous materialized views support NULL partition columns in OLAP tables** + +This enhancement allows asynchronous materialized views to support NULL partition columns in OLAP tables, enhancing data processing flexibility.[#32698](https://github.com/apache/doris/pull/32698) + +**3. Limit maximum string length to 1024 when collecting column statistics to control BE memory usage** + +Limiting the string length when collecting column statistics prevents excessive data from consuming too much BE memory, helping maintain system stability and performance. [#32470](https://github.com/apache/doris/pull/32470) + +**4. Support dynamic deletion of Bitmap cache to improve performance** + +Dynamically deleting no longer needed Bitmap Cache can free up memory and improve system performance. [#32991](https://github.com/apache/doris/pull/32991) + +**5. Reduce memory usage during ALTER operations** + +Reducing memory usage during ALTER operations improves the efficiency of system resource utilization. [#33474](https://github.com/apache/doris/pull/33474) + +**6. Support constant folding for complex types** + +Supports constant folding for Array/Map/Struct complex types.[#32867](https://github.com/apache/doris/pull/32867) + +**7. Add support for Variant type in Aggregate Key Model** + +The Variant data type can store multiple data types. This optimization allows aggregation operations on Variant type data, enhancing the flexibility of semi-structured data analysis. [#33493](https://github.com/apache/doris/pull/33493) + +**8. Support new inverted index format in CCR** [#33415](https://github.com/apache/doris/pull/33415) + +**9. Optimize rewriting performance for nested materialized views** [#34127](https://github.com/apache/doris/pull/34127) + +**10. Support decimal256 type in row-based storage format** + +Supporting the decimal256 type in row-based storage extends the system's ability to handle high-precision numerical data. [#34887](https://github.com/apache/doris/pull/34887) + +## Behavioral Changes + +**1. Authorization** + +- **Grant_priv permission changes**: `Grant_priv` can no longer be arbitrarily granted. When performing a `GRANT` operation, the user not only needs to have `Grant_priv` but also the permissions to be granted. For example, to grant `SELECT` permission on `table1`, the user needs both `GRANT` permission and `SELECT` permission on `table1`, enhancing security and consistency in permission management. [#32825](https://github.com/apache/doris/pull/32825) + +- **Workload group and resource usage_priv**: `Usage_priv` for Workload Group and Resource is no longer global but limited to Resource and Workload Group, making permission granting and usage more specific. [#32907](https://github.com/apache/doris/pull/32907) + +- **Authorization for operations**: Operations that were previously unauthorized now have corresponding authorizations for more detailed and comprehensive operational permission control. [#33347](https://github.com/apache/doris/pull/33347) + +**2. LOG directory configuration** + +The log directory configuration for FE and BE now uniformly uses the `LOG_DIR` environment variable. All other different types of logs will be stored with `LOG_DIR` as the root directory. To maintain compatibility between versions, the previous configuration item `sys_log_dir` can still be used. [#32933](https://github.com/apache/doris/pull/32933) + +**3. S3 Table Function (TVF)** + +Due to issues with correctly recognizing or processing S3 URLs in certain cases, the parsing logic for object storage paths has been refactored. For file paths in S3 table functions, the `force_parsing_by_standard_uri` parameter needs to be passed to ensure correct parsing. [#33858](https://github.com/apache/doris/pull/33858) + +## Upgrade Issues + +Since many users use certain keywords as column names or attribute values, the following keywords have been set as non-reserved, allowing users to use them as identifiers. [#34613](https://github.com/apache/doris/pull/34613) + +## Bug Fixes + +**1. Fix no data error when reading Hive tables on Tencent Cloud COSN** + +Resolved the no data error that could occur when reading Hive tables on Tencent Cloud COSN, enhancing compatibility with Tencent Cloud storage services. + +**2. Fix incorrect results returned by `milliseconds_diff` function** + +Fixed an issue where the `milliseconds_diff` function returned incorrect results in some cases, ensuring the accuracy of time difference calculations. [#32897](https://github.com/apache/doris/pull/32897) + +**3. User-defined variables should be rorwarded to the Master node** + +Ensured that user-defined variables are correctly passed to the Master node for consistency and correct execution logic across the entire system. [#33013]https://github.com/apache/doris/pull/33013 + +**4. Fix Schema Change issues when adding complex type columns** + +Resolved Schema Change issues that could arise when adding complex type columns, ensuring the correctness of Schema Changes. [#31824](https://github.com/apache/doris/pull/31824) + +**5. Fix data loss issue in Routine Load when FE Master node changes** + +`Routine Load` is often used to subscribe to Kafka message queues. This fix addresses potential data loss issues that may occur during FE Master node changes. [#33678](https://github.com/apache/doris/pull/33678) + +**6. Fix Routine Load failure when Workload Group cannot be found** + +Resolved an issue where `Routine Load` would fail if the specified Workload Group could not be found. [#33596](https://github.com/apache/doris/pull/33596) + +**7. Support column string64 to avoid join failures when string size overflows unit32** + +In some cases, string sizes may exceed the unit32 limit. Supporting the `string64` type ensures correct execution of string JOIN operations. [#33850](https://github.com/apache/doris/pull/33850) + +**8. Allow Hadoop users to create Paimon Catalog** + +Permitted authorized Hadoop users to create Paimon Catalogs.[#33833](https://github.com/apache/doris/pull/33833) + +**9. Fix `function_ipxx_cidr` function issues with constant parameters** + +Resolved problems with the `function_ipxx_cidr` function when handling constant parameters, ensuring the correctness of function execution.[#33968](https://github.com/apache/doris/pull/33968) + +**10. Fix file download errors when restoring using HDFS** + +Resolved "failed to download" errors encountered during data restoration using HDFS, ensuring the accuracy and reliability of data recovery. [#33303](https://github.com/apache/doris/issues/33303) + +**11. Fix column permission issues related to hidden columns** + +In some cases, permission settings for hidden columns may be incorrect. This fix ensures the correctness and security of column permission settings. [#34849](https://github.com/apache/doris/pull/34849) + +**12. Fix issue where Arrow Flight cannot obtain the correct IP in K8s deployments** + +This fix resolves an issue where Arrow Flight cannot correctly obtain the IP address in Kubernetes deployment environments.[#34850](https://github.com/apache/doris/pull/34850) \ No newline at end of file diff --git a/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.4.md b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.4.md new file mode 100644 index 0000000000000..521694ffa60fa --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.4.md @@ -0,0 +1,289 @@ +--- +{ + "title": "Release 2.1.4", + "language": "en" +} +--- + + + +**Apache Doris version 2.1.4 was officially released on June 26, 2024.** In this update, we have optimized various functional experiences for data lakehouse scenarios, with a focus on resolving the abnormal memory usage issue in the previous version. Additionally, we have implemented several improvemnents and bug fixes to enhance the stability. Welcome to download and use it. + + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + + +## Behavior changes + +- Non-existent files will be ignored when querying external tables such as Hive. [#35319](https://github.com/apache/doris/pull/35319) + + The file list is obtained from the meta cache, and it may not be consistent with the actual file list. + + Ignoring non-existent files helps to avoid query errors. + +- By default, creating a Bitmap Index will no longer be automatically changed to an Inverted Index. [#35521](https://github.com/apache/doris/pull/35521) + + This behavior is controlled by the FE configuration item `enable_create_bitmap_index_as_inverted_index`, which defaults to false. + +- When starting FE and BE processes using `--console`, all logs will be output to the standard output and differentiated by prefixes indicating the log type. [#35679](https://github.com/apache/doris/pull/35679) + + For more infomation, please see the documentations: + + - [Log Management - FE Log](../admin-manual/log-management/fe-log.md) + + - [Log Management - BE Log](../admin-manual/log-management/be-log.md) + +- If no table comment is provided when creating a table, the default comment will be empty instead of using the table type as the default comment. [#36025](https://github.com/apache/doris/pull/36025) + +- The default precision of DECIMALV3 has been adjusted from (9, 0) to (38, 9) to maintain compatibility with the version in which this feature was initially released. [#36316](https://github.com/apache/doris/pull/36316) + +## New features + +### Query optimizer + +- Support FE flame graph tool + + For more information, see the [documentation](/community/developer-guide/fe-profiler.md) + +- Support `SELECT DISTINCT` to be used with aggregation. + +- Support single table query rewrite without `GROUP BY`. This is useful for complex filters or expressions. [#35242](https://github.com/apache/doris/pull/35242). + +- The new optimizer fully supports point query functionality [#36205](https://github.com/apache/doris/pull/36205). + +### Data Lakehouse + +- Support native reader of Apache Paimon deletion vector [#35241](https://github.com/apache/doris/pull/35241) + +- Support using Resource in Table Valued Functions [#35139](https://github.com/apache/doris/pull/35139) + +- Access controller with Hive Ranger plugin supports Data Mask + +### Asynchronous materialized views + +- Build support for internal table triggered updates, where if a materialized view uses an internal table and the data in the internal table changes, it can trigger a refresh of the materialized view, specifying REFRESH ON COMMIT when creating the materialized view. + +- Support transparent rewriting for single tables. For more information, see [Querying Async Materialized View](../query/view-materialized-view/query-async-materialized-view.md). + +- Transparent rewriting supports aggregation roll-up for agg_state, agg_union types; materialized views can be defined as agg_state or agg_union, queries can use specific aggregation functions, or use agg_merge. For more information, see [AGG_STATE](../sql-manual/sql-types/Data-Types/AGG_STATE.md). + +### Others + +- Added function `replace_empty`. + + For more information, see [documentation]../sql-manual/sql-functions/string-functions/replace_empty). + +- Support `show storage policy using` statement. + + For more information, see [documentation](../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md). + +- Support JVM metrics on the BE side. + + By setting `enable_jvm_monitor=true` in `be.conf` to enable this feature. + +## Improvements + +- Supported creating inverted indexes for columns with Chinese names. [#36321](https://github.com/apache/doris/pull/36321) + +- Estimate memory consumed by segment cache more accurately so that unused memory can be released more quickly. [#35751](https://github.com/apache/doris/pull/35751) + +- Filter empty partitions before exporting tables to remote storage. [#35542](https://github.com/apache/doris/pull/35542) + +- Optimize routine load task allocation algorithm to balance the load among Backends. [#34778](https://github.com/apache/doris/pull/34778) + +- Provide hints when a related variable is not found during a set operation. [#35775](https://github.com/apache/doris/pull/35775) + +- Support placing Java UDF jar files in the FE's `custom_lib` directory for default loading. [#35984](https://github.com/apache/doris/pull/35984) + +- Add a timeout global variable `audit_plugin_load_timeout` for audit log load jobs. + +- Optimize the performance of transparent rewrite planning for asynchronous materialized views. + +- Optimize the `INSERT` operation that when the source is empty, the BE will not execute. [#34418](https://github.com/apache/doris/pull/34418) + +- Support fetching file lists of Hive/Hudi tables in batches. In a senario with 1.2 million files, the time taken to obtain the list of files has been reduced from 390 seconds to 46 seconds. [#35107](https://github.com/apache/doris/pull/35107) + +- Forbid dynamic partitioning when creating asynchronous materialized views. + +- Support detecting whether the partition data of external data of external tables in Hive is synchronized with asynchronous materialized views. + +- Allow to create index for asynchronous materialized views. + +## Bug fixes + +### Query optimizer + +- Fixed the issue where SQL cache returns old results after truncating a partition. [#34698](https://github.com/apache/doris/pull/34698) + +- Fixed the issue where casting from JSON to other types did not correctly handle nullable attributes. [#34707](https://github.com/apache/doris/pull/34707) + +- Fixed occasional DATETIMEV2 literal simplification errors. [#35153](https://github.com/apache/doris/pull/35153) + +- Fixed the issue where `COUNT(*)` could not be used in window functions. [#35220](https://github.com/apache/doris/pull/35220) + +- Fixed the issue where nullable attributes could be incorrect when all `SELECT` statements under `UNION ALL` have no `FROM` clause. [#35074](https://github.com/apache/doris/pull/35074) + +- Fixed the issue where `bitmap in join` and subquery unnesting could not be used simultaneously. [#35435](https://github.com/apache/doris/pull/35435) + +- Fixed the performance issue where filter conditions could not be pushed down to the CTE producer in specific situations. [#35463](https://github.com/apache/doris/pull/35463) + +- Fixed the issue where aggregate combinators written in uppercase could not be found. [#35540](https://github.com/apache/doris/pull/35540) + +- Fixed the performance issue where window functions were not properly pruned by column pruning. [#35504](https://github.com/apache/doris/pull/35504) + +- Fixed the issue where queries might parse incorrectly leading to wrong results when multiple tables with the same name but in different databases appeared simultaneously in the query. [#35571](https://github.com/apache/doris/pull/35571) + +- Fixed the query error caused by generating runtime filters during schema table scans. [#35655](https://github.com/apache/doris/pull/35655) + +- Fixed the issue where nested correlated subqueries could not execute because the join condition was folded into a null literal. [#35811](https://github.com/apache/doris/pull/35811) + +- Fixed the occasional issue where decimal literals were set with incorrect precision during planning. [#36055](https://github.com/apache/doris/pull/36055) + +- Fixed the occasional issue where multiple layers of aggregation were merged incorrectly during planning. [#36145](https://github.com/apache/doris/pull/36145) + +- Fixed the occasional issue where the input-output mismatch error occurred after aggregate expansion planning. [#36207](https://github.com/apache/doris/pull/36207) + +- Fixed the occasional issue where `<=>` was incorrectly converted to `=`. [#36521](https://github.com/apache/doris/pull/36521) + +### Query execution + +- Fixed the issue where the query hangs if the limited rows are reached on the pipeline engine and memory is not released. [#35746](https://github.com/apache/doris/pull/35746) + +- Fixed the BE coredump when `enable_decimal256` is true but falls back to the old planner. [#35731](https://github.com/apache/doris/pull/35731) + +### Asynchronous materialized views + +- Fixed the issue in the asynchronous materialized view build where the store_row_column attribute specified was not being recognized by the core. + +- Fixed the problem in the asynchronous materialized view build where specifying the storage_medium was not taking effect. + +- Resolved the error occurring in the asynchronous materialized view show partitions after the base table is deleted. + +- Fixed the issue where asynchronous materialized views caused backup and restore exceptions. [#35703](https://github.com/apache/doris/pull/35703) + +- Fixed the issue where partition rewrite could lead to incorrect results. [#35236](https://github.com/apache/doris/pull/35236) + +### Semi-structured + +- Fixed the core dump problem when a VARIANT with an empty key is used. [#35671](https://github.com/apache/doris/pull/35671) +- Bitmap and BloomFilter index should not perform light index changes. [#35225](https://github.com/apache/doris/pull/35225) + +### Primary key + +- Fixed the issue where an exception BE restart occurred in the case of partial column updates during import, which could result in duplicate keys. [#35678](https://github.com/apache/doris/pull/35678) + +- Fixed the issue where BE might core dump during clone operations when memory is tight. [#34702](https://github.com/apache/doris/pull/34702) + +### Data Lakehouse + +- Fixed the issue where a Hive table could not be created with a fully qualified name such as `ctl.db.tbl` [#34984](https://github.com/apache/doris/pull/34984) + +- Fixed the issue where the Hive metastore connection did not close when refreshing [#35426](https://github.com/apache/doris/pull/35426) + +- Fixed a potential meta replay issue when upgrading from 2.0.x to 2.1.x. [#35532](https://github.com/apache/doris/pull/35532) + +- Fixed the issue where the Table Valued Function could not read an empty snappy compressed file. [#34926](https://github.com/apache/doris/pull/34926) + +- Fixed the issue where unable to read Parquet files with invalid min-max column statistics [#35041](https://github.com/apache/doris/pull/35041) + +- Fixed the issue where unable to handle pushdown predicates with null-aware functions in the Parquet/ORC reader [#35335](https://github.com/apache/doris/pull/35335) + +- Fixed the issue about the order of partition columns when creating a Hive table [#35347](https://github.com/apache/doris/pull/35347) + +- Fixed the issue where writing to a Hive table on S3 failed when partition values contained spaces [#35645](https://github.com/apache/doris/pull/35645) + +- Fixed the issue about incorrect scheme of Aliyun OSS endpoint [#34907](https://github.com/apache/doris/pull/34907) + +- Fixed the issue where the Parquet format Hive table written by Doris could not be read by Hive [#34981](https://github.com/apache/doris/pull/34981) + +- Fixed the issue where unable to read ORC files after the schema change of a Hive table [#35583](https://github.com/apache/doris/pull/35583) + +- Fixed the issue where unable to read Paimon tables via JNI after the schema change of the Paimon table [#35309](https://github.com/apache/doris/pull/35309) + +- Fixed the issue of too small Row Groups in Parquet format files written out. [#36042](https://github.com/apache/doris/pull/36042) [#36143](https://github.com/apache/doris/pull/36143) + +- Fixed the issue where unable to read Paimon tables after schema changes [#36049](https://github.com/apache/doris/pull/36049) + +- Fixed the issue where unable to read Hive Parquet format tables after schema changes [#36182](https://github.com/apache/doris/pull/36182) + +- Fixed the FE OOM issue caused by Hadoop FS cache [#36403](https://github.com/apache/doris/pull/36403) + +- Fixed the issue where FE could not start after enabling the Hive Metastore Listener [#36533](https://github.com/apache/doris/pull/36533) + +- Fixed the issue of query performance degradation with a large number of files [#36431](https://github.com/apache/doris/pull/36431) + +- Fixed the timezone issue when reading the timestamp column type in Iceberg [#36435](https://github.com/apache/doris/pull/36435) + +- Fixed DATETIME conversion error and data path error on Iceberg Table. [#35708](https://github.com/apache/doris/pull/35708) + +- Support retain and pass the additional user-defined properties fo Table Valued Functions to the S3 SDK. [#35515](https://github.com/apache/doris/pull/35515) + + +### Data import + +- Fixed the issue where `CANCEL LOAD` did not work [#35352](https://github.com/apache/doris/pull/35352) + +- Fixed the issue where a null pointer error in the Publish phase of load transactions prevented the load from completing [#35977](https://github.com/apache/doris/pull/35977) + +- Fixed the issue with bRPC serializing large data files when sent via HTTP [#36169](https://github.com/apache/doris/pull/36169) + +### Data management + +- Fixed the isseu that the resource tag in ConnectionContext was not set after forwarding DDL or DML to master FE. [#35618](https://github.com/apache/doris/pull/35618) + +- Fixed the issue where the restored table name was incorrect when `lower_case_table_names` was enabled [#35508](https://github.com/apache/doris/pull/35508) + +- Fixed the issue where `admin clean trash` could not work [#35271](https://github.com/apache/doris/pull/35271) + +- Fixed the issue where a storage policy could not be deleted from a partition [#35874](https://github.com/apache/doris/pull/35874) + +- Fixed the issue of data loss when importing into a multi-replica automatic partition table [#36586](https://github.com/apache/doris/pull/36586) + +- Fixed the issue where the partition column of a table changed when querying or inserting into an automatic partition table using the old optimizer [#36514](https://github.com/apache/doris/pull/36514) + +### Memory management + +- Fixed the issue of frequent errors in the logs due to failure in obtaining Cgroup meminfo. [#35425](https://github.com/apache/doris/pull/35425) + +- Fixed the issue where the Segment cache size was uncontrolled when using BloomFilter, leading to abnormal process memory growth. [#34871](https://github.com/apache/doris/pull/34871) + +### Permissions + +- Fixed the issue where permission settings were ineffective after enabling case-insensitive table names. [#36557](https://github.com/apache/doris/pull/36557) + +- Fixed the issue where setting LDAP passwords through non-Master FE nodes did not take effect. [#36598](https://github.com/apache/doris/pull/36598) + +- Fixed the issue where authorization could not be checked for the `SELECT COUNT(*)` statement. [#35465](https://github.com/apache/doris/pull/35465) + +### Others + +- Fixed the issue where the client JDBC program could not close the connection if the MySQL connection was broken. [#36616](https://github.com/apache/doris/pull/36616) + +- Fixed MySQL protocol compatibility issue with the `SHOW PROCEDURE STATUS` statement. [#35350](https://github.com/apache/doris/pull/35350) + +- The `libevent` now forces Keepalive to solve the issue of connection leaks in certain situations. [#36088](https://github.com/apache/doris/pull/36088) + +## Credits + +Thanks to every one who contributes to this release. + +@airborne12, @amorynan, @AshinGau, @BePPPower, @BiteTheDDDDt, @ByteYue, @caiconghui, @CalvinKirs, @cambyzju, @catpineapple, @cjj2010, @csun5285, @DarvenDuan, @dataroaring, @deardeng, @Doris-Extras, @eldenmoon, @englefly, @feiniaofeiafei, @felixwluo, @freemandealer, @Gabriel39, @gavinchou, @GoGoWen, @HappenLee, @hello-stephen, @hubgeter, @hust-hhb, @jacktengg, @jackwener, @jeffreys-cat, @Jibing-Li, @kaijchen, @kaka11chen, @Lchangliang, @liaoxin01, @LiBinfeng-01, @lide-reed, @luennng, @luwei16, @mongo360, @morningman, @morrySnow, @mrhhsg, @Mryange, @mymeiyi, @nextdreamblue, @platoneko, @qidaye, @qzsee, @seawinde, @shuke987, @sollhui, @starocean999, @suxiaogang223, @TangSiyang2001, @Thearas, @Vallishp, @w41ter, @wangbo, @whutpencil, @wsjz, @wuwenchi, @xiaokang, @xiedeyantu, @XieJiann, @xinyiZzz, @XuPengfei-1020, @xy720, @xzj7019, @yiguolei, @yongjinhou, @yujun777, @Yukang-Lian, @Yulei-Yang, @zclllyybb, @zddr, @zfr9527, @zgxme, @zhangbutao, @zhangstar333, @zhannngchen, @zhiqiang-hhhh, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.5.md b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.5.md new file mode 100644 index 0000000000000..7c1910eeae8c5 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.5.md @@ -0,0 +1,395 @@ +--- +{ + "title": "Release 2.1.5", + "language": "en" +} +--- + + + +**Apache Doris version 2.1.5 was officially released on July 24, 2024.** In this update, we have optimized various functional experiences for data lakehouse and high concurrency scenarios, functionalities of asynchronous materialized views. Additionaly, we have implemented several improvemnents and bug fixes to enhance the stability. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- The default connection pool size for the JDBC Catalog has been increased from 10 to 30 to prevent connection exhaustion in high-concurrency scenarios. [#37023](https://github.com/apache/doris/pull/37023). + +- The system's reserved memory (low water mark) has been adjusted to `min(6.4GB, MemTotal * 5%)` to mitigate BE OOM issues. + +- When processing multiple statements in a single request, only the last statement's result is returned if the `CLIENT_MULTI_STATEMENTS` flag is not set. + +- Direct modifications to data in asynchronous materialized views are no longer permitted.[#37129](https://github.com/apache/doris/pull/37129) + +- A session variable `use_max_length_of_varchar_in_ctas` has been added to control the behavior of varchar and char type length generation during CTAS (Create Table As Select). The default value is true. When set to false, the derived varchar length is used instead of the maximum length. [#37284](https://github.com/apache/doris/pull/37284) + +- Statistics collection now defaults to enabling the functionality of estimating the number of rows in Hive tables based on file size. [#37694](https://github.com/apache/doris/pull/37694) + +- Transparent rewrite for asynchronous materialized views is now enabled by default. [#35897](https://github.com/apache/doris/pull/35897) + +- Transparent rewrite utilizes partitioned materialized views. If partitions fail, the base tables are unioned with the materialized view to ensure data correctness. [#35897](https://github.com/apache/doris/pull/35897) + +## New features + +### Lakehouse + +- The session variable `read_csv_empty_line_as_null` can be used to control whether empty lines are ignored when reading CSV format files. [#37153](https://github.com/apache/doris/pull/37153) + + By default, empty lines are ignored. When set to true, empty lines will be read as rows where all columns are null. + +- Compatibility with Presto's complex type output format can be enabled by setting `serde_dialect="presto"`. [#37253](https://github.com/apache/doris/pull/37253) + +### Multi-Table Materialized View + +- Supports non-deterministic functions in materialized view building. [#37651](https://github.com/apache/doris/pull/37651) + +- Atomically replaces definitions of asynchronous materialized views. [#37147](https://github.com/apache/doris/pull/37147) + +- Views creation statements can be viewed via `SHOW CREATE MATERIALIZED VIEW`. [#37125](https://github.com/apache/doris/pull/37125) + +- Transparent rewrites for multi-dimensional aggregation and non-aggregate queries. [#37436](https://github.com/apache/doris/pull/37436) [#37497](https://github.com/apache/doris/pull/37497) + +- Supports DISTINCT aggregations with key columns and partitioning for roll-ups. [#37651](https://github.com/apache/doris/pull/37651) + +- Support for partitioning materialized views to roll up partitions using `date_trunc` [#31812](https://github.com/apache/doris/pull/31812) [#35562](https://github.com/apache/doris/pull/35562) + +- Partitioned table-valued functions (TVFs) are supported. [#36479](https://github.com/apache/doris/pull/36479) + +### Semi-Structured Data Management + +- Tables using the VARIANT type now support partial column updates. [#34925](https://github.com/apache/doris/pull/34925) + +- PreparedStatement support is now enabled by default. [#36581](https://github.com/apache/doris/pull/36581) + +- The VARIANT type can be exported to CSV format. [#37857](https://github.com/apache/doris/pull/37857) + +- `explode_json_object` function transposes JSON Object rows into columns. [#36887](https://github.com/apache/doris/pull/36887) + +- The ES Catalog now maps ES NESTED or OBJECT types to the Doris JSON type.[#37101](https://github.com/apache/doris/pull/37101) + +- By default, support_phrase is enabled for inverted indexes with specified analyzers to improve the performance of match_phrase series queries. [#37949](https://github.com/apache/doris/pull/37949) + +### Query Optimizer + +- Support for explaining `DELETE FROM` statements. [#37100](https://github.com/apache/doris/pull/37100) + +- Support for hint form of constant expression parameters [#37988](https://github.com/apache/doris/pull/37988) + +### Memory Management + +- Added an HTTP API to clear the cache. [#36599](https://github.com/apache/doris/pull/36599) + +### Permissions + +- Support for authorization of resources within Table-Valued Functions (TVFs) [#37132](https://github.com/apache/doris/pull/37132) + +## Improvements + +### Lakehouse + +- Upgraded Paimon to version 0.8.1 + +- Fixes ClassNotFoundException for org.apache.commons.lang.StringUtils when querying Paimon tables. [#37512](https://github.com/apache/doris/pull/37512) + +- Added support for Tencent Cloud LakeFS. [#36891](https://github.com/apache/doris/pull/36891) + +- Optimized the timeout duration when fetching file lists for external table queries. [#36842](https://github.com/apache/doris/pull/36842) + +- Configurable via the session variable `fetch_splits_max_wait_time_ms`. + +- Improved default connection logic for SQLServer JDBC Catalog. [#36971](https://github.com/apache/doris/pull/36971) + + By default, the connection encryption settings are not intervened. Only when `force_sqlserver_jdbc_encrypt_false` is set to true, encrypt=false is forcibly added to the JDBC URL to reduce authentication errors. This allows for more flexible control over encryption behavior, enabling it to be turned on or off as needed. + +- Added serde properties to the show create table statements for Hive tables. [#37096](https://github.com/apache/doris/pull/37096) + +- Changed the default cache time for Hive table lists on the FE from 1 day to 4 hours + +- Data export (Export/Outfile) now supports specifying compression formats for Parquet and ORC + + For more information, please refer to [docs](https://doris.apache.org/docs/sql-manual/sql-statements/Data-Manipulation-Statements/Manipulation/EXPORT/?_highlight=compress_type). + +- When creating a table using CTAS+TVF, partition columns in the TVF are automatically mapped to Varchar(65533) instead of String, allowing them to be used as partition columns for internal tables [#37161](https://github.com/apache/doris/pull/37161) + +- Optimized the number of metadata accesses for Hive write operations [#37127](https://github.com/apache/doris/pull/37127) + +- ES Catalog now supports mapping nested/object types to Doris's Json type. [#37182](https://github.com/apache/doris/pull/37182) + +- Improved error messages when connecting to Oracle using older versions of the ojdbc driver [#37634](https://github.com/apache/doris/pull/37634) + +- When Hudi tables return an empty set during Incremental Read, Doris now also returns an empty set instead of error [#37636](https://github.com/apache/doris/pull/37636) + +- Fixed an issue where inner-outer table join queries could lead to FE timeouts in some cases [#37757](https://github.com/apache/doris/pull/37757) + +- Fixed an issue with FE metadata replay errors during upgrades from older versions to newer versions when the Hive metastore event listener is enabled. [#37757](https://github.com/apache/doris/pull/37757) + +### Multi-Table Materialized View + +- Automate key column selection for asynchronous materialized views. [#36601](https://github.com/apache/doris/pull/36601) + +- Support date_trunc in materialized view partition definitions.. [#35562](https://github.com/apache/doris/pull/35562) + +- Enable transparent rewrites across nested materialized view aggregations. [#37651](https://github.com/apache/doris/pull/37651) + +- Asynchronous materialized views remain available when schema changes do not affect the correctness of their data. [#37122](https://github.com/apache/doris/pull/37122) + +- Improve planning speed for transparent rewrites. [#37935](https://github.com/apache/doris/pull/37935) + +- When calculating the availability of asynchronous materialized views, the current refresh status is no longer taken into account. [#36617](https://github.com/apache/doris/pull/36617) + +### Semi-Structured Data Management + +- Optimize DESC performance for viewing VARIANT sub-columns through sampling. [#37217](https://github.com/apache/doris/pull/37217) + +- Support for special JSON data with empty keys in the JSON type. [#36762](https://github.com/apache/doris/pull/36762) + +### Inverted Index + +- Reduce latency by minimizing the invocation of inverted index exists to avoid delays in accessing object storage. [#36945](https://github.com/apache/doris/pull/36945) + +- Optimize the overhead of the inverted index query process. [#35357](https://github.com/apache/doris/pull/35357) + +- Prevent inverted indices in materialized views. [#36869](https://github.com/apache/doris/pull/36869) + +### Query Optimizer + +- When both sides of a comparison expression are literals, the string literal will attempt to convert to the type of the other side. [#36921](https://github.com/apache/doris/pull/36921) + +- Refactored the sub-path pushdown functionality for the variant type, now better supporting complex pushdown scenarios. [#36923](https://github.com/apache/doris/pull/36923) + +- Optimized the logic for calculating the cost of materialized views, enabling more accurate selection of lower-cost materialized views. [#37098](https://github.com/apache/doris/pull/37098) + +- Improved the SQL cache planning speed when using user variables in SQL. [#37119](https://github.com/apache/doris/pull/37119) + +- Optimized the row estimation logic for NOT NULL expressions, resulting in better performance when NOT NULL is present in queries. [#37498](https://github.com/apache/doris/pull/37498) + +- Optimized the null rejection derivation logic for LIKE expressions. [#37864](https://github.com/apache/doris/pull/37864) + +- Improved error messages when querying a specific partition fails, making it clearer which table is causing the issue. [#37280](https://github.com/apache/doris/pull/37280) + +### Query Execution + +- Improved the performance of the bitmap_union operator up to 3 times in certain scenarios. + +- Enhanced the reading performance of Arrow Flight in ARM environments. + +- Optimized the execution performance of the explode, explode_map, and explode_json functions. + +### Data Loading + +- Support setting `max_filter_ratio` for `INSERT INTO ... FROM TABLE VALUE FUNCTION` + +## Bug fixes + +### Lakehouse + +- Fixed an issue that caused BE crashes in some cases when querying Parquet format [#37086](https://github.com/apache/doris/pull/37086) + +- Fixed an issue where BE printed excessive logs when querying Parquet format. [#37012](https://github.com/apache/doris/pull/37012) + +- Fixed an issue where the FE side created a large number of duplicate FileSystem objects in some cases. [#37142](https://github.com/apache/doris/pull/37142) + +- Fixed an issue where transaction information was not cleaned up after writing to Hive in some cases. [#37172](https://github.com/apache/doris/pull/37172) + +- Fixed a thread leak issue caused by Hive table write operations in some cases. [#37247](https://github.com/apache/doris/pull/37247) + +- Fixed an issue where Hive Text format row and column delimiters could not be correctly obtained in some cases. [#37188](https://github.com/apache/doris/pull/37188) + +- Fixed a concurrency issue when reading lz4 compressed blocks in some cases. [#37187](https://github.com/apache/doris/pull/37187) + +- Fixed an issue where `count(*)` on Iceberg tables returned incorrect results in some cases. [#37810](https://github.com/apache/doris/pull/37810) + +- Fixed an issue where creating a Paimon catalog based on MinIO caused FE metadata replay errors in some cases. [#37249](https://github.com/apache/doris/pull/37249) + +- Fixed an issue where using Ranger to create a catalog caused the client to hang in some cases. [#37551](https://github.com/apache/doris/pull/37551) + +### Multi-Table Materialized View + +- Fixed an issue where adding new partitions to the base table could lead to incorrect results after partition aggregation roll-up rewrites. [#37651](https://github.com/apache/doris/pull/37651) + +- Fixed an issue where the materialized view partition status was not set to out-of-sync after deleting associated base table partitions. [#36602](https://github.com/apache/doris/pull/36602) + +- Fixed an occasional deadlock issue during asynchronous materialized view builds. [#37133](https://github.com/apache/doris/pull/37133) + +- Fixed an occasional "nereids cost too much time" error when refreshing a large number of partitions in a single asynchronous materialized view refresh. [#37589](https://github.com/apache/doris/pull/37589) + +- Fixed an issue where an asynchronous materialized view could not be created if the final select list contained a null literal. [#37281](https://github.com/apache/doris/pull/37281) + +- Fixed an issue with single-table materialized views where, even though the aggregation materialized view was successfully rewritten, the CBO did not select it. [#35721](https://github.com/apache/doris/pull/35721) [#36058](https://github.com/apache/doris/pull/36058) + +- Fixed an issue where partition derivation failed when building a partitioned materialized view with both join inputs being aggregations. [#34781](https://github.com/apache/doris/pull/34781) + +### Semi-Structured Data Management + +- Fixed issues with VARIANT in special cases such as concurrency and abnormal data.[#37976](https://github.com/apache/doris/pull/37976) [#37839](https://github.com/apache/doris/pull/37839) [#37794](https://github.com/apache/doris/pull/37794) [#37674](https://github.com/apache/doris/pull/37674) [#36997](https://github.com/apache/doris/pull/36997) + +- Fixed coredump issues when using VARIANT in unsupported SQL. [#37640](https://github.com/apache/doris/pull/37640) + +- Fixed coredump issues related to MAP data type when upgrading from 1.x to 2.x or higher versions. [#36937](https://github.com/apache/doris/pull/36937) + +- Improved ES Catalog support for Array types. [#36936](https://github.com/apache/doris/pull/36936) + +### Inverted Index + +- Fixed an issue where DROP INDEX for Inverted Index v2 did not delete metadata. [#37646](https://github.com/apache/doris/pull/37646) + +- Fixed query accuracy issues when string length exceeded the "ignore above" threshold. [#37679](https://github.com/apache/doris/pull/37679) + +- Fixed issues with index size statistics. [#37232](https://github.com/apache/doris/pull/37232) [#37564](https://github.com/apache/doris/pull/37564) + +### Query Optimizer + +- Fixed an issue that prevented import operations from executing due to the use of reserved keywords. [#35938](https://github.com/apache/doris/pull/35938) + +- Fixed a type error where char(255) was incorrectly recorded as char(1) when creating a table. [#37671](https://github.com/apache/doris/pull/37671) + +- Fixed incorrect results when the join expression in a correlated subquery was a complex expression. [#37683](https://github.com/apache/doris/pull/37683) + +- Fixed a potential issue with incorrect bucket pruning for decimal types. [#38013](https://github.com/apache/doris/pull/38013) + +- Fixed incorrect aggregation operator results when pipeline local shuffle was enabled in certain scenarios. [#38016](https://github.com/apache/doris/pull/38016) + +- Fixed planning errors that could occur when equal expressions existed in aggregation operators. [#36622](https://github.com/apache/doris/pull/36622) + +- Fixed planning errors that could occur when lambda expressions were present in aggregation operators. [#37285](https://github.com/apache/doris/pull/37285) + +- Fixed an issue where a literal generated from a window function being optimized to a literal had the wrong type, preventing execution. [#37283](https://github.com/apache/doris/pull/37283) + +- Fixed an issue with the null attribute being incorrectly output by the aggregate function foreach combinator. [#37980](https://github.com/apache/doris/pull/37980) + +- Fixed an issue where the acos function could not be planned when its parameter was a literal out of range. [#37996](https://github.com/apache/doris/pull/37996) + +- Fixed planning errors when specifying partitions for a query on a synchronized materialized view. [#36982](https://github.com/apache/doris/pull/36982) + +- Fixed occasional Null Pointer Exceptions (NPEs) during planning. [#38024](https://github.com/apache/doris/pull/38024) + +### Query Execution + +- Fixed an error in delete where statements when using decimal data types as conditions. [#37801](https://github.com/apache/doris/pull/37801) + +- Fixed an issue where BE memory was not released after query execution ended. [#37792](https://github.com/apache/doris/pull/37792) [#37297](https://github.com/apache/doris/pull/37297) + +- Fixed a problem where audit logs occupied too much FE memory under high QPS scenarios. [#37786](https://github.com/apache/doris/pull/37786) + +- Fixed BE core dumps when the sleep function received illegal input values. [#37681](https://github.com/apache/doris/pull/37681) + +- Fixed an error encountered during sync filter size execution. [#37103](https://github.com/apache/doris/pull/37103) + +- Fixed incorrect results when using time zones during execution. [#37062](https://github.com/apache/doris/pull/37062) + +- Fixed incorrect results when casting strings to integers. [#36788](https://github.com/apache/doris/pull/36788) + +- Fixed query errors when using the Arrow Flight protocol with pipelinex enabled. [#35804](https://github.com/apache/doris/pull/35804) + +- Fixed errors when casting strings to dates/datetimes. [#35637](https://github.com/apache/doris/pull/35637) + +- Fixed BE core dumps during large table join queries using <=>. [#36263](https://github.com/apache/doris/pull/36263) + +### Storage Management + +- Fixed the issue of invisible DELETE SIGN data encountered during column update and write operations. [#36755](https://github.com/apache/doris/pull/36755) + +- Optimized FE's memory usage during schema changes. [#36756](https://github.com/apache/doris/pull/36756) + +- Fixed the issue where BE would hang during restart due to transactions not being aborted [#36437](https://github.com/apache/doris/pull/36437) + +- Fixed occasional errors when changing from NOT NULL to NULL data types. [#36389](https://github.com/apache/doris/pull/36389) + +- Optimized replica repair scheduling when BE goes down. [#36897](https://github.com/apache/doris/pull/36897) + +- Supported round-robin disk selection for tablet creation on a single BE. [#36900](https://github.com/apache/doris/pull/36900) + +- Fixed query error -230 caused by slow publishing. [#36222](https://github.com/apache/doris/pull/36222) + +- Improved the speed of partition balancing. [#36976](https://github.com/apache/doris/pull/36976) + +- Controlled segment cache using the number of file descriptors (FDs) and memory to avoid FD exhaustion. [#37035](https://github.com/apache/doris/pull/37035) + +- Fixed potential replica loss caused by concurrent clone and alter operations [#36858](https://github.com/apache/doris/pull/36858) + +- Fixed the issue of not being able to adjust column order.[#37226](https://github.com/apache/doris/pull/37226) + +- Prohibited certain schema change operations on auto-increment columns. [#37331](https://github.com/apache/doris/pull/37331) + +- Fixed inaccurate error reporting for DELETE operations. [#37374](https://github.com/apache/doris/pull/37374) + +- Adjusted the trash expiration time on BE side to one day. [#37409](https://github.com/apache/doris/pull/37409) + +- Optimized compaction memory usage and scheduling. [#37491](https://github.com/apache/doris/pull/37491) + +- Checked for potential oversized backups causing FE restarts. [#37466](https://github.com/apache/doris/pull/37466) + +- Restored dynamic partition deletion policies and cross-partition behaviors to 2.1.3. [#37570](https://github.com/apache/doris/pull/37570) [#37506](https://github.com/apache/doris/pull/37506) + +- Fixed errors related to decimal types in DELETE predicates. [#37710](https://github.com/apache/doris/pull/37710) + +### Data Loading + +- Fixed data invisibility issues caused by race conditions in error handling during imports [#36744](https://github.com/apache/doris/pull/36744) + +- Added support for hhl_from_base64 in streamload imports. [#36819](https://github.com/apache/doris/pull/36819) + +- Fixed potential FE OOM issues when importing very large numbers of tablets for a single table. [#36944](https://github.com/apache/doris/pull/36944) + +- Fixed possible auto-increment column duplication during FE master-slave switchovers. [#36961](https://github.com/apache/doris/pull/36961) + +- Fixed errors when inserting into select with auto-increment columns. [#37029](https://github.com/apache/doris/pull/37029) + +- Reduced the number of data flush threads to optimize memory usage. [#37092](https://github.com/apache/doris/pull/37092) + +- Improved automatic recovery and error messaging for routine load tasks. [#37371](https://github.com/apache/doris/pull/37371) + +- Increased the default batch size for routine load. [#37388](https://github.com/apache/doris/pull/37388) + +- Fixed routine load task stoppage due to Kafka EOF expiration. [#37983](https://github.com/apache/doris/pull/37983) + +- Fixed coredump issues in multi-table streaming. [#37370](https://github.com/apache/doris/pull/37370) + +- Fixed premature backpressure caused by inaccurate memory estimation in groupcommit. [#37379](https://github.com/apache/doris/pull/37379) + +- Optimized BE-side thread usage in groupcommit. [#37380](https://github.com/apache/doris/pull/37380) + +- Fixed the issue of no error URL when data was not partitioned. [#37401](https://github.com/apache/doris/pull/37401) + +- Fixed potential memory misoperations during imports. [#38021](https://github.com/apache/doris/pull/38021) + +### Merge on Write Unique Key + +- Reduced memory usage during compaction for primary key tables. [#36968](https://github.com/apache/doris/pull/36968) + +- Fixed potential duplicate data issues when primary key replica cloning fails. [#37229](https://github.com/apache/doris/pull/37229) + +### Permissions + +- Fixed the issue of missing authorization when a table-valued function references a resource. [#37132](https://github.com/apache/doris/pull/37132) + +- Fixed the issue where the SHOW ROLE statement did not include workload group permissions. [#36032](https://github.com/apache/doris/pull/36032) + +- Fixed the issue where executing two statements simultaneously when creating a row policy could cause FE to fail to restart. [#37342](https://github.com/apache/doris/pull/37342) + +- Fixed the issue where, in some cases, upgrading from an older version could result in FE metadata replay failures due to row policies. [#37342](https://github.com/apache/doris/pull/37342) + +### Others + +- Fixed the issue of compute nodes participating in internal table creation. [#37961](https://github.com/apache/doris/pull/37961) + +- Fixed the read lag issue when `enable_strong_read_consistency` is set to true. [#37641](https://github.com/apache/doris/pull/37641) \ No newline at end of file diff --git a/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.6.md b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.6.md new file mode 100644 index 0000000000000..c14d25b52573f --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.6.md @@ -0,0 +1,524 @@ +--- +{ + "title": "Release 2.1.6", + "language": "en" +} +--- + + + +Dear community, **Apache Doris version 2.1.6 was officially released on September 10, 2024.** This version brings continuous upgrades and improvements to the Lakehouse, Async Materialized Views, and Semi-Structured Data Management. Additionally, several fixes have been implemented in areas such as the query optimizer, execution engine, storage management, permission management. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- Removed the `delete_if_exists` option from create repository. [#38192](https://github.com/apache/doris/pull/38192) + +- Added the `enable_prepared_stmt_audit_log` session variable to control whether JDBC prepared statements record audit logs, with the default being no recording. [#38624](https://github.com/apache/doris/pull/38624) [#39009](https://github.com/apache/doris/pull/39009) + +- Implemented fd limit and memory constraints for segment cache. [#39689](https://github.com/apache/doris/pull/39689) + +- When the FE configuration item `sys_log_mode` is set to BRIEF, file location information is added to the logs. [#39571](https://github.com/apache/doris/pull/39571) + +- Changed the default value of the session variable `max_allowed_packet` to 16MB. [#38697](https://github.com/apache/doris/pull/38697) + +- When a single request contains multiple statements, semicolons must be used to separate them. [#38670](https://github.com/apache/doris/pull/38670) + +- Added support for statements to begin with a semicolon. [#39399](https://github.com/apache/doris/pull/39399) + +- Aligned type formatting with MySQL in statements such as `show create table`. [#38012](https://github.com/apache/doris/pull/38012) + +- When the new optimizer planning times out, it no longer falls back to prevent the old optimizer from using longer planning times. [#39499](https://github.com/apache/doris/pull/39499) + +## New features + +### Lakehouse + +- Supported writeback for Iceberg tables. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/lakehouse/datalake-building/iceberg-build). + +- SQL interception rules now support external tables. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/query-admin/sql-interception). + +- Added the system table `file_cache_statistics` to view BE data cache metrics. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/system-tables/file_cache_statistics). + +### Async Materialized View + +- Supported transparent rewriting during inserts. [#38115](https://github.com/apache/doris/pull/38115) + +- Supported transparent rewriting when variant types exist in queries.[ #37929](https://github.com/apache/doris/pull/37929) + +### Semi-Structured Data Management + +- Supported casting ARRAY MAP to JSON type.[ #36548](https://github.com/apache/doris/pull/36548) + +- Supported the `json_keys` function.[ #36411](https://github.com/apache/doris/pull/36411) + +- Supported specifying the JSON path $. when importing JSON. [#38213](https://github.com/apache/doris/pull/38213) + +- ARRAY, MAP, STRUCT types now support `replace_if_not_null`[#38304](https://github.com/apache/doris/pull/38304) + +- ARRAY, MAP, STRUCT types now support adjusting column order.[#39210](https://github.com/apache/doris/pull/39210) + +- Added the `multi_match` function to match keywords across multiple fields, with support for inverted index acceleration. [#37722](https://github.com/apache/doris/pull/37722) + +### Query Optimizer + +- Filled in the original database name, table name, column name, and alias for returned columns in the MySQL protocol. [ #38126](https://github.com/apache/doris/pull/38126) + +- Supported the aggregation function `group_concat` with both order by and distinct simultaneously. [#38080](https://github.com/apache/doris/pull/38080) + +- SQL cache now supports reusing cached results for queries with different comments. [#40049](https://github.com/apache/doris/pull/40049) + +- In partition pruning, supported including `date_trunc` and date functions in filter conditions. [#38025](https://github.com/apache/doris/pull/38025) [#38743](https://github.com/apache/doris/pull/38743) + +- Allowed using the database name where the table resides as a qualifier prefix for table aliases. [#38640](https://github.com/apache/doris/pull/38640) + +- Supported hint-style comments.[#39113](https://github.com/apache/doris/pull/39113) + +### Others + +- Added the system table `table_properties` for viewing table properties. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/system-tables/information_schema/table_properties). + +- Introduced deadlock and slow lock detection in FE. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/maint-monitor/frontend-lock-manager). + +## Improvements + +### Lakehouse + +- Reimplemented the external table metadata caching mechanism. + + - For details, refer to the [documentation](https://doris.apache.org/docs/lakehouse/metacache). + +- Added the session variable `keep_carriage_return` with a default value of false. By default, reading Hive Text format tables treats both `\r\n` and `\n` as newline characters. [#38099](https://github.com/apache/doris/pull/38099) + +- Optimized memory statistics for Parquet/ORC file read/write operations.[#37257](https://github.com/apache/doris/pull/37257) + +- Supported pushing down IN/NOT IN predicates for Paimon tables. [#38390](https://github.com/apache/doris/pull/38390) + +- Enhanced the optimizer to support Time Travel syntax for Hudi tables. [#38591](https://github.com/apache/doris/pull/38591) + +- Optimized Kerberos authentication-related processes. [ #37301](https://github.com/apache/doris/pull/37301) + +- Enabled reading Hive tables after renaming column operations. [#38809](https://github.com/apache/doris/pull/38809) + +- Optimized the reading performance of partition columns for external tables. [#38810](https://github.com/apache/doris/pull/38810) + +- Improved the data shard merging strategy during external table query planning to avoid performance degradation caused by a large number of small shards.[#38964](https://github.com/apache/doris/pull/38964) + +- Added attributes such as location to `SHOW CREATE DATABASE/TABLE`. [#39644](https://github.com/apache/doris/pull/39644) + +- Supported complex types in MaxCompute Catalog. [#39822](https://github.com/apache/doris/pull/39822) + +- Optimized the file cache loading strategy by using asynchronous loading to avoid long BE startup times. [#39036](https://github.com/apache/doris/pull/39036) + +- Improved the file cache eviction strategy, such as evicting locks held for extended periods. [#39721](https://github.com/apache/doris/pull/39721) + +### Async Materialized View + +- Supported hourly, weekly, and quarterly partition roll-up construction. [#37678](https://github.com/apache/doris/pull/37678) + +- For materialized views based on Hive external tables, the metadata cache is now updated before refresh to ensure the latest data is obtained during each refresh. [#38212](https://github.com/apache/doris/pull/38212) + +- Improved the performance of transparent rewrite planning in storage-compute decoupled mode by batch fetching metadata. [#39301](https://github.com/apache/doris/pull/39301) + +- Enhanced the performance of transparent rewrite planning by prohibiting duplicate enumerations. [#39541](https://github.com/apache/doris/pull/39541) + +- Improved the performance of transparent rewrite for refreshing materialized views based on Hive external table partitions.[#38525](https://github.com/apache/doris/pull/38525) + +### Semi-Structured Data Management + +- Optimized memory allocation for TOPN queries to improve performance. [#37429](https://github.com/apache/doris/pull/37429) + +- Enhanced the performance of string processing in inverted indexes.[#37395](https://github.com/apache/doris/pull/37395) + +- Optimized the performance of inverted indexes in MOW tables. [#37428](https://github.com/apache/doris/pull/37428) + +- Supported specifying the row-store `page_size` during table creation to control compression effectiveness. [#37145](https://github.com/apache/doris/pull/37145) + +### Query Optimizer + +- Adjusted the row count estimation algorithm for mark joins, resulting in more accurate cardinality estimates for mark joins. [#38270](https://github.com/apache/doris/pull/38270) + +- Optimized the cost estimation algorithm for semi/anti joins, enabling more accurate selection of semi/anti join orders. [#37951](https://github.com/apache/doris/pull/37951) + +- Adjusted the filter estimation algorithm for cases where some columns have no statistical information, leading to more accurate cardinality estimates. [#39592](https://github.com/apache/doris/pull/39592) + +- Modified the instance calculation logic for set operation operators to prevent insufficient parallelism in extreme cases. [#39999](https://github.com/apache/doris/pull/39999) + +- Adjusted the usage strategy of bucket shuffle, achieving better performance when data is not sufficiently shuffled. [#36784](https://github.com/apache/doris/pull/36784) + +- Enabled early filtering of window function data, supporting multiple window functions in a single projection. [#38393](https://github.com/apache/doris/pull/38393) + +- When a `NullLiteral` exists in a filter condition, it can now be folded into false, further converted to an `EmptySet` to reduce unnecessary data scanning and computation. [#38135](https://github.com/apache/doris/pull/38135) + +- Expanded the scope of predicate derivation, reducing data scanning in queries with specific patterns. [#37314](https://github.com/apache/doris/pull/37314) + +- Supported partial short-circuit evaluation logic in partition pruning to improve partition pruning performance, achieving over 100% improvement in specific scenarios. [#38191](https://github.com/apache/doris/pull/38191) + +- Enabled the computation of arbitrary scalar functions within user variables. [#39144](https://github.com/apache/doris/pull/39144) + +- Maintained error messages consistent with MySQL when alias conflicts exist in queries. [#38104](https://github.com/apache/doris/pull/38104) + +### Query Execution + +- Adapted AggState for compatibility from 2.1 to 3.x and fixed coredump issues. [#37104](https://github.com/apache/doris/pull/37104) + +- Refactored the strategy selection for local shuffle when no joins are involved. [#37282](https://github.com/apache/doris/pull/37282) + +- Modified the scanner for internal table queries to an asynchronous approach to prevent blocking during internal table queries. [#38403](https://github.com/apache/doris/pull/38403) + +- Optimized the block merge process when building hash tables in Join operators. [#37471](https://github.com/apache/doris/pull/37471) + +- Reduced the lock holding time for MultiCast operations. [37462](https://github.com/apache/doris/pull/37462) + +- Optimized gRPC's keepAliveTime and added a connection monitoring mechanism, reducing the probability of query failures due to RPC errors during query execution. [#37304](https://github.com/apache/doris/pull/37304) + +- Cleaned up all dirty pages in jemalloc when memory limits are exceeded. [#37164](https://github.com/apache/doris/pull/37164) + +- Improved the performance of `aes_encrypt`/`decrypt` functions when handling constant types. [#37194](https://github.com/apache/doris/pull/37194) + +- Optimized the performance of `json_extract` functions when processing constant data. [#36927](https://github.com/apache/doris/pull/36927) + +- Optimized the performance of ParseURL functions when processing constant data. [#36882](https://github.com/apache/doris/pull/36882) + +### Backup Recovery / CCR + +- Restore now supports deleting redundant tablets and partition options. [#39363](https://github.com/apache/doris/pull/39363) + +- Check storage connectivity when creating a repository. [#39538](https://github.com/apache/doris/pull/39538) + +- Enables binlog to support `DROP TABLE`, allowing CCR to incrementally synchronize `DROP TABLE` operations. [#38541](https://github.com/apache/doris/pull/38541) + +### Compaction + +- Improves the issue where high-priority compaction tasks were not subject to task concurrency control limits. [#38189](https://github.com/apache/doris/pull/38189) + +- Automatically reduces compaction memory consumption based on data characteristics. [#37486](https://github.com/apache/doris/pull/37486) + +- Fixes an issue where the sequential data optimization strategy could lead to incorrect data in aggregate tables or MOR UNIQUE tables. [ #38299](https://github.com/apache/doris/pull/38299) + +- Optimizes the rowset selection strategy during compaction during replica replenishment to avoid triggering -235 errors. [#39262](https://github.com/apache/doris/pull/39262) + +### MOW (Merge-On-Write) + +- Optimizes slow column updates caused by concurrent column updates and compactions. [#38682](https://github.com/apache/doris/pull/38682) + +- Fixes an issue where segcompaction during bulk data imports could lead to incorrect MOW data. [#38992](https://github.com/apache/doris/pull/38992) [#39707](https://github.com/apache/doris/pull/39707) + +- Fixes data loss in column updates that may occur after BE restarts. [#39035](https://github.com/apache/doris/pull/39035) + +### Storage Management + +- Adds FE configuration to control whether queries under hot-cold tiering prefer local data replicas. [#38322](https://github.com/apache/doris/pull/38322) + +- Optimizes expired BE report messages to include newly created tablets. [#38839](https://github.com/apache/doris/pull/38839) [#39605](https://github.com/apache/doris/pull/39605) + +- Optimizes replica scheduling priority strategy to prioritize replicas with missing data. [#38884](https://github.com/apache/doris/pull/38884) + +- Prevents tablets with unfinished ALTER jobs from being balanced. [#39202](https://github.com/apache/doris/pull/39202) + +- Enables modifying the number of buckets for tables with list partitioning. [#39688](https://github.com/apache/doris/pull/39688) + +- Prefers querying from online disk services. [#39654](https://github.com/apache/doris/pull/39654) + +- Improves error messages for materialized view base tables that do not support deletion during synchronization. [#39857](https://github.com/apache/doris/pull/39857) + +- Improves error messages for single columns exceeding 4GB. [#39897](https://github.com/apache/doris/pull/39897) + +- Fixes an issue where aborted transactions were omitted when plan errors occurred during `INSERT` statements.[#38260](https://github.com/apache/doris/pull/38260) + +- Fixes exceptions during SSL connection closure.[#38677](https://github.com/apache/doris/pull/38677) + +- Fixes an issue where table locks were not held when aborting transactions using labels. [#38842](https://github.com/apache/doris/pull/38842) + +- Fixes `gson pretty` causing large image issues. [#39135](https://github.com/apache/doris/pull/39135) + +- Fixes an issue where the new optimizer did not check for bucket values of 0 in `CREATE TABLE` statements.[#38999](https://github.com/apache/doris/pull/38999) + +- Fixes errors when Chinese column names are included in `DELETE` condition predicates. [#39500](https://github.com/apache/doris/pull/39500) + +- Fixes frequent tablet balancing issues in partition balancing mode. [#39606](https://github.com/apache/doris/pull/39606) + +- Fixes an issue where partition storage policy attributes were lost. [#39677](https://github.com/apache/doris/pull/39677) + +- Fixes incorrect statistics when importing multiple tables within a transaction. [#39548](https://github.com/apache/doris/pull/39548) + +- Fixes errors when deleting random bucket tables. [#39830](https://github.com/apache/doris/pull/39830) + +- Fixes issues where FE fails to start due to non-existent UDFs. [#39868](https://github.com/apache/doris/pull/39868) + +- Fixes inconsistencies in the last failed version between FE master and slave. [#39947](https://github.com/apache/doris/pull/39947) + +- Fixes an issue where related tablets may still be in schema change state when schema change jobs are canceled. [ #39327](https://github.com/apache/doris/pull/39327) + +- Fixes errors when modifying type and column order in a single statement schema change (SC). [#39107](https://github.com/apache/doris/pull/39107) + +### Data Loading + +- Improves error messages for -238 errors during imports. [#39182](https://github.com/apache/doris/pull/39182) + +- Allows importing to other partitions while restoring a partition. [#39915](https://github.com/apache/doris/pull/39915) + +- Optimizes the strategy for FE to select BEs during group commit. [#37830](https://github.com/apache/doris/pull/37830) [#39010](https://github.com/apache/doris/pull/39010) + +- Avoids printing stack traces for some common streamload error messages. [#38418](https://github.com/apache/doris/pull/38418) + +- Improves handling of issues where offline BEs may affect import errors. [#38256](https://github.com/apache/doris/pull/38256) + +### Permissions + +- Optimizes access performance after enabling the Ranger authentication plugin. [#38575](https://github.com/apache/doris/pull/38575) +- Optimizes permission strategies for Refresh Catalog/Database/Table operations, allowing users to perform these operations with only SHOW permissions. [#39008](https://github.com/apache/doris/pull/39008) + +## Bug fixes + +### Lakehouse + +- Fixes the issue where switching catalogs may result in an error of not finding the database. [#38114](https://github.com/apache/doris/pull/38114) + +- Addresses exceptions caused by attempting to read non-existent data on S3. [#38253](https://github.com/apache/doris/pull/38253) + +- Resolves the issue where specifying an abnormal path during export operations may lead to incorrect export locations. [#38602](https://github.com/apache/doris/pull/38602) + +- Fixes the timezone issue for time columns in Paimon tables. [#37716](https://github.com/apache/doris/pull/37716) + +- Temporarily disables the Parquet PageIndex feature to avoid certain erroneous behaviors. + +- Corrects the selection of Backend nodes in the blacklist during external table queries. [#38984](https://github.com/apache/doris/pull/38984) + +- Resolves errors caused by missing subcolumns in Parquet Struct column types.[#39192](https://github.com/apache/doris/pull/39192) + +- Addresses several issues with predicate pushdown in JDBC Catalog. [#39082](https://github.com/apache/doris/pull/39082) + +- Fixes issues where some historical Parquet formats led to incorrect query results. [#39375](https://github.com/apache/doris/pull/39375) + +- Improves compatibility with ojdbc6 drivers for Oracle JDBC Catalog. [#39408](https://github.com/apache/doris/pull/39408) + +- Resolves potential FE memory leaks caused by Refresh Catalog/Database/Table operations. [#39186](https://github.com/apache/doris/pull/39186) [#39871](https://github.com/apache/doris/pull/39871) + +- Fixes thread leaks in JDBC Catalog under certain conditions. [#39666](https://github.com/apache/doris/pull/39666) [#39582](https://github.com/apache/doris/pull/39582) + +- Addresses potential event processing failures after enabling Hive Metastore event subscription. [#39239](https://github.com/apache/doris/pull/39239) + +- Disables reading Hive Text format tables with custom escape characters and null formats to prevent data errors. [#39869](https://github.com/apache/doris/pull/39869) + +- Resolves issues accessing Iceberg tables created via the Iceberg API under certain conditions. [#39203](https://github.com/apache/doris/pull/39203) + +- Fixes the inability to read Paimon tables stored on HDFS clusters with high availability enabled. [#39876](https://github.com/apache/doris/pull/39876) + +- Addresses errors that may occur when reading Paimon table deletion vectors after enabling file caching. [#39875](https://github.com/apache/doris/pull/39875) + +- Resolves potential deadlocks when reading Parquet files under certain conditions. [#39945](https://github.com/apache/doris/pull/39945) + +### Async Materialized View + +- Fixes the inability to use `SHOW CREATE MATERIALIZED VIEW` on follower FEs. [#38794](https://github.com/apache/doris/pull/38794) + +- Unifies the object type of asynchronous materialized views in metadata as tables to enable proper display in data tools. [#38797](https://github.com/apache/doris/pull/38797) + +- Resolves the issue where nested asynchronous materialized views always perform full refreshes. [#38698](https://github.com/apache/doris/pull/38698) + +- Fixes the issue where canceled tasks may show as running after restarting FEs. [ #39424](https://github.com/apache/doris/pull/39424) + +- Addresses incorrect use of contexts, which may lead to unexpected failures of materialized view refresh tasks. [#39690](https://github.com/apache/doris/pull/39690) + +- Resolves issues that may cause varchar type write failures due to unreasonable lengths when creating asynchronous materialized views based on external tables.[#37668](https://github.com/apache/doris/pull/37668) + +- Fixes the potential invalidation of asynchronous materialized views based on external tables after FE restarts or catalog rebuilds. [#39355](https://github.com/apache/doris/pull/39355) + +- Prohibits the use of partition rollup for materialized views with list partitions to prevent the generation of incorrect data. [#38124](https://github.com/apache/doris/pull/38124) + +- Fixes incorrect results when literals exist in the select list during transparent rewriting for aggregation rollup. [#38958](https://github.com/apache/doris/pull/38958) + +- Addresses potential errors during transparent rewriting when queries contain filters like `a = a`. [#39629](https://github.com/apache/doris/pull/39629) + +- Fixes issues where transparent rewriting for direct external table queries fails. [#39041](https://github.com/apache/doris/pull/39041) + +### Semi-Structured Data Management + +- Removes support for prepared statements in the old optimizer. [#39465](https://github.com/apache/doris/pull/39465) + +- Fixes issues with JSON escape character handling. [#37251](https://github.com/apache/doris/pull/37251) + +- Resolves issues with duplicate processing of JSON fields. [#38490](https://github.com/apache/doris/pull/38490) + +- Fixes issues with some ARRAY and MAP functions. [#39307](https://github.com/apache/doris/pull/39307) [#39699](https://github.com/apache/doris/pull/39699) [#39757](https://github.com/apache/doris/pull/39757) + +- Resolves complex combinations of inverted index queries and LIKE queries. [#36687](https://github.com/apache/doris/pull/36687) + +### Query Optimizer + +- Fixed the potential partition pruning error issue when the 'OR' condition exists in partition filter conditions. [#38897](https://github.com/apache/doris/pull/38897) + +- Fixed the potential partition pruning error issue when complex expressions are involved. [#39298](https://github.com/apache/doris/pull/39298) + +- Fixed the issue where nullable in `agg_state` subtypes might be planned incorrectly, leading to execution errors. [#37489](https://github.com/apache/doris/pull/37489) + +- Fixed the issue where nullable in set operation operators might be planned incorrectly, leading to execution errors. [#39109](https://github.com/apache/doris/pull/39109) + +- Fixed the incorrect execution priority issue of intersect operator. [#39095](https://github.com/apache/doris/pull/39095) + +- Fixed the NPE issue that may occur when the maximum valid date literal exists in the query. [#39482](https://github.com/apache/doris/pull/39482) + +- Fixed the occasional planning error that results in an illegal slot error during execution. [#39640](https://github.com/apache/doris/pull/39640) + +- Fixed the issue where repeatedly referencing columns in cte may lead to missing data in some columns in the result. [#39850](https://github.com/apache/doris/pull/39850) + +- Fixed the occasional planning error issue when 'case when' exists in the query. [#38491](https://github.com/apache/doris/pull/38491) + +- Fixed the issue where IP types cannot be implicitly converted to string types. [#39318](https://github.com/apache/doris/pull/39318) + +- Fixed the potential planning error issue when using multi-dimensional aggregation and the same column and its alias exist in the select list. [ #38166](https://github.com/apache/doris/pull/38166) + +- Fixed the issue where boolean types might be handled incorrectly when using BE constant folding. [#39019](https://github.com/apache/doris/pull/39019) + +- Fixed the planning error issue caused by `default_cluster`: as a prefix for the database name in expressions. [#39114](https://github.com/apache/doris/pull/39114) + +- Fixed the potential deadlock issue caused by` insert into`. [#38660](https://github.com/apache/doris/pull/38660) + +- Fixed the potential planning error issue caused by not holding table locks throughout the planning process. [#38950](https://github.com/apache/doris/pull/38950) + +- Fixed the issue where CHAR(0), VARCHAR(0) are not handled correctly when creating tables. [#38427](https://github.com/apache/doris/pull/38427) + +- Fixed the issue where `show create table` may incorrectly display hidden columns. [#38796](https://github.com/apache/doris/pull/38796) + +- Fixed the issue where columns with the same name as hidden columns are not prohibited when creating tables. [#38796](https://github.com/apache/doris/pull/38796) + +- Fixed the occasional planning error issue when executing `insert into as select` with CTEs. [#38526](https://github.com/apache/doris/pull/38526) + +- Fixed the issue where `insert into values` cannot automatically fill null default values. **[[fix](Nereids) fix insert into table with null literal default value #39122](https://github.com/apache/doris/pull/39122)** + +- Fixed the NPE issue caused by using cte in delete without using it. [#39379](https://github.com/apache/doris/pull/39379) + +- Fixed the issue where deleting from a randomly distributed aggregation model table fails. [#37985](https://github.com/apache/doris/pull/37985) + +### Query Execution + +- Fixed the issue where the pipeline execution engine gets stuck in multiple scenarios, causing queries not to end. [#38657](https://github.com/apache/doris/pull/38657) [#38206](https://github.com/apache/doris/pull/38206) [#38885](https://github.com/apache/doris/pull/38885) + +- Fixed the coredump issue caused by null and non-null columns in set difference calculations.[#38737](https://github.com/apache/doris/pull/38737) + +- Fixed the incorrect result issue of the `width_bucket` function. [#37892](https://github.com/apache/doris/pull/37892) + +- Fixed the query error issue when a single row of data is large and the result set is also large (exceeding 2GB). [#37990](https://github.com/apache/doris/pull/37990) + +- Fixed the incorrect result issue of `stddev` with DecimalV2 type. [#38731](https://github.com/apache/doris/pull/38731) + +- Fixed the coredump issue caused by the `MULTI_MATCH_ANY` function. [#37959](https://github.com/apache/doris/pull/37959) + +- Fixed the issue where `insert overwrite auto partition` causes transaction rollback. [#38103](https://github.com/apache/doris/pull/38103) + +- Fixed the incorrect result issue of the `convert_tz` function. [#37358](https://github.com/apache/doris/pull/37358) [#38764](https://github.com/apache/doris/pull/38764) + +- Fixed the coredump issue when using the `collect_set` function with window functions. [#38234](https://github.com/apache/doris/pull/38234) + +- Fixed the coredump issue caused by the mod function with abnormal input. [#37999](https://github.com/apache/doris/pull/37999) + +- Fixed the issue where executing the same expression in multiple threads may lead to incorrect Java UDF results. [#38612](https://github.com/apache/doris/pull/38612) + +- Fixed the overflow issue caused by the incorrect return type of the `conv` function. [#38001](https://github.com/apache/doris/pull/38001) + +- Fixed the unstable result issue of the histogram function. [#38608](https://github.com/apache/doris/pull/38608) + +### Backup & Recovery / CCR + +- Fixed the issue where the data version after backup and recovery may be incorrect, leading to unreadability. [#38343](https://github.com/apache/doris/pull/38343) + +- Fixed the issue of using restore version across versions. [#38396](https://github.com/apache/doris/pull/38396) + +- Fixed the issue where the job is not canceled when backup fails. [#38993](https://github.com/apache/doris/pull/38993) + +- Fixed the NPE issue in ccr during the upgrade from 2.1.4 to 2.1.5, causing the FE to fail to start. [#39910](https://github.com/apache/doris/pull/39910) + +- Fixed the issue where views and materialized views cannot be used after restoration. [#38072](https://github.com/apache/doris/pull/38072) [#39848](https://github.com/apache/doris/pull/39848) + +### Storage Management + +- Fixed possible memory leaks in routine load when loading multiple tables from a single stream. [#38824](https://github.com/apache/doris/pull/38824) + +- Fixed the issue where delimiters and escape characters in routine load were not effective. [#38825](https://github.com/apache/doris/pull/38825) + +- Fixed incorrectly show routine load results when the routine load task name contained uppercase letters. [#38826](https://github.com/apache/doris/pull/38826) + +- Fixed the issue where the offset cache was not reset when changing the routineload topic. [#38474](https://github.com/apache/doris/pull/38474) + +- Fixed the potential exception triggered by show routineload under concurrent scenarios. [#39525](https://github.com/apache/doris/pull/39525) + +- Fixed the issue where routine load might import data repeatedly. [#39526](https://github.com/apache/doris/pull/39526) + +- Fixed the data error caused by `setNull` when enabling group commit via JDBC. [#38276](https://github.com/apache/doris/pull/38276) + +- Fixed the potential NPE issue when enabling group commit insert to a non-master FE. [#38345](https://github.com/apache/doris/pull/38345) + +- Fixed incorrect error handling during internal data writing in group commit. [#38997](https://github.com/apache/doris/pull/38997) + +- Fixed the coredump that might be triggered when the group commit execution plan failed. [#39396](https://github.com/apache/doris/pull/39396) + +- Fixed the issue where concurrent imports into auto partition tables might report non-existent tablets. [#38793](https://github.com/apache/doris/pull/38793) + +- Fixed potential load stream leakage issues. [#39039](https://github.com/apache/doris/pull/39039) + +- Fixed the issue where transactions were opened for `insert into select` with no data. [#39108](https://github.com/apache/doris/pull/39108) + +- Ignored the single-replica import configuration when using memtable prefetching. [#39154](https://github.com/apache/doris/pull/39154) + +- Fixed the issue where background imports of stream load records might be abnormally aborted upon encountering db deletion. [#39527](https://github.com/apache/doris/pull/39527) + +- Fixed inaccurate error messages when data errors occurred in strict mode. [#39587](https://github.com/apache/doris/pull/39587) + +- Fixed the issue where streamload did not return an error URL upon encountering erroneous data. [#38417](https://github.com/apache/doris/pull/38417) + +- Fixed the issue with the combined use of insert overwrite and auto partition. [#38442](https://github.com/apache/doris/pull/38442) + +- Fixed parsing errors when CSV encountered data where the line delimiter was enclosed by the enclosing character. [#38445](https://github.com/apache/doris/pull/38445) + +### Data Exporting + +- Fixed the issue where enabling the delete_existing_files property during export operations might result in duplicate deletion of exported data. [#39304](https://github.com/apache/doris/pull/39304)) + +### Permissions + +- Fixed the incorrect requirement of ALTER TABLE permission when creating a materialized view. [#38011](https://github.com/apache/doris/pull/38011) + +- Fixed the issue where the db was explicitly displayed as empty when showing routine load. [#38365](https://github.com/apache/doris/pull/38365) + +- Fixed the incorrect requirement of CREATE permission on the original table when using CREATE TABLE LIKE. [#37879](https://github.com/apache/doris/pull/37879) + +- Fixed the issue where grant operations did not check if the object existed. [#39597](https://github.com/apache/doris/pull/39597) + +## Upgrade suggestions + +When upgrading Doris, please follow the principle of not skipping two minor versions and upgrade sequentially. + +For example, if you are upgrading from version 0.15.x to 2.0.x, it is recommended to first upgrade to the latest version of 1.1, then upgrade to the latest version of 1.2, and finally upgrade to the latest version of 2.0. + +For more upgrade information, see the documentation: [Cluster Upgrade](../../admin-manual/cluster-management/upgrade) \ No newline at end of file diff --git a/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.7.md b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.7.md new file mode 100644 index 0000000000000..414229276e6b0 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v2.1/release-2.1.7.md @@ -0,0 +1,180 @@ +--- +{ + "title": "Release 2.1.7", + "language": "en" +} +--- + + + +Dear community, **Apache Doris version 2.1.7 was officially released on November 10, 2024.** This version brings continuous upgrades and improvements. Additionally, several fixes have been implemented in areas such as the to the Lakehouse, Async Materialized Views, and Semi-Structured Data Management, Query Optimizer and Permission Management. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- The following global variables will be forcibly set to the following default values: + - enable_nereids_dml: true + - enable_nereids_dml_with_pipeline: true + - enable_nereids_planner: true + - enable_fallback_to_original_planner: true + - enable_pipeline_x_engine: true +- New columns have been added to the audit log. [#42262](https://github.com/apache/doris/pull/42262) + - For more information, please refer to [docs](https://doris.apache.org/docs/admin-manual/audit-plugin/) + +## New features + +### Async Materialized View + +- An asynchronous materialized view has added a property called `use_for_rewrite` to control whether it participates in transparent rewriting. [#40332](https://github.com/apache/doris/pull/40332) + +### Query Execution + +- The list of changed session variables is now output in the Profile. [#41016](https://github.com/apache/doris/pull/41016) +- Support for `trim_in`, `ltrim_in`, and `rtrim_in` functions has been added. [#42641](https://github.com/apache/doris/pull/42641) (Note: This is a duplicate mention, but I'm including it as per your original list.) +- Support for several URL functions (top_level_domain, first_significant_subdomain, cut_to_first_significant_subdomain) has been added. [#42916](https://github.com/apache/doris/pull/42916) +- The `bit_set` function has been added. [#42916](https://github.com/apache/doris/pull/42099) +- The `count_substrings` function has been added. [#42055](https://github.com/apache/doris/pull/42055) +- The `translate` and `url_encode` functions have been added. [#41051](https://github.com/apache/doris/pull/41051) +- The `normal_cdf`, `to_iso8601`, and `from_iso8601_date` functions have been added. [#40695](https://github.com/apache/doris/pull/40695) + + +### Storage Management + +- The `information_schema.table_options` and `table_properties` system tables have been added, supporting the querying of attributes set during table creation. [#34384](https://github.com/apache/doris/pull/34384) +- Support for `bitmap_empty` as a default value has been implemented. [#40364](https://github.com/apache/doris/pull/40364) +- A new session variable `require_sequence_in_insert` has been introduced to control whether a sequence column must be provided when performing `INSERT INTO SELECT` writes to a unique key table. [#41655](https://github.com/apache/doris/pull/41655) + +### Others + +- Allow for generating flame graphs on the BE WebUI page.[#41044](https://github.com/apache/doris/pull/41044) + +## Improvements + +### Lakehouse + +- Support for writing data to Hive text format tables. [#40537](https://github.com/apache/doris/pull/40537) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/datalake-building/hive-build) +- Access MaxCompute data using MaxCompute Open Storage API. [#41610](https://github.com/apache/doris/pull/41610) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/database/max-compute) +- Support for Paimon DLF Catalog. [#41694](https://github.com/apache/doris/pull/41694) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/datalake-analytics/paimon) +- Added `table$partitions` syntax to directly query Hive partition information.[#41230](https://github.com/apache/doris/pull/41230) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/datalake-analytics/hive) +- Support for reading Parquet files in brotli compression format.[#42162](https://github.com/apache/doris/pull/42162) +- Support for reading DECIMAL 256 types in Parquet files. [#42241](https://github.com/apache/doris/pull/42241) +- Support for reading Hive tables in OpenCsvSerde format.[#42939](https://github.com/apache/doris/pull/42939) + +### Async Materialized View + +- Refined the granularity of lock holding during the build process for asynchronous materialized views. [#40402](https://github.com/apache/doris/pull/40402) [#41010](https://github.com/apache/doris/pull/41010). + +### Query optimizer + +- Improved the accuracy of statistic information collection and usage in extreme cases to enhance planning stability. [#40457](https://github.com/apache/doris/pull/40457) +- Runtime filters can now be generated in more scenarios to improve query performance. [#40815](https://github.com/apache/doris/pull/40815) +- Enhanced constant folding capabilities for numerical, date, and string functions to boost query performance. [#40820](https://github.com/apache/doris/pull/40820) +- Optimized the column pruning algorithm to enhance query performance. [#41548](https://github.com/apache/doris/pull/41548) + +### Query Execution + +- Supported parallel preparation to reduce the time consumed by short queries. [#40270](https://github.com/apache/doris/pull/40270) +- Corrected the names of some counters in the profile to match the audit logs. [#41993](https://github.com/apache/doris/pull/41993) +- Added new local shuffle rules to speed up certain queries. [#40637](https://github.com/apache/doris/pull/40637) + +### Storage Management + +- The `SHOW PARTITIONS` command now supports displaying the commit version. [#28274](https://github.com/apache/doris/pull/28274) +- Checked for unreasonable partition expressions when creating tables. [#40158](https://github.com/apache/doris/pull/40158) +- Optimized the scheduling logic when encountering EOF in Routine Load. [#40509](https://github.com/apache/doris/pull/40509) +- Made Routine Load aware of schema changes. [#40508](https://github.com/apache/doris/pull/40508) +- Improved the timeout logic for Routine Load tasks. [#41135](https://github.com/apache/doris/pull/41135) + +### Others + +- Allowed closing the built-in service port of BRPC via BE configuration. [#41047](https://github.com/apache/doris/pull/41047) +- Fixed issues with missing fields and duplicate records in audit logs. [#43015](https://github.com/apache/doris/pull/43015) + +## Bug fixes + +### Lakehouse + +- Fixed the inconsistency in the behavior of INSERT OVERWRITE with Hive. [#39840](https://github.com/apache/doris/pull/39840) +- Cleaned up temporarily created folders to address the issue of too many empty folders on HDFS. [#40424](https://github.com/apache/doris/pull/40424) +- Resolved memory leaks in FE caused by using the JDBC Catalog in some cases. [#40923](https://github.com/apache/doris/pull/40923) +- Resolved memory leaks in BE caused by using the JDBC Catalog in some cases. [#41266](https://github.com/apache/doris/pull/41266) +- Fixed errors in reading Snappy compressed formats in certain scenarios. [#40862](https://github.com/apache/doris/pull/40862) +- Addressed potential FileSystem leaks on the FE side in certain scenarios. [#41108](https://github.com/apache/doris/pull/41108) +- Resolved issues where using EXPLAIN VERBOSE to view external table execution plans could cause null pointer exceptions in some cases. [#41231] (https://github.com/apache/doris/pull/41231) +- Fixed the inability to read tables in Paimon parquet format. [#41487](https://github.com/apache/doris/pull/41487) +- Addressed performance issues introduced by compatibility changes in the JDBC Oracle Catalog. [#41407](https://github.com/apache/doris/pull/41407) +- Disabled predicate pushing down after implicit conversion to resolve incorrect query results in some cases with JDBC Catalog. [#42242](https://github.com/apache/doris/pull/42242) +- Fixed issues with case-sensitive access to table names in the External Catalog. [#42261](https://github.com/apache/doris/pull/42261) + +### Async Materialized View + +- Fixed the issue where user-specified start times were not effective. [#39573](https://github.com/apache/doris/pull/39573) +- Resolved the issue of nested materialized views not refreshing. [#40433](https://github.com/apache/doris/pull/40433) +- Fixed the issue where materialized views might not refresh after the base table was deleted and recreated. [#41762](https://github.com/apache/doris/pull/41762) +- Addressed issues where partition compensation rewrites could lead to incorrect results. [#40803](https://github.com/apache/doris/pull/40803) +- Fixed potential errors in rewrite results when `sql_select_limit` was set. [#40106](https://github.com/apache/doris/pull/40106) + +### Semi-Structured Data Management + +- Fixed the issue of index file handle leaks. [#41915](https://github.com/apache/doris/pull/41915) +- Addressed inaccuracies in the `count()` function of inverted indexes in special cases. (#41127)[https://github.com/apache/doris/pull/41127] +- Fixed exceptions with variant when light schema change was not enabled. [#40908](https://github.com/apache/doris/pull/40908) +- Resolved memory leaks when variant returns arrays. [#41339](https://github.com/apache/doris/pull/41339) + +### Query optimizer + +- Corrected potential errors in nullable calculations for filter conditions during external table queries, leading to execution exceptions. [#41014](https://github.com/apache/doris/pull/41014) +- Fixed potential errors in optimizing range comparison expressions. [#41356](https://github.com/apache/doris/pull/41356) + +### Query Execution + +- The match_regexp function could not correctly handle empty strings. [#39503](https://github.com/apache/doris/pull/39503) +- Resolved issues where the scanner thread pool could become stuck in high-concurrency scenarios. [#40495](https://github.com/apache/doris/pull/40495) +- Fixed errors in the results of the `data_floor` function. [#41948](https://github.com/apache/doris/pull/41948) +- Addressed incorrect cancel messages in some scenarios. [#41798](https://github.com/apache/doris/pull/41798) +- Fixed issues with excessive warning logs printed by arrow flight. [#41770](https://github.com/apache/doris/pull/41770) +- Resolved issues where runtime filters failed to send in some scenarios. [#41698](https://github.com/apache/doris/pull/41698) +- Fixed problems where some system table queries could not end normally or became stuck. [#41592](https://github.com/apache/doris/pull/41592) +- Addressed incorrect results from window functions. ][#40761](https://github.com/apache/doris/pull/40761) +- Fixed issues where the encrypt and decrypt functions caused BE cores. [#40726](https://github.com/apache/doris/pull/40726) +- Resolved errors in the results of the conv function. [#40530](https://github.com/apache/doris/pull/40530) + +### Storage Management + +- Fixed import failures when Memtable migration was used in multi-replica scenarios with machine crashes. [#38003](https://github.com/apache/doris/pull/38003) +- Addressed inaccurate memory statistics during the Memtable flush phase during imports. [#39536](https://github.com/apache/doris/pull/39536) +- Fixed fault tolerance issues with Memtable migration in multi-replica scenarios. [#40477](https://github.com/apache/doris/pull/40477) +- Resolved inaccurate bvar statistics with Memtable migration. [#40985](https://github.com/apache/doris/pull/40985) +- Fixed inaccurate progress reporting for S3 loads. [#40987](https://github.com/apache/doris/pull/40987) + +### Permissions + +- Fixed permission issues related to show columns, show sync, and show data from db.table. [#39726](https://github.com/apache/doris/pull/39726) + +### Others + +- Fixed the issue where the audit log plugin for version 2.0 could not be used in version 2.1. [#41400](https://github.com/apache/doris/pull/41400) diff --git a/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.0.md b/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.0.md new file mode 100644 index 0000000000000..baa62b37e1e75 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.0.md @@ -0,0 +1,469 @@ +--- +{ + "title": "Release 3.0.0", + "language": "en" +} +--- + + + + +We are excited to announce the release of Apache Doris 3.0! + +**Starting from version 3.X, Apache Doris supports a compute-storage decoupled mode in addition to the compute-storage coupled mode for cluster deployment. With the cloud-native architecture that decouples the computation and storage layers, users can achieve physical isolation between query loads across multiple compute clusters, as well as isolation between read and write loads. Additionally, users can take advantage of low-cost shared storage systems such as object storage or HDFS to significantly reduce storage costs.** + +Version 3.0 marks a milestone in the evolution of Apache Doris towards a unified data lake and data warehouse architecture. This version introduces the ability to write data back to data lakes, allowing users to perform data analysis, sharing, processing, and storage operations across multiple data sources within Apache Doris. With capabilities such as asynchronous materialized views, Apache Doris can serve as a unified data processing engine for enterprises, helping users better manage data across lakes, warehouses, and databases. Also, Apache Doris 3.0 introduces the Trino Connector. It allows users to quickly connect or adapt to more data sources, and leverage the high-performance compute engine of Doris to deliver faster query results than Trino. + +Version 3.0 also enhances support for ETL batch processing scenarios, adding explicit transaction support for operations like `insert into select`, `delete` and `update`. The observability of query execution has also been improved. + +In terms of performance, we have improved the framework capabilities, infrastructure, and rules of the query optimizer in version 3.0. This provides optimized performance, which has been proven by blind testing in more complex and diverse business scenarios. + +The adaptive Runtime Filter computation method now accurately estimates filters based on data size during execution, delivering better performance under large data volumes and high loads. Additionally, asynchronous materialized view has been more stable and user-friendly in query acceleration and data modeling. + +**During the development of version 3.0, over 200 contributors submitted nearly 5,000 optimizations** and fixes to Apache Doris. Contributors from companies such as VeloDB, Baidu, Meituan, ByteDance, Tencent, Alibaba, Kwai, Huawei, and Tianyi Cloud actively collaborated with the community, contributing test cases from real-world use cases to help us improve Apache Doris. We extend our heartfelt thanks to all the contributors involved in the development, testing, and feedback process for this release. + +- **GitHub**: https://github.com/apache/doris/releases + +- **Website**: https://doris.apache.org/download + +## 1. Compute-storage decoupled mode + +Since V3.0, Apache Doris supports the compute-storage decoupled mode. Users can choose between it and the compute-storage coupled mode during cluster deployment. + +In the compute-storage decoupled mode, the BE nodes no longer store the data, but instead, a shared storage layer (HDFS and object storage) is introduced as the shared data storage layer. The computing and storage resources can be scaled independently, bringing multiple benefits to users: + +- **Workload isolation**: Multiple compute clusters can share the same data, allowing users to isolate different business workloads or offline loads using separate compute clusters. + +- **Reduced storage costs**: The full dataset is stored in the more cost-effective and highly reliable shared storage, with only hot data cached locally. Compared to the compute-storage coupled mode with three data replicas, the storage cost can be reduced by up to 90%. + +- **Elastic computing resources**: Since no data is stored on the BE nodes, the computing resources can be scaled flexibly based on the load requirements. Users can scale in or out an individual compute cluster or increase/decrease the number of compute clusters. This also leads to cost savings. + +- **Improved system robustness**: By storing the data in shared storage, Doris no longer needs to handle the complex logic of multi-replica consistency, thus simplifying distributed storage complexity and improving the overall system robustness. + +- **Flexible data sharing and cloning**: The flexibility of the compute-storage decoupled mode extends beyond a single Doris cluster. Tables from one Doris cluster can be easily cloned to another Doris cluster, with just metadata replication. + +### 1-1. From coupled to decoupled + +In the compute-storage coupled mode, the Apache Doris architecture consists of two main process types: Frontend (FE) and Backend (BE). The FE is primarily responsible for user request access, query parsing and planning, metadata management, and node management. The BE is responsible for data storage and query plan execution. + +The BE nodes employ an MPP (Massively Parallel Processing) distributed computing architecture, leveraging a multi-replica consistency protocol to ensure high service availability and high data reliability. + +![From coupled to decoupled](/images/storage-compute-decoupled.PNG) + + +The maturation of emerging cloud computing infrastructure, including public clouds, private clouds, and Kubernetes-based container platforms, has driven the need for cloud-native capabilities. Increasingly, users are seeking deeper integration between Apache Doris and cloud computing infrastructure to provide more elasticity. + +**To address this need, the VeloDB team has designed and implemented a cloud-native version of Apache Doris that decouples compute and storage, known as VeloDB Cloud. After extensive production testing and refinement across hundreds of enterprises over a long time, this cloud-native solution has now been contributed to the Apache Doris community, manifesting as the Apache Doris 3.0 in the compute-storage decoupled mode.** + +In the compute-storage decoupled mode, the Apache Doris architecture consists of three layers: + +- **Meta data layer**: A new Meta Service module has been introduced to provide meta data services, such as processing database and table information, schemas, rowset meta, and transactions. The Meta Service is stateless and horizontally scalable. In V3.0, all of the BE's meta data and parts of the FE's meta data have been migrated to the Meta Service. We will finish the migration of the remains in future versions. +- **Computation layer**: The stateless BE nodes execute query plans and cache a portion of the data and tablet meta data locally to improve query performance. Multiple stateless BE nodes can be organized into a computing resource pool (i.e., compute cluster), and multiple compute clusters can share the same data and metadata service. The compute clusters can be elastically scaled by adding or removing nodes as needed. +- **Shared storage layer**: Data is persisted to the shared storage layer, which currently supports HDFS as well as various cloud-based object storage systems that are compatible with the S3 protocol, such as S3, OSS, GCS, Azure Blob, COS, BOS, and MinIO. + +![From coupled to decoupled-2](/images/storage-compute-decoupled-2.JPEG) + +### 1-2 Design highlight + +The design of the compute-storage decoupled mode of Apache Doris highlights the transformation of the FE's in-memory metadata model into a shared metadata service. This approach offers a globally consistent state view, allowing any node to directly submit writes without needing to go through the FE for publishing. During write operations, data is stored in shared storage, while metadata is managed by the metadata service. **This effectively controls the number of small files in shared storage. Meanwhile, the real-time write performance for individual tables is nearly on par with that in the compute-storage coupled mode. The system's overall write capacity is no longer limited by the processing power of a single FE node.** + +![Design highlight](/images/design-hightlight.PNG) + +Based on the globally consistent state view, for data garbage collection, we have adopted a design approach for data deletion that is easier to prove correct and more efficient. + +Specifically, data in the shared storage is incorporated into the globally consistent view offered by the shared meta data service. Whenever data is generated, we bind it to a separate, independent transaction. Similarly, for a meta data deletion operation, we also bind it to a separate, independent transaction. The purpose of this approach is to ensure that deletion and write operations cannot succeed together. The view records which data needs to be deleted, and the asynchronous deletion process can simply perform a forward deletion of the data based on the transaction records, without the need for reverse garbage collection. + +As the tablet-related meta data in the FE is gradually migrated to the shared meta data service, the scalability of the Doris cluster will no longer be constrained by the memory capacity of a single FE node. Building upon the shared meta data service and the forward data deletion technique, we can conveniently expand functionality such as data sharing and lightweight cloning. + +### 1-3 Comparison with alternative solutions + +Another design of decoupling compute and storage in the industry is to store the data and BE node meta data in a shared object storage or HDFS. However, this approach brings the following problems: + +- **Inability to support real-time writes**: During data writes, the data is mapped to tablets based on the partitioning and bucketing rules, generating segment files and rowset meta data. During the write process, a two-phase commit (Publish) is performed through the FE. When a BE node receives the Publish request, it then sets the rowset as visible. The Publish operation must not fail. If the rowset meta data is stored in the shared storage, the total small file data during the real-time write process would triple the size of the actual data files - one replica of data files, one for rowset meta data, and another for rowset meta data changes during Publish. The Publish operation is driven by a single FE node, so the write capacity of a single table or even the entire system is limited by the FE node's capabilities. + + ![Comparison with alternative solutions](/images/comparison-with-alternative-solutions.png) + + We compared the real-time data write performance of Apache Doris 3.0 with the above-described solution. We simulated 500 concurrent tasks writing 10,000 data files with 500 rows each, and 50 concurrent tasks writing 250 data files with 20,000 rows each, using the same computational resources. + + **The results showed that at 50 concurrent tasks, the micro-batch write performances of Apache Doris in both compute-storage coupled and decoupled modes were almost identical, while the industry solution lagged behind Apache Doris by a factor of 100.** + + At 500 concurrent tasks, the performance of Apache Doris in the compute-storage decoupled mode showed slight degradation, but it still maintained an 11X advantage over the industry solution. To ensure a fair test, Apache Doris did not enable the Group Commit feature (which the industry solution lacks). Enabling Group Commit would further enhance real-time write performance. + + ![Comparison with alternative solutions](/images/real-time-write-performance..png) + + Additionally, the industry solution also faces stability and cost issues in terms of real-time data ingestion: + + - Stability concerns: A large number of small files can put pressure on the shared storage, especially HDFS, and introduce stability risks. + + - High object storage request costs: Some public cloud object storage services charge 10 times more for Put and Delete operations compared to Get operations. A large number of small files can lead to a significant increase in object storage request costs, which can even exceed the storage costs. + +- **Limited scalability**: Use cases of the compute-storage decoupled model often handles larger data storage sizes, since the FE (Frontend) meta data is entirely in-memory, when the number of tablets reaches a certain high level (e.g. tens of millions), the FE's memory pressure can become a bottleneck that limits the overall write throughput of the system. + +- **Potential data deletion logic issues**: In the compute-storage decoupled architecture, data is stored with one single replica. Therefore, the data deletion logic is critical for the system's reliability. The conventional approach of cross-system data deletion by comparing the differences can be challenging. During the write process, there is no way to completely avoid deletion and write from succeeding together, which can lead to data loss. Additionally, when the storage system experiences anomalies, the input used for difference calculation may be incorrect, which potentially leads to unintended data deletion. + +- **Data sharing and lightweight cloning**: The flexibility of the decoupled storage-compute architecture can enable future data sharing and lightweight data cloning, reducing the burden of enterprise data management. However, if each cluster has a separate FE, after cloning data across clusters, it becomes difficult to accurately determine which data is no longer referenced and can be safely deleted, as calculating cross-cluster references can easily lead to unintended data deletion. + +By evolving the FE's full in-memory meta data model into a shared meta data service, Apache Doris 3.0 avoids all the aforementioned issues. + +### 1-4 Query performance comparison + +In the compute-storage decoupled mode, data needs to be read from the remote shared storage system, the main bottleneck has become the network bandwidth instead of the disk I/O in the compute-storage coupled mode. + +To accelerate data access, Apache Doris has implemented a high-speed caching mechanism based on local disks, and provides two cache management policies: LRU (Least Recently Used) and TTL (Time-To-Live). The newly imported data is asynchronously written to the cache to accelerate the first-time access to the latest data. If the data required by a query is not in the cache, the system will read the data from the remote storage into memory and synchronously write it to the cache for subsequent queries. + +In use cases involving multiple compute clusters, Apache Doris provides a cache preheating function. When a new compute cluster is established, users can choose to preheat specific data (such as tables or partitions) to further improve query efficiency. + +In this context, we have conducted performance tests with different caching strategies in both the compute-storage coupled and decoupled modes, using the TPC-DS 1TB test dataset. The results are concluded as follows: + +- When the cache is fully hit (i.e., all the data required for the query is loaded into the cache), **the query performance of the compute-storage decoupled mode is on par with that of the compute-storage coupled mode**. + +- When the cache is partially hit (i.e., the cache is cleared before the test, and data is gradually loaded into the cache during the test, with performance continuously improving), the query performance of the compute-storage decoupled mode is about 10% lower than that of the compute-storage coupled mode. This test scenario is the most similar to the real-life use cases. + +- When the cache is completely missed (i.e., the cache is cleared before every SQL execution, simulating an extreme case), the performance loss is around 35%. **Even so, Apache Doris in the compute-storage decoupled mode delivers much higher performance than its alternative solutions.** + +![Query performance comparison](/images/query-performance-comparison.png) + +### 1-5 Write speed comparison + +In terms of write performance, we have simulated two test cases under the same computing resources: batch import and high-concurrency real-time import. The comparison of write performance between the compute-storage coupled mode and the compute-storage decoupled mode is as follows: + +- **Batch import**: When importing the 1TB TPC-H and 1TB TPC-DS test datasets, **the write performance of the compute-storage decoupled mode is 20.05% and 27.98% higher than the compute-storage coupled mode**, respectively, under the single-replica configuration. During batch import, the segment file size is generally in the range of tens to hundreds of MB. In the compute-storage decoupled mode, the segment files are split into smaller files and concurrently uploaded to the object storage, which can result in higher throughput compared to writing to local disks. In real-life deployments, the compute-storage coupled mode typically uses three replicas, which means the write speed advantage of the compute-storage decoupled mode will be even more pronounced. + +- **High-concurrency real-time import**: as described in the "Comparison with alternative solutions" section. + +![Write speed comparison](/images/write-speed-comparison.png) + +### 1-6 Tips for production environment + +- **Performance**: For real-time data analysis, users can achieve query performance comparable to the compute-storage coupled mode by specifying a TTL (Time-To-Live) for the cache and writing newly ingested data into the cache. To prevent query jitter, users can cache the data generated by background tasks such as compaction and schema changes based on how frequently used the data is. + +- **Workload isolation**: Users can achieve physical resource isolation for different business using multiple compute clusters. For workload isolation within a single compute cluster, users can utilize the Workload Group mechanism to limit and isolate resources for different queries. + +### 1-7 Notes + +- Apache Doris 3.0 does not support the co-existence of the compute-storage coupled mode and the compute-storage decoupled mode. Users need to specify one of them during cluster deployment. + +- If users need the compute-storage coupled mode, following the [documentation](https://doris.apache.org/docs/3.0/install/source-install/compilation-with-docker/) for its deployment and upgrade. We recommend using Doris Manager for quick deployment and cluster upgrades. However, the compute-storage decoupled mode does not yet support Doris Manager deployment and upgrade. We will continue iteration for better support in future versions. + +- Currently Apache Doris does not support in-place upgrade from V2.1 to the compute-storage decoupled mode of V3.0. For such purpose, users need to perform data migration using tools like X2Doris after deploying the compute-storage decoupled clusters. In the future, we will support migration without service interruption through the CCR (Change Data Capture) capability. + +:::info +See doc: +https://doris.apache.org/docs/3.0/compute-storage-decoupled/overview/ +::: + +## 2. Data lakehouse + +Apache Doris is positioned as a real-time data warehouse, but it is much more than that. In previous versions, we have consistently pushed beyond the boundaries of traditional data warehouse capabilities, advancing towards a unified data lakehouse. Version 3.0 marks a milestone in this journey, with its capabilities in the lakehouse architecture becoming fully mature. We believe that a unified lakehouse is identified by **boundaryless data** and **lakehouse fusion**: + +**Boundaryless data: Apache Doris serves as a unified query processing engine, breaking down data barriers across different systems. It provides a consistent and ultra-fast analysis experience across all data sources, including data warehouses, data lakes, data streams, and local data files.** + +- **Lakehouse query acceleration**: Without the need to migrate data to Apache Doris, users can leverage Doris’ efficient query engine to directly query data stored in data lakes such as Iceberg, Hudi, Paimon, and offline data warehouses like Hive, thereby accelerating query analysis. + +- **Federated analysis**: By extending its catalog and storage plugins, Apache Doris enhances its federated analysis capabilities, allowing users to perform unified analysis across multiple heterogeneous data sources without physically centralizing the data in a single storage system. This enables external table queries and federated joins between internal and external tables, breaking down data silos and providing globally consistent data insights. + +- **Data lake construction**: Apache Doris introduces write-back functionality for Hive and Iceberg, allowing users to directly create Hive and Iceberg tables through Doris and write data into them. This allows users to write internal table data back to the offline lakehouse or process offline lakehouse data using Doris and save the results back into the lakehouse, simplifying and streamlining the data lake construction process. + +**Lakehouse fusion: As data lake architectures become increasingly complex, the costs of technology selection and maintenance rise for users. Achieving consistent fine-grained access control across multiple systems also becomes challenging, and real-time performance suffers. To address this, Apache Doris integrates core features of the data lake, transforming itself into a lightweight, efficient, native real-time lakehouse.** + +- **Real-time data updates**: Starting with version 1.2, Apache Doris enhanced the primary key model by introducing Merge-on-Write, supporting real-time updates. This feature allows high-frequency, real-time data updates based on primary key changes from upstream data sources. + +- **Data science and** **AI** **computation support**: From version 2.1, Apache Doris, using the efficient Arrow Flight protocol, increased the openness of its storage system and its support for various compute loads, enabling data science and AI computations. + +- **Enhancements for semi-structured and unstructured Data**: Apache Doris has introduced support for data types like Array, Map, Struct, JSON, and Variant, with plans to support vector indexing in the future. + +- **Improved resource efficiency by decoupling storage and compute**: With version 3.0, Apache Doris supports a decoupled storage and compute mode, further improving resource efficiency and scalability. + +### 2-1 Faster queries in the data lakehouse + +TPC-H and TPC-DS benchmarking proves that Apache Doris achieves average query performance that is 3 to 5 times faster than Trino/Presto. + +In V3.0, we have focused on optimizing query performance for production environments, including: + +- **More granular task splitting strategy**: By adjusting the consistent hashing algorithm and introducing a task sharding weighting mechanism, we ensure balanced query loads across all nodes. + +- **Scheduling optimizations for use cases with numerous partitions and files**: For cases with a large number of files (over 1 million), we have largely reduced query latency (from 100 seconds to 10 seconds) and alleviated memory pressure on the Frontend (FE) by asynchronously and batch-fetching file shards. + +We will continue to specifically enhance query acceleration performance in real-world business scenarios, improve the actual user experience, and build an industry-leading lakehouse query acceleration engine. + +### 2-2 Federated analysis: more data connectors + +Previous versions of Apache Doris support connectors for over 10 mainstream data lakehouses, warehouses, and relational databases. In V3.0, we have introduced the Trino Connector compatibility framework, which expands the range of data sources that Apache Doris can connect to. With this framework, users can easily adapt their existing setups to access corresponding data sources using Doris and leverage its high-speed computing engine for data analysis. + +Currently, Doris has completed adaptations for Delta Lake, Kudu, BigQuery, Kafka, TPCH, and TPCDS. We also encourage contributions from developers to prolong this list. + +:::info Note + +See doc: + +- Trino Connector: https://doris.apache.org/community/how-to-contribute/trino-connector-developer-guide/ + +- TPC-H: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/tpch/ + +- TPC-DS: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/tpcds/ + +- Delta Lake: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/deltalake/ + +- Kudu: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/kudu/ + +- BigQuery: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/bigquery/ +::: + + +### 2-3 Data lake building + +In V3.0, we have introduced data writeback functionality for Hive and Iceberg. This allows users to create Hive and Iceberg tables directly through Doris and write data into these tables, and enables users to perform data analysis, sharing, processing, and storage operations across multiple data sources within Doris. + +In future iterations, Apache Doris will further enhance support for data lake table formats and improve the openness of storage APIs. + +:::info Note +See doc: https://doris.apache.org/docs/3.0/lakehouse/datalake-building/hive-build/ +::: + +## 3. Upgraded semi-structured data analysis capabilities + +In versions 2.0 and 2.1, Apache Doris introduced some well-embraced features such as inverted index, NGram Bloom Filter, and Variant data type to support high-performance full-text search and multi-dimensional analysis. With them, the storage and processing of complex semi-structured data have been more flexible and efficient. + +In V3.0, we have further enhanced the capabilities in this scenario. + +After extensive testing in production environments, the Variant data type has gained sufficient stability and become the preferred choice for JSON data storage and analysis. In V3.0, we have made multiple optimizations to it: + +- Support for indexing of the Variant data type to accelerate queries, including inverted index, Bloom Filter index, and the built-in ZoneMap index. + +- Support for flexible partial column updates for Unique Key tables containing the Variant data type. + +- Support for the use of the Variant data type in the compute-storage decoupled mode, with optimizations of its metadata storage. + +- Support for exporting the Variant data type to formats such as Parquet and CSV. + +The inverted index, introduced since V2.0, has reached a high level of maturity after more than a year of refinement and is now running in production environments of hundreds of enterprises. In V3.0, we have made multiple optimizations to the inverted index: + +- After performance optimizations, including lock concurrency, Apache Doris outperforms Elasticsearch in key metrics such as query latency and concurrency in real-time reporting analysis. + +- Optimized index file in the compute-storage decoupled mode to reduce remote storage calls and decrease index query latency. + +- Support for the Array data type to accelerate the `array_contains` queries. + +- Enhanced the `match_phrase_*` functionality, including support for slop and phrase prefix matching `match_phrase_prefix`. + +## 4. Enhanced ETL capabilities + +### 4-1. Transaction improvements + +Data processing in data warehouses often involves multiple data changes that need to be handled as a single transaction. V3.0 provides explicit transaction support for `insert into select`, `delete`, and `update` operations. Example cases include: + +- **Transactional requirements**: For example, when updating data within a time range, the typical approach is to first delete the data in that time range, and then insert the new data. Considering that the data might already be in service, there is a need to ensure that queries visit either the old data or the new data. Thus, it can be achieved by executing the `delete` and `insert into select` operations in a transaction. + + ```Java + BEGIN; + DELETE FROM table WHERE date >= "2024-07-01" AND date <= "2024-07-31"; + INSERT INTO table SELECT * FROM stage_table; + COMMIT; + ``` + +- **Simplified the processing of failed tasks**: For example, when two `insert into select` operations are executed within a single transaction, if any of the operations fail, it can be retried directly. + + ```Java + BEGIN WITH LABEL label_etl_1; + INTO table1 SELECT * FROM stage_table1; + INSERT INTO table SELECT * FROM stage_table; + COMMIT; + ``` + +:::info Note +See doc: https://doris.apache.org/docs/3.0/data-operate/transaction/ +Currently, explicit transaction synchronization is not supported in Cross-Cluster Replication (CCR). +::: + +### 4-2. Improved observability + +- **Real-time profile retrieval**: In previous versions, due to issues with the execution plan or the data, some complex queries might have high computational requirements, so developers can only access the query profile for performance analysis after the completion of the query. This makes it hard to promptly identify issues in query execution to guarantee stability of the production environment. Now, with the ability to retrieve real-time profiles, V3.0 allows users to monitor query execution as the query is running. It also allows them to better monitor the progress of each ETL job. + +- **`backend_active_tasks` system table**: The `backend_active_tasks` system table provides real-time resource consumption information for each query on each BE node. Users can analyze this system table using SQL to obtain the resource usage of each query, which helps identify large queries or abnormal workloads. + +## 5. Asynchronous materialized view + +In V3.0, asynchronous materialized view is faster and more stable. It is also more user-friendly for query acceleration and data modeling scenarios. We have restructured the logic for transparent rewrite and expanded its capabilities, making it 2X faster. + +### 5-1 Refresh + +- Support for incremental update of materialized views by partitions and partition roll-ups on materialized views to allow refreshes at different granularities. + +- Support for nested materialized views, which is useful in data modeling scenarios. + +- Support for index creation and sort key specification in asynchronous materialized views, which will improve query performance after the materialized view is hit. + +- Higher usability of materialized view DDL with support for atomically replacing materialized views, allowing modifications to the materialized view definition SQL while keeping the materialized view available. + +- Support for non-deterministic functions in materialized views to better serve daily materialized view creation. + +- Support for trigger-based materialized view refresh, which ensures data consistency in data modeling with nested materialized views. + +- Support for a broader range of SQL patterns for building partitioned materialized views, making the incremental update capability available to more use cases. + +### 5-2 Refresh stability + +- V3.0 supports specifying a Workload Group for building materialized views. This is to limit the resources used by the materialized view build process and ensure that sufficient resources remain available for ongoing queries. + +### 5-3 Transparent rewrite + +- Support for transparent rewrite of more Join types, including derived Joins. Even when there is a mismatch of Join types between the query and materialized view, transparent rewrite can still be performed by compensating with additional predicates, as long as the materialized view can provide all the data needed for the query. + +- Support for more aggregate functions for roll-up as well as rewrite of multi-dimensional aggregations like GROUPING SETS, ROLLUP, and CUBE; support rewriting queries with aggregations when the materialized view does not contain aggregations, simplifying Join operations and expression computation. + +- Support for transparent rewrite of nested materialized views, enabling higher performance for complex queries. + +- For partially invalid partitioned materialized views, V3.0 supports `Union All` the base tables for data completion, expanding the applicability of partitioned materialized views. + +### 5-4 Transparent rewrite performance + +- Continuous optimization has been done to improve the transparent rewrite performance, achieving 2X the speed compared to version 2.1.0. + +:::info Note + +See doc: + +https://doris.apache.org/docs/3.0/query/view-materialized-view/query-async-materialized-view + +https://doris.apache.org/docs/3.0/query/view-materialized-view/async-materialized-view/ + +::: + +## 6. Performance improvement + +### 6-1 Smarter optimizer + +In V3.0, the query optimizer has been enhanced in terms of framework capabilities, distributed plan support, optimizer infrastructure, and rule expansion. It provides better optimization capabilities for more complex and diverse business scenarios, with higher blind test performance for complex SQL: + +- **Improved plan enumeration capability**: The key structure Memo for plan enumeration has been restructured and normalized. This improves the efficiency of the Cascades framework in plan enumeration and the possibility of producing better plans. Additionally, it fixes incomplete column pruning during the Join Reorder process in older versions, which led to unnecessary overhead of the Join operator, thus improving the execution performance in the relevant scenarios. + +- **Improved distributed plan support**: The distributed query plan has been enhanced to allow aggregation, join, and window function operations to more intelligently identify the data characteristics of intermediate computation results, avoiding ineffective data redistribution operations. Meanwhile, we have optimized the execution under the multi-replica continuous execution mode, making it more data cache-friendly. + +- **Improved optimizer infrastructure**: V3 has fixed several issues in cost model and statistics information estimation. The fixes to the cost model are more adaptable to the evolution of the execution engine, making the execution plan more stable compared to previous versions. + +- **Enhanced Runtime Filter plan support**: On the basis of Join Runtime Filter, V3.0 has expanded the capability of the TopN Runtime Filter to achieve better performance in use cases that involve a TopN operator. + +- **Enriched optimization rule library**: Based on user feedback and internal testing results, we have introduced optimization rules such as Intersect Reorder to enrich the rule set of the optimizer. + +### 6-2 Self-adaptive Runtime Filter + +In previous versions, the generation of Runtime Filter relies on manual setting by users based on statistical information. However, inaccurate settings in certain cases could lead to performance instability. + +In V3.0, Doris implements a self-adaptive Runtime Filter calculation approach. It can estimate the Runtime Filter at runtime based on the data size with high accuracy, enabling better performance in use cases with large data volumes and high workloads. + +### 6-3 Function performance optimization + +- V3.0 has improved the vectorized implementation of dozens of functions, enabling a performance improvement of over 50% for some commonly used functions. +- V3.0 has also made extensive optimizations to the aggregation of nullable data types, enabling a 30% performance improvement. + +### 6-4 Blind test performance improvement + +Our blind tests on V3.0 and V2.1 show that the new version is 7.3% and 6.2% faster in TPC-DS and TPC-H benchmark tests, respectively. + +![Blind test performance improvement](/images/blind-test-performance-improvement.png) + +## 7. New features + +### 7-1 Java UDTF + +Version 3.0 has added support for Java UDTFs. The key operations are as follows: + +- Implementing a UDTF: Similar to a UDF, a UDTF requires the user to implement an `evaluate` method. Note that the return value of a UDTF function must be of the `Array` data type. + + ```sql + public class UDTFStringTest { + public ArrayList evaluate(String value, String separator) { + if (value == null || separator == null) { + return null; + } else { + return new ArrayList<>(Arrays.asList(value.split(separator))); + } + } + } + ``` + +- Creating a UDTF: By default, two corresponding functions will be created - `java-utdf`and `java-utdf_outer`. The `_outer` suffix adds a single row of `NULL` data when the table function generates 0 rows of output. + + ```sql + CREATE TABLES FUNCTION java-utdf(string, string) RETURNS array PROPERTIES ( + "file"="file:///pathTo/java-udaf.jar", + "symbol"="org.apache.doris.udf.demo.UDTFStringTest", + "always_nullable"="true", + "type"="JAVA_UDF" + ); + ``` + +:::info + +See doc: https://doris.apache.org/docs/3.0/query/udf/java-user-defined-function/#udtf-1 + +::: + +### 7-2 Generated column + +A generated column is a special column whose value is calculated from the values of other columns rather than directly inserted or updated by the user. It supports pre-computing the results of expressions and storing them in the database, which is suitable for scenarios that require frequent queries or complex calculations. + +Results can be automatically calculated based on predefined expressions when data is imported or updated, and then stored persistently. In this way, during subsequent queries, the system can directly access these calculated results without performing complex calculations, thereby improving query performance. + +Generated columns are supported since V3.0. When creating a table, you can specify a column as generated column. A generated column automatically calculates values based on the defined expression when data is written. Generated columns allow for more complex expressions to be defined, but the value cannot be explicitly written or set. + +:::info + +See doc: https://doris.apache.org/docs/3.0/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-TABLE-AND-GENERATED-COLUMN/ + +::: + +## 8. Functional improvements + +### 8-1. Materialized view + +We have refactored the selection logic for materialized views and migrated it from the rule-based optimizer (RBO) to the cost-based optimizer (CBO). This aligns the selection logic with that of asynchronous materialized views. This functionality is enabled by default. If any issues are encountered, you can revert to the RBO mode using `set global enable_sync_mv_cost_based_rewrite = false`. + +### 8-2. Routine Load + +In previous versions, the Routine Load functionality faced some usability challenges, such as uneven task scheduling across BE nodes, untimely task scheduling, complex configuration requirements (the need to change multiple FE and BE settings for optimization), insufficient overall stability (where restarts or upgrades could frequently pause Routine Load jobs, requiring manual user intervention to resume). + +To address these issues, we have made extensive optimizations to the Routine Load feature: + +- **Resource scheduling**: We have improved the scheduling balance to make sure that tasks are more evenly distributed across BE nodes. Jobs that encounter unrepairable errors will be promptly paused to avoid wasting resources on futile scheduling attempts. Additionally, we have improved the timeliness of the scheduling process, which has enhanced the import performance of Routine Load. + +- **Parameter configuration**: Users in most environments no longer need to modify FE and BE configurations for optimization. An automatic adjustment mechanism with timeout parameter has been introduced to prevent tasks from constantly retrying when cluster pressure increases. + +- **Stability**: We have enhanced the robustness of Doris in various exceptional scenarios, such as FE failovers, BE rolling upgrades, and Kafka cluster anomalies, ensuring continuous stable operation. We have also optimized the Auto Resume mechanism, allowing Routine Load to automatically resume operation after faults are repaired, reducing the need for manual user intervention. + +## 9. Behavior changed + +- `cpu_resource_limit` will no longer be supported, and all types of resource isolation will be implemented through Workload Groups. + +- Please use JDK 17 for Apache Doris 3.0 and later versions. The recommended version being `jdk-17.0.10_linux-x64_bin.tar.gz`. + +## Try Apache Doris 3.0 now! + +Before the official release of version 3.0, the compute-storage decoupled mode of Apache Doris has undergone nearly two years of extensive testing and optimization in the production environments of hundreds of enterprises. Contributors from many tech giants have collaborated with the community to provide a significant number of test cases based on their real-world business needs. This has rigorously validated the usability and stability of version 3.0. + +We highly recommend users with compute-storage decoupling needs to download version 3.0 and experience it firsthand. + +Going forward, we will accelerate our release iteration cycle to deliver a more stable version experience for all users. Feel free to join us in the [Apache Doris community](https://join.slack.com/t/apachedoriscommunity/shared_invite/zt-2gmq5o30h-455W226d79zP3L96ZhXIoQ) and engage directly with the core developers. + +## Credits + +Special thanks to the following contributors who participated in the development, testing, and provided feedback for this version: + +@133tosakarin、@390008457、@924060929、@AcKing-Sam、@AshinGau、@BePPPower、@BiteTheDDDDt、@ByteYue、@CSTGluigi、@CalvinKirs、@Ceng23333、@DarvenDuan、@DongLiang-0、@Doris-Extras、@Dragonliu2018、@Emor-nj、@FreeOnePlus、@Gabriel39、@GoGoWen、@HappenLee、@HowardQin、@Hyman-zhao、@INNOCENT-BOY、@JNSimba、@JackDrogon、@Jibing-Li、@KassieZ、@Lchangliang、@LemonLiTree、@LiBinfeng-01、@LompleZ、@M1saka2003、@Mryange、@Nitin-Kashyap、@On-Work-Song、@SWJTU-ZhangLei、@StarryVerse、@TangSiyang2001、@Tech-Circle-48、@Thearas、@Vallishp、@WinkerDu、@XieJiann、@XuJianxu、@XuPengfei-1020、@Yukang-Lian、@Yulei-Yang、@Z-SWEI、@ZhongJinHacker、@adonis0147、@airborne12、@allenhooo、@amorynan、@bingquanzhao、@biohazard4321、@bobhan1、@caiconghui、@cambyzju、@caoliang-web、@catpineapple、@cjj2010、@csun5285、@dataroaring、@deardeng、@dongsilun、@dutyu、@echo-hhj、@eldenmoon、@elvestar、@englefly、@feelshana、@feifeifeimoon、@feiniaofeiafei、@felixwluo、@freemandealer、@gavinchou、@ghkang98、@gnehil、@hechao-ustc、@hello-stephen、@httpshirley、@hubgeter、@hust-hhb、@iszhangpch、@iwanttobepowerful、@ixzc、@jacktengg、@jackwener、@jeffreys-cat、@kaijchen、@kaka11chen、@kindred77、@koarz、@kobe6th、@kylinmac、@larshelge、@liaoxin01、@lide-reed、@liugddx、@liujiwen-up、@liutang123、@lsy3993、@luwei16、@luzhijing、@lxliyou001、@mongo360、@morningman、@morrySnow、@mrhhsg、@my-vegetable-has-exploded、@mymeiyi、@nanfeng1999、@nextdreamblue、@pingchunzhang、@platoneko、@py023、@qidaye、@qzsee、@raboof、@rohitrs1983、@rotkang、@ryanzryu、@seawinde、@shoothzj、@shuke987、@sjyango、@smallhibiscus、@sollhui、@sollhui、@spaces-X、@stalary、@starocean999、@superdiaodiao、@suxiaogang223、@taptao、@vhwzx、@vinlee19、@w41ter、@wangbo、@wangshuo128、@whutpencil、@wsjz、@wuwenchi、@wyxxxcat、@xiaokang、@xiedeyantu、@xiedeyantu、@xingyingone、@xinyiZzz、@xy720、@xzj7019、@yagagagaga、@yiguolei、@yongjinhou、@ytwp、@yuanyuan8983、@yujun777、@yuxuan-luo、@zclllyybb、@zddr、@zfr9527、@zgxme、@zhangbutao、@zhangstar333、@zhannngchen、@zhiqiang-hhhh、@ziyanTOP、@zxealous、@zy-kkk、@zzzxl1993、@zzzzzzzs \ No newline at end of file diff --git a/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.1.md b/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.1.md new file mode 100644 index 0000000000000..9b9007e4391aa --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.1.md @@ -0,0 +1,604 @@ +--- +{ + "title": "Release 3.0.1", + "language": "en" +} +--- + + + +Dear community members, the Apache Doris 3.0.1 version was officially released on August 23, 2024, featuring updates and improvements in compute-storage decoupling, lakehouse, semi-structured data analysis, asynchronous materialized views, and more. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior Changes + +### Query Optimizer + +- Added the variable `use_max_length_of_varchar_in_ctas` to control the length behavior of VARCHAR type when executing `CREATE TABLE AS SELECT` (CTAS) operations. [#37069](https://github.com/apache/doris/pull/37069) + + - This variable is set to true by default. + + - When set to true, if the VARCHAR type column originates from a table, the derived length is used; otherwise, the maximum length is used. + + - When set to false, the VARCHAR type will always use the derived length. + +- All data types will now be displayed in lowercase to maintain compatibility with MySQL format. [#38012](https://github.com/apache/doris/pull/38012) + +- Multiple query statements in the same query request must now be separated by semicolons. [#38670](https://github.com/apache/doris/pull/38670) + +### Query Execution + +- The default number of parallel tasks after shuffle operations in the cluster is set to 100, which will improve query stability and concurrent processing capability in large clusters. [#38196](https://github.com/apache/doris/pull/38196) + +### Storage + +- The default value of `trash_file_expire_time_sec` has been changed from 86400 seconds to 0 seconds, which means that if files are deleted by mistake and the FE trash is cleared, the data cannot be recovered. + +- The table attribute `enable_mow_delete_on_delete_predicate` (introduced in version 3.0.0) has been renamed to `enable_mow_light_delete`. + +- Explicit transactions are now prohibited from performing delete operations on tables with written data. + +- Heavy schema change operations are prohibited on tables with auto-increment fields. + + + +## New Features + +### Job Scheduling + +- Optimized the execution logic of internal scheduling jobs, decoupling the strong association between start time and immediate execution parameters. Now, tasks can be created with a specified start time or selected for immediate execution, without conflict, enhancing scheduling flexibility. [#36805](https://github.com/apache/doris/pull/36805) + +### Compute-Storage Decoupled + +- Supports dynamic modification of the upper limit for file cache usage. [#37484](https://github.com/apache/doris/pull/37484) + +- Recycler now supports object storage rate limiting and server-side rate limiting retry functionality. [#37663](https://github.com/apache/doris/pull/37663) [#37680](https://github.com/apache/doris/pull/37680) + +### Lakehouse + +- Added the session variable `serde_dialect` to set the output format for complex types. [#37039](https://github.com/apache/doris/pull/37039) + +- SQL interception now supports external tables. + + - For more information, refer to the documentation on [SQL Interception](https://doris.apache.org/docs/admin-manual/query-admin/sql-interception). + +- Insert overwrite now supports Iceberg tables. [#37191](https://github.com/apache/doris/pull/37191) + +### Asynchronous Materialized Views + +- Supports partition roll-up and build at the hourly level. [#37678](https://github.com/apache/doris/pull/37678) + +- Supports atomic replacement of asynchronous materialized view definition statements. [#36749](https://github.com/apache/doris/pull/36749) + +- Transparent rewriting now supports Insert statements. [#38115](https://github.com/apache/doris/pull/38115) + +- Transparent rewriting now supports the VARIANT type. [#37929](https://github.com/apache/doris/pull/37929) + +### Query Execution + +- The group concat function now supports DISTINCT and ORDER BY options. [#38744](https://github.com/apache/doris/pull/38744) + +### Semi-Structured Data Management + +- The ES Catalog now maps `nested` or `object` types in Elasticsearch to the JSON type in Doris. [#37101](https://github.com/apache/doris/pull/37101) + +- Added the `MULTI_MATCH` function, which supports matching keywords across multiple fields and can leverage inverted indexes to accelerate searches. [#37722](https://github.com/apache/doris/pull/37722) + +- Added the `explode_json_object` function, which can unfold objects in JSON data into multiple rows. [#36887](https://github.com/apache/doris/pull/36887) + +- Inverted indexes now support memtable advancement, requiring index construction only once during multi-replica writes, reducing CPU consumption and improving performance. [#35891](https://github.com/apache/doris/pull/35891) + +- Added `MATCH_PHRASE` support for positive slop, e.g., `msg MATCH_PHRASE 'a b 2+'` can match instances containing words a and b with a slop of no more than two, and a preceding b; regular slop without the final `+` does not guarantee this order. [#36356](https://github.com/apache/doris/pull/36356) + +### Other + +- Added the FE parameter `skip_audit_user_list`, where user operations specified in this configuration will not be recorded in the audit log. [#38310](https://github.com/apache/doris/pull/38310) + + - For more information, refer to the documentation on [Audit Plugin](https://doris.apache.org/docs/admin-manual/audit-plugin/). + + + +## Improvements + +### Storage + +- Reduced the likelihood of write failures caused by disk balancing within a single BE. [#38000](https://github.com/apache/doris/pull/38000) + +- Decreased memory consumption by the memtable limiter. [#37511](https://github.com/apache/doris/pull/37511) + +- Moved old partitions to the FE trash during partition replacement operations. [#36361](https://github.com/apache/doris/pull/36361) + +- Optimized memory consumption during compaction. [#37099](https://github.com/apache/doris/pull/37099) + +- Added a session variable to control audit logs for JDBC PreparedStatement, with default setting to not print. [#38419](https://github.com/apache/doris/pull/38419) + +- Optimized the logic for selecting BEs for group commits. [#35558](https://github.com/apache/doris/pull/35558) + +- Improved the performance of column updates. [#38487](https://github.com/apache/doris/pull/38487) + +- Optimized the use of `delete bitmap cache`. [#38761](https://github.com/apache/doris/pull/38761) + +- Added a configuration to control query affinity during hot and cold tiering. [#37492](https://github.com/apache/doris/pull/37492) + +### Compute-Storage Decoupled + +- Implemented automatic retries when encountering object storage server rate limiting. [#37199](https://github.com/apache/doris/pull/37199) + +- Adapted the number of threads for memtable flush in the compute-storage decoupled mode. [#38789](https://github.com/apache/doris/pull/38789) + +- Added Azure as a compile option to support compilation in environments without Azure support. + +- Optimized the observability of object storage access rate limiting. [#38294](https://github.com/apache/doris/pull/38294) + +- Allowed the file cache TTL queue to perform LRU eviction, enhancing TTL queue usability. [#37312](https://github.com/apache/doris/pull/37312) + +- Optimized the number of balance writeeditlog IO operations in the storage and compute separation mode. [#37787](https://github.com/apache/doris/pull/37787) + +- Improved table creation speed in the storage and compute separation mode by sending tablet creation requests in batches. [#36786](https://github.com/apache/doris/pull/36786) + +- Optimized read failures caused by potential inconsistencies in the local file cache through backoff retries. [#38645](https://github.com/apache/doris/pull/38645) + +### Lakehouse + +- Optimized memory statistics for Parquet/ORC format read and write operations. [#37234](https://github.com/apache/doris/pull/37234) + +- Trino Connector Catalog now supports predicate pushdown. [#37874](https://github.com/apache/doris/pull/37874) + +- Added a session variable `enable_count_push_down_for_external_table` to control whether to enable `count(*)` pushdown optimization for external tables. [#37046](https://github.com/apache/doris/pull/37046) + +- Optimized the read logic for Hudi snapshot reads, returning an empty set when the snapshot is empty, consistent with Spark behavior. [#37702](https://github.com/apache/doris/pull/37702) + +- Improved the read performance of partition columns for Hive tables. [#37377](https://github.com/apache/doris/pull/37377) + +### Asynchronous Materialized Views + +- Improved transparent rewrite plan speed by 20%. [#37197](https://github.com/apache/doris/pull/37197) + +- Eliminated roll-up during transparent rewrite if the group key satisfies data uniqueness for better nested matching. [#38387](https://github.com/apache/doris/pull/38387) + +- Transparent rewrite now performs better aggregation elimination to improve the matching success rate of nested materialized views. [#36888](https://github.com/apache/doris/pull/36888) + +### MySQL Compatibility + +- Now correctly populates the database name, table name, and original name in the MySQL protocol result columns. [#38126](https://github.com/apache/doris/pull/38126) + +- Supported the hint format `/*+ func(value) */`. [#37720](https://github.com/apache/doris/pull/37720) + +### Query Optimizer + +- Significantly improved the plan speed for complex queries. [#38317](https://github.com/apache/doris/pull/38317) + +- Adaptively chose whether to perform bucket shuffle based on the number of data buckets to avoid performance degradation in extreme cases. [#36784](https://github.com/apache/doris/pull/36784) + +- Optimized the cost estimation logic for SEMI / ANTI JOIN. [#37951](https://github.com/apache/doris/pull/37951) [#37060](https://github.com/apache/doris/pull/37060) + +- Supported pushing Limit down to the first stage of aggregation to improve performance. [#34853](https://github.com/apache/doris/pull/34853) + +- Partition pruning now supports filter conditions containing the `date_trunc` or `date` function. [#38025](https://github.com/apache/doris/pull/38025) [#38743](https://github.com/apache/doris/pull/38743) + +- SQL cache now supports query scenarios that include user variables. [#37915](https://github.com/apache/doris/pull/37915) + +- Optimized error messages for invalid aggregation semantics. [#38122](https://github.com/apache/doris/pull/38122) + +### Query Execution + +- Adapted AggState compatibility from 2.1 to 3.x and fixed Coredump issues. [#37104](https://github.com/apache/doris/pull/37104) + +- Refactored the strategy selection for local shuffle without Join. [#37282](https://github.com/apache/doris/pull/37282) + +- Modified the scanner for internal table queries to be asynchronous to prevent stalling during such queries. [#38403](https://github.com/apache/doris/pull/38403) + +- Optimized the block merge process during Hash table construction for Join operators. [#37471](https://github.com/apache/doris/pull/37471) + +- Optimized the duration of lock holding for MultiCast. [#37462](https://github.com/apache/doris/pull/37462) + +- Optimized gRPC keepAliveTime and added link monitoring to reduce the probability of query failure due to RPC errors. [#37304](https://github.com/apache/doris/pull/37304) + +- Cleaned up all dirty pages in jemalloc when memory limits were exceeded. [#37164](https://github.com/apache/doris/pull/37164) + +- Optimized the processing performance of `aes_encrypt`/`decrypt` functions for constant types. [#37194](https://github.com/apache/doris/pull/37194) + +- Optimized the processing performance of the `json_extract` function for constant data. [#36927](https://github.com/apache/doris/pull/36927) + +- Optimized the processing performance of the `ParseUrl` function for constant data. [#36882](https://github.com/apache/doris/pull/36882) + +### Semi-Structured Data Management + +- Bitmap indexes now default to using inverted indexes, with `enable_create_bitmap_index_as_inverted_index` set to true by default. [#36692](https://github.com/apache/doris/pull/36692) + +- In the compute-storage decoupled mode, DESC can now view sub-columns of VARIANT type. [#38143](https://github.com/apache/doris/pull/38143) + +- Removed the step of checking file existence during inverted index queries to reduce access latency to remote storage. [#36945](https://github.com/apache/doris/pull/36945) + +- Complex types ARRAY / MAP / STRUCT now support `replace_if_not_null` for AGG tables. [#38304](https://github.com/apache/doris/pull/38304) + +- Escape characters for JSON data are now supported. [#37176](https://github.com/apache/doris/pull/37176) [#37251](https://github.com/apache/doris/pull/37251) + +- Inverted index queries now behave consistently on MOW tables and DUP tables. [#37428](https://github.com/apache/doris/pull/37428) + +- Optimized the performance of inverted index acceleration for IN queries. [#37395](https://github.com/apache/doris/pull/37395) + +- Reduced unnecessary memory allocation during TOPN queries to improve performance. [#37429](https://github.com/apache/doris/pull/37429) + +- When creating an inverted index with tokenization, the `support_phrase` option is now automatically enabled to accelerate `match_phrase` series phrase queries. [#37949](https://github.com/apache/doris/pull/37949) + +### Other + +- Audit log now can record SQL types. [#37790](https://github.com/apache/doris/pull/37790) + +- Added support for `information_schema.processlist` to show all FE. [#38701](https://github.com/apache/doris/pull/38701) + +- Cached ranger's `atamask` and `rowpolicy` to accelerate query efficiency. [#37723](https://github.com/apache/doris/pull/37723) + +- Optimized metadata management in job manager to release locks immediately after modifying metadata, reducing lock holding time. [#38162](https://github.com/apache/doris/pull/38162) + + + +## Bug Fixes + +### Upgrade + +- Fix the issue where `mtmv load` fails during upgrade from version 2.1. [#38799](https://github.com/apache/doris/pull/38799) + +- Resolve the issue where `null_type` cannot be found during the upgrade to version 2.1. [#39373](https://github.com/apache/doris/pull/39373) + +- Address the compatibility issue with permission persistence during the upgrade from version 2.1 to 3.0. [#39288](https://github.com/apache/doris/pull/39288) + +### Load + +- Fix the issue where parsing fails when the newline character is surrounded by delimiters in CSV format parsing. [#38347](https://github.com/apache/doris/pull/38347) +- Resolve potential exception issues when FE forwards group commit. [#38228](https://github.com/apache/doris/pull/38228) [#38265](https://github.com/apache/doris/pull/38265) + +- Group commit now supports the new optimizer. [#37002](https://github.com/apache/doris/pull/37002) + +- Fix the issue where group commit reports data errors when JDBC setNull is used. [#38262](https://github.com/apache/doris/pull/38262) + +- Optimize the retry logic for group commit when encountering `delete bitmap lock` errors. [#37600](https://github.com/apache/doris/pull/37600) + +- Resolve the issue where routine load cannot use CSV delimiters and escape characters. [#38402](https://github.com/apache/doris/pull/38402) + +- Fix the issue where routine load job names with mixed case cannot be displayed. [#38523](https://github.com/apache/doris/pull/38523) + +- Optimize the logic for actively recovering routine load during FE master-slave switching. [#37876](https://github.com/apache/doris/pull/37876) + +- Resolve the issue where routine load pauses when all data in Kafka is expired. [#37288](https://github.com/apache/doris/pull/37288) + +- Fix the issue where `show routine load` returns empty results. [#38199](https://github.com/apache/doris/pull/38199) + +- Resolve the memory leak issue during multi-table stream import in routine load. [#38255](https://github.com/apache/doris/pull/38255) + +- Fix the issue where stream load does not return the error URL. [#38325](https://github.com/apache/doris/pull/38325) + +- Resolve potential load channel leak issues. [#38031](https://github.com/apache/doris/pull/38031) [#37500](https://github.com/apache/doris/pull/37500) + +- Fix the issue where no error may be reported when importing fewer segments than expected. [#36753](https://github.com/apache/doris/pull/36753) + +- Resolve the load stream leak issue. [#38912](https://github.com/apache/doris/pull/38912) + +- Optimize the impact of offline nodes on import operations. [#38198](https://github.com/apache/doris/pull/38198) + +- Fix the issue where transactions do not end when inserting into empty data. [#38991](https://github.com/apache/doris/pull/38991) + +### Storage + +**01 Backup and Restoration** + +- Fix the issue where tables cannot be written after backup and restoration. [#37089](https://github.com/apache/doris/pull/37089) + +- Resolve the issue where view database names are incorrect after backup and restoration. [#37412](https://github.com/apache/doris/pull/37412) + +**02 Compaction** + +- Fix the issue where cumu compaction handles delete errors incorrectly during ordered data compression. [#38742](https://github.com/apache/doris/pull/38742) + +- Resolve the issue of duplicate keys in aggregate tables caused by sequential compression optimization. [#38224](https://github.com/apache/doris/pull/38224) + +- Fix the issue where compression operations cause coredump in large wide tables. [#37960](https://github.com/apache/doris/pull/37960) + +- Resolve the compression starvation issue caused by inaccurate concurrent statistics of compression tasks. [#37318](https://github.com/apache/doris/pull/37318) + +**03 MOW Unique Key** + +- Resolve the issue of inconsistent data between replicas caused by cumulative compression deletion of delete sign. [#37950](https://github.com/apache/doris/pull/37950) + +- MOW delete now uses partial column updates with the new optimizer. [#38751](https://github.com/apache/doris/pull/38751) + +- Fix the potential duplicate key issue in MOW tables under compute-storage decoupled. [#39018](https://github.com/apache/doris/pull/39018) + +- Resolve the issue where MOW unique and duplicate tables cannot modify column order. [#37067](https://github.com/apache/doris/pull/37067) + +- Fix the potential data correctness issue caused by segcompaction. [#37760](https://github.com/apache/doris/pull/37760) + +- Resolve the potential memory leak issue during column updates. [#37706](https://github.com/apache/doris/pull/37706) + +**04 Other** + +- Fix the small probability of exceptions in TOPN queries. [#39119](https://github.com/apache/doris/pull/39119) [#39199](https://github.com/apache/doris/pull/39199) + +- Resolve the issue where auto-increment IDs may duplicate during FE restart. [#37306](https://github.com/apache/doris/pull/37306) + +- Fix the potential queuing issue in the delete operation priority queue. [#37169](https://github.com/apache/doris/pull/37169) + +- Optimize the delete retry logic. [#37363](https://github.com/apache/doris/pull/37363) + +- Resolve the issue with `bucket = 0` in table creation statements under the new optimizer. [#38971](https://github.com/apache/doris/pull/38971) + +- Fix the issue where FE reports success incorrectly when image generation fails. [#37508](https://github.com/apache/doris/pull/37508) + +- Resolve the issue where using the wrong nodename during FE offline nodes may cause inconsistent FE members. [#37987](https://github.com/apache/doris/pull/37987) + +- Fix the issue where CCR partition addition may fail. [#37295](https://github.com/apache/doris/pull/37295) + +- Resolve the `int32` overflow issue in inverted index files. [#38891](https://github.com/apache/doris/pull/38891) + +- Fix the issue where TRUNCATE TABLE failure may cause BE to fail to go offline. [#37334](https://github.com/apache/doris/pull/37334) + +- Resolve the issue where publish cannot continue due to null pointers. [#37724](https://github.com/apache/doris/pull/37724) [#37531](https://github.com/apache/doris/pull/37531) + +- Fix the potential coredump issue when manually triggering disk migration. [#37712](https://github.com/apache/doris/pull/37712) + +### Compute-Storage Decoupled + +- Fixed the issue where `show create table` might display the `file_cache_ttl_seconds` attribute twice. [#38052](https://github.com/apache/doris/pull/38052) + +- Fixed the issue where segment Footer TTL was not set correctly after setting file cache TTL. [#37485](https://github.com/apache/doris/pull/37485) + +- Fixed the issue where file cache might cause coredump due to massive conversion of cache types. [#38518](https://github.com/apache/doris/pull/38518) + +- Fixed the potential file descriptor (fd) leak in file cache. [#38051](https://github.com/apache/doris/pull/38051) + +- Fixed the issue where schema change Job overwriting compaction Job prevented base tablet compaction from completing normally. [#38210](https://github.com/apache/doris/pull/38210) + +- Fixed the potential inaccuracy of base compaction score due to data race. [#38006](https://github.com/apache/doris/pull/38006) + +- Fixed the issue where error messages from imports might not be uploaded correctly to object storage. [#38359](https://github.com/apache/doris/pull/38359) + +- Fixed the inconsistency in return information between compute-storage decoupled mode and storage and compute integration mode for 2PC imports. [#38076](https://github.com/apache/doris/pull/38076) + +- Fix the issue where incorrect file size setting during file cache warm-up leads to coredump. [#38939](https://github.com/apache/doris/pull/38939) + +- Fixed the issue where partial column updates did not correctly dequeue delete operations. [#37151](https://github.com/apache/doris/pull/37151) + +- Fixed compatibility issues with permission persistence in compute-storage decoupled mode. [#38136](https://github.com/apache/doris/pull/38136) [#37708](https://github.com/apache/doris/pull/37708) + +- Fixed the issue where observer did not retry correctly when encountering a `-230` error. [#37625](https://github.com/apache/doris/pull/37625) + +- Fixed the issue where `show load` with conditions did not perform correct analysis. [#37656](https://github.com/apache/doris/pull/37656) + +- Fixed the issue where `show streamload` in compute-storage decoupled mode caused BE coredump. [#37903](https://github.com/apache/doris/pull/37903) + +- Fixed the issue where `copy into` did not correctly verify column names in strict mode. [#37650](https://github.com/apache/doris/pull/37650) + +- Fixed the issue where multi-stream imports into a single table lacked permissions. [#38878](https://github.com/apache/doris/pull/38878) + +- Fixed the potential overflow issue in `getVersionUpdateTimeMs`. [#38074](https://github.com/apache/doris/pull/38074) + +- Fixed the issue where FE azure blob list was not implemented correctly. [#37986](https://github.com/apache/doris/pull/37986) + +- Fixed the issue where inaccurate azure blob recycling time calculation prevented recycling. [#37535](https://github.com/apache/doris/pull/37535) + +- Fixed the issue where inverted index files were not deleted in compute-storage decoupled mode. [#38306](https://github.com/apache/doris/pull/38306) + +### Lakehouse + +- Fixed the issue with reading binary data from Oracle Catalog. [#37078](https://github.com/apache/doris/pull/37078) + +- Fixed the potential deadlock issue when acquiring external table metadata in multi-FE scenarios. [#37756](https://github.com/apache/doris/pull/37756) + +- Fixed the issue where JNI scanner failure caused BE nodes to crash. [#37697](https://github.com/apache/doris/pull/37697) + +- Fixed the issue with slow reading of date types from Trino Connector Catalog. [#37266](https://github.com/apache/doris/pull/37266) + +- Optimized kerberos authentication logic for Hive Catalog. [#37301](https://github.com/apache/doris/pull/37301) + +- Fixed the issue where region attributes might be parsed incorrectly when parsing MinIO properties. [#37249](https://github.com/apache/doris/pull/37249) + +- Fixed the issue where creating too many FileSystems by FE caused memory leaks. [#36954](https://github.com/apache/doris/pull/36954) + +- Fixed the issue with reading incorrect time zone information from Paimon. [#37716](https://github.com/apache/doris/pull/37716) + +- Fixed the potential thread leak issue caused by Hive write-back operations. [#36990](https://github.com/apache/doris/pull/36990) + +- Fixed the null pointer issue caused by enabling Hive metastore event synchronization. [#38421](https://github.com/apache/doris/pull/38421) + +- Fixed the issue where error messages were unclear or caused stalling when creating catalogs. [#37551](https://github.com/apache/doris/pull/37551) + +- Fixed the issue where reading Hive text format tables behaved differently from Hive. [#37638](https://github.com/apache/doris/pull/37638) + +- Fixed the logic error when switching between catalogs and databases. [#37828](https://github.com/apache/doris/pull/37828) + +### MySQL Compatibility + +- Fixed the issue where certain flags in the MySQL protocol were set incorrectly when SSL was enabled. [#38086](https://github.com/apache/doris/pull/38086) + +### Asynchronous Materialized Views + +- Fixed the issue where construction might fail when the base table had a very large number of partitions. [#37589](https://github.com/apache/doris/pull/37589) + +- Fixed the issue where nested materialized views incorrectly performed full table refreshes even when partition refreshes were possible. [#38698](https://github.com/apache/doris/pull/38698) + +- Fixed the issue where partition refresh could not handle the simultaneous existence of valid and invalid dependencies when analyzing partition dependencies. [#38367](https://github.com/apache/doris/pull/38367) + +- Fixed the issue where the final result containing NULL type might cause asynchronous materialized views to fail. [#37019](https://github.com/apache/doris/pull/37019) + +- Fixed the planning error that might occur during transparent rewriting when both synchronous and asynchronous materialized views with the same name were present. [#37311](https://github.com/apache/doris/pull/37311) + +### Synchronous Materialized Views + +- The rewritten synchronous materialized views now can correctly perform partition pruning. [#38527](https://github.com/apache/doris/pull/38527) + +- When rewriting synchronous materialized views, those with unready data are no longer selected. [#38148](https://github.com/apache/doris/pull/38148) + +### Query Optimizer + +- Fixed the deadlock issue that might occur when queries and delete operations are performed simultaneously. [#38660](https://github.com/apache/doris/pull/38660) + +- Fixed the issue where bucket pruning might incorrectly prune on decimal column buckets. [#37889](https://github.com/apache/doris/pull/37889) + +- Fixed the issue where planning might be incorrect when mark join participates in join reorder. [#39152](https://github.com/apache/doris/pull/39152) + +- Fixed the issue where the result is incorrect when the correlation condition of a correlated subquery is not a simple column. [#37644](https://github.com/apache/doris/pull/37644) + +- Fixed the issue where partition pruning cannot correctly handle or expressions. [#38897](https://github.com/apache/doris/pull/38897) + +- Fixed the planning error that might occur when optimizing the execution order of JOIN and AGG. [#37343](https://github.com/apache/doris/pull/37343) + +- Fixed the issue where `str_to_date` performs incorrect constant folding calculations on datev1 types. [#37360](https://github.com/apache/doris/pull/37360) + +- Fixed the issue where the ACOS function's constant folding returns non-NaN values. [#37932](https://github.com/apache/doris/pull/37932) + +- Fixed the occasional planning error: "The children format needs to be [WhenClause+, DefaultValue?]". [#38491](https://github.com/apache/doris/pull/38491) + +- Fixed the issue where planning might be incorrect when the projection includes window functions and there is both the original column and its alias. [#38166](https://github.com/apache/doris/pull/38166) + +- Fixed the issue where planning might report an error when the aggregation parameter contains a lambda expression. [#37109](https://github.com/apache/doris/pull/37109) + +- Fixed the insert error that might occur in extreme cases: "MultiCastDataSink cannot be cast to DataStreamSink". [#38526](https://github.com/apache/doris/pull/38526) + +- Fixed the issue where the new optimizer does not correctly handle `char(0)/varchar(0)` when creating a table. [#38427](https://github.com/apache/doris/pull/38427) + +- Fixed the incorrect behavior of `char(255) toSql`. [#37340](https://github.com/apache/doris/pull/37340) + +- Fixed the issue where the nullable attribute within the `agg_state` type might lead to planning errors. [#37489](https://github.com/apache/doris/pull/37489) +- Fixed the issue where row count statistics are inaccurate during mark Join. [#38270](https://github.com/apache/doris/pull/38270) + +### Query Execution + +- Fixed issues where the Pipeline execution engine was stuck, causing queries to not end, in multiple scenarios. [#38657](https://github.com/apache/doris/pull/38657), [#38206](https://github.com/apache/doris/pull/38206), [#38885](https://github.com/apache/doris/pull/38885), [#38151](https://github.com/apache/doris/pull/38151), [#37297](https://github.com/apache/doris/pull/37297) + +- Fixed the coredump issue caused by NULL and non-NULL columns during set difference calculations. [#38750](https://github.com/apache/doris/pull/38750) + +- Fixed the error when using the DECIMAL type with pure decimals in delete statements. [#37801](https://github.com/apache/doris/pull/37801) + +- Fixed the issue where the `width_bucket` function returned incorrect results. [#37892](https://github.com/apache/doris/pull/37892) + +- Fixed the query error when a single row of data was very large and the result set was also large (exceeding 2GB). [#37990](https://github.com/apache/doris/pull/37990) + +- Fixed the coredump issue caused by incorrect release of rpc connections during single-replica imports. [#38087](https://github.com/apache/doris/pull/38087) + +- Fixed the coredump issue caused by processing NULL values with the `foreach` function. [#37349](https://github.com/apache/doris/pull/37349) + +- Fixed the issue where stddev returned incorrect results for DECIMALV2 types. [#38731](https://github.com/apache/doris/pull/38731) + +- Fixed the slow performance of `bitmap union` calculations. [#37816](https://github.com/apache/doris/pull/37816) + +- Fixed the issue where RowsProduced for aggregation operators was not set in the profile. [#38271](https://github.com/apache/doris/pull/38271) + +- Fixed the overflow issue when calculating the number of buckets for the hash table under hash join. [#37193](https://github.com/apache/doris/pull/37193), [#37493](https://github.com/apache/doris/pull/37493) + +- Fixed the inaccurate recording of the `jemalloc cache memory tracker`. [#37464](https://github.com/apache/doris/pull/37464) + +- Added the `enable_stacktrace` configuration option, allowing users to control whether exception stacks are output in BE logs. [#37713](https://github.com/apache/doris/pull/37713) + +- Fixed the issue where Arrow Flight SQL did not work correctly when `enable_parallel_result_sink` was set to false. [#37779](https://github.com/apache/doris/pull/37779) + +- Fixed the incorrect use of colocate Join. [#37361](https://github.com/apache/doris/pull/37361), [#37729](https://github.com/apache/doris/pull/37729) + +- Fixed the calculation overflow issue of the `round` function on DECIMAL128 types. [#37733](https://github.com/apache/doris/pull/37733), [#38106](https://github.com/apache/doris/pull/38106) + +- Fixed the coredump issue when passing a const string to the `sleep` function. [#37681](https://github.com/apache/doris/pull/37681) + +- Increased the queue length for audit logs, solving the issue where audit logs could not be recorded normally under high concurrency scenarios with thousands of concurrent connections. [#37786](https://github.com/apache/doris/pull/37786) + +- Fixed the issue where creating a workload group caused too many threads, leading to BE coredump. [#38096](https://github.com/apache/doris/pull/38096) + +- Fixed the coredump issue caused by the `MULTI_MATCH_ANY` function. [#37959](https://github.com/apache/doris/pull/37959) + +- Fixed the transaction rollback issue caused by `insert overwrite auto partition`. [#38103](https://github.com/apache/doris/pull/38103) + +- Fixed the issue where the TimeUtils formatter did not use the correct time zone. [#37465](https://github.com/apache/doris/pull/37465) + +- Fixed the issue where results were incorrect under constant folding scenarios for week/yearweek. [#37376](https://github.com/apache/doris/pull/37376) + +- Fixed the issue where the `convert_tz` function returned incorrect results. [#37358](https://github.com/apache/doris/pull/37358), [#38764](https://github.com/apache/doris/pull/38764) + +- Fixed the coredump issue when using the `collect_set` function with window functions. [#38234](https://github.com/apache/doris/pull/38234) + +- Fixed the coredump issue caused by `percentile_approx` during rolling upgrades. [#39321](https://github.com/apache/doris/pull/39321) + +- Fixed the coredump issue caused by the `mod` function when encountering abnormal input. [#37999](https://github.com/apache/doris/pull/37999) + +- Fixed the issue where the hash table was not fully built when the broadcast join probe started running. [#37643](https://github.com/apache/doris/pull/37643) + +- Fixed the issue where executing the same expression in multithreaded environments might lead to incorrect results for Java UDFs. [#38612](https://github.com/apache/doris/pull/38612) + +- Fixed the overflow issue caused by incorrect return types of the `conv` function. [#38001](https://github.com/apache/doris/pull/38001) + +- Fixed the issue where the `json_replace` function returned incorrect types. [#3701](https://github.com/apache/doris/pull/37014) + +- Fixed the issue where the nullable attribute setting was unreasonable for the `percentile` aggregation function. [#37330](https://github.com/apache/doris/pull/37330) + +- Fixed the issue where the results of the `histogram` function were unstable. [#38608](https://github.com/apache/doris/pull/38608) + +- Fixed the issue where task state was displayed incorrectly in the profile. [#38082](https://github.com/apache/doris/pull/38082) + +- Fixed the issue where some queries were incorrectly canceled when the system just started. [#37662](https://github.com/apache/doris/pull/37662) + +### Semi-Structured Data Management + +- Fix some issues with time series compression. [#39170](https://github.com/apache/doris/pull/39170) [#39176](https://github.com/apache/doris/pull/39176) + +- Fix the issue of incorrect index size statistics during compression. [#37232](https://github.com/apache/doris/pull/37232) + +- Fix the potential incorrect matching of ultra-long strings without tokenization in inverted indexes. [#37679](https://github.com/apache/doris/pull/37679) [#38218](https://github.com/apache/doris/pull/38218) + +- Fix the high memory usage issue of `array_range` and `array_with_const` functions when dealing with large data volumes. [#38284](https://github.com/apache/doris/pull/38284) [#37495](https://github.com/apache/doris/pull/37495) + +- Fix the potential coredump issue when selecting columns of ARRAY / MAP / STRUCT types. [#37936](https://github.com/apache/doris/pull/37936) + +- Fix the import failure issue caused by simdjson parsing errors when specifying jsonpath in Stream Load. [#38490](https://github.com/apache/doris/pull/38490) + +- Fix the exception handling issue when there are duplicate keys in JSON data. [#38146](https://github.com/apache/doris/pull/38146) + +- Fix the potential query error after DROP INDEX. [#37646](https://github.com/apache/doris/pull/37646) + +- Fix the error return issue in row merging checks during index compression. [#38732](https://github.com/apache/doris/pull/38732) + +- Inverted index v2 format now supports renaming columns. [#38079](https://github.com/apache/doris/pull/38079) + +- Fix the coredump issue when the `MATCH` function matches an empty string without an index. [#37947](https://github.com/apache/doris/pull/37947) + +- Fix the handling of NULL values in inverted indexes. [#37921](https://github.com/apache/doris/pull/37921) [#37842](https://github.com/apache/doris/pull/37842) [#38741](https://github.com/apache/doris/pull/38741) + +- Fix the incorrect `row_store_page_size` after FE restart. [#38240](https://github.com/apache/doris/pull/38240) + +### Other + +- Fix the timezone configuration issue. The default timezone is no longer fixed at UTC+8 and is now obtained from system configuration. [#37294](https://github.com/apache/doris/pull/37294) + +- Fix the class conflict issue when using ranger due to multiple JSR specification implementations. [#37575](https://github.com/apache/doris/pull/37575) + +- Fix the potential uninitialized field issue in some BE code. [#37403](https://github.com/apache/doris/pull/37403) + +- Fix the error in delete statements for random distributed tables. [#37985](https://github.com/apache/doris/pull/37985) + +- Fix the incorrect requirement for `alter_priv` permission on the base table when creating a synchronized materialized view. [#38011](https://github.com/apache/doris/pull/38011) + +- Fix the issue of not authenticating resources when used in TVF. [#36928](https://github.com/apache/doris/pull/36928) + + +## Credits + +Thanks all who contribute to this release: + +@133tosakarin, @924060929, @AshinGau, @Baymine, @BePPPower, @BiteTheDDDDt, @ByteYue, @CalvinKirs, @Ceng23333, @DarvenDuan, @FreeOnePlus, @Gabriel39, @HappenLee, @JNSimba, @Jibing-Li, @KassieZ, @Lchangliang, @LiBinfeng-01, @Mryange, @SWJTU-ZhangLei, @TangSiyang2001, @Tech-Circle-48, @Vallishp, @Yukang-Lian, @Yulei-Yang, @airborne12, @amorynan, @bobhan1, @cambyzju, @cjj2010, @csun5285, @dataroaring, @deardeng, @eldenmoon, @englefly, @feiniaofeiafei, @felixwluo, @freemandealer, @gavinchou, @ghkang98, @hello-stephen, @hubgeter, @hust-hhb, @jacktengg, @kaijchen, @kaka11chen, @keanji-x, @liaoxin01, @liutang123, @luwei16, @luzhijing, @lxr599, @morningman, @morrySnow, @mrhhsg, @mymeiyi, @platoneko, @qidaye, @qzsee, @seawinde, @shuke987, @sollhui, @starocean999, @suxiaogang223, @w41ter, @wangbo, @wangshuo128, @whutpencil, @wsjz, @wuwenchi, @wyxxxcat, @xiaokang, @xiedeyantu, @xinyiZzz, @xy720, @xzj7019, @yagagagaga, @yiguolei, @yujun777, @z404289981, @zclllyybb, @zddr, @zfr9527, @zhangbutao, @zhangstar333, @zhannngchen, @zhiqiang-hhhh, @zjj, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.2.md b/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.2.md new file mode 100644 index 0000000000000..0ab6a828ab95d --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.2.md @@ -0,0 +1,341 @@ +--- +{ + "title": "Release 3.0.2", + "language": "en" +} +--- + + + + +Dear community members, the Apache Doris 3.0.2 version was officially released on October 15, 2024, featuring updates and improvements in compute-storage decoupling, data storage, lakehouse, query optimizer, query execution and more. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavioral Changes + +### Storage + +- Limited the number of tablets in a single backup task to prevent FE memory overflow. [#40518](https://github.com/apache/doris/pull/40518) +- The `SHOW PARTITIONS` command now displays the `CommittedVersion` of partitions. [#28274](https://github.com/apache/doris/pull/28274) + +### Other + +- The default printing mode (asynchronous) of `fe.log` now includes file line number information. If performance issues are encountered due to line number output, please switch to BRIEF mode. [#39419](https://github.com/apache/doris/pull/39419) +- The default value of the session variable `ENABLE_PREPARED_STMT_AUDIT_LOG` has been changed from `true` to `false`, and the audit log of prepare statements will no longer be printed. [#38865](https://github.com/apache/doris/pull/38865) +- The default value of the session variable `max_allowed_packet` has been adjusted from 1MB to 16MB to align with MySQL 8.4. [#38697](https://github.com/apache/doris/pull/38697) +- The JVM of FE and BE defaults to using the UTF-8 character set. [#39521](https://github.com/apache/doris/pull/39521) + +## New Features + +### Storage + +- Backup and recovery now support clearing tables or partitions that are not in the backup. [#39028](https://github.com/apache/doris/pull/39028) + +### Compute-Storage Decoupled + +- Support for parallel recycling of expired data on multiple tablets. [#37630](https://github.com/apache/doris/pull/37630) +- Support for changing storage vaults through `ALTER` statements. [#38685](https://github.com/apache/doris/pull/38685) [#37606](https://github.com/apache/doris/pull/37606) +- Support for importing a large number of tablets (5000+) in a single transaction (experimental feature). [#38243](https://github.com/apache/doris/pull/38243) +- Support for automatically aborting pending transactions caused by reasons such as node restarts, solving the issue of pending transactions blocking decommission or schema change. [#37669](https://github.com/apache/doris/pull/37669) +- A new session variable `enable_segment_cache` has been added to control whether to use segment cache during queries (default is `true`). [#37141](https://github.com/apache/doris/pull/37141) +- Resolved the issue of not being able to import a large amount of data during schema changes in compute-storage decoupled mode. [#39558](https://github.com/apache/doris/pull/39558) +- Support for adding multiple follower roles of FE in compute-storage decoupled mode. [#38388](https://github.com/apache/doris/pull/38388) +- Support for using memory as file cache to accelerate queries in environments with no disks or low-performance HDDs. [#38811](https://github.com/apache/doris/pull/38811) + +### Lakehouse + +- New Lakesoul Catalog has been added. [Apache Doris Docs](https://doris.apache.org/zh-CN/docs/dev/lakehouse/datalake-analytics/lakesoul) +- A new system table `catalog_meta_cache_statistics` has been added to view the usage of various metadata caches in external catalog. [#40155](https://github.com/apache/doris/pull/40155) + +### Query Optimizer + +- Support for `is [not] true/false` expressions. [#38623](https://github.com/apache/doris/pull/38623) + +### Query Execution + +- A new CRC32 function has been added. [#38204](https://github.com/apache/doris/pull/38204) +- New aggregate functions skew and kurt have been added. [#41277](https://github.com/apache/doris/pull/41277) +- Profiles are now persisted to the FE's disk to retain more profiles. [#33690](https://github.com/apache/doris/pull/33690) +- A new system table `workload_group_privileges` has been added to view permission information related to workload groups. [#38436](https://github.com/apache/doris/pull/38436) +- A new system table `workload_group_resource_usage` has been added to monitor resource statistics of workload groups. [#39177](https://github.com/apache/doris/pull/39177) +- Workload groups now support limiting reads of local IO and remote IO. [#39012](https://github.com/apache/doris/pull/39012) +- Workload groups now support cgroupv2 to limit CPU usage. [#39374](https://github.com/apache/doris/pull/39374) +- A new system table `information_schema.partitions` has been added to view some table creation attributes. [#40636](https://github.com/apache/doris/pull/40636) + +### Other + +- Support for using the `SHOW` statement to display BE's configuration information, such as `SHOW BACKEND CONFIG LIKE ${pattern}`. [#36525](https://github.com/apache/doris/pull/36525) + +## Improvements + +### Load + +- Improved the import efficiency of routine load when encountering frequent EOFs from Kafka. [#39975](https://github.com/apache/doris/pull/39975) +- The stream load result now includes the time taken to read HTTP data, `ReceiveDataTimeMs`, which can quickly determine slow stream load issues caused by network reasons. [#40735](https://github.com/apache/doris/pull/40735) +- Optimized the routine load timeout logic to avoid frequent timeouts during inverted index and mow writes. [#40818](https://github.com/apache/doris/pull/40818) + +### Storage + +- Support for batch addition of partitions. [#37114](https://github.com/apache/doris/pull/37114) + +### Compute-Storage Decoupled + +- Added the meta-service HTTP interface `/MetaService/http/show_meta_ranges` to facilitate the statistics of KV distribution in FDB. [#39208](https://github.com/apache/doris/pull/39208) +- The meta-service/recycler stop script ensures that the process fully exits before returning. [#40218](https://github.com/apache/doris/pull/40218) +- Support for using the session variable `version_comment` (Cloud Mode) to display the current deployment mode as compute-storage decoupled. [#38269](https://github.com/apache/doris/pull/38269) +- Fixed the detailed message returned when transaction submission fails. [#40584](https://github.com/apache/doris/pull/40584) +- Support for using one meta-service process to provide both metadata services and data recycling services. [#40223](https://github.com/apache/doris/pull/40223) +- Optimized the default configuration of file_cache to avoid potential issues when not set. [#41421](https://github.com/apache/doris/pull/41421) [#41507](https://github.com/apache/doris/pull/41507) +- Improved query performance by batch retrieving the version of multiple partitions. [#38949](https://github.com/apache/doris/pull/38949) +- Delayed the redistribution of tablets to avoid query performance issues caused by temporary network fluctuations. [#40371](https://github.com/apache/doris/pull/40371) +- Optimized the read-write lock logic in the balance. [#40633](https://github.com/apache/doris/pull/40633) +- Enhanced the robustness of file cache in handling TTL filenames during restarts/crashes. [#40226](https://github.com/apache/doris/pull/40226) +- Added the BE HTTP interface `/api/file_cache?op=hash` to facilitate the calculation of the hash file names of segment files on disk. [#40831](https://github.com/apache/doris/pull/40831) +- Optimized the unified naming to be compatible with using compute group to represent BE groups (original cloud cluster). [#40767](https://github.com/apache/doris/pull/40767) +- Optimized the waiting time for obtaining locks when calculating delete bitmaps in primary key tables. [#40341](https://github.com/apache/doris/pull/40341) +- When there are many delete bitmaps in primary key tables, optimized the high CPU consumption during queries by pre-merging multiple delete bitmaps. [#40204](https://github.com/apache/doris/pull/40204) +- Support for managing FE/BE nodes in compute-storage decoupled mode through SQL statements, hiding the logic of direct interaction with meta-service when deploying in compute-storage decoupled mode. [#40264](https://github.com/apache/doris/pull/40264) +- Added a script for rapid deployment of FDB. [#39803](https://github.com/apache/doris/pull/39803) +- Optimized the output of `SHOW CACHE HOTSPOT` to unify the column name style with other `SHOW` statements. [#41322](https://github.com/apache/doris/pull/41322) +- When using a storage vault as the storage backend, disallowed the use of `latest_fs()` to avoid binding different storage backends to the same table. [#40516](https://github.com/apache/doris/pull/40516) +- Optimized the timeout strategy for calculating delete bitmaps when importing mow tables. [#40562](https://github.com/apache/doris/pull/40562) [#40333](https://github.com/apache/doris/pull/40333) +- The enable_file_cache in be.conf is now enabled by default in compute-storage decoupled mode. [#41502](https://github.com/apache/doris/pull/41502) + +### Lakehouse + +- When reading tables in CSV format, support for the session `keep_carriage_return` setting to control the reading behavior of the `\r` symbol. [#39980](https://github.com/apache/doris/pull/39980) +- The default maximum memory of BE's JVM has been adjusted to 2GB (affecting only new deployments). [#41403](https://github.com/apache/doris/pull/41403) +- Hive Catalog has added `hive.recursive_directories_table` and `hive.ignore_absent_partitions` properties to specify whether to recursively traverse data directories and whether to ignore missing partitions. [#39494](https://github.com/apache/doris/pull/39494) +- Optimized the Catalog refresh logic to avoid generating a large number of connections during refresh. [#39205](https://github.com/apache/doris/pull/39205) +- `SHOW CREATE DATABASE` and `SHOW CREATE TABLE` for external data sources now display location information. [#39179](https://github.com/apache/doris/pull/39179) +- The new optimizer supports inserting data into JDBC external tables using the `INSERT INTO` statement. [#41511](https://github.com/apache/doris/pull/41511) +- MaxCompute Catalog now supports complex data types. [#39259](https://github.com/apache/doris/pull/39259) +- Optimized the logic for reading and merging data shards of external tables. [#38311](https://github.com/apache/doris/pull/38311) +- Optimized some refresh strategies for metadata caches of external tables. [#38506](https://github.com/apache/doris/pull/38506) +- Paimon tables now support pushing down `IN/NOT IN` predicates. [#38390](https://github.com/apache/doris/pull/38390) +- Compatible with tables created in Parquet format by Paimon version 0.9. [#41020](https://github.com/apache/doris/pull/41020) + +### Asynchronous Materialized Views + +- Building asynchronous materialized views now supports the use of both immediate and starttime. [#39573](https://github.com/apache/doris/pull/39573) +- Asynchronous materialized views based on external tables will refresh the metadata cache of the external tables before refreshing the materialized views, ensuring construction based on the latest external table data. [#38212](https://github.com/apache/doris/pull/38212) +- Partition incremental construction now supports rolling up according to weekly and quarterly granularities. [#39286](https://github.com/apache/doris/pull/39286) + +### Query Optimizer + +- The aggregate function `GROUP_CONCAT` now supports the use of both `DISTINCT` and `ORDER BY`. [#38080](https://github.com/apache/doris/pull/38080) +- Optimized the collection and use of statistical information, as well as the logic for estimating row counts and cost calculations, to generate more efficient and stable execution plans. +- Window function partition data pre-filtering now supports cases containing multiple window functions. [#38393](https://github.com/apache/doris/pull/38393) + +### Query Execution + +- Reduced query latency by running prepare pipeline tasks in parallel. [#40874](https://github.com/apache/doris/pull/40874) +- Display Catalog information in Profile. [#38283](https://github.com/apache/doris/pull/38283) +- Optimized the computational performance of `IN` filtering conditions. [#40917](https://github.com/apache/doris/pull/40917) +- Supported cgroupv2 in K8S to limit Doris's memory usage. [#39256](https://github.com/apache/doris/pull/39256) +- Optimized the performance of converting strings to datetime types. [#38385](https://github.com/apache/doris/pull/38385) +- When a `string` is a decimal number, support casting it to an `int`, which will be more compatible with certain behaviors of MySQL. [#38847](https://github.com/apache/doris/pull/38847) + +### Semi-Structured Data Management + +- Optimized the performance of inverted index matching. [#41122](https://github.com/apache/doris/pull/41122) +- Temporarily prohibited the creation of inverted indexes with tokenization on arrays. [#39062](https://github.com/apache/doris/pull/39062) +- `explode_json_array` now supports binary JSON types. [#37278](https://github.com/apache/doris/pull/37278) +- IP data types now support bloomfilter indexes. [#39253](https://github.com/apache/doris/pull/39253) +- IP data types now support row storage. [#39258](https://github.com/apache/doris/pull/39258) +- Nested data types such as ARRAY, MAP, and STRUCT now support schema changes. [#39210](https://github.com/apache/doris/pull/39210) +- When creating MTMV, automatically truncate KEYs encountered in VARIANT data types. [#39988](https://github.com/apache/doris/pull/39988) +- Lazy loading of inverted indexes during queries to improve performance. [#38979](https://github.com/apache/doris/pull/38979) +- `add inverted index file size for open file`. [#37482](https://github.com/apache/doris/pull/37482) +- Reduced access to object storage interfaces during compaction to improve performance. [#41079](https://github.com/apache/doris/pull/41079) +- Added three new query profile metrics related to inverted indexes. [#36696](https://github.com/apache/doris/pull/36696) +- Reduced cache overhead for non-PreparedStatement SQL to improve performance. [#40910](https://github.com/apache/doris/pull/40910) +- Pre-warming cache now supports inverted indexes. [#38986](https://github.com/apache/doris/pull/38986) +- Inverted indexes are now cached immediately after writing. [#39076](https://github.com/apache/doris/pull/39076) + +### Compatibility + +- Fixed the issue of Thrift ID incompatibility on the master with branch-2.1. [#41057](https://github.com/apache/doris/pull/41057) + +### Other + +- BE HTTP API now supports authentication; set config::enable_all_http_auth to true (default is false) when authentication is required. [#39577](https://github.com/apache/doris/pull/39577) +- Optimized the user permissions required for the REFRESH operation. Permissions have been relaxed from ALTER to SHOW. [#39008](https://github.com/apache/doris/pull/39008) +- Reduced the range of nextId when calling advanceNextId(). [#40160](https://github.com/apache/doris/pull/40160) +- Optimized the caching mechanism for Java UDFs. [#40404](https://github.com/apache/doris/pull/40404) + +## Bug Fixes + +### Load + +- Fixed the issue where `abortTransaction` did not handle return codes. [#41275](https://github.com/apache/doris/pull/41275) +- Fixed the issue where transactions failed to commit or abort in compute-storage decoupled mode without calling `afterCommit/afterAbort`. [#41267](https://github.com/apache/doris/pull/41267) +- Fixed the issue where Routine Load could not work properly when modifying consumer offsets in compute-storage decoupled mode. [#39159](https://github.com/apache/doris/pull/39159) +- Fixed the issue of repeatedly closing file handles when obtaining error log file paths. [#41320](https://github.com/apache/doris/pull/41320) +- Fixed the issue of incorrect job progress caching for Routine Load in compute-storage decoupled mode. [#39313](https://github.com/apache/doris/pull/39313) +- Fixed the issue where Routine Load could get stuck when failing to commit transactions in compute-storage decoupled mode. [#40539](https://github.com/apache/doris/pull/40539) +- Fixed the issue where Routine Load kept reporting data quality check errors in compute-storage decoupled mode. [#39790](https://github.com/apache/doris/pull/39790) +- Fixed the issue where Routine Load did not check transactions before committing in compute-storage decoupled mode. [#39775](https://github.com/apache/doris/pull/39775) +- Fixed the issue where Routine Load did not check transactions before aborting in compute-storage decoupled mode. [#40463](https://github.com/apache/doris/pull/40463) +- Fixed the issue where cluster keys did not support certain data types. [#38966](https://github.com/apache/doris/pull/38966) +- Fixed the issue of transactions being repeatedly committed. [#39786](https://github.com/apache/doris/pull/39786) +- Fixed the issue of use after free with WAL when BE exits. [#33131](https://github.com/apache/doris/pull/33131) +- Fixed the issue where WAL playback did not skip completed import transactions in compute-storage decoupled mode. [#41262](https://github.com/apache/doris/pull/41262) +- Fixed the logic for selecting BE in group commit in compute-storage decoupled mode. [#39986](https://github.com/apache/doris/pull/39986) [#38644](https://github.com/apache/doris/pull/38644) +- Fixed the issue where BE might crash when group commit was enabled for insert into. [#39339](https://github.com/apache/doris/pull/39339) +- Fixed the issue where insert into with group commit enabled might get stuck. [#39391](https://github.com/apache/doris/pull/39391) +- Fixed the issue where not enabling the group commit option during import might result in a table not found error. [#39731](https://github.com/apache/doris/pull/39731) +- Fixed the issue of transaction submission timeouts due to too many tablets. [#40031](https://github.com/apache/doris/pull/40031) +- Fixed the issue of concurrent opens with Auto Partition. [#38605](https://github.com/apache/doris/pull/38605) +- Fixed the issue of import lock granularity being too large. [#40134](https://github.com/apache/doris/pull/40134) +- Fixed the issue of coredumps caused by zero-length varchars. [#40940](https://github.com/apache/doris/pull/40940) +- Fixed the issue of incorrect index Id values in log prints. [#38790](https://github.com/apache/doris/pull/38790) +- Fixed the issue of memtable shifting not closing BRPC streaming. [#40105](https://github.com/apache/doris/pull/40105) +- Fixed the issue of inaccurate bvar statistics during memtable shifting. [#39075](https://github.com/apache/doris/pull/39075) +- Fixed the issue of multi-replication fault tolerance during memtable shifting. [#38003](https://github.com/apache/doris/pull/38003) +- Fixed the issue of incorrect message length calculations for Routine Load with multiple tables in one stream. [#40367](https://github.com/apache/doris/pull/40367) +- Fixed the issue of inaccurate progress reporting for Broker Load. [#40325](https://github.com/apache/doris/pull/40325) +- Fixed the issue of inaccurate data scan volume reporting for Broker Load. [#40694](https://github.com/apache/doris/pull/40694) +- Fixed the issue of concurrency with Routine Load in compute-storage decoupled mode. [#39242](https://github.com/apache/doris/pull/39242) +- Fixed the issue of Routine Load jobs being canceled in compute-storage decoupled mode. [#39514](https://github.com/apache/doris/pull/39514) +- Fixed the issue of progress not being reset when deleting Kafka topics. [#38474](https://github.com/apache/doris/pull/38474) +- Fixed the issue of updating progress during transaction state transitions in Routine Load. [#39311](https://github.com/apache/doris/pull/39311) +- Fixed the issue of Routine Load switching from a paused state to a paused state. [#40728](https://github.com/apache/doris/pull/40728) +- Fixed the issue of Stream Load records being missed due to database deletion. [#39360](https://github.com/apache/doris/pull/39360) + +### Storage + +- Fixed the issue of missing storage policies. [#38700](https://github.com/apache/doris/pull/38700) +- Fixed the issue of errors during cross-version backup and recovery. [#38370](https://github.com/apache/doris/pull/38370) +- Fixed the NPE issue with ccr binlog. [#39909](https://github.com/apache/doris/pull/39909) +- Fixed potential issues with duplicate keys in mow. [#41309](https://github.com/apache/doris/pull/41309) [#39791](https://github.com/apache/doris/pull/39791) [#39958](https://github.com/apache/doris/pull/39958) [#38369](https://github.com/apache/doris/pull/38369) [#38331](https://github.com/apache/doris/pull/38331) +- Fixed the issue of not being able to write after backup and recovery in high-frequency write scenarios. [#40118](https://github.com/apache/doris/pull/40118) [#38321](https://github.com/apache/doris/pull/38321) +- Fixed the issue of data errors potentially triggered by deleting empty strings and schema changes. [#41064](https://github.com/apache/doris/pull/41064) +- Fixed the issue of incorrect statistics due to column updates. [#40880](https://github.com/apache/doris/pull/40880) +- Limited the size of tablet meta pb to prevent BE crashes due to oversized meta. [#39455](https://github.com/apache/doris/pull/39455) +- Fixed the potential column misalignment issue with the new optimizer in `begin; insert into values; commit`. [#39295](https://github.com/apache/doris/pull/39295) + +### Compute-Storage Decoupled + +- Fixed the issue where the tablet distribution might be inconsistent across multiple FEs in compute-storage decoupled mode. [#41458](https://github.com/apache/doris/pull/41458) +- Fixed the issue where TVF might not work in multi-computing group environments. [#39249](https://github.com/apache/doris/pull/39249) +- Fixed the issue where compaction used resources that had already been released when BE exited in compute-storage decoupled mode. [#39302](https://github.com/apache/doris/pull/39302) +- Fixed the issue where automatic start-stop might cause FE replay to get stuck. [#40027](https://github.com/apache/doris/pull/40027) +- Fixed the issue where the BE status and the stored status in meta-service were inconsistent. [#40799](https://github.com/apache/doris/pull/40799) +- Fixed the issue where the FE->meta-service connection pool could not automatically expire and reconnect. [#41202](https://github.com/apache/doris/pull/41202) [#40661](https://github.com/apache/doris/pull/40661) +- Fixed the issue where some tablets might repeatedly undergo unexpected balance processes during rebalance. [#39792](https://github.com/apache/doris/pull/39792) +- Fixed the issue where storage vault permissions were lost after FE restarted. [#40260](https://github.com/apache/doris/pull/40260) +- Fixed the issue where tablet row counts and other statistical information might be incomplete due to FDB scan range pagination. [#40494](https://github.com/apache/doris/pull/40494) +- Fixed the performance issue caused by a large number of aborted transactions associated with the same label. [#40606](https://github.com/apache/doris/pull/40606) +- Fixed the issue where `commit_txn` did not automatically re-enter, maintaining consistent behavior between compute-storage decoupled and integrated modes. [#39615](https://github.com/apache/doris/pull/39615) +- Fixed the issue where the number of projected columns increased when dropping columns. [#40187](https://github.com/apache/doris/pull/40187) +- Fixed the issue where delete statements did not correctly handle return values, causing data to still be visible after deletion. [#39428](https://github.com/apache/doris/pull/39428) +- Fixed the coredump issue caused by rowset metadata competition during file cache preheating. [#39361](https://github.com/apache/doris/pull/39361) +- Fixed the issue where the entire cache space would be used up when TTL cache enabled LRU eviction. [#39814](https://github.com/apache/doris/pull/39814) +- Fixed the issue where temporary files could not be recycled when importing commit rowset failed with HDFS storage backend. [#40215](https://github.com/apache/doris/pull/40215) + +### Lakehouse + +- Fixed some issues with predicate pushdown in JDBC Catalog. [#39064](https://github.com/apache/doris/pull/39064) +- Fixed the issue of not being able to read when `S``TRUCT` type columns are missing in Parquet format. [#38718](https://github.com/apache/doris/pull/38718) +- Fixed the issue of FileSystem leaks on the FE side in some cases. [#38610](https://github.com/apache/doris/pull/38610) +- Fixed the issue of metadata cache information being inconsistent when Hive/Iceberg tables write back in some cases. [#40729](https://github.com/apache/doris/pull/40729) +- Fixed the issue of unstable partition ID generation for external tables in some cases. [#39325](https://github.com/apache/doris/pull/39325) +- Fixed the issue of external table queries selecting BE nodes in the blacklist in some cases. [#39451](https://github.com/apache/doris/pull/39451) +- Optimized the timeout time for batch retrieval of external table partition information to avoid long-term thread occupation. [#39346](https://github.com/apache/doris/pull/39346) +- Fixed the issue of memory leaks when querying Hudi tables in some cases. [#41256](https://github.com/apache/doris/pull/41256) +- Fixed the issue of connection pool connection leaks in JDBC Catalog in some cases. [#39582](https://github.com/apache/doris/pull/39582) +- Fixed the issue of BE memory leaks in JDBC Catalog in some cases. [#41041](https://github.com/apache/doris/pull/41041) +- Fixed the issue of not being able to query Hudi data on Alibaba Cloud OSS. [#41316](https://github.com/apache/doris/pull/41316) +- Fixed the issue of not being able to read empty partitions in MaxCompute. [#40046](https://github.com/apache/doris/pull/40046) +- Fixed the issue of poor performance when querying Oracle through JDBC Catalog. [#41513](https://github.com/apache/doris/pull/41513) +- Fixed the issue of BE crashes when querying deletion vector of Paimon tables after enabling file cache features. [#39877](https://github.com/apache/doris/pull/39877) +- Fixed the issue of not being able to access Paimon tables on HDFS clusters with HA enabled. [#39806](https://github.com/apache/doris/pull/39806) +- Temporarily disabled the page index filtering feature of Parquet to avoid potential issues. [#38691](https://github.com/apache/doris/pull/38691) +- Fixed the issue of not being able to read unsigned types in Parquet files. [#39926](https://github.com/apache/doris/pull/39926) +- Fixed the issue of potential infinite loops when reading Parquet files in some cases. [#39523](https://github.com/apache/doris/pull/39523) + +### Asynchronous Materialized Views + +- Fixed the issue where partition construction might select the wrong table to track partitions if both sides have the same column names. [#40810](https://github.com/apache/doris/pull/40810) +- Fixed the issue where transparent rewrite partition compensation might result in incorrect results. [#40803](https://github.com/apache/doris/pull/40803) +- Fixed the issue where transparent rewrite did not take effect on external tables. [#38909](https://github.com/apache/doris/pull/38909) +- Fixed the issue where nested materialized views might not refresh properly. [#40433](https://github.com/apache/doris/pull/40433) + +### Synchronous Materialized Views + +- Fixed the issue where creating synchronous materialized views on MOW tables might result in incorrect query results. [#39171](https://github.com/apache/doris/pull/39171) + +### Query Optimizer + +- Fixed the issue where existing synchronous materialized views might not be usable after upgrading. [#41283](https://github.com/apache/doris/pull/41283) +- Fixed the issue of not correctly handling milliseconds when comparing datetime literals. [#40121](https://github.com/apache/doris/pull/40121) +- Fixed the issue of potential errors in conditional function partition pruning. [#39298](https://github.com/apache/doris/pull/39298) +- Fixed the issue where MOW tables with synchronous materialized views could not perform delete operations. [#39578](https://github.com/apache/doris/pull/39578) +- Fixed the issue where the nullable of slots in JDBC external table query predicates might be incorrectly planned, causing query errors. [#41014](https://github.com/apache/doris/pull/41014) + +### Query Execution + +- Fixed the memory leak issue caused by the use of runtime filters. [#39155](https://github.com/apache/doris/pull/39155) +- Fixed the issue of excessive memory usage by window functions. [#39581](https://github.com/apache/doris/pull/39581) +- Fixed a series of function compatibility issues during rolling upgrades. [#41023](https://github.com/apache/doris/pull/41023) [#40438](https://github.com/apache/doris/pull/40438) [#39648](https://github.com/apache/doris/pull/39648) +- Fixed the issue of incorrect results with `encryption_function` when used with constants. [#40201](https://github.com/apache/doris/pull/40201) +- Fixed the issue of errors when importing single-table materialized views. [#39061](https://github.com/apache/doris/pull/39061) +- Fixed the issue of incorrect partition result calculations for window functions. [#39100](https://github.com/apache/doris/pull/39100) [#40761](https://github.com/apache/doris/pull/40761) +- Fixed the issue of incorrect calculations for topn when null values are present. [#39497](https://github.com/apache/doris/pull/39497) +- Fixed the issue of incorrect results with the `map_agg` function. [#39743](https://github.com/apache/doris/pull/39743) +- Fixed the issue of incorrect messages returned by cancel. [#38982](https://github.com/apache/doris/pull/38982) +- Fixed the issue of BE core dumps caused by encrypt and decrypt functions. [#40726](https://github.com/apache/doris/pull/40726) +- Fixed the issue of queries getting stuck due to too many scanners in high-concurrency scenarios. [#40495](https://github.com/apache/doris/pull/40495) +- Supported time types in runtime filters. [#38258](https://github.com/apache/doris/pull/38258) +- Fixed the issue of incorrect results with window funnel functions. [#40960](https://github.com/apache/doris/pull/40960) + +### Semi-Structured Data Management + +- Fixed the issue of match function errors when no indexes were present. [#38989](https://github.com/apache/doris/pull/38989) +- Fixed the issue of crashes when ARRAY data types were used as parameters for array_min/array_max functions. [#39492](https://github.com/apache/doris/pull/39492) +- Fixed the issue of nullable with the `array_enumerate_uniq` function. [#38384](https://github.com/apache/doris/pull/38384) +- Fixed the issue of bloomfilter indexes not being updated when adding or deleting columns. [#38431](https://github.com/apache/doris/pull/38431) +- Fixed the issue of es-catalog parsing exceptions with array data. [#39104](https://github.com/apache/doris/pull/39104) +- Fixed the issue of improper predicate push-down in es-catalog. [#40111](https://github.com/apache/doris/pull/40111) +- Fixed the issue of exceptions caused by modifying input data with`map()` and `struct()` functions. [#39699](https://github.com/apache/doris/pull/39699) +- Fixed the issue of index compaction crashes in special cases. [#40294](https://github.com/apache/doris/pull/40294) +- Fixed the issue of ARRAY type inverted indexes missing nullbitmaps. [#38907](https://github.com/apache/doris/pull/38907) +- Fixed the issue of incorrect results with the `count()` function on inverted indexes. [#41152](https://github.com/apache/doris/pull/41152) +- Fixed the issue of correct results with the `explode_map` function when using aliases. [#39757](https://github.com/apache/doris/pull/39757) +- Fixed the issue of VARIANT type not being able to use row storage for exceptional JSON data. [#39394](https://github.com/apache/doris/pull/39394) +- Fixed the issue of memory leaks when returning ARRAY results with VARIANT type. [#41358](https://github.com/apache/doris/pull/41358) +- Fixed the issue of changing column names with VARIANT type. [#40320](https://github.com/apache/doris/pull/40320) +- Fixed the issue of potential precision loss when converting VARIANT type to DECIMAL type. [#39650](https://github.com/apache/doris/pull/39650) +- Fixed the issue of nullable handling with VARIANT type. [#39732](https://github.com/apache/doris/pull/39732) +- Fixed the issue of sparse column reading with VARIANT type. [#40295](https://github.com/apache/doris/pull/40295) + +### Other + +- Fixed the compatibility issue between new and old audit log plugins. [#41401](https://github.com/apache/doris/pull/41401) +- Fixed the issue where users could see processes of others in certain cases. [#39747](https://github.com/apache/doris/pull/39747) +- Fixed the issue where users with permissions could not export. [#38365](https://github.com/apache/doris/pull/38365) +- Fixed the issue where create table like required create permissions for the existing table. [#37879](https://github.com/apache/doris/pull/37879) +- Fixed the issue where some features did not verify permissions. [#39726](https://github.com/apache/doris/pull/39726) +- Fixed the issue of not correctly closing connections when using SSL. [#38587](https://github.com/apache/doris/pull/38587) +- Fixed the issue where executing ALTER VIEW operations in some cases caused FE to fail to start. [#40872](https://github.com/apache/doris/pull/40872) \ No newline at end of file diff --git a/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.3.md b/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.3.md new file mode 100644 index 0000000000000..b15777212b400 --- /dev/null +++ b/versioned_docs/version-2.0/releasenotes/v3.0/release-3.0.3.md @@ -0,0 +1,226 @@ +--- +{ + "title": "Release 3.0.3", + "language": "en" +} +--- + + + + +Dear community members, the Apache Doris 3.0.3 version was officially released on December 02, 2024, this version further enhances the performance and stability of the system. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavioral Changes + +- Prohibited column updates on MOW tables with synchronous materialized views. [#40190](https://github.com/apache/doris/pull/40190) +- Adjusted the default parameters of RoutineLoad to improve import efficiency. [#42968](https://github.com/apache/doris/pull/42968) +- When StreamLoad fails, the return value of LoadedRows is adjusted to 0. [#41946](https://github.com/apache/doris/pull/41946) [#42291](https://github.com/apache/doris/pull/42291) +- Adjusted the default memory limit of Segment cache to 5%. [#42308](https://github.com/apache/doris/pull/42308) [#42436](https://github.com/apache/doris/pull/42436) + +## New Features + +- Introduced the session variable `enable_cooldown_replica_affinity` to control the affinity of cold and hot tiered replicas. [#42677](https://github.com/apache/doris/pull/42677) + +- Added `table$partition` syntax for querying partition information of Hive tables. [#40774](https://github.com/apache/doris/pull/40774) + + - [View Documentation](https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/hive) + +- Supported creation of Hive tables in Text format. [#41860](https://github.com/apache/doris/pull/41860) [#42175](https://github.com/apache/doris/pull/42175) + + - [View Documentation](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build#table) + +### Asynchronous Materialized Views + +- Introduced new materialized view attribute `use_for_rewrite`. When `use_for_rewrite` is set to false, the materialized view does not participate in transparent rewriting. [#40332](https://github.com/apache/doris/pull/40332) + +### Query Optimizer + +- Supported correlated non-aggregate subqueries. [#42236](https://github.com/apache/doris/pull/42236) + +### Query Execution + +- Added functions `ngram_search`, `normal_cdf`, `to_iso8601`, `from_iso8601_date`, `SESSION_USER()`, `last_query_id`. [#38226](https://github.com/apache/doris/pull/38226) [#40695](https://github.com/apache/doris/pull/40695) [#41075](https://github.com/apache/doris/pull/41075) [#41600](https://github.com/apache/doris/pull/41600) [#39575](https://github.com/apache/doris/pull/39575) [#40739](https://github.com/apache/doris/pull/40739) +- The `aes_encrypt` and `aes_decrypt` functions support GCM mode. [#40004](https://github.com/apache/doris/pull/40004) +- Profile outputs the changed session variable values. [#41016](https://github.com/apache/doris/pull/41016) [#41318](https://github.com/apache/doris/pull/41318) + +### Semi-structured Data Management + +- Added array functions `array_match_all` and `array_match_any`. [#40605](https://github.com/apache/doris/pull/40605) [#43514](https://github.com/apache/doris/pull/43514) +- The array function `array_agg` supports nesting ARRAY/MAP/STRUCT within ARRAY. [#42009](https://github.com/apache/doris/pull/42009) +- Added approximate aggregate statistical functions `approx_top_k` and `approx_top_sum`. [#44082](https://github.com/apache/doris/pull/44082) + +## Improvements + +### Storage + +- Supported `bitmap_empty` as the default value. [#40364](https://github.com/apache/doris/pull/40364) +- Introduced the session variable `insert_timeout` to control the timeout of DELETE statements. [#41063](https://github.com/apache/doris/pull/41063) +- Improved some error message prompts. [#41048](https://github.com/apache/doris/pull/41048) [#39631](https://github.com/apache/doris/pull/39631) +- Improved the priority scheduling of replica repair. [#41076](https://github.com/apache/doris/pull/41076) +- Enhanced the robustness of timezone handling when creating tables. [#41926](https://github.com/apache/doris/pull/41926) [#42389](https://github.com/apache/doris/pull/42389) +- Checked the validity of partition expressions when creating tables. [#40158](https://github.com/apache/doris/pull/40158) +- Supported Unicode-encoded column names in DELETE operations. [#39381](https://github.com/apache/doris/pull/39381) + +### Compute-Storage Decoupled + +- Supported ARM architecture deployment in storage and compute separation mode. [#42467](https://github.com/apache/doris/pull/42467) [#43377](https://github.com/apache/doris/pull/43377) +- Optimized the eviction strategy and lock competition of file cache, improving hit rate and high concurrency point query performance. [#42451](https://github.com/apache/doris/pull/42451) [#43201](https://github.com/apache/doris/pull/43201) [#41818](https://github.com/apache/doris/pull/41818) [#43401](https://github.com/apache/doris/pull/43401) +- S3 storage vault supported `use_path_style`, solving the problem of using custom domain names for object storage. [#43060](https://github.com/apache/doris/pull/43060) [#43343](https://github.com/apache/doris/pull/43343) [#43330](https://github.com/apache/doris/pull/43330) +- Optimized storage and compute separation configuration and deployment, preventing misoperations in different modes. [#43381](https://github.com/apache/doris/pull/43381) [#43522](https://github.com/apache/doris/pull/43522) [#43434](https://github.com/apache/doris/pull/43434) [#40764](https://github.com/apache/doris/pull/40764) [#43891](https://github.com/apache/doris/pull/43891) +- Optimized observability and provided an interface for deleting specified segment file cache. [#38489](https://github.com/apache/doris/pull/38489) [#42896](https://github.com/apache/doris/pull/42896) [#41037](https://github.com/apache/doris/pull/41037) [#43412](https://github.com/apache/doris/pull/43412) +- Optimized Meta-service operation and maintenance interface: RPC rate limiting and tablet metadata correction. [#42413](https://github.com/apache/doris/pull/42413) [#43884](https://github.com/apache/doris/pull/43884) [#41782](https://github.com/apache/doris/pull/41782) [#43460](https://github.com/apache/doris/pull/43460) + +### Lakehouse + +- Paimon Catalog supported Alibaba Cloud DLF and OSS-HDFS storage. [#41247](https://github.com/apache/doris/pull/41247) [#42585](https://github.com/apache/doris/pull/42585) + + - View [Documentation](https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/paimon) + +- Supported reading of Hive tables in OpenCSV format. [#42257](https://github.com/apache/doris/pull/42257) [#42942](https://github.com/apache/doris/pull/42942) +- Optimized the performance of accessing the `information_schema.columns` table in External Catalog. [#41659](https://github.com/apache/doris/pull/41659) [#41962](https://github.com/apache/doris/pull/41962) +- Used the new Max Compute open storage API to access Max Compute data sources. [#41614](https://github.com/apache/doris/pull/41614) +- Optimized the scheduling policy of the JNI part of Paimon tables, making scan tasks more balanced. [#43310](https://github.com/apache/doris/pull/43310) +- Optimized the read performance of small ORC files. [#42004](https://github.com/apache/doris/pull/42004) [#43467](https://github.com/apache/doris/pull/43467) +- Supported reading of parquet files in brotli compressed format. [#42177](https://github.com/apache/doris/pull/42177) +- Added `file_cache_statistics` table under the `information_schema` library to view metadata cache statistics. [#42160](https://github.com/apache/doris/pull/42160) + +### Query Optimizer + +- Optimization: When queries only differ in comments, the same SQL Cache can be reused. [#40049](https://github.com/apache/doris/pull/40049) +- Optimization: Improved the stability of statistical information when data is frequently updated. [#43865](https://github.com/apache/doris/pull/43865) [#39788](https://github.com/apache/doris/pull/39788) [#43009](https://github.com/apache/doris/pull/43009) [#40457](https://github.com/apache/doris/pull/40457) [#42409](https://github.com/apache/doris/pull/42409) [#41894](https://github.com/apache/doris/pull/41894) +- Optimization: Enhanced the stability of constant folding. [#42910](https://github.com/apache/doris/pull/42910) [#41164](https://github.com/apache/doris/pull/41164) [#39723](https://github.com/apache/doris/pull/39723) [#41394](https://github.com/apache/doris/pull/41394) [#42256](https://github.com/apache/doris/pull/42256) [#40441](https://github.com/apache/doris/pull/40441) +- Optimization: Column pruning can generate better execution plans. [#41719](https://github.com/apache/doris/pull/41719) [#41548](https://github.com/apache/doris/pull/41548) + +### Query Execution + +- Optimized the memory usage of the sort operator. [#39306](https://github.com/apache/doris/pull/39306) +- Optimized the performance of computations on ARM. [#38888](https://github.com/apache/doris/pull/38888) [#38759](https://github.com/apache/doris/pull/38759) +- Optimized the computational performance of a series of functions. [#40366](https://github.com/apache/doris/pull/40366) [#40821](https://github.com/apache/doris/pull/40821) [#40670](https://github.com/apache/doris/pull/40670) [#41206](https://github.com/apache/doris/pull/41206) [#40162](https://github.com/apache/doris/pull/40162) +- Used SSE instructions to optimize the performance of the `match_ipv6_subnet` function. [#38755](https://github.com/apache/doris/pull/38755) +- Supported automatic creation of new partitions during insert overwrite. [#38628](https://github.com/apache/doris/pull/38628) [#42645](https://github.com/apache/doris/pull/42645) +- Added the status of each PipelineTask in Profile. [#42981](https://github.com/apache/doris/pull/42981) +- IP type supported runtime filter. [#39985](https://github.com/apache/doris/pull/39985) + +### Semi-structured Data Management + +- Output the real SQL of prepared statements in audit logs. [#43321](https://github.com/apache/doris/pull/43321) +- The filebeat doris output plugin supports fault tolerance and progress reporting. [#36355](https://github.com/apache/doris/pull/36355) +- Optimized the performance of inverted index queries. [#41547](https://github.com/apache/doris/pull/41547) [#41585](https://github.com/apache/doris/pull/41585) [#41567](https://github.com/apache/doris/pull/41567) [#41577](https://github.com/apache/doris/pull/41577) [#42060](https://github.com/apache/doris/pull/42060) [#42372](https://github.com/apache/doris/pull/42372) +- The array function `array overlaps` supports acceleration using inverted indexes. [#41571](https://github.com/apache/doris/pull/41571) +- The IP function `is_ip_address_in_range` supports acceleration using inverted indexes. [#41571](https://github.com/apache/doris/pull/41571) +- Optimized the CAST performance of the VARIANT data type. [#41775](https://github.com/apache/doris/pull/41775) [#42438](https://github.com/apache/doris/pull/42438) [#43320](https://github.com/apache/doris/pull/43320) +- Optimized the CPU resource consumption of the Variant data type. [#42856](https://github.com/apache/doris/pull/42856) [#43062](https://github.com/apache/doris/pull/43062) [#43634](https://github.com/apache/doris/pull/43634) +- Optimized the metadata and execution memory resource consumption of the Variant data type. [#42448](https://github.com/apache/doris/pull/42448) [#43326](https://github.com/apache/doris/pull/43326) [#41482](https://github.com/apache/doris/pull/41482) [#43093](https://github.com/apache/doris/pull/43093) [#43567](https://github.com/apache/doris/pull/43567) [#43620](https://github.com/apache/doris/pull/43620) + +### Permissions + +- Added a new configuration item `ldap_group_filter` in LDAP for custom group filtering. [#43292](https://github.com/apache/doris/pull/43292) + +### Other + +- Supported displaying connection count information by user in FE monitoring items. [#39200](https://github.com/apache/doris/pull/39200) + +## Bug Fixes + +### Storage + +- Fixed the issue with using IPv6 hostnames. [#40074](https://github.com/apache/doris/pull/40074) +- Fixed the inaccurate display of broker/s3 load progress. [#43535](https://github.com/apache/doris/pull/43535) +- Fixed the issue where queries might hang from FE. [#41303](https://github.com/apache/doris/pull/41303) [#42382](https://github.com/apache/doris/pull/42382) +- Fixed the issue of duplicate auto-increment IDs under exceptional circumstances. [#43774](https://github.com/apache/doris/pull/43774) [#43983](https://github.com/apache/doris/pull/43983) +- Fixed occasional NPE issues with groupcommit. [#43635](https://github.com/apache/doris/pull/43635) +- Fixed the inaccurate calculation of auto bucket. [#41675](https://github.com/apache/doris/pull/41675) [#41835](https://github.com/apache/doris/pull/41835) +- Fixed the issue where FE might not correctly plan multi-table flows after restart. [#41677](https://github.com/apache/doris/pull/41677) [#42290](https://github.com/apache/doris/pull/42290) + +### Compute-Storage Decoupled + +- Fixed the issue that MOW primary key tables with large delete bitmaps might cause coredump. [#43088](https://github.com/apache/doris/pull/43088) [#43457](https://github.com/apache/doris/pull/43457) [#43479](https://github.com/apache/doris/pull/43479) [#43407](https://github.com/apache/doris/pull/43407) [#43297](https://github.com/apache/doris/pull/43297) [#43613](https://github.com/apache/doris/pull/43613) [#43615](https://github.com/apache/doris/pull/43615) [#43854](https://github.com/apache/doris/pull/43854) [#43968](https://github.com/apache/doris/pull/43968) [#44074](https://github.com/apache/doris/pull/44074) [#41793](https://github.com/apache/doris/pull/41793) [#42142](https://github.com/apache/doris/pull/42142) +- Fixed the issue that segment files, when being a multiple of 5MB, would fail to upload objects. [#43254](https://github.com/apache/doris/pull/43254) +- Fixed the issue that the default retry policy of aws sdk did not take effect. [#43575](https://github.com/apache/doris/pull/43575) [#43648](https://github.com/apache/doris/pull/43648) +- Fixed the issue that altering storage vault could continue execution even when the wrong type was specified. [#43489](https://github.com/apache/doris/pull/43489) [#43352](https://github.com/apache/doris/pull/43352) [#43495](https://github.com/apache/doris/pull/43495) +- Fixed the issue that tablet_id might be 0 during the delayed commit process of large transactions. [#42043](https://github.com/apache/doris/pull/42043) [#42905](https://github.com/apache/doris/pull/42905) +- Fixed the issue that constant folding RCP and FE forwarding SQL might not be executed in the expected computation group. [#43110](https://github.com/apache/doris/pull/43110) [#41819](https://github.com/apache/doris/pull/41819) [#41846](https://github.com/apache/doris/pull/41846) +- Fixed the issue that meta-service did not strictly check instance_id upon receiving RPC. [#43253](https://github.com/apache/doris/pull/43253) [#43832](https://github.com/apache/doris/pull/43832) +- Fixed the issue that FE follower information_schema version did not update in time. [#43496](https://github.com/apache/doris/pull/43496) +- Fixed the issue of atomicity in file cache rename and inaccurate metrics. [#42869](https://github.com/apache/doris/pull/42869) [#43504](https://github.com/apache/doris/pull/43504) [#43220](https://github.com/apache/doris/pull/43220) + +### Lakehouse + +- Prohibited implicit conversion predicates from being pushed down to JDBC data sources to avoid inconsistent query results. [#42102](https://github.com/apache/doris/pull/42102) +- Fixed some read issues with high-version Hive transactional tables. [#42226](https://github.com/apache/doris/pull/42226) +- Fixed the issue that the Export command might cause deadlocks. [#43083](https://github.com/apache/doris/pull/43083) [#43402](https://github.com/apache/doris/pull/43402) +- Fixed the issue of being unable to query Hive views created by Spark. [#43552](https://github.com/apache/doris/pull/43552) +- Fixed the issue that Hive partition paths containing special characters led to incorrect partition pruning. [#42906](https://github.com/apache/doris/pull/42906) +- Fixed the issue that Iceberg Catalog could not use AWS Glue. [#41084](https://github.com/apache/doris/pull/41084) + +### Asynchronous Materialized Views + +- Fixed the issue that asynchronous materialized views might not refresh after the base table is rebuilt. [#41762](https://github.com/apache/doris/pull/41762) + +### Query Optimizer + +- Fixed the issue that partition pruning results might be incorrect when using multi-column range partitioning. [#43332](https://github.com/apache/doris/pull/43332) +- Fixed the issue of incorrect calculation results in some limit offset scenarios. [#42576](https://github.com/apache/doris/pull/42576) + +### Query Execution + +- Fixed the issue that hash join with array types larger than 4G could cause BE Core. [#43861](https://github.com/apache/doris/pull/43861) +- Fixed the issue that is null predicate operations might yield incorrect results in some scenarios. [#43619](https://github.com/apache/doris/pull/43619) +- Fixed the issue that bitmap types might produce incorrect output results in hash join. [#43718](https://github.com/apache/doris/pull/43718) +- Fixed some issues where function results were calculated incorrectly. [#40710](https://github.com/apache/doris/pull/40710) [#39358](https://github.com/apache/doris/pull/39358) [#40929](https://github.com/apache/doris/pull/40929) [#40869](https://github.com/apache/doris/pull/40869) [#40285](https://github.com/apache/doris/pull/40285) [#39891](https://github.com/apache/doris/pull/39891) [#40530](https://github.com/apache/doris/pull/40530) [#41948](https://github.com/apache/doris/pull/41948) [#43588](https://github.com/apache/doris/pull/43588) +- Fixed some issues with JSON type parsing. [#39937](https://github.com/apache/doris/pull/39937) +- Fixed issues with varchar and char types in runtime filter operations. [#43758](https://github.com/apache/doris/pull/43758) [#43919](https://github.com/apache/doris/pull/43919) +- Fixed some issues with the use of decimal256 in scalar and aggregate functions. [#42136](https://github.com/apache/doris/pull/42136) [#42356](https://github.com/apache/doris/pull/42356) +- Fixed the issue that arrow flight reported `Reach limit of connections` errors upon connection. [#39127](https://github.com/apache/doris/pull/39127) +- Fixed the issue of incorrect memory usage statistics for BE in k8s environments. [#41123](https://github.com/apache/doris/pull/41123) + +### Semi-structured Data Management + +- Adjusted the default values of `segment_cache_fd_percentage` and `inverted_index_fd_number_limit_percent`. [#42224](https://github.com/apache/doris/pull/42224) +- logstash now supports group_commit. [#40450](https://github.com/apache/doris/pull/40450) +- Fixed the issue of coredump when building index. [#43246](https://github.com/apache/doris/pull/43246) [#43298](https://github.com/apache/doris/pull/43298) +- Fixed issues with variant index. [#43375](https://github.com/apache/doris/pull/43375) [#43773](https://github.com/apache/doris/pull/43773) +- Fixed potential fd and memory leaks under abnormal compaction circumstances. [#42374](https://github.com/apache/doris/pull/42374) +- Inverted index match null now correctly returns null instead of false. [#41786](https://github.com/apache/doris/pull/41786) +- Fixed the issue of coredump when ngram bloomfilter index bf_size is set to 65536. [#43645](https://github.com/apache/doris/pull/43645) +- Fixed the issue of potential coredump during complex data type JOINs. [#40398](https://github.com/apache/doris/pull/40398) +- Fixed the issue of coredump with TVF JSON data. [#43187](https://github.com/apache/doris/pull/43187) +- Fixed the precision issue of bloom filter calculations for dates and times. [#43612](https://github.com/apache/doris/pull/43612) +- Fixed the issue of coredump with IPv6 type storage. [#43251](https://github.com/apache/doris/pull/43251) +- Fixed the issue of coredump when using VARIANT type with light_schema_change disabled. [#40908](https://github.com/apache/doris/pull/40908) +- Improved cache performance for high-concurrency point queries. [#44077](https://github.com/apache/doris/pull/44077) +- Fixed the issue that bloom filter indexes were not synchronized when columns were deleted. [#43378](https://github.com/apache/doris/pull/43378) +- Fixed instability issues with es catalog under special circumstances such as mixed array and scalar data. [#40314](https://github.com/apache/doris/pull/40314) [#40385](https://github.com/apache/doris/pull/40385) [#43399](https://github.com/apache/doris/pull/43399) [#40614](https://github.com/apache/doris/pull/40614) +- Fixed coredump issues caused by abnormal regular pattern matching. [#43394](https://github.com/apache/doris/pull/43394) + +### Permissions + +- Fixed several issues where permissions were not properly restricted after authorization. [#43193](https://github.com/apache/doris/pull/43193) [#41723](https://github.com/apache/doris/pull/41723) [#42107](https://github.com/apache/doris/pull/42107) [#43306](https://github.com/apache/doris/pull/43306) +- Enhanced several permission checks. [#40688](https://github.com/apache/doris/pull/40688) [#40533](https://github.com/apache/doris/pull/40533) [#41791](https://github.com/apache/doris/pull/41791) [#42106](https://github.com/apache/doris/pull/42106) + +### Other + +- Supplemented missing audit log fields in audit log tables and files. [#43303](https://github.com/apache/doris/pull/43303) + + - [View Documentation](https://doris.apache.org/docs/3.0/admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.0.md b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.0.md new file mode 100644 index 0000000000000..dd94da6816294 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.0.md @@ -0,0 +1,379 @@ +--- +{ + "title": "Release 1.1.0", + "language": "en" +} +--- + + + +In version 1.1, we realized the full vectorization of the computing layer and storage layer, and officially enabled the vectorized execution engine as a stable function. All queries are executed by the vectorized execution engine by default, and the performance is 3-5 times higher than the previous version. It increases the ability to access the external tables of Apache Iceberg and supports federated query of data in Doris and Iceberg, and expands the analysis capabilities of Apache Doris on the data lake; on the basis of the original LZ4, the ZSTD compression algorithm is added , further improves the data compression rate; fixed many performance and stability problems in previous versions, greatly improving system stability. Downloading and using is recommended. + +## Upgrade Notes + +### The vectorized execution engine is enabled by default + +In version 1.0, we introduced the vectorized execution engine as an experimental feature and Users need to manually enable it when executing queries by configuring the session variables through `set batch_size = 4096` and `set enable_vectorized_engine = true` . + +In version 1.1, we officially fully enabled the vectorized execution engine as a stable function. The session variable `enable_vectorized_engine` is set to true by default. All queries are executed by default through the vectorized execution engine. + +### BE Binary File Renaming + +BE binary file has been renamed from palo_be to doris_be . Please pay attention to modifying the relevant scripts if you used to rely on process names for cluster management and other operations. + +### Segment storage format upgrade + +The storage format of earlier versions of Apache Doris was Segment V1. In version 0.12, we had implemented Segment V2 as a new storage format, which introduced Bitmap indexes, memory tables, page cache, dictionary compression, delayed materialization and many other features. Starting from version 0.13, the default storage format for newly created tables is Segment V2, while maintaining compatibility with the Segment V1 format. + +In order to ensure the maintainability of the code structure and reduce the additional learning and development costs caused by redundant historical codes, we have decided to no longer support the Segment v1 storage format from the next version. It is expected that this part of the code will be deleted in the Apache Doris 1.2 version. + +### Normal Upgrade + +For normal upgrade operations, you can perform rolling upgrades according to the cluster upgrade documentation on the official website. + +[https://doris.apache.org//docs/admin-manual/cluster-management/upgrade](https://doris.apache.org//docs/admin-manual/cluster-management/upgrade) + +## Features + +### Support random distribution of data [experimental] + +In some scenarios (such as log data analysis), users may not be able to find a suitable bucket key to avoid data skew, so the system needs to provide additional distribution methods to solve the problem. + +Therefore, when creating a table you can set `DISTRIBUTED BY random BUCKETS number`to use random distribution, the data will be randomly written to a single tablet when importing to reduce the data fanout during the loading process. And reduce resource overhead and improve system stability. + +### Support for creating Iceberg external tables[experimental] + +Iceberg external tables provide Apache Doris with direct access to data stored in Iceberg. Through Iceberg external tables, federated queries on data stored in local storage and Iceberg can be implemented, which saves tedious data loading work, simplifies the system architecture for data analysis, and performs more complex analysis operations. + +In version 1.1, Apache Doris supports creating Iceberg external tables and querying data, and supports automatic synchronization of all table schemas in the Iceberg database through the REFRESH command. + +### Added ZSTD compression algorithm + +At present, the data compression method in Apache Doris is uniformly specified by the system, and the default is LZ4. For some scenarios that are sensitive to data storage costs, the original data compression ratio requirements cannot be met. + +In version 1.1, users can set "compression"="zstd" in the table properties to specify the compression method as ZSTD when creating a table. In the 25GB 110 million lines of text log test data, the highest compression rate is nearly 10 times, which is 53% higher than the original compression rate, and the speed of reading data from disk and decompressing it is increased by 30%. + +## Improvements + +### More comprehensive vectorization support + +In version 1.1, we implemented full vectorization of the compute and storage layers, including: + +Implemented vectorization of all built-in functions + +The storage layer implements vectorization and supports dictionary optimization for low-cardinality string columns + +Optimized and resolved numerous performance and stability issues with the vectorization engine. + +We tested the performance of Apache Doris version 1.1 and version 0.15 on the SSB and TPC-H standard test datasets: + +On all 13 SQLs in the SSB test data set, version 1.1 is better than version 0.15, and the overall performance is improved by about 3 times, which solves the problem of performance degradation in some scenarios in version 1.0; + +On all 22 SQLs in the TPC-H test data set, version 1.1 is better than version 0.15, the overall performance is improved by about 4.5 times, and the performance of some scenarios is improved by more than ten times; + +![](/images/release-note-1.1.0-SSB.png) + +

SSB Benchmark

+ +![](/images/release-note-1.1.0-TPC-H.png) + + +

TPC-H Benchmark

+ +**Performance test report** + +[https://doris.apache.org//docs/benchmark/ssb](https://doris.apache.org//docs/benchmark/ssb) + +[https://doris.apache.org//docs/benchmark/tpch](https://doris.apache.org//docs/benchmark/tpch) + +### Compaction logic optimization and real-time guarantee + +In Apache Doris, each commit will generate a data version. In high concurrent write scenarios, -235 errors are prone to occur due to too many data versions and untimely compaction, and query performance will also decrease accordingly. + +In version 1.1, we introduced QuickCompaction, which will actively trigger compaction when the data version increases. At the same time, by improving the ability to scan fragment metadata, it can quickly find fragments with too many data versions and trigger compaction. Through active triggering and passive scanning, the real-time problem of data merging is completely solved. + +At the same time, for high-frequency small file cumulative compaction, the scheduling and isolation of compaction tasks is implemented to prevent the heavyweight base compaction from affecting the merging of new data. + +Finally, for the merging of small files, the strategy of merging small files is optimized, and the method of gradient merging is adopted. Each time the files participating in the merging belong to the same data magnitude, it prevents versions with large differences in size from merging, and gradually merges hierarchically. , reducing the number of times a single file participates in merging, which can greatly save the CPU consumption of the system. + +When the data upstream maintains a write frequency of 10w per second (20 concurrent write tasks, 5000 rows per job, and checkpoint interval of 1s), version 1.1 behaves as follows: + +- Quick data consolidation: Tablet version remains below 50 and compaction score is stable. Compared with the -235 problem that frequently occurred during high concurrent writing in the previous version, the compaction merge efficiency has been improved by more than 10 times. + +- Significantly reduced CPU resource consumption: The strategy has been optimized for small file Compaction. In the above scenario of high concurrent writing, CPU resource consumption is reduced by 25%; + +- Stable query time consumption: The overall orderliness of data is improved, and the fluctuation of query time consumption is greatly reduced. The query time consumption during high concurrent writing is the same as that of only querying, and the query performance is improved by 3-4 times compared with the previous version. + +### Read efficiency optimization for Parquet and ORC files + +By adjusting arrow parameters, arrow's multi-threaded read capability is used to speed up Arrow's reading of each row_group, and it is modified to SPSC model to reduce the cost of waiting for the network through prefetching. After optimization, the performance of Parquet file import is improved by 4 to 5 times. + +### Safer metadata Checkpoint + +By double-checking the image files generated after the metadata checkpoint and retaining the function of historical image files, the problem of metadata corruption caused by image file errors is solved. + +## Bugfix + +### Fix the problem that the data cannot be queried due to the missing data version.(Serious) + +This issue was introduced in version 1.0 and may result in the loss of data versions for multiple replicas. + +### Fix the problem that the resource isolation is invalid for the resource usage limit of loading tasks (Moderate) + +In 1.1, the broker load and routine load will use Backends with specified resource tags to do the load. + +### Use HTTP BRPC to transfer network data packets over 2GB (Moderate) + +In the previous version, when the data transmitted between Backends through BRPC exceeded 2GB, +it may cause data transmission errors. + +## Others + +### Disabling Mini Load + +The `/_load` interface is disabled by default, please use `the /_stream_load` interface uniformly. +Of course, you can re-enable it by turning off the FE configuration item `disable_mini_load`. + +The Mini Load interface will be completely removed in version 1.2. + +### Completely disable the SegmentV1 storage format + +Data in SegmentV1 format is no longer allowed to be created. Existing data can continue to be accessed normally. +You can use the `ADMIN SHOW TABLET STORAGE FORMAT` statement to check whether the data in SegmentV1 format +still exists in the cluster. And convert to SegmentV2 through the data conversion command + +Access to SegmentV1 data will no longer be supported in version 1.2. + +### Limit the maximum length of String type + +In previous versions, String types were allowed a maximum length of 2GB. +In version 1.1, we will limit the maximum length of the string type to 1MB. Strings longer than this length cannot be written anymore. +At the same time, using the String type as a partitioning or bucketing column of a table is no longer supported. + +The String type that has been written can be accessed normally. + +### Fix fastjson related vulnerabilities + +Update to Canal version to fix fastjson security vulnerability. + +### Added `ADMIN DIAGNOSE TABLET` command + +Used to quickly diagnose problems with the specified tablet. + +## Download to Use + +### Download Link + +[hhttps://doris.apache.org/download](https://doris.apache.org/download) + +### Feedback + +If you encounter any problems with use, please feel free to contact us through GitHub discussion forum or Dev e-mail group anytime. + +GitHub Forum: [https://github.com/apache/doris/discussions](https://github.com/apache/doris/discussions) + +Mailing list: [dev@doris.apache.org](dev@doris.apache.org) + +## Thanks + +Thanks to everyone who has contributed to this release: + +``` + +@adonis0147 + +@airborne12 + +@amosbird + +@aopangzi + +@arthuryangcs + +@awakeljw + +@BePPPower + +@BiteTheDDDDt + +@bridgeDream + +@caiconghui + +@cambyzju + +@ccoffline + +@chenlinzhong + +@daikon12 + +@DarvenDuan + +@dataalive + +@dataroaring + +@deardeng + +@Doris-Extras + +@emerkfu + +@EmmyMiao87 + +@englefly + +@Gabriel39 + +@GoGoWen + +@gtchaos + +@HappenLee + +@hello-stephen + +@Henry2SS + +@hewei-nju + +@hf200012 + +@jacktengg + +@jackwener + +@Jibing-Li + +@JNSimba + +@kangshisen + +@Kikyou1997 + +@kylinmac + +@Lchangliang + +@leo65535 + +@liaoxin01 + +@liutang123 + +@lovingfeel + +@luozenglin + +@luwei16 + +@luzhijing + +@mklzl + +@morningman + +@morrySnow + +@nextdreamblue + +@Nivane + +@pengxiangyu + +@qidaye + +@qzsee + +@SaintBacchus + +@SleepyBear96 + +@smallhibiscus + +@spaces-X + +@stalary + +@starocean999 + +@steadyBoy + +@SWJTU-ZhangLei + +@Tanya-W + +@tarepanda1024 + +@tianhui5 + +@Userwhite + +@wangbo + +@wangyf0555 + +@weizuo93 + +@whutpencil + +@wsjz + +@wunan1210 + +@xiaokang + +@xinyiZzz + +@xlwh + +@xy720 + +@yangzhg + +@Yankee24 + +@yiguolei + +@yinzhijian + +@yixiutt + +@zbtzbtzbt + +@zenoyang + +@zhangstar333 + +@zhangyifan27 + +@zhannngchen + +@zhengshengjun + +@zhengshiJ + +@zingdle + +@zuochunwei + +@zy-kkk +``` diff --git a/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.1.md b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.1.md new file mode 100644 index 0000000000000..73a6c2d976999 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.1.md @@ -0,0 +1,78 @@ +--- +{ + "title": "Release 1.1.1", + "language": "en" +} +--- + + + +## Features + +### Support ODBC Sink in Vectorized Engine. + +This feature is enabled in non-vectorized engine but it is missed in vectorized engine in 1.1. So that we add back this feature in 1.1.1. + +### Simple Memtracker for Vectorized Engine. + +There is no memtracker in BE for vectorized engine in 1.1, so that the memory is out of control and cause OOM. In 1.1.1, a simple memtracker is added to BE and could control the memory and cancel the query when memory exceeded. + +## Improvements + +### Cache decompressed data in page cache. + +Some data is compressed using bitshuffle and it costs a lot of time to decompress it during query. In 1.1.1, doris will decompress the data that encoded by bitshuffle to accelerate query and we find it could reduce 30% latency for some query in ssb-flat. + +## Bug Fix + +### Fix the problem that could not do rolling upgrade from 1.0.(Serious) + +This issue was introduced in version 1.1 and may cause BE core when upgrade BE but not upgrade FE. + +If you encounter this problem, you can try to fix it with [#10833](https://github.com/apache/doris/pull/10833). + +### Fix the problem that some query not fall back to non-vectorized engine, and BE will core. + +Currently, vectorized engine could not deal with all sql queries and some queries (like left outer join) will use non-vectorized engine to run. But there are some cases not covered in 1.1. And it will cause be crash. + +### Compaction not work correctly and cause -235 Error. + +One rowset multi segments in uniq key compaction, segments rows will be merged in generic_iterator but merged_rows not increased. Compaction will failed in check_correctness, and make a tablet with too much versions which lead to -235 load error. + +### Some segment fault cases during query. + +[#10961](https://github.com/apache/doris/pull/10961) +[#10954](https://github.com/apache/doris/pull/10954) +[#10962](https://github.com/apache/doris/pull/10962) + +# Thanks + +Thanks to everyone who has contributed to this release: + +``` +@jacktengg +@mrhhsg +@xinyiZzz +@yixiutt +@starocean999 +@morrySnow +@morningman +@HappenLee +``` \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.2.md b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.2.md new file mode 100644 index 0000000000000..223b65fda064c --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.2.md @@ -0,0 +1,84 @@ +--- +{ + "title": "Release 1.1.2", + "language": "en" +} +--- + + + + +In this release, Doris Team has fixed more than 170 issues or performance improvement since 1.1.1. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + +# Features + +### New MemTracker + +Introduced new MemTracker for both vectorized engine and non-vectorized engine which is more accurate. + +### Add API for showing current queries and kill query + +### Support read/write emoji of UTF16 via ODBC Table + +# Improvements + +### Data Lake related improvements + +- Improved HDFS ORC File scan performance about 300%. [#11501](https://github.com/apache/doris/pull/11501) + +- Support HDFS HA mode when query Iceberg table. + +- Support query Hive data created by [Apache Tez](https://tez.apache.org/) + +- Add Ali OSS as Hive external support. + +### Add support for string and text type in Spark Load + + +### Add reuse block in non-vectorized engine and have 50% performance improvement in some cases. [#11392](https://github.com/apache/doris/pull/11392) + +### Improve like or regex performance + +### Disable tcmalloc's aggressive_memory_decommit + +It will have 40% performance gains in load or query. + +Currently it is a config, you can change it by set config `tc_enable_aggressive_memory_decommit`. + +# Bug Fix + +### Some issues about FE that will cause FE failure or data corrupt. + +- Add reserved disk config to avoid too many reserved BDB-JE files.**(Serious)** In an HA environment, BDB JE will retains as many reserved files. The BDB-je log doesn't delete until approaching a disk limit. + +- Fix fatal bug in BDB-JE which will cause FE replica could not start correctly or data corrupted.** (Serious)** + +### Fe will hang on waitFor_rpc during query and BE will hang in high concurrent scenarios. + +[#12459](https://github.com/apache/doris/pull/12459) [#12458](https://github.com/apache/doris/pull/12458) [#12392](https://github.com/apache/doris/pull/12392) + +### A fatal issue in vectorized storage engine which will cause wrong result. **(Serious)** + +[#11754](https://github.com/apache/doris/pull/11754) [#11694](https://github.com/apache/doris/pull/11694) + +### Lots of planner related issues that will cause BE core or in abnormal state. + +[#12080](https://github.com/apache/doris/pull/12080) [#12075](https://github.com/apache/doris/pull/12075) [#12040](https://github.com/apache/doris/pull/12040) [#12003](https://github.com/apache/doris/pull/12003) [#12007](https://github.com/apache/doris/pull/12007) [#11971](https://github.com/apache/doris/pull/11971) [#11933](https://github.com/apache/doris/pull/11933) [#11861](https://github.com/apache/doris/pull/11861) [#11859](https://github.com/apache/doris/pull/11859) [#11855](https://github.com/apache/doris/pull/11855) [#11837](https://github.com/apache/doris/pull/11837) [#11834](https://github.com/apache/doris/pull/11834) [#11821](https://github.com/apache/doris/pull/11821) [#11782](https://github.com/apache/doris/pull/11782) [#11723](https://github.com/apache/doris/pull/11723) [#11569](https://github.com/apache/doris/pull/11569) + diff --git a/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.3.md b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.3.md new file mode 100644 index 0000000000000..cfa7151097de3 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.3.md @@ -0,0 +1,92 @@ +--- +{ + "title": "Release 1.1.3", + "language": "en" +} +--- + + + + +In this release, Doris Team has fixed more than 80 issues or performance improvement since 1.1.2. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + + +# Features + +- Support escape identifiers for sqlserver and postgresql in ODBC table. + +- Could use Parquet as output file format. + +# Improvements + +- Optimize flush policy to avoid small segments. [#12706](https://github.com/apache/doris/pull/12706) [#12716](https://github.com/apache/doris/pull/12716) + +- Refactor runtime filter to reduce the prepare time. [#13127](https://github.com/apache/doris/pull/13127) + +- Lots of memory control related issues during query or load process. [#12682](https://github.com/apache/doris/pull/12682) [#12688](https://github.com/apache/doris/pull/12688) [#12708](https://github.com/apache/doris/pull/12708) [#12776](https://github.com/apache/doris/pull/12776) [#12782](https://github.com/apache/doris/pull/12782) [#12791](https://github.com/apache/doris/pull/12791) [#12794](https://github.com/apache/doris/pull/12794) [#12820](https://github.com/apache/doris/pull/12820) [#12932](https://github.com/apache/doris/pull/12932) [#12954](https://github.com/apache/doris/pull/12954) [#12951](https://github.com/apache/doris/pull/12951) + +# BugFix + +- Core dump on compaction with largeint. [#10094](https://github.com/apache/doris/pull/10094) + +- Grouping sets cause be core or return wrong results. [#12313](https://github.com/apache/doris/pull/12313) + +- PREAGGREGATION flag in orthogonal_bitmap_union_count operator is wrong. [#12581](https://github.com/apache/doris/pull/12581) + +- Level1Iterator should release iterators in heap and it may cause memory leak. [#12592](https://github.com/apache/doris/pull/12592) + +- Fix decommission failure with 2 BEs and existing colocation table. [#12644](https://github.com/apache/doris/pull/12644) + +- BE may core dump because of stack-buffer-overflow when TBrokerOpenReaderResponse too large. [#12658](https://github.com/apache/doris/pull/12658) + +- BE may OOM during load when error code -238 occurs. [#12666](https://github.com/apache/doris/pull/12666) + +- Fix wrong child expression of lead function. [#12587](https://github.com/apache/doris/pull/12587) + +- Fix intersect query failed in row storage code. [#12712](https://github.com/apache/doris/pull/12712) + +- Fix wrong result produced by curdate()/current_date() function. [#12720](https://github.com/apache/doris/pull/12720) + +- Fix lateral view explode_split with temp table bug. [#13643](https://github.com/apache/doris/pull/13643) + +- Bucket shuffle join plan is wrong in two same table. [#12930](https://github.com/apache/doris/pull/12930) + +- Fix bug that tablet version may be wrong when doing alter and load. [#13070](https://github.com/apache/doris/pull/13070) + +- BE core when load data using broker with md5sum()/sm3sum(). [#13009](https://github.com/apache/doris/pull/13009) + +# Upgrade Notes + +PageCache and ChunkAllocator are disabled by default to reduce memory usage and can be re-enabled by modifying the configuration items `disable_storage_page_cache` and `chunk_reserved_bytes_limit`. + +Storage Page Cache and Chunk Allocator cache user data chunks and memory preallocation, respectively. + +These two functions take up a certain percentage of memory and are not freed. This part of memory cannot be flexibly allocated, which may lead to insufficient memory for other tasks in some scenarios, affecting system stability and availability. Therefore, we disabled these two features by default in version 1.1.3. + +However, in some latency-sensitive reporting scenarios, turning off this feature may lead to increased query latency. If you are worried about the impact of this feature on your business after upgrade, you can add the following parameters to be.conf to keep the same behavior as the previous version. + +``` +disable_storage_page_cache=false +chunk_reserved_bytes_limit=10% +``` + +* ``disable_storage_page_cache``: Whether to disable Storage Page Cache. version 1.1.2 (inclusive), the default is false, i.e., on. version 1.1.3 defaults to true, i.e., off. +* `chunk_reserved_bytes_limit`: Chunk allocator reserved memory size. 1.1.2 (and earlier), the default is 10% of the overall memory. 1.1.3 version default is 209715200 (200MB). + diff --git a/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.4.md b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.4.md new file mode 100644 index 0000000000000..4710463f4bcde --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.4.md @@ -0,0 +1,72 @@ +--- +{ + "title": "Release 1.1.4", + "language": "en" +} +--- + + + +In this release, Doris Team has fixed about 60 issues or performance improvement since 1.1.3. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + + +# Features + +- Support obs broker load for Huawei Cloud. [#13523](https://github.com/apache/doris/pull/13523) + +- SparkLoad support parquet and orc file.[#13438](https://github.com/apache/doris/pull/13438) + +# Improvements + +- Do not acquire mutex in metric hook since it will affect query performance during heavy load.[#10941](https://github.com/apache/doris/pull/10941) + + +# BugFix + +- The where condition does not take effect when spark load loads the file. [#13804](https://github.com/apache/doris/pull/13804) + +- If function return error result when there is nullable column in vectorized mode. [#13779](https://github.com/apache/doris/pull/13779) + +- Fix incorrect result when using anti join with other join predicates. [#13743](https://github.com/apache/doris/pull/13743) + +- BE crash when call function concat(ifnull). [#13693](https://github.com/apache/doris/pull/13693) + +- Fix planner bug when there is a function in group by clause. [#13613](https://github.com/apache/doris/pull/13613) + +- Table name and column name is not recognized correctly in lateral view clause. [#13600](https://github.com/apache/doris/pull/13600) + +- Unknown column when use MV and table alias. [#13605](https://github.com/apache/doris/pull/13605) + +- JSONReader release memory of both value and parse allocator. [#13513](https://github.com/apache/doris/pull/13513) + +- Fix allow create mv using to_bitmap() on negative value columns when enable_vectorized_alter_table is true. [#13448](https://github.com/apache/doris/pull/13448) + +- Microsecond in function from_date_format_str is lost. [#13446](https://github.com/apache/doris/pull/13446) + +- Sort exprs nullability property may not be right after subsitute using child's smap info. [#13328](https://github.com/apache/doris/pull/13328) + +- Fix core dump on case when have 1000 condition. [#13315](https://github.com/apache/doris/pull/13315) + +- Fix bug that last line of data lost for stream load. [#13066](https://github.com/apache/doris/pull/13066) + +- Restore table or partition with the same replication num as before the backup. [#11942](https://github.com/apache/doris/pull/11942) + + + diff --git a/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.5.md b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.5.md new file mode 100644 index 0000000000000..ee0482b3ba487 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.1/release-1.1.5.md @@ -0,0 +1,65 @@ +--- +{ + "title": "Release 1.1.5", + "language": "en" +} +--- + + + +In this release, Doris Team has fixed about 36 issues or performance improvement since 1.1.4. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + +# Behavior Changes + +When alias name is same as the original column name like "select year(birthday) as birthday" and use it in group by, order by , having clause, doris's behavior is different from MySQL in the past. In this release, we make it follow MySQL's behavior. Group by and having clause will use original column at first and order by will use alias first. It maybe a litter confuse here so there is a simple advice here, you'd better not use an alias the same as original column name. + +# Features + +Add support of murmur_hash3_64. [#14636](https://github.com/apache/doris/pull/14636) + +# Improvements + +Add timezone cache for convert_tz to improve performance. [#14616](https://github.com/apache/doris/pull/14616) + +Sort result by tablename when call show clause. [#14492](https://github.com/apache/doris/pull/14492) + +# Bug Fix + +Fix coredump when there is a if constant expr in select clause. [#14858](https://github.com/apache/doris/pull/14858) + +ColumnVector::insert_date_column may crashed. [#14839](https://github.com/apache/doris/pull/14839) + +Update high_priority_flush_thread_num_per_store default value to 6 and it will improve the load performance. [#14775](https://github.com/apache/doris/pull/14775) + +Fix quick compaction core. [#14731](https://github.com/apache/doris/pull/14731) + +Partition column is not duplicate key, spark load will throw IndexOutOfBounds error. [#14661](https://github.com/apache/doris/pull/14661) + +Fix a memory leak problem in VCollectorIterator. [#14549](https://github.com/apache/doris/pull/14549) + +Fix create table like when having sequence column. [#14511](https://github.com/apache/doris/pull/14511) + +Using avg rowset to calculate batch size instead of using total_bytes since it costs a lot of cpu. [#14273](https://github.com/apache/doris/pull/14273) + +Fix right outer join core with conjunct. [#14821](https://github.com/apache/doris/pull/14821) + +Optimize policy of tcmalloc gc. [#14777](https://github.com/apache/doris/pull/14777) [#14738](https://github.com/apache/doris/pull/14738) [#14374](https://github.com/apache/doris/pull/14374) + + diff --git a/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.0.md b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.0.md new file mode 100644 index 0000000000000..2529ce7e58aa2 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.0.md @@ -0,0 +1,563 @@ +--- +{ + "title": "Release 1.2.0", + "language": "en" +} +--- + + + + + +# Feature +## Highlight + +1. Full Vectorizied-Engine support, greatly improved performance + + In the standard ssb-100-flat benchmark, the performance of 1.2 is 2 times faster than that of 1.1; in complex TPCH 100 benchmark, the performance of 1.2 is 3 times faster than that of 1.1. + +2. Merge-on-Write Unique Key + + Support Merge-On-Write on Unique Key Model. This mode marks the data that needs to be deleted or updated when the data is written, thereby avoiding the overhead of Merge-On-Read when querying, and greatly improving the reading efficiency on the updateable data model. + +3. Multi Catalog + + The multi-catalog feature provides Doris with the ability to quickly access external data sources for access. Users can connect to external data sources through the `CREATE CATALOG` command. Doris will automatically map the library and table information of external data sources. After that, users can access the data in these external data sources just like accessing ordinary tables. It avoids the complicated operation that the user needs to manually establish external mapping for each table. + + Currently this feature supports the following data sources: + + 1. Hive Metastore: You can access data tables including Hive, Iceberg, and Hudi. It can also be connected to data sources compatible with Hive Metastore, such as Alibaba Cloud's DataLake Formation. Supports data access on both HDFS and object storage. + 2. Elasticsearch: Access ES data sources. + 3. JDBC: Access MySQL through the JDBC protocol. + + Documentation: https://doris.apache.org//docs/dev/lakehouse/multi-catalog) + + > Note: The corresponding permission level will also be changed automatically, see the "Upgrade Notes" section for details. + +4. Light table structure changes + +In the new version, it is no longer necessary to change the data file synchronously for the operation of adding and subtracting columns to the data table, and only need to update the metadata in FE, thus realizing the millisecond-level Schema Change operation. Through this function, the DDL synchronization capability of upstream CDC data can be realized. For example, users can use Flink CDC to realize DML and DDL synchronization from upstream database to Doris. + +Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE + +When creating a table, set `"light_schema_change"="true"` in properties. + +5. JDBC facade + + Users can connect to external data sources through JDBC. Currently supported: + + - MySQL + - PostgreSQL + - Oracle + - SQL Server + - Clickhouse + + Documentation: [https://doris.apache.org/en/docs/dev/lakehouse/multi-catalog/jdbc](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc/) + + > Note: The ODBC feature will be removed in a later version, please try to switch to the JDBC. + +6. JAVA UDF + + Supports writing UDF/UDAF in Java, which is convenient for users to use custom functions in the Java ecosystem. At the same time, through technologies such as off-heap memory and Zero Copy, the efficiency of cross-language data access has been greatly improved. + + Document: https://doris.apache.org//docs/dev/ecosystem/udf/java-user-defined-function + + Example: https://github.com/apache/doris/tree/master/samples/doris-demo + +7. Remote UDF + + Supports accessing remote user-defined function services through RPC, thus completely eliminating language restrictions for users to write UDFs. Users can use any programming language to implement custom functions to complete complex data analysis work. + + Documentation: https://doris.apache.org//docs/ecosystem/udf/remote-user-defined-function + + Example: https://github.com/apache/doris/tree/master/samples/doris-demo + +8. More data types support + + - Array type + + Array types are supported. It also supports nested array types. In some scenarios such as user portraits and tags, the Array type can be used to better adapt to business scenarios. At the same time, in the new version, we have also implemented a large number of data-related functions to better support the application of data types in actual scenarios. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Types/ARRAY + + Related functions: https://doris.apache.org//docs/dev/sql-manual/sql-functions/array-functions/array_max + + - Jsonb type + + Support binary Json data type: Jsonb. This type provides a more compact json encoding format, and at the same time provides data access in the encoding format. Compared with json data stored in strings, it is several times newer and can be improved. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Types/JSONB + + Related functions: https://doris.apache.org//docs/dev/sql-manual/sql-functions/json-functions/jsonb_parse + + - Date V2 + + Sphere of influence: + + 1. The user needs to specify datev2 and datetimev2 when creating the table, and the date and datetime of the original table will not be affected. + 2. When datev2 and datetimev2 are calculated with the original date and datetime (for example, equivalent connection), the original type will be cast into a new type for calculation + 3. The example is in the documentation + + Documentation: https://doris.apache.org/docs/1.2/sql-manual/sql-reference/Data-Types/DATEV2 + + +## More + +1. A new memory management framework + + Documentation: https://doris.apache.org//docs/dev/admin-manual/maint-monitor/memory-management/memory-tracker + +2. Table Valued Function + + Doris implements a set of Table Valued Function (TVF). TVF can be regarded as an ordinary table, which can appear in all places where "table" can appear in SQL. + + For example, we can use S3 TVF to implement data import on object storage: + + ``` + insert into tbl select * from s3("s3://bucket/file.*", "ak" = "xx", "sk" = "xxx") where c1 > 2; + ``` + + Or directly query data files on HDFS: + + ``` + insert into tbl select * from hdfs("hdfs://bucket/file.*") where c1 > 2; + ``` + + TVF can help users make full use of the rich expressiveness of SQL and flexibly process various data. + + Documentation: + + https://doris.apache.org//docs/dev/sql-manual/sql-functions/table-functions/s3 + + https://doris.apache.org//docs/dev/sql-manual/sql-functions/table-functions/hdfs + +3. A more convenient way to create partitions + + Support for creating multiple partitions within a time range via the `FROM TO` command. + +4. Column renaming + + For tables with Light Schema Change enabled, column renaming is supported. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-TABLE-RENAME + +5. Richer permission management + + - Support row-level permissions + + Row-level permissions can be created with the `CREATE ROW POLICY` command. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-POLICY + + - Support specifying password strength, expiration time, etc. + + - Support for locking accounts after multiple failed logins. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Account-Management-Statements/ALTER-USER + +6. Import + + - CSV import supports csv files with header. + + Search for `csv_with_names` in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD/ + + - Stream Load adds `hidden_columns`, which can explicitly specify the delete flag column and sequence column. + + Search for `hidden_columns` in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD + + - Spark Load supports Parquet and ORC file import. + + - Support for cleaning completed imported Labels + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CLEAN-LABEL + + - Support batch cancellation of import jobs by status + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CANCEL-LOAD + + - Added support for Alibaba Cloud oss, Tencent Cloud cos/chdfs and Huawei Cloud obs in broker load. + + Documentation: https://doris.apache.org//docs/dev/advanced/broker + + - Support access to hdfs through hive-site.xml file configuration. + + Documentation: https://doris.apache.org//docs/dev/admin-manual/config/config-dir + +7. Support viewing the contents of the catalog recycle bin through `SHOW CATALOG RECYCLE BIN` function. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Show-Statements/SHOW-CATALOG-RECYCLE-BIN + +8. Support `SELECT * EXCEPT` syntax. + + Documentation: https://doris.apache.org//docs/dev/data-table/basic-usage + +9. OUTFILE supports ORC format export. And supports multi-byte delimiters. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/OUTFILE + +10. Support to modify the number of Query Profiles that can be saved through configuration. + + Document search FE configuration item: max_query_profile_num + +11. The DELETE statement supports IN predicate conditions. And it supports partition pruning. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Manipulation/DELETE + +12. The default value of the time column supports using `CURRENT_TIMESTAMP` + + Search for "CURRENT_TIMESTAMP" in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE + +13. Add two system tables: backends, rowsets + + Documentation: + + https://doris.apache.org//docs/dev/admin-manual/system-table/backends + + https://doris.apache.org//docs/dev/admin-manual/system-table/rowsets + +14. Backup and restore + + - The Restore job supports the `reserve_replica` parameter, so that the number of replicas of the restored table is the same as that of the backup. + + - The Restore job supports `reserve_dynamic_partition_enable` parameter, so that the restored table keeps the dynamic partition enabled. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Backup-and-Restore/RESTORE + + - Support backup and restore operations through the built-in libhdfs, no longer rely on broker. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Backup-and-Restore/CREATE-REPOSITORY + +15. Support data balance between multiple disks on the same machine + + Documentation: + + https://doris.apache.org//docs/dev/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-REBALANCE-DISK + + https://doris.apache.org//docs/dev/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-CANCEL-REBALANCE-DISK + +16. Routine Load supports subscribing to Kerberos-authenticated Kafka services. + + Search for kerberos in the documentation: https://doris.apache.org//docs/dev/data-operate/import/import-way/routine-load-manual + +17. New built-in-function + + Added the following built-in functions: + + - `cbrt` + - `sequence_match/sequence_count` + - `mask/mask_first_n/mask_last_n` + - `elt` + - `any/any_value` + - `group_bitmap_xor` + - `ntile` + - `nvl` + - `uuid` + - `initcap` + - `regexp_replace_one/regexp_extract_all` + - `multi_search_all_positions/multi_match_any` + - `domain/domain_without_www/protocol` + - `running_difference` + - `bitmap_hash64` + - `murmur_hash3_64` + - `to_monday` + - `not_null_or_empty` + - `window_funnel` + - `group_bit_and/group_bit_or/group_bit_xor` + - `outer combine` + - and all array functions + +# Upgrade Notice + +## Known Issues + +- Use JDK11 will cause BE crash, please use JDK8 instead. + +## Behavior Changed + +- Permission level changes + + Because the catalog level is introduced, the corresponding user permission level will also be changed automatically. The rules are as follows: + + - GlobalPrivs and ResourcePrivs remain unchanged + - Added CatalogPrivs level. + - The original DatabasePrivs level is added with the internal prefix (indicating the db in the internal catalog) + - Add the internal prefix to the original TablePrivs level (representing tbl in the internal catalog) + +- In GroupBy and Having clauses, match on column names in preference to aliases. (#14408) + +- Creating columns starting with `mv_` is no longer supported. `mv_` is a reserved keyword in materialized views (#14361) + +- Removed the default limit of 65535 rows added by the order by statement, and added the session variable `default_order_by_limit` to configure this limit. (#12478) + +- In the table generated by "Create Table As Select", all string columns use the string type uniformly, and no longer distinguish varchar/char/string (#14382) + +- In the audit log, remove the word `default_cluster` before the db and user names. (#13499) (#11408) + +- Add sql digest field in audit log (#8919) + +- The union clause always changes the order by logic. In the new version, the order by clause will be executed after the union is executed, unless explicitly associated by parentheses. (#9745) + +- During the decommission operation, the tablet in the recycle bin will be ignored to ensure that the decomission can be completed. (#14028) + +- The returned result of Decimal will be displayed according to the precision declared in the original column, or according to the precision specified in the cast function. (#13437) + +- Changed column name length limit from 64 to 256 (#14671) + +- Changes to FE configuration items + + - The `enable_vectorized_load` parameter is enabled by default. (#11833) + + - Increased `create_table_timeout` value. The default timeout for table creation operations will be increased. (#13520) + + - Modify `stream_load_default_timeout_second` default value to 3 days. + + - Modify the default value of `alter_table_timeout_second` to one month. + + - Increase the parameter `max_replica_count_when_schema_change` to limit the number of replicas involved in the alter job, the default is 100000. (#12850) + + - Add `disable_iceberg_hudi_table`. The iceberg and hudi appearances are disabled by default, and the multi catalog function is recommended. (#13932) + +- Changes to BE configuration items + + - Removed `disable_stream_load_2pc` parameter. 2PC's stream load can be used directly. (#13520) + + - Modify `tablet_rowset_stale_sweep_time_sec` from 1800 seconds to 300 seconds. + + - Redesigned configuration item name about compaction (#13495) + + - Revisited parameter about memory optimization (#13781) + +- Session variable changes + + - Modify the variable `enable_insert_strict` to true by default. This will cause some insert operations that could be executed before, but inserted illegal values, to no longer be executed. (11866) + + - Modified variable `enable_local_exchange` to default to true (#13292) + + - Default data transmission via lz4 compression, controlled by variable `fragment_transmission_compression_codec` (#11955) + + - Add `skip_storage_engine_merge` variable for debugging unique or agg model data (#11952) + + Documentation: https://doris.apache.org//docs/dev/advanced/variables + +- The BE startup script will check whether the value is greater than 200W through `/proc/sys/vm/max_map_count`. Otherwise, the startup fails. (#11052) + +- Removed mini load interface (#10520) + +- FE Metadata Version + + FE Meta Version changed from 107 to 114, and cannot be rolled back after upgrading. + +## During Upgrade + +1. Upgrade preparation + + - Need to replace: lib, bin directory (start/stop scripts have been modified) + + - BE also needs to configure JAVA_HOME, and already supports JDBC Table and Java UDF. + + - The default JVM Xmx parameter in fe.conf is changed to 8GB. + +2. Possible errors during the upgrade process + + - The repeat function cannot be used and an error is reported: `vectorized repeat function cannot be executed`, you can turn off the vectorized execution engine before upgrading. (#13868) + + - schema change fails with error: `desc_tbl is not set. Maybe the FE version is not equal to the BE` (#13822) + + - Vectorized hash join cannot be used and an error will be reported. `vectorized hash join cannot be executed`. You can turn off the vectorized execution engine before upgrading. (#13753) + + The above errors will return to normal after a full upgrade. + +## Performance Impact + +- By default, JeMalloc is used as the memory allocator of the new version BE, replacing TcMalloc (#13367) + +- The batch size in the tablet sink is modified to be at least 8K. (#13912) + +- Disable chunk allocator by default (#13285) + +## Api change + +- BE's http api error return information changed from `{"status": "Fail", "msg": "xxx"}` to more specific ``{"status": "Not found", "msg": "Tablet not found. tablet_id=1202"}``(#9771) + +- In `SHOW CREATE TABLE`, the content of comment is changed from double quotes to single quotes (#10327) + +- Support ordinary users to obtain query profile through http command. (#14016) +Documentation: https://doris.apache.org//docs/dev/admin-manual/http-actions/fe/manager/query-profile-action + +- Optimized the way to specify the sequence column, you can directly specify the column name. (#13872) +Documentation: https://doris.apache.org//docs/dev/data-operate/update-delete/sequence-column-manual + +- Increase the space usage of remote storage in the results returned by `show backends` and `show tablets` (#11450) + +- Removed Num-Based Compaction related code (#13409) + +- Refactored BE's error code mechanism, some returned error messages will change (#8855) +other + +- Support Docker official image. + +- Support compiling Doris on MacOS(x86/M1) and ubuntu-22.04 + Documentation: https://doris.apache.org//docs/dev/install/source-install/compilation-mac/ + +- Support for image file verification. + + Documentation: https://doris.apache.org//docs/dev/admin-manual/maint-monitor/metadata-operation/ + +- script related + + - The stop scripts of FE and BE support exiting FE and BE via the `--grace` parameter (use kill -15 signal instead of kill -9) + + - FE start script supports checking the current FE version via --version (#11563) + + - Support to get the data and related table creation statement of a tablet through the `ADMIN COPY TABLET` command, for local problem debugging (#12176) + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-COPY-TABLET + +- Support to obtain a table creation statement related to a SQL statement through the http api for local problem reproduction (#11979) + + Documentation: https://doris.apache.org//docs/dev/admin-manual/http-actions/fe/query-schema-action + +- Support to close the compaction function of this table when creating a table, for testing (#11743) + + Search for "disble_auto_compaction" in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE + +# Big Thanks + +Thanks to ALL who contributed to this release! (alphabetically) +``` +@924060929 +@a19920714liou +@adonis0147 +@Aiden-Dong +@aiwenmo +@AshinGau +@b19mud +@BePPPower +@BiteTheDDDDt +@bridgeDream +@ByteYue +@caiconghui +@CalvinKirs +@cambyzju +@caoliang-web +@carlvinhust2012 +@catpineapple +@ccoffline +@chenlinzhong +@chovy-3012 +@coderjiang +@cxzl25 +@dataalive +@dataroaring +@dependabot[bot] +@dinggege1024 +@DongLiang-0 +@Doris-Extras +@eldenmoon +@EmmyMiao87 +@englefly +@FreeOnePlus +@Gabriel39 +@gaodayue +@geniusjoe +@gj-zhang +@gnehil +@GoGoWen +@HappenLee +@hello-stephen +@Henry2SS +@hf200012 +@huyuanfeng2018 +@jacktengg +@jackwener +@jeffreys-cat +@Jibing-Li +@JNSimba +@Kikyou1997 +@Lchangliang +@LemonLiTree +@lexoning +@liaoxin01 +@lide-reed +@link3280 +@liutang123 +@liuyaolin +@LOVEGISER +@lsy3993 +@luozenglin +@luzhijing +@madongz +@morningman +@morningman-cmy +@morrySnow +@mrhhsg +@Myasuka +@myfjdthink +@nextdreamblue +@pan3793 +@pangzhili +@pengxiangyu +@platoneko +@qidaye +@qzsee +@SaintBacchus +@SeekingYang +@smallhibiscus +@sohardforaname +@song7788q +@spaces-X +@ssusieee +@stalary +@starocean999 +@SWJTU-ZhangLei +@TaoZex +@timelxy +@Wahno +@wangbo +@wangshuo128 +@wangyf0555 +@weizhengte +@weizuo93 +@wsjz +@wunan1210 +@xhmz +@xiaokang +@xiaokangguo +@xinyiZzz +@xy720 +@yangzhg +@Yankee24 +@yeyudefeng +@yiguolei +@yinzhijian +@yixiutt +@yuanyuan8983 +@zbtzbtzbt +@zenoyang +@zhangboya1 +@zhangstar333 +@zhannngchen +@ZHbamboo +@zhengshiJ +@zhenhb +@zhqu1148980644 +@zuochunwei +@zy-kkk +``` diff --git a/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.1.md b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.1.md new file mode 100644 index 0000000000000..d5adb31eb5256 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.1.md @@ -0,0 +1,196 @@ +--- +{ + "title": "Release 1.2.1", + "language": "en" +} +--- + + + +# Improvement + +### Supports new type DecimalV3 + +DecimalV3, which supports higher precision and better performance, has the following advantages over past versions. + +- Larger representable range, the range of values are significantly expanded, and the valid number range [1,38]. + +- Higher performance, adaptive adjustment of the storage space occupied according to different precision. + +- More complete precision derivation support, for different expressions, different precision derivation rules are applied to the accuracy of the result. + +[DecimalV3](https://doris.apache.org/docs/dev/sql-manual/sql-reference/Data-Types/DECIMALV3/) + +### Support Iceberg V2 + +Support Iceberg V2 (only Position Delete is supported, Equality Delete will be supported in subsequent versions). + +Tables in Iceberg V2 format can be accessed through the Multi-Catalog feature. + +### Support OR condition to IN + +Support converting OR condition to IN condition, which can improve the execution efficiency in some scenarios.[#15437](https://github.com/apache/doris/pull/15437) [#12872](https://github.com/apache/doris/pull/12872) + +### Optimize the import and query performance of JSONB type + +Optimize the import and query performance of JSONB type. [#15219](https://github.com/apache/doris/pull/15219) [#15219](https://github.com/apache/doris/pull/15219) + +### Stream load supports quoted csv data + +Search trim_double_quotes in Document:[https://doris.apache.org/docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD](https://doris.apache.org/docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD) + +### Broker supports Tencent Cloud CHDFS and Baidu Cloud BOS, AFS + +Data on CHDFS, BOS, and AFS can be accessed through Broker. [#15297](https://github.com/apache/doris/pull/15297) [#15448](https://github.com/apache/doris/pull/15448) + +### New function + +Add function `substring_index`. [#15373](https://github.com/apache/doris/pull/15373) + +# Bug Fix + +- In some cases, after upgrading from version 1.1 to version 1.2, the user permission information will be lost. [#15144](https://github.com/apache/doris/pull/15144) + +- Fix the problem that the partition value is wrong when using datev2/datetimev2 type for partitioning. [#15094](https://github.com/apache/doris/pull/15094) + +- Bug fixes for a large number of released features. For a complete list see: [PR List](https://github.com/apache/doris/pulls?q=is%3Apr+label%3Adev%2F1.2.1-merged+is%3Aclosed) + +# Upgrade Notice + +### Known Issues + +- Do not use JDK11 as the runtime JDK of BE, it will cause BE Crash. +- The reading performance of the csv format in this version has declined, which will affect the import and reading efficiency of the csv format. We will fix it as soon as possible in the next three-digit version + +### Behavior Changed + +- The default value of the BE configuration item `high_priority_flush_thread_num_per_store` is changed from 1 to 6, to improve the write efficiency of Routine Load. (https://github.com/apache/doris/pull/14775) + +- The default value of the FE configuration item `enable_new_load_scan_node` is changed to true. Import tasks will be performed using the new File Scan Node. No impact on users.[#14808](https://github.com/apache/doris/pull/14808) + +- Delete the FE configuration item `enable_multi_catalog`. The Multi-Catalog function is enabled by default. + +- The vectorized execution engine is forced to be enabled by default.[#15213](https://github.com/apache/doris/pull/15213) + +The session variable enable_vectorized_engine will no longer take effect. Enabled by default. + +To make it valid again, set the FE configuration item `disable_enable_vectorized_engine` to false, and restart FE to make `enable_vectorized_engine` valid again. + + +# Big Thanks + +Thanks to ALL who contributed to this release! + + +@adonis0147 + +@AshinGau + +@BePPPower + +@BiteTheDDDDt + +@ByteYue + +@caiconghui + +@cambyzju + +@chenlinzhong + +@dataroaring + +@Doris-Extras + +@dutyu + +@eldenmoon + +@englefly + +@freemandealer + +@Gabriel39 + +@HappenLee + +@Henry2SS + +@hf200012 + +@jacktengg + +@Jibing-Li + +@Kikyou1997 + +@liaoxin01 + +@luozenglin + +@morningman + +@morrySnow + +@mrhhsg + +@nextdreamblue + +@qidaye + +@spaces-X + +@starocean999 + +@wangshuo128 + +@weizuo93 + +@wsjz + +@xiaokang + +@xinyiZzz + +@xutaoustc + +@yangzhg + +@yiguolei + +@yixiutt + +@Yulei-Yang + +@yuxuan-luo + +@zenoyang + +@zhangstar333 + +@zhannngchen + +@zhengshengjun + + + + + + diff --git a/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.2.md b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.2.md new file mode 100644 index 0000000000000..08fd22571a03f --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.2.md @@ -0,0 +1,254 @@ +--- +{ + "title": "Release 1.2.2", + "language": "en" +} +--- + + + +# New Features + +### Lakehouse + +- Support automatic synchronization of Hive metastore. + +- Support reading the Iceberg Snapshot, and viewing the Snapshot history. + +- JDBC Catalog supports PostgreSQL, Clickhouse, Oracle, SQLServer + +- JDBC Catalog supports Insert operation + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/) + +### Auto Bucket + + Set and scale the number of buckets for different partitions to keep the number of tablet in a relatively appropriate range. + +### New Functions + +Add the new function `width_bucket`. + +Reference: [https://doris.apache.org/zh-CN/docs/dev/sql-manual/sql-functions/width-bucket/#description](https://doris.apache.org/zh-CN/docs/dev/sql-manual/sql-functions/width-bucket/#description) + +# Behavior Changes + +- Disable BE's page cache by default: `disable_storage_page_cache=true` + +Turn off this configuration to optimize memory usage and reduce the risk of memory OOM. +But it will reduce the query latency of some small queries. +If you are sensitive to query latency, or have high concurrency and small query scenarios, you can configure *disable_storage_page_cache=false* to enable page cache again. + +- Add new session variable `group_by_and_having_use_alias_first`, used to control whether the group and having clauses use alias. + +Reference: [https://doris.apache.org/docs/dev/advanced/variables](https://doris.apache.org/docs/dev/advanced/variables) + +# Improvement + +### Compaction + +- Support `Vertical Compaction`. To optimize the compaction overhead and efficiency of wide tables. + +- Support `Segment ompaction`. Fix -238 and -235 issues with high frequency imports. + +### Lakehouse + +- Hive Catalog can be compatible with Hive version 1/2/3 + +- Hive Catalog can access JuiceFS based HDFS with Broker. + +- Iceberg Catalog Support Hive Metastore and Rest Catalog type. + +- ES Catalog support _id column mapping. + +- Optimize Iceberg V2 read performance with large number of delete rows. + +- Support for reading Iceberg tables after Schema Evolution + +- Parquet Reader handles column name case correctly. + +### Other + +- Support for accessing Hadoop KMS-encrypted HDFS. + +- Support to cancel the Export export task in progress. + +- Optimize the performance of `explode_split` with 1x. + +- Optimize the read performance of nullable columns with 3x. + +- Optimize some problems of Memtracker, improve memory management accuracy, and optimize memory application. + + + +# Bug Fix + +- Fixed memory leak when loading data with Doris Flink Connector. + +- Fixed the possible thread scheduling problem of BE and reduce the `Fragment sent timeout` error caused by BE thread exhaustion. + +- Fixed various correctness and precision issues of column type datetimev2/decimalv3. + +- Fixed the problem data correctness issue with Unique Key Merge-on-Read table. + +- Fixed various known issues with the Light Schema Change feature. + +- Fixed various data correctness issues of bitmap type Runtime Filter. + +- Fixed the problem of poor reading performance of csv reader introduced in version 1.2.1. + +- Fixed the problem of BE OOM caused by Spark Load data download phase. + +- Fixed possible metadata compatibility issues when upgrading from version 1.1 to version 1.2. + +- Fixed the metadata problem when creating JDBC Catalog with Resource. + +- Fixed the problem of high CPU usage caused by load operation. + +- Fixed the problem of FE OOM caused by a large number of failed Broker Load jobs. + +- Fixed the problem of precision loss when loading floating-point types. + +- Fixed the problem of memory leak when useing 2PC stream load + +# Other + +Add metrics to view the total rowset and segment numbers on BE + +- doris_be_all_rowsets_num and doris_be_all_segments_num + + +# Big Thanks + +Thanks to ALL who contributed to this release! + + +@adonis0147 + +@AshinGau + +@BePPPower + +@BiteTheDDDDt + +@ByteYue + +@caiconghui + +@cambyzju + +@chenlinzhong + +@DarvenDuan + +@dataroaring + +@Doris-Extras + +@dutyu + +@englefly + +@freemandealer + +@Gabriel39 + +@HappenLee + +@Henry2SS + +@htyoung + +@isHuangXin + +@JackDrogon + +@jacktengg + +@Jibing-Li + +@kaka11chen + +@Kikyou1997 + +@Lchangliang + +@LemonLiTree + +@liaoxin01 + +@liqing-coder + +@luozenglin + +@morningman + +@morrySnow + +@mrhhsg + +@nextdreamblue + +@qidaye + +@qzsee + +@spaces-X + +@stalary + + +@starocean999 + +@weizuo93 + +@wsjz + +@xiaokang + +@xinyiZzz + +@xy720 + +@yangzhg + +@yiguolei + +@yixiutt + +@Yukang-Lian + +@Yulei-Yang + +@zclllyybb + +@zddr + +@zhangstar333 + +@zhannngchen + +@zy-kkk + + + + + + diff --git a/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.3.md b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.3.md new file mode 100644 index 0000000000000..cd9226b15e14f --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.3.md @@ -0,0 +1,109 @@ +--- +{ + "title": "Release 1.2.3", + "language": "en" +} +--- + + + +# Improvement + +### JDBC Catalog + +- Support connecting to Doris clusters through JDBC Catalog. + +Currently, Jdbc Catalog only support to use 5.x version of JDBC jar package to connect another Doris database. If you use 8.x version of JDBC jar package, the data type of column may not be matched. + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc/#doris](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc/#doris) + +- Support to synchronize only the specified database through the `only_specified_database` attribute. + +- Support synchronizing table names in the form of lowercase through `lower_case_table_names` to solve the problem of case sensitivity of table names. + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc) + +- Optimize the read performance of JDBC Catalog. + +### Elasticsearch Catalog + +- Support Array type mapping. + +- Support whether to push down the like expression through the `like_push_down` attribute to control the CPU overhead of the ES cluster. + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/es](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/es) + +### Hive Catalog + +- Support Hive table default partition `_HIVE_DEFAULT_PARTITION_`. + +- Hive Metastore metadata automatic synchronization supports notification event in compressed format. + +### Dynamic Partition Improvement + +- Dynamic partition supports specifying the `storage_medium` parameter to control the storage medium of the newly added partition. + +Reference: [https://doris.apache.org/docs/dev/advanced/partition/dynamic-partition](https://doris.apache.org/docs/dev/advanced/partition/dynamic-partition) + + +### Optimize BE's Threading Model + +- Optimize BE's threading model to avoid stability problems caused by frequent thread creation and destroy. + +# Bugfix + +- Fixed issues with Merge-On-Write Unique Key tables. + +- Fixed compaction related issues. + +- Fixed some delete statement issues causing data errors. + +- Fixed several query execution errors. + +- Fixed the problem of using JDBC catalog to cause BE crash on some operating system. + +- Fixed Multi-Catalog issues. + +- Fixed memory statistics and optimization issues. + +- Fixed decimalV3 and date/datetimev2 related issues. + +- Fixed load transaction stability issues. + +- Fixed light-weight schema change issues. + +- Fixed the issue of using `datetime` type for batch partition creation. + +- Fixed the problem that a large number of failed broker loads would cause the FE memory usage to be too high. + +- Fixed the problem that stream load cannot be canceled after dropping the table. + +- Fixed querying `information_schema` timeout in some cases. + +- Fixed the problem of BE crash caused by concurrent data export using `select outfile`. + +- Fixed transactional insert operation memory leak. + +- Fixed several query/load profile issues, and supports direct download of profiles through FE web ui. + +- Fixed the problem that the BE tablet GC thread caused the IO util to be too high. + +- Fixed the problem that the commit offset is inaccurate in Kafka routine load. + diff --git a/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.4.md b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.4.md new file mode 100644 index 0000000000000..a959a323d06d1 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.4.md @@ -0,0 +1,81 @@ +--- +{ + "title": "Release 1.2.4", + "language": "en" +} +--- + + + + +# Behavior Changed + +- For `DateV2`/`DatetimeV2` and `DecimalV3` type, in the results of `DESCRIBLE` and `SHOW CREATE TABLE` statements, they will no longer be displayed as `DateV2`/`DatetimeV2` or `DecimalV3`, but directly displayed as `Date`/`Datetime` or `Decimal`. + + - This change is for compatibility with some BI tools. If you want to see the actual type of the column, you can check it with the `DESCRIBE ALL` statement. + +- When querying tables in the `information_schema` database, the meta information(database, table, column, etc.) in the external catalog is no longer returned by default. + + - This change avoids the problem that the `information_schema` database cannot be queried due to the connection problem of some external catalog, so as to solve the problem of using some BI tools with Doris. It can be controlled by the FE configuration `infodb_support_ext_catalog`, and the default value is `false`, that is, the meta information of external catalog will not be returned. + +# Improvement + +### JDBC Catalog + +- Supports connecting to Trino/Presto via JDBC Catalog + +​ Refer to: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#trino](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#trino) + +- JDBC Catalog connects to Clickhouse data source and supports Array type mapping + +​ Refer to: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#clickhouse](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#clickhouse) + +### Spark Load + +- Spark Load supports Resource Manager HA related configuration + +​ Refer to: https://github.com/apache/doris/pull/15000 + +## Bug Fixes + +- Fixed several connectivity issues with Hive Catalog. + +- Fixed ClassNotFound issues with Hudi Catalog. + +- Optimize the connection pool of JDBC Catalog to avoid too many connections. + +- Fix the problem that OOM will occur when importing data from another Doris cluster through JDBC Catalog. + +- Fixed serveral queries and imports planning issues. + +- Fixed several issues with Unique Key Merge-On-Write data model. + +- Fix several BDBJE issues and solve the problem of abnormal FE metadata in some cases. + +- Fix the problem that the `CREATE VIEW` statement does not support Table Valued Function. + +- Fixed several memory statistics issues. + +- Fixed several issues reading Parquet/ORC format. + +- Fixed several issues with DecimalV3. + +- Fixed several issues with SHOW QUERY/LOAD PROFILE. + diff --git a/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.5.md b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.5.md new file mode 100644 index 0000000000000..55af863ba47d6 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.5.md @@ -0,0 +1,199 @@ +--- +{ + "title": "Release 1.2.5", + "language": "en" +} +--- + + + +In version 1.2.5, the Doris team has fixed nearly 210 issues or performance improvements since the release of version 1.2.4. At the same time, version 1.2.5 is also an iterative version of version 1.2.4, which has higher stability. It is recommended that all users upgrade to this version. + +# Behavior Changed + +- The `start_be.sh` script will check that the maximum number of file handles in the system must be greater than or equal to 65536, otherwise the startup will fail. + +- The BE configuration item `enable_quick_compaction` is set to true by default. The Quick Compaction is enabled by default. This feature is used to optimize the problem of small files in the case of large batch import. + +- After modifying the dynamic partition attribute of the table, it will no longer take effect immediately, but wait for the next task scheduling of the dynamic partition table to avoid some deadlock problems. + +# Improvement + +- Optimize the use of bthread and pthread to reduce the RPC blocking problem during the query process. + +- A button to download Profile is added to the Profile page of the FE web UI. + +- Added FE configuration `recover_with_skip_missing_version`, which is used to query to skip the problematic replica under certain failure conditions. + +- The row-level permission function supports external Catalog. + +- Hive Catalog supports automatic refreshing of kerberos tickets on the BE side without manual refreshing. + +- JDBC Catalog supports tables under the MySQL/ClickHouse system database (`information_schema`). + +# Bug Fixes + +- Fixed the problem of incorrect query results caused by low-cardinality column optimization + +- Fixed several authentication and compatibility issues accessing HDFS. + +- Fixed several issues with float/double and decimal types. + +- Fixed several issues with date/datetimev2 types. + +- Fixed several query execution and planning issues. + +- Fixed several issues with JDBC Catalog. + +- Fixed several query-related issues with Hive Catalog, and Hive Metastore metadata synchronization issues. + +- Fix the problem that the result of `SHOW LOAD PROFILE` statement is incorrect. + +- Fixed several memory related issues. + +- Fixed several issues with `CREATE TABLE AS SELECT` functionality. + +- Fix the problem that the jsonb type causes BE to crash on CPU that do not support avx2. + +- Fixed several issues with dynamic partitions. + +- Fixed several issues with TOPN query optimization. + +- Fixed several issues with the Unique Key Merge-on-Write table model. + +# Big Thanks + +58 contributors participated in the improvement and release of 1.2.5, and thank them for their hard work and dedication: + +@adonis0147 + +@airborne12 + +@AshinGau + +@BePPPower + +@BiteTheDDDDt + +@caiconghui + +@CalvinKirs + +@cambyzju + +@caoliang-web + +@dataroaring + +@Doris-Extras + +@dujl + +@dutyu + +@fsilent + +@Gabriel39 + +@gitccl + +@gnehil + +@GoGoWen + +@gongzexin + +@HappenLee + +@herry2038 + +@jacktengg + +@Jibing-Li + +@kaka11chen + +@Kikyou1997 + +@LemonLiTree + +@liaoxin01 + +@LiBinfeng-01 + +@luwei16 + +@Moonm3n + +@morningman + +@mrhhsg + +@Mryange + +@nextdreamblue + +@nsnhuang + +@qidaye + +@Shoothzj + +@sohardforaname + +@stalary + +@starocean999 + +@SWJTU-ZhangLei + +@wsjz + +@xiaokang + +@xinyiZzz + +@yangzhg + +@yiguolei + +@yixiutt + +@yujun777 + +@Yulei-Yang + +@yuxuan-luo + +@zclllyybb + +@zddr + +@zenoyang + +@zhangstar333 + +@zhannngchen + +@zxealous + +@zy-kkk + +@zzzzzzzs diff --git a/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.6.md b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.6.md new file mode 100644 index 0000000000000..39146b35b15ac --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.6.md @@ -0,0 +1,135 @@ +--- +{ + "title": "Release 1.2.6", + "language": "en" +} +--- + + + + +# Behavior Change + +- Add a BE configuration item `allow_invalid_decimalv2_literal` to control whether can import data that exceeding the decimal's precision, for compatibility with previous logic. + +# Query + +- Fix several query planning issues. +- Support `sql_select_limit` session variable. +- Optimize query cold run performance. +- Fix expr context memory leak. +- Fix the issue that the `explode_split` function was executed incorrectly in some cases. + +## Multi Catalog + +- Fix the issue that synchronizing hive metadata caused FE replay edit log to fail. +- Fix `refresh catalog` operation causing FE OOM. +- Fix the issue that jdbc catalog cannot handle `0000-00-00` correctly. +- Fixed the issue that the kerberos ticket cannot be refreshed automatically. +- Optimize the partition pruning performance of hive. +- Fix the inconsistent behavior of trino and presto in jdbc catalog. +- Fix the issue that hdfs short-circuit read could not be used to improve query efficiency in some environments. +- Fix the issue that the iceberg table on CHDFS could not be read. + +# Storage + +- Fix the wrong calculation of delete bitmap in MOW table. +- Fix several BE memory issues. +- Fix snappy compression issue. +- Fix the issue that jemalloc may cause BE to crash in some cases. + +# Others + +- Fix several java udf related issues. +- Fix the issue that the `recover table` operation incorrectly triggered the creation of dynamic partitions. +- Fix timezone when importing orc files via broker load. +- Fix the issue that the newly added `PERCENT` keyword caused the replay metadata of the routine load job to fail. +- Fix the issue that the `truncate` operation failed to acts on a non-partitioned table. +- Fix the issue that the mysql connection was lost due to the `show snapshot` operation. +- Optimize the lock logic to reduce the probability of lock timeout errors when creating tables. +- Add session variable `have_query_cache` to be compatible with some old mysql clients. +- Optimize the error message when encountering an error of loading. + +# Big Thanks + +Thanks all who contribute to this release: + +@amorynan + +@BiteTheDDDDt + +@caoliang-web + +@dataroaring + +@Doris-Extras + +@dutyu + +@Gabriel39 + +@HHoflittlefish777 + +@htyoung + +@jacktengg + +@jeffreys-cat + +@kaijchen + +@kaka11chen + +@Kikyou1997 + +@KnightLiJunLong + +@liaoxin01 + +@LiBinfeng-01 + +@morningman + +@mrhhsg + +@sohardforaname + +@starocean999 + +@vinlee19 + +@wangbo + +@wsjz + +@xiaokang + +@xinyiZzz + +@yiguolei + +@yujun777 + +@Yulei-Yang + +@zhangstar333 + +@zy-kkk + diff --git a/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.7.md b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.7.md new file mode 100644 index 0000000000000..cd47282f4688d --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.7.md @@ -0,0 +1,46 @@ +--- +{ + "title": "Release 1.2.7", + "language": "en" +} +--- + + + +# Bug Fixes + +- Fixed some query issues. +- Fix some storage issues. +- Fix some decimal precision issues. +- Fix query error caused by invalid `sql_select_limit` session variable's value. +- Fix the problem that hdfs short-circuit read cannot be used. +- Fix the problem that Tencent Cloud cosn cannot be accessed. +- Fix several issues with hive catalog kerberos access. +- Fix the problem that stream load profile cannot be used. +- Fix promethus monitoring parameter format problem. +- Fix the table creation timeout issue when creating a large number of tablets. + +# New Features + +- Unique Key model supports array type as value column +- Added `have_query_cache` variable for compatibility with MySQL ecosystem. +- Added `enable_strong_consistency_read` to support strong consistent read between sessions +- FE metrics supports user-level query counter + diff --git a/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.8.md b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.8.md new file mode 100644 index 0000000000000..35cbb7a3cdcf1 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v1.2/release-1.2.8.md @@ -0,0 +1,47 @@ +--- +{ + "title": "Release 1.2.8", + "language": "en" +} +--- + + + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Bug Fixes +- Fixed several issues with query execution. +- Fixed several issues with Spark Load. +- Fixed several issues with Parquet Reader. +- Fixed several issues with Orc Reader. +- Fixed Broker "FileSystem closed" problem. +- Fixed several issues with Broker Load. +- Fixed several issues with CTAS execution. +- Fixed several issues with backup and restore. +- Added "Catalog" column in audit log. +- Optimized the metadata cache of Iceberg Catalog. +- Fixed several issues with outfile/export feature. +- Fixed an issue with "replayEraseTable" edit log causing FE start to fail. +- Fixed some security issues. + + diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.0.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.0.md new file mode 100644 index 0000000000000..61ba6c5c60890 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.0.md @@ -0,0 +1,236 @@ +--- +{ + "title": "Release 2.0.0", + "language": "en" +} +--- + + + + +We are more than excited to announce that, after six months of coding, testing, and fine-tuning, Apache Doris 2.0.0 is now production-ready. Special thanks to the 275 committers who altogether contributed over 4100 optimizations and fixes to the project. + +This new version highlights: + +- 10 times faster data queries +- Enhanced log analytic and federated query capabilities +- More efficient data writing and updates +- Improved multi-tenant and resource isolation mechanisms +- Progresses in elastic scaling of resources and storage-compute separation +- Enterprise-facing features for higher usability + +> Download: https://doris.apache.org/download +> +> GitHub source code: https://github.com/apache/doris/releases/tag/2.0.0-rc04 + +## **A 10 Times Performance Increase** + +In SSB-Flat and TPC-H benchmarking, Apache Doris 2.0.0 delivered **over 10-time faster query performance** compared to an early version of Apache Doris. + +![](/images/release-note-2.0.0-1.png) + +This is realized by the introduction of a smarter query optimizer, inverted index, a parallel execution model, and a series of new functionalities to support high-concurrency point queries. + +### A smarter query optimizer + +The brand new query optimizer, Nereids, has a richer statistical base and adopts the Cascades framework. It is capable of self-tuning in most query scenarios and supports all 99 SQLs in TPC-DS, so users can expect high performance without any fine-tuning or SQL rewriting. + +TPC-H tests showed that Nereids, with no human intervention, outperformed the old query optimizer by a wide margin. Over 100 users have tried Apache Doris 2.0.0 in their production environment and the vast majority of them reported huge speedups in query execution. + +![](/images/release-note-2.0.0-2.png) + +**Doc**: https://doris.apache.org/docs/dev/query-acceleration/nereids/ + +Nereids is enabled by default in Apache Doris 2.0.0: `SET enable_nereids_planner=true`. Nereids collects statistical data by calling the Analyze command. + +### Inverted Index + +In Apache Doris 2.0.0, we introduced inverted index to better support fuzzy keyword search, equivalence queries, and range queries. + +A smartphone manufacturer tested Apache Doris 2.0.0 in their user behavior analysis scenarios. With inverted index enabled, v2.0.0 was able to finish the queries within milliseconds and maintain stable performance as the query concurrency level went up. In this case, it is 5 to 90 times faster than its old version. + +![](/images/release-note-2.0.0-3.png) + +### 20 times higher concurrency capability + +In scenarios like e-commerce order queries and express tracking, a huge number of end data users search for a certain data record simultaneously. These are what we call high-concurrency point queries, which can bring huge pressure on the system. A traditional solution is to introduce Key-Value stores like Apache HBase for such queries, and Redis as a cache layer to ease the burden, but that means redundant storage and higher maintenance costs. + +For a column-oriented DBMS like Apache Doris, the I/O usage of point queries will be multiplied. We need neater execution. Thus, on the basis of columnar storage, we added row storage format and row cache to increase row reading efficiency, short-circuit plans to speed up data retrieval, and prepared statements to reduce frontend overheads. + +After these optimizations, Apache Doris 2.0 reached a concurrency level of **30,000 QPS per node** on YCSB on a 16 Core 64G cloud server with 4×1T hard drives, representing an improvement of **20 times** compared to its older version. This makes Apache Doris a good alternative to HBase in high-concurrency scenarios, so that users don't need to endure extra maintenance costs and redundant storage brought by complicated tech stacks. + +Read more: https://doris.apache.org/blog/High_concurrency + +### A self-adaptive parallel execution model + +Apache 2.0 brought in a Pipeline execution model for higher efficiency and stability in hybrid analytic workloads. In this model, the execution of queries is driven by data. The blocking operators in all query execution processes are split into pipelines. Whether a pipeline gets an execution thread depends on whether its relevant data is ready. This enables asynchronous blocking operations and more flexible system resource management. Also, this improves CPU efficiency as the system doesn't have to create and destroy threads that much. + +Doc: https://doris.apache.org/docs/dev/query-acceleration/pipeline-execution-engine/ + +**How to enable the Pipeline execution model** + +- The Pipeline execution engine is enabled by default in Apache Doris 2.0: `Set enable_pipeline_engine = true`. +- `parallel_pipeline_task_num` represents the number of pipeline tasks that are parallelly executed in SQL queries. The default value of it is `0`, which means Apache Doris will automatically set the concurrency level to half the number of CPUs in each backend node. Users can change this value as they need it. +- For those who are upgrading to Apache Doris 2.0 from an older version, it is recommended to set the value of `parallel_pipeline_task_num` to that of `parallel_fragment_exec_instance_num` in the old version. + +## A Unified Platform for Multiple Analytic Workloads + +Apache Doris has been pushing its boundaries. Starting as an OLAP engine for reporting, it is now a data warehouse capable of ETL/ELT and more. Version 2.0 is making advancements in its log analysis and data lakehousing capabilities. + +### A 10 times more cost-effective log analysis solution + +Apache Doris 2.0.0 provides native support for semi-structured data. In addition to JSON and Array, it now supports a complex data type: Map. Based on Light Schema Change, it also supports Schema Evolution, which means you can adjust the schema as your business changes. You can add or delete fields and indexes, and change the data types for fields. As we introduced inverted index and a high-performance text analysis algorithm into it, it can execute full-text search and dimensional analysis of logs more efficiently. With faster data writing and query speed and lower storage cost, it is 10 times more cost-effective than the common log analytic solution within the industry. + +![](/images/release-note-2.0.0-4.png) + +### Enhanced data lakehousing capabilities + +In Apache Doris 1.2, we introduced Multi-Catalog to allow for auto-mapping and auto-synchronization of data from heterogeneous sources. In version 2.0.0, we extended the list of data sources supported and optimized Doris for based on users' needs in production environment. + +![](/images/release-note-2.0.0-5.png) + +Apache Doris 2.0.0 supports dozens of data sources including Hive, Hudi, Iceberg, Paimon, MaxCompute, Elasticsearch, Trino, ClickHouse, and almost all open lakehouse formats. It also supports snapshot queries on Hudi Copy-on-Write tables and read optimized queries on Hudi Merge-on-Read tables. It allows for authorization of Hive Catalog using Apache Ranger, so users can reuse their existing privilege control system. Besides, it supports extensible authorization plug-ins to enable user-defined authorization methods for any catalog. + +TPC-H benchmark tests showed that Apache Doris 2.0.0 is 3~5 times faster than Presto/Trino in queries on Hive tables. This is realized by all-around optimizations (in small file reading, flat table reading, local file cache, ORC/Parquet file reading, Compute Nodes, and information collection of external tables) finished in this development cycle and the distributed execution framework, vectorized execution engine, and query optimizer of Apache Doris. + +![](/images/release-note-2.0.0-6.png) + +All this gives Apache Doris 2.0.0 an edge in data lakehousing scenarios. With Doris, you can do incremental or overall synchronization of multiple upstream data sources in one place, and expect much higher data query performance than other query engines. The processed data can be written back to the sources or provided for downstream systems. In this way, you can make Apache Doris your unified data analytic gateway. + +## Efficient Data Update + +Data update is important in real-time analysis, since users want to always be accessible to the latest data, and be able to update data flexibly, such as updating a row or just a few columns, batching updating or deleting their specified data, or even overwriting a whole data partition. + +Efficient data updating has been another hill to climb in data analysis. Apache Hive only supports updates on the partition level, while Hudi and Iceberg do better in low-frequency batch updates instead of real-time updates due to their Merge-on-Read and Copy-on-Write implementations. + +As for data updating, Apache Doris 2.0.0 is capable of: + +- **Faster data writing**: In the pressure tests with an online payment platform, under 20 concurrent data writing tasks, Doris reached a writing throughput of 300,000 records per second and maintained stability throughout the over 10-hour continuous writing process. +- **Partial column update**: Older versions of Doris implements partial column update by `replace_if_not_null` in the Aggregate Key model. In 2.0.0, we enable partial column updates in the Unique Key model. That means you can directly write data from multiple source tables into a flat table, without having to concatenate them into one output stream using Flink before writing. This method avoids a complicated processing pipeline and the extra resource consumption. You can simply specify the columns you need to update. +- **Conditional update and deletion**: In addition to the simple Update and Delete operations, we realize complicated conditional updates and deletes operations on the basis of Merge-on-Write. + +## Faster, Stabler, and Smarter Data Writing + +### Higher speed in data writing + +As part of our continuing effort to strengthen the real-time analytic capability of Apache Doris, we have improved the end-to-end real-time data writing capability of version 2.0.0. Benchmark tests reported higher throughput in various writing methods: + +- Stream Load, TPC-H 144G lineitem table, 48-bucket Duplicate table, triple-replica writing: throughput increased by 100% +- Stream Load, TPC-H 144G lineitem table, 48-bucket Unique Key table, triple-replica writing: throughput increased by 200% +- Insert Into Select, TPC-H 144G lineitem table, 48-bucket Duplicate table: throughput increased by 50% +- Insert Into Select, TPC-H 144G lineitem table, 48-bucket Unique Key table: throughput increased by 150% + +### Greater stability in high-concurrency data writing + +The sources of system instability often includes small file merging, write amplification, and the consequential disk I/O and CPU overheads. Hence, we introduced Vertical Compaction and Segment Compaction in version 2.0.0 to eliminate OOM errors in compaction and avoid the generation of too many segment files during data writing. After such improvements, Apache Doris can write data 50% faster while **using only 10% of the memory that it previously used**. + +Read more: https://doris.apache.org/blog/Compaction + +### Auto-synchronization of table schema + +The latest Flink-Doris-Connector allows users to synchronize an entire database (such as MySQL and Oracle) to Apache Doris by one simple step. According to our test results, one single synchronization task can support the real-time concurrent writing of thousands of tables. Users no longer need to go through a complicated synchronization procedure because Apache Doris has automated the process. Changes in the upstream data schema will be automatically captured and dynamically updated to Apache Doris in a seamless manner. + +Read more: https://doris.apache.org/blog/FDC + +## A New Multi-Tenant Resource Isolation Solution + +The purpose of multi-tenant resource isolation is to avoid resource preemption in the case of heavy loads. For that sake, older versions of Apache Doris adopted a hard isolation plan featured by Resource Group: Backend nodes of the same Doris cluster would be tagged, and those of the same tag formed a Resource Group. As data was ingested into the database, different data replicas would be written into different Resource Groups, which will be responsible for different workloads. For example, data reading and writing will be conducted on different data tablets, so as to realize read-write separation. Similarly, you can also put online and offline business on different Resource Groups. + +![](/images/release-note-2.0.0-7.png) + +This is an effective solution, but in practice, it happens that some Resource Groups are heavily occupied while others are idle. We want a more flexible way to reduce vacancy rate of resources. Thus, in 2.0.0, we introduce Workload Group resource soft limit. + +![](/images/release-note-2.0.0-8.png) + +The idea is to divide workloads into groups to allow for flexible management of CPU and memory resources. Apache Doris associates a query with a Workload Group, and limits the percentage of CPU and memory that a single query can use on a backend node. The memory soft limit can be configured and enabled by the user. + +When there is a cluster resource shortage, the system will kill the largest memory-consuming query tasks; when there are sufficient cluster resources, once a Workload Group uses more resources than expected, the idle cluster resources will be shared among all the Workload Groups to give full play to the system memory and ensure stable execution of queries. You can also prioritize the Workload Groups in terms of resource allocation. In other words, you can decide which tasks can be assigned with adequate resources and which not. + +Meanwhile, we introduced Query Queue in 2.0.0. Upon Workload Group creation, you can set a maximum query number for a query queue. Queries beyond that limit will wait for execution in the queue. This is to reduce system burden under heavy workloads. + +## Elastic Scaling and Storage-Compute Separation + +When it comes to computation and storage resources, what do users want? + +- **Elastic scaling of computation resources**: Scale up resources quickly in peak times to increase efficiency and scale down in valley times to reduce costs. +- **Lower storage costs**: Use low-cost storage media and separate storage from computation. +- **Separation of workloads**: Isolate the computation resources of different workloads to avoid preemption. +- **Unified management of data**: Simply manage catalogs and data in one place. + +To separate storage and computation is a way to realize elastic scaling of resources, but it demands more efforts in maintaining storage stability, which determines the stability and continuity of OLAP services. To ensure storage stability, we introduced mechanisms including cache management, computation resource management, and garbage collection. + + In this respect, we divide our users into three groups after investigation: + +1. Users with no need for resource scaling +2. Users requiring resource scaling, low storage costs, and workload separation from Apache Doris +3. Users who already have a stable large-scale storage system and thus require an advanced compute-storage-separated architecture for efficient resource scaling + +Apache Doris 2.0 provides two solutions to address the needs of the first two types of users. + +1. **Compute nodes**. We introduced stateless compute nodes in version 2.0. Unlike the mix nodes, the compute nodes do not save any data and are not involved in workload balancing of data tablets during cluster scaling. Thus, they are able to quickly join the cluster and share the computing pressure during peak times. In addition, in data lakehouse analysis, these nodes will be the first ones to execute queries on remote storage (HDFS/S3) so there will be no resource competition between internal tables and external tables. + 1. Doc: https://doris.apache.org/docs/dev/advanced/compute_node/ +2. **Hot-cold data separation**. Hot/cold data refers to data that is frequently/seldom accessed, respectively. Generally, it makes more sense to store cold data in low-cost storage. Older versions of Apache Doris support lifecycle management of table partitions: As hot data cooled down, it would be moved from SSD to HDD. However, data was stored with multiple replicas on HDD, which was still a waste. Now, in Apache Doris 2.0, cold data can be stored in object storage, which is even cheaper and allows single-copy storage. That reduces the storage costs by 70% and cuts down the computation and network overheads that come with storage. + 1. Read more: https://doris.apache.org/blog/HCDS/ + +For neater separate of computation and storage, the VeloDB team is going to contribute the Cloud Compute-Storage-Separation solution to the Apache Doris project. The performance and stability of it has stood the test of hundreds of companies in their production environment. The merging of code will be finished by October this year, and all Apache Doris users will be able to get an early taste of it in September. + +## Enhanced Usability + +Apache Doris 2.0.0 also highlights some enterprise-facing functionalities. + +### Support for Kubernetes Deployment + +Older versions of Apache Doris communicate based on IP, so any host failure in Kubernetes deployment that causes a POD IP drift will lead to cluster unavailability. Now, version 2.0 supports FQDN. That means the failed Doris nodes can recover automatically without human intervention, which lays the foundation for Kubernetes deployment and elastic scaling. + +### Support for Cross-Cluster Replication (CCR) + +Apache Doris 2.0.0 supports cross-cluster replication (CCR). Data changes at the database/table level in the source cluster will be synchronized to the target cluster. You can choose to replicate the incremental data or the overall data. + +It also supports synchronization of DDL, which means DDL statements executed by the source cluster can also by automatically replicated to the target cluster. + +It is simple to configure and use CCR in Doris. Leveraging this functionality, you can implement read-write separation and multi-datacenter replication + +This feature allows for higher availability of data, read/write workload separation, and cross-data-center replication more efficiently. + +## Behavior Change + +- Use rolling upgrade from 1.2-ITS to 2.0.0, and restart upgrade from preview versions of 2.0 to 2.0.0; +- The new query optimizer (Nereids) is enabled by default: `enable_nereids_planner=true`; +- All non-vectorized code has been removed from the system, so the `enable_vectorized_engine` parameter no long works; +- A new parameter `enable_single_replica_compaction` has been added; +- datev2, datetimev2, and decimalv3 are the default data types in table creation; datav1, datetimev1, and decimalv2 are not supported in table creation; +- decimalv3 is the default data type for JDBC and Iceberg Catalog; +- A new data type `AGG_STATE` has been added; +- The cluster column has been removed from backend tables; +- For better compatibility with BI tools, datev2 and datetimev2 are displayed as date and datetime when `show create table`; +- max_openfiles and swaps checks are added to the backend startup script so inappropriate system configuration might lead to backend failure; +- Password-free login is not allowed when accessing frontend on localhost; +- If there is a Multi-Catalog in the system, by default, only data of the internal catalog will be displayed when querying information schema; +- A limit has been imposed on the depth of the expression tree. The default value is 200; +- The single quote in the return value of array string has been changed to double quote; +- The Doris processes are renamed to DorisFE and DorisBE. +- The functions AES and SM4 with two arguments' behaviour changed. See more informations in [relative function docs](../../sql-manual/sql-functions/encrypt-digest-functions/sm4-encrypt.md) + +## Embarking on the 2.0.0 Journey + +To make Apache Doris 2.0.0 production-ready, we invited hundreds of enterprise users to engage in the testing and optimized it for better performance, stability, and usability. In the next phase, we will continue responding to user needs with agile release planning. We plan to launch 2.0.1 in late August and 2.0.2 in September, as we keep fixing bugs and adding new features. We also plan to release an early version of 2.1 in September to bring a few long-requested capabilities to you. For example, in Doris 2.1, the Variant data type will better serve the schema-free analytic needs of semi-structured data; the multi-table materialized views will be able to simplify the data scheduling and processing link while speeding up queries; more and neater data ingestion methods will be added and nested composite data types will be realized. + +If you have any questions or ideas when investigating, testing, and deploying Apache Doris, please find us on [Slack](https://t.co/ZxJuNJHXb2). Our developers will be happy to hear them and provide targeted support. + diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.1.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.1.md new file mode 100644 index 0000000000000..d8c19fb67525b --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.1.md @@ -0,0 +1,224 @@ +--- +{ + "title": "Release 2.0.1", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, 383 improvements and bug fixes have been made in Doris 2.0.1. + +## Behavior Changes + +- [https://github.com/apache/doris/pull/21302](https://github.com/apache/doris/pull/21302) + +## Improvements + +### functionality and stability of array and map datatypes +- [https://github.com/apache/doris/pull/22793](https://github.com/apache/doris/pull/22793) +- [https://github.com/apache/doris/pull/22927](https://github.com/apache/doris/pull/22927) +- https://github.com/apache/doris/pull/22738 +- https://github.com/apache/doris/pull/22347 +- https://github.com/apache/doris/pull/23250 +- https://github.com/apache/doris/pull/22300 + +### performance for inverted index query +- https://github.com/apache/doris/pull/22836 +- https://github.com/apache/doris/pull/23381 +- https://github.com/apache/doris/pull/23389 +- https://github.com/apache/doris/pull/22570 + +### performance for bitmap, like, scan, agg functions +- https://github.com/apache/doris/pull/23172 +- https://github.com/apache/doris/pull/23495 +- https://github.com/apache/doris/pull/23476 +- https://github.com/apache/doris/pull/23396 +- https://github.com/apache/doris/pull/23182 +- https://github.com/apache/doris/pull/22216 + +### functionality and stability of CCR +- https://github.com/apache/doris/pull/22447 +- https://github.com/apache/doris/pull/22559 +- https://github.com/apache/doris/pull/22173 +- https://github.com/apache/doris/pull/22678 + +### merge on write unique table + +- https://github.com/apache/doris/pull/22282 +- https://github.com/apache/doris/pull/22984 +- https://github.com/apache/doris/pull/21933 +- https://github.com/apache/doris/pull/22874 + +### optimizer table stats and analyze + +- https://github.com/apache/doris/pull/22658 +- https://github.com/apache/doris/pull/22211 +- https://github.com/apache/doris/pull/22775 +- https://github.com/apache/doris/pull/22896 +- https://github.com/apache/doris/pull/22788 +- https://github.com/apache/doris/pull/22882 +- + +### functionality and performance of multi catalog + +- https://github.com/apache/doris/pull/22949 +- https://github.com/apache/doris/pull/22923 +- https://github.com/apache/doris/pull/22336 +- https://github.com/apache/doris/pull/22915 +- https://github.com/apache/doris/pull/23056 +- https://github.com/apache/doris/pull/23297 +- https://github.com/apache/doris/pull/23279 + + +## Important Bug fixes + +- https://github.com/apache/doris/pull/22673 +- https://github.com/apache/doris/pull/22656 +- https://github.com/apache/doris/pull/22892 +- https://github.com/apache/doris/pull/22959 +- https://github.com/apache/doris/pull/22902 +- https://github.com/apache/doris/pull/22976 +- https://github.com/apache/doris/pull/22734 +- https://github.com/apache/doris/pull/22840 +- https://github.com/apache/doris/pull/23008 +- https://github.com/apache/doris/pull/23003 +- https://github.com/apache/doris/pull/22966 +- https://github.com/apache/doris/pull/22965 +- https://github.com/apache/doris/pull/22784 +- https://github.com/apache/doris/pull/23049 +- https://github.com/apache/doris/pull/23084 +- https://github.com/apache/doris/pull/22947 +- https://github.com/apache/doris/pull/22919 +- https://github.com/apache/doris/pull/22979 +- https://github.com/apache/doris/pull/23096 +- https://github.com/apache/doris/pull/23113 +- https://github.com/apache/doris/pull/23062 +- https://github.com/apache/doris/pull/22918 +- https://github.com/apache/doris/pull/23026 +- https://github.com/apache/doris/pull/23175 +- https://github.com/apache/doris/pull/23167 +- https://github.com/apache/doris/pull/23015 +- https://github.com/apache/doris/pull/23165 +- https://github.com/apache/doris/pull/23264 +- https://github.com/apache/doris/pull/23246 +- https://github.com/apache/doris/pull/23198 +- https://github.com/apache/doris/pull/23221 +- https://github.com/apache/doris/pull/23277 +- https://github.com/apache/doris/pull/23249 +- https://github.com/apache/doris/pull/23272 +- https://github.com/apache/doris/pull/23383 +- https://github.com/apache/doris/pull/23372 +- https://github.com/apache/doris/pull/23399 +- https://github.com/apache/doris/pull/23295 +- https://github.com/apache/doris/pull/23446 +- https://github.com/apache/doris/pull/23406 +- https://github.com/apache/doris/pull/23387 +- https://github.com/apache/doris/pull/23421 +- https://github.com/apache/doris/pull/23456 +- https://github.com/apache/doris/pull/23361 +- https://github.com/apache/doris/pull/23402 +- https://github.com/apache/doris/pull/23369 +- https://github.com/apache/doris/pull/23245 +- https://github.com/apache/doris/pull/23532 +- https://github.com/apache/doris/pull/23529 +- https://github.com/apache/doris/pull/23601 + + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.1-merged+is%3Aclosed) . + + +## Big Thanks + +Thanks all who contribute to this release: + +@adonis0147 +@airborne12 +@amorynan +@AshinGau +@BePPPower +@BiteTheDDDDt +@bobhan1 +@ByteYue +@caiconghui +@CalvinKirs +@csun5285 +@DarvenDuan +@deadlinefen +@DongLiang-0 +@Doris-Extras +@dutyu +@englefly +@freemandealer +@Gabriel39 +@GoGoWen +@HappenLee +@hello-stephen +@HHoflittlefish777 +@hubgeter +@hust-hhb +@JackDrogon +@jacktengg +@jackwener +@Jibing-Li +@kaijchen +@kaka11chen +@Kikyou1997 +@Lchangliang +@LemonLiTree +@liaoxin01 +@LiBinfeng-01 +@lsy3993 +@luozenglin +@morningman +@morrySnow +@mrhhsg +@Mryange +@mymeiyi +@shuke987 +@sohardforaname +@starocean999 +@TangSiyang2001 +@Tanya-W +@ucasfl +@vinlee19 +@wangbo +@wsjz +@wuwenchi +@xiaokang +@XieJiann +@xinyiZzz +@yujun777 +@Yukang-Lian +@Yulei-Yang +@zclllyybb +@zddr +@zenoyang +@zgxme +@zhangguoqiang666 +@zhangstar333 +@zhannngchen +@zhiqiang-hhhh +@zxealous +@zy-kkk +@zzzxl1993 +@zzzzzzzs + diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.10.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.10.md new file mode 100644 index 0000000000000..5d8592a0ee25c --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.10.md @@ -0,0 +1,59 @@ +--- +{ + "title": "Release 2.0.10", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 83 improvements and bug fixes have been made in Doris 2.0.10 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + + +## Improvement and Optimizations + +- This enhancement introduces the `read_only` and `super_read_only` variables to the database system, ensuring compatibility with MySQL's read-only modes. + +- When the check status is not IO_ERROR, the disk path should not be added to the broken list. This ensures that only disks with actual I/O errors are marked as broken. + +- When performing a Create Table As Select (CTAS) operation from an external table, convert the `VARCHAR` column to `STRING` type. + +- Support mapping Paimon column type "ROW" to Doris type "STRUCT" + +- Choose disk tolerate with little skew when creating tablet + +- Write editlog to `set replica drop` to avoid confusing status on follower FE + +- Make the schema change memory space adaptive to avoid memory over limit + +- Inverted index 'unicode' tokenizer supports configuration to exclude stop words + +See the complete list of improvements and bug fixes on [GitHub](https://github.com/apache/doris/compare/2.0.9...2.0.10) . + +## Credits + +Thanks to all who contributed to this release: + +@airborne12, @BePPPower, @ByteYue, @CalvinKirs, @cambyzju, @csun5285, @dataroaring, @deardeng, @DongLiang-0, @eldenmoon, @felixwluo, @HappenLee, @hubgeter, @jackwener, @kaijchen, @kaka11chen, @Lchangliang, @liaoxin01, @LiBinfeng-01, @luennng, @morningman, @morrySnow, @Mryange, @nextdreamblue, @qidaye, @starocean999, @suxiaogang223, @SWJTU-ZhangLei, @w41ter, @xiaokang, @xy720, @yujun777, @Yukang-Lian, @zhangstar333, @zxealous, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.11.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.11.md new file mode 100644 index 0000000000000..1a2598b0d41a0 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.11.md @@ -0,0 +1,60 @@ +--- +{ + "title": "Release 2.0.11", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 123 improvements and bug fixes have been made in Doris 2.0.11 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + +## 1 Behavior change + +Since the inverted index is now mature and stable, it can replace the old BITMAP INDEX. Therefore, any newly created `BITMAP INDEX` will automatically switch to an `INVERTED INDEX`, while existing `BITMAP INDEX` will remain unchanged. This entire switching process is transparent to the user, with no changes to writing or querying. Additionally, users can disable this automatic switch by setting the FE configuration `enable_create_bitmap_index_as_inverted_index` to false. [#35528](https://github.com/apache/doris/pull/35528) + + +## 2 Improvement and optimizations + +- Add Trino JDBC Catalog type mapping for JSON and TIME + +- FE exit when failed to transfer to (non) master to prevent unknown state and too many logs + +- Write audit log while doing drop stats table. + +- Ignore min/max column stats if table is partially analyzed to avoid inefficient query plan + +- Support minus operation for set like `set1 - set2` + +- Improve perfmance of LIKE and REGEXP clause with concat (col, pattern_str), eg. `col1 LIKE concat('%', col2, '%')` + +- Add query options for short circuit queries for upgrade compatibility + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.10...2.0.11) . + +## Credits + +Thanks all who contribute to this release: + +@AshinGau, @BePPPower, @BiteTheDDDDt, @ByteYue, @CalvinKirs, @cambyzju, @csun5285, @dataroaring, @eldenmoon, @englefly, @feiniaofeiafei, @Gabriel39, @GoGoWen, @HHoflittlefish777, @hubgeter, @jacktengg, @jackwener, @jeffreys-cat, @Jibing-Li, @kaka11chen, @kobe6th, @LiBinfeng-01, @mongo360, @morningman, @morrySnow, @mrhhsg, @Mryange, @nextdreamblue, @qidaye, @sjyango, @starocean999, @SWJTU-ZhangLei, @w41ter, @wangbo, @wsjz, @wuwenchi, @xiaokang, @XieJiann, @xy720, @yujun777, @Yukang-Lian, @Yulei-Yang, @zclllyybb, @zddr, @zhangstar333, @zhiqiang-hhhh, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.12.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.12.md new file mode 100644 index 0000000000000..0bc289c91a8ef --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.12.md @@ -0,0 +1,58 @@ +--- +{ + "title": "Release 2.0.12", + "language": "en" +} +--- + + + +Thanks to our community developers and users for their contributions. Doris version 2.0.12 will bring 99 improvements and bug fixes. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- No longer set the default table comment to the table type. Instead, set it to be empty by default, for example, change COMMENT 'OLAP' to COMMENT ' '. This new behavior is more friendly for BI software that relies on table comments. [#35855](https://github.com/apache/doris/pull/35855) + +- Change the type of the `@@autocommit` variable from `BOOLEAN` to `BIGINT` to prevent errors from certain MySQL clients (such as .NET MySQL.Data). [#33282](https://github.com/apache/doris/pull/33282) + + +## Improvements + +- Remove the `disable_nested_complex_type` parameter and allow the creation of nested `ARRAY`, `MAP`, and `STRUCT` types by default. [#36255](https://github.com/apache/doris/pull/36255) + +- The HMS catalog supports the `SHOW CREATE DATABASE` command. [#28145](https://github.com/apache/doris/pull/28145) + +- Add more inverted index metrics to the query profile. [#36545](https://github.com/apache/doris/pull/36545) + +- Cross-Cluster Replication (CCR) supports inverted indices. [#31743](https://github.com/apache/doris/pull/31743) + +You can access the full list through the GitHub [link](https://github.com/apache/doris/compare/2.0.11...2.0.12) , with the key features and improvements highlighted below. + + + +## Credits + +Thanks all who contribute to this release: + +@airborne12, D14@amorynan, D14@BiteTheDDDDt, D14@cambyzju, D14@caoliang-web, D14@dataroaring, D14@eldenmoon, D14@feiniaofeiafei, D14@felixwluo, D14@gavinchou, D14@HappenLee, D14@hello-stephen, D14@jacktengg, D14@Jibing-Li, D14@Johnnyssc, D14@liaoxin01, D14@LiBinfeng-01, D14@luwei16, D14@mongo360, D14@morningman, D14@morrySnow, D14@mrhhsg, D14@Mryange, D14@mymeiyi, D14@qidaye, D14@qzsee, D14@starocean999, D14@w41ter, D14@wangbo, D14@wsjz, D14@wuwenchi, D14@xiaokang, D14@XuPengfei-1020, D14@xy720, D14@yongjinhou, D14@yujun777, D14@Yukang-Lian, D14@Yulei-Yang, D14@zclllyybb, D14@zddr, D14@zhannngchen, D14@zhiqiang-hhhh, D14@zy-kkk, D14@zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.13.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.13.md new file mode 100644 index 0000000000000..1b6e54d948d7d --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.13.md @@ -0,0 +1,61 @@ +--- +{ + "title": "Release 2.0.13", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 112 improvements and bug fixes have been made in Doris 2.0.13 version + +[Quick Download](https://doris.apache.org/download/) + +## Behavior changes + +SQL input is treated as multiple statements only when the `CLIENT_MULTI_STATEMENTS` setting is enabled on the client side, enhancing compatibility with MySQL. [#36759](https://github.com/apache/doris/pull/36759) + +## New features + +- A new BE configuration `allow_zero_date` has been added, allowing dates with all zeros. When set to `false`, `0000-00-00` is parsed as `NULL`, and when set to `true`, it is parsed as `0000-01-01`. The default value is `false` to maintain consistency with previous behavior. [#34961](https://github.com/apache/doris/pull/34961) + +- `LogicalWindow` and `LogicalPartitionTopN` support multi-field predicate pushdown to improve performance. [#36828](https://github.com/apache/doris/pull/36828) + +- The ES Catalog now maps ES `nested` or `object` types to Doris `JSON` types. [#37101](https://github.com/apache/doris/pull/37101) + +## Improvements + +- Queries with `LIMIT` end reading data earlier to reduce resource consumption and improve performance. [#36535](https://github.com/apache/doris/pull/36535) + +- Special JSON data with empty keys is now supported. [#36762](https://github.com/apache/doris/pull/36762) + +- Stability and usability of routine load have been improved, including load balancing, automatic recovery, exception handling, and more user-friendly error messages. [#36450](https://github.com/apache/doris/pull/36450) [#35376](https://github.com/apache/doris/pull/35376) [#35266](https://github.com/apache/doris/pull/35266) [ #33372](https://github.com/apache/doris/pull/33372) [#32282](https://github.com/apache/doris/pull/32282) [#32046](https://github.com/apache/doris/pull/32046) [#32021](https://github.com/apache/doris/pull/32021) [#31846](https://github.com/apache/doris/pull/31846) [#31273](https://github.com/apache/doris/pull/31273) + +- BE load balancing selection of hard disk strategy and speed optimization. [#36826](https://github.com/apache/doris/pull/36826) [#36795](https://github.com/apache/doris/pull/36795) [#36509](https://github.com/apache/doris/pull/36509) + +- Stability and usability of the JDBC catalog have been improved, including encryption, thread pool connection count configuration, and more user-friendly error messages. [#36940](https://github.com/apache/doris/pull/36940) [#36720](https://github.com/apache/doris/pull/36720) [#30880](https://github.com/apache/doris/pull/30880) [#35692](https://github.com/apache/doris/pull/35692) + +You can access the full list through the GitHub [link](https://github.com/apache/doris/compare/2.0.12...2.0.13) , with the key features and improvements highlighted below. + +## Credits + +Thanks to all who contributed to this release: + +@Gabriel39, @Jibing-Li, @Johnnyssc, @Lchangliang, @LiBinfeng-01, @SWJTU-ZhangLei, @Thearas, @Yukang-Lian, @Yulei-Yang, @airborne12, @amorynan, @bobhan1, @cambyzju, @csun5285, @dataroaring, @deardeng, @eldenmoon, @englefly, @feiniaofeiafei, @hello-stephen, @jacktengg, @kaijchen, @liutang123, @luwei16, @morningman, @morrySnow, @mrhhsg, @mymeiyi, @platoneko, @qidaye, @sollhui, @starocean999, @w41ter, @xiaokang, @xy720, @yujun777, @zclllyybb, @zddr \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.14.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.14.md new file mode 100644 index 0000000000000..061c5cb7a1093 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.14.md @@ -0,0 +1,59 @@ +--- +{ + "title": "Release 2.0.14", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 110 improvements and bug fixes have been made in Doris 2.0.14 version + + +## 1 New features + +- Adds a REST interface to retrieve the most recent query profile: `curl http://user:password@127.0.0.1:8030/api/profile/text` [#38268](https://github.com/apache/doris/pull/38268) + +## 2 Improvements + +- Optimizes the primary key point query performance for MOW tables with sequence columns [#38287](https://github.com/apache/doris/pull/38287) + +- Enhances the performance of inverted index queries with many conditions [#35346](https://github.com/apache/doris/pull/35346) + +- Automatically enables the `support_phrase` option when creating a tokenized inverted index to accelerate `match_phrase` phrase queries [#37949](https://github.com/apache/doris/pull/37949) + +- Supports simplified SQL hints, for example: `SELECT /*+ query_timeout(3000) */ * FROM t;` [#37720](https://github.com/apache/doris/pull/37720) + +- Automatically retries reading from object storage when encountering a `429` error to improve stability [#35396](https://github.com/apache/doris/pull/35396) + +- LEFT SEMI / ANTI JOIN terminates subsequent matching execution upon matching a qualifying data row to enhance performance. [#34703](https://github.com/apache/doris/pull/34703) + +- Prevents coredump when returning illegal data to MySQL results. [#28069](https://github.com/apache/doris/pull/28069) + +- Unifies the output of type names in lowercase to maintain compatibility with MySQL and be more friendly to BI tools. [#38521](https://github.com/apache/doris/pull/38521) + + +You can access the full list through the GitHub [link](https://github.com/apache/doris/compare/2.0.13...2.0.14) , with the key features and improvements highlighted below. + +## Credits + +Thanks all who contribute to this release: + +@ByteYue, @CalvinKirs, @GoGoWen, @HappenLee, @Jibing-Li, @Lchangliang, @LiBinfeng-01, @Mryange, @XieJiann, @Yukang-Lian, @Yulei-Yang, @airborne12, @amorynan, @biohazard4321, @cambyzju, @csun5285, @eldenmoon, @englefly, @freemandealer, @hello-stephen, @hubgeter, @kaijchen, @liaoxin01, @luwei16, @morningman, @morrySnow, @mymeiyi, @qidaye, @sollhui, @starocean999, @w41ter, @wuwenchi, @xiaokang, @xy720, @yujun777, @zclllyybb, @zddr, @zhangstar333, @zhiqiang-hhhh, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.15.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.15.md new file mode 100644 index 0000000000000..58237f7c3f097 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.15.md @@ -0,0 +1,91 @@ +--- +{ + "title": "Release 2.0.15", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 157 improvements and bug fixes have been made in Doris 2.0.15 version + +- Quick Download: https://doris.apache.org/download + +- GitHub: https://github.com/apache/doris/releases/tag/2.0.15 + +## 1 Behavior Change + +NA + +## 2 New Features + +- Restore now supports deleting redundant tablets and partition options. [#39028](https://github.com/apache/doris/pull/39028) + +- Support JSON function `json_search`.[#40948](https://github.com/apache/doris/pull/40948) + +## 3 Improvement and Optimizations + +### Stability + +- Add a FE configuration `abort_txn_after_lost_heartbeat_time_second` for transaction abort time. [#28662](https://github.com/apache/doris/pull/28662) + +- Abort transactions after a BE loses heartbeat for over 1 minute instead of 5 seconds, to avoid overly sensitive transaction aborts. [#22781](https://github.com/apache/doris/pull/22781) + +- Delay scheduling EOF tasks of routine load to avoid an excessive number of small transactions. [#39975](https://github.com/apache/doris/pull/39975) + +- Prefer querying from online disk services to be more robust. [#39467](https://github.com/apache/doris/pull/39467) + +- Skip checking newly inserted rows in non-strict mode partial updates if the row's delete sign is marked. [#40322](https://github.com/apache/doris/pull/40322) + +- To prevent FE OOM, limit the number of tablets in backup tasks, with a default value of 300,000. [#39987](https://github.com/apache/doris/pull/39987) + +### Performance + +- Optimize slow column updates caused by concurrent column updates and compactions. [#38487](https://github.com/apache/doris/pull/38487) + +- When a NullLiteral exists in a filter condition, it can now be folded into False and further converted to an EmptySet to reduce unnecessary data scanning and computation. [#38135](https://github.com/apache/doris/pull/38135) + +- Improve performance of `ORDER BY` permutation. [#38985](https://github.com/apache/doris/pull/38985) + +- Improve the performance of string processing in inverted indexes. [#37395](https://github.com/apache/doris/pull/37395) + +### Optimizer and Statistics + +- Added support for statements beginning with a semicolon. [#39399](https://github.com/apache/doris/pull/39399) + +- Polish aggregate function signature matching. [#39352](https://github.com/apache/doris/pull/39352) + +- Drop column statistics and trigger auto analysis after schema change. [#39101](https://github.com/apache/doris/pull/39101) + +- Support dropping cached stats using `DROP CACHED STATS table_name`. [#39367](https://github.com/apache/doris/pull/39367) + +### Multi Catalog and Others + +- Optimize JDBC Catalog refresh to reduce the frequency of client creation. [#40261](https://github.com/apache/doris/pull/40261) + +- Fix thread leaks in JDBC Catalog under certain conditions. [#39423](https://github.com/apache/doris/pull/39423) + +- ARRAY MAP STRUCT types now support `REPLACE_IF_NOT_NULL`. [#38304](https://github.com/apache/doris/pull/38304) + +- Retry delete jobs for failures that are not `DELETE_INVALID_XXX`. [#37834](https://github.com/apache/doris/pull/37834) + +**Credits** + +@924060929, @BePPPower, @BiteTheDDDDt, @CalvinKirs, @GoGoWen, @HappenLee, @Jibing-Li, @Johnnyssc, @LiBinfeng-01, @Mryange, @SWJTU-ZhangLei, @TangSiyang2001, @Toms1999, @Vallishp, @Yukang-Lian, @airborne12, @amorynan, @bobhan1, @cambyzju, @csun5285, @dataroaring, @eldenmoon, @englefly, @feiniaofeiafei, @hello-stephen, @htyoung, @hubgeter, @justfortaste, @liaoxin01, @liugddx, @liutang123, @luwei16, @mongo360, @morrySnow, @qidaye, @smallx, @sollhui, @starocean999, @w41ter, @xiaokang, @xzj7019, @yujun777, @zclllyybb, @zddr, @zhangstar333, @zhannngchen, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.2.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.2.md new file mode 100644 index 0000000000000..3f8e89cddf946 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.2.md @@ -0,0 +1,157 @@ +--- +{ + "title": "Release 2.0.2", + "language": "en" +} +--- + + + +# Release 2.0.2 + +Thanks to our community users and developers, 489 improvements and bug fixes have been made in Doris 2.0.2. + +## Behavior Changes + +- [Remove json -> operator convert to json_extract #24679](https://github.com/apache/doris/pull/24679) + + Remove json '->' operator since it is conflicted with lambda function syntax. It's a syntax sugar for function json_extract and can be replaced with the former. +- [Start the script to set metadata_failure_recovery #24308](https://github.com/apache/doris/pull/24308) + + Move metadata_failure_recovery from fe.conf to start_fe.sh argument to prevent being used unexpectedly. +- [Change ordinary type null value is \N,complex type null value is null #24207](https://github.com/apache/doris/pull/24207) +- [Optimize priority_ network matching logic for be #23795](https://github.com/apache/doris/pull/23795) +- [Fix cancel load failed because Job could not be cancelled… #17730](https://github.com/apache/doris/pull/17730) + + Allow cancel a retrying load job. + +## Improvements + +### Easier to use + +- [Support custom lib dir to save custom libs #23887](https://github.com/apache/doris/pull/23887) + + Add a custom_lib dir to allow users place custom lib files and custom_lib will not be replaced. +- [Optimize priority_ network matching logic #23784](https://github.com/apache/doris/pull/23784) + + Optimize priority_network logic to avoid error when this config is wrong or not configured. +- [Row policy support role #23022](https://github.com/apache/doris/pull/23022) + + Support role based auth for row policy. + +### New optimizer Nereids statistics collection improvement + +- [Disable file cache while running analysis tasks. #23663](https://github.com/apache/doris/pull/23663) +- [Show column stats even when error occurred. #23703](https://github.com/apache/doris/pull/23703) +- [Support basic jdbc external table stats collection. #23965](https://github.com/apache/doris/pull/23965) +- [Skip unknown col stats check on __internal_scheam and information_schema #24625](https://github.com/apache/doris/pull/24625) + +### Better support for JDBC, HDFS, Hive, MySQL, Max Compute, Multi-Catalog + +- [Support hadoop viewfs. #24168](https://github.com/apache/doris/pull/24168) +- [Avoid calling checksum when replaying creating jdbc catalog and fix ranger issue #22369](https://github.com/apache/doris/pull/22369) +- [Optimize the JDBC Catalog connection error message #23868](https://github.com/apache/doris/pull/23868) + + Improve property check and error message for JDBC catalog +- [Fix mc decimal type parse, fix wrong obj location #24242](https://github.com/apache/doris/pull/24242) + + Fix some issues for Max Compute catalog +- [Support sql cache for hms catalog #23391](https://github.com/apache/doris/pull/23391) + + SQL cache for Hive catalog +- [Merge hms partition events. #22869](https://github.com/apache/doris/pull/22869) + + Improve performance for Hive metadata sync +- [Add metadata_name_ids for quickly get catlogs,db,table and add profiling table in order to Compatible with mysql #22702](https://github.com/apache/doris/pull/22702) + +### Performance for inverted index query + +- [Add bkd index query cache to improve perf #23952](https://github.com/apache/doris/pull/23952) +- [Improve performance for count on index other than match #24678](https://github.com/apache/doris/pull/24678) +- [Improve match performance without index #24751](https://github.com/apache/doris/pull/24751) +- [Optimize multiple terms conjunction query #23871](https://github.com/apache/doris/pull/23871) +Improve performance of MATCH_ALL +- [Optimize unnecessary conversions #24389](https://github.com/apache/doris/pull/24389) +Improve performance of MATCH + +### Improve Array functions + +- [[Fix old optimizer with some array literal functions #23630](https://github.com/apache/doris/pull/23630) +- [Improve array union support multi params #24327](https://github.com/apache/doris/pull/24327) +- [Improve explode func with array nested complex type #24455](https://github.com/apache/doris/pull/24455) + +## Important Bug fixes + +- [The parameter positions of timestamp diff function to sql are reversed #23601](https://github.com/apache/doris/pull/23601) +- [Fix old optimizer with some array literal functions #23630](https://github.com/apache/doris/pull/23630) +- [Fix query cache returns wrong result after deleting partitions. #23555](https://github.com/apache/doris/pull/23555) +- [Fix potential data loss when clone task's dst tablet is cooldown replica #17644](https://github.com/apache/doris/pull/17644) +- [Fix array map batch append data with right next_array_item_rowid #23779](https://github.com/apache/doris/pull/23779) +- [Fix or to in rule #23940](https://github.com/apache/doris/pull/23940) +- [Fix 'char' function's toSql implementation is wrong #23860](https://github.com/apache/doris/pull/23860) +- [Record wrong best plan properties #23973](https://github.com/apache/doris/pull/23973) +- [Make TVF's distribution spec always be RANDOM #24020](https://github.com/apache/doris/pull/24020) +- [External scan use STORAGE_ANY instead of ANY as distibution #24039](https://github.com/apache/doris/pull/24039) +- [Runtimefilter target is not SlotReference #23958](https://github.com/apache/doris/pull/23958) +- [mv in select materialized_view should disable show table #24104](https://github.com/apache/doris/pull/24104) +- [Fail over to remote file reader if local cache failed #24097](https://github.com/apache/doris/pull/24097) +- [Fix revoke role operation cause fe down #23852](https://github.com/apache/doris/pull/23852) +- [Handle status code correctly and add a new error code `ENTRY_NOT_FOUND` #24139](https://github.com/apache/doris/pull/24139) +- [Fix leaky abstraction and shield the status code `END_OF_FILE` from upper layers #24165](https://github.com/apache/doris/pull/24165) +- [Fix bug that Read garbled files caused be crash. #24164](https://github.com/apache/doris/pull/24164) +- [Fix be core when user sepcified empty `column_separator` using hdfs tvf #24369](https://github.com/apache/doris/pull/24369) +- [Fix need to restart BE after replacing the jar package in java-udf #24372](https://github.com/apache/doris/pull/24372) +- [Need to call 'set_version' in nested functions #24381](https://github.com/apache/doris/pull/24381) +- [windown_funnel compatibility issue with multi backends #24385](https://github.com/apache/doris/pull/24385) +- [correlated anti join shouldn't be translated to null aware anti join #24290](https://github.com/apache/doris/pull/24290) +- [Change ordinary type null value is \N,complex type null value is null #24207](https://github.com/apache/doris/pull/24207) +- [Fix analyze failed when there are thousands of partitions. #24521](https://github.com/apache/doris/pull/24521) +- [Do not use enum as the data type for JavaUdfDataType. #24460](https://github.com/apache/doris/pull/24460) +- [Fix multi window projection issue temporarily #24568](https://github.com/apache/doris/pull/24568) +- [Make metadata compatible with 2.0.3 #24610](https://github.com/apache/doris/pull/24610) +- [Select outfile column order is wrong #24595](https://github.com/apache/doris/pull/24595) +- [Incorrect result of semi/anti mark join #24616](https://github.com/apache/doris/pull/24616) +- [Fix broker read issue #24635](https://github.com/apache/doris/pull/24635) +- [Skip unknown col stats check on __internal_scheam and information_schema #24625](https://github.com/apache/doris/pull/24625) +- [Fixed bug when parsing multi-character delimiters. #24572](https://github.com/apache/doris/pull/24572) +- [Fix timezone parse when there is no tzfile #24578](https://github.com/apache/doris/pull/24578) +- [We need to issue an error when starting FE without setting the Java home environment #23943](https://github.com/apache/doris/pull/23943) +- [Enable_unique_key_partial_update should be forwarded to master #24697](https://github.com/apache/doris/pull/24697) +- [Fix paimon file catalog meta issue and replication num analysis issue #24681](https://github.com/apache/doris/pull/24681) +- [Add more log for ingest_binlog && Fix ingest_binlog not rewrite rowset_meta tablet_uid #24617](https://github.com/apache/doris/pull/24617) +- [Do not abort when a disk is broken #24692](https://github.com/apache/doris/pull/24692) +- [colocate join could not work well on full outer join #24700](https://github.com/apache/doris/pull/24700) +- [Optimize unnecessary conversions #24389](https://github.com/apache/doris/pull/24389) +- [Optimize the reading efficiency of nullable (string) columns. #24698](https://github.com/apache/doris/pull/24698) +- [Fix segment cache core when output rowset is nullptr #24778](https://github.com/apache/doris/pull/24778) +- [Fix duplicate key in schema change #24782](https://github.com/apache/doris/pull/24782) +- [Make metadata compatible for future version after 2.0.2 #24800](https://github.com/apache/doris/pull/24800) +- [Fix map/array deserialize string with quote pair #24808](https://github.com/apache/doris/pull/24808) +- [Failed on arm platform, with clang compiler and pch on, close #24633 #24636](https://github.com/apache/doris/pull/24636) +- [Table column order is changed if add a column and do truncate #24981](https://github.com/apache/doris/pull/24981) +- [Make parser mode coarse grained by default #24949](https://github.com/apache/doris/pull/24949) + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.2-merged+is%3Aclosed) . + +## Big Thanks + +Thanks all who contribute to this release: + +[@adonis0147](https://github.com/adonis0147) [@airborne12](https://github.com/airborne12) [@amorynan](https://github.com/amorynan) [@AshinGau](https://github.com/AshinGau) [@BePPPower](https://github.com/BePPPower) [@BiteTheDDDDt](https://github.com/BiteTheDDDDt) [@bobhan1](https://github.com/bobhan1) [@ByteYue](https://github.com/ByteYue) [@caiconghui](https://github.com/caiconghui) [@CalvinKirs](https://github.com/CalvinKirs) [@cambyzju](https://github.com/cambyzju) [@ChengDaqi2023](https://github.com/ChengDaqi2023) [@ChinaYiGuan](https://github.com/ChinaYiGuan) [@CodeCooker17](https://github.com/CodeCooker17) [@csun5285](https://github.com/csun5285) [@dataroaring](https://github.com/dataroaring) [@deadlinefen](https://github.com/deadlinefen) [@DongLiang-0](https://github.com/DongLiang-0) [@Doris-Extras](https://github.com/Doris-Extras) [@dutyu](https://github.com/dutyu) [@eldenmoon](https://github.com/eldenmoon) [@englefly](https://github.com/englefly) [@freemandealer](https://github.com/freemandealer) [@Gabriel39](https://github.com/Gabriel39) [@gnehil](https://github.com/gnehil) [@GoGoWen](https://github.com/GoGoWen) [@gohalo](https://github.com/gohalo) [@HappenLee](https://github.com/HappenLee) [@hello-stephen](https://github.com/hello-stephen) [@HHoflittlefish777](https://github.com/HHoflittlefish777) [@hubgeter](https://github.com/hubgeter) [@hust-hhb](https://github.com/hust-hhb) [@ixzc](https://github.com/ixzc) [@JackDrogon](https://github.com/JackDrogon) [@jacktengg](https://github.com/jacktengg) [@jackwener](https://github.com/jackwener) [@Jibing-Li](https://github.com/Jibing-Li) [@JNSimba](https://github.com/JNSimba) [@kaijchen](https://github.com/kaijchen) [@kaka11chen](https://github.com/kaka11chen) [@Kikyou1997](https://github.com/Kikyou1997) [@Lchangliang](https://github.com/Lchangliang) [@LemonLiTree](https://github.com/LemonLiTree) [@liaoxin01](https://github.com/liaoxin01) [@LiBinfeng-01](https://github.com/LiBinfeng-01) [@liugddx](https://github.com/liugddx) [@luwei16](https://github.com/luwei16) [@mongo360](https://github.com/mongo360) [@morningman](https://github.com/morningman) [@morrySnow](https://github.com/morrySnow) @mrhhsg @Mryange @mymeiyi @neuyilan @pingchunzhang @platoneko @qidaye @realize096 @RYH61 @shuke987 @sohardforaname @starocean999 @SWJTU-ZhangLei @TangSiyang2001 @Tech-Circle-48 @w41ter @wangbo @wsjz @wuwenchi @wyx123654 @xiaokang @XieJiann @xinyiZzz @XuJianxu @xutaoustc @xy720 @xyfsjq @xzj7019 @yiguolei @yujun777 @Yukang-Lian @Yulei-Yang @zclllyybb @zddr @zhangguoqiang666 @zhangstar333 @ZhangYu0123 @zhannngchen @zxealous @zy-kkk @zzzxl1993 @zzzzzzzs diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.3.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.3.md new file mode 100644 index 0000000000000..a716d6d711fb0 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.3.md @@ -0,0 +1,253 @@ +--- +{ + "title": "Release 2.0.3", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 1000 improvements and bug fixes have been made in Doris 2.0.3 version, including optimizer statistics, inverted index, complex datatypes, data lake, replica management. + + + +## 1 Behavior change + +- The output format of the complex data type array/map/struct has been changed to be consistent to the input format and JSON specification. The main changes from the previous version are that DATE/DATETIME and STRING/VARCHAR are enclosed in double quotes and null values inside ARRAY/MAP are displayed as `null` instead of `NULL`. + - https://github.com/apache/doris/pull/25946 +- SHOW_VIEW permission is supported. Users with SELECT or LOAD permission will no longer be able to execute the 'SHOW CREATE VIEW' statement and must be granted the SHOW_VIEW permission separately. + - https://github.com/apache/doris/pull/25370 + + +## 2 New features + +### 2.1 Support collecting statistics for optimizer automatically + +Collecting statistics helps the optimizer understand the data distribution characteristics and choose a better plan to greatly improve query performance. It is officially supported starting from version 2.0.3 and is enabled all day by default. + +### 2.2 Support complex datatypes for more datalake source +- Support complex datatypes for JAVA UDF, JDBC and Hudi MOR + - https://github.com/apache/doris/pull/24810 + - https://github.com/apache/doris/pull/26236 +- Support complex datatypes for Paimon + - https://github.com/apache/doris/pull/25364 +- Suport Paimon version 0.5 + - https://github.com/apache/doris/pull/24985 + + +### 2.3 Add more builtin functions +- Support the BitmapAgg function in new optimizer + - https://github.com/apache/doris/pull/25508 +- Supports SHA series digest functions + - https://github.com/apache/doris/pull/24342 +- Support the BITMAP datatype in the aggregate functions min_by and max_by + - https://github.com/apache/doris/pull/25430 +- Add milliseconds/microseconds_add/sub/diff functions + - https://github.com/apache/doris/pull/24114 +- Add some json functions: json_insert, json_replace, json_set + - https://github.com/apache/doris/pull/24384 + + +## 3 Improvement and optimizations + +### 3.1 Performance optimizations + +- When the inverted index MATCH WHERE condition with a high filter rate is combined with the common WHERE condition with a low filter rate, the I/O of the index column is greatly reduced. +- Optimize the efficiency of random data access after the where filter. +- Optimizes the performance of the old get_json_xx function on JSON data types by 2~4x. +- Supports the configuration to reduce the priority of the data read thread, ensuring the CPU resources for real-time writing. +- Adds `uuid-numeric` function that returns largeint, which is 20 times faster than `uuid` function that returns string. +- Optimized the performance of case when by 3x. +- Cut out unnecessary predicate calculations in storage engine execution. +- Accelerate count performance by pushing down count operator to storage tier. +- Optimizes the computation performance of the nullable type in and or expressions. +- Supports rewriting the limit operator before `join` in more scenarios to improve query performance. +- Eliminate useless `order by` operators from inline view to improve query performance. +- Optimizes the accuracy of cardinality estimates and cost models in some cases. +- Optimized jdbc catalog predicate pushdown logic. +- Optimized the read efficiency of the file cache when it's enable for the first time. +- Optimizes the hive table sql cache policy and uses the partition update time stored in HMS to improve the cache hit ratio. +- Optimize mow compaction efficiency. +- Optimized thread allocation logic for external table query to reduce memory usage +- Optimize memory usage for column reader. + + + +### 3.2 Distributed replica management improvements + +Distributed replica management improvements include skipping partition deletion, colocate group deletion, balance failure due to continuous write, and hot and cold seperation table balance. + + +### 3.3 Security enhancement +- The audit log plug-in uses a token instead of a plaintext password to enhance security + - https://github.com/apache/doris/pull/26278 +- log4j configures security enhancement + - https://github.com/apache/doris/pull/24861 +- Sensitive user information is not displayed in logs + - https://github.com/apache/doris/pull/26912 + + +## 4 Bugfix and stability + +### 4.1 Complex datatypes +- Fix issues that fixed-length CHAR(n) was not truncated correctly in map/struct. + - https://github.com/apache/doris/pull/25725 +- Fix write failure for struct datatype nested for map/array + - https://github.com/apache/doris/pull/26973 +- Fix the issue that count distinct did not support array/map/struct + - https://github.com/apache/doris/pull/25483 +- Fix be crash in updating to 2.0.3 after the delete complex type appeared in query + - https://github.com/apache/doris/pull/26006 +- Fix be crash when JSON datatype is in WHERE clause. + - https://github.com/apache/doris/pull/27325 +- Fix be crash when ARRAY datatype is in OUTER JOIN clause. + - https://github.com/apache/doris/pull/25669 +- Fix reading incorrect result for DECIMAL datatype in ORC format. + - https://github.com/apache/doris/pull/26548 + - https://github.com/apache/doris/pull/25977 + - https://github.com/apache/doris/pull/26633 + +### 4.2 Inverted index +- Fix incorrect result for OR NOT combination in WHERE clause were incorrect when disable inverted index query. + - https://github.com/apache/doris/pull/26327 +- Fix be crash when write a empty with inverted index + - https://github.com/apache/doris/pull/25984 +- Fix be crash in index compaction when the output of compaction is empty. + - https://github.com/apache/doris/pull/25486 +- Fixed the problem of adding an inverted index to be crashed when no data is written to the newly added column. +- Fix be crash when BUILD INDEX after ADD COLUMN without new data written. + - https://github.com/apache/doris/pull/27276 +- Fix missing and leak problem of hardlink for inverted index file. + - https://github.com/apache/doris/pull/26903 +- Fix index file corrupt when disk is full temporarilly + - https://github.com/apache/doris/pull/28191 +- Fix incorrect result due to optimization for skip reading index column + - https://github.com/apache/doris/pull/28104 + +### 4.3 Materialized View +- Fix the problem of BE crash caused by repeated expressions in the group by statement +- Fix be crash when there are duplicate expressions in `group by` statements. + - https://github.com/apache/doris/pull/27523 +- Disables the float/double type in the `group by` clause when a view is created. + - https://github.com/apache/doris/pull/25823 +- Improve the function of select query matching materialized view + - https://github.com/apache/doris/pull/24691 +- Fix an issue that materialized views could not be matched when a table alias was used + - https://github.com/apache/doris/pull/25321 +- Fix the problem using percentile_approx when creating materialized views + - https://github.com/apache/doris/pull/26528 + +### 4.4 Table sample +- Fix the problem that table sample query can not work on table with partitions. + - https://github.com/apache/doris/pull/25912 +- Fix the problem that table sample query can not work when specify tablet. + - https://github.com/apache/doris/pull/25378 + + +### 4.5 Unique with merge on write +- Fix null pointer exception in conditional update based on primary key + - https://github.com/apache/doris/pull/26881 +- Fix field name capitalization issues in partial update + - https://github.com/apache/doris/pull/27223 +- Fix duplicate keys occur in mow during schema change repairement. + - https://github.com/apache/doris/pull/25705 + + +### 4.6 Load and compaction +- Fix unkown slot descriptor error in routineload for running multiple tables + - https://github.com/apache/doris/pull/25762 +- Fix be crash due to concurrent memory access when caculating memory + - https://github.com/apache/doris/pull/27101 +- Fix be crash on duplicate cancel for load. + - https://github.com/apache/doris/pull/27111 +- Fix broker connection error during broker load + - https://github.com/apache/doris/pull/26050 +- Fix incorrect result delete predicates in concurrent case of compation and scan. + - https://github.com/apache/doris/pull/24638 +- Fix the problem tha compaction task would print too many stacktrace logs + - https://github.com/apache/doris/pull/25597 + + +### 4.7 Data Lake compatibility +- Solve the problem that the iceberg table contains special characters that cause query failure + - https://github.com/apache/doris/pull/27108 +- Fix compatibility issues of different hive metastore versions + - https://github.com/apache/doris/pull/27327 +- Fix an error reading max compute partition table + - https://github.com/apache/doris/pull/24911 +- Fix the issue that backup to object storage failed + - https://github.com/apache/doris/pull/25496 + - https://github.com/apache/doris/pull/25803 + + +### 4.8 JDBC external table compatibility + +- Fix Oracle date type format error in jdbc catalog + - https://github.com/apache/doris/pull/25487 +- Fix MySQL 0000-00-00 date exception in jdbc catalog + - https://github.com/apache/doris/pull/26569 +- Fix an exception in reading data from Mariadb where the default value of the time type is current_timestamp + - https://github.com/apache/doris/pull/25016 +- Fix be crash when processing BITMAP datatype in jdbc catalog + - https://github.com/apache/doris/pull/25034 + - https://github.com/apache/doris/pull/26933 + + +### 4.9 SQL Planner and Optimizer + +- Fix partition prune error in some scenes + - https://github.com/apache/doris/pull/27047 + - https://github.com/apache/doris/pull/26873 + - https://github.com/apache/doris/pull/25769 + - https://github.com/apache/doris/pull/27636 + +- Fix incorrect sub-query processing in some scenarios + - https://github.com/apache/doris/pull/26034 + - https://github.com/apache/doris/pull/25492 + - https://github.com/apache/doris/pull/25955 + - https://github.com/apache/doris/pull/27177 + +- Fix some semantic parsing errors + - https://github.com/apache/doris/pull/24928 + - https://github.com/apache/doris/pull/25627 + +- Fix data loss during right outer/anti join + - https://github.com/apache/doris/pull/26529 + +- Fix incorrect pushing down of predicate pass aggregation operators. + - https://github.com/apache/doris/pull/25525 + +- Fix incorrect result header in some cases + - https://github.com/apache/doris/pull/25372 + +- Fix incorrect plan when the nullsafeEquals expression (<=>) is used as the join condition + - https://github.com/apache/doris/pull/27127 + +- Fix correct column prune in set operation operator. + - https://github.com/apache/doris/pull/26884 + + +### Others + +- Fix BE crash when the order of columns in a table is changed and then upgraded to 2.0.3. + - https://github.com/apache/doris/pull/28205 + + +See the complete list of improvements and bug fixes on [github dev/2.0.3-merged](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.3-merged+is%3Aclosed) . diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.4.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.4.md new file mode 100644 index 0000000000000..e1dac58fbf69a --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.4.md @@ -0,0 +1,67 @@ +--- +{ + "title": "Release 2.0.4", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, about 333 improvements and bug fixes have been made in Doris 2.0.4 version. + +**Quick Download** : [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub** : [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Behavior change +- More reasonable and accurate precision and scale inference for decimal data type + - [https://github.com/apache/doris/pull/28034](https://github.com/apache/doris/pull/28034) + +- Support drop policy for user or role + - [https://github.com/apache/doris/pull/29488](https://github.com/apache/doris/pull/29488) + +## New features + +- Support datev1, datetimev1 and decimalv2 datatypes in new optimizer Nereids. +- Support ODBC table for new optimizer Nereids. +- Add `lower_case` and `ignore_above` option for inverted index +- Support `match_regexp` and `match_phrase_prefix` optimization by inverted index +- Support paimon native reader in datalake +- Support audit-log for `insert into` SQL +- Support reading parquet file in lzo compressed format + +## Three Improvement and optimizations + +- Improve storage management including balance, migration, publish and others. +- Improve storage cooldown policy to use save disk space. +- Performance optimization for substr with ascii string. +- Improve partition prune when date function is used. +- Improve auto analyze visibility and performance. + +See the complete list of improvements and bug fixes on github [dev/2.0.4-merged](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.4-merged+is%3Aclosed) + + + +## Credits +Last but not least, this release would not have been possible without the following contributors: + +airborne12, amorynan, AshinGau, BePPPower, bingquanzhao, BiteTheDDDDt, bobhan1, ByteYue, caiconghui,CalvinKirs, cambyzju, caoliang-web, catpineapple, csun5285, dataroaring, deardeng, dutyu, eldenmoon, englefly, feifeifeimoon, fornaix, Gabriel39, gnehil, HappenLee, hello-stephen, HHoflittlefish777,hubgeter, hust-hhb, ixzc, jacktengg, jackwener, Jibing-Li, kaka11chen, KassieZ, LemonLiTree,liaoxin01, LiBinfeng-01, lihuigang, liugddx, luwei16, morningman, morrySnow, mrhhsg, Mryange, nextdreamblue, Nitin-Kashyap, platoneko, py023, qidaye, shuke987, starocean999, SWJTU-ZhangLei, w41ter, wangbo, wsjz, wuwenchi, Xiaoccer, xiaokang, XieJiann, xingyingone, xinyiZzz, xuwei0912, xy720, xzj7019, yujun777, zclllyybb, zddr, zhangguoqiang666, zhangstar333, zhannngchen, zhiqiang-hhhh, zy-kkk, zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.5.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.5.md new file mode 100644 index 0000000000000..20d6bd9302b2c --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.5.md @@ -0,0 +1,73 @@ +--- +{ + "title": "Release 2.0.5", + "language": "en" +} +--- + + + + + +Thanks to our community users and developers, about 217 improvements and bug fixes have been made in Doris 2.0.5 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Behavior change +- Change char function behaviour: `select char(0) = '\0'` return true as MySQL + - https://github.com/apache/doris/pull/30034 +- Allow exporting empty data + - https://github.com/apache/doris/pull/30703 + +## New features +- Eliminate left outer join with `is null` condition +- Add `show-tablets-belong` stmt for analyzing a batch of tablet-ids +- InferPredicates support In, such as `a = b & a in [1, 2] -> b in [1, 2]` +- Optimize plan when column stats are unavailable +- Optimize plan using rollup column stats +- Support analyze materialized view +- Support ShowProcessStmt Show all FE connection + +## Improvement and optimizations +- Optimize query plan when column stats are unaviable +- Optimize query plan using rollup column stats +- Stop analyze quickly after user close auto analyze +- Catch load column stats exception, avoid print too much stack info to fe.out +- Select materialized view by specify the view name in SQL +- Change auto analyze max table width default value to 100 +- Escape characters for columns in recovery predicate pushdown in JDBC Catalog +- Fix JDBC MYSQL Catalog `to_date` fun pushdown +- Optimize the close logic of JDBC client +- Optimize JDBC connection pool parameter settings +- Obtain hudi partition information through HMS's API +- Optimize routine load job error msg and memory +- Skip all backup/restore jobs if max allowd option is set to 0 + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.4-rc06...2.0.5-rc02). + + +## Credits +Thanks all who contribute to this release: + +airborne12, alexxing662, amorynan, AshinGau, BePPPower, bingquanzhao, BiteTheDDDDt, ByteYue, caiconghui, cambyzju, catpineapple, dataroaring, eldenmoon, Emor-nj, englefly, felixwluo, GoGoWen, HappenLee, hello-stephen, HHoflittlefish777, HowardQin, JackDrogon, jacktengg, jackwener, Jibing-Li, KassieZ, LemonLiTree, liaoxin01, liugddx, LuGuangming, morningman, morrySnow, mrhhsg, Mryange, mymeiyi, nextdreamblue, qidaye, ryanzryu, seawinde,starocean999, TangSiyang2001, vinlee19, w41ter, wangbo, wsjz, wuwenchi, xiaokang, XieJiann, xingyingone, xy720,xzj7019, yujun777, zclllyybb, zhangstar333, zhannngchen, zhiqiang-hhhh, zxealous, zy-kkk, zzzxl1993 + diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.6.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.6.md new file mode 100644 index 0000000000000..9591ed8d3fab8 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.6.md @@ -0,0 +1,59 @@ +--- +{ + "title": "Release 2.0.6", + "language": "en" +} +--- + + + + + +Thanks to our community users and developers, about 114 improvements and bug fixes have been created by 51 contributors in Doris 2.0.6 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Behavior change +- N/A + +## New features +- Support match a function with alias in materialized-view +- Add a command to drop a tablet replica safely on backend +- Add row count cache for external table. +- Support analyze rollup to gather statistics for optimizer + +## Improvement and optimizations +- Improve tablet schema cache memory by using deterministic way to serialize protobuf +- Improve show column stats performance +- Support estimate row count for iceberg and paimon +- Support sqlserver timestamp type read for JDBC catalog + + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.5-rc02...2.0.6). + + +## Credits +Thanks all who contribute to this release: + +924060929, AshinGau, BePPPower, BiteTheDDDDt, CalvinKirs, cambyzju, deardeng, DongLiang-0, eldenmoon, englefly, feelshana, feiniaofeiafei, felixwluo, HappenLee, hust-hhb, iwanttobepowerful, ixzc, JackDrogon, Jibing-Li, KassieZ, larshelge, liaoxin01, LiBinfeng-01, liutang123, luennng, morningman, morrySnow, mrhhsg, qidaye, starocean999, TangSiyang2001, wangbo, wsjz, wuwenchi, xiaokang, XieJiann, xuwei0912, xy720, xzj7019, yiguolei, yujun777, Yukang-Lian, Yulei-Yang, zclllyybb, zddr, zhangstar333, zhannngchen, zhiqiang-hhhh, zy-kkk, zzzxl1993 + diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.7.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.7.md new file mode 100644 index 0000000000000..10f226dbd63b4 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.7.md @@ -0,0 +1,84 @@ +--- +{ + "title": "Release 2.0.7", + "language": "en" +} +--- + + + + + +Thanks to our community users and developers, about 80 improvements and bug fixes have been made in Doris 2.0.7 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## 1 Behavior change + +- `round` function defaults to rounding normally as MySQL, eg. round(5/2) return 3 instead of 2. + + - https://github.com/apache/doris/pull/31583 + +- `round` datetime with scale from string literal as MySQL, eg. round '2023-10-12 14:31:49.666' to '2023-10-12 14:31:50' . + + - https://github.com/apache/doris/pull/27965 + + +## 2 New features +- Support make miss slot as null alias when converting outer join to anti join to speed up query + + - https://github.com/apache/doris/pull/31854 + +- Enable proxy protocol to support IP transparency for Nginx and HAProxy. + + - https://github.com/apache/doris/pull/32338 + + +## 3 Improvement and optimizations + +- Add DEFAULT_ENCRYPTION column in `information_schema` table and add `processlist` table for better compatibility for BI tools + +- Automatically test connectivity by default when creating a JDBC Catalog. + +- Enhance auto resume to keep routine load stable + +- Use lowercase by default for Chinese tokenizer in inverted index + +- Add error msg if exceeded maximum default value in repeat function + +- Skip hidden file and dir in Hive table + +- Reduce file meta cache size and disable cache for some cases to avoid OOM + +- Reduce jvm heap memory consumed by profiles of BrokerLoadJob + +- Remove sort which is under table sink to speed up query like `INSERT INTO t1 SELECT * FROM t2 ORDER BY k`. + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.6...2.0.7) . + + +## 4 Credits + +Thanks all who contribute to this release: + +924060929,airborne12,amorynan,ByteYue,dataroaring,deardeng,feiniaofeiafei,felixwluo,freemandealer,gavinchou,hello-stephen,HHoflittlefish777,jacktengg,jackwener,jeffreys-cat,Jibing-Li,KassieZ,LiBinfeng-01,luwei16,morningman,mrhhsg,Mryange,nextdreamblue,platoneko,qidaye,rohitrs1983,seawinde,shuke987,starocean999,SWJTU-ZhangLei,w41ter,wsjz,wuwenchi,xiaokang,XieJiann,XuJianxu,yujun777,Yulei-Yang,zhangstar333,zhiqiang-hhhh,zy-kkk,zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.8.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.8.md new file mode 100644 index 0000000000000..d881a80628b44 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.8.md @@ -0,0 +1,76 @@ +--- +{ + "title": "Release 2.0.8", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, about 65 improvements and bug fixes have been made in Doris 2.0.8 version. + +- **Quick Download** : [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub** : [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + + +## 1 Behavior change + +The `ADMIN SHOW` statement can not be executed with high version of MySQL 8.x jdbc driver. So rename these statement, remove the `ADMIN` keywords. + +- https://github.com/apache/doris/pull/29492 + +```sql +ADMIN SHOW CONFIG -> SHOW CONFIG +ADMIN SHOW REPLICA -> SHOW REPLICA +ADMIN DIAGNOSE TABLET -> SHOW TABLET DIAGNOSIS +ADMIN SHOW TABLET -> SHOW TABLET +``` + + +## 2 New features + +N/A + + + +## 3 Improvement and optimizations + +- Make Inverted Index work with TopN opt in Nereids + +- Limit the max string length to 1024 while collecting column stats to control BE memory usage + +- JDBC Catalog close when JDBC client is not empty + +- Accept all Iceberg database and do not check the name format of database + +- Refresh external table's rowcount async to avoid cache miss and unstable query plan + +- Simplify the isSplitable method of hive external table to avoid too many hadoop metrics + +See the complete list of improvements and bug fixes on [GitHub](https://github.com/apache/doris/compare/2.0.7...2.0.8) . + +## 4 Credits + +Thanks all who contribute to this release: + +924060929, AcKing-Sam, amorynan, AshinGau, BePPPower, BiteTheDDDDt, ByteYue, cambyzju, dongsilun, eldenmoon, feiniaofeiafei, gnehil, Jibing-Li, liaoxin01, luwei16, morningman, morrySnow, mrhhsg, Mryange, nextdreamblue, platoneko, starocean999, SWJTU-ZhangLei, wuwenchi, xiaokang, xinyiZzz, Yukang-Lian, Yulei-Yang, zclllyybb, zddr, zhangstar333, zhiqiang-hhhh, ziyanTOP, zy-kkk, zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.9.md b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.9.md new file mode 100644 index 0000000000000..04048fc060461 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v2.0/release-2.0.9.md @@ -0,0 +1,75 @@ +--- +{ + "title": "Release 2.0.9", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, about 68 improvements and bug fixes have been made in Doris 2.0.9 version. + +- **Quick Download** : [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub** : [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## 1 Behavior change + +NA + +## 2 New features + +- Support predicate apprear both on key and value mv column + +- Support mv with `bitmap_union(bitmap_from_array())` + +- Add a FE config to force replicate allocation for OLAP tables in the cluster + +- Support date literal support timezone in new optimizer Nereids + +- Support slop in fulltext search `match_phrase` to specify word distence + +- Show index id in `SHOW PROC INDEXES` + +## 3 Improvement and optimizations + +- Sdd a secondary argument in `first_value` / `last_value` to ignore NULL values + +- the offset params in `LEAD`/ `LAG` function could use 0 + +- Adjust priority of materialized view match rule + +- TopN opt reads only limit number of records for better performance + +- Add profile for delete_bitmap get_agg function + +- Refine the Meta cache to get better performance + +- Add FE config `autobucket_max_buckets` + +See the complete list of improvements and bug fixes on [GitHub](https://github.com/apache/doris/compare/2.0.8...2.0.9) . + +## Big Thanks + +Thanks all who contribute to this release: + +adonis0147, airborne12, amorynan, AshinGau, BePPPower, BiteTheDDDDt, CalvinKirs, cambyzju, csun5285, eldenmoon, englefly, feiniaofeiafei, HHoflittlefish777, htyoung, hust-hhb, jackwener, Jibing-Li, kaijchen, kylinmac, liaoxin01, luwei16, morningman, mrhhsg, qidaye, starocean999, SWJTU-ZhangLei, w41ter, xiaokang, xiedeyantu, xy720, zclllyybb, zhangstar333, zhannngchen, zy-kkk, zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.0.md b/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.0.md new file mode 100644 index 0000000000000..baa62b37e1e75 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.0.md @@ -0,0 +1,469 @@ +--- +{ + "title": "Release 3.0.0", + "language": "en" +} +--- + + + + +We are excited to announce the release of Apache Doris 3.0! + +**Starting from version 3.X, Apache Doris supports a compute-storage decoupled mode in addition to the compute-storage coupled mode for cluster deployment. With the cloud-native architecture that decouples the computation and storage layers, users can achieve physical isolation between query loads across multiple compute clusters, as well as isolation between read and write loads. Additionally, users can take advantage of low-cost shared storage systems such as object storage or HDFS to significantly reduce storage costs.** + +Version 3.0 marks a milestone in the evolution of Apache Doris towards a unified data lake and data warehouse architecture. This version introduces the ability to write data back to data lakes, allowing users to perform data analysis, sharing, processing, and storage operations across multiple data sources within Apache Doris. With capabilities such as asynchronous materialized views, Apache Doris can serve as a unified data processing engine for enterprises, helping users better manage data across lakes, warehouses, and databases. Also, Apache Doris 3.0 introduces the Trino Connector. It allows users to quickly connect or adapt to more data sources, and leverage the high-performance compute engine of Doris to deliver faster query results than Trino. + +Version 3.0 also enhances support for ETL batch processing scenarios, adding explicit transaction support for operations like `insert into select`, `delete` and `update`. The observability of query execution has also been improved. + +In terms of performance, we have improved the framework capabilities, infrastructure, and rules of the query optimizer in version 3.0. This provides optimized performance, which has been proven by blind testing in more complex and diverse business scenarios. + +The adaptive Runtime Filter computation method now accurately estimates filters based on data size during execution, delivering better performance under large data volumes and high loads. Additionally, asynchronous materialized view has been more stable and user-friendly in query acceleration and data modeling. + +**During the development of version 3.0, over 200 contributors submitted nearly 5,000 optimizations** and fixes to Apache Doris. Contributors from companies such as VeloDB, Baidu, Meituan, ByteDance, Tencent, Alibaba, Kwai, Huawei, and Tianyi Cloud actively collaborated with the community, contributing test cases from real-world use cases to help us improve Apache Doris. We extend our heartfelt thanks to all the contributors involved in the development, testing, and feedback process for this release. + +- **GitHub**: https://github.com/apache/doris/releases + +- **Website**: https://doris.apache.org/download + +## 1. Compute-storage decoupled mode + +Since V3.0, Apache Doris supports the compute-storage decoupled mode. Users can choose between it and the compute-storage coupled mode during cluster deployment. + +In the compute-storage decoupled mode, the BE nodes no longer store the data, but instead, a shared storage layer (HDFS and object storage) is introduced as the shared data storage layer. The computing and storage resources can be scaled independently, bringing multiple benefits to users: + +- **Workload isolation**: Multiple compute clusters can share the same data, allowing users to isolate different business workloads or offline loads using separate compute clusters. + +- **Reduced storage costs**: The full dataset is stored in the more cost-effective and highly reliable shared storage, with only hot data cached locally. Compared to the compute-storage coupled mode with three data replicas, the storage cost can be reduced by up to 90%. + +- **Elastic computing resources**: Since no data is stored on the BE nodes, the computing resources can be scaled flexibly based on the load requirements. Users can scale in or out an individual compute cluster or increase/decrease the number of compute clusters. This also leads to cost savings. + +- **Improved system robustness**: By storing the data in shared storage, Doris no longer needs to handle the complex logic of multi-replica consistency, thus simplifying distributed storage complexity and improving the overall system robustness. + +- **Flexible data sharing and cloning**: The flexibility of the compute-storage decoupled mode extends beyond a single Doris cluster. Tables from one Doris cluster can be easily cloned to another Doris cluster, with just metadata replication. + +### 1-1. From coupled to decoupled + +In the compute-storage coupled mode, the Apache Doris architecture consists of two main process types: Frontend (FE) and Backend (BE). The FE is primarily responsible for user request access, query parsing and planning, metadata management, and node management. The BE is responsible for data storage and query plan execution. + +The BE nodes employ an MPP (Massively Parallel Processing) distributed computing architecture, leveraging a multi-replica consistency protocol to ensure high service availability and high data reliability. + +![From coupled to decoupled](/images/storage-compute-decoupled.PNG) + + +The maturation of emerging cloud computing infrastructure, including public clouds, private clouds, and Kubernetes-based container platforms, has driven the need for cloud-native capabilities. Increasingly, users are seeking deeper integration between Apache Doris and cloud computing infrastructure to provide more elasticity. + +**To address this need, the VeloDB team has designed and implemented a cloud-native version of Apache Doris that decouples compute and storage, known as VeloDB Cloud. After extensive production testing and refinement across hundreds of enterprises over a long time, this cloud-native solution has now been contributed to the Apache Doris community, manifesting as the Apache Doris 3.0 in the compute-storage decoupled mode.** + +In the compute-storage decoupled mode, the Apache Doris architecture consists of three layers: + +- **Meta data layer**: A new Meta Service module has been introduced to provide meta data services, such as processing database and table information, schemas, rowset meta, and transactions. The Meta Service is stateless and horizontally scalable. In V3.0, all of the BE's meta data and parts of the FE's meta data have been migrated to the Meta Service. We will finish the migration of the remains in future versions. +- **Computation layer**: The stateless BE nodes execute query plans and cache a portion of the data and tablet meta data locally to improve query performance. Multiple stateless BE nodes can be organized into a computing resource pool (i.e., compute cluster), and multiple compute clusters can share the same data and metadata service. The compute clusters can be elastically scaled by adding or removing nodes as needed. +- **Shared storage layer**: Data is persisted to the shared storage layer, which currently supports HDFS as well as various cloud-based object storage systems that are compatible with the S3 protocol, such as S3, OSS, GCS, Azure Blob, COS, BOS, and MinIO. + +![From coupled to decoupled-2](/images/storage-compute-decoupled-2.JPEG) + +### 1-2 Design highlight + +The design of the compute-storage decoupled mode of Apache Doris highlights the transformation of the FE's in-memory metadata model into a shared metadata service. This approach offers a globally consistent state view, allowing any node to directly submit writes without needing to go through the FE for publishing. During write operations, data is stored in shared storage, while metadata is managed by the metadata service. **This effectively controls the number of small files in shared storage. Meanwhile, the real-time write performance for individual tables is nearly on par with that in the compute-storage coupled mode. The system's overall write capacity is no longer limited by the processing power of a single FE node.** + +![Design highlight](/images/design-hightlight.PNG) + +Based on the globally consistent state view, for data garbage collection, we have adopted a design approach for data deletion that is easier to prove correct and more efficient. + +Specifically, data in the shared storage is incorporated into the globally consistent view offered by the shared meta data service. Whenever data is generated, we bind it to a separate, independent transaction. Similarly, for a meta data deletion operation, we also bind it to a separate, independent transaction. The purpose of this approach is to ensure that deletion and write operations cannot succeed together. The view records which data needs to be deleted, and the asynchronous deletion process can simply perform a forward deletion of the data based on the transaction records, without the need for reverse garbage collection. + +As the tablet-related meta data in the FE is gradually migrated to the shared meta data service, the scalability of the Doris cluster will no longer be constrained by the memory capacity of a single FE node. Building upon the shared meta data service and the forward data deletion technique, we can conveniently expand functionality such as data sharing and lightweight cloning. + +### 1-3 Comparison with alternative solutions + +Another design of decoupling compute and storage in the industry is to store the data and BE node meta data in a shared object storage or HDFS. However, this approach brings the following problems: + +- **Inability to support real-time writes**: During data writes, the data is mapped to tablets based on the partitioning and bucketing rules, generating segment files and rowset meta data. During the write process, a two-phase commit (Publish) is performed through the FE. When a BE node receives the Publish request, it then sets the rowset as visible. The Publish operation must not fail. If the rowset meta data is stored in the shared storage, the total small file data during the real-time write process would triple the size of the actual data files - one replica of data files, one for rowset meta data, and another for rowset meta data changes during Publish. The Publish operation is driven by a single FE node, so the write capacity of a single table or even the entire system is limited by the FE node's capabilities. + + ![Comparison with alternative solutions](/images/comparison-with-alternative-solutions.png) + + We compared the real-time data write performance of Apache Doris 3.0 with the above-described solution. We simulated 500 concurrent tasks writing 10,000 data files with 500 rows each, and 50 concurrent tasks writing 250 data files with 20,000 rows each, using the same computational resources. + + **The results showed that at 50 concurrent tasks, the micro-batch write performances of Apache Doris in both compute-storage coupled and decoupled modes were almost identical, while the industry solution lagged behind Apache Doris by a factor of 100.** + + At 500 concurrent tasks, the performance of Apache Doris in the compute-storage decoupled mode showed slight degradation, but it still maintained an 11X advantage over the industry solution. To ensure a fair test, Apache Doris did not enable the Group Commit feature (which the industry solution lacks). Enabling Group Commit would further enhance real-time write performance. + + ![Comparison with alternative solutions](/images/real-time-write-performance..png) + + Additionally, the industry solution also faces stability and cost issues in terms of real-time data ingestion: + + - Stability concerns: A large number of small files can put pressure on the shared storage, especially HDFS, and introduce stability risks. + + - High object storage request costs: Some public cloud object storage services charge 10 times more for Put and Delete operations compared to Get operations. A large number of small files can lead to a significant increase in object storage request costs, which can even exceed the storage costs. + +- **Limited scalability**: Use cases of the compute-storage decoupled model often handles larger data storage sizes, since the FE (Frontend) meta data is entirely in-memory, when the number of tablets reaches a certain high level (e.g. tens of millions), the FE's memory pressure can become a bottleneck that limits the overall write throughput of the system. + +- **Potential data deletion logic issues**: In the compute-storage decoupled architecture, data is stored with one single replica. Therefore, the data deletion logic is critical for the system's reliability. The conventional approach of cross-system data deletion by comparing the differences can be challenging. During the write process, there is no way to completely avoid deletion and write from succeeding together, which can lead to data loss. Additionally, when the storage system experiences anomalies, the input used for difference calculation may be incorrect, which potentially leads to unintended data deletion. + +- **Data sharing and lightweight cloning**: The flexibility of the decoupled storage-compute architecture can enable future data sharing and lightweight data cloning, reducing the burden of enterprise data management. However, if each cluster has a separate FE, after cloning data across clusters, it becomes difficult to accurately determine which data is no longer referenced and can be safely deleted, as calculating cross-cluster references can easily lead to unintended data deletion. + +By evolving the FE's full in-memory meta data model into a shared meta data service, Apache Doris 3.0 avoids all the aforementioned issues. + +### 1-4 Query performance comparison + +In the compute-storage decoupled mode, data needs to be read from the remote shared storage system, the main bottleneck has become the network bandwidth instead of the disk I/O in the compute-storage coupled mode. + +To accelerate data access, Apache Doris has implemented a high-speed caching mechanism based on local disks, and provides two cache management policies: LRU (Least Recently Used) and TTL (Time-To-Live). The newly imported data is asynchronously written to the cache to accelerate the first-time access to the latest data. If the data required by a query is not in the cache, the system will read the data from the remote storage into memory and synchronously write it to the cache for subsequent queries. + +In use cases involving multiple compute clusters, Apache Doris provides a cache preheating function. When a new compute cluster is established, users can choose to preheat specific data (such as tables or partitions) to further improve query efficiency. + +In this context, we have conducted performance tests with different caching strategies in both the compute-storage coupled and decoupled modes, using the TPC-DS 1TB test dataset. The results are concluded as follows: + +- When the cache is fully hit (i.e., all the data required for the query is loaded into the cache), **the query performance of the compute-storage decoupled mode is on par with that of the compute-storage coupled mode**. + +- When the cache is partially hit (i.e., the cache is cleared before the test, and data is gradually loaded into the cache during the test, with performance continuously improving), the query performance of the compute-storage decoupled mode is about 10% lower than that of the compute-storage coupled mode. This test scenario is the most similar to the real-life use cases. + +- When the cache is completely missed (i.e., the cache is cleared before every SQL execution, simulating an extreme case), the performance loss is around 35%. **Even so, Apache Doris in the compute-storage decoupled mode delivers much higher performance than its alternative solutions.** + +![Query performance comparison](/images/query-performance-comparison.png) + +### 1-5 Write speed comparison + +In terms of write performance, we have simulated two test cases under the same computing resources: batch import and high-concurrency real-time import. The comparison of write performance between the compute-storage coupled mode and the compute-storage decoupled mode is as follows: + +- **Batch import**: When importing the 1TB TPC-H and 1TB TPC-DS test datasets, **the write performance of the compute-storage decoupled mode is 20.05% and 27.98% higher than the compute-storage coupled mode**, respectively, under the single-replica configuration. During batch import, the segment file size is generally in the range of tens to hundreds of MB. In the compute-storage decoupled mode, the segment files are split into smaller files and concurrently uploaded to the object storage, which can result in higher throughput compared to writing to local disks. In real-life deployments, the compute-storage coupled mode typically uses three replicas, which means the write speed advantage of the compute-storage decoupled mode will be even more pronounced. + +- **High-concurrency real-time import**: as described in the "Comparison with alternative solutions" section. + +![Write speed comparison](/images/write-speed-comparison.png) + +### 1-6 Tips for production environment + +- **Performance**: For real-time data analysis, users can achieve query performance comparable to the compute-storage coupled mode by specifying a TTL (Time-To-Live) for the cache and writing newly ingested data into the cache. To prevent query jitter, users can cache the data generated by background tasks such as compaction and schema changes based on how frequently used the data is. + +- **Workload isolation**: Users can achieve physical resource isolation for different business using multiple compute clusters. For workload isolation within a single compute cluster, users can utilize the Workload Group mechanism to limit and isolate resources for different queries. + +### 1-7 Notes + +- Apache Doris 3.0 does not support the co-existence of the compute-storage coupled mode and the compute-storage decoupled mode. Users need to specify one of them during cluster deployment. + +- If users need the compute-storage coupled mode, following the [documentation](https://doris.apache.org/docs/3.0/install/source-install/compilation-with-docker/) for its deployment and upgrade. We recommend using Doris Manager for quick deployment and cluster upgrades. However, the compute-storage decoupled mode does not yet support Doris Manager deployment and upgrade. We will continue iteration for better support in future versions. + +- Currently Apache Doris does not support in-place upgrade from V2.1 to the compute-storage decoupled mode of V3.0. For such purpose, users need to perform data migration using tools like X2Doris after deploying the compute-storage decoupled clusters. In the future, we will support migration without service interruption through the CCR (Change Data Capture) capability. + +:::info +See doc: +https://doris.apache.org/docs/3.0/compute-storage-decoupled/overview/ +::: + +## 2. Data lakehouse + +Apache Doris is positioned as a real-time data warehouse, but it is much more than that. In previous versions, we have consistently pushed beyond the boundaries of traditional data warehouse capabilities, advancing towards a unified data lakehouse. Version 3.0 marks a milestone in this journey, with its capabilities in the lakehouse architecture becoming fully mature. We believe that a unified lakehouse is identified by **boundaryless data** and **lakehouse fusion**: + +**Boundaryless data: Apache Doris serves as a unified query processing engine, breaking down data barriers across different systems. It provides a consistent and ultra-fast analysis experience across all data sources, including data warehouses, data lakes, data streams, and local data files.** + +- **Lakehouse query acceleration**: Without the need to migrate data to Apache Doris, users can leverage Doris’ efficient query engine to directly query data stored in data lakes such as Iceberg, Hudi, Paimon, and offline data warehouses like Hive, thereby accelerating query analysis. + +- **Federated analysis**: By extending its catalog and storage plugins, Apache Doris enhances its federated analysis capabilities, allowing users to perform unified analysis across multiple heterogeneous data sources without physically centralizing the data in a single storage system. This enables external table queries and federated joins between internal and external tables, breaking down data silos and providing globally consistent data insights. + +- **Data lake construction**: Apache Doris introduces write-back functionality for Hive and Iceberg, allowing users to directly create Hive and Iceberg tables through Doris and write data into them. This allows users to write internal table data back to the offline lakehouse or process offline lakehouse data using Doris and save the results back into the lakehouse, simplifying and streamlining the data lake construction process. + +**Lakehouse fusion: As data lake architectures become increasingly complex, the costs of technology selection and maintenance rise for users. Achieving consistent fine-grained access control across multiple systems also becomes challenging, and real-time performance suffers. To address this, Apache Doris integrates core features of the data lake, transforming itself into a lightweight, efficient, native real-time lakehouse.** + +- **Real-time data updates**: Starting with version 1.2, Apache Doris enhanced the primary key model by introducing Merge-on-Write, supporting real-time updates. This feature allows high-frequency, real-time data updates based on primary key changes from upstream data sources. + +- **Data science and** **AI** **computation support**: From version 2.1, Apache Doris, using the efficient Arrow Flight protocol, increased the openness of its storage system and its support for various compute loads, enabling data science and AI computations. + +- **Enhancements for semi-structured and unstructured Data**: Apache Doris has introduced support for data types like Array, Map, Struct, JSON, and Variant, with plans to support vector indexing in the future. + +- **Improved resource efficiency by decoupling storage and compute**: With version 3.0, Apache Doris supports a decoupled storage and compute mode, further improving resource efficiency and scalability. + +### 2-1 Faster queries in the data lakehouse + +TPC-H and TPC-DS benchmarking proves that Apache Doris achieves average query performance that is 3 to 5 times faster than Trino/Presto. + +In V3.0, we have focused on optimizing query performance for production environments, including: + +- **More granular task splitting strategy**: By adjusting the consistent hashing algorithm and introducing a task sharding weighting mechanism, we ensure balanced query loads across all nodes. + +- **Scheduling optimizations for use cases with numerous partitions and files**: For cases with a large number of files (over 1 million), we have largely reduced query latency (from 100 seconds to 10 seconds) and alleviated memory pressure on the Frontend (FE) by asynchronously and batch-fetching file shards. + +We will continue to specifically enhance query acceleration performance in real-world business scenarios, improve the actual user experience, and build an industry-leading lakehouse query acceleration engine. + +### 2-2 Federated analysis: more data connectors + +Previous versions of Apache Doris support connectors for over 10 mainstream data lakehouses, warehouses, and relational databases. In V3.0, we have introduced the Trino Connector compatibility framework, which expands the range of data sources that Apache Doris can connect to. With this framework, users can easily adapt their existing setups to access corresponding data sources using Doris and leverage its high-speed computing engine for data analysis. + +Currently, Doris has completed adaptations for Delta Lake, Kudu, BigQuery, Kafka, TPCH, and TPCDS. We also encourage contributions from developers to prolong this list. + +:::info Note + +See doc: + +- Trino Connector: https://doris.apache.org/community/how-to-contribute/trino-connector-developer-guide/ + +- TPC-H: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/tpch/ + +- TPC-DS: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/tpcds/ + +- Delta Lake: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/deltalake/ + +- Kudu: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/kudu/ + +- BigQuery: https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/bigquery/ +::: + + +### 2-3 Data lake building + +In V3.0, we have introduced data writeback functionality for Hive and Iceberg. This allows users to create Hive and Iceberg tables directly through Doris and write data into these tables, and enables users to perform data analysis, sharing, processing, and storage operations across multiple data sources within Doris. + +In future iterations, Apache Doris will further enhance support for data lake table formats and improve the openness of storage APIs. + +:::info Note +See doc: https://doris.apache.org/docs/3.0/lakehouse/datalake-building/hive-build/ +::: + +## 3. Upgraded semi-structured data analysis capabilities + +In versions 2.0 and 2.1, Apache Doris introduced some well-embraced features such as inverted index, NGram Bloom Filter, and Variant data type to support high-performance full-text search and multi-dimensional analysis. With them, the storage and processing of complex semi-structured data have been more flexible and efficient. + +In V3.0, we have further enhanced the capabilities in this scenario. + +After extensive testing in production environments, the Variant data type has gained sufficient stability and become the preferred choice for JSON data storage and analysis. In V3.0, we have made multiple optimizations to it: + +- Support for indexing of the Variant data type to accelerate queries, including inverted index, Bloom Filter index, and the built-in ZoneMap index. + +- Support for flexible partial column updates for Unique Key tables containing the Variant data type. + +- Support for the use of the Variant data type in the compute-storage decoupled mode, with optimizations of its metadata storage. + +- Support for exporting the Variant data type to formats such as Parquet and CSV. + +The inverted index, introduced since V2.0, has reached a high level of maturity after more than a year of refinement and is now running in production environments of hundreds of enterprises. In V3.0, we have made multiple optimizations to the inverted index: + +- After performance optimizations, including lock concurrency, Apache Doris outperforms Elasticsearch in key metrics such as query latency and concurrency in real-time reporting analysis. + +- Optimized index file in the compute-storage decoupled mode to reduce remote storage calls and decrease index query latency. + +- Support for the Array data type to accelerate the `array_contains` queries. + +- Enhanced the `match_phrase_*` functionality, including support for slop and phrase prefix matching `match_phrase_prefix`. + +## 4. Enhanced ETL capabilities + +### 4-1. Transaction improvements + +Data processing in data warehouses often involves multiple data changes that need to be handled as a single transaction. V3.0 provides explicit transaction support for `insert into select`, `delete`, and `update` operations. Example cases include: + +- **Transactional requirements**: For example, when updating data within a time range, the typical approach is to first delete the data in that time range, and then insert the new data. Considering that the data might already be in service, there is a need to ensure that queries visit either the old data or the new data. Thus, it can be achieved by executing the `delete` and `insert into select` operations in a transaction. + + ```Java + BEGIN; + DELETE FROM table WHERE date >= "2024-07-01" AND date <= "2024-07-31"; + INSERT INTO table SELECT * FROM stage_table; + COMMIT; + ``` + +- **Simplified the processing of failed tasks**: For example, when two `insert into select` operations are executed within a single transaction, if any of the operations fail, it can be retried directly. + + ```Java + BEGIN WITH LABEL label_etl_1; + INTO table1 SELECT * FROM stage_table1; + INSERT INTO table SELECT * FROM stage_table; + COMMIT; + ``` + +:::info Note +See doc: https://doris.apache.org/docs/3.0/data-operate/transaction/ +Currently, explicit transaction synchronization is not supported in Cross-Cluster Replication (CCR). +::: + +### 4-2. Improved observability + +- **Real-time profile retrieval**: In previous versions, due to issues with the execution plan or the data, some complex queries might have high computational requirements, so developers can only access the query profile for performance analysis after the completion of the query. This makes it hard to promptly identify issues in query execution to guarantee stability of the production environment. Now, with the ability to retrieve real-time profiles, V3.0 allows users to monitor query execution as the query is running. It also allows them to better monitor the progress of each ETL job. + +- **`backend_active_tasks` system table**: The `backend_active_tasks` system table provides real-time resource consumption information for each query on each BE node. Users can analyze this system table using SQL to obtain the resource usage of each query, which helps identify large queries or abnormal workloads. + +## 5. Asynchronous materialized view + +In V3.0, asynchronous materialized view is faster and more stable. It is also more user-friendly for query acceleration and data modeling scenarios. We have restructured the logic for transparent rewrite and expanded its capabilities, making it 2X faster. + +### 5-1 Refresh + +- Support for incremental update of materialized views by partitions and partition roll-ups on materialized views to allow refreshes at different granularities. + +- Support for nested materialized views, which is useful in data modeling scenarios. + +- Support for index creation and sort key specification in asynchronous materialized views, which will improve query performance after the materialized view is hit. + +- Higher usability of materialized view DDL with support for atomically replacing materialized views, allowing modifications to the materialized view definition SQL while keeping the materialized view available. + +- Support for non-deterministic functions in materialized views to better serve daily materialized view creation. + +- Support for trigger-based materialized view refresh, which ensures data consistency in data modeling with nested materialized views. + +- Support for a broader range of SQL patterns for building partitioned materialized views, making the incremental update capability available to more use cases. + +### 5-2 Refresh stability + +- V3.0 supports specifying a Workload Group for building materialized views. This is to limit the resources used by the materialized view build process and ensure that sufficient resources remain available for ongoing queries. + +### 5-3 Transparent rewrite + +- Support for transparent rewrite of more Join types, including derived Joins. Even when there is a mismatch of Join types between the query and materialized view, transparent rewrite can still be performed by compensating with additional predicates, as long as the materialized view can provide all the data needed for the query. + +- Support for more aggregate functions for roll-up as well as rewrite of multi-dimensional aggregations like GROUPING SETS, ROLLUP, and CUBE; support rewriting queries with aggregations when the materialized view does not contain aggregations, simplifying Join operations and expression computation. + +- Support for transparent rewrite of nested materialized views, enabling higher performance for complex queries. + +- For partially invalid partitioned materialized views, V3.0 supports `Union All` the base tables for data completion, expanding the applicability of partitioned materialized views. + +### 5-4 Transparent rewrite performance + +- Continuous optimization has been done to improve the transparent rewrite performance, achieving 2X the speed compared to version 2.1.0. + +:::info Note + +See doc: + +https://doris.apache.org/docs/3.0/query/view-materialized-view/query-async-materialized-view + +https://doris.apache.org/docs/3.0/query/view-materialized-view/async-materialized-view/ + +::: + +## 6. Performance improvement + +### 6-1 Smarter optimizer + +In V3.0, the query optimizer has been enhanced in terms of framework capabilities, distributed plan support, optimizer infrastructure, and rule expansion. It provides better optimization capabilities for more complex and diverse business scenarios, with higher blind test performance for complex SQL: + +- **Improved plan enumeration capability**: The key structure Memo for plan enumeration has been restructured and normalized. This improves the efficiency of the Cascades framework in plan enumeration and the possibility of producing better plans. Additionally, it fixes incomplete column pruning during the Join Reorder process in older versions, which led to unnecessary overhead of the Join operator, thus improving the execution performance in the relevant scenarios. + +- **Improved distributed plan support**: The distributed query plan has been enhanced to allow aggregation, join, and window function operations to more intelligently identify the data characteristics of intermediate computation results, avoiding ineffective data redistribution operations. Meanwhile, we have optimized the execution under the multi-replica continuous execution mode, making it more data cache-friendly. + +- **Improved optimizer infrastructure**: V3 has fixed several issues in cost model and statistics information estimation. The fixes to the cost model are more adaptable to the evolution of the execution engine, making the execution plan more stable compared to previous versions. + +- **Enhanced Runtime Filter plan support**: On the basis of Join Runtime Filter, V3.0 has expanded the capability of the TopN Runtime Filter to achieve better performance in use cases that involve a TopN operator. + +- **Enriched optimization rule library**: Based on user feedback and internal testing results, we have introduced optimization rules such as Intersect Reorder to enrich the rule set of the optimizer. + +### 6-2 Self-adaptive Runtime Filter + +In previous versions, the generation of Runtime Filter relies on manual setting by users based on statistical information. However, inaccurate settings in certain cases could lead to performance instability. + +In V3.0, Doris implements a self-adaptive Runtime Filter calculation approach. It can estimate the Runtime Filter at runtime based on the data size with high accuracy, enabling better performance in use cases with large data volumes and high workloads. + +### 6-3 Function performance optimization + +- V3.0 has improved the vectorized implementation of dozens of functions, enabling a performance improvement of over 50% for some commonly used functions. +- V3.0 has also made extensive optimizations to the aggregation of nullable data types, enabling a 30% performance improvement. + +### 6-4 Blind test performance improvement + +Our blind tests on V3.0 and V2.1 show that the new version is 7.3% and 6.2% faster in TPC-DS and TPC-H benchmark tests, respectively. + +![Blind test performance improvement](/images/blind-test-performance-improvement.png) + +## 7. New features + +### 7-1 Java UDTF + +Version 3.0 has added support for Java UDTFs. The key operations are as follows: + +- Implementing a UDTF: Similar to a UDF, a UDTF requires the user to implement an `evaluate` method. Note that the return value of a UDTF function must be of the `Array` data type. + + ```sql + public class UDTFStringTest { + public ArrayList evaluate(String value, String separator) { + if (value == null || separator == null) { + return null; + } else { + return new ArrayList<>(Arrays.asList(value.split(separator))); + } + } + } + ``` + +- Creating a UDTF: By default, two corresponding functions will be created - `java-utdf`and `java-utdf_outer`. The `_outer` suffix adds a single row of `NULL` data when the table function generates 0 rows of output. + + ```sql + CREATE TABLES FUNCTION java-utdf(string, string) RETURNS array PROPERTIES ( + "file"="file:///pathTo/java-udaf.jar", + "symbol"="org.apache.doris.udf.demo.UDTFStringTest", + "always_nullable"="true", + "type"="JAVA_UDF" + ); + ``` + +:::info + +See doc: https://doris.apache.org/docs/3.0/query/udf/java-user-defined-function/#udtf-1 + +::: + +### 7-2 Generated column + +A generated column is a special column whose value is calculated from the values of other columns rather than directly inserted or updated by the user. It supports pre-computing the results of expressions and storing them in the database, which is suitable for scenarios that require frequent queries or complex calculations. + +Results can be automatically calculated based on predefined expressions when data is imported or updated, and then stored persistently. In this way, during subsequent queries, the system can directly access these calculated results without performing complex calculations, thereby improving query performance. + +Generated columns are supported since V3.0. When creating a table, you can specify a column as generated column. A generated column automatically calculates values based on the defined expression when data is written. Generated columns allow for more complex expressions to be defined, but the value cannot be explicitly written or set. + +:::info + +See doc: https://doris.apache.org/docs/3.0/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-TABLE-AND-GENERATED-COLUMN/ + +::: + +## 8. Functional improvements + +### 8-1. Materialized view + +We have refactored the selection logic for materialized views and migrated it from the rule-based optimizer (RBO) to the cost-based optimizer (CBO). This aligns the selection logic with that of asynchronous materialized views. This functionality is enabled by default. If any issues are encountered, you can revert to the RBO mode using `set global enable_sync_mv_cost_based_rewrite = false`. + +### 8-2. Routine Load + +In previous versions, the Routine Load functionality faced some usability challenges, such as uneven task scheduling across BE nodes, untimely task scheduling, complex configuration requirements (the need to change multiple FE and BE settings for optimization), insufficient overall stability (where restarts or upgrades could frequently pause Routine Load jobs, requiring manual user intervention to resume). + +To address these issues, we have made extensive optimizations to the Routine Load feature: + +- **Resource scheduling**: We have improved the scheduling balance to make sure that tasks are more evenly distributed across BE nodes. Jobs that encounter unrepairable errors will be promptly paused to avoid wasting resources on futile scheduling attempts. Additionally, we have improved the timeliness of the scheduling process, which has enhanced the import performance of Routine Load. + +- **Parameter configuration**: Users in most environments no longer need to modify FE and BE configurations for optimization. An automatic adjustment mechanism with timeout parameter has been introduced to prevent tasks from constantly retrying when cluster pressure increases. + +- **Stability**: We have enhanced the robustness of Doris in various exceptional scenarios, such as FE failovers, BE rolling upgrades, and Kafka cluster anomalies, ensuring continuous stable operation. We have also optimized the Auto Resume mechanism, allowing Routine Load to automatically resume operation after faults are repaired, reducing the need for manual user intervention. + +## 9. Behavior changed + +- `cpu_resource_limit` will no longer be supported, and all types of resource isolation will be implemented through Workload Groups. + +- Please use JDK 17 for Apache Doris 3.0 and later versions. The recommended version being `jdk-17.0.10_linux-x64_bin.tar.gz`. + +## Try Apache Doris 3.0 now! + +Before the official release of version 3.0, the compute-storage decoupled mode of Apache Doris has undergone nearly two years of extensive testing and optimization in the production environments of hundreds of enterprises. Contributors from many tech giants have collaborated with the community to provide a significant number of test cases based on their real-world business needs. This has rigorously validated the usability and stability of version 3.0. + +We highly recommend users with compute-storage decoupling needs to download version 3.0 and experience it firsthand. + +Going forward, we will accelerate our release iteration cycle to deliver a more stable version experience for all users. Feel free to join us in the [Apache Doris community](https://join.slack.com/t/apachedoriscommunity/shared_invite/zt-2gmq5o30h-455W226d79zP3L96ZhXIoQ) and engage directly with the core developers. + +## Credits + +Special thanks to the following contributors who participated in the development, testing, and provided feedback for this version: + +@133tosakarin、@390008457、@924060929、@AcKing-Sam、@AshinGau、@BePPPower、@BiteTheDDDDt、@ByteYue、@CSTGluigi、@CalvinKirs、@Ceng23333、@DarvenDuan、@DongLiang-0、@Doris-Extras、@Dragonliu2018、@Emor-nj、@FreeOnePlus、@Gabriel39、@GoGoWen、@HappenLee、@HowardQin、@Hyman-zhao、@INNOCENT-BOY、@JNSimba、@JackDrogon、@Jibing-Li、@KassieZ、@Lchangliang、@LemonLiTree、@LiBinfeng-01、@LompleZ、@M1saka2003、@Mryange、@Nitin-Kashyap、@On-Work-Song、@SWJTU-ZhangLei、@StarryVerse、@TangSiyang2001、@Tech-Circle-48、@Thearas、@Vallishp、@WinkerDu、@XieJiann、@XuJianxu、@XuPengfei-1020、@Yukang-Lian、@Yulei-Yang、@Z-SWEI、@ZhongJinHacker、@adonis0147、@airborne12、@allenhooo、@amorynan、@bingquanzhao、@biohazard4321、@bobhan1、@caiconghui、@cambyzju、@caoliang-web、@catpineapple、@cjj2010、@csun5285、@dataroaring、@deardeng、@dongsilun、@dutyu、@echo-hhj、@eldenmoon、@elvestar、@englefly、@feelshana、@feifeifeimoon、@feiniaofeiafei、@felixwluo、@freemandealer、@gavinchou、@ghkang98、@gnehil、@hechao-ustc、@hello-stephen、@httpshirley、@hubgeter、@hust-hhb、@iszhangpch、@iwanttobepowerful、@ixzc、@jacktengg、@jackwener、@jeffreys-cat、@kaijchen、@kaka11chen、@kindred77、@koarz、@kobe6th、@kylinmac、@larshelge、@liaoxin01、@lide-reed、@liugddx、@liujiwen-up、@liutang123、@lsy3993、@luwei16、@luzhijing、@lxliyou001、@mongo360、@morningman、@morrySnow、@mrhhsg、@my-vegetable-has-exploded、@mymeiyi、@nanfeng1999、@nextdreamblue、@pingchunzhang、@platoneko、@py023、@qidaye、@qzsee、@raboof、@rohitrs1983、@rotkang、@ryanzryu、@seawinde、@shoothzj、@shuke987、@sjyango、@smallhibiscus、@sollhui、@sollhui、@spaces-X、@stalary、@starocean999、@superdiaodiao、@suxiaogang223、@taptao、@vhwzx、@vinlee19、@w41ter、@wangbo、@wangshuo128、@whutpencil、@wsjz、@wuwenchi、@wyxxxcat、@xiaokang、@xiedeyantu、@xiedeyantu、@xingyingone、@xinyiZzz、@xy720、@xzj7019、@yagagagaga、@yiguolei、@yongjinhou、@ytwp、@yuanyuan8983、@yujun777、@yuxuan-luo、@zclllyybb、@zddr、@zfr9527、@zgxme、@zhangbutao、@zhangstar333、@zhannngchen、@zhiqiang-hhhh、@ziyanTOP、@zxealous、@zy-kkk、@zzzxl1993、@zzzzzzzs \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.1.md b/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.1.md new file mode 100644 index 0000000000000..9b9007e4391aa --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.1.md @@ -0,0 +1,604 @@ +--- +{ + "title": "Release 3.0.1", + "language": "en" +} +--- + + + +Dear community members, the Apache Doris 3.0.1 version was officially released on August 23, 2024, featuring updates and improvements in compute-storage decoupling, lakehouse, semi-structured data analysis, asynchronous materialized views, and more. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior Changes + +### Query Optimizer + +- Added the variable `use_max_length_of_varchar_in_ctas` to control the length behavior of VARCHAR type when executing `CREATE TABLE AS SELECT` (CTAS) operations. [#37069](https://github.com/apache/doris/pull/37069) + + - This variable is set to true by default. + + - When set to true, if the VARCHAR type column originates from a table, the derived length is used; otherwise, the maximum length is used. + + - When set to false, the VARCHAR type will always use the derived length. + +- All data types will now be displayed in lowercase to maintain compatibility with MySQL format. [#38012](https://github.com/apache/doris/pull/38012) + +- Multiple query statements in the same query request must now be separated by semicolons. [#38670](https://github.com/apache/doris/pull/38670) + +### Query Execution + +- The default number of parallel tasks after shuffle operations in the cluster is set to 100, which will improve query stability and concurrent processing capability in large clusters. [#38196](https://github.com/apache/doris/pull/38196) + +### Storage + +- The default value of `trash_file_expire_time_sec` has been changed from 86400 seconds to 0 seconds, which means that if files are deleted by mistake and the FE trash is cleared, the data cannot be recovered. + +- The table attribute `enable_mow_delete_on_delete_predicate` (introduced in version 3.0.0) has been renamed to `enable_mow_light_delete`. + +- Explicit transactions are now prohibited from performing delete operations on tables with written data. + +- Heavy schema change operations are prohibited on tables with auto-increment fields. + + + +## New Features + +### Job Scheduling + +- Optimized the execution logic of internal scheduling jobs, decoupling the strong association between start time and immediate execution parameters. Now, tasks can be created with a specified start time or selected for immediate execution, without conflict, enhancing scheduling flexibility. [#36805](https://github.com/apache/doris/pull/36805) + +### Compute-Storage Decoupled + +- Supports dynamic modification of the upper limit for file cache usage. [#37484](https://github.com/apache/doris/pull/37484) + +- Recycler now supports object storage rate limiting and server-side rate limiting retry functionality. [#37663](https://github.com/apache/doris/pull/37663) [#37680](https://github.com/apache/doris/pull/37680) + +### Lakehouse + +- Added the session variable `serde_dialect` to set the output format for complex types. [#37039](https://github.com/apache/doris/pull/37039) + +- SQL interception now supports external tables. + + - For more information, refer to the documentation on [SQL Interception](https://doris.apache.org/docs/admin-manual/query-admin/sql-interception). + +- Insert overwrite now supports Iceberg tables. [#37191](https://github.com/apache/doris/pull/37191) + +### Asynchronous Materialized Views + +- Supports partition roll-up and build at the hourly level. [#37678](https://github.com/apache/doris/pull/37678) + +- Supports atomic replacement of asynchronous materialized view definition statements. [#36749](https://github.com/apache/doris/pull/36749) + +- Transparent rewriting now supports Insert statements. [#38115](https://github.com/apache/doris/pull/38115) + +- Transparent rewriting now supports the VARIANT type. [#37929](https://github.com/apache/doris/pull/37929) + +### Query Execution + +- The group concat function now supports DISTINCT and ORDER BY options. [#38744](https://github.com/apache/doris/pull/38744) + +### Semi-Structured Data Management + +- The ES Catalog now maps `nested` or `object` types in Elasticsearch to the JSON type in Doris. [#37101](https://github.com/apache/doris/pull/37101) + +- Added the `MULTI_MATCH` function, which supports matching keywords across multiple fields and can leverage inverted indexes to accelerate searches. [#37722](https://github.com/apache/doris/pull/37722) + +- Added the `explode_json_object` function, which can unfold objects in JSON data into multiple rows. [#36887](https://github.com/apache/doris/pull/36887) + +- Inverted indexes now support memtable advancement, requiring index construction only once during multi-replica writes, reducing CPU consumption and improving performance. [#35891](https://github.com/apache/doris/pull/35891) + +- Added `MATCH_PHRASE` support for positive slop, e.g., `msg MATCH_PHRASE 'a b 2+'` can match instances containing words a and b with a slop of no more than two, and a preceding b; regular slop without the final `+` does not guarantee this order. [#36356](https://github.com/apache/doris/pull/36356) + +### Other + +- Added the FE parameter `skip_audit_user_list`, where user operations specified in this configuration will not be recorded in the audit log. [#38310](https://github.com/apache/doris/pull/38310) + + - For more information, refer to the documentation on [Audit Plugin](https://doris.apache.org/docs/admin-manual/audit-plugin/). + + + +## Improvements + +### Storage + +- Reduced the likelihood of write failures caused by disk balancing within a single BE. [#38000](https://github.com/apache/doris/pull/38000) + +- Decreased memory consumption by the memtable limiter. [#37511](https://github.com/apache/doris/pull/37511) + +- Moved old partitions to the FE trash during partition replacement operations. [#36361](https://github.com/apache/doris/pull/36361) + +- Optimized memory consumption during compaction. [#37099](https://github.com/apache/doris/pull/37099) + +- Added a session variable to control audit logs for JDBC PreparedStatement, with default setting to not print. [#38419](https://github.com/apache/doris/pull/38419) + +- Optimized the logic for selecting BEs for group commits. [#35558](https://github.com/apache/doris/pull/35558) + +- Improved the performance of column updates. [#38487](https://github.com/apache/doris/pull/38487) + +- Optimized the use of `delete bitmap cache`. [#38761](https://github.com/apache/doris/pull/38761) + +- Added a configuration to control query affinity during hot and cold tiering. [#37492](https://github.com/apache/doris/pull/37492) + +### Compute-Storage Decoupled + +- Implemented automatic retries when encountering object storage server rate limiting. [#37199](https://github.com/apache/doris/pull/37199) + +- Adapted the number of threads for memtable flush in the compute-storage decoupled mode. [#38789](https://github.com/apache/doris/pull/38789) + +- Added Azure as a compile option to support compilation in environments without Azure support. + +- Optimized the observability of object storage access rate limiting. [#38294](https://github.com/apache/doris/pull/38294) + +- Allowed the file cache TTL queue to perform LRU eviction, enhancing TTL queue usability. [#37312](https://github.com/apache/doris/pull/37312) + +- Optimized the number of balance writeeditlog IO operations in the storage and compute separation mode. [#37787](https://github.com/apache/doris/pull/37787) + +- Improved table creation speed in the storage and compute separation mode by sending tablet creation requests in batches. [#36786](https://github.com/apache/doris/pull/36786) + +- Optimized read failures caused by potential inconsistencies in the local file cache through backoff retries. [#38645](https://github.com/apache/doris/pull/38645) + +### Lakehouse + +- Optimized memory statistics for Parquet/ORC format read and write operations. [#37234](https://github.com/apache/doris/pull/37234) + +- Trino Connector Catalog now supports predicate pushdown. [#37874](https://github.com/apache/doris/pull/37874) + +- Added a session variable `enable_count_push_down_for_external_table` to control whether to enable `count(*)` pushdown optimization for external tables. [#37046](https://github.com/apache/doris/pull/37046) + +- Optimized the read logic for Hudi snapshot reads, returning an empty set when the snapshot is empty, consistent with Spark behavior. [#37702](https://github.com/apache/doris/pull/37702) + +- Improved the read performance of partition columns for Hive tables. [#37377](https://github.com/apache/doris/pull/37377) + +### Asynchronous Materialized Views + +- Improved transparent rewrite plan speed by 20%. [#37197](https://github.com/apache/doris/pull/37197) + +- Eliminated roll-up during transparent rewrite if the group key satisfies data uniqueness for better nested matching. [#38387](https://github.com/apache/doris/pull/38387) + +- Transparent rewrite now performs better aggregation elimination to improve the matching success rate of nested materialized views. [#36888](https://github.com/apache/doris/pull/36888) + +### MySQL Compatibility + +- Now correctly populates the database name, table name, and original name in the MySQL protocol result columns. [#38126](https://github.com/apache/doris/pull/38126) + +- Supported the hint format `/*+ func(value) */`. [#37720](https://github.com/apache/doris/pull/37720) + +### Query Optimizer + +- Significantly improved the plan speed for complex queries. [#38317](https://github.com/apache/doris/pull/38317) + +- Adaptively chose whether to perform bucket shuffle based on the number of data buckets to avoid performance degradation in extreme cases. [#36784](https://github.com/apache/doris/pull/36784) + +- Optimized the cost estimation logic for SEMI / ANTI JOIN. [#37951](https://github.com/apache/doris/pull/37951) [#37060](https://github.com/apache/doris/pull/37060) + +- Supported pushing Limit down to the first stage of aggregation to improve performance. [#34853](https://github.com/apache/doris/pull/34853) + +- Partition pruning now supports filter conditions containing the `date_trunc` or `date` function. [#38025](https://github.com/apache/doris/pull/38025) [#38743](https://github.com/apache/doris/pull/38743) + +- SQL cache now supports query scenarios that include user variables. [#37915](https://github.com/apache/doris/pull/37915) + +- Optimized error messages for invalid aggregation semantics. [#38122](https://github.com/apache/doris/pull/38122) + +### Query Execution + +- Adapted AggState compatibility from 2.1 to 3.x and fixed Coredump issues. [#37104](https://github.com/apache/doris/pull/37104) + +- Refactored the strategy selection for local shuffle without Join. [#37282](https://github.com/apache/doris/pull/37282) + +- Modified the scanner for internal table queries to be asynchronous to prevent stalling during such queries. [#38403](https://github.com/apache/doris/pull/38403) + +- Optimized the block merge process during Hash table construction for Join operators. [#37471](https://github.com/apache/doris/pull/37471) + +- Optimized the duration of lock holding for MultiCast. [#37462](https://github.com/apache/doris/pull/37462) + +- Optimized gRPC keepAliveTime and added link monitoring to reduce the probability of query failure due to RPC errors. [#37304](https://github.com/apache/doris/pull/37304) + +- Cleaned up all dirty pages in jemalloc when memory limits were exceeded. [#37164](https://github.com/apache/doris/pull/37164) + +- Optimized the processing performance of `aes_encrypt`/`decrypt` functions for constant types. [#37194](https://github.com/apache/doris/pull/37194) + +- Optimized the processing performance of the `json_extract` function for constant data. [#36927](https://github.com/apache/doris/pull/36927) + +- Optimized the processing performance of the `ParseUrl` function for constant data. [#36882](https://github.com/apache/doris/pull/36882) + +### Semi-Structured Data Management + +- Bitmap indexes now default to using inverted indexes, with `enable_create_bitmap_index_as_inverted_index` set to true by default. [#36692](https://github.com/apache/doris/pull/36692) + +- In the compute-storage decoupled mode, DESC can now view sub-columns of VARIANT type. [#38143](https://github.com/apache/doris/pull/38143) + +- Removed the step of checking file existence during inverted index queries to reduce access latency to remote storage. [#36945](https://github.com/apache/doris/pull/36945) + +- Complex types ARRAY / MAP / STRUCT now support `replace_if_not_null` for AGG tables. [#38304](https://github.com/apache/doris/pull/38304) + +- Escape characters for JSON data are now supported. [#37176](https://github.com/apache/doris/pull/37176) [#37251](https://github.com/apache/doris/pull/37251) + +- Inverted index queries now behave consistently on MOW tables and DUP tables. [#37428](https://github.com/apache/doris/pull/37428) + +- Optimized the performance of inverted index acceleration for IN queries. [#37395](https://github.com/apache/doris/pull/37395) + +- Reduced unnecessary memory allocation during TOPN queries to improve performance. [#37429](https://github.com/apache/doris/pull/37429) + +- When creating an inverted index with tokenization, the `support_phrase` option is now automatically enabled to accelerate `match_phrase` series phrase queries. [#37949](https://github.com/apache/doris/pull/37949) + +### Other + +- Audit log now can record SQL types. [#37790](https://github.com/apache/doris/pull/37790) + +- Added support for `information_schema.processlist` to show all FE. [#38701](https://github.com/apache/doris/pull/38701) + +- Cached ranger's `atamask` and `rowpolicy` to accelerate query efficiency. [#37723](https://github.com/apache/doris/pull/37723) + +- Optimized metadata management in job manager to release locks immediately after modifying metadata, reducing lock holding time. [#38162](https://github.com/apache/doris/pull/38162) + + + +## Bug Fixes + +### Upgrade + +- Fix the issue where `mtmv load` fails during upgrade from version 2.1. [#38799](https://github.com/apache/doris/pull/38799) + +- Resolve the issue where `null_type` cannot be found during the upgrade to version 2.1. [#39373](https://github.com/apache/doris/pull/39373) + +- Address the compatibility issue with permission persistence during the upgrade from version 2.1 to 3.0. [#39288](https://github.com/apache/doris/pull/39288) + +### Load + +- Fix the issue where parsing fails when the newline character is surrounded by delimiters in CSV format parsing. [#38347](https://github.com/apache/doris/pull/38347) +- Resolve potential exception issues when FE forwards group commit. [#38228](https://github.com/apache/doris/pull/38228) [#38265](https://github.com/apache/doris/pull/38265) + +- Group commit now supports the new optimizer. [#37002](https://github.com/apache/doris/pull/37002) + +- Fix the issue where group commit reports data errors when JDBC setNull is used. [#38262](https://github.com/apache/doris/pull/38262) + +- Optimize the retry logic for group commit when encountering `delete bitmap lock` errors. [#37600](https://github.com/apache/doris/pull/37600) + +- Resolve the issue where routine load cannot use CSV delimiters and escape characters. [#38402](https://github.com/apache/doris/pull/38402) + +- Fix the issue where routine load job names with mixed case cannot be displayed. [#38523](https://github.com/apache/doris/pull/38523) + +- Optimize the logic for actively recovering routine load during FE master-slave switching. [#37876](https://github.com/apache/doris/pull/37876) + +- Resolve the issue where routine load pauses when all data in Kafka is expired. [#37288](https://github.com/apache/doris/pull/37288) + +- Fix the issue where `show routine load` returns empty results. [#38199](https://github.com/apache/doris/pull/38199) + +- Resolve the memory leak issue during multi-table stream import in routine load. [#38255](https://github.com/apache/doris/pull/38255) + +- Fix the issue where stream load does not return the error URL. [#38325](https://github.com/apache/doris/pull/38325) + +- Resolve potential load channel leak issues. [#38031](https://github.com/apache/doris/pull/38031) [#37500](https://github.com/apache/doris/pull/37500) + +- Fix the issue where no error may be reported when importing fewer segments than expected. [#36753](https://github.com/apache/doris/pull/36753) + +- Resolve the load stream leak issue. [#38912](https://github.com/apache/doris/pull/38912) + +- Optimize the impact of offline nodes on import operations. [#38198](https://github.com/apache/doris/pull/38198) + +- Fix the issue where transactions do not end when inserting into empty data. [#38991](https://github.com/apache/doris/pull/38991) + +### Storage + +**01 Backup and Restoration** + +- Fix the issue where tables cannot be written after backup and restoration. [#37089](https://github.com/apache/doris/pull/37089) + +- Resolve the issue where view database names are incorrect after backup and restoration. [#37412](https://github.com/apache/doris/pull/37412) + +**02 Compaction** + +- Fix the issue where cumu compaction handles delete errors incorrectly during ordered data compression. [#38742](https://github.com/apache/doris/pull/38742) + +- Resolve the issue of duplicate keys in aggregate tables caused by sequential compression optimization. [#38224](https://github.com/apache/doris/pull/38224) + +- Fix the issue where compression operations cause coredump in large wide tables. [#37960](https://github.com/apache/doris/pull/37960) + +- Resolve the compression starvation issue caused by inaccurate concurrent statistics of compression tasks. [#37318](https://github.com/apache/doris/pull/37318) + +**03 MOW Unique Key** + +- Resolve the issue of inconsistent data between replicas caused by cumulative compression deletion of delete sign. [#37950](https://github.com/apache/doris/pull/37950) + +- MOW delete now uses partial column updates with the new optimizer. [#38751](https://github.com/apache/doris/pull/38751) + +- Fix the potential duplicate key issue in MOW tables under compute-storage decoupled. [#39018](https://github.com/apache/doris/pull/39018) + +- Resolve the issue where MOW unique and duplicate tables cannot modify column order. [#37067](https://github.com/apache/doris/pull/37067) + +- Fix the potential data correctness issue caused by segcompaction. [#37760](https://github.com/apache/doris/pull/37760) + +- Resolve the potential memory leak issue during column updates. [#37706](https://github.com/apache/doris/pull/37706) + +**04 Other** + +- Fix the small probability of exceptions in TOPN queries. [#39119](https://github.com/apache/doris/pull/39119) [#39199](https://github.com/apache/doris/pull/39199) + +- Resolve the issue where auto-increment IDs may duplicate during FE restart. [#37306](https://github.com/apache/doris/pull/37306) + +- Fix the potential queuing issue in the delete operation priority queue. [#37169](https://github.com/apache/doris/pull/37169) + +- Optimize the delete retry logic. [#37363](https://github.com/apache/doris/pull/37363) + +- Resolve the issue with `bucket = 0` in table creation statements under the new optimizer. [#38971](https://github.com/apache/doris/pull/38971) + +- Fix the issue where FE reports success incorrectly when image generation fails. [#37508](https://github.com/apache/doris/pull/37508) + +- Resolve the issue where using the wrong nodename during FE offline nodes may cause inconsistent FE members. [#37987](https://github.com/apache/doris/pull/37987) + +- Fix the issue where CCR partition addition may fail. [#37295](https://github.com/apache/doris/pull/37295) + +- Resolve the `int32` overflow issue in inverted index files. [#38891](https://github.com/apache/doris/pull/38891) + +- Fix the issue where TRUNCATE TABLE failure may cause BE to fail to go offline. [#37334](https://github.com/apache/doris/pull/37334) + +- Resolve the issue where publish cannot continue due to null pointers. [#37724](https://github.com/apache/doris/pull/37724) [#37531](https://github.com/apache/doris/pull/37531) + +- Fix the potential coredump issue when manually triggering disk migration. [#37712](https://github.com/apache/doris/pull/37712) + +### Compute-Storage Decoupled + +- Fixed the issue where `show create table` might display the `file_cache_ttl_seconds` attribute twice. [#38052](https://github.com/apache/doris/pull/38052) + +- Fixed the issue where segment Footer TTL was not set correctly after setting file cache TTL. [#37485](https://github.com/apache/doris/pull/37485) + +- Fixed the issue where file cache might cause coredump due to massive conversion of cache types. [#38518](https://github.com/apache/doris/pull/38518) + +- Fixed the potential file descriptor (fd) leak in file cache. [#38051](https://github.com/apache/doris/pull/38051) + +- Fixed the issue where schema change Job overwriting compaction Job prevented base tablet compaction from completing normally. [#38210](https://github.com/apache/doris/pull/38210) + +- Fixed the potential inaccuracy of base compaction score due to data race. [#38006](https://github.com/apache/doris/pull/38006) + +- Fixed the issue where error messages from imports might not be uploaded correctly to object storage. [#38359](https://github.com/apache/doris/pull/38359) + +- Fixed the inconsistency in return information between compute-storage decoupled mode and storage and compute integration mode for 2PC imports. [#38076](https://github.com/apache/doris/pull/38076) + +- Fix the issue where incorrect file size setting during file cache warm-up leads to coredump. [#38939](https://github.com/apache/doris/pull/38939) + +- Fixed the issue where partial column updates did not correctly dequeue delete operations. [#37151](https://github.com/apache/doris/pull/37151) + +- Fixed compatibility issues with permission persistence in compute-storage decoupled mode. [#38136](https://github.com/apache/doris/pull/38136) [#37708](https://github.com/apache/doris/pull/37708) + +- Fixed the issue where observer did not retry correctly when encountering a `-230` error. [#37625](https://github.com/apache/doris/pull/37625) + +- Fixed the issue where `show load` with conditions did not perform correct analysis. [#37656](https://github.com/apache/doris/pull/37656) + +- Fixed the issue where `show streamload` in compute-storage decoupled mode caused BE coredump. [#37903](https://github.com/apache/doris/pull/37903) + +- Fixed the issue where `copy into` did not correctly verify column names in strict mode. [#37650](https://github.com/apache/doris/pull/37650) + +- Fixed the issue where multi-stream imports into a single table lacked permissions. [#38878](https://github.com/apache/doris/pull/38878) + +- Fixed the potential overflow issue in `getVersionUpdateTimeMs`. [#38074](https://github.com/apache/doris/pull/38074) + +- Fixed the issue where FE azure blob list was not implemented correctly. [#37986](https://github.com/apache/doris/pull/37986) + +- Fixed the issue where inaccurate azure blob recycling time calculation prevented recycling. [#37535](https://github.com/apache/doris/pull/37535) + +- Fixed the issue where inverted index files were not deleted in compute-storage decoupled mode. [#38306](https://github.com/apache/doris/pull/38306) + +### Lakehouse + +- Fixed the issue with reading binary data from Oracle Catalog. [#37078](https://github.com/apache/doris/pull/37078) + +- Fixed the potential deadlock issue when acquiring external table metadata in multi-FE scenarios. [#37756](https://github.com/apache/doris/pull/37756) + +- Fixed the issue where JNI scanner failure caused BE nodes to crash. [#37697](https://github.com/apache/doris/pull/37697) + +- Fixed the issue with slow reading of date types from Trino Connector Catalog. [#37266](https://github.com/apache/doris/pull/37266) + +- Optimized kerberos authentication logic for Hive Catalog. [#37301](https://github.com/apache/doris/pull/37301) + +- Fixed the issue where region attributes might be parsed incorrectly when parsing MinIO properties. [#37249](https://github.com/apache/doris/pull/37249) + +- Fixed the issue where creating too many FileSystems by FE caused memory leaks. [#36954](https://github.com/apache/doris/pull/36954) + +- Fixed the issue with reading incorrect time zone information from Paimon. [#37716](https://github.com/apache/doris/pull/37716) + +- Fixed the potential thread leak issue caused by Hive write-back operations. [#36990](https://github.com/apache/doris/pull/36990) + +- Fixed the null pointer issue caused by enabling Hive metastore event synchronization. [#38421](https://github.com/apache/doris/pull/38421) + +- Fixed the issue where error messages were unclear or caused stalling when creating catalogs. [#37551](https://github.com/apache/doris/pull/37551) + +- Fixed the issue where reading Hive text format tables behaved differently from Hive. [#37638](https://github.com/apache/doris/pull/37638) + +- Fixed the logic error when switching between catalogs and databases. [#37828](https://github.com/apache/doris/pull/37828) + +### MySQL Compatibility + +- Fixed the issue where certain flags in the MySQL protocol were set incorrectly when SSL was enabled. [#38086](https://github.com/apache/doris/pull/38086) + +### Asynchronous Materialized Views + +- Fixed the issue where construction might fail when the base table had a very large number of partitions. [#37589](https://github.com/apache/doris/pull/37589) + +- Fixed the issue where nested materialized views incorrectly performed full table refreshes even when partition refreshes were possible. [#38698](https://github.com/apache/doris/pull/38698) + +- Fixed the issue where partition refresh could not handle the simultaneous existence of valid and invalid dependencies when analyzing partition dependencies. [#38367](https://github.com/apache/doris/pull/38367) + +- Fixed the issue where the final result containing NULL type might cause asynchronous materialized views to fail. [#37019](https://github.com/apache/doris/pull/37019) + +- Fixed the planning error that might occur during transparent rewriting when both synchronous and asynchronous materialized views with the same name were present. [#37311](https://github.com/apache/doris/pull/37311) + +### Synchronous Materialized Views + +- The rewritten synchronous materialized views now can correctly perform partition pruning. [#38527](https://github.com/apache/doris/pull/38527) + +- When rewriting synchronous materialized views, those with unready data are no longer selected. [#38148](https://github.com/apache/doris/pull/38148) + +### Query Optimizer + +- Fixed the deadlock issue that might occur when queries and delete operations are performed simultaneously. [#38660](https://github.com/apache/doris/pull/38660) + +- Fixed the issue where bucket pruning might incorrectly prune on decimal column buckets. [#37889](https://github.com/apache/doris/pull/37889) + +- Fixed the issue where planning might be incorrect when mark join participates in join reorder. [#39152](https://github.com/apache/doris/pull/39152) + +- Fixed the issue where the result is incorrect when the correlation condition of a correlated subquery is not a simple column. [#37644](https://github.com/apache/doris/pull/37644) + +- Fixed the issue where partition pruning cannot correctly handle or expressions. [#38897](https://github.com/apache/doris/pull/38897) + +- Fixed the planning error that might occur when optimizing the execution order of JOIN and AGG. [#37343](https://github.com/apache/doris/pull/37343) + +- Fixed the issue where `str_to_date` performs incorrect constant folding calculations on datev1 types. [#37360](https://github.com/apache/doris/pull/37360) + +- Fixed the issue where the ACOS function's constant folding returns non-NaN values. [#37932](https://github.com/apache/doris/pull/37932) + +- Fixed the occasional planning error: "The children format needs to be [WhenClause+, DefaultValue?]". [#38491](https://github.com/apache/doris/pull/38491) + +- Fixed the issue where planning might be incorrect when the projection includes window functions and there is both the original column and its alias. [#38166](https://github.com/apache/doris/pull/38166) + +- Fixed the issue where planning might report an error when the aggregation parameter contains a lambda expression. [#37109](https://github.com/apache/doris/pull/37109) + +- Fixed the insert error that might occur in extreme cases: "MultiCastDataSink cannot be cast to DataStreamSink". [#38526](https://github.com/apache/doris/pull/38526) + +- Fixed the issue where the new optimizer does not correctly handle `char(0)/varchar(0)` when creating a table. [#38427](https://github.com/apache/doris/pull/38427) + +- Fixed the incorrect behavior of `char(255) toSql`. [#37340](https://github.com/apache/doris/pull/37340) + +- Fixed the issue where the nullable attribute within the `agg_state` type might lead to planning errors. [#37489](https://github.com/apache/doris/pull/37489) +- Fixed the issue where row count statistics are inaccurate during mark Join. [#38270](https://github.com/apache/doris/pull/38270) + +### Query Execution + +- Fixed issues where the Pipeline execution engine was stuck, causing queries to not end, in multiple scenarios. [#38657](https://github.com/apache/doris/pull/38657), [#38206](https://github.com/apache/doris/pull/38206), [#38885](https://github.com/apache/doris/pull/38885), [#38151](https://github.com/apache/doris/pull/38151), [#37297](https://github.com/apache/doris/pull/37297) + +- Fixed the coredump issue caused by NULL and non-NULL columns during set difference calculations. [#38750](https://github.com/apache/doris/pull/38750) + +- Fixed the error when using the DECIMAL type with pure decimals in delete statements. [#37801](https://github.com/apache/doris/pull/37801) + +- Fixed the issue where the `width_bucket` function returned incorrect results. [#37892](https://github.com/apache/doris/pull/37892) + +- Fixed the query error when a single row of data was very large and the result set was also large (exceeding 2GB). [#37990](https://github.com/apache/doris/pull/37990) + +- Fixed the coredump issue caused by incorrect release of rpc connections during single-replica imports. [#38087](https://github.com/apache/doris/pull/38087) + +- Fixed the coredump issue caused by processing NULL values with the `foreach` function. [#37349](https://github.com/apache/doris/pull/37349) + +- Fixed the issue where stddev returned incorrect results for DECIMALV2 types. [#38731](https://github.com/apache/doris/pull/38731) + +- Fixed the slow performance of `bitmap union` calculations. [#37816](https://github.com/apache/doris/pull/37816) + +- Fixed the issue where RowsProduced for aggregation operators was not set in the profile. [#38271](https://github.com/apache/doris/pull/38271) + +- Fixed the overflow issue when calculating the number of buckets for the hash table under hash join. [#37193](https://github.com/apache/doris/pull/37193), [#37493](https://github.com/apache/doris/pull/37493) + +- Fixed the inaccurate recording of the `jemalloc cache memory tracker`. [#37464](https://github.com/apache/doris/pull/37464) + +- Added the `enable_stacktrace` configuration option, allowing users to control whether exception stacks are output in BE logs. [#37713](https://github.com/apache/doris/pull/37713) + +- Fixed the issue where Arrow Flight SQL did not work correctly when `enable_parallel_result_sink` was set to false. [#37779](https://github.com/apache/doris/pull/37779) + +- Fixed the incorrect use of colocate Join. [#37361](https://github.com/apache/doris/pull/37361), [#37729](https://github.com/apache/doris/pull/37729) + +- Fixed the calculation overflow issue of the `round` function on DECIMAL128 types. [#37733](https://github.com/apache/doris/pull/37733), [#38106](https://github.com/apache/doris/pull/38106) + +- Fixed the coredump issue when passing a const string to the `sleep` function. [#37681](https://github.com/apache/doris/pull/37681) + +- Increased the queue length for audit logs, solving the issue where audit logs could not be recorded normally under high concurrency scenarios with thousands of concurrent connections. [#37786](https://github.com/apache/doris/pull/37786) + +- Fixed the issue where creating a workload group caused too many threads, leading to BE coredump. [#38096](https://github.com/apache/doris/pull/38096) + +- Fixed the coredump issue caused by the `MULTI_MATCH_ANY` function. [#37959](https://github.com/apache/doris/pull/37959) + +- Fixed the transaction rollback issue caused by `insert overwrite auto partition`. [#38103](https://github.com/apache/doris/pull/38103) + +- Fixed the issue where the TimeUtils formatter did not use the correct time zone. [#37465](https://github.com/apache/doris/pull/37465) + +- Fixed the issue where results were incorrect under constant folding scenarios for week/yearweek. [#37376](https://github.com/apache/doris/pull/37376) + +- Fixed the issue where the `convert_tz` function returned incorrect results. [#37358](https://github.com/apache/doris/pull/37358), [#38764](https://github.com/apache/doris/pull/38764) + +- Fixed the coredump issue when using the `collect_set` function with window functions. [#38234](https://github.com/apache/doris/pull/38234) + +- Fixed the coredump issue caused by `percentile_approx` during rolling upgrades. [#39321](https://github.com/apache/doris/pull/39321) + +- Fixed the coredump issue caused by the `mod` function when encountering abnormal input. [#37999](https://github.com/apache/doris/pull/37999) + +- Fixed the issue where the hash table was not fully built when the broadcast join probe started running. [#37643](https://github.com/apache/doris/pull/37643) + +- Fixed the issue where executing the same expression in multithreaded environments might lead to incorrect results for Java UDFs. [#38612](https://github.com/apache/doris/pull/38612) + +- Fixed the overflow issue caused by incorrect return types of the `conv` function. [#38001](https://github.com/apache/doris/pull/38001) + +- Fixed the issue where the `json_replace` function returned incorrect types. [#3701](https://github.com/apache/doris/pull/37014) + +- Fixed the issue where the nullable attribute setting was unreasonable for the `percentile` aggregation function. [#37330](https://github.com/apache/doris/pull/37330) + +- Fixed the issue where the results of the `histogram` function were unstable. [#38608](https://github.com/apache/doris/pull/38608) + +- Fixed the issue where task state was displayed incorrectly in the profile. [#38082](https://github.com/apache/doris/pull/38082) + +- Fixed the issue where some queries were incorrectly canceled when the system just started. [#37662](https://github.com/apache/doris/pull/37662) + +### Semi-Structured Data Management + +- Fix some issues with time series compression. [#39170](https://github.com/apache/doris/pull/39170) [#39176](https://github.com/apache/doris/pull/39176) + +- Fix the issue of incorrect index size statistics during compression. [#37232](https://github.com/apache/doris/pull/37232) + +- Fix the potential incorrect matching of ultra-long strings without tokenization in inverted indexes. [#37679](https://github.com/apache/doris/pull/37679) [#38218](https://github.com/apache/doris/pull/38218) + +- Fix the high memory usage issue of `array_range` and `array_with_const` functions when dealing with large data volumes. [#38284](https://github.com/apache/doris/pull/38284) [#37495](https://github.com/apache/doris/pull/37495) + +- Fix the potential coredump issue when selecting columns of ARRAY / MAP / STRUCT types. [#37936](https://github.com/apache/doris/pull/37936) + +- Fix the import failure issue caused by simdjson parsing errors when specifying jsonpath in Stream Load. [#38490](https://github.com/apache/doris/pull/38490) + +- Fix the exception handling issue when there are duplicate keys in JSON data. [#38146](https://github.com/apache/doris/pull/38146) + +- Fix the potential query error after DROP INDEX. [#37646](https://github.com/apache/doris/pull/37646) + +- Fix the error return issue in row merging checks during index compression. [#38732](https://github.com/apache/doris/pull/38732) + +- Inverted index v2 format now supports renaming columns. [#38079](https://github.com/apache/doris/pull/38079) + +- Fix the coredump issue when the `MATCH` function matches an empty string without an index. [#37947](https://github.com/apache/doris/pull/37947) + +- Fix the handling of NULL values in inverted indexes. [#37921](https://github.com/apache/doris/pull/37921) [#37842](https://github.com/apache/doris/pull/37842) [#38741](https://github.com/apache/doris/pull/38741) + +- Fix the incorrect `row_store_page_size` after FE restart. [#38240](https://github.com/apache/doris/pull/38240) + +### Other + +- Fix the timezone configuration issue. The default timezone is no longer fixed at UTC+8 and is now obtained from system configuration. [#37294](https://github.com/apache/doris/pull/37294) + +- Fix the class conflict issue when using ranger due to multiple JSR specification implementations. [#37575](https://github.com/apache/doris/pull/37575) + +- Fix the potential uninitialized field issue in some BE code. [#37403](https://github.com/apache/doris/pull/37403) + +- Fix the error in delete statements for random distributed tables. [#37985](https://github.com/apache/doris/pull/37985) + +- Fix the incorrect requirement for `alter_priv` permission on the base table when creating a synchronized materialized view. [#38011](https://github.com/apache/doris/pull/38011) + +- Fix the issue of not authenticating resources when used in TVF. [#36928](https://github.com/apache/doris/pull/36928) + + +## Credits + +Thanks all who contribute to this release: + +@133tosakarin, @924060929, @AshinGau, @Baymine, @BePPPower, @BiteTheDDDDt, @ByteYue, @CalvinKirs, @Ceng23333, @DarvenDuan, @FreeOnePlus, @Gabriel39, @HappenLee, @JNSimba, @Jibing-Li, @KassieZ, @Lchangliang, @LiBinfeng-01, @Mryange, @SWJTU-ZhangLei, @TangSiyang2001, @Tech-Circle-48, @Vallishp, @Yukang-Lian, @Yulei-Yang, @airborne12, @amorynan, @bobhan1, @cambyzju, @cjj2010, @csun5285, @dataroaring, @deardeng, @eldenmoon, @englefly, @feiniaofeiafei, @felixwluo, @freemandealer, @gavinchou, @ghkang98, @hello-stephen, @hubgeter, @hust-hhb, @jacktengg, @kaijchen, @kaka11chen, @keanji-x, @liaoxin01, @liutang123, @luwei16, @luzhijing, @lxr599, @morningman, @morrySnow, @mrhhsg, @mymeiyi, @platoneko, @qidaye, @qzsee, @seawinde, @shuke987, @sollhui, @starocean999, @suxiaogang223, @w41ter, @wangbo, @wangshuo128, @whutpencil, @wsjz, @wuwenchi, @wyxxxcat, @xiaokang, @xiedeyantu, @xinyiZzz, @xy720, @xzj7019, @yagagagaga, @yiguolei, @yujun777, @z404289981, @zclllyybb, @zddr, @zfr9527, @zhangbutao, @zhangstar333, @zhannngchen, @zhiqiang-hhhh, @zjj, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.2.md b/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.2.md new file mode 100644 index 0000000000000..0ab6a828ab95d --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.2.md @@ -0,0 +1,341 @@ +--- +{ + "title": "Release 3.0.2", + "language": "en" +} +--- + + + + +Dear community members, the Apache Doris 3.0.2 version was officially released on October 15, 2024, featuring updates and improvements in compute-storage decoupling, data storage, lakehouse, query optimizer, query execution and more. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavioral Changes + +### Storage + +- Limited the number of tablets in a single backup task to prevent FE memory overflow. [#40518](https://github.com/apache/doris/pull/40518) +- The `SHOW PARTITIONS` command now displays the `CommittedVersion` of partitions. [#28274](https://github.com/apache/doris/pull/28274) + +### Other + +- The default printing mode (asynchronous) of `fe.log` now includes file line number information. If performance issues are encountered due to line number output, please switch to BRIEF mode. [#39419](https://github.com/apache/doris/pull/39419) +- The default value of the session variable `ENABLE_PREPARED_STMT_AUDIT_LOG` has been changed from `true` to `false`, and the audit log of prepare statements will no longer be printed. [#38865](https://github.com/apache/doris/pull/38865) +- The default value of the session variable `max_allowed_packet` has been adjusted from 1MB to 16MB to align with MySQL 8.4. [#38697](https://github.com/apache/doris/pull/38697) +- The JVM of FE and BE defaults to using the UTF-8 character set. [#39521](https://github.com/apache/doris/pull/39521) + +## New Features + +### Storage + +- Backup and recovery now support clearing tables or partitions that are not in the backup. [#39028](https://github.com/apache/doris/pull/39028) + +### Compute-Storage Decoupled + +- Support for parallel recycling of expired data on multiple tablets. [#37630](https://github.com/apache/doris/pull/37630) +- Support for changing storage vaults through `ALTER` statements. [#38685](https://github.com/apache/doris/pull/38685) [#37606](https://github.com/apache/doris/pull/37606) +- Support for importing a large number of tablets (5000+) in a single transaction (experimental feature). [#38243](https://github.com/apache/doris/pull/38243) +- Support for automatically aborting pending transactions caused by reasons such as node restarts, solving the issue of pending transactions blocking decommission or schema change. [#37669](https://github.com/apache/doris/pull/37669) +- A new session variable `enable_segment_cache` has been added to control whether to use segment cache during queries (default is `true`). [#37141](https://github.com/apache/doris/pull/37141) +- Resolved the issue of not being able to import a large amount of data during schema changes in compute-storage decoupled mode. [#39558](https://github.com/apache/doris/pull/39558) +- Support for adding multiple follower roles of FE in compute-storage decoupled mode. [#38388](https://github.com/apache/doris/pull/38388) +- Support for using memory as file cache to accelerate queries in environments with no disks or low-performance HDDs. [#38811](https://github.com/apache/doris/pull/38811) + +### Lakehouse + +- New Lakesoul Catalog has been added. [Apache Doris Docs](https://doris.apache.org/zh-CN/docs/dev/lakehouse/datalake-analytics/lakesoul) +- A new system table `catalog_meta_cache_statistics` has been added to view the usage of various metadata caches in external catalog. [#40155](https://github.com/apache/doris/pull/40155) + +### Query Optimizer + +- Support for `is [not] true/false` expressions. [#38623](https://github.com/apache/doris/pull/38623) + +### Query Execution + +- A new CRC32 function has been added. [#38204](https://github.com/apache/doris/pull/38204) +- New aggregate functions skew and kurt have been added. [#41277](https://github.com/apache/doris/pull/41277) +- Profiles are now persisted to the FE's disk to retain more profiles. [#33690](https://github.com/apache/doris/pull/33690) +- A new system table `workload_group_privileges` has been added to view permission information related to workload groups. [#38436](https://github.com/apache/doris/pull/38436) +- A new system table `workload_group_resource_usage` has been added to monitor resource statistics of workload groups. [#39177](https://github.com/apache/doris/pull/39177) +- Workload groups now support limiting reads of local IO and remote IO. [#39012](https://github.com/apache/doris/pull/39012) +- Workload groups now support cgroupv2 to limit CPU usage. [#39374](https://github.com/apache/doris/pull/39374) +- A new system table `information_schema.partitions` has been added to view some table creation attributes. [#40636](https://github.com/apache/doris/pull/40636) + +### Other + +- Support for using the `SHOW` statement to display BE's configuration information, such as `SHOW BACKEND CONFIG LIKE ${pattern}`. [#36525](https://github.com/apache/doris/pull/36525) + +## Improvements + +### Load + +- Improved the import efficiency of routine load when encountering frequent EOFs from Kafka. [#39975](https://github.com/apache/doris/pull/39975) +- The stream load result now includes the time taken to read HTTP data, `ReceiveDataTimeMs`, which can quickly determine slow stream load issues caused by network reasons. [#40735](https://github.com/apache/doris/pull/40735) +- Optimized the routine load timeout logic to avoid frequent timeouts during inverted index and mow writes. [#40818](https://github.com/apache/doris/pull/40818) + +### Storage + +- Support for batch addition of partitions. [#37114](https://github.com/apache/doris/pull/37114) + +### Compute-Storage Decoupled + +- Added the meta-service HTTP interface `/MetaService/http/show_meta_ranges` to facilitate the statistics of KV distribution in FDB. [#39208](https://github.com/apache/doris/pull/39208) +- The meta-service/recycler stop script ensures that the process fully exits before returning. [#40218](https://github.com/apache/doris/pull/40218) +- Support for using the session variable `version_comment` (Cloud Mode) to display the current deployment mode as compute-storage decoupled. [#38269](https://github.com/apache/doris/pull/38269) +- Fixed the detailed message returned when transaction submission fails. [#40584](https://github.com/apache/doris/pull/40584) +- Support for using one meta-service process to provide both metadata services and data recycling services. [#40223](https://github.com/apache/doris/pull/40223) +- Optimized the default configuration of file_cache to avoid potential issues when not set. [#41421](https://github.com/apache/doris/pull/41421) [#41507](https://github.com/apache/doris/pull/41507) +- Improved query performance by batch retrieving the version of multiple partitions. [#38949](https://github.com/apache/doris/pull/38949) +- Delayed the redistribution of tablets to avoid query performance issues caused by temporary network fluctuations. [#40371](https://github.com/apache/doris/pull/40371) +- Optimized the read-write lock logic in the balance. [#40633](https://github.com/apache/doris/pull/40633) +- Enhanced the robustness of file cache in handling TTL filenames during restarts/crashes. [#40226](https://github.com/apache/doris/pull/40226) +- Added the BE HTTP interface `/api/file_cache?op=hash` to facilitate the calculation of the hash file names of segment files on disk. [#40831](https://github.com/apache/doris/pull/40831) +- Optimized the unified naming to be compatible with using compute group to represent BE groups (original cloud cluster). [#40767](https://github.com/apache/doris/pull/40767) +- Optimized the waiting time for obtaining locks when calculating delete bitmaps in primary key tables. [#40341](https://github.com/apache/doris/pull/40341) +- When there are many delete bitmaps in primary key tables, optimized the high CPU consumption during queries by pre-merging multiple delete bitmaps. [#40204](https://github.com/apache/doris/pull/40204) +- Support for managing FE/BE nodes in compute-storage decoupled mode through SQL statements, hiding the logic of direct interaction with meta-service when deploying in compute-storage decoupled mode. [#40264](https://github.com/apache/doris/pull/40264) +- Added a script for rapid deployment of FDB. [#39803](https://github.com/apache/doris/pull/39803) +- Optimized the output of `SHOW CACHE HOTSPOT` to unify the column name style with other `SHOW` statements. [#41322](https://github.com/apache/doris/pull/41322) +- When using a storage vault as the storage backend, disallowed the use of `latest_fs()` to avoid binding different storage backends to the same table. [#40516](https://github.com/apache/doris/pull/40516) +- Optimized the timeout strategy for calculating delete bitmaps when importing mow tables. [#40562](https://github.com/apache/doris/pull/40562) [#40333](https://github.com/apache/doris/pull/40333) +- The enable_file_cache in be.conf is now enabled by default in compute-storage decoupled mode. [#41502](https://github.com/apache/doris/pull/41502) + +### Lakehouse + +- When reading tables in CSV format, support for the session `keep_carriage_return` setting to control the reading behavior of the `\r` symbol. [#39980](https://github.com/apache/doris/pull/39980) +- The default maximum memory of BE's JVM has been adjusted to 2GB (affecting only new deployments). [#41403](https://github.com/apache/doris/pull/41403) +- Hive Catalog has added `hive.recursive_directories_table` and `hive.ignore_absent_partitions` properties to specify whether to recursively traverse data directories and whether to ignore missing partitions. [#39494](https://github.com/apache/doris/pull/39494) +- Optimized the Catalog refresh logic to avoid generating a large number of connections during refresh. [#39205](https://github.com/apache/doris/pull/39205) +- `SHOW CREATE DATABASE` and `SHOW CREATE TABLE` for external data sources now display location information. [#39179](https://github.com/apache/doris/pull/39179) +- The new optimizer supports inserting data into JDBC external tables using the `INSERT INTO` statement. [#41511](https://github.com/apache/doris/pull/41511) +- MaxCompute Catalog now supports complex data types. [#39259](https://github.com/apache/doris/pull/39259) +- Optimized the logic for reading and merging data shards of external tables. [#38311](https://github.com/apache/doris/pull/38311) +- Optimized some refresh strategies for metadata caches of external tables. [#38506](https://github.com/apache/doris/pull/38506) +- Paimon tables now support pushing down `IN/NOT IN` predicates. [#38390](https://github.com/apache/doris/pull/38390) +- Compatible with tables created in Parquet format by Paimon version 0.9. [#41020](https://github.com/apache/doris/pull/41020) + +### Asynchronous Materialized Views + +- Building asynchronous materialized views now supports the use of both immediate and starttime. [#39573](https://github.com/apache/doris/pull/39573) +- Asynchronous materialized views based on external tables will refresh the metadata cache of the external tables before refreshing the materialized views, ensuring construction based on the latest external table data. [#38212](https://github.com/apache/doris/pull/38212) +- Partition incremental construction now supports rolling up according to weekly and quarterly granularities. [#39286](https://github.com/apache/doris/pull/39286) + +### Query Optimizer + +- The aggregate function `GROUP_CONCAT` now supports the use of both `DISTINCT` and `ORDER BY`. [#38080](https://github.com/apache/doris/pull/38080) +- Optimized the collection and use of statistical information, as well as the logic for estimating row counts and cost calculations, to generate more efficient and stable execution plans. +- Window function partition data pre-filtering now supports cases containing multiple window functions. [#38393](https://github.com/apache/doris/pull/38393) + +### Query Execution + +- Reduced query latency by running prepare pipeline tasks in parallel. [#40874](https://github.com/apache/doris/pull/40874) +- Display Catalog information in Profile. [#38283](https://github.com/apache/doris/pull/38283) +- Optimized the computational performance of `IN` filtering conditions. [#40917](https://github.com/apache/doris/pull/40917) +- Supported cgroupv2 in K8S to limit Doris's memory usage. [#39256](https://github.com/apache/doris/pull/39256) +- Optimized the performance of converting strings to datetime types. [#38385](https://github.com/apache/doris/pull/38385) +- When a `string` is a decimal number, support casting it to an `int`, which will be more compatible with certain behaviors of MySQL. [#38847](https://github.com/apache/doris/pull/38847) + +### Semi-Structured Data Management + +- Optimized the performance of inverted index matching. [#41122](https://github.com/apache/doris/pull/41122) +- Temporarily prohibited the creation of inverted indexes with tokenization on arrays. [#39062](https://github.com/apache/doris/pull/39062) +- `explode_json_array` now supports binary JSON types. [#37278](https://github.com/apache/doris/pull/37278) +- IP data types now support bloomfilter indexes. [#39253](https://github.com/apache/doris/pull/39253) +- IP data types now support row storage. [#39258](https://github.com/apache/doris/pull/39258) +- Nested data types such as ARRAY, MAP, and STRUCT now support schema changes. [#39210](https://github.com/apache/doris/pull/39210) +- When creating MTMV, automatically truncate KEYs encountered in VARIANT data types. [#39988](https://github.com/apache/doris/pull/39988) +- Lazy loading of inverted indexes during queries to improve performance. [#38979](https://github.com/apache/doris/pull/38979) +- `add inverted index file size for open file`. [#37482](https://github.com/apache/doris/pull/37482) +- Reduced access to object storage interfaces during compaction to improve performance. [#41079](https://github.com/apache/doris/pull/41079) +- Added three new query profile metrics related to inverted indexes. [#36696](https://github.com/apache/doris/pull/36696) +- Reduced cache overhead for non-PreparedStatement SQL to improve performance. [#40910](https://github.com/apache/doris/pull/40910) +- Pre-warming cache now supports inverted indexes. [#38986](https://github.com/apache/doris/pull/38986) +- Inverted indexes are now cached immediately after writing. [#39076](https://github.com/apache/doris/pull/39076) + +### Compatibility + +- Fixed the issue of Thrift ID incompatibility on the master with branch-2.1. [#41057](https://github.com/apache/doris/pull/41057) + +### Other + +- BE HTTP API now supports authentication; set config::enable_all_http_auth to true (default is false) when authentication is required. [#39577](https://github.com/apache/doris/pull/39577) +- Optimized the user permissions required for the REFRESH operation. Permissions have been relaxed from ALTER to SHOW. [#39008](https://github.com/apache/doris/pull/39008) +- Reduced the range of nextId when calling advanceNextId(). [#40160](https://github.com/apache/doris/pull/40160) +- Optimized the caching mechanism for Java UDFs. [#40404](https://github.com/apache/doris/pull/40404) + +## Bug Fixes + +### Load + +- Fixed the issue where `abortTransaction` did not handle return codes. [#41275](https://github.com/apache/doris/pull/41275) +- Fixed the issue where transactions failed to commit or abort in compute-storage decoupled mode without calling `afterCommit/afterAbort`. [#41267](https://github.com/apache/doris/pull/41267) +- Fixed the issue where Routine Load could not work properly when modifying consumer offsets in compute-storage decoupled mode. [#39159](https://github.com/apache/doris/pull/39159) +- Fixed the issue of repeatedly closing file handles when obtaining error log file paths. [#41320](https://github.com/apache/doris/pull/41320) +- Fixed the issue of incorrect job progress caching for Routine Load in compute-storage decoupled mode. [#39313](https://github.com/apache/doris/pull/39313) +- Fixed the issue where Routine Load could get stuck when failing to commit transactions in compute-storage decoupled mode. [#40539](https://github.com/apache/doris/pull/40539) +- Fixed the issue where Routine Load kept reporting data quality check errors in compute-storage decoupled mode. [#39790](https://github.com/apache/doris/pull/39790) +- Fixed the issue where Routine Load did not check transactions before committing in compute-storage decoupled mode. [#39775](https://github.com/apache/doris/pull/39775) +- Fixed the issue where Routine Load did not check transactions before aborting in compute-storage decoupled mode. [#40463](https://github.com/apache/doris/pull/40463) +- Fixed the issue where cluster keys did not support certain data types. [#38966](https://github.com/apache/doris/pull/38966) +- Fixed the issue of transactions being repeatedly committed. [#39786](https://github.com/apache/doris/pull/39786) +- Fixed the issue of use after free with WAL when BE exits. [#33131](https://github.com/apache/doris/pull/33131) +- Fixed the issue where WAL playback did not skip completed import transactions in compute-storage decoupled mode. [#41262](https://github.com/apache/doris/pull/41262) +- Fixed the logic for selecting BE in group commit in compute-storage decoupled mode. [#39986](https://github.com/apache/doris/pull/39986) [#38644](https://github.com/apache/doris/pull/38644) +- Fixed the issue where BE might crash when group commit was enabled for insert into. [#39339](https://github.com/apache/doris/pull/39339) +- Fixed the issue where insert into with group commit enabled might get stuck. [#39391](https://github.com/apache/doris/pull/39391) +- Fixed the issue where not enabling the group commit option during import might result in a table not found error. [#39731](https://github.com/apache/doris/pull/39731) +- Fixed the issue of transaction submission timeouts due to too many tablets. [#40031](https://github.com/apache/doris/pull/40031) +- Fixed the issue of concurrent opens with Auto Partition. [#38605](https://github.com/apache/doris/pull/38605) +- Fixed the issue of import lock granularity being too large. [#40134](https://github.com/apache/doris/pull/40134) +- Fixed the issue of coredumps caused by zero-length varchars. [#40940](https://github.com/apache/doris/pull/40940) +- Fixed the issue of incorrect index Id values in log prints. [#38790](https://github.com/apache/doris/pull/38790) +- Fixed the issue of memtable shifting not closing BRPC streaming. [#40105](https://github.com/apache/doris/pull/40105) +- Fixed the issue of inaccurate bvar statistics during memtable shifting. [#39075](https://github.com/apache/doris/pull/39075) +- Fixed the issue of multi-replication fault tolerance during memtable shifting. [#38003](https://github.com/apache/doris/pull/38003) +- Fixed the issue of incorrect message length calculations for Routine Load with multiple tables in one stream. [#40367](https://github.com/apache/doris/pull/40367) +- Fixed the issue of inaccurate progress reporting for Broker Load. [#40325](https://github.com/apache/doris/pull/40325) +- Fixed the issue of inaccurate data scan volume reporting for Broker Load. [#40694](https://github.com/apache/doris/pull/40694) +- Fixed the issue of concurrency with Routine Load in compute-storage decoupled mode. [#39242](https://github.com/apache/doris/pull/39242) +- Fixed the issue of Routine Load jobs being canceled in compute-storage decoupled mode. [#39514](https://github.com/apache/doris/pull/39514) +- Fixed the issue of progress not being reset when deleting Kafka topics. [#38474](https://github.com/apache/doris/pull/38474) +- Fixed the issue of updating progress during transaction state transitions in Routine Load. [#39311](https://github.com/apache/doris/pull/39311) +- Fixed the issue of Routine Load switching from a paused state to a paused state. [#40728](https://github.com/apache/doris/pull/40728) +- Fixed the issue of Stream Load records being missed due to database deletion. [#39360](https://github.com/apache/doris/pull/39360) + +### Storage + +- Fixed the issue of missing storage policies. [#38700](https://github.com/apache/doris/pull/38700) +- Fixed the issue of errors during cross-version backup and recovery. [#38370](https://github.com/apache/doris/pull/38370) +- Fixed the NPE issue with ccr binlog. [#39909](https://github.com/apache/doris/pull/39909) +- Fixed potential issues with duplicate keys in mow. [#41309](https://github.com/apache/doris/pull/41309) [#39791](https://github.com/apache/doris/pull/39791) [#39958](https://github.com/apache/doris/pull/39958) [#38369](https://github.com/apache/doris/pull/38369) [#38331](https://github.com/apache/doris/pull/38331) +- Fixed the issue of not being able to write after backup and recovery in high-frequency write scenarios. [#40118](https://github.com/apache/doris/pull/40118) [#38321](https://github.com/apache/doris/pull/38321) +- Fixed the issue of data errors potentially triggered by deleting empty strings and schema changes. [#41064](https://github.com/apache/doris/pull/41064) +- Fixed the issue of incorrect statistics due to column updates. [#40880](https://github.com/apache/doris/pull/40880) +- Limited the size of tablet meta pb to prevent BE crashes due to oversized meta. [#39455](https://github.com/apache/doris/pull/39455) +- Fixed the potential column misalignment issue with the new optimizer in `begin; insert into values; commit`. [#39295](https://github.com/apache/doris/pull/39295) + +### Compute-Storage Decoupled + +- Fixed the issue where the tablet distribution might be inconsistent across multiple FEs in compute-storage decoupled mode. [#41458](https://github.com/apache/doris/pull/41458) +- Fixed the issue where TVF might not work in multi-computing group environments. [#39249](https://github.com/apache/doris/pull/39249) +- Fixed the issue where compaction used resources that had already been released when BE exited in compute-storage decoupled mode. [#39302](https://github.com/apache/doris/pull/39302) +- Fixed the issue where automatic start-stop might cause FE replay to get stuck. [#40027](https://github.com/apache/doris/pull/40027) +- Fixed the issue where the BE status and the stored status in meta-service were inconsistent. [#40799](https://github.com/apache/doris/pull/40799) +- Fixed the issue where the FE->meta-service connection pool could not automatically expire and reconnect. [#41202](https://github.com/apache/doris/pull/41202) [#40661](https://github.com/apache/doris/pull/40661) +- Fixed the issue where some tablets might repeatedly undergo unexpected balance processes during rebalance. [#39792](https://github.com/apache/doris/pull/39792) +- Fixed the issue where storage vault permissions were lost after FE restarted. [#40260](https://github.com/apache/doris/pull/40260) +- Fixed the issue where tablet row counts and other statistical information might be incomplete due to FDB scan range pagination. [#40494](https://github.com/apache/doris/pull/40494) +- Fixed the performance issue caused by a large number of aborted transactions associated with the same label. [#40606](https://github.com/apache/doris/pull/40606) +- Fixed the issue where `commit_txn` did not automatically re-enter, maintaining consistent behavior between compute-storage decoupled and integrated modes. [#39615](https://github.com/apache/doris/pull/39615) +- Fixed the issue where the number of projected columns increased when dropping columns. [#40187](https://github.com/apache/doris/pull/40187) +- Fixed the issue where delete statements did not correctly handle return values, causing data to still be visible after deletion. [#39428](https://github.com/apache/doris/pull/39428) +- Fixed the coredump issue caused by rowset metadata competition during file cache preheating. [#39361](https://github.com/apache/doris/pull/39361) +- Fixed the issue where the entire cache space would be used up when TTL cache enabled LRU eviction. [#39814](https://github.com/apache/doris/pull/39814) +- Fixed the issue where temporary files could not be recycled when importing commit rowset failed with HDFS storage backend. [#40215](https://github.com/apache/doris/pull/40215) + +### Lakehouse + +- Fixed some issues with predicate pushdown in JDBC Catalog. [#39064](https://github.com/apache/doris/pull/39064) +- Fixed the issue of not being able to read when `S``TRUCT` type columns are missing in Parquet format. [#38718](https://github.com/apache/doris/pull/38718) +- Fixed the issue of FileSystem leaks on the FE side in some cases. [#38610](https://github.com/apache/doris/pull/38610) +- Fixed the issue of metadata cache information being inconsistent when Hive/Iceberg tables write back in some cases. [#40729](https://github.com/apache/doris/pull/40729) +- Fixed the issue of unstable partition ID generation for external tables in some cases. [#39325](https://github.com/apache/doris/pull/39325) +- Fixed the issue of external table queries selecting BE nodes in the blacklist in some cases. [#39451](https://github.com/apache/doris/pull/39451) +- Optimized the timeout time for batch retrieval of external table partition information to avoid long-term thread occupation. [#39346](https://github.com/apache/doris/pull/39346) +- Fixed the issue of memory leaks when querying Hudi tables in some cases. [#41256](https://github.com/apache/doris/pull/41256) +- Fixed the issue of connection pool connection leaks in JDBC Catalog in some cases. [#39582](https://github.com/apache/doris/pull/39582) +- Fixed the issue of BE memory leaks in JDBC Catalog in some cases. [#41041](https://github.com/apache/doris/pull/41041) +- Fixed the issue of not being able to query Hudi data on Alibaba Cloud OSS. [#41316](https://github.com/apache/doris/pull/41316) +- Fixed the issue of not being able to read empty partitions in MaxCompute. [#40046](https://github.com/apache/doris/pull/40046) +- Fixed the issue of poor performance when querying Oracle through JDBC Catalog. [#41513](https://github.com/apache/doris/pull/41513) +- Fixed the issue of BE crashes when querying deletion vector of Paimon tables after enabling file cache features. [#39877](https://github.com/apache/doris/pull/39877) +- Fixed the issue of not being able to access Paimon tables on HDFS clusters with HA enabled. [#39806](https://github.com/apache/doris/pull/39806) +- Temporarily disabled the page index filtering feature of Parquet to avoid potential issues. [#38691](https://github.com/apache/doris/pull/38691) +- Fixed the issue of not being able to read unsigned types in Parquet files. [#39926](https://github.com/apache/doris/pull/39926) +- Fixed the issue of potential infinite loops when reading Parquet files in some cases. [#39523](https://github.com/apache/doris/pull/39523) + +### Asynchronous Materialized Views + +- Fixed the issue where partition construction might select the wrong table to track partitions if both sides have the same column names. [#40810](https://github.com/apache/doris/pull/40810) +- Fixed the issue where transparent rewrite partition compensation might result in incorrect results. [#40803](https://github.com/apache/doris/pull/40803) +- Fixed the issue where transparent rewrite did not take effect on external tables. [#38909](https://github.com/apache/doris/pull/38909) +- Fixed the issue where nested materialized views might not refresh properly. [#40433](https://github.com/apache/doris/pull/40433) + +### Synchronous Materialized Views + +- Fixed the issue where creating synchronous materialized views on MOW tables might result in incorrect query results. [#39171](https://github.com/apache/doris/pull/39171) + +### Query Optimizer + +- Fixed the issue where existing synchronous materialized views might not be usable after upgrading. [#41283](https://github.com/apache/doris/pull/41283) +- Fixed the issue of not correctly handling milliseconds when comparing datetime literals. [#40121](https://github.com/apache/doris/pull/40121) +- Fixed the issue of potential errors in conditional function partition pruning. [#39298](https://github.com/apache/doris/pull/39298) +- Fixed the issue where MOW tables with synchronous materialized views could not perform delete operations. [#39578](https://github.com/apache/doris/pull/39578) +- Fixed the issue where the nullable of slots in JDBC external table query predicates might be incorrectly planned, causing query errors. [#41014](https://github.com/apache/doris/pull/41014) + +### Query Execution + +- Fixed the memory leak issue caused by the use of runtime filters. [#39155](https://github.com/apache/doris/pull/39155) +- Fixed the issue of excessive memory usage by window functions. [#39581](https://github.com/apache/doris/pull/39581) +- Fixed a series of function compatibility issues during rolling upgrades. [#41023](https://github.com/apache/doris/pull/41023) [#40438](https://github.com/apache/doris/pull/40438) [#39648](https://github.com/apache/doris/pull/39648) +- Fixed the issue of incorrect results with `encryption_function` when used with constants. [#40201](https://github.com/apache/doris/pull/40201) +- Fixed the issue of errors when importing single-table materialized views. [#39061](https://github.com/apache/doris/pull/39061) +- Fixed the issue of incorrect partition result calculations for window functions. [#39100](https://github.com/apache/doris/pull/39100) [#40761](https://github.com/apache/doris/pull/40761) +- Fixed the issue of incorrect calculations for topn when null values are present. [#39497](https://github.com/apache/doris/pull/39497) +- Fixed the issue of incorrect results with the `map_agg` function. [#39743](https://github.com/apache/doris/pull/39743) +- Fixed the issue of incorrect messages returned by cancel. [#38982](https://github.com/apache/doris/pull/38982) +- Fixed the issue of BE core dumps caused by encrypt and decrypt functions. [#40726](https://github.com/apache/doris/pull/40726) +- Fixed the issue of queries getting stuck due to too many scanners in high-concurrency scenarios. [#40495](https://github.com/apache/doris/pull/40495) +- Supported time types in runtime filters. [#38258](https://github.com/apache/doris/pull/38258) +- Fixed the issue of incorrect results with window funnel functions. [#40960](https://github.com/apache/doris/pull/40960) + +### Semi-Structured Data Management + +- Fixed the issue of match function errors when no indexes were present. [#38989](https://github.com/apache/doris/pull/38989) +- Fixed the issue of crashes when ARRAY data types were used as parameters for array_min/array_max functions. [#39492](https://github.com/apache/doris/pull/39492) +- Fixed the issue of nullable with the `array_enumerate_uniq` function. [#38384](https://github.com/apache/doris/pull/38384) +- Fixed the issue of bloomfilter indexes not being updated when adding or deleting columns. [#38431](https://github.com/apache/doris/pull/38431) +- Fixed the issue of es-catalog parsing exceptions with array data. [#39104](https://github.com/apache/doris/pull/39104) +- Fixed the issue of improper predicate push-down in es-catalog. [#40111](https://github.com/apache/doris/pull/40111) +- Fixed the issue of exceptions caused by modifying input data with`map()` and `struct()` functions. [#39699](https://github.com/apache/doris/pull/39699) +- Fixed the issue of index compaction crashes in special cases. [#40294](https://github.com/apache/doris/pull/40294) +- Fixed the issue of ARRAY type inverted indexes missing nullbitmaps. [#38907](https://github.com/apache/doris/pull/38907) +- Fixed the issue of incorrect results with the `count()` function on inverted indexes. [#41152](https://github.com/apache/doris/pull/41152) +- Fixed the issue of correct results with the `explode_map` function when using aliases. [#39757](https://github.com/apache/doris/pull/39757) +- Fixed the issue of VARIANT type not being able to use row storage for exceptional JSON data. [#39394](https://github.com/apache/doris/pull/39394) +- Fixed the issue of memory leaks when returning ARRAY results with VARIANT type. [#41358](https://github.com/apache/doris/pull/41358) +- Fixed the issue of changing column names with VARIANT type. [#40320](https://github.com/apache/doris/pull/40320) +- Fixed the issue of potential precision loss when converting VARIANT type to DECIMAL type. [#39650](https://github.com/apache/doris/pull/39650) +- Fixed the issue of nullable handling with VARIANT type. [#39732](https://github.com/apache/doris/pull/39732) +- Fixed the issue of sparse column reading with VARIANT type. [#40295](https://github.com/apache/doris/pull/40295) + +### Other + +- Fixed the compatibility issue between new and old audit log plugins. [#41401](https://github.com/apache/doris/pull/41401) +- Fixed the issue where users could see processes of others in certain cases. [#39747](https://github.com/apache/doris/pull/39747) +- Fixed the issue where users with permissions could not export. [#38365](https://github.com/apache/doris/pull/38365) +- Fixed the issue where create table like required create permissions for the existing table. [#37879](https://github.com/apache/doris/pull/37879) +- Fixed the issue where some features did not verify permissions. [#39726](https://github.com/apache/doris/pull/39726) +- Fixed the issue of not correctly closing connections when using SSL. [#38587](https://github.com/apache/doris/pull/38587) +- Fixed the issue where executing ALTER VIEW operations in some cases caused FE to fail to start. [#40872](https://github.com/apache/doris/pull/40872) \ No newline at end of file diff --git a/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.3.md b/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.3.md new file mode 100644 index 0000000000000..b15777212b400 --- /dev/null +++ b/versioned_docs/version-2.1/releasenotes/v3.0/release-3.0.3.md @@ -0,0 +1,226 @@ +--- +{ + "title": "Release 3.0.3", + "language": "en" +} +--- + + + + +Dear community members, the Apache Doris 3.0.3 version was officially released on December 02, 2024, this version further enhances the performance and stability of the system. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavioral Changes + +- Prohibited column updates on MOW tables with synchronous materialized views. [#40190](https://github.com/apache/doris/pull/40190) +- Adjusted the default parameters of RoutineLoad to improve import efficiency. [#42968](https://github.com/apache/doris/pull/42968) +- When StreamLoad fails, the return value of LoadedRows is adjusted to 0. [#41946](https://github.com/apache/doris/pull/41946) [#42291](https://github.com/apache/doris/pull/42291) +- Adjusted the default memory limit of Segment cache to 5%. [#42308](https://github.com/apache/doris/pull/42308) [#42436](https://github.com/apache/doris/pull/42436) + +## New Features + +- Introduced the session variable `enable_cooldown_replica_affinity` to control the affinity of cold and hot tiered replicas. [#42677](https://github.com/apache/doris/pull/42677) + +- Added `table$partition` syntax for querying partition information of Hive tables. [#40774](https://github.com/apache/doris/pull/40774) + + - [View Documentation](https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/hive) + +- Supported creation of Hive tables in Text format. [#41860](https://github.com/apache/doris/pull/41860) [#42175](https://github.com/apache/doris/pull/42175) + + - [View Documentation](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/datalake-building/hive-build#table) + +### Asynchronous Materialized Views + +- Introduced new materialized view attribute `use_for_rewrite`. When `use_for_rewrite` is set to false, the materialized view does not participate in transparent rewriting. [#40332](https://github.com/apache/doris/pull/40332) + +### Query Optimizer + +- Supported correlated non-aggregate subqueries. [#42236](https://github.com/apache/doris/pull/42236) + +### Query Execution + +- Added functions `ngram_search`, `normal_cdf`, `to_iso8601`, `from_iso8601_date`, `SESSION_USER()`, `last_query_id`. [#38226](https://github.com/apache/doris/pull/38226) [#40695](https://github.com/apache/doris/pull/40695) [#41075](https://github.com/apache/doris/pull/41075) [#41600](https://github.com/apache/doris/pull/41600) [#39575](https://github.com/apache/doris/pull/39575) [#40739](https://github.com/apache/doris/pull/40739) +- The `aes_encrypt` and `aes_decrypt` functions support GCM mode. [#40004](https://github.com/apache/doris/pull/40004) +- Profile outputs the changed session variable values. [#41016](https://github.com/apache/doris/pull/41016) [#41318](https://github.com/apache/doris/pull/41318) + +### Semi-structured Data Management + +- Added array functions `array_match_all` and `array_match_any`. [#40605](https://github.com/apache/doris/pull/40605) [#43514](https://github.com/apache/doris/pull/43514) +- The array function `array_agg` supports nesting ARRAY/MAP/STRUCT within ARRAY. [#42009](https://github.com/apache/doris/pull/42009) +- Added approximate aggregate statistical functions `approx_top_k` and `approx_top_sum`. [#44082](https://github.com/apache/doris/pull/44082) + +## Improvements + +### Storage + +- Supported `bitmap_empty` as the default value. [#40364](https://github.com/apache/doris/pull/40364) +- Introduced the session variable `insert_timeout` to control the timeout of DELETE statements. [#41063](https://github.com/apache/doris/pull/41063) +- Improved some error message prompts. [#41048](https://github.com/apache/doris/pull/41048) [#39631](https://github.com/apache/doris/pull/39631) +- Improved the priority scheduling of replica repair. [#41076](https://github.com/apache/doris/pull/41076) +- Enhanced the robustness of timezone handling when creating tables. [#41926](https://github.com/apache/doris/pull/41926) [#42389](https://github.com/apache/doris/pull/42389) +- Checked the validity of partition expressions when creating tables. [#40158](https://github.com/apache/doris/pull/40158) +- Supported Unicode-encoded column names in DELETE operations. [#39381](https://github.com/apache/doris/pull/39381) + +### Compute-Storage Decoupled + +- Supported ARM architecture deployment in storage and compute separation mode. [#42467](https://github.com/apache/doris/pull/42467) [#43377](https://github.com/apache/doris/pull/43377) +- Optimized the eviction strategy and lock competition of file cache, improving hit rate and high concurrency point query performance. [#42451](https://github.com/apache/doris/pull/42451) [#43201](https://github.com/apache/doris/pull/43201) [#41818](https://github.com/apache/doris/pull/41818) [#43401](https://github.com/apache/doris/pull/43401) +- S3 storage vault supported `use_path_style`, solving the problem of using custom domain names for object storage. [#43060](https://github.com/apache/doris/pull/43060) [#43343](https://github.com/apache/doris/pull/43343) [#43330](https://github.com/apache/doris/pull/43330) +- Optimized storage and compute separation configuration and deployment, preventing misoperations in different modes. [#43381](https://github.com/apache/doris/pull/43381) [#43522](https://github.com/apache/doris/pull/43522) [#43434](https://github.com/apache/doris/pull/43434) [#40764](https://github.com/apache/doris/pull/40764) [#43891](https://github.com/apache/doris/pull/43891) +- Optimized observability and provided an interface for deleting specified segment file cache. [#38489](https://github.com/apache/doris/pull/38489) [#42896](https://github.com/apache/doris/pull/42896) [#41037](https://github.com/apache/doris/pull/41037) [#43412](https://github.com/apache/doris/pull/43412) +- Optimized Meta-service operation and maintenance interface: RPC rate limiting and tablet metadata correction. [#42413](https://github.com/apache/doris/pull/42413) [#43884](https://github.com/apache/doris/pull/43884) [#41782](https://github.com/apache/doris/pull/41782) [#43460](https://github.com/apache/doris/pull/43460) + +### Lakehouse + +- Paimon Catalog supported Alibaba Cloud DLF and OSS-HDFS storage. [#41247](https://github.com/apache/doris/pull/41247) [#42585](https://github.com/apache/doris/pull/42585) + + - View [Documentation](https://doris.apache.org/docs/3.0/lakehouse/datalake-analytics/paimon) + +- Supported reading of Hive tables in OpenCSV format. [#42257](https://github.com/apache/doris/pull/42257) [#42942](https://github.com/apache/doris/pull/42942) +- Optimized the performance of accessing the `information_schema.columns` table in External Catalog. [#41659](https://github.com/apache/doris/pull/41659) [#41962](https://github.com/apache/doris/pull/41962) +- Used the new Max Compute open storage API to access Max Compute data sources. [#41614](https://github.com/apache/doris/pull/41614) +- Optimized the scheduling policy of the JNI part of Paimon tables, making scan tasks more balanced. [#43310](https://github.com/apache/doris/pull/43310) +- Optimized the read performance of small ORC files. [#42004](https://github.com/apache/doris/pull/42004) [#43467](https://github.com/apache/doris/pull/43467) +- Supported reading of parquet files in brotli compressed format. [#42177](https://github.com/apache/doris/pull/42177) +- Added `file_cache_statistics` table under the `information_schema` library to view metadata cache statistics. [#42160](https://github.com/apache/doris/pull/42160) + +### Query Optimizer + +- Optimization: When queries only differ in comments, the same SQL Cache can be reused. [#40049](https://github.com/apache/doris/pull/40049) +- Optimization: Improved the stability of statistical information when data is frequently updated. [#43865](https://github.com/apache/doris/pull/43865) [#39788](https://github.com/apache/doris/pull/39788) [#43009](https://github.com/apache/doris/pull/43009) [#40457](https://github.com/apache/doris/pull/40457) [#42409](https://github.com/apache/doris/pull/42409) [#41894](https://github.com/apache/doris/pull/41894) +- Optimization: Enhanced the stability of constant folding. [#42910](https://github.com/apache/doris/pull/42910) [#41164](https://github.com/apache/doris/pull/41164) [#39723](https://github.com/apache/doris/pull/39723) [#41394](https://github.com/apache/doris/pull/41394) [#42256](https://github.com/apache/doris/pull/42256) [#40441](https://github.com/apache/doris/pull/40441) +- Optimization: Column pruning can generate better execution plans. [#41719](https://github.com/apache/doris/pull/41719) [#41548](https://github.com/apache/doris/pull/41548) + +### Query Execution + +- Optimized the memory usage of the sort operator. [#39306](https://github.com/apache/doris/pull/39306) +- Optimized the performance of computations on ARM. [#38888](https://github.com/apache/doris/pull/38888) [#38759](https://github.com/apache/doris/pull/38759) +- Optimized the computational performance of a series of functions. [#40366](https://github.com/apache/doris/pull/40366) [#40821](https://github.com/apache/doris/pull/40821) [#40670](https://github.com/apache/doris/pull/40670) [#41206](https://github.com/apache/doris/pull/41206) [#40162](https://github.com/apache/doris/pull/40162) +- Used SSE instructions to optimize the performance of the `match_ipv6_subnet` function. [#38755](https://github.com/apache/doris/pull/38755) +- Supported automatic creation of new partitions during insert overwrite. [#38628](https://github.com/apache/doris/pull/38628) [#42645](https://github.com/apache/doris/pull/42645) +- Added the status of each PipelineTask in Profile. [#42981](https://github.com/apache/doris/pull/42981) +- IP type supported runtime filter. [#39985](https://github.com/apache/doris/pull/39985) + +### Semi-structured Data Management + +- Output the real SQL of prepared statements in audit logs. [#43321](https://github.com/apache/doris/pull/43321) +- The filebeat doris output plugin supports fault tolerance and progress reporting. [#36355](https://github.com/apache/doris/pull/36355) +- Optimized the performance of inverted index queries. [#41547](https://github.com/apache/doris/pull/41547) [#41585](https://github.com/apache/doris/pull/41585) [#41567](https://github.com/apache/doris/pull/41567) [#41577](https://github.com/apache/doris/pull/41577) [#42060](https://github.com/apache/doris/pull/42060) [#42372](https://github.com/apache/doris/pull/42372) +- The array function `array overlaps` supports acceleration using inverted indexes. [#41571](https://github.com/apache/doris/pull/41571) +- The IP function `is_ip_address_in_range` supports acceleration using inverted indexes. [#41571](https://github.com/apache/doris/pull/41571) +- Optimized the CAST performance of the VARIANT data type. [#41775](https://github.com/apache/doris/pull/41775) [#42438](https://github.com/apache/doris/pull/42438) [#43320](https://github.com/apache/doris/pull/43320) +- Optimized the CPU resource consumption of the Variant data type. [#42856](https://github.com/apache/doris/pull/42856) [#43062](https://github.com/apache/doris/pull/43062) [#43634](https://github.com/apache/doris/pull/43634) +- Optimized the metadata and execution memory resource consumption of the Variant data type. [#42448](https://github.com/apache/doris/pull/42448) [#43326](https://github.com/apache/doris/pull/43326) [#41482](https://github.com/apache/doris/pull/41482) [#43093](https://github.com/apache/doris/pull/43093) [#43567](https://github.com/apache/doris/pull/43567) [#43620](https://github.com/apache/doris/pull/43620) + +### Permissions + +- Added a new configuration item `ldap_group_filter` in LDAP for custom group filtering. [#43292](https://github.com/apache/doris/pull/43292) + +### Other + +- Supported displaying connection count information by user in FE monitoring items. [#39200](https://github.com/apache/doris/pull/39200) + +## Bug Fixes + +### Storage + +- Fixed the issue with using IPv6 hostnames. [#40074](https://github.com/apache/doris/pull/40074) +- Fixed the inaccurate display of broker/s3 load progress. [#43535](https://github.com/apache/doris/pull/43535) +- Fixed the issue where queries might hang from FE. [#41303](https://github.com/apache/doris/pull/41303) [#42382](https://github.com/apache/doris/pull/42382) +- Fixed the issue of duplicate auto-increment IDs under exceptional circumstances. [#43774](https://github.com/apache/doris/pull/43774) [#43983](https://github.com/apache/doris/pull/43983) +- Fixed occasional NPE issues with groupcommit. [#43635](https://github.com/apache/doris/pull/43635) +- Fixed the inaccurate calculation of auto bucket. [#41675](https://github.com/apache/doris/pull/41675) [#41835](https://github.com/apache/doris/pull/41835) +- Fixed the issue where FE might not correctly plan multi-table flows after restart. [#41677](https://github.com/apache/doris/pull/41677) [#42290](https://github.com/apache/doris/pull/42290) + +### Compute-Storage Decoupled + +- Fixed the issue that MOW primary key tables with large delete bitmaps might cause coredump. [#43088](https://github.com/apache/doris/pull/43088) [#43457](https://github.com/apache/doris/pull/43457) [#43479](https://github.com/apache/doris/pull/43479) [#43407](https://github.com/apache/doris/pull/43407) [#43297](https://github.com/apache/doris/pull/43297) [#43613](https://github.com/apache/doris/pull/43613) [#43615](https://github.com/apache/doris/pull/43615) [#43854](https://github.com/apache/doris/pull/43854) [#43968](https://github.com/apache/doris/pull/43968) [#44074](https://github.com/apache/doris/pull/44074) [#41793](https://github.com/apache/doris/pull/41793) [#42142](https://github.com/apache/doris/pull/42142) +- Fixed the issue that segment files, when being a multiple of 5MB, would fail to upload objects. [#43254](https://github.com/apache/doris/pull/43254) +- Fixed the issue that the default retry policy of aws sdk did not take effect. [#43575](https://github.com/apache/doris/pull/43575) [#43648](https://github.com/apache/doris/pull/43648) +- Fixed the issue that altering storage vault could continue execution even when the wrong type was specified. [#43489](https://github.com/apache/doris/pull/43489) [#43352](https://github.com/apache/doris/pull/43352) [#43495](https://github.com/apache/doris/pull/43495) +- Fixed the issue that tablet_id might be 0 during the delayed commit process of large transactions. [#42043](https://github.com/apache/doris/pull/42043) [#42905](https://github.com/apache/doris/pull/42905) +- Fixed the issue that constant folding RCP and FE forwarding SQL might not be executed in the expected computation group. [#43110](https://github.com/apache/doris/pull/43110) [#41819](https://github.com/apache/doris/pull/41819) [#41846](https://github.com/apache/doris/pull/41846) +- Fixed the issue that meta-service did not strictly check instance_id upon receiving RPC. [#43253](https://github.com/apache/doris/pull/43253) [#43832](https://github.com/apache/doris/pull/43832) +- Fixed the issue that FE follower information_schema version did not update in time. [#43496](https://github.com/apache/doris/pull/43496) +- Fixed the issue of atomicity in file cache rename and inaccurate metrics. [#42869](https://github.com/apache/doris/pull/42869) [#43504](https://github.com/apache/doris/pull/43504) [#43220](https://github.com/apache/doris/pull/43220) + +### Lakehouse + +- Prohibited implicit conversion predicates from being pushed down to JDBC data sources to avoid inconsistent query results. [#42102](https://github.com/apache/doris/pull/42102) +- Fixed some read issues with high-version Hive transactional tables. [#42226](https://github.com/apache/doris/pull/42226) +- Fixed the issue that the Export command might cause deadlocks. [#43083](https://github.com/apache/doris/pull/43083) [#43402](https://github.com/apache/doris/pull/43402) +- Fixed the issue of being unable to query Hive views created by Spark. [#43552](https://github.com/apache/doris/pull/43552) +- Fixed the issue that Hive partition paths containing special characters led to incorrect partition pruning. [#42906](https://github.com/apache/doris/pull/42906) +- Fixed the issue that Iceberg Catalog could not use AWS Glue. [#41084](https://github.com/apache/doris/pull/41084) + +### Asynchronous Materialized Views + +- Fixed the issue that asynchronous materialized views might not refresh after the base table is rebuilt. [#41762](https://github.com/apache/doris/pull/41762) + +### Query Optimizer + +- Fixed the issue that partition pruning results might be incorrect when using multi-column range partitioning. [#43332](https://github.com/apache/doris/pull/43332) +- Fixed the issue of incorrect calculation results in some limit offset scenarios. [#42576](https://github.com/apache/doris/pull/42576) + +### Query Execution + +- Fixed the issue that hash join with array types larger than 4G could cause BE Core. [#43861](https://github.com/apache/doris/pull/43861) +- Fixed the issue that is null predicate operations might yield incorrect results in some scenarios. [#43619](https://github.com/apache/doris/pull/43619) +- Fixed the issue that bitmap types might produce incorrect output results in hash join. [#43718](https://github.com/apache/doris/pull/43718) +- Fixed some issues where function results were calculated incorrectly. [#40710](https://github.com/apache/doris/pull/40710) [#39358](https://github.com/apache/doris/pull/39358) [#40929](https://github.com/apache/doris/pull/40929) [#40869](https://github.com/apache/doris/pull/40869) [#40285](https://github.com/apache/doris/pull/40285) [#39891](https://github.com/apache/doris/pull/39891) [#40530](https://github.com/apache/doris/pull/40530) [#41948](https://github.com/apache/doris/pull/41948) [#43588](https://github.com/apache/doris/pull/43588) +- Fixed some issues with JSON type parsing. [#39937](https://github.com/apache/doris/pull/39937) +- Fixed issues with varchar and char types in runtime filter operations. [#43758](https://github.com/apache/doris/pull/43758) [#43919](https://github.com/apache/doris/pull/43919) +- Fixed some issues with the use of decimal256 in scalar and aggregate functions. [#42136](https://github.com/apache/doris/pull/42136) [#42356](https://github.com/apache/doris/pull/42356) +- Fixed the issue that arrow flight reported `Reach limit of connections` errors upon connection. [#39127](https://github.com/apache/doris/pull/39127) +- Fixed the issue of incorrect memory usage statistics for BE in k8s environments. [#41123](https://github.com/apache/doris/pull/41123) + +### Semi-structured Data Management + +- Adjusted the default values of `segment_cache_fd_percentage` and `inverted_index_fd_number_limit_percent`. [#42224](https://github.com/apache/doris/pull/42224) +- logstash now supports group_commit. [#40450](https://github.com/apache/doris/pull/40450) +- Fixed the issue of coredump when building index. [#43246](https://github.com/apache/doris/pull/43246) [#43298](https://github.com/apache/doris/pull/43298) +- Fixed issues with variant index. [#43375](https://github.com/apache/doris/pull/43375) [#43773](https://github.com/apache/doris/pull/43773) +- Fixed potential fd and memory leaks under abnormal compaction circumstances. [#42374](https://github.com/apache/doris/pull/42374) +- Inverted index match null now correctly returns null instead of false. [#41786](https://github.com/apache/doris/pull/41786) +- Fixed the issue of coredump when ngram bloomfilter index bf_size is set to 65536. [#43645](https://github.com/apache/doris/pull/43645) +- Fixed the issue of potential coredump during complex data type JOINs. [#40398](https://github.com/apache/doris/pull/40398) +- Fixed the issue of coredump with TVF JSON data. [#43187](https://github.com/apache/doris/pull/43187) +- Fixed the precision issue of bloom filter calculations for dates and times. [#43612](https://github.com/apache/doris/pull/43612) +- Fixed the issue of coredump with IPv6 type storage. [#43251](https://github.com/apache/doris/pull/43251) +- Fixed the issue of coredump when using VARIANT type with light_schema_change disabled. [#40908](https://github.com/apache/doris/pull/40908) +- Improved cache performance for high-concurrency point queries. [#44077](https://github.com/apache/doris/pull/44077) +- Fixed the issue that bloom filter indexes were not synchronized when columns were deleted. [#43378](https://github.com/apache/doris/pull/43378) +- Fixed instability issues with es catalog under special circumstances such as mixed array and scalar data. [#40314](https://github.com/apache/doris/pull/40314) [#40385](https://github.com/apache/doris/pull/40385) [#43399](https://github.com/apache/doris/pull/43399) [#40614](https://github.com/apache/doris/pull/40614) +- Fixed coredump issues caused by abnormal regular pattern matching. [#43394](https://github.com/apache/doris/pull/43394) + +### Permissions + +- Fixed several issues where permissions were not properly restricted after authorization. [#43193](https://github.com/apache/doris/pull/43193) [#41723](https://github.com/apache/doris/pull/41723) [#42107](https://github.com/apache/doris/pull/42107) [#43306](https://github.com/apache/doris/pull/43306) +- Enhanced several permission checks. [#40688](https://github.com/apache/doris/pull/40688) [#40533](https://github.com/apache/doris/pull/40533) [#41791](https://github.com/apache/doris/pull/41791) [#42106](https://github.com/apache/doris/pull/42106) + +### Other + +- Supplemented missing audit log fields in audit log tables and files. [#43303](https://github.com/apache/doris/pull/43303) + + - [View Documentation](https://doris.apache.org/docs/3.0/admin-manual/system-tables/internal_schema/audit_log) \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.0.md b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.0.md new file mode 100644 index 0000000000000..dd94da6816294 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.0.md @@ -0,0 +1,379 @@ +--- +{ + "title": "Release 1.1.0", + "language": "en" +} +--- + + + +In version 1.1, we realized the full vectorization of the computing layer and storage layer, and officially enabled the vectorized execution engine as a stable function. All queries are executed by the vectorized execution engine by default, and the performance is 3-5 times higher than the previous version. It increases the ability to access the external tables of Apache Iceberg and supports federated query of data in Doris and Iceberg, and expands the analysis capabilities of Apache Doris on the data lake; on the basis of the original LZ4, the ZSTD compression algorithm is added , further improves the data compression rate; fixed many performance and stability problems in previous versions, greatly improving system stability. Downloading and using is recommended. + +## Upgrade Notes + +### The vectorized execution engine is enabled by default + +In version 1.0, we introduced the vectorized execution engine as an experimental feature and Users need to manually enable it when executing queries by configuring the session variables through `set batch_size = 4096` and `set enable_vectorized_engine = true` . + +In version 1.1, we officially fully enabled the vectorized execution engine as a stable function. The session variable `enable_vectorized_engine` is set to true by default. All queries are executed by default through the vectorized execution engine. + +### BE Binary File Renaming + +BE binary file has been renamed from palo_be to doris_be . Please pay attention to modifying the relevant scripts if you used to rely on process names for cluster management and other operations. + +### Segment storage format upgrade + +The storage format of earlier versions of Apache Doris was Segment V1. In version 0.12, we had implemented Segment V2 as a new storage format, which introduced Bitmap indexes, memory tables, page cache, dictionary compression, delayed materialization and many other features. Starting from version 0.13, the default storage format for newly created tables is Segment V2, while maintaining compatibility with the Segment V1 format. + +In order to ensure the maintainability of the code structure and reduce the additional learning and development costs caused by redundant historical codes, we have decided to no longer support the Segment v1 storage format from the next version. It is expected that this part of the code will be deleted in the Apache Doris 1.2 version. + +### Normal Upgrade + +For normal upgrade operations, you can perform rolling upgrades according to the cluster upgrade documentation on the official website. + +[https://doris.apache.org//docs/admin-manual/cluster-management/upgrade](https://doris.apache.org//docs/admin-manual/cluster-management/upgrade) + +## Features + +### Support random distribution of data [experimental] + +In some scenarios (such as log data analysis), users may not be able to find a suitable bucket key to avoid data skew, so the system needs to provide additional distribution methods to solve the problem. + +Therefore, when creating a table you can set `DISTRIBUTED BY random BUCKETS number`to use random distribution, the data will be randomly written to a single tablet when importing to reduce the data fanout during the loading process. And reduce resource overhead and improve system stability. + +### Support for creating Iceberg external tables[experimental] + +Iceberg external tables provide Apache Doris with direct access to data stored in Iceberg. Through Iceberg external tables, federated queries on data stored in local storage and Iceberg can be implemented, which saves tedious data loading work, simplifies the system architecture for data analysis, and performs more complex analysis operations. + +In version 1.1, Apache Doris supports creating Iceberg external tables and querying data, and supports automatic synchronization of all table schemas in the Iceberg database through the REFRESH command. + +### Added ZSTD compression algorithm + +At present, the data compression method in Apache Doris is uniformly specified by the system, and the default is LZ4. For some scenarios that are sensitive to data storage costs, the original data compression ratio requirements cannot be met. + +In version 1.1, users can set "compression"="zstd" in the table properties to specify the compression method as ZSTD when creating a table. In the 25GB 110 million lines of text log test data, the highest compression rate is nearly 10 times, which is 53% higher than the original compression rate, and the speed of reading data from disk and decompressing it is increased by 30%. + +## Improvements + +### More comprehensive vectorization support + +In version 1.1, we implemented full vectorization of the compute and storage layers, including: + +Implemented vectorization of all built-in functions + +The storage layer implements vectorization and supports dictionary optimization for low-cardinality string columns + +Optimized and resolved numerous performance and stability issues with the vectorization engine. + +We tested the performance of Apache Doris version 1.1 and version 0.15 on the SSB and TPC-H standard test datasets: + +On all 13 SQLs in the SSB test data set, version 1.1 is better than version 0.15, and the overall performance is improved by about 3 times, which solves the problem of performance degradation in some scenarios in version 1.0; + +On all 22 SQLs in the TPC-H test data set, version 1.1 is better than version 0.15, the overall performance is improved by about 4.5 times, and the performance of some scenarios is improved by more than ten times; + +![](/images/release-note-1.1.0-SSB.png) + +

SSB Benchmark

+ +![](/images/release-note-1.1.0-TPC-H.png) + + +

TPC-H Benchmark

+ +**Performance test report** + +[https://doris.apache.org//docs/benchmark/ssb](https://doris.apache.org//docs/benchmark/ssb) + +[https://doris.apache.org//docs/benchmark/tpch](https://doris.apache.org//docs/benchmark/tpch) + +### Compaction logic optimization and real-time guarantee + +In Apache Doris, each commit will generate a data version. In high concurrent write scenarios, -235 errors are prone to occur due to too many data versions and untimely compaction, and query performance will also decrease accordingly. + +In version 1.1, we introduced QuickCompaction, which will actively trigger compaction when the data version increases. At the same time, by improving the ability to scan fragment metadata, it can quickly find fragments with too many data versions and trigger compaction. Through active triggering and passive scanning, the real-time problem of data merging is completely solved. + +At the same time, for high-frequency small file cumulative compaction, the scheduling and isolation of compaction tasks is implemented to prevent the heavyweight base compaction from affecting the merging of new data. + +Finally, for the merging of small files, the strategy of merging small files is optimized, and the method of gradient merging is adopted. Each time the files participating in the merging belong to the same data magnitude, it prevents versions with large differences in size from merging, and gradually merges hierarchically. , reducing the number of times a single file participates in merging, which can greatly save the CPU consumption of the system. + +When the data upstream maintains a write frequency of 10w per second (20 concurrent write tasks, 5000 rows per job, and checkpoint interval of 1s), version 1.1 behaves as follows: + +- Quick data consolidation: Tablet version remains below 50 and compaction score is stable. Compared with the -235 problem that frequently occurred during high concurrent writing in the previous version, the compaction merge efficiency has been improved by more than 10 times. + +- Significantly reduced CPU resource consumption: The strategy has been optimized for small file Compaction. In the above scenario of high concurrent writing, CPU resource consumption is reduced by 25%; + +- Stable query time consumption: The overall orderliness of data is improved, and the fluctuation of query time consumption is greatly reduced. The query time consumption during high concurrent writing is the same as that of only querying, and the query performance is improved by 3-4 times compared with the previous version. + +### Read efficiency optimization for Parquet and ORC files + +By adjusting arrow parameters, arrow's multi-threaded read capability is used to speed up Arrow's reading of each row_group, and it is modified to SPSC model to reduce the cost of waiting for the network through prefetching. After optimization, the performance of Parquet file import is improved by 4 to 5 times. + +### Safer metadata Checkpoint + +By double-checking the image files generated after the metadata checkpoint and retaining the function of historical image files, the problem of metadata corruption caused by image file errors is solved. + +## Bugfix + +### Fix the problem that the data cannot be queried due to the missing data version.(Serious) + +This issue was introduced in version 1.0 and may result in the loss of data versions for multiple replicas. + +### Fix the problem that the resource isolation is invalid for the resource usage limit of loading tasks (Moderate) + +In 1.1, the broker load and routine load will use Backends with specified resource tags to do the load. + +### Use HTTP BRPC to transfer network data packets over 2GB (Moderate) + +In the previous version, when the data transmitted between Backends through BRPC exceeded 2GB, +it may cause data transmission errors. + +## Others + +### Disabling Mini Load + +The `/_load` interface is disabled by default, please use `the /_stream_load` interface uniformly. +Of course, you can re-enable it by turning off the FE configuration item `disable_mini_load`. + +The Mini Load interface will be completely removed in version 1.2. + +### Completely disable the SegmentV1 storage format + +Data in SegmentV1 format is no longer allowed to be created. Existing data can continue to be accessed normally. +You can use the `ADMIN SHOW TABLET STORAGE FORMAT` statement to check whether the data in SegmentV1 format +still exists in the cluster. And convert to SegmentV2 through the data conversion command + +Access to SegmentV1 data will no longer be supported in version 1.2. + +### Limit the maximum length of String type + +In previous versions, String types were allowed a maximum length of 2GB. +In version 1.1, we will limit the maximum length of the string type to 1MB. Strings longer than this length cannot be written anymore. +At the same time, using the String type as a partitioning or bucketing column of a table is no longer supported. + +The String type that has been written can be accessed normally. + +### Fix fastjson related vulnerabilities + +Update to Canal version to fix fastjson security vulnerability. + +### Added `ADMIN DIAGNOSE TABLET` command + +Used to quickly diagnose problems with the specified tablet. + +## Download to Use + +### Download Link + +[hhttps://doris.apache.org/download](https://doris.apache.org/download) + +### Feedback + +If you encounter any problems with use, please feel free to contact us through GitHub discussion forum or Dev e-mail group anytime. + +GitHub Forum: [https://github.com/apache/doris/discussions](https://github.com/apache/doris/discussions) + +Mailing list: [dev@doris.apache.org](dev@doris.apache.org) + +## Thanks + +Thanks to everyone who has contributed to this release: + +``` + +@adonis0147 + +@airborne12 + +@amosbird + +@aopangzi + +@arthuryangcs + +@awakeljw + +@BePPPower + +@BiteTheDDDDt + +@bridgeDream + +@caiconghui + +@cambyzju + +@ccoffline + +@chenlinzhong + +@daikon12 + +@DarvenDuan + +@dataalive + +@dataroaring + +@deardeng + +@Doris-Extras + +@emerkfu + +@EmmyMiao87 + +@englefly + +@Gabriel39 + +@GoGoWen + +@gtchaos + +@HappenLee + +@hello-stephen + +@Henry2SS + +@hewei-nju + +@hf200012 + +@jacktengg + +@jackwener + +@Jibing-Li + +@JNSimba + +@kangshisen + +@Kikyou1997 + +@kylinmac + +@Lchangliang + +@leo65535 + +@liaoxin01 + +@liutang123 + +@lovingfeel + +@luozenglin + +@luwei16 + +@luzhijing + +@mklzl + +@morningman + +@morrySnow + +@nextdreamblue + +@Nivane + +@pengxiangyu + +@qidaye + +@qzsee + +@SaintBacchus + +@SleepyBear96 + +@smallhibiscus + +@spaces-X + +@stalary + +@starocean999 + +@steadyBoy + +@SWJTU-ZhangLei + +@Tanya-W + +@tarepanda1024 + +@tianhui5 + +@Userwhite + +@wangbo + +@wangyf0555 + +@weizuo93 + +@whutpencil + +@wsjz + +@wunan1210 + +@xiaokang + +@xinyiZzz + +@xlwh + +@xy720 + +@yangzhg + +@Yankee24 + +@yiguolei + +@yinzhijian + +@yixiutt + +@zbtzbtzbt + +@zenoyang + +@zhangstar333 + +@zhangyifan27 + +@zhannngchen + +@zhengshengjun + +@zhengshiJ + +@zingdle + +@zuochunwei + +@zy-kkk +``` diff --git a/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.1.md b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.1.md new file mode 100644 index 0000000000000..73a6c2d976999 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.1.md @@ -0,0 +1,78 @@ +--- +{ + "title": "Release 1.1.1", + "language": "en" +} +--- + + + +## Features + +### Support ODBC Sink in Vectorized Engine. + +This feature is enabled in non-vectorized engine but it is missed in vectorized engine in 1.1. So that we add back this feature in 1.1.1. + +### Simple Memtracker for Vectorized Engine. + +There is no memtracker in BE for vectorized engine in 1.1, so that the memory is out of control and cause OOM. In 1.1.1, a simple memtracker is added to BE and could control the memory and cancel the query when memory exceeded. + +## Improvements + +### Cache decompressed data in page cache. + +Some data is compressed using bitshuffle and it costs a lot of time to decompress it during query. In 1.1.1, doris will decompress the data that encoded by bitshuffle to accelerate query and we find it could reduce 30% latency for some query in ssb-flat. + +## Bug Fix + +### Fix the problem that could not do rolling upgrade from 1.0.(Serious) + +This issue was introduced in version 1.1 and may cause BE core when upgrade BE but not upgrade FE. + +If you encounter this problem, you can try to fix it with [#10833](https://github.com/apache/doris/pull/10833). + +### Fix the problem that some query not fall back to non-vectorized engine, and BE will core. + +Currently, vectorized engine could not deal with all sql queries and some queries (like left outer join) will use non-vectorized engine to run. But there are some cases not covered in 1.1. And it will cause be crash. + +### Compaction not work correctly and cause -235 Error. + +One rowset multi segments in uniq key compaction, segments rows will be merged in generic_iterator but merged_rows not increased. Compaction will failed in check_correctness, and make a tablet with too much versions which lead to -235 load error. + +### Some segment fault cases during query. + +[#10961](https://github.com/apache/doris/pull/10961) +[#10954](https://github.com/apache/doris/pull/10954) +[#10962](https://github.com/apache/doris/pull/10962) + +# Thanks + +Thanks to everyone who has contributed to this release: + +``` +@jacktengg +@mrhhsg +@xinyiZzz +@yixiutt +@starocean999 +@morrySnow +@morningman +@HappenLee +``` \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.2.md b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.2.md new file mode 100644 index 0000000000000..223b65fda064c --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.2.md @@ -0,0 +1,84 @@ +--- +{ + "title": "Release 1.1.2", + "language": "en" +} +--- + + + + +In this release, Doris Team has fixed more than 170 issues or performance improvement since 1.1.1. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + +# Features + +### New MemTracker + +Introduced new MemTracker for both vectorized engine and non-vectorized engine which is more accurate. + +### Add API for showing current queries and kill query + +### Support read/write emoji of UTF16 via ODBC Table + +# Improvements + +### Data Lake related improvements + +- Improved HDFS ORC File scan performance about 300%. [#11501](https://github.com/apache/doris/pull/11501) + +- Support HDFS HA mode when query Iceberg table. + +- Support query Hive data created by [Apache Tez](https://tez.apache.org/) + +- Add Ali OSS as Hive external support. + +### Add support for string and text type in Spark Load + + +### Add reuse block in non-vectorized engine and have 50% performance improvement in some cases. [#11392](https://github.com/apache/doris/pull/11392) + +### Improve like or regex performance + +### Disable tcmalloc's aggressive_memory_decommit + +It will have 40% performance gains in load or query. + +Currently it is a config, you can change it by set config `tc_enable_aggressive_memory_decommit`. + +# Bug Fix + +### Some issues about FE that will cause FE failure or data corrupt. + +- Add reserved disk config to avoid too many reserved BDB-JE files.**(Serious)** In an HA environment, BDB JE will retains as many reserved files. The BDB-je log doesn't delete until approaching a disk limit. + +- Fix fatal bug in BDB-JE which will cause FE replica could not start correctly or data corrupted.** (Serious)** + +### Fe will hang on waitFor_rpc during query and BE will hang in high concurrent scenarios. + +[#12459](https://github.com/apache/doris/pull/12459) [#12458](https://github.com/apache/doris/pull/12458) [#12392](https://github.com/apache/doris/pull/12392) + +### A fatal issue in vectorized storage engine which will cause wrong result. **(Serious)** + +[#11754](https://github.com/apache/doris/pull/11754) [#11694](https://github.com/apache/doris/pull/11694) + +### Lots of planner related issues that will cause BE core or in abnormal state. + +[#12080](https://github.com/apache/doris/pull/12080) [#12075](https://github.com/apache/doris/pull/12075) [#12040](https://github.com/apache/doris/pull/12040) [#12003](https://github.com/apache/doris/pull/12003) [#12007](https://github.com/apache/doris/pull/12007) [#11971](https://github.com/apache/doris/pull/11971) [#11933](https://github.com/apache/doris/pull/11933) [#11861](https://github.com/apache/doris/pull/11861) [#11859](https://github.com/apache/doris/pull/11859) [#11855](https://github.com/apache/doris/pull/11855) [#11837](https://github.com/apache/doris/pull/11837) [#11834](https://github.com/apache/doris/pull/11834) [#11821](https://github.com/apache/doris/pull/11821) [#11782](https://github.com/apache/doris/pull/11782) [#11723](https://github.com/apache/doris/pull/11723) [#11569](https://github.com/apache/doris/pull/11569) + diff --git a/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.3.md b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.3.md new file mode 100644 index 0000000000000..cfa7151097de3 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.3.md @@ -0,0 +1,92 @@ +--- +{ + "title": "Release 1.1.3", + "language": "en" +} +--- + + + + +In this release, Doris Team has fixed more than 80 issues or performance improvement since 1.1.2. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + + +# Features + +- Support escape identifiers for sqlserver and postgresql in ODBC table. + +- Could use Parquet as output file format. + +# Improvements + +- Optimize flush policy to avoid small segments. [#12706](https://github.com/apache/doris/pull/12706) [#12716](https://github.com/apache/doris/pull/12716) + +- Refactor runtime filter to reduce the prepare time. [#13127](https://github.com/apache/doris/pull/13127) + +- Lots of memory control related issues during query or load process. [#12682](https://github.com/apache/doris/pull/12682) [#12688](https://github.com/apache/doris/pull/12688) [#12708](https://github.com/apache/doris/pull/12708) [#12776](https://github.com/apache/doris/pull/12776) [#12782](https://github.com/apache/doris/pull/12782) [#12791](https://github.com/apache/doris/pull/12791) [#12794](https://github.com/apache/doris/pull/12794) [#12820](https://github.com/apache/doris/pull/12820) [#12932](https://github.com/apache/doris/pull/12932) [#12954](https://github.com/apache/doris/pull/12954) [#12951](https://github.com/apache/doris/pull/12951) + +# BugFix + +- Core dump on compaction with largeint. [#10094](https://github.com/apache/doris/pull/10094) + +- Grouping sets cause be core or return wrong results. [#12313](https://github.com/apache/doris/pull/12313) + +- PREAGGREGATION flag in orthogonal_bitmap_union_count operator is wrong. [#12581](https://github.com/apache/doris/pull/12581) + +- Level1Iterator should release iterators in heap and it may cause memory leak. [#12592](https://github.com/apache/doris/pull/12592) + +- Fix decommission failure with 2 BEs and existing colocation table. [#12644](https://github.com/apache/doris/pull/12644) + +- BE may core dump because of stack-buffer-overflow when TBrokerOpenReaderResponse too large. [#12658](https://github.com/apache/doris/pull/12658) + +- BE may OOM during load when error code -238 occurs. [#12666](https://github.com/apache/doris/pull/12666) + +- Fix wrong child expression of lead function. [#12587](https://github.com/apache/doris/pull/12587) + +- Fix intersect query failed in row storage code. [#12712](https://github.com/apache/doris/pull/12712) + +- Fix wrong result produced by curdate()/current_date() function. [#12720](https://github.com/apache/doris/pull/12720) + +- Fix lateral view explode_split with temp table bug. [#13643](https://github.com/apache/doris/pull/13643) + +- Bucket shuffle join plan is wrong in two same table. [#12930](https://github.com/apache/doris/pull/12930) + +- Fix bug that tablet version may be wrong when doing alter and load. [#13070](https://github.com/apache/doris/pull/13070) + +- BE core when load data using broker with md5sum()/sm3sum(). [#13009](https://github.com/apache/doris/pull/13009) + +# Upgrade Notes + +PageCache and ChunkAllocator are disabled by default to reduce memory usage and can be re-enabled by modifying the configuration items `disable_storage_page_cache` and `chunk_reserved_bytes_limit`. + +Storage Page Cache and Chunk Allocator cache user data chunks and memory preallocation, respectively. + +These two functions take up a certain percentage of memory and are not freed. This part of memory cannot be flexibly allocated, which may lead to insufficient memory for other tasks in some scenarios, affecting system stability and availability. Therefore, we disabled these two features by default in version 1.1.3. + +However, in some latency-sensitive reporting scenarios, turning off this feature may lead to increased query latency. If you are worried about the impact of this feature on your business after upgrade, you can add the following parameters to be.conf to keep the same behavior as the previous version. + +``` +disable_storage_page_cache=false +chunk_reserved_bytes_limit=10% +``` + +* ``disable_storage_page_cache``: Whether to disable Storage Page Cache. version 1.1.2 (inclusive), the default is false, i.e., on. version 1.1.3 defaults to true, i.e., off. +* `chunk_reserved_bytes_limit`: Chunk allocator reserved memory size. 1.1.2 (and earlier), the default is 10% of the overall memory. 1.1.3 version default is 209715200 (200MB). + diff --git a/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.4.md b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.4.md new file mode 100644 index 0000000000000..4710463f4bcde --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.4.md @@ -0,0 +1,72 @@ +--- +{ + "title": "Release 1.1.4", + "language": "en" +} +--- + + + +In this release, Doris Team has fixed about 60 issues or performance improvement since 1.1.3. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + + +# Features + +- Support obs broker load for Huawei Cloud. [#13523](https://github.com/apache/doris/pull/13523) + +- SparkLoad support parquet and orc file.[#13438](https://github.com/apache/doris/pull/13438) + +# Improvements + +- Do not acquire mutex in metric hook since it will affect query performance during heavy load.[#10941](https://github.com/apache/doris/pull/10941) + + +# BugFix + +- The where condition does not take effect when spark load loads the file. [#13804](https://github.com/apache/doris/pull/13804) + +- If function return error result when there is nullable column in vectorized mode. [#13779](https://github.com/apache/doris/pull/13779) + +- Fix incorrect result when using anti join with other join predicates. [#13743](https://github.com/apache/doris/pull/13743) + +- BE crash when call function concat(ifnull). [#13693](https://github.com/apache/doris/pull/13693) + +- Fix planner bug when there is a function in group by clause. [#13613](https://github.com/apache/doris/pull/13613) + +- Table name and column name is not recognized correctly in lateral view clause. [#13600](https://github.com/apache/doris/pull/13600) + +- Unknown column when use MV and table alias. [#13605](https://github.com/apache/doris/pull/13605) + +- JSONReader release memory of both value and parse allocator. [#13513](https://github.com/apache/doris/pull/13513) + +- Fix allow create mv using to_bitmap() on negative value columns when enable_vectorized_alter_table is true. [#13448](https://github.com/apache/doris/pull/13448) + +- Microsecond in function from_date_format_str is lost. [#13446](https://github.com/apache/doris/pull/13446) + +- Sort exprs nullability property may not be right after subsitute using child's smap info. [#13328](https://github.com/apache/doris/pull/13328) + +- Fix core dump on case when have 1000 condition. [#13315](https://github.com/apache/doris/pull/13315) + +- Fix bug that last line of data lost for stream load. [#13066](https://github.com/apache/doris/pull/13066) + +- Restore table or partition with the same replication num as before the backup. [#11942](https://github.com/apache/doris/pull/11942) + + + diff --git a/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.5.md b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.5.md new file mode 100644 index 0000000000000..ee0482b3ba487 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.1/release-1.1.5.md @@ -0,0 +1,65 @@ +--- +{ + "title": "Release 1.1.5", + "language": "en" +} +--- + + + +In this release, Doris Team has fixed about 36 issues or performance improvement since 1.1.4. This release is a bugfix release on 1.1 and all users are encouraged to upgrade to this release. + +# Behavior Changes + +When alias name is same as the original column name like "select year(birthday) as birthday" and use it in group by, order by , having clause, doris's behavior is different from MySQL in the past. In this release, we make it follow MySQL's behavior. Group by and having clause will use original column at first and order by will use alias first. It maybe a litter confuse here so there is a simple advice here, you'd better not use an alias the same as original column name. + +# Features + +Add support of murmur_hash3_64. [#14636](https://github.com/apache/doris/pull/14636) + +# Improvements + +Add timezone cache for convert_tz to improve performance. [#14616](https://github.com/apache/doris/pull/14616) + +Sort result by tablename when call show clause. [#14492](https://github.com/apache/doris/pull/14492) + +# Bug Fix + +Fix coredump when there is a if constant expr in select clause. [#14858](https://github.com/apache/doris/pull/14858) + +ColumnVector::insert_date_column may crashed. [#14839](https://github.com/apache/doris/pull/14839) + +Update high_priority_flush_thread_num_per_store default value to 6 and it will improve the load performance. [#14775](https://github.com/apache/doris/pull/14775) + +Fix quick compaction core. [#14731](https://github.com/apache/doris/pull/14731) + +Partition column is not duplicate key, spark load will throw IndexOutOfBounds error. [#14661](https://github.com/apache/doris/pull/14661) + +Fix a memory leak problem in VCollectorIterator. [#14549](https://github.com/apache/doris/pull/14549) + +Fix create table like when having sequence column. [#14511](https://github.com/apache/doris/pull/14511) + +Using avg rowset to calculate batch size instead of using total_bytes since it costs a lot of cpu. [#14273](https://github.com/apache/doris/pull/14273) + +Fix right outer join core with conjunct. [#14821](https://github.com/apache/doris/pull/14821) + +Optimize policy of tcmalloc gc. [#14777](https://github.com/apache/doris/pull/14777) [#14738](https://github.com/apache/doris/pull/14738) [#14374](https://github.com/apache/doris/pull/14374) + + diff --git a/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.0.md b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.0.md new file mode 100644 index 0000000000000..2529ce7e58aa2 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.0.md @@ -0,0 +1,563 @@ +--- +{ + "title": "Release 1.2.0", + "language": "en" +} +--- + + + + + +# Feature +## Highlight + +1. Full Vectorizied-Engine support, greatly improved performance + + In the standard ssb-100-flat benchmark, the performance of 1.2 is 2 times faster than that of 1.1; in complex TPCH 100 benchmark, the performance of 1.2 is 3 times faster than that of 1.1. + +2. Merge-on-Write Unique Key + + Support Merge-On-Write on Unique Key Model. This mode marks the data that needs to be deleted or updated when the data is written, thereby avoiding the overhead of Merge-On-Read when querying, and greatly improving the reading efficiency on the updateable data model. + +3. Multi Catalog + + The multi-catalog feature provides Doris with the ability to quickly access external data sources for access. Users can connect to external data sources through the `CREATE CATALOG` command. Doris will automatically map the library and table information of external data sources. After that, users can access the data in these external data sources just like accessing ordinary tables. It avoids the complicated operation that the user needs to manually establish external mapping for each table. + + Currently this feature supports the following data sources: + + 1. Hive Metastore: You can access data tables including Hive, Iceberg, and Hudi. It can also be connected to data sources compatible with Hive Metastore, such as Alibaba Cloud's DataLake Formation. Supports data access on both HDFS and object storage. + 2. Elasticsearch: Access ES data sources. + 3. JDBC: Access MySQL through the JDBC protocol. + + Documentation: https://doris.apache.org//docs/dev/lakehouse/multi-catalog) + + > Note: The corresponding permission level will also be changed automatically, see the "Upgrade Notes" section for details. + +4. Light table structure changes + +In the new version, it is no longer necessary to change the data file synchronously for the operation of adding and subtracting columns to the data table, and only need to update the metadata in FE, thus realizing the millisecond-level Schema Change operation. Through this function, the DDL synchronization capability of upstream CDC data can be realized. For example, users can use Flink CDC to realize DML and DDL synchronization from upstream database to Doris. + +Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE + +When creating a table, set `"light_schema_change"="true"` in properties. + +5. JDBC facade + + Users can connect to external data sources through JDBC. Currently supported: + + - MySQL + - PostgreSQL + - Oracle + - SQL Server + - Clickhouse + + Documentation: [https://doris.apache.org/en/docs/dev/lakehouse/multi-catalog/jdbc](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc/) + + > Note: The ODBC feature will be removed in a later version, please try to switch to the JDBC. + +6. JAVA UDF + + Supports writing UDF/UDAF in Java, which is convenient for users to use custom functions in the Java ecosystem. At the same time, through technologies such as off-heap memory and Zero Copy, the efficiency of cross-language data access has been greatly improved. + + Document: https://doris.apache.org//docs/dev/ecosystem/udf/java-user-defined-function + + Example: https://github.com/apache/doris/tree/master/samples/doris-demo + +7. Remote UDF + + Supports accessing remote user-defined function services through RPC, thus completely eliminating language restrictions for users to write UDFs. Users can use any programming language to implement custom functions to complete complex data analysis work. + + Documentation: https://doris.apache.org//docs/ecosystem/udf/remote-user-defined-function + + Example: https://github.com/apache/doris/tree/master/samples/doris-demo + +8. More data types support + + - Array type + + Array types are supported. It also supports nested array types. In some scenarios such as user portraits and tags, the Array type can be used to better adapt to business scenarios. At the same time, in the new version, we have also implemented a large number of data-related functions to better support the application of data types in actual scenarios. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Types/ARRAY + + Related functions: https://doris.apache.org//docs/dev/sql-manual/sql-functions/array-functions/array_max + + - Jsonb type + + Support binary Json data type: Jsonb. This type provides a more compact json encoding format, and at the same time provides data access in the encoding format. Compared with json data stored in strings, it is several times newer and can be improved. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Types/JSONB + + Related functions: https://doris.apache.org//docs/dev/sql-manual/sql-functions/json-functions/jsonb_parse + + - Date V2 + + Sphere of influence: + + 1. The user needs to specify datev2 and datetimev2 when creating the table, and the date and datetime of the original table will not be affected. + 2. When datev2 and datetimev2 are calculated with the original date and datetime (for example, equivalent connection), the original type will be cast into a new type for calculation + 3. The example is in the documentation + + Documentation: https://doris.apache.org/docs/1.2/sql-manual/sql-reference/Data-Types/DATEV2 + + +## More + +1. A new memory management framework + + Documentation: https://doris.apache.org//docs/dev/admin-manual/maint-monitor/memory-management/memory-tracker + +2. Table Valued Function + + Doris implements a set of Table Valued Function (TVF). TVF can be regarded as an ordinary table, which can appear in all places where "table" can appear in SQL. + + For example, we can use S3 TVF to implement data import on object storage: + + ``` + insert into tbl select * from s3("s3://bucket/file.*", "ak" = "xx", "sk" = "xxx") where c1 > 2; + ``` + + Or directly query data files on HDFS: + + ``` + insert into tbl select * from hdfs("hdfs://bucket/file.*") where c1 > 2; + ``` + + TVF can help users make full use of the rich expressiveness of SQL and flexibly process various data. + + Documentation: + + https://doris.apache.org//docs/dev/sql-manual/sql-functions/table-functions/s3 + + https://doris.apache.org//docs/dev/sql-manual/sql-functions/table-functions/hdfs + +3. A more convenient way to create partitions + + Support for creating multiple partitions within a time range via the `FROM TO` command. + +4. Column renaming + + For tables with Light Schema Change enabled, column renaming is supported. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-TABLE-RENAME + +5. Richer permission management + + - Support row-level permissions + + Row-level permissions can be created with the `CREATE ROW POLICY` command. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-POLICY + + - Support specifying password strength, expiration time, etc. + + - Support for locking accounts after multiple failed logins. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Account-Management-Statements/ALTER-USER + +6. Import + + - CSV import supports csv files with header. + + Search for `csv_with_names` in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD/ + + - Stream Load adds `hidden_columns`, which can explicitly specify the delete flag column and sequence column. + + Search for `hidden_columns` in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD + + - Spark Load supports Parquet and ORC file import. + + - Support for cleaning completed imported Labels + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CLEAN-LABEL + + - Support batch cancellation of import jobs by status + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CANCEL-LOAD + + - Added support for Alibaba Cloud oss, Tencent Cloud cos/chdfs and Huawei Cloud obs in broker load. + + Documentation: https://doris.apache.org//docs/dev/advanced/broker + + - Support access to hdfs through hive-site.xml file configuration. + + Documentation: https://doris.apache.org//docs/dev/admin-manual/config/config-dir + +7. Support viewing the contents of the catalog recycle bin through `SHOW CATALOG RECYCLE BIN` function. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Show-Statements/SHOW-CATALOG-RECYCLE-BIN + +8. Support `SELECT * EXCEPT` syntax. + + Documentation: https://doris.apache.org//docs/dev/data-table/basic-usage + +9. OUTFILE supports ORC format export. And supports multi-byte delimiters. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/OUTFILE + +10. Support to modify the number of Query Profiles that can be saved through configuration. + + Document search FE configuration item: max_query_profile_num + +11. The DELETE statement supports IN predicate conditions. And it supports partition pruning. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Manipulation/DELETE + +12. The default value of the time column supports using `CURRENT_TIMESTAMP` + + Search for "CURRENT_TIMESTAMP" in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE + +13. Add two system tables: backends, rowsets + + Documentation: + + https://doris.apache.org//docs/dev/admin-manual/system-table/backends + + https://doris.apache.org//docs/dev/admin-manual/system-table/rowsets + +14. Backup and restore + + - The Restore job supports the `reserve_replica` parameter, so that the number of replicas of the restored table is the same as that of the backup. + + - The Restore job supports `reserve_dynamic_partition_enable` parameter, so that the restored table keeps the dynamic partition enabled. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Backup-and-Restore/RESTORE + + - Support backup and restore operations through the built-in libhdfs, no longer rely on broker. + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Backup-and-Restore/CREATE-REPOSITORY + +15. Support data balance between multiple disks on the same machine + + Documentation: + + https://doris.apache.org//docs/dev/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-REBALANCE-DISK + + https://doris.apache.org//docs/dev/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-CANCEL-REBALANCE-DISK + +16. Routine Load supports subscribing to Kerberos-authenticated Kafka services. + + Search for kerberos in the documentation: https://doris.apache.org//docs/dev/data-operate/import/import-way/routine-load-manual + +17. New built-in-function + + Added the following built-in functions: + + - `cbrt` + - `sequence_match/sequence_count` + - `mask/mask_first_n/mask_last_n` + - `elt` + - `any/any_value` + - `group_bitmap_xor` + - `ntile` + - `nvl` + - `uuid` + - `initcap` + - `regexp_replace_one/regexp_extract_all` + - `multi_search_all_positions/multi_match_any` + - `domain/domain_without_www/protocol` + - `running_difference` + - `bitmap_hash64` + - `murmur_hash3_64` + - `to_monday` + - `not_null_or_empty` + - `window_funnel` + - `group_bit_and/group_bit_or/group_bit_xor` + - `outer combine` + - and all array functions + +# Upgrade Notice + +## Known Issues + +- Use JDK11 will cause BE crash, please use JDK8 instead. + +## Behavior Changed + +- Permission level changes + + Because the catalog level is introduced, the corresponding user permission level will also be changed automatically. The rules are as follows: + + - GlobalPrivs and ResourcePrivs remain unchanged + - Added CatalogPrivs level. + - The original DatabasePrivs level is added with the internal prefix (indicating the db in the internal catalog) + - Add the internal prefix to the original TablePrivs level (representing tbl in the internal catalog) + +- In GroupBy and Having clauses, match on column names in preference to aliases. (#14408) + +- Creating columns starting with `mv_` is no longer supported. `mv_` is a reserved keyword in materialized views (#14361) + +- Removed the default limit of 65535 rows added by the order by statement, and added the session variable `default_order_by_limit` to configure this limit. (#12478) + +- In the table generated by "Create Table As Select", all string columns use the string type uniformly, and no longer distinguish varchar/char/string (#14382) + +- In the audit log, remove the word `default_cluster` before the db and user names. (#13499) (#11408) + +- Add sql digest field in audit log (#8919) + +- The union clause always changes the order by logic. In the new version, the order by clause will be executed after the union is executed, unless explicitly associated by parentheses. (#9745) + +- During the decommission operation, the tablet in the recycle bin will be ignored to ensure that the decomission can be completed. (#14028) + +- The returned result of Decimal will be displayed according to the precision declared in the original column, or according to the precision specified in the cast function. (#13437) + +- Changed column name length limit from 64 to 256 (#14671) + +- Changes to FE configuration items + + - The `enable_vectorized_load` parameter is enabled by default. (#11833) + + - Increased `create_table_timeout` value. The default timeout for table creation operations will be increased. (#13520) + + - Modify `stream_load_default_timeout_second` default value to 3 days. + + - Modify the default value of `alter_table_timeout_second` to one month. + + - Increase the parameter `max_replica_count_when_schema_change` to limit the number of replicas involved in the alter job, the default is 100000. (#12850) + + - Add `disable_iceberg_hudi_table`. The iceberg and hudi appearances are disabled by default, and the multi catalog function is recommended. (#13932) + +- Changes to BE configuration items + + - Removed `disable_stream_load_2pc` parameter. 2PC's stream load can be used directly. (#13520) + + - Modify `tablet_rowset_stale_sweep_time_sec` from 1800 seconds to 300 seconds. + + - Redesigned configuration item name about compaction (#13495) + + - Revisited parameter about memory optimization (#13781) + +- Session variable changes + + - Modify the variable `enable_insert_strict` to true by default. This will cause some insert operations that could be executed before, but inserted illegal values, to no longer be executed. (11866) + + - Modified variable `enable_local_exchange` to default to true (#13292) + + - Default data transmission via lz4 compression, controlled by variable `fragment_transmission_compression_codec` (#11955) + + - Add `skip_storage_engine_merge` variable for debugging unique or agg model data (#11952) + + Documentation: https://doris.apache.org//docs/dev/advanced/variables + +- The BE startup script will check whether the value is greater than 200W through `/proc/sys/vm/max_map_count`. Otherwise, the startup fails. (#11052) + +- Removed mini load interface (#10520) + +- FE Metadata Version + + FE Meta Version changed from 107 to 114, and cannot be rolled back after upgrading. + +## During Upgrade + +1. Upgrade preparation + + - Need to replace: lib, bin directory (start/stop scripts have been modified) + + - BE also needs to configure JAVA_HOME, and already supports JDBC Table and Java UDF. + + - The default JVM Xmx parameter in fe.conf is changed to 8GB. + +2. Possible errors during the upgrade process + + - The repeat function cannot be used and an error is reported: `vectorized repeat function cannot be executed`, you can turn off the vectorized execution engine before upgrading. (#13868) + + - schema change fails with error: `desc_tbl is not set. Maybe the FE version is not equal to the BE` (#13822) + + - Vectorized hash join cannot be used and an error will be reported. `vectorized hash join cannot be executed`. You can turn off the vectorized execution engine before upgrading. (#13753) + + The above errors will return to normal after a full upgrade. + +## Performance Impact + +- By default, JeMalloc is used as the memory allocator of the new version BE, replacing TcMalloc (#13367) + +- The batch size in the tablet sink is modified to be at least 8K. (#13912) + +- Disable chunk allocator by default (#13285) + +## Api change + +- BE's http api error return information changed from `{"status": "Fail", "msg": "xxx"}` to more specific ``{"status": "Not found", "msg": "Tablet not found. tablet_id=1202"}``(#9771) + +- In `SHOW CREATE TABLE`, the content of comment is changed from double quotes to single quotes (#10327) + +- Support ordinary users to obtain query profile through http command. (#14016) +Documentation: https://doris.apache.org//docs/dev/admin-manual/http-actions/fe/manager/query-profile-action + +- Optimized the way to specify the sequence column, you can directly specify the column name. (#13872) +Documentation: https://doris.apache.org//docs/dev/data-operate/update-delete/sequence-column-manual + +- Increase the space usage of remote storage in the results returned by `show backends` and `show tablets` (#11450) + +- Removed Num-Based Compaction related code (#13409) + +- Refactored BE's error code mechanism, some returned error messages will change (#8855) +other + +- Support Docker official image. + +- Support compiling Doris on MacOS(x86/M1) and ubuntu-22.04 + Documentation: https://doris.apache.org//docs/dev/install/source-install/compilation-mac/ + +- Support for image file verification. + + Documentation: https://doris.apache.org//docs/dev/admin-manual/maint-monitor/metadata-operation/ + +- script related + + - The stop scripts of FE and BE support exiting FE and BE via the `--grace` parameter (use kill -15 signal instead of kill -9) + + - FE start script supports checking the current FE version via --version (#11563) + + - Support to get the data and related table creation statement of a tablet through the `ADMIN COPY TABLET` command, for local problem debugging (#12176) + + Documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-COPY-TABLET + +- Support to obtain a table creation statement related to a SQL statement through the http api for local problem reproduction (#11979) + + Documentation: https://doris.apache.org//docs/dev/admin-manual/http-actions/fe/query-schema-action + +- Support to close the compaction function of this table when creating a table, for testing (#11743) + + Search for "disble_auto_compaction" in the documentation: https://doris.apache.org//docs/dev/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE + +# Big Thanks + +Thanks to ALL who contributed to this release! (alphabetically) +``` +@924060929 +@a19920714liou +@adonis0147 +@Aiden-Dong +@aiwenmo +@AshinGau +@b19mud +@BePPPower +@BiteTheDDDDt +@bridgeDream +@ByteYue +@caiconghui +@CalvinKirs +@cambyzju +@caoliang-web +@carlvinhust2012 +@catpineapple +@ccoffline +@chenlinzhong +@chovy-3012 +@coderjiang +@cxzl25 +@dataalive +@dataroaring +@dependabot[bot] +@dinggege1024 +@DongLiang-0 +@Doris-Extras +@eldenmoon +@EmmyMiao87 +@englefly +@FreeOnePlus +@Gabriel39 +@gaodayue +@geniusjoe +@gj-zhang +@gnehil +@GoGoWen +@HappenLee +@hello-stephen +@Henry2SS +@hf200012 +@huyuanfeng2018 +@jacktengg +@jackwener +@jeffreys-cat +@Jibing-Li +@JNSimba +@Kikyou1997 +@Lchangliang +@LemonLiTree +@lexoning +@liaoxin01 +@lide-reed +@link3280 +@liutang123 +@liuyaolin +@LOVEGISER +@lsy3993 +@luozenglin +@luzhijing +@madongz +@morningman +@morningman-cmy +@morrySnow +@mrhhsg +@Myasuka +@myfjdthink +@nextdreamblue +@pan3793 +@pangzhili +@pengxiangyu +@platoneko +@qidaye +@qzsee +@SaintBacchus +@SeekingYang +@smallhibiscus +@sohardforaname +@song7788q +@spaces-X +@ssusieee +@stalary +@starocean999 +@SWJTU-ZhangLei +@TaoZex +@timelxy +@Wahno +@wangbo +@wangshuo128 +@wangyf0555 +@weizhengte +@weizuo93 +@wsjz +@wunan1210 +@xhmz +@xiaokang +@xiaokangguo +@xinyiZzz +@xy720 +@yangzhg +@Yankee24 +@yeyudefeng +@yiguolei +@yinzhijian +@yixiutt +@yuanyuan8983 +@zbtzbtzbt +@zenoyang +@zhangboya1 +@zhangstar333 +@zhannngchen +@ZHbamboo +@zhengshiJ +@zhenhb +@zhqu1148980644 +@zuochunwei +@zy-kkk +``` diff --git a/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.1.md b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.1.md new file mode 100644 index 0000000000000..d5adb31eb5256 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.1.md @@ -0,0 +1,196 @@ +--- +{ + "title": "Release 1.2.1", + "language": "en" +} +--- + + + +# Improvement + +### Supports new type DecimalV3 + +DecimalV3, which supports higher precision and better performance, has the following advantages over past versions. + +- Larger representable range, the range of values are significantly expanded, and the valid number range [1,38]. + +- Higher performance, adaptive adjustment of the storage space occupied according to different precision. + +- More complete precision derivation support, for different expressions, different precision derivation rules are applied to the accuracy of the result. + +[DecimalV3](https://doris.apache.org/docs/dev/sql-manual/sql-reference/Data-Types/DECIMALV3/) + +### Support Iceberg V2 + +Support Iceberg V2 (only Position Delete is supported, Equality Delete will be supported in subsequent versions). + +Tables in Iceberg V2 format can be accessed through the Multi-Catalog feature. + +### Support OR condition to IN + +Support converting OR condition to IN condition, which can improve the execution efficiency in some scenarios.[#15437](https://github.com/apache/doris/pull/15437) [#12872](https://github.com/apache/doris/pull/12872) + +### Optimize the import and query performance of JSONB type + +Optimize the import and query performance of JSONB type. [#15219](https://github.com/apache/doris/pull/15219) [#15219](https://github.com/apache/doris/pull/15219) + +### Stream load supports quoted csv data + +Search trim_double_quotes in Document:[https://doris.apache.org/docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD](https://doris.apache.org/docs/dev/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD) + +### Broker supports Tencent Cloud CHDFS and Baidu Cloud BOS, AFS + +Data on CHDFS, BOS, and AFS can be accessed through Broker. [#15297](https://github.com/apache/doris/pull/15297) [#15448](https://github.com/apache/doris/pull/15448) + +### New function + +Add function `substring_index`. [#15373](https://github.com/apache/doris/pull/15373) + +# Bug Fix + +- In some cases, after upgrading from version 1.1 to version 1.2, the user permission information will be lost. [#15144](https://github.com/apache/doris/pull/15144) + +- Fix the problem that the partition value is wrong when using datev2/datetimev2 type for partitioning. [#15094](https://github.com/apache/doris/pull/15094) + +- Bug fixes for a large number of released features. For a complete list see: [PR List](https://github.com/apache/doris/pulls?q=is%3Apr+label%3Adev%2F1.2.1-merged+is%3Aclosed) + +# Upgrade Notice + +### Known Issues + +- Do not use JDK11 as the runtime JDK of BE, it will cause BE Crash. +- The reading performance of the csv format in this version has declined, which will affect the import and reading efficiency of the csv format. We will fix it as soon as possible in the next three-digit version + +### Behavior Changed + +- The default value of the BE configuration item `high_priority_flush_thread_num_per_store` is changed from 1 to 6, to improve the write efficiency of Routine Load. (https://github.com/apache/doris/pull/14775) + +- The default value of the FE configuration item `enable_new_load_scan_node` is changed to true. Import tasks will be performed using the new File Scan Node. No impact on users.[#14808](https://github.com/apache/doris/pull/14808) + +- Delete the FE configuration item `enable_multi_catalog`. The Multi-Catalog function is enabled by default. + +- The vectorized execution engine is forced to be enabled by default.[#15213](https://github.com/apache/doris/pull/15213) + +The session variable enable_vectorized_engine will no longer take effect. Enabled by default. + +To make it valid again, set the FE configuration item `disable_enable_vectorized_engine` to false, and restart FE to make `enable_vectorized_engine` valid again. + + +# Big Thanks + +Thanks to ALL who contributed to this release! + + +@adonis0147 + +@AshinGau + +@BePPPower + +@BiteTheDDDDt + +@ByteYue + +@caiconghui + +@cambyzju + +@chenlinzhong + +@dataroaring + +@Doris-Extras + +@dutyu + +@eldenmoon + +@englefly + +@freemandealer + +@Gabriel39 + +@HappenLee + +@Henry2SS + +@hf200012 + +@jacktengg + +@Jibing-Li + +@Kikyou1997 + +@liaoxin01 + +@luozenglin + +@morningman + +@morrySnow + +@mrhhsg + +@nextdreamblue + +@qidaye + +@spaces-X + +@starocean999 + +@wangshuo128 + +@weizuo93 + +@wsjz + +@xiaokang + +@xinyiZzz + +@xutaoustc + +@yangzhg + +@yiguolei + +@yixiutt + +@Yulei-Yang + +@yuxuan-luo + +@zenoyang + +@zhangstar333 + +@zhannngchen + +@zhengshengjun + + + + + + diff --git a/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.2.md b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.2.md new file mode 100644 index 0000000000000..08fd22571a03f --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.2.md @@ -0,0 +1,254 @@ +--- +{ + "title": "Release 1.2.2", + "language": "en" +} +--- + + + +# New Features + +### Lakehouse + +- Support automatic synchronization of Hive metastore. + +- Support reading the Iceberg Snapshot, and viewing the Snapshot history. + +- JDBC Catalog supports PostgreSQL, Clickhouse, Oracle, SQLServer + +- JDBC Catalog supports Insert operation + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/) + +### Auto Bucket + + Set and scale the number of buckets for different partitions to keep the number of tablet in a relatively appropriate range. + +### New Functions + +Add the new function `width_bucket`. + +Reference: [https://doris.apache.org/zh-CN/docs/dev/sql-manual/sql-functions/width-bucket/#description](https://doris.apache.org/zh-CN/docs/dev/sql-manual/sql-functions/width-bucket/#description) + +# Behavior Changes + +- Disable BE's page cache by default: `disable_storage_page_cache=true` + +Turn off this configuration to optimize memory usage and reduce the risk of memory OOM. +But it will reduce the query latency of some small queries. +If you are sensitive to query latency, or have high concurrency and small query scenarios, you can configure *disable_storage_page_cache=false* to enable page cache again. + +- Add new session variable `group_by_and_having_use_alias_first`, used to control whether the group and having clauses use alias. + +Reference: [https://doris.apache.org/docs/dev/advanced/variables](https://doris.apache.org/docs/dev/advanced/variables) + +# Improvement + +### Compaction + +- Support `Vertical Compaction`. To optimize the compaction overhead and efficiency of wide tables. + +- Support `Segment ompaction`. Fix -238 and -235 issues with high frequency imports. + +### Lakehouse + +- Hive Catalog can be compatible with Hive version 1/2/3 + +- Hive Catalog can access JuiceFS based HDFS with Broker. + +- Iceberg Catalog Support Hive Metastore and Rest Catalog type. + +- ES Catalog support _id column mapping. + +- Optimize Iceberg V2 read performance with large number of delete rows. + +- Support for reading Iceberg tables after Schema Evolution + +- Parquet Reader handles column name case correctly. + +### Other + +- Support for accessing Hadoop KMS-encrypted HDFS. + +- Support to cancel the Export export task in progress. + +- Optimize the performance of `explode_split` with 1x. + +- Optimize the read performance of nullable columns with 3x. + +- Optimize some problems of Memtracker, improve memory management accuracy, and optimize memory application. + + + +# Bug Fix + +- Fixed memory leak when loading data with Doris Flink Connector. + +- Fixed the possible thread scheduling problem of BE and reduce the `Fragment sent timeout` error caused by BE thread exhaustion. + +- Fixed various correctness and precision issues of column type datetimev2/decimalv3. + +- Fixed the problem data correctness issue with Unique Key Merge-on-Read table. + +- Fixed various known issues with the Light Schema Change feature. + +- Fixed various data correctness issues of bitmap type Runtime Filter. + +- Fixed the problem of poor reading performance of csv reader introduced in version 1.2.1. + +- Fixed the problem of BE OOM caused by Spark Load data download phase. + +- Fixed possible metadata compatibility issues when upgrading from version 1.1 to version 1.2. + +- Fixed the metadata problem when creating JDBC Catalog with Resource. + +- Fixed the problem of high CPU usage caused by load operation. + +- Fixed the problem of FE OOM caused by a large number of failed Broker Load jobs. + +- Fixed the problem of precision loss when loading floating-point types. + +- Fixed the problem of memory leak when useing 2PC stream load + +# Other + +Add metrics to view the total rowset and segment numbers on BE + +- doris_be_all_rowsets_num and doris_be_all_segments_num + + +# Big Thanks + +Thanks to ALL who contributed to this release! + + +@adonis0147 + +@AshinGau + +@BePPPower + +@BiteTheDDDDt + +@ByteYue + +@caiconghui + +@cambyzju + +@chenlinzhong + +@DarvenDuan + +@dataroaring + +@Doris-Extras + +@dutyu + +@englefly + +@freemandealer + +@Gabriel39 + +@HappenLee + +@Henry2SS + +@htyoung + +@isHuangXin + +@JackDrogon + +@jacktengg + +@Jibing-Li + +@kaka11chen + +@Kikyou1997 + +@Lchangliang + +@LemonLiTree + +@liaoxin01 + +@liqing-coder + +@luozenglin + +@morningman + +@morrySnow + +@mrhhsg + +@nextdreamblue + +@qidaye + +@qzsee + +@spaces-X + +@stalary + + +@starocean999 + +@weizuo93 + +@wsjz + +@xiaokang + +@xinyiZzz + +@xy720 + +@yangzhg + +@yiguolei + +@yixiutt + +@Yukang-Lian + +@Yulei-Yang + +@zclllyybb + +@zddr + +@zhangstar333 + +@zhannngchen + +@zy-kkk + + + + + + diff --git a/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.3.md b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.3.md new file mode 100644 index 0000000000000..cd9226b15e14f --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.3.md @@ -0,0 +1,109 @@ +--- +{ + "title": "Release 1.2.3", + "language": "en" +} +--- + + + +# Improvement + +### JDBC Catalog + +- Support connecting to Doris clusters through JDBC Catalog. + +Currently, Jdbc Catalog only support to use 5.x version of JDBC jar package to connect another Doris database. If you use 8.x version of JDBC jar package, the data type of column may not be matched. + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc/#doris](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc/#doris) + +- Support to synchronize only the specified database through the `only_specified_database` attribute. + +- Support synchronizing table names in the form of lowercase through `lower_case_table_names` to solve the problem of case sensitivity of table names. + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc) + +- Optimize the read performance of JDBC Catalog. + +### Elasticsearch Catalog + +- Support Array type mapping. + +- Support whether to push down the like expression through the `like_push_down` attribute to control the CPU overhead of the ES cluster. + +Reference: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/es](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/es) + +### Hive Catalog + +- Support Hive table default partition `_HIVE_DEFAULT_PARTITION_`. + +- Hive Metastore metadata automatic synchronization supports notification event in compressed format. + +### Dynamic Partition Improvement + +- Dynamic partition supports specifying the `storage_medium` parameter to control the storage medium of the newly added partition. + +Reference: [https://doris.apache.org/docs/dev/advanced/partition/dynamic-partition](https://doris.apache.org/docs/dev/advanced/partition/dynamic-partition) + + +### Optimize BE's Threading Model + +- Optimize BE's threading model to avoid stability problems caused by frequent thread creation and destroy. + +# Bugfix + +- Fixed issues with Merge-On-Write Unique Key tables. + +- Fixed compaction related issues. + +- Fixed some delete statement issues causing data errors. + +- Fixed several query execution errors. + +- Fixed the problem of using JDBC catalog to cause BE crash on some operating system. + +- Fixed Multi-Catalog issues. + +- Fixed memory statistics and optimization issues. + +- Fixed decimalV3 and date/datetimev2 related issues. + +- Fixed load transaction stability issues. + +- Fixed light-weight schema change issues. + +- Fixed the issue of using `datetime` type for batch partition creation. + +- Fixed the problem that a large number of failed broker loads would cause the FE memory usage to be too high. + +- Fixed the problem that stream load cannot be canceled after dropping the table. + +- Fixed querying `information_schema` timeout in some cases. + +- Fixed the problem of BE crash caused by concurrent data export using `select outfile`. + +- Fixed transactional insert operation memory leak. + +- Fixed several query/load profile issues, and supports direct download of profiles through FE web ui. + +- Fixed the problem that the BE tablet GC thread caused the IO util to be too high. + +- Fixed the problem that the commit offset is inaccurate in Kafka routine load. + diff --git a/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.4.md b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.4.md new file mode 100644 index 0000000000000..a959a323d06d1 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.4.md @@ -0,0 +1,81 @@ +--- +{ + "title": "Release 1.2.4", + "language": "en" +} +--- + + + + +# Behavior Changed + +- For `DateV2`/`DatetimeV2` and `DecimalV3` type, in the results of `DESCRIBLE` and `SHOW CREATE TABLE` statements, they will no longer be displayed as `DateV2`/`DatetimeV2` or `DecimalV3`, but directly displayed as `Date`/`Datetime` or `Decimal`. + + - This change is for compatibility with some BI tools. If you want to see the actual type of the column, you can check it with the `DESCRIBE ALL` statement. + +- When querying tables in the `information_schema` database, the meta information(database, table, column, etc.) in the external catalog is no longer returned by default. + + - This change avoids the problem that the `information_schema` database cannot be queried due to the connection problem of some external catalog, so as to solve the problem of using some BI tools with Doris. It can be controlled by the FE configuration `infodb_support_ext_catalog`, and the default value is `false`, that is, the meta information of external catalog will not be returned. + +# Improvement + +### JDBC Catalog + +- Supports connecting to Trino/Presto via JDBC Catalog + +​ Refer to: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#trino](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#trino) + +- JDBC Catalog connects to Clickhouse data source and supports Array type mapping + +​ Refer to: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#clickhouse](https://doris.apache.org/docs/dev/lakehouse/multi-catalog/jdbc#clickhouse) + +### Spark Load + +- Spark Load supports Resource Manager HA related configuration + +​ Refer to: https://github.com/apache/doris/pull/15000 + +## Bug Fixes + +- Fixed several connectivity issues with Hive Catalog. + +- Fixed ClassNotFound issues with Hudi Catalog. + +- Optimize the connection pool of JDBC Catalog to avoid too many connections. + +- Fix the problem that OOM will occur when importing data from another Doris cluster through JDBC Catalog. + +- Fixed serveral queries and imports planning issues. + +- Fixed several issues with Unique Key Merge-On-Write data model. + +- Fix several BDBJE issues and solve the problem of abnormal FE metadata in some cases. + +- Fix the problem that the `CREATE VIEW` statement does not support Table Valued Function. + +- Fixed several memory statistics issues. + +- Fixed several issues reading Parquet/ORC format. + +- Fixed several issues with DecimalV3. + +- Fixed several issues with SHOW QUERY/LOAD PROFILE. + diff --git a/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.5.md b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.5.md new file mode 100644 index 0000000000000..55af863ba47d6 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.5.md @@ -0,0 +1,199 @@ +--- +{ + "title": "Release 1.2.5", + "language": "en" +} +--- + + + +In version 1.2.5, the Doris team has fixed nearly 210 issues or performance improvements since the release of version 1.2.4. At the same time, version 1.2.5 is also an iterative version of version 1.2.4, which has higher stability. It is recommended that all users upgrade to this version. + +# Behavior Changed + +- The `start_be.sh` script will check that the maximum number of file handles in the system must be greater than or equal to 65536, otherwise the startup will fail. + +- The BE configuration item `enable_quick_compaction` is set to true by default. The Quick Compaction is enabled by default. This feature is used to optimize the problem of small files in the case of large batch import. + +- After modifying the dynamic partition attribute of the table, it will no longer take effect immediately, but wait for the next task scheduling of the dynamic partition table to avoid some deadlock problems. + +# Improvement + +- Optimize the use of bthread and pthread to reduce the RPC blocking problem during the query process. + +- A button to download Profile is added to the Profile page of the FE web UI. + +- Added FE configuration `recover_with_skip_missing_version`, which is used to query to skip the problematic replica under certain failure conditions. + +- The row-level permission function supports external Catalog. + +- Hive Catalog supports automatic refreshing of kerberos tickets on the BE side without manual refreshing. + +- JDBC Catalog supports tables under the MySQL/ClickHouse system database (`information_schema`). + +# Bug Fixes + +- Fixed the problem of incorrect query results caused by low-cardinality column optimization + +- Fixed several authentication and compatibility issues accessing HDFS. + +- Fixed several issues with float/double and decimal types. + +- Fixed several issues with date/datetimev2 types. + +- Fixed several query execution and planning issues. + +- Fixed several issues with JDBC Catalog. + +- Fixed several query-related issues with Hive Catalog, and Hive Metastore metadata synchronization issues. + +- Fix the problem that the result of `SHOW LOAD PROFILE` statement is incorrect. + +- Fixed several memory related issues. + +- Fixed several issues with `CREATE TABLE AS SELECT` functionality. + +- Fix the problem that the jsonb type causes BE to crash on CPU that do not support avx2. + +- Fixed several issues with dynamic partitions. + +- Fixed several issues with TOPN query optimization. + +- Fixed several issues with the Unique Key Merge-on-Write table model. + +# Big Thanks + +58 contributors participated in the improvement and release of 1.2.5, and thank them for their hard work and dedication: + +@adonis0147 + +@airborne12 + +@AshinGau + +@BePPPower + +@BiteTheDDDDt + +@caiconghui + +@CalvinKirs + +@cambyzju + +@caoliang-web + +@dataroaring + +@Doris-Extras + +@dujl + +@dutyu + +@fsilent + +@Gabriel39 + +@gitccl + +@gnehil + +@GoGoWen + +@gongzexin + +@HappenLee + +@herry2038 + +@jacktengg + +@Jibing-Li + +@kaka11chen + +@Kikyou1997 + +@LemonLiTree + +@liaoxin01 + +@LiBinfeng-01 + +@luwei16 + +@Moonm3n + +@morningman + +@mrhhsg + +@Mryange + +@nextdreamblue + +@nsnhuang + +@qidaye + +@Shoothzj + +@sohardforaname + +@stalary + +@starocean999 + +@SWJTU-ZhangLei + +@wsjz + +@xiaokang + +@xinyiZzz + +@yangzhg + +@yiguolei + +@yixiutt + +@yujun777 + +@Yulei-Yang + +@yuxuan-luo + +@zclllyybb + +@zddr + +@zenoyang + +@zhangstar333 + +@zhannngchen + +@zxealous + +@zy-kkk + +@zzzzzzzs diff --git a/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.6.md b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.6.md new file mode 100644 index 0000000000000..39146b35b15ac --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.6.md @@ -0,0 +1,135 @@ +--- +{ + "title": "Release 1.2.6", + "language": "en" +} +--- + + + + +# Behavior Change + +- Add a BE configuration item `allow_invalid_decimalv2_literal` to control whether can import data that exceeding the decimal's precision, for compatibility with previous logic. + +# Query + +- Fix several query planning issues. +- Support `sql_select_limit` session variable. +- Optimize query cold run performance. +- Fix expr context memory leak. +- Fix the issue that the `explode_split` function was executed incorrectly in some cases. + +## Multi Catalog + +- Fix the issue that synchronizing hive metadata caused FE replay edit log to fail. +- Fix `refresh catalog` operation causing FE OOM. +- Fix the issue that jdbc catalog cannot handle `0000-00-00` correctly. +- Fixed the issue that the kerberos ticket cannot be refreshed automatically. +- Optimize the partition pruning performance of hive. +- Fix the inconsistent behavior of trino and presto in jdbc catalog. +- Fix the issue that hdfs short-circuit read could not be used to improve query efficiency in some environments. +- Fix the issue that the iceberg table on CHDFS could not be read. + +# Storage + +- Fix the wrong calculation of delete bitmap in MOW table. +- Fix several BE memory issues. +- Fix snappy compression issue. +- Fix the issue that jemalloc may cause BE to crash in some cases. + +# Others + +- Fix several java udf related issues. +- Fix the issue that the `recover table` operation incorrectly triggered the creation of dynamic partitions. +- Fix timezone when importing orc files via broker load. +- Fix the issue that the newly added `PERCENT` keyword caused the replay metadata of the routine load job to fail. +- Fix the issue that the `truncate` operation failed to acts on a non-partitioned table. +- Fix the issue that the mysql connection was lost due to the `show snapshot` operation. +- Optimize the lock logic to reduce the probability of lock timeout errors when creating tables. +- Add session variable `have_query_cache` to be compatible with some old mysql clients. +- Optimize the error message when encountering an error of loading. + +# Big Thanks + +Thanks all who contribute to this release: + +@amorynan + +@BiteTheDDDDt + +@caoliang-web + +@dataroaring + +@Doris-Extras + +@dutyu + +@Gabriel39 + +@HHoflittlefish777 + +@htyoung + +@jacktengg + +@jeffreys-cat + +@kaijchen + +@kaka11chen + +@Kikyou1997 + +@KnightLiJunLong + +@liaoxin01 + +@LiBinfeng-01 + +@morningman + +@mrhhsg + +@sohardforaname + +@starocean999 + +@vinlee19 + +@wangbo + +@wsjz + +@xiaokang + +@xinyiZzz + +@yiguolei + +@yujun777 + +@Yulei-Yang + +@zhangstar333 + +@zy-kkk + diff --git a/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.7.md b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.7.md new file mode 100644 index 0000000000000..cd47282f4688d --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.7.md @@ -0,0 +1,46 @@ +--- +{ + "title": "Release 1.2.7", + "language": "en" +} +--- + + + +# Bug Fixes + +- Fixed some query issues. +- Fix some storage issues. +- Fix some decimal precision issues. +- Fix query error caused by invalid `sql_select_limit` session variable's value. +- Fix the problem that hdfs short-circuit read cannot be used. +- Fix the problem that Tencent Cloud cosn cannot be accessed. +- Fix several issues with hive catalog kerberos access. +- Fix the problem that stream load profile cannot be used. +- Fix promethus monitoring parameter format problem. +- Fix the table creation timeout issue when creating a large number of tablets. + +# New Features + +- Unique Key model supports array type as value column +- Added `have_query_cache` variable for compatibility with MySQL ecosystem. +- Added `enable_strong_consistency_read` to support strong consistent read between sessions +- FE metrics supports user-level query counter + diff --git a/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.8.md b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.8.md new file mode 100644 index 0000000000000..35cbb7a3cdcf1 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v1.2/release-1.2.8.md @@ -0,0 +1,47 @@ +--- +{ + "title": "Release 1.2.8", + "language": "en" +} +--- + + + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Bug Fixes +- Fixed several issues with query execution. +- Fixed several issues with Spark Load. +- Fixed several issues with Parquet Reader. +- Fixed several issues with Orc Reader. +- Fixed Broker "FileSystem closed" problem. +- Fixed several issues with Broker Load. +- Fixed several issues with CTAS execution. +- Fixed several issues with backup and restore. +- Added "Catalog" column in audit log. +- Optimized the metadata cache of Iceberg Catalog. +- Fixed several issues with outfile/export feature. +- Fixed an issue with "replayEraseTable" edit log causing FE start to fail. +- Fixed some security issues. + + diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.0.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.0.md new file mode 100644 index 0000000000000..61ba6c5c60890 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.0.md @@ -0,0 +1,236 @@ +--- +{ + "title": "Release 2.0.0", + "language": "en" +} +--- + + + + +We are more than excited to announce that, after six months of coding, testing, and fine-tuning, Apache Doris 2.0.0 is now production-ready. Special thanks to the 275 committers who altogether contributed over 4100 optimizations and fixes to the project. + +This new version highlights: + +- 10 times faster data queries +- Enhanced log analytic and federated query capabilities +- More efficient data writing and updates +- Improved multi-tenant and resource isolation mechanisms +- Progresses in elastic scaling of resources and storage-compute separation +- Enterprise-facing features for higher usability + +> Download: https://doris.apache.org/download +> +> GitHub source code: https://github.com/apache/doris/releases/tag/2.0.0-rc04 + +## **A 10 Times Performance Increase** + +In SSB-Flat and TPC-H benchmarking, Apache Doris 2.0.0 delivered **over 10-time faster query performance** compared to an early version of Apache Doris. + +![](/images/release-note-2.0.0-1.png) + +This is realized by the introduction of a smarter query optimizer, inverted index, a parallel execution model, and a series of new functionalities to support high-concurrency point queries. + +### A smarter query optimizer + +The brand new query optimizer, Nereids, has a richer statistical base and adopts the Cascades framework. It is capable of self-tuning in most query scenarios and supports all 99 SQLs in TPC-DS, so users can expect high performance without any fine-tuning or SQL rewriting. + +TPC-H tests showed that Nereids, with no human intervention, outperformed the old query optimizer by a wide margin. Over 100 users have tried Apache Doris 2.0.0 in their production environment and the vast majority of them reported huge speedups in query execution. + +![](/images/release-note-2.0.0-2.png) + +**Doc**: https://doris.apache.org/docs/dev/query-acceleration/nereids/ + +Nereids is enabled by default in Apache Doris 2.0.0: `SET enable_nereids_planner=true`. Nereids collects statistical data by calling the Analyze command. + +### Inverted Index + +In Apache Doris 2.0.0, we introduced inverted index to better support fuzzy keyword search, equivalence queries, and range queries. + +A smartphone manufacturer tested Apache Doris 2.0.0 in their user behavior analysis scenarios. With inverted index enabled, v2.0.0 was able to finish the queries within milliseconds and maintain stable performance as the query concurrency level went up. In this case, it is 5 to 90 times faster than its old version. + +![](/images/release-note-2.0.0-3.png) + +### 20 times higher concurrency capability + +In scenarios like e-commerce order queries and express tracking, a huge number of end data users search for a certain data record simultaneously. These are what we call high-concurrency point queries, which can bring huge pressure on the system. A traditional solution is to introduce Key-Value stores like Apache HBase for such queries, and Redis as a cache layer to ease the burden, but that means redundant storage and higher maintenance costs. + +For a column-oriented DBMS like Apache Doris, the I/O usage of point queries will be multiplied. We need neater execution. Thus, on the basis of columnar storage, we added row storage format and row cache to increase row reading efficiency, short-circuit plans to speed up data retrieval, and prepared statements to reduce frontend overheads. + +After these optimizations, Apache Doris 2.0 reached a concurrency level of **30,000 QPS per node** on YCSB on a 16 Core 64G cloud server with 4×1T hard drives, representing an improvement of **20 times** compared to its older version. This makes Apache Doris a good alternative to HBase in high-concurrency scenarios, so that users don't need to endure extra maintenance costs and redundant storage brought by complicated tech stacks. + +Read more: https://doris.apache.org/blog/High_concurrency + +### A self-adaptive parallel execution model + +Apache 2.0 brought in a Pipeline execution model for higher efficiency and stability in hybrid analytic workloads. In this model, the execution of queries is driven by data. The blocking operators in all query execution processes are split into pipelines. Whether a pipeline gets an execution thread depends on whether its relevant data is ready. This enables asynchronous blocking operations and more flexible system resource management. Also, this improves CPU efficiency as the system doesn't have to create and destroy threads that much. + +Doc: https://doris.apache.org/docs/dev/query-acceleration/pipeline-execution-engine/ + +**How to enable the Pipeline execution model** + +- The Pipeline execution engine is enabled by default in Apache Doris 2.0: `Set enable_pipeline_engine = true`. +- `parallel_pipeline_task_num` represents the number of pipeline tasks that are parallelly executed in SQL queries. The default value of it is `0`, which means Apache Doris will automatically set the concurrency level to half the number of CPUs in each backend node. Users can change this value as they need it. +- For those who are upgrading to Apache Doris 2.0 from an older version, it is recommended to set the value of `parallel_pipeline_task_num` to that of `parallel_fragment_exec_instance_num` in the old version. + +## A Unified Platform for Multiple Analytic Workloads + +Apache Doris has been pushing its boundaries. Starting as an OLAP engine for reporting, it is now a data warehouse capable of ETL/ELT and more. Version 2.0 is making advancements in its log analysis and data lakehousing capabilities. + +### A 10 times more cost-effective log analysis solution + +Apache Doris 2.0.0 provides native support for semi-structured data. In addition to JSON and Array, it now supports a complex data type: Map. Based on Light Schema Change, it also supports Schema Evolution, which means you can adjust the schema as your business changes. You can add or delete fields and indexes, and change the data types for fields. As we introduced inverted index and a high-performance text analysis algorithm into it, it can execute full-text search and dimensional analysis of logs more efficiently. With faster data writing and query speed and lower storage cost, it is 10 times more cost-effective than the common log analytic solution within the industry. + +![](/images/release-note-2.0.0-4.png) + +### Enhanced data lakehousing capabilities + +In Apache Doris 1.2, we introduced Multi-Catalog to allow for auto-mapping and auto-synchronization of data from heterogeneous sources. In version 2.0.0, we extended the list of data sources supported and optimized Doris for based on users' needs in production environment. + +![](/images/release-note-2.0.0-5.png) + +Apache Doris 2.0.0 supports dozens of data sources including Hive, Hudi, Iceberg, Paimon, MaxCompute, Elasticsearch, Trino, ClickHouse, and almost all open lakehouse formats. It also supports snapshot queries on Hudi Copy-on-Write tables and read optimized queries on Hudi Merge-on-Read tables. It allows for authorization of Hive Catalog using Apache Ranger, so users can reuse their existing privilege control system. Besides, it supports extensible authorization plug-ins to enable user-defined authorization methods for any catalog. + +TPC-H benchmark tests showed that Apache Doris 2.0.0 is 3~5 times faster than Presto/Trino in queries on Hive tables. This is realized by all-around optimizations (in small file reading, flat table reading, local file cache, ORC/Parquet file reading, Compute Nodes, and information collection of external tables) finished in this development cycle and the distributed execution framework, vectorized execution engine, and query optimizer of Apache Doris. + +![](/images/release-note-2.0.0-6.png) + +All this gives Apache Doris 2.0.0 an edge in data lakehousing scenarios. With Doris, you can do incremental or overall synchronization of multiple upstream data sources in one place, and expect much higher data query performance than other query engines. The processed data can be written back to the sources or provided for downstream systems. In this way, you can make Apache Doris your unified data analytic gateway. + +## Efficient Data Update + +Data update is important in real-time analysis, since users want to always be accessible to the latest data, and be able to update data flexibly, such as updating a row or just a few columns, batching updating or deleting their specified data, or even overwriting a whole data partition. + +Efficient data updating has been another hill to climb in data analysis. Apache Hive only supports updates on the partition level, while Hudi and Iceberg do better in low-frequency batch updates instead of real-time updates due to their Merge-on-Read and Copy-on-Write implementations. + +As for data updating, Apache Doris 2.0.0 is capable of: + +- **Faster data writing**: In the pressure tests with an online payment platform, under 20 concurrent data writing tasks, Doris reached a writing throughput of 300,000 records per second and maintained stability throughout the over 10-hour continuous writing process. +- **Partial column update**: Older versions of Doris implements partial column update by `replace_if_not_null` in the Aggregate Key model. In 2.0.0, we enable partial column updates in the Unique Key model. That means you can directly write data from multiple source tables into a flat table, without having to concatenate them into one output stream using Flink before writing. This method avoids a complicated processing pipeline and the extra resource consumption. You can simply specify the columns you need to update. +- **Conditional update and deletion**: In addition to the simple Update and Delete operations, we realize complicated conditional updates and deletes operations on the basis of Merge-on-Write. + +## Faster, Stabler, and Smarter Data Writing + +### Higher speed in data writing + +As part of our continuing effort to strengthen the real-time analytic capability of Apache Doris, we have improved the end-to-end real-time data writing capability of version 2.0.0. Benchmark tests reported higher throughput in various writing methods: + +- Stream Load, TPC-H 144G lineitem table, 48-bucket Duplicate table, triple-replica writing: throughput increased by 100% +- Stream Load, TPC-H 144G lineitem table, 48-bucket Unique Key table, triple-replica writing: throughput increased by 200% +- Insert Into Select, TPC-H 144G lineitem table, 48-bucket Duplicate table: throughput increased by 50% +- Insert Into Select, TPC-H 144G lineitem table, 48-bucket Unique Key table: throughput increased by 150% + +### Greater stability in high-concurrency data writing + +The sources of system instability often includes small file merging, write amplification, and the consequential disk I/O and CPU overheads. Hence, we introduced Vertical Compaction and Segment Compaction in version 2.0.0 to eliminate OOM errors in compaction and avoid the generation of too many segment files during data writing. After such improvements, Apache Doris can write data 50% faster while **using only 10% of the memory that it previously used**. + +Read more: https://doris.apache.org/blog/Compaction + +### Auto-synchronization of table schema + +The latest Flink-Doris-Connector allows users to synchronize an entire database (such as MySQL and Oracle) to Apache Doris by one simple step. According to our test results, one single synchronization task can support the real-time concurrent writing of thousands of tables. Users no longer need to go through a complicated synchronization procedure because Apache Doris has automated the process. Changes in the upstream data schema will be automatically captured and dynamically updated to Apache Doris in a seamless manner. + +Read more: https://doris.apache.org/blog/FDC + +## A New Multi-Tenant Resource Isolation Solution + +The purpose of multi-tenant resource isolation is to avoid resource preemption in the case of heavy loads. For that sake, older versions of Apache Doris adopted a hard isolation plan featured by Resource Group: Backend nodes of the same Doris cluster would be tagged, and those of the same tag formed a Resource Group. As data was ingested into the database, different data replicas would be written into different Resource Groups, which will be responsible for different workloads. For example, data reading and writing will be conducted on different data tablets, so as to realize read-write separation. Similarly, you can also put online and offline business on different Resource Groups. + +![](/images/release-note-2.0.0-7.png) + +This is an effective solution, but in practice, it happens that some Resource Groups are heavily occupied while others are idle. We want a more flexible way to reduce vacancy rate of resources. Thus, in 2.0.0, we introduce Workload Group resource soft limit. + +![](/images/release-note-2.0.0-8.png) + +The idea is to divide workloads into groups to allow for flexible management of CPU and memory resources. Apache Doris associates a query with a Workload Group, and limits the percentage of CPU and memory that a single query can use on a backend node. The memory soft limit can be configured and enabled by the user. + +When there is a cluster resource shortage, the system will kill the largest memory-consuming query tasks; when there are sufficient cluster resources, once a Workload Group uses more resources than expected, the idle cluster resources will be shared among all the Workload Groups to give full play to the system memory and ensure stable execution of queries. You can also prioritize the Workload Groups in terms of resource allocation. In other words, you can decide which tasks can be assigned with adequate resources and which not. + +Meanwhile, we introduced Query Queue in 2.0.0. Upon Workload Group creation, you can set a maximum query number for a query queue. Queries beyond that limit will wait for execution in the queue. This is to reduce system burden under heavy workloads. + +## Elastic Scaling and Storage-Compute Separation + +When it comes to computation and storage resources, what do users want? + +- **Elastic scaling of computation resources**: Scale up resources quickly in peak times to increase efficiency and scale down in valley times to reduce costs. +- **Lower storage costs**: Use low-cost storage media and separate storage from computation. +- **Separation of workloads**: Isolate the computation resources of different workloads to avoid preemption. +- **Unified management of data**: Simply manage catalogs and data in one place. + +To separate storage and computation is a way to realize elastic scaling of resources, but it demands more efforts in maintaining storage stability, which determines the stability and continuity of OLAP services. To ensure storage stability, we introduced mechanisms including cache management, computation resource management, and garbage collection. + + In this respect, we divide our users into three groups after investigation: + +1. Users with no need for resource scaling +2. Users requiring resource scaling, low storage costs, and workload separation from Apache Doris +3. Users who already have a stable large-scale storage system and thus require an advanced compute-storage-separated architecture for efficient resource scaling + +Apache Doris 2.0 provides two solutions to address the needs of the first two types of users. + +1. **Compute nodes**. We introduced stateless compute nodes in version 2.0. Unlike the mix nodes, the compute nodes do not save any data and are not involved in workload balancing of data tablets during cluster scaling. Thus, they are able to quickly join the cluster and share the computing pressure during peak times. In addition, in data lakehouse analysis, these nodes will be the first ones to execute queries on remote storage (HDFS/S3) so there will be no resource competition between internal tables and external tables. + 1. Doc: https://doris.apache.org/docs/dev/advanced/compute_node/ +2. **Hot-cold data separation**. Hot/cold data refers to data that is frequently/seldom accessed, respectively. Generally, it makes more sense to store cold data in low-cost storage. Older versions of Apache Doris support lifecycle management of table partitions: As hot data cooled down, it would be moved from SSD to HDD. However, data was stored with multiple replicas on HDD, which was still a waste. Now, in Apache Doris 2.0, cold data can be stored in object storage, which is even cheaper and allows single-copy storage. That reduces the storage costs by 70% and cuts down the computation and network overheads that come with storage. + 1. Read more: https://doris.apache.org/blog/HCDS/ + +For neater separate of computation and storage, the VeloDB team is going to contribute the Cloud Compute-Storage-Separation solution to the Apache Doris project. The performance and stability of it has stood the test of hundreds of companies in their production environment. The merging of code will be finished by October this year, and all Apache Doris users will be able to get an early taste of it in September. + +## Enhanced Usability + +Apache Doris 2.0.0 also highlights some enterprise-facing functionalities. + +### Support for Kubernetes Deployment + +Older versions of Apache Doris communicate based on IP, so any host failure in Kubernetes deployment that causes a POD IP drift will lead to cluster unavailability. Now, version 2.0 supports FQDN. That means the failed Doris nodes can recover automatically without human intervention, which lays the foundation for Kubernetes deployment and elastic scaling. + +### Support for Cross-Cluster Replication (CCR) + +Apache Doris 2.0.0 supports cross-cluster replication (CCR). Data changes at the database/table level in the source cluster will be synchronized to the target cluster. You can choose to replicate the incremental data or the overall data. + +It also supports synchronization of DDL, which means DDL statements executed by the source cluster can also by automatically replicated to the target cluster. + +It is simple to configure and use CCR in Doris. Leveraging this functionality, you can implement read-write separation and multi-datacenter replication + +This feature allows for higher availability of data, read/write workload separation, and cross-data-center replication more efficiently. + +## Behavior Change + +- Use rolling upgrade from 1.2-ITS to 2.0.0, and restart upgrade from preview versions of 2.0 to 2.0.0; +- The new query optimizer (Nereids) is enabled by default: `enable_nereids_planner=true`; +- All non-vectorized code has been removed from the system, so the `enable_vectorized_engine` parameter no long works; +- A new parameter `enable_single_replica_compaction` has been added; +- datev2, datetimev2, and decimalv3 are the default data types in table creation; datav1, datetimev1, and decimalv2 are not supported in table creation; +- decimalv3 is the default data type for JDBC and Iceberg Catalog; +- A new data type `AGG_STATE` has been added; +- The cluster column has been removed from backend tables; +- For better compatibility with BI tools, datev2 and datetimev2 are displayed as date and datetime when `show create table`; +- max_openfiles and swaps checks are added to the backend startup script so inappropriate system configuration might lead to backend failure; +- Password-free login is not allowed when accessing frontend on localhost; +- If there is a Multi-Catalog in the system, by default, only data of the internal catalog will be displayed when querying information schema; +- A limit has been imposed on the depth of the expression tree. The default value is 200; +- The single quote in the return value of array string has been changed to double quote; +- The Doris processes are renamed to DorisFE and DorisBE. +- The functions AES and SM4 with two arguments' behaviour changed. See more informations in [relative function docs](../../sql-manual/sql-functions/encrypt-digest-functions/sm4-encrypt.md) + +## Embarking on the 2.0.0 Journey + +To make Apache Doris 2.0.0 production-ready, we invited hundreds of enterprise users to engage in the testing and optimized it for better performance, stability, and usability. In the next phase, we will continue responding to user needs with agile release planning. We plan to launch 2.0.1 in late August and 2.0.2 in September, as we keep fixing bugs and adding new features. We also plan to release an early version of 2.1 in September to bring a few long-requested capabilities to you. For example, in Doris 2.1, the Variant data type will better serve the schema-free analytic needs of semi-structured data; the multi-table materialized views will be able to simplify the data scheduling and processing link while speeding up queries; more and neater data ingestion methods will be added and nested composite data types will be realized. + +If you have any questions or ideas when investigating, testing, and deploying Apache Doris, please find us on [Slack](https://t.co/ZxJuNJHXb2). Our developers will be happy to hear them and provide targeted support. + diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.1.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.1.md new file mode 100644 index 0000000000000..d8c19fb67525b --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.1.md @@ -0,0 +1,224 @@ +--- +{ + "title": "Release 2.0.1", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, 383 improvements and bug fixes have been made in Doris 2.0.1. + +## Behavior Changes + +- [https://github.com/apache/doris/pull/21302](https://github.com/apache/doris/pull/21302) + +## Improvements + +### functionality and stability of array and map datatypes +- [https://github.com/apache/doris/pull/22793](https://github.com/apache/doris/pull/22793) +- [https://github.com/apache/doris/pull/22927](https://github.com/apache/doris/pull/22927) +- https://github.com/apache/doris/pull/22738 +- https://github.com/apache/doris/pull/22347 +- https://github.com/apache/doris/pull/23250 +- https://github.com/apache/doris/pull/22300 + +### performance for inverted index query +- https://github.com/apache/doris/pull/22836 +- https://github.com/apache/doris/pull/23381 +- https://github.com/apache/doris/pull/23389 +- https://github.com/apache/doris/pull/22570 + +### performance for bitmap, like, scan, agg functions +- https://github.com/apache/doris/pull/23172 +- https://github.com/apache/doris/pull/23495 +- https://github.com/apache/doris/pull/23476 +- https://github.com/apache/doris/pull/23396 +- https://github.com/apache/doris/pull/23182 +- https://github.com/apache/doris/pull/22216 + +### functionality and stability of CCR +- https://github.com/apache/doris/pull/22447 +- https://github.com/apache/doris/pull/22559 +- https://github.com/apache/doris/pull/22173 +- https://github.com/apache/doris/pull/22678 + +### merge on write unique table + +- https://github.com/apache/doris/pull/22282 +- https://github.com/apache/doris/pull/22984 +- https://github.com/apache/doris/pull/21933 +- https://github.com/apache/doris/pull/22874 + +### optimizer table stats and analyze + +- https://github.com/apache/doris/pull/22658 +- https://github.com/apache/doris/pull/22211 +- https://github.com/apache/doris/pull/22775 +- https://github.com/apache/doris/pull/22896 +- https://github.com/apache/doris/pull/22788 +- https://github.com/apache/doris/pull/22882 +- + +### functionality and performance of multi catalog + +- https://github.com/apache/doris/pull/22949 +- https://github.com/apache/doris/pull/22923 +- https://github.com/apache/doris/pull/22336 +- https://github.com/apache/doris/pull/22915 +- https://github.com/apache/doris/pull/23056 +- https://github.com/apache/doris/pull/23297 +- https://github.com/apache/doris/pull/23279 + + +## Important Bug fixes + +- https://github.com/apache/doris/pull/22673 +- https://github.com/apache/doris/pull/22656 +- https://github.com/apache/doris/pull/22892 +- https://github.com/apache/doris/pull/22959 +- https://github.com/apache/doris/pull/22902 +- https://github.com/apache/doris/pull/22976 +- https://github.com/apache/doris/pull/22734 +- https://github.com/apache/doris/pull/22840 +- https://github.com/apache/doris/pull/23008 +- https://github.com/apache/doris/pull/23003 +- https://github.com/apache/doris/pull/22966 +- https://github.com/apache/doris/pull/22965 +- https://github.com/apache/doris/pull/22784 +- https://github.com/apache/doris/pull/23049 +- https://github.com/apache/doris/pull/23084 +- https://github.com/apache/doris/pull/22947 +- https://github.com/apache/doris/pull/22919 +- https://github.com/apache/doris/pull/22979 +- https://github.com/apache/doris/pull/23096 +- https://github.com/apache/doris/pull/23113 +- https://github.com/apache/doris/pull/23062 +- https://github.com/apache/doris/pull/22918 +- https://github.com/apache/doris/pull/23026 +- https://github.com/apache/doris/pull/23175 +- https://github.com/apache/doris/pull/23167 +- https://github.com/apache/doris/pull/23015 +- https://github.com/apache/doris/pull/23165 +- https://github.com/apache/doris/pull/23264 +- https://github.com/apache/doris/pull/23246 +- https://github.com/apache/doris/pull/23198 +- https://github.com/apache/doris/pull/23221 +- https://github.com/apache/doris/pull/23277 +- https://github.com/apache/doris/pull/23249 +- https://github.com/apache/doris/pull/23272 +- https://github.com/apache/doris/pull/23383 +- https://github.com/apache/doris/pull/23372 +- https://github.com/apache/doris/pull/23399 +- https://github.com/apache/doris/pull/23295 +- https://github.com/apache/doris/pull/23446 +- https://github.com/apache/doris/pull/23406 +- https://github.com/apache/doris/pull/23387 +- https://github.com/apache/doris/pull/23421 +- https://github.com/apache/doris/pull/23456 +- https://github.com/apache/doris/pull/23361 +- https://github.com/apache/doris/pull/23402 +- https://github.com/apache/doris/pull/23369 +- https://github.com/apache/doris/pull/23245 +- https://github.com/apache/doris/pull/23532 +- https://github.com/apache/doris/pull/23529 +- https://github.com/apache/doris/pull/23601 + + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.1-merged+is%3Aclosed) . + + +## Big Thanks + +Thanks all who contribute to this release: + +@adonis0147 +@airborne12 +@amorynan +@AshinGau +@BePPPower +@BiteTheDDDDt +@bobhan1 +@ByteYue +@caiconghui +@CalvinKirs +@csun5285 +@DarvenDuan +@deadlinefen +@DongLiang-0 +@Doris-Extras +@dutyu +@englefly +@freemandealer +@Gabriel39 +@GoGoWen +@HappenLee +@hello-stephen +@HHoflittlefish777 +@hubgeter +@hust-hhb +@JackDrogon +@jacktengg +@jackwener +@Jibing-Li +@kaijchen +@kaka11chen +@Kikyou1997 +@Lchangliang +@LemonLiTree +@liaoxin01 +@LiBinfeng-01 +@lsy3993 +@luozenglin +@morningman +@morrySnow +@mrhhsg +@Mryange +@mymeiyi +@shuke987 +@sohardforaname +@starocean999 +@TangSiyang2001 +@Tanya-W +@ucasfl +@vinlee19 +@wangbo +@wsjz +@wuwenchi +@xiaokang +@XieJiann +@xinyiZzz +@yujun777 +@Yukang-Lian +@Yulei-Yang +@zclllyybb +@zddr +@zenoyang +@zgxme +@zhangguoqiang666 +@zhangstar333 +@zhannngchen +@zhiqiang-hhhh +@zxealous +@zy-kkk +@zzzxl1993 +@zzzzzzzs + diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.10.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.10.md new file mode 100644 index 0000000000000..5d8592a0ee25c --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.10.md @@ -0,0 +1,59 @@ +--- +{ + "title": "Release 2.0.10", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 83 improvements and bug fixes have been made in Doris 2.0.10 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + + +## Improvement and Optimizations + +- This enhancement introduces the `read_only` and `super_read_only` variables to the database system, ensuring compatibility with MySQL's read-only modes. + +- When the check status is not IO_ERROR, the disk path should not be added to the broken list. This ensures that only disks with actual I/O errors are marked as broken. + +- When performing a Create Table As Select (CTAS) operation from an external table, convert the `VARCHAR` column to `STRING` type. + +- Support mapping Paimon column type "ROW" to Doris type "STRUCT" + +- Choose disk tolerate with little skew when creating tablet + +- Write editlog to `set replica drop` to avoid confusing status on follower FE + +- Make the schema change memory space adaptive to avoid memory over limit + +- Inverted index 'unicode' tokenizer supports configuration to exclude stop words + +See the complete list of improvements and bug fixes on [GitHub](https://github.com/apache/doris/compare/2.0.9...2.0.10) . + +## Credits + +Thanks to all who contributed to this release: + +@airborne12, @BePPPower, @ByteYue, @CalvinKirs, @cambyzju, @csun5285, @dataroaring, @deardeng, @DongLiang-0, @eldenmoon, @felixwluo, @HappenLee, @hubgeter, @jackwener, @kaijchen, @kaka11chen, @Lchangliang, @liaoxin01, @LiBinfeng-01, @luennng, @morningman, @morrySnow, @Mryange, @nextdreamblue, @qidaye, @starocean999, @suxiaogang223, @SWJTU-ZhangLei, @w41ter, @xiaokang, @xy720, @yujun777, @Yukang-Lian, @zhangstar333, @zxealous, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.11.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.11.md new file mode 100644 index 0000000000000..1a2598b0d41a0 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.11.md @@ -0,0 +1,60 @@ +--- +{ + "title": "Release 2.0.11", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 123 improvements and bug fixes have been made in Doris 2.0.11 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + +## 1 Behavior change + +Since the inverted index is now mature and stable, it can replace the old BITMAP INDEX. Therefore, any newly created `BITMAP INDEX` will automatically switch to an `INVERTED INDEX`, while existing `BITMAP INDEX` will remain unchanged. This entire switching process is transparent to the user, with no changes to writing or querying. Additionally, users can disable this automatic switch by setting the FE configuration `enable_create_bitmap_index_as_inverted_index` to false. [#35528](https://github.com/apache/doris/pull/35528) + + +## 2 Improvement and optimizations + +- Add Trino JDBC Catalog type mapping for JSON and TIME + +- FE exit when failed to transfer to (non) master to prevent unknown state and too many logs + +- Write audit log while doing drop stats table. + +- Ignore min/max column stats if table is partially analyzed to avoid inefficient query plan + +- Support minus operation for set like `set1 - set2` + +- Improve perfmance of LIKE and REGEXP clause with concat (col, pattern_str), eg. `col1 LIKE concat('%', col2, '%')` + +- Add query options for short circuit queries for upgrade compatibility + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.10...2.0.11) . + +## Credits + +Thanks all who contribute to this release: + +@AshinGau, @BePPPower, @BiteTheDDDDt, @ByteYue, @CalvinKirs, @cambyzju, @csun5285, @dataroaring, @eldenmoon, @englefly, @feiniaofeiafei, @Gabriel39, @GoGoWen, @HHoflittlefish777, @hubgeter, @jacktengg, @jackwener, @jeffreys-cat, @Jibing-Li, @kaka11chen, @kobe6th, @LiBinfeng-01, @mongo360, @morningman, @morrySnow, @mrhhsg, @Mryange, @nextdreamblue, @qidaye, @sjyango, @starocean999, @SWJTU-ZhangLei, @w41ter, @wangbo, @wsjz, @wuwenchi, @xiaokang, @XieJiann, @xy720, @yujun777, @Yukang-Lian, @Yulei-Yang, @zclllyybb, @zddr, @zhangstar333, @zhiqiang-hhhh, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.12.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.12.md new file mode 100644 index 0000000000000..0bc289c91a8ef --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.12.md @@ -0,0 +1,58 @@ +--- +{ + "title": "Release 2.0.12", + "language": "en" +} +--- + + + +Thanks to our community developers and users for their contributions. Doris version 2.0.12 will bring 99 improvements and bug fixes. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- No longer set the default table comment to the table type. Instead, set it to be empty by default, for example, change COMMENT 'OLAP' to COMMENT ' '. This new behavior is more friendly for BI software that relies on table comments. [#35855](https://github.com/apache/doris/pull/35855) + +- Change the type of the `@@autocommit` variable from `BOOLEAN` to `BIGINT` to prevent errors from certain MySQL clients (such as .NET MySQL.Data). [#33282](https://github.com/apache/doris/pull/33282) + + +## Improvements + +- Remove the `disable_nested_complex_type` parameter and allow the creation of nested `ARRAY`, `MAP`, and `STRUCT` types by default. [#36255](https://github.com/apache/doris/pull/36255) + +- The HMS catalog supports the `SHOW CREATE DATABASE` command. [#28145](https://github.com/apache/doris/pull/28145) + +- Add more inverted index metrics to the query profile. [#36545](https://github.com/apache/doris/pull/36545) + +- Cross-Cluster Replication (CCR) supports inverted indices. [#31743](https://github.com/apache/doris/pull/31743) + +You can access the full list through the GitHub [link](https://github.com/apache/doris/compare/2.0.11...2.0.12) , with the key features and improvements highlighted below. + + + +## Credits + +Thanks all who contribute to this release: + +@airborne12, D14@amorynan, D14@BiteTheDDDDt, D14@cambyzju, D14@caoliang-web, D14@dataroaring, D14@eldenmoon, D14@feiniaofeiafei, D14@felixwluo, D14@gavinchou, D14@HappenLee, D14@hello-stephen, D14@jacktengg, D14@Jibing-Li, D14@Johnnyssc, D14@liaoxin01, D14@LiBinfeng-01, D14@luwei16, D14@mongo360, D14@morningman, D14@morrySnow, D14@mrhhsg, D14@Mryange, D14@mymeiyi, D14@qidaye, D14@qzsee, D14@starocean999, D14@w41ter, D14@wangbo, D14@wsjz, D14@wuwenchi, D14@xiaokang, D14@XuPengfei-1020, D14@xy720, D14@yongjinhou, D14@yujun777, D14@Yukang-Lian, D14@Yulei-Yang, D14@zclllyybb, D14@zddr, D14@zhannngchen, D14@zhiqiang-hhhh, D14@zy-kkk, D14@zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.13.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.13.md new file mode 100644 index 0000000000000..1b6e54d948d7d --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.13.md @@ -0,0 +1,61 @@ +--- +{ + "title": "Release 2.0.13", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 112 improvements and bug fixes have been made in Doris 2.0.13 version + +[Quick Download](https://doris.apache.org/download/) + +## Behavior changes + +SQL input is treated as multiple statements only when the `CLIENT_MULTI_STATEMENTS` setting is enabled on the client side, enhancing compatibility with MySQL. [#36759](https://github.com/apache/doris/pull/36759) + +## New features + +- A new BE configuration `allow_zero_date` has been added, allowing dates with all zeros. When set to `false`, `0000-00-00` is parsed as `NULL`, and when set to `true`, it is parsed as `0000-01-01`. The default value is `false` to maintain consistency with previous behavior. [#34961](https://github.com/apache/doris/pull/34961) + +- `LogicalWindow` and `LogicalPartitionTopN` support multi-field predicate pushdown to improve performance. [#36828](https://github.com/apache/doris/pull/36828) + +- The ES Catalog now maps ES `nested` or `object` types to Doris `JSON` types. [#37101](https://github.com/apache/doris/pull/37101) + +## Improvements + +- Queries with `LIMIT` end reading data earlier to reduce resource consumption and improve performance. [#36535](https://github.com/apache/doris/pull/36535) + +- Special JSON data with empty keys is now supported. [#36762](https://github.com/apache/doris/pull/36762) + +- Stability and usability of routine load have been improved, including load balancing, automatic recovery, exception handling, and more user-friendly error messages. [#36450](https://github.com/apache/doris/pull/36450) [#35376](https://github.com/apache/doris/pull/35376) [#35266](https://github.com/apache/doris/pull/35266) [ #33372](https://github.com/apache/doris/pull/33372) [#32282](https://github.com/apache/doris/pull/32282) [#32046](https://github.com/apache/doris/pull/32046) [#32021](https://github.com/apache/doris/pull/32021) [#31846](https://github.com/apache/doris/pull/31846) [#31273](https://github.com/apache/doris/pull/31273) + +- BE load balancing selection of hard disk strategy and speed optimization. [#36826](https://github.com/apache/doris/pull/36826) [#36795](https://github.com/apache/doris/pull/36795) [#36509](https://github.com/apache/doris/pull/36509) + +- Stability and usability of the JDBC catalog have been improved, including encryption, thread pool connection count configuration, and more user-friendly error messages. [#36940](https://github.com/apache/doris/pull/36940) [#36720](https://github.com/apache/doris/pull/36720) [#30880](https://github.com/apache/doris/pull/30880) [#35692](https://github.com/apache/doris/pull/35692) + +You can access the full list through the GitHub [link](https://github.com/apache/doris/compare/2.0.12...2.0.13) , with the key features and improvements highlighted below. + +## Credits + +Thanks to all who contributed to this release: + +@Gabriel39, @Jibing-Li, @Johnnyssc, @Lchangliang, @LiBinfeng-01, @SWJTU-ZhangLei, @Thearas, @Yukang-Lian, @Yulei-Yang, @airborne12, @amorynan, @bobhan1, @cambyzju, @csun5285, @dataroaring, @deardeng, @eldenmoon, @englefly, @feiniaofeiafei, @hello-stephen, @jacktengg, @kaijchen, @liutang123, @luwei16, @morningman, @morrySnow, @mrhhsg, @mymeiyi, @platoneko, @qidaye, @sollhui, @starocean999, @w41ter, @xiaokang, @xy720, @yujun777, @zclllyybb, @zddr \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.14.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.14.md new file mode 100644 index 0000000000000..061c5cb7a1093 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.14.md @@ -0,0 +1,59 @@ +--- +{ + "title": "Release 2.0.14", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 110 improvements and bug fixes have been made in Doris 2.0.14 version + + +## 1 New features + +- Adds a REST interface to retrieve the most recent query profile: `curl http://user:password@127.0.0.1:8030/api/profile/text` [#38268](https://github.com/apache/doris/pull/38268) + +## 2 Improvements + +- Optimizes the primary key point query performance for MOW tables with sequence columns [#38287](https://github.com/apache/doris/pull/38287) + +- Enhances the performance of inverted index queries with many conditions [#35346](https://github.com/apache/doris/pull/35346) + +- Automatically enables the `support_phrase` option when creating a tokenized inverted index to accelerate `match_phrase` phrase queries [#37949](https://github.com/apache/doris/pull/37949) + +- Supports simplified SQL hints, for example: `SELECT /*+ query_timeout(3000) */ * FROM t;` [#37720](https://github.com/apache/doris/pull/37720) + +- Automatically retries reading from object storage when encountering a `429` error to improve stability [#35396](https://github.com/apache/doris/pull/35396) + +- LEFT SEMI / ANTI JOIN terminates subsequent matching execution upon matching a qualifying data row to enhance performance. [#34703](https://github.com/apache/doris/pull/34703) + +- Prevents coredump when returning illegal data to MySQL results. [#28069](https://github.com/apache/doris/pull/28069) + +- Unifies the output of type names in lowercase to maintain compatibility with MySQL and be more friendly to BI tools. [#38521](https://github.com/apache/doris/pull/38521) + + +You can access the full list through the GitHub [link](https://github.com/apache/doris/compare/2.0.13...2.0.14) , with the key features and improvements highlighted below. + +## Credits + +Thanks all who contribute to this release: + +@ByteYue, @CalvinKirs, @GoGoWen, @HappenLee, @Jibing-Li, @Lchangliang, @LiBinfeng-01, @Mryange, @XieJiann, @Yukang-Lian, @Yulei-Yang, @airborne12, @amorynan, @biohazard4321, @cambyzju, @csun5285, @eldenmoon, @englefly, @freemandealer, @hello-stephen, @hubgeter, @kaijchen, @liaoxin01, @luwei16, @morningman, @morrySnow, @mymeiyi, @qidaye, @sollhui, @starocean999, @w41ter, @wuwenchi, @xiaokang, @xy720, @yujun777, @zclllyybb, @zddr, @zhangstar333, @zhiqiang-hhhh, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.15.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.15.md new file mode 100644 index 0000000000000..58237f7c3f097 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.15.md @@ -0,0 +1,91 @@ +--- +{ + "title": "Release 2.0.15", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 157 improvements and bug fixes have been made in Doris 2.0.15 version + +- Quick Download: https://doris.apache.org/download + +- GitHub: https://github.com/apache/doris/releases/tag/2.0.15 + +## 1 Behavior Change + +NA + +## 2 New Features + +- Restore now supports deleting redundant tablets and partition options. [#39028](https://github.com/apache/doris/pull/39028) + +- Support JSON function `json_search`.[#40948](https://github.com/apache/doris/pull/40948) + +## 3 Improvement and Optimizations + +### Stability + +- Add a FE configuration `abort_txn_after_lost_heartbeat_time_second` for transaction abort time. [#28662](https://github.com/apache/doris/pull/28662) + +- Abort transactions after a BE loses heartbeat for over 1 minute instead of 5 seconds, to avoid overly sensitive transaction aborts. [#22781](https://github.com/apache/doris/pull/22781) + +- Delay scheduling EOF tasks of routine load to avoid an excessive number of small transactions. [#39975](https://github.com/apache/doris/pull/39975) + +- Prefer querying from online disk services to be more robust. [#39467](https://github.com/apache/doris/pull/39467) + +- Skip checking newly inserted rows in non-strict mode partial updates if the row's delete sign is marked. [#40322](https://github.com/apache/doris/pull/40322) + +- To prevent FE OOM, limit the number of tablets in backup tasks, with a default value of 300,000. [#39987](https://github.com/apache/doris/pull/39987) + +### Performance + +- Optimize slow column updates caused by concurrent column updates and compactions. [#38487](https://github.com/apache/doris/pull/38487) + +- When a NullLiteral exists in a filter condition, it can now be folded into False and further converted to an EmptySet to reduce unnecessary data scanning and computation. [#38135](https://github.com/apache/doris/pull/38135) + +- Improve performance of `ORDER BY` permutation. [#38985](https://github.com/apache/doris/pull/38985) + +- Improve the performance of string processing in inverted indexes. [#37395](https://github.com/apache/doris/pull/37395) + +### Optimizer and Statistics + +- Added support for statements beginning with a semicolon. [#39399](https://github.com/apache/doris/pull/39399) + +- Polish aggregate function signature matching. [#39352](https://github.com/apache/doris/pull/39352) + +- Drop column statistics and trigger auto analysis after schema change. [#39101](https://github.com/apache/doris/pull/39101) + +- Support dropping cached stats using `DROP CACHED STATS table_name`. [#39367](https://github.com/apache/doris/pull/39367) + +### Multi Catalog and Others + +- Optimize JDBC Catalog refresh to reduce the frequency of client creation. [#40261](https://github.com/apache/doris/pull/40261) + +- Fix thread leaks in JDBC Catalog under certain conditions. [#39423](https://github.com/apache/doris/pull/39423) + +- ARRAY MAP STRUCT types now support `REPLACE_IF_NOT_NULL`. [#38304](https://github.com/apache/doris/pull/38304) + +- Retry delete jobs for failures that are not `DELETE_INVALID_XXX`. [#37834](https://github.com/apache/doris/pull/37834) + +**Credits** + +@924060929, @BePPPower, @BiteTheDDDDt, @CalvinKirs, @GoGoWen, @HappenLee, @Jibing-Li, @Johnnyssc, @LiBinfeng-01, @Mryange, @SWJTU-ZhangLei, @TangSiyang2001, @Toms1999, @Vallishp, @Yukang-Lian, @airborne12, @amorynan, @bobhan1, @cambyzju, @csun5285, @dataroaring, @eldenmoon, @englefly, @feiniaofeiafei, @hello-stephen, @htyoung, @hubgeter, @justfortaste, @liaoxin01, @liugddx, @liutang123, @luwei16, @mongo360, @morrySnow, @qidaye, @smallx, @sollhui, @starocean999, @w41ter, @xiaokang, @xzj7019, @yujun777, @zclllyybb, @zddr, @zhangstar333, @zhannngchen, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.2.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.2.md new file mode 100644 index 0000000000000..3f8e89cddf946 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.2.md @@ -0,0 +1,157 @@ +--- +{ + "title": "Release 2.0.2", + "language": "en" +} +--- + + + +# Release 2.0.2 + +Thanks to our community users and developers, 489 improvements and bug fixes have been made in Doris 2.0.2. + +## Behavior Changes + +- [Remove json -> operator convert to json_extract #24679](https://github.com/apache/doris/pull/24679) + + Remove json '->' operator since it is conflicted with lambda function syntax. It's a syntax sugar for function json_extract and can be replaced with the former. +- [Start the script to set metadata_failure_recovery #24308](https://github.com/apache/doris/pull/24308) + + Move metadata_failure_recovery from fe.conf to start_fe.sh argument to prevent being used unexpectedly. +- [Change ordinary type null value is \N,complex type null value is null #24207](https://github.com/apache/doris/pull/24207) +- [Optimize priority_ network matching logic for be #23795](https://github.com/apache/doris/pull/23795) +- [Fix cancel load failed because Job could not be cancelled… #17730](https://github.com/apache/doris/pull/17730) + + Allow cancel a retrying load job. + +## Improvements + +### Easier to use + +- [Support custom lib dir to save custom libs #23887](https://github.com/apache/doris/pull/23887) + + Add a custom_lib dir to allow users place custom lib files and custom_lib will not be replaced. +- [Optimize priority_ network matching logic #23784](https://github.com/apache/doris/pull/23784) + + Optimize priority_network logic to avoid error when this config is wrong or not configured. +- [Row policy support role #23022](https://github.com/apache/doris/pull/23022) + + Support role based auth for row policy. + +### New optimizer Nereids statistics collection improvement + +- [Disable file cache while running analysis tasks. #23663](https://github.com/apache/doris/pull/23663) +- [Show column stats even when error occurred. #23703](https://github.com/apache/doris/pull/23703) +- [Support basic jdbc external table stats collection. #23965](https://github.com/apache/doris/pull/23965) +- [Skip unknown col stats check on __internal_scheam and information_schema #24625](https://github.com/apache/doris/pull/24625) + +### Better support for JDBC, HDFS, Hive, MySQL, Max Compute, Multi-Catalog + +- [Support hadoop viewfs. #24168](https://github.com/apache/doris/pull/24168) +- [Avoid calling checksum when replaying creating jdbc catalog and fix ranger issue #22369](https://github.com/apache/doris/pull/22369) +- [Optimize the JDBC Catalog connection error message #23868](https://github.com/apache/doris/pull/23868) + + Improve property check and error message for JDBC catalog +- [Fix mc decimal type parse, fix wrong obj location #24242](https://github.com/apache/doris/pull/24242) + + Fix some issues for Max Compute catalog +- [Support sql cache for hms catalog #23391](https://github.com/apache/doris/pull/23391) + + SQL cache for Hive catalog +- [Merge hms partition events. #22869](https://github.com/apache/doris/pull/22869) + + Improve performance for Hive metadata sync +- [Add metadata_name_ids for quickly get catlogs,db,table and add profiling table in order to Compatible with mysql #22702](https://github.com/apache/doris/pull/22702) + +### Performance for inverted index query + +- [Add bkd index query cache to improve perf #23952](https://github.com/apache/doris/pull/23952) +- [Improve performance for count on index other than match #24678](https://github.com/apache/doris/pull/24678) +- [Improve match performance without index #24751](https://github.com/apache/doris/pull/24751) +- [Optimize multiple terms conjunction query #23871](https://github.com/apache/doris/pull/23871) +Improve performance of MATCH_ALL +- [Optimize unnecessary conversions #24389](https://github.com/apache/doris/pull/24389) +Improve performance of MATCH + +### Improve Array functions + +- [[Fix old optimizer with some array literal functions #23630](https://github.com/apache/doris/pull/23630) +- [Improve array union support multi params #24327](https://github.com/apache/doris/pull/24327) +- [Improve explode func with array nested complex type #24455](https://github.com/apache/doris/pull/24455) + +## Important Bug fixes + +- [The parameter positions of timestamp diff function to sql are reversed #23601](https://github.com/apache/doris/pull/23601) +- [Fix old optimizer with some array literal functions #23630](https://github.com/apache/doris/pull/23630) +- [Fix query cache returns wrong result after deleting partitions. #23555](https://github.com/apache/doris/pull/23555) +- [Fix potential data loss when clone task's dst tablet is cooldown replica #17644](https://github.com/apache/doris/pull/17644) +- [Fix array map batch append data with right next_array_item_rowid #23779](https://github.com/apache/doris/pull/23779) +- [Fix or to in rule #23940](https://github.com/apache/doris/pull/23940) +- [Fix 'char' function's toSql implementation is wrong #23860](https://github.com/apache/doris/pull/23860) +- [Record wrong best plan properties #23973](https://github.com/apache/doris/pull/23973) +- [Make TVF's distribution spec always be RANDOM #24020](https://github.com/apache/doris/pull/24020) +- [External scan use STORAGE_ANY instead of ANY as distibution #24039](https://github.com/apache/doris/pull/24039) +- [Runtimefilter target is not SlotReference #23958](https://github.com/apache/doris/pull/23958) +- [mv in select materialized_view should disable show table #24104](https://github.com/apache/doris/pull/24104) +- [Fail over to remote file reader if local cache failed #24097](https://github.com/apache/doris/pull/24097) +- [Fix revoke role operation cause fe down #23852](https://github.com/apache/doris/pull/23852) +- [Handle status code correctly and add a new error code `ENTRY_NOT_FOUND` #24139](https://github.com/apache/doris/pull/24139) +- [Fix leaky abstraction and shield the status code `END_OF_FILE` from upper layers #24165](https://github.com/apache/doris/pull/24165) +- [Fix bug that Read garbled files caused be crash. #24164](https://github.com/apache/doris/pull/24164) +- [Fix be core when user sepcified empty `column_separator` using hdfs tvf #24369](https://github.com/apache/doris/pull/24369) +- [Fix need to restart BE after replacing the jar package in java-udf #24372](https://github.com/apache/doris/pull/24372) +- [Need to call 'set_version' in nested functions #24381](https://github.com/apache/doris/pull/24381) +- [windown_funnel compatibility issue with multi backends #24385](https://github.com/apache/doris/pull/24385) +- [correlated anti join shouldn't be translated to null aware anti join #24290](https://github.com/apache/doris/pull/24290) +- [Change ordinary type null value is \N,complex type null value is null #24207](https://github.com/apache/doris/pull/24207) +- [Fix analyze failed when there are thousands of partitions. #24521](https://github.com/apache/doris/pull/24521) +- [Do not use enum as the data type for JavaUdfDataType. #24460](https://github.com/apache/doris/pull/24460) +- [Fix multi window projection issue temporarily #24568](https://github.com/apache/doris/pull/24568) +- [Make metadata compatible with 2.0.3 #24610](https://github.com/apache/doris/pull/24610) +- [Select outfile column order is wrong #24595](https://github.com/apache/doris/pull/24595) +- [Incorrect result of semi/anti mark join #24616](https://github.com/apache/doris/pull/24616) +- [Fix broker read issue #24635](https://github.com/apache/doris/pull/24635) +- [Skip unknown col stats check on __internal_scheam and information_schema #24625](https://github.com/apache/doris/pull/24625) +- [Fixed bug when parsing multi-character delimiters. #24572](https://github.com/apache/doris/pull/24572) +- [Fix timezone parse when there is no tzfile #24578](https://github.com/apache/doris/pull/24578) +- [We need to issue an error when starting FE without setting the Java home environment #23943](https://github.com/apache/doris/pull/23943) +- [Enable_unique_key_partial_update should be forwarded to master #24697](https://github.com/apache/doris/pull/24697) +- [Fix paimon file catalog meta issue and replication num analysis issue #24681](https://github.com/apache/doris/pull/24681) +- [Add more log for ingest_binlog && Fix ingest_binlog not rewrite rowset_meta tablet_uid #24617](https://github.com/apache/doris/pull/24617) +- [Do not abort when a disk is broken #24692](https://github.com/apache/doris/pull/24692) +- [colocate join could not work well on full outer join #24700](https://github.com/apache/doris/pull/24700) +- [Optimize unnecessary conversions #24389](https://github.com/apache/doris/pull/24389) +- [Optimize the reading efficiency of nullable (string) columns. #24698](https://github.com/apache/doris/pull/24698) +- [Fix segment cache core when output rowset is nullptr #24778](https://github.com/apache/doris/pull/24778) +- [Fix duplicate key in schema change #24782](https://github.com/apache/doris/pull/24782) +- [Make metadata compatible for future version after 2.0.2 #24800](https://github.com/apache/doris/pull/24800) +- [Fix map/array deserialize string with quote pair #24808](https://github.com/apache/doris/pull/24808) +- [Failed on arm platform, with clang compiler and pch on, close #24633 #24636](https://github.com/apache/doris/pull/24636) +- [Table column order is changed if add a column and do truncate #24981](https://github.com/apache/doris/pull/24981) +- [Make parser mode coarse grained by default #24949](https://github.com/apache/doris/pull/24949) + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.2-merged+is%3Aclosed) . + +## Big Thanks + +Thanks all who contribute to this release: + +[@adonis0147](https://github.com/adonis0147) [@airborne12](https://github.com/airborne12) [@amorynan](https://github.com/amorynan) [@AshinGau](https://github.com/AshinGau) [@BePPPower](https://github.com/BePPPower) [@BiteTheDDDDt](https://github.com/BiteTheDDDDt) [@bobhan1](https://github.com/bobhan1) [@ByteYue](https://github.com/ByteYue) [@caiconghui](https://github.com/caiconghui) [@CalvinKirs](https://github.com/CalvinKirs) [@cambyzju](https://github.com/cambyzju) [@ChengDaqi2023](https://github.com/ChengDaqi2023) [@ChinaYiGuan](https://github.com/ChinaYiGuan) [@CodeCooker17](https://github.com/CodeCooker17) [@csun5285](https://github.com/csun5285) [@dataroaring](https://github.com/dataroaring) [@deadlinefen](https://github.com/deadlinefen) [@DongLiang-0](https://github.com/DongLiang-0) [@Doris-Extras](https://github.com/Doris-Extras) [@dutyu](https://github.com/dutyu) [@eldenmoon](https://github.com/eldenmoon) [@englefly](https://github.com/englefly) [@freemandealer](https://github.com/freemandealer) [@Gabriel39](https://github.com/Gabriel39) [@gnehil](https://github.com/gnehil) [@GoGoWen](https://github.com/GoGoWen) [@gohalo](https://github.com/gohalo) [@HappenLee](https://github.com/HappenLee) [@hello-stephen](https://github.com/hello-stephen) [@HHoflittlefish777](https://github.com/HHoflittlefish777) [@hubgeter](https://github.com/hubgeter) [@hust-hhb](https://github.com/hust-hhb) [@ixzc](https://github.com/ixzc) [@JackDrogon](https://github.com/JackDrogon) [@jacktengg](https://github.com/jacktengg) [@jackwener](https://github.com/jackwener) [@Jibing-Li](https://github.com/Jibing-Li) [@JNSimba](https://github.com/JNSimba) [@kaijchen](https://github.com/kaijchen) [@kaka11chen](https://github.com/kaka11chen) [@Kikyou1997](https://github.com/Kikyou1997) [@Lchangliang](https://github.com/Lchangliang) [@LemonLiTree](https://github.com/LemonLiTree) [@liaoxin01](https://github.com/liaoxin01) [@LiBinfeng-01](https://github.com/LiBinfeng-01) [@liugddx](https://github.com/liugddx) [@luwei16](https://github.com/luwei16) [@mongo360](https://github.com/mongo360) [@morningman](https://github.com/morningman) [@morrySnow](https://github.com/morrySnow) @mrhhsg @Mryange @mymeiyi @neuyilan @pingchunzhang @platoneko @qidaye @realize096 @RYH61 @shuke987 @sohardforaname @starocean999 @SWJTU-ZhangLei @TangSiyang2001 @Tech-Circle-48 @w41ter @wangbo @wsjz @wuwenchi @wyx123654 @xiaokang @XieJiann @xinyiZzz @XuJianxu @xutaoustc @xy720 @xyfsjq @xzj7019 @yiguolei @yujun777 @Yukang-Lian @Yulei-Yang @zclllyybb @zddr @zhangguoqiang666 @zhangstar333 @ZhangYu0123 @zhannngchen @zxealous @zy-kkk @zzzxl1993 @zzzzzzzs diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.3.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.3.md new file mode 100644 index 0000000000000..a716d6d711fb0 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.3.md @@ -0,0 +1,253 @@ +--- +{ + "title": "Release 2.0.3", + "language": "en" +} +--- + + + +Thanks to our community users and developers, about 1000 improvements and bug fixes have been made in Doris 2.0.3 version, including optimizer statistics, inverted index, complex datatypes, data lake, replica management. + + + +## 1 Behavior change + +- The output format of the complex data type array/map/struct has been changed to be consistent to the input format and JSON specification. The main changes from the previous version are that DATE/DATETIME and STRING/VARCHAR are enclosed in double quotes and null values inside ARRAY/MAP are displayed as `null` instead of `NULL`. + - https://github.com/apache/doris/pull/25946 +- SHOW_VIEW permission is supported. Users with SELECT or LOAD permission will no longer be able to execute the 'SHOW CREATE VIEW' statement and must be granted the SHOW_VIEW permission separately. + - https://github.com/apache/doris/pull/25370 + + +## 2 New features + +### 2.1 Support collecting statistics for optimizer automatically + +Collecting statistics helps the optimizer understand the data distribution characteristics and choose a better plan to greatly improve query performance. It is officially supported starting from version 2.0.3 and is enabled all day by default. + +### 2.2 Support complex datatypes for more datalake source +- Support complex datatypes for JAVA UDF, JDBC and Hudi MOR + - https://github.com/apache/doris/pull/24810 + - https://github.com/apache/doris/pull/26236 +- Support complex datatypes for Paimon + - https://github.com/apache/doris/pull/25364 +- Suport Paimon version 0.5 + - https://github.com/apache/doris/pull/24985 + + +### 2.3 Add more builtin functions +- Support the BitmapAgg function in new optimizer + - https://github.com/apache/doris/pull/25508 +- Supports SHA series digest functions + - https://github.com/apache/doris/pull/24342 +- Support the BITMAP datatype in the aggregate functions min_by and max_by + - https://github.com/apache/doris/pull/25430 +- Add milliseconds/microseconds_add/sub/diff functions + - https://github.com/apache/doris/pull/24114 +- Add some json functions: json_insert, json_replace, json_set + - https://github.com/apache/doris/pull/24384 + + +## 3 Improvement and optimizations + +### 3.1 Performance optimizations + +- When the inverted index MATCH WHERE condition with a high filter rate is combined with the common WHERE condition with a low filter rate, the I/O of the index column is greatly reduced. +- Optimize the efficiency of random data access after the where filter. +- Optimizes the performance of the old get_json_xx function on JSON data types by 2~4x. +- Supports the configuration to reduce the priority of the data read thread, ensuring the CPU resources for real-time writing. +- Adds `uuid-numeric` function that returns largeint, which is 20 times faster than `uuid` function that returns string. +- Optimized the performance of case when by 3x. +- Cut out unnecessary predicate calculations in storage engine execution. +- Accelerate count performance by pushing down count operator to storage tier. +- Optimizes the computation performance of the nullable type in and or expressions. +- Supports rewriting the limit operator before `join` in more scenarios to improve query performance. +- Eliminate useless `order by` operators from inline view to improve query performance. +- Optimizes the accuracy of cardinality estimates and cost models in some cases. +- Optimized jdbc catalog predicate pushdown logic. +- Optimized the read efficiency of the file cache when it's enable for the first time. +- Optimizes the hive table sql cache policy and uses the partition update time stored in HMS to improve the cache hit ratio. +- Optimize mow compaction efficiency. +- Optimized thread allocation logic for external table query to reduce memory usage +- Optimize memory usage for column reader. + + + +### 3.2 Distributed replica management improvements + +Distributed replica management improvements include skipping partition deletion, colocate group deletion, balance failure due to continuous write, and hot and cold seperation table balance. + + +### 3.3 Security enhancement +- The audit log plug-in uses a token instead of a plaintext password to enhance security + - https://github.com/apache/doris/pull/26278 +- log4j configures security enhancement + - https://github.com/apache/doris/pull/24861 +- Sensitive user information is not displayed in logs + - https://github.com/apache/doris/pull/26912 + + +## 4 Bugfix and stability + +### 4.1 Complex datatypes +- Fix issues that fixed-length CHAR(n) was not truncated correctly in map/struct. + - https://github.com/apache/doris/pull/25725 +- Fix write failure for struct datatype nested for map/array + - https://github.com/apache/doris/pull/26973 +- Fix the issue that count distinct did not support array/map/struct + - https://github.com/apache/doris/pull/25483 +- Fix be crash in updating to 2.0.3 after the delete complex type appeared in query + - https://github.com/apache/doris/pull/26006 +- Fix be crash when JSON datatype is in WHERE clause. + - https://github.com/apache/doris/pull/27325 +- Fix be crash when ARRAY datatype is in OUTER JOIN clause. + - https://github.com/apache/doris/pull/25669 +- Fix reading incorrect result for DECIMAL datatype in ORC format. + - https://github.com/apache/doris/pull/26548 + - https://github.com/apache/doris/pull/25977 + - https://github.com/apache/doris/pull/26633 + +### 4.2 Inverted index +- Fix incorrect result for OR NOT combination in WHERE clause were incorrect when disable inverted index query. + - https://github.com/apache/doris/pull/26327 +- Fix be crash when write a empty with inverted index + - https://github.com/apache/doris/pull/25984 +- Fix be crash in index compaction when the output of compaction is empty. + - https://github.com/apache/doris/pull/25486 +- Fixed the problem of adding an inverted index to be crashed when no data is written to the newly added column. +- Fix be crash when BUILD INDEX after ADD COLUMN without new data written. + - https://github.com/apache/doris/pull/27276 +- Fix missing and leak problem of hardlink for inverted index file. + - https://github.com/apache/doris/pull/26903 +- Fix index file corrupt when disk is full temporarilly + - https://github.com/apache/doris/pull/28191 +- Fix incorrect result due to optimization for skip reading index column + - https://github.com/apache/doris/pull/28104 + +### 4.3 Materialized View +- Fix the problem of BE crash caused by repeated expressions in the group by statement +- Fix be crash when there are duplicate expressions in `group by` statements. + - https://github.com/apache/doris/pull/27523 +- Disables the float/double type in the `group by` clause when a view is created. + - https://github.com/apache/doris/pull/25823 +- Improve the function of select query matching materialized view + - https://github.com/apache/doris/pull/24691 +- Fix an issue that materialized views could not be matched when a table alias was used + - https://github.com/apache/doris/pull/25321 +- Fix the problem using percentile_approx when creating materialized views + - https://github.com/apache/doris/pull/26528 + +### 4.4 Table sample +- Fix the problem that table sample query can not work on table with partitions. + - https://github.com/apache/doris/pull/25912 +- Fix the problem that table sample query can not work when specify tablet. + - https://github.com/apache/doris/pull/25378 + + +### 4.5 Unique with merge on write +- Fix null pointer exception in conditional update based on primary key + - https://github.com/apache/doris/pull/26881 +- Fix field name capitalization issues in partial update + - https://github.com/apache/doris/pull/27223 +- Fix duplicate keys occur in mow during schema change repairement. + - https://github.com/apache/doris/pull/25705 + + +### 4.6 Load and compaction +- Fix unkown slot descriptor error in routineload for running multiple tables + - https://github.com/apache/doris/pull/25762 +- Fix be crash due to concurrent memory access when caculating memory + - https://github.com/apache/doris/pull/27101 +- Fix be crash on duplicate cancel for load. + - https://github.com/apache/doris/pull/27111 +- Fix broker connection error during broker load + - https://github.com/apache/doris/pull/26050 +- Fix incorrect result delete predicates in concurrent case of compation and scan. + - https://github.com/apache/doris/pull/24638 +- Fix the problem tha compaction task would print too many stacktrace logs + - https://github.com/apache/doris/pull/25597 + + +### 4.7 Data Lake compatibility +- Solve the problem that the iceberg table contains special characters that cause query failure + - https://github.com/apache/doris/pull/27108 +- Fix compatibility issues of different hive metastore versions + - https://github.com/apache/doris/pull/27327 +- Fix an error reading max compute partition table + - https://github.com/apache/doris/pull/24911 +- Fix the issue that backup to object storage failed + - https://github.com/apache/doris/pull/25496 + - https://github.com/apache/doris/pull/25803 + + +### 4.8 JDBC external table compatibility + +- Fix Oracle date type format error in jdbc catalog + - https://github.com/apache/doris/pull/25487 +- Fix MySQL 0000-00-00 date exception in jdbc catalog + - https://github.com/apache/doris/pull/26569 +- Fix an exception in reading data from Mariadb where the default value of the time type is current_timestamp + - https://github.com/apache/doris/pull/25016 +- Fix be crash when processing BITMAP datatype in jdbc catalog + - https://github.com/apache/doris/pull/25034 + - https://github.com/apache/doris/pull/26933 + + +### 4.9 SQL Planner and Optimizer + +- Fix partition prune error in some scenes + - https://github.com/apache/doris/pull/27047 + - https://github.com/apache/doris/pull/26873 + - https://github.com/apache/doris/pull/25769 + - https://github.com/apache/doris/pull/27636 + +- Fix incorrect sub-query processing in some scenarios + - https://github.com/apache/doris/pull/26034 + - https://github.com/apache/doris/pull/25492 + - https://github.com/apache/doris/pull/25955 + - https://github.com/apache/doris/pull/27177 + +- Fix some semantic parsing errors + - https://github.com/apache/doris/pull/24928 + - https://github.com/apache/doris/pull/25627 + +- Fix data loss during right outer/anti join + - https://github.com/apache/doris/pull/26529 + +- Fix incorrect pushing down of predicate pass aggregation operators. + - https://github.com/apache/doris/pull/25525 + +- Fix incorrect result header in some cases + - https://github.com/apache/doris/pull/25372 + +- Fix incorrect plan when the nullsafeEquals expression (<=>) is used as the join condition + - https://github.com/apache/doris/pull/27127 + +- Fix correct column prune in set operation operator. + - https://github.com/apache/doris/pull/26884 + + +### Others + +- Fix BE crash when the order of columns in a table is changed and then upgraded to 2.0.3. + - https://github.com/apache/doris/pull/28205 + + +See the complete list of improvements and bug fixes on [github dev/2.0.3-merged](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.3-merged+is%3Aclosed) . diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.4.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.4.md new file mode 100644 index 0000000000000..e1dac58fbf69a --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.4.md @@ -0,0 +1,67 @@ +--- +{ + "title": "Release 2.0.4", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, about 333 improvements and bug fixes have been made in Doris 2.0.4 version. + +**Quick Download** : [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub** : [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Behavior change +- More reasonable and accurate precision and scale inference for decimal data type + - [https://github.com/apache/doris/pull/28034](https://github.com/apache/doris/pull/28034) + +- Support drop policy for user or role + - [https://github.com/apache/doris/pull/29488](https://github.com/apache/doris/pull/29488) + +## New features + +- Support datev1, datetimev1 and decimalv2 datatypes in new optimizer Nereids. +- Support ODBC table for new optimizer Nereids. +- Add `lower_case` and `ignore_above` option for inverted index +- Support `match_regexp` and `match_phrase_prefix` optimization by inverted index +- Support paimon native reader in datalake +- Support audit-log for `insert into` SQL +- Support reading parquet file in lzo compressed format + +## Three Improvement and optimizations + +- Improve storage management including balance, migration, publish and others. +- Improve storage cooldown policy to use save disk space. +- Performance optimization for substr with ascii string. +- Improve partition prune when date function is used. +- Improve auto analyze visibility and performance. + +See the complete list of improvements and bug fixes on github [dev/2.0.4-merged](https://github.com/apache/doris/issues?q=label%3Adev%2F2.0.4-merged+is%3Aclosed) + + + +## Credits +Last but not least, this release would not have been possible without the following contributors: + +airborne12, amorynan, AshinGau, BePPPower, bingquanzhao, BiteTheDDDDt, bobhan1, ByteYue, caiconghui,CalvinKirs, cambyzju, caoliang-web, catpineapple, csun5285, dataroaring, deardeng, dutyu, eldenmoon, englefly, feifeifeimoon, fornaix, Gabriel39, gnehil, HappenLee, hello-stephen, HHoflittlefish777,hubgeter, hust-hhb, ixzc, jacktengg, jackwener, Jibing-Li, kaka11chen, KassieZ, LemonLiTree,liaoxin01, LiBinfeng-01, lihuigang, liugddx, luwei16, morningman, morrySnow, mrhhsg, Mryange, nextdreamblue, Nitin-Kashyap, platoneko, py023, qidaye, shuke987, starocean999, SWJTU-ZhangLei, w41ter, wangbo, wsjz, wuwenchi, Xiaoccer, xiaokang, XieJiann, xingyingone, xinyiZzz, xuwei0912, xy720, xzj7019, yujun777, zclllyybb, zddr, zhangguoqiang666, zhangstar333, zhannngchen, zhiqiang-hhhh, zy-kkk, zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.5.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.5.md new file mode 100644 index 0000000000000..20d6bd9302b2c --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.5.md @@ -0,0 +1,73 @@ +--- +{ + "title": "Release 2.0.5", + "language": "en" +} +--- + + + + + +Thanks to our community users and developers, about 217 improvements and bug fixes have been made in Doris 2.0.5 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Behavior change +- Change char function behaviour: `select char(0) = '\0'` return true as MySQL + - https://github.com/apache/doris/pull/30034 +- Allow exporting empty data + - https://github.com/apache/doris/pull/30703 + +## New features +- Eliminate left outer join with `is null` condition +- Add `show-tablets-belong` stmt for analyzing a batch of tablet-ids +- InferPredicates support In, such as `a = b & a in [1, 2] -> b in [1, 2]` +- Optimize plan when column stats are unavailable +- Optimize plan using rollup column stats +- Support analyze materialized view +- Support ShowProcessStmt Show all FE connection + +## Improvement and optimizations +- Optimize query plan when column stats are unaviable +- Optimize query plan using rollup column stats +- Stop analyze quickly after user close auto analyze +- Catch load column stats exception, avoid print too much stack info to fe.out +- Select materialized view by specify the view name in SQL +- Change auto analyze max table width default value to 100 +- Escape characters for columns in recovery predicate pushdown in JDBC Catalog +- Fix JDBC MYSQL Catalog `to_date` fun pushdown +- Optimize the close logic of JDBC client +- Optimize JDBC connection pool parameter settings +- Obtain hudi partition information through HMS's API +- Optimize routine load job error msg and memory +- Skip all backup/restore jobs if max allowd option is set to 0 + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.4-rc06...2.0.5-rc02). + + +## Credits +Thanks all who contribute to this release: + +airborne12, alexxing662, amorynan, AshinGau, BePPPower, bingquanzhao, BiteTheDDDDt, ByteYue, caiconghui, cambyzju, catpineapple, dataroaring, eldenmoon, Emor-nj, englefly, felixwluo, GoGoWen, HappenLee, hello-stephen, HHoflittlefish777, HowardQin, JackDrogon, jacktengg, jackwener, Jibing-Li, KassieZ, LemonLiTree, liaoxin01, liugddx, LuGuangming, morningman, morrySnow, mrhhsg, Mryange, mymeiyi, nextdreamblue, qidaye, ryanzryu, seawinde,starocean999, TangSiyang2001, vinlee19, w41ter, wangbo, wsjz, wuwenchi, xiaokang, XieJiann, xingyingone, xy720,xzj7019, yujun777, zclllyybb, zhangstar333, zhannngchen, zhiqiang-hhhh, zxealous, zy-kkk, zzzxl1993 + diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.6.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.6.md new file mode 100644 index 0000000000000..9591ed8d3fab8 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.6.md @@ -0,0 +1,59 @@ +--- +{ + "title": "Release 2.0.6", + "language": "en" +} +--- + + + + + +Thanks to our community users and developers, about 114 improvements and bug fixes have been created by 51 contributors in Doris 2.0.6 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Behavior change +- N/A + +## New features +- Support match a function with alias in materialized-view +- Add a command to drop a tablet replica safely on backend +- Add row count cache for external table. +- Support analyze rollup to gather statistics for optimizer + +## Improvement and optimizations +- Improve tablet schema cache memory by using deterministic way to serialize protobuf +- Improve show column stats performance +- Support estimate row count for iceberg and paimon +- Support sqlserver timestamp type read for JDBC catalog + + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.5-rc02...2.0.6). + + +## Credits +Thanks all who contribute to this release: + +924060929, AshinGau, BePPPower, BiteTheDDDDt, CalvinKirs, cambyzju, deardeng, DongLiang-0, eldenmoon, englefly, feelshana, feiniaofeiafei, felixwluo, HappenLee, hust-hhb, iwanttobepowerful, ixzc, JackDrogon, Jibing-Li, KassieZ, larshelge, liaoxin01, LiBinfeng-01, liutang123, luennng, morningman, morrySnow, mrhhsg, qidaye, starocean999, TangSiyang2001, wangbo, wsjz, wuwenchi, xiaokang, XieJiann, xuwei0912, xy720, xzj7019, yiguolei, yujun777, Yukang-Lian, Yulei-Yang, zclllyybb, zddr, zhangstar333, zhannngchen, zhiqiang-hhhh, zy-kkk, zzzxl1993 + diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.7.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.7.md new file mode 100644 index 0000000000000..10f226dbd63b4 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.7.md @@ -0,0 +1,84 @@ +--- +{ + "title": "Release 2.0.7", + "language": "en" +} +--- + + + + + +Thanks to our community users and developers, about 80 improvements and bug fixes have been made in Doris 2.0.7 version. + +**Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +**GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## 1 Behavior change + +- `round` function defaults to rounding normally as MySQL, eg. round(5/2) return 3 instead of 2. + + - https://github.com/apache/doris/pull/31583 + +- `round` datetime with scale from string literal as MySQL, eg. round '2023-10-12 14:31:49.666' to '2023-10-12 14:31:50' . + + - https://github.com/apache/doris/pull/27965 + + +## 2 New features +- Support make miss slot as null alias when converting outer join to anti join to speed up query + + - https://github.com/apache/doris/pull/31854 + +- Enable proxy protocol to support IP transparency for Nginx and HAProxy. + + - https://github.com/apache/doris/pull/32338 + + +## 3 Improvement and optimizations + +- Add DEFAULT_ENCRYPTION column in `information_schema` table and add `processlist` table for better compatibility for BI tools + +- Automatically test connectivity by default when creating a JDBC Catalog. + +- Enhance auto resume to keep routine load stable + +- Use lowercase by default for Chinese tokenizer in inverted index + +- Add error msg if exceeded maximum default value in repeat function + +- Skip hidden file and dir in Hive table + +- Reduce file meta cache size and disable cache for some cases to avoid OOM + +- Reduce jvm heap memory consumed by profiles of BrokerLoadJob + +- Remove sort which is under table sink to speed up query like `INSERT INTO t1 SELECT * FROM t2 ORDER BY k`. + +See the complete list of improvements and bug fixes on [github](https://github.com/apache/doris/compare/2.0.6...2.0.7) . + + +## 4 Credits + +Thanks all who contribute to this release: + +924060929,airborne12,amorynan,ByteYue,dataroaring,deardeng,feiniaofeiafei,felixwluo,freemandealer,gavinchou,hello-stephen,HHoflittlefish777,jacktengg,jackwener,jeffreys-cat,Jibing-Li,KassieZ,LiBinfeng-01,luwei16,morningman,mrhhsg,Mryange,nextdreamblue,platoneko,qidaye,rohitrs1983,seawinde,shuke987,starocean999,SWJTU-ZhangLei,w41ter,wsjz,wuwenchi,xiaokang,XieJiann,XuJianxu,yujun777,Yulei-Yang,zhangstar333,zhiqiang-hhhh,zy-kkk,zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.8.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.8.md new file mode 100644 index 0000000000000..d881a80628b44 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.8.md @@ -0,0 +1,76 @@ +--- +{ + "title": "Release 2.0.8", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, about 65 improvements and bug fixes have been made in Doris 2.0.8 version. + +- **Quick Download** : [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub** : [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + + +## 1 Behavior change + +The `ADMIN SHOW` statement can not be executed with high version of MySQL 8.x jdbc driver. So rename these statement, remove the `ADMIN` keywords. + +- https://github.com/apache/doris/pull/29492 + +```sql +ADMIN SHOW CONFIG -> SHOW CONFIG +ADMIN SHOW REPLICA -> SHOW REPLICA +ADMIN DIAGNOSE TABLET -> SHOW TABLET DIAGNOSIS +ADMIN SHOW TABLET -> SHOW TABLET +``` + + +## 2 New features + +N/A + + + +## 3 Improvement and optimizations + +- Make Inverted Index work with TopN opt in Nereids + +- Limit the max string length to 1024 while collecting column stats to control BE memory usage + +- JDBC Catalog close when JDBC client is not empty + +- Accept all Iceberg database and do not check the name format of database + +- Refresh external table's rowcount async to avoid cache miss and unstable query plan + +- Simplify the isSplitable method of hive external table to avoid too many hadoop metrics + +See the complete list of improvements and bug fixes on [GitHub](https://github.com/apache/doris/compare/2.0.7...2.0.8) . + +## 4 Credits + +Thanks all who contribute to this release: + +924060929, AcKing-Sam, amorynan, AshinGau, BePPPower, BiteTheDDDDt, ByteYue, cambyzju, dongsilun, eldenmoon, feiniaofeiafei, gnehil, Jibing-Li, liaoxin01, luwei16, morningman, morrySnow, mrhhsg, Mryange, nextdreamblue, platoneko, starocean999, SWJTU-ZhangLei, wuwenchi, xiaokang, xinyiZzz, Yukang-Lian, Yulei-Yang, zclllyybb, zddr, zhangstar333, zhiqiang-hhhh, ziyanTOP, zy-kkk, zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.9.md b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.9.md new file mode 100644 index 0000000000000..04048fc060461 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.0/release-2.0.9.md @@ -0,0 +1,75 @@ +--- +{ + "title": "Release 2.0.9", + "language": "en" +} +--- + + + + +Thanks to our community users and developers, about 68 improvements and bug fixes have been made in Doris 2.0.9 version. + +- **Quick Download** : [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub** : [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## 1 Behavior change + +NA + +## 2 New features + +- Support predicate apprear both on key and value mv column + +- Support mv with `bitmap_union(bitmap_from_array())` + +- Add a FE config to force replicate allocation for OLAP tables in the cluster + +- Support date literal support timezone in new optimizer Nereids + +- Support slop in fulltext search `match_phrase` to specify word distence + +- Show index id in `SHOW PROC INDEXES` + +## 3 Improvement and optimizations + +- Sdd a secondary argument in `first_value` / `last_value` to ignore NULL values + +- the offset params in `LEAD`/ `LAG` function could use 0 + +- Adjust priority of materialized view match rule + +- TopN opt reads only limit number of records for better performance + +- Add profile for delete_bitmap get_agg function + +- Refine the Meta cache to get better performance + +- Add FE config `autobucket_max_buckets` + +See the complete list of improvements and bug fixes on [GitHub](https://github.com/apache/doris/compare/2.0.8...2.0.9) . + +## Big Thanks + +Thanks all who contribute to this release: + +adonis0147, airborne12, amorynan, AshinGau, BePPPower, BiteTheDDDDt, CalvinKirs, cambyzju, csun5285, eldenmoon, englefly, feiniaofeiafei, HHoflittlefish777, htyoung, hust-hhb, jackwener, Jibing-Li, kaijchen, kylinmac, liaoxin01, luwei16, morningman, mrhhsg, qidaye, starocean999, SWJTU-ZhangLei, w41ter, xiaokang, xiedeyantu, xy720, zclllyybb, zhangstar333, zhannngchen, zy-kkk, zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.0.md b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.0.md new file mode 100644 index 0000000000000..b0b88f715ee51 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.0.md @@ -0,0 +1,159 @@ +--- +{ + "title": "Release 2.1.0", + "language": "en" +} +--- + + + +Dear community, we are pleased to share with you the official release of Apache Doris 2.1.0, now available for download and use as of March 8th. This latest version marks a significant milestone in our journey towards enhancing data analysis capabilities, particularly for handling massive and complex datasets. + +With Doris 2.1.0, our primary focus has been on optimizing analysis performance, and the results speak for themselves. We have achieved an impressive performance improvement of over 100% on the TPC-DS 1TB test dataset, making Apache Doris more capable of challenging real-world business scenarios. + +- **Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + + +## Performance improvement + +### Smarter optimizer + +On the basis of V2.0, the query optimizer in Doris V2.1 comes with enhanced statistics-based inference and enumeration framework. We have upgraded the cost model and expanded the optimization rules to serve the needs of more use cases + +### Better heuristic optimization + +For data analytics at scale or data lake scenarios, Doris V2.1 provides better heuristic query plans. Meanwhile, the RuntimeFilter is more self-adaptive to enable higher performance even without statistical information. + +### Parallel adaptive scan + +Doris V2.1 has adopted parallel adaptive scan to optimize scan I/O and thus improve query performance. It can avoid the negative impact of unreasonable numbers of buckets. (This feature is currently available on the Duplicate Key model and Merge-on-Write Unique Key model.) + +### Local shuffle + +We have introduced Local Shuffle to prevent uneven data distribution. Benchmark tests show that Local Shuffle in combination with Parallel Adaptive Scan can guarantee fast query performance in spite of unreasonable bucket number settings upon table creation. + +### Faster INSERT INTO SELECT + +To further improve the performance of INSERT INTO SELECT, which is a frequent operation in ETL, we have moved forward the MemTable execution-wise to reduce data ingestion overheads. Tests show that this can double the data ingestion speed in most cases compared to V2.0. +Improved data lake analytics capabilities + +## Data lake analytic performance + +### TPC-DS Benchmark + +According to TPC-DS benchmark tests (1TB) of Doris V2.1 against Trino, + +- Without caching, the total execution time of Doris is 56% of that of Trino V435. (717s VS 1296s) +- Enabling file cache can further increase the overall performance of Doris by 2.2 times. (323s) + This is achieved by a series of optimizations in I/O, parquet/ORC file reading, predicate pushdown, caching, and scan task scheduling, etc. + +### SQL dialects compatibility + +To facilitate migration to Doris and increase its compatibility with other DBMS, we have enabled SQL dialect conversion in V2.1. ([read more](../../lakehouse/sql-dialect)) For example, by set sql_dialect = "trino" in Doris, you can use the Trino SQL dialect as you're used to, without modifying your current business logic, and Doris will execute the corresponding queries for you. Tests in user production environment show that Doris V2.1 is compatible with 99% of Trino SQL. + +### Arrow Flight SQL protocol + +As a column-oriented database compatible with MySQL 8.0 protocol, Doris V2.1 now supports the Arrow Flight SQL protocol as well so users can have fast access to Doris data via Pandas/Numpy without data serialization and deserialization. For most common data types, the Arrow Flight protocol enables tens of times faster performance than the MySQL protocol. + +## Asynchronous materialized view + +V2.1 allows creating a materialized view based on multiple tables. This feature currently supports: + +- Transparent rewriting: supports transparent rewriting of common operators including Select, Where, Join, Group By, and Aggregation. +- Auto refresh: supports regular refresh, manual refresh, full refresh, incremental refresh, and partition-based refresh. +- Materialized view of external tables: supports materialized views based on external data tables such as those on Hive, Hudi, and Iceberg; supported synchronizing data from data lakes into Doris internal tables via materialized views. +- Direct query on materialized views: Materialized views can be regarded as the result set after ETL. In this sense, materialized views are data tables, so users can conduct queries on them directly. + +## Enhanced storage + +### Auto-increment column + +V2.1 supports auto-increment columns, which can ensure data uniqueness of each row. This lays the foundation for efficient dictionary encoding and query pagination. For example, for precise UV calculation and customer grouping, users often apply the bitmap type in Doris, the process of which entails dictionary encoding. With V2.1, users can first create a dictionary table using the auto-increment column, and then simply load user data into it. + +### Auto partition + +To further release burden on operation and maintenance, V2.1 allows auto data partitioning. Upon data ingestion, it detects whether a partition exists for the data based on the partitioning column. If not, it automatically creates one and starts data ingestion. + +### High-concurrency real-time data ingestion + +For data writing, a back pressure mechanism is in place to avoid execessive data versions, so as to reduce resource consumption by data version merging. In addition, V2.1 supports group commit ([read more](../../data-operate/import/import-way/group-commit-manual)), which means to accumulate multiple writing and commit them as one. Benchmark tests on group commit with JDBC ingestion and the Stream Load method present great results. + +## Semi-structured data analysis + +### A new data type: Variant + +V2.1 supports a new data type named Variant. It can accommodate semi-structured data such as JSON as well as compound data types that contain integers, strings, booleans, etcs. Users don't have to pre-define the exact data types for a Variant column in the table schema. The Variant type is handy when processing nested data structures. +You can include Variant columns and static columns with pre-defined data types in the same table. This will provide you with more flexibility in storage and queries. +Tests with ClickBench datasets prove that data in Variant columns takes up the same storage space as data in static columns, which is half of that in JSON format. In terms of query performance, the Variant type enables 8 times higher query speed than JSON in hot runs and even more in cold runs. + +### IP types + +Doris V2.1 provides native support for IPv4 and IPv6. It stores IP data in binary format, which cuts down storage space usage by 60% compared to IP string in plain texts. Along with these IP types, we have added over 20 functions for IP data processing. + +### More powerful functions for compound data types + +- explode_map: supports exploding rows into columns for the Map data type. +- Supports the STRUCT data type in the IN predicates + +## Workload Management + +### Hard isolation of resources + +On the basis of the Workload Group mechanism, which imposes a soft limit on the resources that a workload group can use, Doris 2.1 introduces a hard limit on CPU resource consumption for workload groups as a way to ensure higher stability in query performance. + +### TopSQL + +V2.1 allows users to check the most resource-consuming SQL queries in the runtime. This can be a big help when handling cluster load spike caused by unexpected large queries. + + +## Others + +### Decimal 256 + +For users in the financial sector or high-end manufacturing, V2.1 supports a high-precision data type: Decimal, which supports up to 76 significant digits (an experimental feature, please set enable_decimal256=true.) + +### Job scheduler + +V2.1 provides a good option for regular task scheduling: Doris Job Scheduler. It can trigger the pre-defined operations on schedule or at fixed intervals. The Doris Job Scheduler is accurate to the second. It provides consistency guarantee for data writing, high efficiency and flexibility, high-performance processing queues, retraceable scheduling records, and high availability of jobs. + +### Support Docker fast start to experience the new version + +Starting from version 2.1.0, we will provide a separate Docker Image to support the rapid creation of a 1FE, 1BE Docker container to experience the new version of Doris. The container will complete the initialization of FE and BE, BE registration and other steps by default. After creating the container, it can directly access and use the Doris cluster about 1 [minute.In](http://minute.in/) this image version, the default `max_map_count`, `ulimit`, `Swap` and other hard limits are removed. It supports X64 (avx2) machines and ARM machines for deployment. The default open ports are 8000, 8030, 8040, 9030.If you need to experience the Broker component, you can add the environment variable `--env BROKER=true` at startup to start the Broker process synchronously. After startup, it will automatically complete the registration. The Broker name is `test`. + +Please note that this version is only suitable for quick experience and functional testing, not for production environment! + +## Behavior changed + +- The default data model is the Merge-on-Write Unique Key model. enable_unique_key_merge_on_write will be included as a default setting when a table is created in the Unique Key model. +- As inverted index has proven to be more performant than bitmap index, V2.1 stops supporting bitmap index. Existing bitmap indexes will remain effective but new creation is not allowed. We will remove bitmap index-related code in the future. +- cpu_resource_limit is no longer supported. It is to put a limit on the number of scanner threads on Doris BE. Since the workload group mechanism also supports such settings, the already configured cpu_resource_limit will be invalid. +- The default value of enable_segcompaction is true. This means Doris supports compaction of multiple segments in the same rowset. +- Audit log plug-in + - Since V2.1.0, Doris has a built-in audit log plug-in. Users can simply enable or disable it by setting the enable_audit_plugin parameter. + - If you have already installed your own audit log plug-in, you can either continue using it after upgrading to Doris V2.1, or uninstall it and use the one in Doris. Please note that the audit log table will be relocated after switching plug-in. + - For more details, please see the [docs](../../admin-manual/audit-plugin). + + +## Credits +Thanks all who contribute to this release: + +467887319, 924060929, acnot, airborne12, AKIRA, alan_rodriguez, AlexYue, allenhooo, amory, amory, AshinGau, beat4ocean, BePPPower, bigben0204, bingquanzhao, BirdAmosBird, BiteTheDDDDt, bobhan1, caiconghui, camby, camby, CanGuan, caoliang-web, catpineapple, Centurybbx, chen, ChengDaqi2023, ChenyangSunChenyang, Chester, ChinaYiGuan, ChouGavinChou, chunping, colagy, CSTGluigi, czzmmc, daidai, dalong, dataroaring, DeadlineFen, DeadlineFen, deadlinefen, deardeng, didiaode18, DongLiang-0, dong-shuai, Doris-Extras, Dragonliu2018, DrogonJackDrogon, DuanXujianDuan, DuRipeng, dutyu, echo-dundun, ElvinWei, englefly, Euporia, feelshana, feifeifeimoon, feiniaofeiafei, felixwluo, figurant, flynn, fornaix, FreeOnePlus, Gabriel39, gitccl, gnehil, GoGoWen, gohalo, guardcrystal, hammer, HappenLee, HB, hechao, HelgeLarsHelge, herry2038, HeZhangJianHe, HHoflittlefish777, HonestManXin, hongkun-Shao, HowardQin, hqx871, httpshirley, htyoung, huanghaibin, HuJerryHu, HuZhiyuHu, Hyman-zhao, i78086, irenesrl, ixzc, jacktengg, jacktengg, jackwener, jayhua, Jeffrey, jiafeng.zhang, Jibing-Li, JingDas, julic20s, kaijchen, kaka11chen, KassieZ, kindred77, KirsCalvinKirs, KirsCalvinKirs, kkop, koarz, LemonLiTree, LHG41278, liaoxin01, LiBinfeng-01, LiChuangLi, LiDongyangLi, Lightman, lihangyu, lihuigang, LingAdonisLing, liugddx, LiuGuangdongLiu, LiuHongLiu, liuJiwenliu, LiuLijiaLiu, lsy3993, LuGuangmingLu, LuoMetaLuo, luozenglin, Luwei, Luzhijing, lxliyou001, Ma1oneZhang, mch_ucchi, Miaohongkai, morningman, morrySnow, Mryange, mymeiyi, nanfeng, nanfeng, Nitin-Kashyap, PaiVallishPai, Petrichor, plat1ko, py023, q763562998, qidaye, QiHouliangQi, ranxiang327, realize096, rohitrs1983, sdhzwc, seawinde, seuhezhiqiang, seuhezhiqiang, shee, shuke987, shysnow, songguangfan, Stalary, starocean999, SunChenyangSun, sunny, SWJTU-ZhangLei, TangSiyang2001, Tanya-W, taoxutao, Uniqueyou, vhwzIs, walter, walter, wangbo, Wanghuan, wangqt, wangtao, wangtianyi2004, wenluowen, whuxingying, wsjz, wudi, wudongliang, wuwenchihdu, wyx123654, xiangran0327, Xiaocc, XiaoChangmingXiao, xiaokang, XieJiann, Xinxing, xiongjx, xuefengze, xueweizhang, XueYuhai, XuJianxu, xuke-hat, xy, xy720, xyfsjq, xzj7019, yagagagaga, yangshijie, YangYAN, yiguolei, yiguolei, yimeng, YinShaowenYin, Yoko, yongjinhou, ytwp, yuanyuan8983, yujian, yujun777, Yukang-Lian, Yulei-Yang, yuxuan-luo, zclllyybb, ZenoYang, zfr95, zgxme, zhangdong, zhangguoqiang, zhangstar333, zhangstar333, zhangy5, ZhangYu0123, zhannngchen, ZhaoLongZhao, zhaoshuo, zhengyu, zhiqqqq, ZhongJinHacker, ZhuArmandoZhu, zlw5307, ZouXinyiZou, zxealous, zy-kkk, zzwwhh, zzzxl1993, zzzzzzzs diff --git a/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.1.md b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.1.md new file mode 100644 index 0000000000000..384bccdceb414 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.1.md @@ -0,0 +1,251 @@ +--- +{ + "title": "Release 2.1.1", + "language": "en" +} +--- + + + +Dear community members, Apache Doris 2.1.1 has been officially released on April 3, 2024, with several enhancements and bug fixes based on 2.1.0, enabling smoother user experience. + +- **Quick Download:** [https://doris.apache.org/download/](https://doris.apache.org/download/) + +- **GitHub:** [https://github.com/apache/doris/releases](https://github.com/apache/doris/releases) + +## Behavior Changed + +1. Change float type output format to improve float type serialization performance. + +- https://github.com/apache/doris/pull/32049 + +2. Change system table value functions active_queries(), workload_groups() to system tables. + +- https://github.com/apache/doris/pull/32314 + +3. Disable show query/load profile stmt because there are not so many developers use it and the pipeline and pipelinex engine not support it. + +- https://github.com/apache/doris/pull/32467 + +4. Upgrade arrow flight version to 15.0.2 to fix some bugs, so that please use ADBC 15.0.2 version to access Doris. + +- https://github.com/apache/doris/pull/32827. + + + +## Upgrade Problem + +1. BE will core when rolling pgrade problem from 2.0.x to 2.1.x + +- https://github.com/apache/doris/pull/32672 + +- https://github.com/apache/doris/pull/32444 + +- https://github.com/apache/doris/pull/32162 + +2. JDBC Catalog will have query errors when rolling grade rom 2.0.x to 2.1.x. + +- https://github.com/apache/doris/pull/32618 + + + +## New Feature + +1. Enable column auth by default. + +- https://github.com/apache/doris/pull/32659 + + +2. Get correct cores for pipeline and pipelinex engine when running within docker or k8s. + +- https://github.com/apache/doris/pull/32370 + +3. Support read parquet int96 type. + +- https://github.com/apache/doris/pull/32394 + +4. Enable proxy protocol to support IP transparency. Using this protocol, IP transparency for load balancing can be achieved, so that after load balancing, Doris can still obtain the client's real IP and implement permission control such as whitelisting. + +- https://github.com/apache/doris/pull/32338/files + +5. Add workload group queue related columns for active_queries system table. Uses could use this system to monitor the workload queue usage. + +- https://github.com/apache/doris/pull/32259 + +6. Add new system table backend_active_tasks to monitor the realtime query statics on every BE. + +- https://github.com/apache/doris/pull/31945 + +7. Add ipv4 and ipv6 support for spark-doris connector. + +- https://github.com/apache/doris/pull/32240 + +8. Add inverted index support for CCR. + +- https://github.com/apache/doris/pull/32101 + +9. Support select experimental session variable. + +- https://github.com/apache/doris/pull/31837 + +10. Support materialized view with bitmap_union(bitmap_from_array()) case. + +- https://github.com/apache/doris/pull/31962 + +11. Support partition prune for *HIVE_DEFAULT_PARTITION*. + +- https://github.com/apache/doris/pull/31736 + +12. Support function in set variable statement. + +- https://github.com/apache/doris/pull/32492 + +13. Support arrow serialization for varint type. + +- https://github.com/apache/doris/pull/32809 + + + +## Optimization + +1. Auto resume routine load when be restart or during upgrade. And keep the routine load stable. + +- https://github.com/apache/doris/pull/32239 + +2. Routine Load: optimize allocate task to be algorithm for load balance. + +- https://github.com/apache/doris/pull/32021 + +3. Spark Load: update spark version for spark load to resolve cve problem. + +- https://github.com/apache/doris/pull/30368 + +4. Skip cooldown if the tablet is dropped. + +- https://github.com/apache/doris/pull/32079 + +5. Support using workload group to manage routine load. + +- https://github.com/apache/doris/pull/31671 + +6. [MTMV ]Improve the performance for query rewritting by materialized view. + +- https://github.com/apache/doris/pull/31886 + +7. Reduce jvm heap memory consumed by profiles of BrokerLoadJob. + +- https://github.com/apache/doris/pull/31985 +8. Imporve the high QPS query by speed up PartitionPrunner. + +- https://github.com/apache/doris/pull/31970 + +9. Reduce duplicated memory consumption for column name and column path for schema cache. + +- https://github.com/apache/doris/pull/31141 + +10. Support more join types for query rewriting by materialized view such as INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN, FULL OUTER JOIN, LEFT SEMI JOIN, RIGHT SEMI JOIN, LEFT ANTI JOIN, RIGHT ANTI JOIN + +- https://github.com/apache/doris/pull/32909 + + + +## Bugfix + + +1. Do not push down topn-filter through right/full outer join if the first orderkey is nulls first. + +- https://github.com/apache/doris/pull/32633 + +2. Fix memory leak in Java UDF + +- https://github.com/apache/doris/pull/32630 + +3. If some odbc tables use the same resource, and restore not all odbc tables, it will not retain the resource. +and check some conf for backup/restore + +- https://github.com/apache/doris/pull/31989 + +4. Fold constant will core for variant type. + +- https://github.com/apache/doris/pull/32265 + +5. Routine load will pause when transaction fail in some cases. + +- https://github.com/apache/doris/pull/32638 + +6. the result of left semi join with empty right side should be false instead of null. + +- https://github.com/apache/doris/pull/32477 + +7. Fix core when build inverted index for a new column with no data. + +- https://github.com/apache/doris/pull/32669 + +8. Fix be core caused by null-safe-equal join. + +- https://github.com/apache/doris/pull/32623 + +9. Partial update: fix data correctness risk when load delete sign data into a table with sequence col. + +- https://github.com/apache/doris/pull/32574 + +10. Select outfile: Fix the column type mapping in the orc/parquet file format. + +- https://github.com/apache/doris/pull/32281 + +11. Fix BE core during restore stage. + +- https://github.com/apache/doris/pull/32489 + +12. Use array_agg func after other agg func like count, sum, may make be core. + +- https://github.com/apache/doris/pull/32387 + +13. Variant type should always nullable or there will some bugs. + +- https://github.com/apache/doris/pull/32248 + +14. Fix the bug of handling empty blocks in schema change. + +- https://github.com/apache/doris/pull/32396 + +15. Fix BE will core when use json_length() in some cases. + +- https://github.com/apache/doris/pull/32145 + +16. Fix error when query iceberg table using date cast predicate + +- https://github.com/apache/doris/pull/32194 + +17. Fix some bugs when build inverted index for variant type. + +- https://github.com/apache/doris/pull/31992 + +18. Wrong result of two or more map_agg functions in query. + +- https://github.com/apache/doris/pull/31928 + +19. Fix wrong result of money_format function. + +- https://github.com/apache/doris/pull/31883 + +20. Fix connection hang after too many connections. + +- https://github.com/apache/doris/pull/31594 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.2.md b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.2.md new file mode 100644 index 0000000000000..6116bd9984632 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.2.md @@ -0,0 +1,110 @@ +--- +{ + "title": "Release 2.1.2", + "language": "en" +} +--- + + + +## Behavior Changed + +1. Set the default value of the `data_consistence` property of EXPORT to partition to make export more stable during load. + +- https://github.com/apache/doris/pull/32830 + +2. Some of MySQL Connector (eg, dotnet MySQL.Data) rely on variable's column type to make connection. + + eg, select @[@autocommit]([@autocommit](https://github.com/autocommit)) should with column type BIGINT, not BIT, otherwise it will throw error. So we change column type of @[@autocommit](https://github.com/autocommit) to BIGINT. + + - https://github.com/apache/doris/pull/33282 + + +## Upgrade Problem + +1. Normal workload group is not created when upgrade from 2.0 or other old versions. + + - https://github.com/apache/doris/pull/33197 + +## New Feature + + +1. Add processlist table in information_schema database, users could use this table to query active connections. + + - https://github.com/apache/doris/pull/32511 + +2. Add a new table valued function `LOCAL` to allow access file system like shared storage. + + - https://github.com/apache/doris-website/pull/494 + + +## Optimization + +1. Skip some useless process to make graceful stop more quickly in K8s env. + + - https://github.com/apache/doris/pull/33212 + +2. Add rollup table name in profile to help find the mv selection problem. + + - https://github.com/apache/doris/pull/33137 + +3. Add test connection function to DB2 database to allow user check the connection when create DB2 Catalog. + + - https://github.com/apache/doris/pull/33335 + +4. Add DNS Cache for FQDN to accelerate the connect process among BEs in K8s env. + + - https://github.com/apache/doris/pull/32869 + +5. Refresh external table's rowcount async to make the query plan more stable. + + - https://github.com/apache/doris/pull/32997 + + +## Bugfix + + +1. Fix Iceberg Catalog of HMS and Hadoop do not support Iceberg properties like "io.manifest.cache-enabled" to enable manifest cache in Iceberg. + + - https://github.com/apache/doris/pull/33113 + +2. The offset params in `LEAD`/`LAG` function could use 0 as offset. + + - https://github.com/apache/doris/pull/33174 + +3. Fix some timeout issues with load. + + - https://github.com/apache/doris/pull/33077 + + - https://github.com/apache/doris/pull/33260 + +4. Fix core problem related with `ARRAY`/`MAP`/`STRUCT` compaction process. + + - https://github.com/apache/doris/pull/33130 + + - https://github.com/apache/doris/pull/33295 + +5. Fix runtime filter wait timeout. + + - https://github.com/apache/doris/pull/33369 + +6. Fix `unix_timestamp` core for string input in auto partition. + + - https://github.com/apache/doris/pull/32871 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.3.md b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.3.md new file mode 100644 index 0000000000000..e88ec3e94fb6d --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.3.md @@ -0,0 +1,191 @@ +--- +{ + "title": "Release 2.1.3", + "language": "en" +} +--- + + + +Apache Doris 2.1.3 was officially released on May 21, 2024. This version has updated several improvements, including writing data back to Hive, materialized view, permission management and bug fixes. It further enhances the performance and stability of the system. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + + + +## Feature Enhancements + +**1. Support writing back data to hive tables via Hive Catalog** + +Starting from version 2.1.3, Apache Doris supports DDL and DML operations on Hive. Users can directly create libraries and tables in Hive through Apache Doris and write data to Hive tables by executing `INSERT INTO` statements. This feature allows users to perform complete data query and write operations on Hive through Apache Doris, further simplifying the integrated lakehouse architecture. + +Please refer: [https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/](https://doris.apache.org/docs/lakehouse/datalake-building/hive-build/) + +**2. Support building new asynchronous materialized views on top of existing ones** + +Users can create new asynchronous materialized views on top of existing ones, directly reusing pre-computed intermediate results for data processing. This simplifies complex aggregation and computation operations, reducing resource consumption and maintenance costs while further accelerating query performance and improving data availability. [#32984](https://github.com/apache/doris/pull/32984) + +**3. Support rewriting through nested materialized views** + +Materialized View (MV) is a database object used to store query results. Now, Apache Doris supports rewriting through nested materialized views, which helps optimize query performance. [#33362](https://github.com/apache/doris/pull/33362) + +**4. New `SHOW VIEWS` statement** + +The `SHOW VIEWS` statement can be used to query views in the database, facilitating better management and understanding of view objects in the database. [#32358](https://github.com/apache/doris/pull/32358) + +**5. Workload Group supports binding to specific BE nodes** + +Workload Group can be bound to specific BE nodes, enabling more refined control over query execution to optimize resource usage and improve performance. [#32874](https://github.com/apache/doris/pull/32874) + +**6. Broker Load supports compressed JSON format** + +Broker Load now supports importing compressed JSON format data, significantly reducing bandwidth requirements for data transmission and accelerating data import performance. [#30809](https://github.com/apache/doris/pull/30809) + +**7. TRUNCATE Function can use columns as scale parameters** + +The TRUNCATE function can now accept columns as scale parameters, providing more flexibility when processing numerical data. [#32746](https://github.com/apache/doris/pull/32746) + +**8. Add new functions `uuid_to_int` and `int_to_uuid`** + +These two functions allow users to convert between UUID and integer, significantly helping in scenarios that require handling UUID data. [#33005](https://github.com/apache/doris/pull/33005) + +**9. Add `bypass_workload_group` session variable to bypass query queue** + +The `bypass_workload_group` session variable allows certain queries to bypass the Workload Group queue and execute directly, which is useful for handling critical queries that require quick responses. [#33101](https://github.com/apache/doris/pull/33101) + +**10. Add strcmp function** + +The strcmp function compares two strings and returns their comparison result, simplifying text data processing. [#33272](https://github.com/apache/doris/pull/33272) + +**11. Support HLL functions `hll_from_base64` and `hll_to_base64`** + +HyperLogLog (HLL) is an algorithm for cardinality estimation. These two functions allow users to decode HLL data from a Base64-encoded string or encode HLL data as a Base64 string, which is very useful for storing and transmitting HLL data. [#32089](https://github.com/apache/doris/pull/32089) + +## Optimization and Improvements + +**1. Replace SipHash with XXHash to improve shuffle performance** + +Both SipHash and XXHash are hashing functions, but XXHash may provide faster hashing speeds and better performance in certain scenarios. This optimization aims to improve performance during data shuffling by adopting XXHash. [#32919](https://github.com/apache/doris/pull/32919) + +**2. Asynchronous materialized views support NULL partition columns in OLAP tables** + +This enhancement allows asynchronous materialized views to support NULL partition columns in OLAP tables, enhancing data processing flexibility.[#32698](https://github.com/apache/doris/pull/32698) + +**3. Limit maximum string length to 1024 when collecting column statistics to control BE memory usage** + +Limiting the string length when collecting column statistics prevents excessive data from consuming too much BE memory, helping maintain system stability and performance. [#32470](https://github.com/apache/doris/pull/32470) + +**4. Support dynamic deletion of Bitmap cache to improve performance** + +Dynamically deleting no longer needed Bitmap Cache can free up memory and improve system performance. [#32991](https://github.com/apache/doris/pull/32991) + +**5. Reduce memory usage during ALTER operations** + +Reducing memory usage during ALTER operations improves the efficiency of system resource utilization. [#33474](https://github.com/apache/doris/pull/33474) + +**6. Support constant folding for complex types** + +Supports constant folding for Array/Map/Struct complex types.[#32867](https://github.com/apache/doris/pull/32867) + +**7. Add support for Variant type in Aggregate Key Model** + +The Variant data type can store multiple data types. This optimization allows aggregation operations on Variant type data, enhancing the flexibility of semi-structured data analysis. [#33493](https://github.com/apache/doris/pull/33493) + +**8. Support new inverted index format in CCR** [#33415](https://github.com/apache/doris/pull/33415) + +**9. Optimize rewriting performance for nested materialized views** [#34127](https://github.com/apache/doris/pull/34127) + +**10. Support decimal256 type in row-based storage format** + +Supporting the decimal256 type in row-based storage extends the system's ability to handle high-precision numerical data. [#34887](https://github.com/apache/doris/pull/34887) + +## Behavioral Changes + +**1. Authorization** + +- **Grant_priv permission changes**: `Grant_priv` can no longer be arbitrarily granted. When performing a `GRANT` operation, the user not only needs to have `Grant_priv` but also the permissions to be granted. For example, to grant `SELECT` permission on `table1`, the user needs both `GRANT` permission and `SELECT` permission on `table1`, enhancing security and consistency in permission management. [#32825](https://github.com/apache/doris/pull/32825) + +- **Workload group and resource usage_priv**: `Usage_priv` for Workload Group and Resource is no longer global but limited to Resource and Workload Group, making permission granting and usage more specific. [#32907](https://github.com/apache/doris/pull/32907) + +- **Authorization for operations**: Operations that were previously unauthorized now have corresponding authorizations for more detailed and comprehensive operational permission control. [#33347](https://github.com/apache/doris/pull/33347) + +**2. LOG directory configuration** + +The log directory configuration for FE and BE now uniformly uses the `LOG_DIR` environment variable. All other different types of logs will be stored with `LOG_DIR` as the root directory. To maintain compatibility between versions, the previous configuration item `sys_log_dir` can still be used. [#32933](https://github.com/apache/doris/pull/32933) + +**3. S3 Table Function (TVF)** + +Due to issues with correctly recognizing or processing S3 URLs in certain cases, the parsing logic for object storage paths has been refactored. For file paths in S3 table functions, the `force_parsing_by_standard_uri` parameter needs to be passed to ensure correct parsing. [#33858](https://github.com/apache/doris/pull/33858) + +## Upgrade Issues + +Since many users use certain keywords as column names or attribute values, the following keywords have been set as non-reserved, allowing users to use them as identifiers. [#34613](https://github.com/apache/doris/pull/34613) + +## Bug Fixes + +**1. Fix no data error when reading Hive tables on Tencent Cloud COSN** + +Resolved the no data error that could occur when reading Hive tables on Tencent Cloud COSN, enhancing compatibility with Tencent Cloud storage services. + +**2. Fix incorrect results returned by `milliseconds_diff` function** + +Fixed an issue where the `milliseconds_diff` function returned incorrect results in some cases, ensuring the accuracy of time difference calculations. [#32897](https://github.com/apache/doris/pull/32897) + +**3. User-defined variables should be rorwarded to the Master node** + +Ensured that user-defined variables are correctly passed to the Master node for consistency and correct execution logic across the entire system. [#33013]https://github.com/apache/doris/pull/33013 + +**4. Fix Schema Change issues when adding complex type columns** + +Resolved Schema Change issues that could arise when adding complex type columns, ensuring the correctness of Schema Changes. [#31824](https://github.com/apache/doris/pull/31824) + +**5. Fix data loss issue in Routine Load when FE Master node changes** + +`Routine Load` is often used to subscribe to Kafka message queues. This fix addresses potential data loss issues that may occur during FE Master node changes. [#33678](https://github.com/apache/doris/pull/33678) + +**6. Fix Routine Load failure when Workload Group cannot be found** + +Resolved an issue where `Routine Load` would fail if the specified Workload Group could not be found. [#33596](https://github.com/apache/doris/pull/33596) + +**7. Support column string64 to avoid join failures when string size overflows unit32** + +In some cases, string sizes may exceed the unit32 limit. Supporting the `string64` type ensures correct execution of string JOIN operations. [#33850](https://github.com/apache/doris/pull/33850) + +**8. Allow Hadoop users to create Paimon Catalog** + +Permitted authorized Hadoop users to create Paimon Catalogs.[#33833](https://github.com/apache/doris/pull/33833) + +**9. Fix `function_ipxx_cidr` function issues with constant parameters** + +Resolved problems with the `function_ipxx_cidr` function when handling constant parameters, ensuring the correctness of function execution.[#33968](https://github.com/apache/doris/pull/33968) + +**10. Fix file download errors when restoring using HDFS** + +Resolved "failed to download" errors encountered during data restoration using HDFS, ensuring the accuracy and reliability of data recovery. [#33303](https://github.com/apache/doris/issues/33303) + +**11. Fix column permission issues related to hidden columns** + +In some cases, permission settings for hidden columns may be incorrect. This fix ensures the correctness and security of column permission settings. [#34849](https://github.com/apache/doris/pull/34849) + +**12. Fix issue where Arrow Flight cannot obtain the correct IP in K8s deployments** + +This fix resolves an issue where Arrow Flight cannot correctly obtain the IP address in Kubernetes deployment environments.[#34850](https://github.com/apache/doris/pull/34850) \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.4.md b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.4.md new file mode 100644 index 0000000000000..521694ffa60fa --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.4.md @@ -0,0 +1,289 @@ +--- +{ + "title": "Release 2.1.4", + "language": "en" +} +--- + + + +**Apache Doris version 2.1.4 was officially released on June 26, 2024.** In this update, we have optimized various functional experiences for data lakehouse scenarios, with a focus on resolving the abnormal memory usage issue in the previous version. Additionally, we have implemented several improvemnents and bug fixes to enhance the stability. Welcome to download and use it. + + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + + +## Behavior changes + +- Non-existent files will be ignored when querying external tables such as Hive. [#35319](https://github.com/apache/doris/pull/35319) + + The file list is obtained from the meta cache, and it may not be consistent with the actual file list. + + Ignoring non-existent files helps to avoid query errors. + +- By default, creating a Bitmap Index will no longer be automatically changed to an Inverted Index. [#35521](https://github.com/apache/doris/pull/35521) + + This behavior is controlled by the FE configuration item `enable_create_bitmap_index_as_inverted_index`, which defaults to false. + +- When starting FE and BE processes using `--console`, all logs will be output to the standard output and differentiated by prefixes indicating the log type. [#35679](https://github.com/apache/doris/pull/35679) + + For more infomation, please see the documentations: + + - [Log Management - FE Log](../admin-manual/log-management/fe-log.md) + + - [Log Management - BE Log](../admin-manual/log-management/be-log.md) + +- If no table comment is provided when creating a table, the default comment will be empty instead of using the table type as the default comment. [#36025](https://github.com/apache/doris/pull/36025) + +- The default precision of DECIMALV3 has been adjusted from (9, 0) to (38, 9) to maintain compatibility with the version in which this feature was initially released. [#36316](https://github.com/apache/doris/pull/36316) + +## New features + +### Query optimizer + +- Support FE flame graph tool + + For more information, see the [documentation](/community/developer-guide/fe-profiler.md) + +- Support `SELECT DISTINCT` to be used with aggregation. + +- Support single table query rewrite without `GROUP BY`. This is useful for complex filters or expressions. [#35242](https://github.com/apache/doris/pull/35242). + +- The new optimizer fully supports point query functionality [#36205](https://github.com/apache/doris/pull/36205). + +### Data Lakehouse + +- Support native reader of Apache Paimon deletion vector [#35241](https://github.com/apache/doris/pull/35241) + +- Support using Resource in Table Valued Functions [#35139](https://github.com/apache/doris/pull/35139) + +- Access controller with Hive Ranger plugin supports Data Mask + +### Asynchronous materialized views + +- Build support for internal table triggered updates, where if a materialized view uses an internal table and the data in the internal table changes, it can trigger a refresh of the materialized view, specifying REFRESH ON COMMIT when creating the materialized view. + +- Support transparent rewriting for single tables. For more information, see [Querying Async Materialized View](../query/view-materialized-view/query-async-materialized-view.md). + +- Transparent rewriting supports aggregation roll-up for agg_state, agg_union types; materialized views can be defined as agg_state or agg_union, queries can use specific aggregation functions, or use agg_merge. For more information, see [AGG_STATE](../sql-manual/sql-types/Data-Types/AGG_STATE.md). + +### Others + +- Added function `replace_empty`. + + For more information, see [documentation]../sql-manual/sql-functions/string-functions/replace_empty). + +- Support `show storage policy using` statement. + + For more information, see [documentation](../sql-manual/sql-statements/Show-Statements/SHOW-STORAGE-POLICY-USING.md). + +- Support JVM metrics on the BE side. + + By setting `enable_jvm_monitor=true` in `be.conf` to enable this feature. + +## Improvements + +- Supported creating inverted indexes for columns with Chinese names. [#36321](https://github.com/apache/doris/pull/36321) + +- Estimate memory consumed by segment cache more accurately so that unused memory can be released more quickly. [#35751](https://github.com/apache/doris/pull/35751) + +- Filter empty partitions before exporting tables to remote storage. [#35542](https://github.com/apache/doris/pull/35542) + +- Optimize routine load task allocation algorithm to balance the load among Backends. [#34778](https://github.com/apache/doris/pull/34778) + +- Provide hints when a related variable is not found during a set operation. [#35775](https://github.com/apache/doris/pull/35775) + +- Support placing Java UDF jar files in the FE's `custom_lib` directory for default loading. [#35984](https://github.com/apache/doris/pull/35984) + +- Add a timeout global variable `audit_plugin_load_timeout` for audit log load jobs. + +- Optimize the performance of transparent rewrite planning for asynchronous materialized views. + +- Optimize the `INSERT` operation that when the source is empty, the BE will not execute. [#34418](https://github.com/apache/doris/pull/34418) + +- Support fetching file lists of Hive/Hudi tables in batches. In a senario with 1.2 million files, the time taken to obtain the list of files has been reduced from 390 seconds to 46 seconds. [#35107](https://github.com/apache/doris/pull/35107) + +- Forbid dynamic partitioning when creating asynchronous materialized views. + +- Support detecting whether the partition data of external data of external tables in Hive is synchronized with asynchronous materialized views. + +- Allow to create index for asynchronous materialized views. + +## Bug fixes + +### Query optimizer + +- Fixed the issue where SQL cache returns old results after truncating a partition. [#34698](https://github.com/apache/doris/pull/34698) + +- Fixed the issue where casting from JSON to other types did not correctly handle nullable attributes. [#34707](https://github.com/apache/doris/pull/34707) + +- Fixed occasional DATETIMEV2 literal simplification errors. [#35153](https://github.com/apache/doris/pull/35153) + +- Fixed the issue where `COUNT(*)` could not be used in window functions. [#35220](https://github.com/apache/doris/pull/35220) + +- Fixed the issue where nullable attributes could be incorrect when all `SELECT` statements under `UNION ALL` have no `FROM` clause. [#35074](https://github.com/apache/doris/pull/35074) + +- Fixed the issue where `bitmap in join` and subquery unnesting could not be used simultaneously. [#35435](https://github.com/apache/doris/pull/35435) + +- Fixed the performance issue where filter conditions could not be pushed down to the CTE producer in specific situations. [#35463](https://github.com/apache/doris/pull/35463) + +- Fixed the issue where aggregate combinators written in uppercase could not be found. [#35540](https://github.com/apache/doris/pull/35540) + +- Fixed the performance issue where window functions were not properly pruned by column pruning. [#35504](https://github.com/apache/doris/pull/35504) + +- Fixed the issue where queries might parse incorrectly leading to wrong results when multiple tables with the same name but in different databases appeared simultaneously in the query. [#35571](https://github.com/apache/doris/pull/35571) + +- Fixed the query error caused by generating runtime filters during schema table scans. [#35655](https://github.com/apache/doris/pull/35655) + +- Fixed the issue where nested correlated subqueries could not execute because the join condition was folded into a null literal. [#35811](https://github.com/apache/doris/pull/35811) + +- Fixed the occasional issue where decimal literals were set with incorrect precision during planning. [#36055](https://github.com/apache/doris/pull/36055) + +- Fixed the occasional issue where multiple layers of aggregation were merged incorrectly during planning. [#36145](https://github.com/apache/doris/pull/36145) + +- Fixed the occasional issue where the input-output mismatch error occurred after aggregate expansion planning. [#36207](https://github.com/apache/doris/pull/36207) + +- Fixed the occasional issue where `<=>` was incorrectly converted to `=`. [#36521](https://github.com/apache/doris/pull/36521) + +### Query execution + +- Fixed the issue where the query hangs if the limited rows are reached on the pipeline engine and memory is not released. [#35746](https://github.com/apache/doris/pull/35746) + +- Fixed the BE coredump when `enable_decimal256` is true but falls back to the old planner. [#35731](https://github.com/apache/doris/pull/35731) + +### Asynchronous materialized views + +- Fixed the issue in the asynchronous materialized view build where the store_row_column attribute specified was not being recognized by the core. + +- Fixed the problem in the asynchronous materialized view build where specifying the storage_medium was not taking effect. + +- Resolved the error occurring in the asynchronous materialized view show partitions after the base table is deleted. + +- Fixed the issue where asynchronous materialized views caused backup and restore exceptions. [#35703](https://github.com/apache/doris/pull/35703) + +- Fixed the issue where partition rewrite could lead to incorrect results. [#35236](https://github.com/apache/doris/pull/35236) + +### Semi-structured + +- Fixed the core dump problem when a VARIANT with an empty key is used. [#35671](https://github.com/apache/doris/pull/35671) +- Bitmap and BloomFilter index should not perform light index changes. [#35225](https://github.com/apache/doris/pull/35225) + +### Primary key + +- Fixed the issue where an exception BE restart occurred in the case of partial column updates during import, which could result in duplicate keys. [#35678](https://github.com/apache/doris/pull/35678) + +- Fixed the issue where BE might core dump during clone operations when memory is tight. [#34702](https://github.com/apache/doris/pull/34702) + +### Data Lakehouse + +- Fixed the issue where a Hive table could not be created with a fully qualified name such as `ctl.db.tbl` [#34984](https://github.com/apache/doris/pull/34984) + +- Fixed the issue where the Hive metastore connection did not close when refreshing [#35426](https://github.com/apache/doris/pull/35426) + +- Fixed a potential meta replay issue when upgrading from 2.0.x to 2.1.x. [#35532](https://github.com/apache/doris/pull/35532) + +- Fixed the issue where the Table Valued Function could not read an empty snappy compressed file. [#34926](https://github.com/apache/doris/pull/34926) + +- Fixed the issue where unable to read Parquet files with invalid min-max column statistics [#35041](https://github.com/apache/doris/pull/35041) + +- Fixed the issue where unable to handle pushdown predicates with null-aware functions in the Parquet/ORC reader [#35335](https://github.com/apache/doris/pull/35335) + +- Fixed the issue about the order of partition columns when creating a Hive table [#35347](https://github.com/apache/doris/pull/35347) + +- Fixed the issue where writing to a Hive table on S3 failed when partition values contained spaces [#35645](https://github.com/apache/doris/pull/35645) + +- Fixed the issue about incorrect scheme of Aliyun OSS endpoint [#34907](https://github.com/apache/doris/pull/34907) + +- Fixed the issue where the Parquet format Hive table written by Doris could not be read by Hive [#34981](https://github.com/apache/doris/pull/34981) + +- Fixed the issue where unable to read ORC files after the schema change of a Hive table [#35583](https://github.com/apache/doris/pull/35583) + +- Fixed the issue where unable to read Paimon tables via JNI after the schema change of the Paimon table [#35309](https://github.com/apache/doris/pull/35309) + +- Fixed the issue of too small Row Groups in Parquet format files written out. [#36042](https://github.com/apache/doris/pull/36042) [#36143](https://github.com/apache/doris/pull/36143) + +- Fixed the issue where unable to read Paimon tables after schema changes [#36049](https://github.com/apache/doris/pull/36049) + +- Fixed the issue where unable to read Hive Parquet format tables after schema changes [#36182](https://github.com/apache/doris/pull/36182) + +- Fixed the FE OOM issue caused by Hadoop FS cache [#36403](https://github.com/apache/doris/pull/36403) + +- Fixed the issue where FE could not start after enabling the Hive Metastore Listener [#36533](https://github.com/apache/doris/pull/36533) + +- Fixed the issue of query performance degradation with a large number of files [#36431](https://github.com/apache/doris/pull/36431) + +- Fixed the timezone issue when reading the timestamp column type in Iceberg [#36435](https://github.com/apache/doris/pull/36435) + +- Fixed DATETIME conversion error and data path error on Iceberg Table. [#35708](https://github.com/apache/doris/pull/35708) + +- Support retain and pass the additional user-defined properties fo Table Valued Functions to the S3 SDK. [#35515](https://github.com/apache/doris/pull/35515) + + +### Data import + +- Fixed the issue where `CANCEL LOAD` did not work [#35352](https://github.com/apache/doris/pull/35352) + +- Fixed the issue where a null pointer error in the Publish phase of load transactions prevented the load from completing [#35977](https://github.com/apache/doris/pull/35977) + +- Fixed the issue with bRPC serializing large data files when sent via HTTP [#36169](https://github.com/apache/doris/pull/36169) + +### Data management + +- Fixed the isseu that the resource tag in ConnectionContext was not set after forwarding DDL or DML to master FE. [#35618](https://github.com/apache/doris/pull/35618) + +- Fixed the issue where the restored table name was incorrect when `lower_case_table_names` was enabled [#35508](https://github.com/apache/doris/pull/35508) + +- Fixed the issue where `admin clean trash` could not work [#35271](https://github.com/apache/doris/pull/35271) + +- Fixed the issue where a storage policy could not be deleted from a partition [#35874](https://github.com/apache/doris/pull/35874) + +- Fixed the issue of data loss when importing into a multi-replica automatic partition table [#36586](https://github.com/apache/doris/pull/36586) + +- Fixed the issue where the partition column of a table changed when querying or inserting into an automatic partition table using the old optimizer [#36514](https://github.com/apache/doris/pull/36514) + +### Memory management + +- Fixed the issue of frequent errors in the logs due to failure in obtaining Cgroup meminfo. [#35425](https://github.com/apache/doris/pull/35425) + +- Fixed the issue where the Segment cache size was uncontrolled when using BloomFilter, leading to abnormal process memory growth. [#34871](https://github.com/apache/doris/pull/34871) + +### Permissions + +- Fixed the issue where permission settings were ineffective after enabling case-insensitive table names. [#36557](https://github.com/apache/doris/pull/36557) + +- Fixed the issue where setting LDAP passwords through non-Master FE nodes did not take effect. [#36598](https://github.com/apache/doris/pull/36598) + +- Fixed the issue where authorization could not be checked for the `SELECT COUNT(*)` statement. [#35465](https://github.com/apache/doris/pull/35465) + +### Others + +- Fixed the issue where the client JDBC program could not close the connection if the MySQL connection was broken. [#36616](https://github.com/apache/doris/pull/36616) + +- Fixed MySQL protocol compatibility issue with the `SHOW PROCEDURE STATUS` statement. [#35350](https://github.com/apache/doris/pull/35350) + +- The `libevent` now forces Keepalive to solve the issue of connection leaks in certain situations. [#36088](https://github.com/apache/doris/pull/36088) + +## Credits + +Thanks to every one who contributes to this release. + +@airborne12, @amorynan, @AshinGau, @BePPPower, @BiteTheDDDDt, @ByteYue, @caiconghui, @CalvinKirs, @cambyzju, @catpineapple, @cjj2010, @csun5285, @DarvenDuan, @dataroaring, @deardeng, @Doris-Extras, @eldenmoon, @englefly, @feiniaofeiafei, @felixwluo, @freemandealer, @Gabriel39, @gavinchou, @GoGoWen, @HappenLee, @hello-stephen, @hubgeter, @hust-hhb, @jacktengg, @jackwener, @jeffreys-cat, @Jibing-Li, @kaijchen, @kaka11chen, @Lchangliang, @liaoxin01, @LiBinfeng-01, @lide-reed, @luennng, @luwei16, @mongo360, @morningman, @morrySnow, @mrhhsg, @Mryange, @mymeiyi, @nextdreamblue, @platoneko, @qidaye, @qzsee, @seawinde, @shuke987, @sollhui, @starocean999, @suxiaogang223, @TangSiyang2001, @Thearas, @Vallishp, @w41ter, @wangbo, @whutpencil, @wsjz, @wuwenchi, @xiaokang, @xiedeyantu, @XieJiann, @xinyiZzz, @XuPengfei-1020, @xy720, @xzj7019, @yiguolei, @yongjinhou, @yujun777, @Yukang-Lian, @Yulei-Yang, @zclllyybb, @zddr, @zfr9527, @zgxme, @zhangbutao, @zhangstar333, @zhannngchen, @zhiqiang-hhhh, @zy-kkk, @zzzxl1993 \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.5.md b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.5.md new file mode 100644 index 0000000000000..7c1910eeae8c5 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.5.md @@ -0,0 +1,395 @@ +--- +{ + "title": "Release 2.1.5", + "language": "en" +} +--- + + + +**Apache Doris version 2.1.5 was officially released on July 24, 2024.** In this update, we have optimized various functional experiences for data lakehouse and high concurrency scenarios, functionalities of asynchronous materialized views. Additionaly, we have implemented several improvemnents and bug fixes to enhance the stability. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- The default connection pool size for the JDBC Catalog has been increased from 10 to 30 to prevent connection exhaustion in high-concurrency scenarios. [#37023](https://github.com/apache/doris/pull/37023). + +- The system's reserved memory (low water mark) has been adjusted to `min(6.4GB, MemTotal * 5%)` to mitigate BE OOM issues. + +- When processing multiple statements in a single request, only the last statement's result is returned if the `CLIENT_MULTI_STATEMENTS` flag is not set. + +- Direct modifications to data in asynchronous materialized views are no longer permitted.[#37129](https://github.com/apache/doris/pull/37129) + +- A session variable `use_max_length_of_varchar_in_ctas` has been added to control the behavior of varchar and char type length generation during CTAS (Create Table As Select). The default value is true. When set to false, the derived varchar length is used instead of the maximum length. [#37284](https://github.com/apache/doris/pull/37284) + +- Statistics collection now defaults to enabling the functionality of estimating the number of rows in Hive tables based on file size. [#37694](https://github.com/apache/doris/pull/37694) + +- Transparent rewrite for asynchronous materialized views is now enabled by default. [#35897](https://github.com/apache/doris/pull/35897) + +- Transparent rewrite utilizes partitioned materialized views. If partitions fail, the base tables are unioned with the materialized view to ensure data correctness. [#35897](https://github.com/apache/doris/pull/35897) + +## New features + +### Lakehouse + +- The session variable `read_csv_empty_line_as_null` can be used to control whether empty lines are ignored when reading CSV format files. [#37153](https://github.com/apache/doris/pull/37153) + + By default, empty lines are ignored. When set to true, empty lines will be read as rows where all columns are null. + +- Compatibility with Presto's complex type output format can be enabled by setting `serde_dialect="presto"`. [#37253](https://github.com/apache/doris/pull/37253) + +### Multi-Table Materialized View + +- Supports non-deterministic functions in materialized view building. [#37651](https://github.com/apache/doris/pull/37651) + +- Atomically replaces definitions of asynchronous materialized views. [#37147](https://github.com/apache/doris/pull/37147) + +- Views creation statements can be viewed via `SHOW CREATE MATERIALIZED VIEW`. [#37125](https://github.com/apache/doris/pull/37125) + +- Transparent rewrites for multi-dimensional aggregation and non-aggregate queries. [#37436](https://github.com/apache/doris/pull/37436) [#37497](https://github.com/apache/doris/pull/37497) + +- Supports DISTINCT aggregations with key columns and partitioning for roll-ups. [#37651](https://github.com/apache/doris/pull/37651) + +- Support for partitioning materialized views to roll up partitions using `date_trunc` [#31812](https://github.com/apache/doris/pull/31812) [#35562](https://github.com/apache/doris/pull/35562) + +- Partitioned table-valued functions (TVFs) are supported. [#36479](https://github.com/apache/doris/pull/36479) + +### Semi-Structured Data Management + +- Tables using the VARIANT type now support partial column updates. [#34925](https://github.com/apache/doris/pull/34925) + +- PreparedStatement support is now enabled by default. [#36581](https://github.com/apache/doris/pull/36581) + +- The VARIANT type can be exported to CSV format. [#37857](https://github.com/apache/doris/pull/37857) + +- `explode_json_object` function transposes JSON Object rows into columns. [#36887](https://github.com/apache/doris/pull/36887) + +- The ES Catalog now maps ES NESTED or OBJECT types to the Doris JSON type.[#37101](https://github.com/apache/doris/pull/37101) + +- By default, support_phrase is enabled for inverted indexes with specified analyzers to improve the performance of match_phrase series queries. [#37949](https://github.com/apache/doris/pull/37949) + +### Query Optimizer + +- Support for explaining `DELETE FROM` statements. [#37100](https://github.com/apache/doris/pull/37100) + +- Support for hint form of constant expression parameters [#37988](https://github.com/apache/doris/pull/37988) + +### Memory Management + +- Added an HTTP API to clear the cache. [#36599](https://github.com/apache/doris/pull/36599) + +### Permissions + +- Support for authorization of resources within Table-Valued Functions (TVFs) [#37132](https://github.com/apache/doris/pull/37132) + +## Improvements + +### Lakehouse + +- Upgraded Paimon to version 0.8.1 + +- Fixes ClassNotFoundException for org.apache.commons.lang.StringUtils when querying Paimon tables. [#37512](https://github.com/apache/doris/pull/37512) + +- Added support for Tencent Cloud LakeFS. [#36891](https://github.com/apache/doris/pull/36891) + +- Optimized the timeout duration when fetching file lists for external table queries. [#36842](https://github.com/apache/doris/pull/36842) + +- Configurable via the session variable `fetch_splits_max_wait_time_ms`. + +- Improved default connection logic for SQLServer JDBC Catalog. [#36971](https://github.com/apache/doris/pull/36971) + + By default, the connection encryption settings are not intervened. Only when `force_sqlserver_jdbc_encrypt_false` is set to true, encrypt=false is forcibly added to the JDBC URL to reduce authentication errors. This allows for more flexible control over encryption behavior, enabling it to be turned on or off as needed. + +- Added serde properties to the show create table statements for Hive tables. [#37096](https://github.com/apache/doris/pull/37096) + +- Changed the default cache time for Hive table lists on the FE from 1 day to 4 hours + +- Data export (Export/Outfile) now supports specifying compression formats for Parquet and ORC + + For more information, please refer to [docs](https://doris.apache.org/docs/sql-manual/sql-statements/Data-Manipulation-Statements/Manipulation/EXPORT/?_highlight=compress_type). + +- When creating a table using CTAS+TVF, partition columns in the TVF are automatically mapped to Varchar(65533) instead of String, allowing them to be used as partition columns for internal tables [#37161](https://github.com/apache/doris/pull/37161) + +- Optimized the number of metadata accesses for Hive write operations [#37127](https://github.com/apache/doris/pull/37127) + +- ES Catalog now supports mapping nested/object types to Doris's Json type. [#37182](https://github.com/apache/doris/pull/37182) + +- Improved error messages when connecting to Oracle using older versions of the ojdbc driver [#37634](https://github.com/apache/doris/pull/37634) + +- When Hudi tables return an empty set during Incremental Read, Doris now also returns an empty set instead of error [#37636](https://github.com/apache/doris/pull/37636) + +- Fixed an issue where inner-outer table join queries could lead to FE timeouts in some cases [#37757](https://github.com/apache/doris/pull/37757) + +- Fixed an issue with FE metadata replay errors during upgrades from older versions to newer versions when the Hive metastore event listener is enabled. [#37757](https://github.com/apache/doris/pull/37757) + +### Multi-Table Materialized View + +- Automate key column selection for asynchronous materialized views. [#36601](https://github.com/apache/doris/pull/36601) + +- Support date_trunc in materialized view partition definitions.. [#35562](https://github.com/apache/doris/pull/35562) + +- Enable transparent rewrites across nested materialized view aggregations. [#37651](https://github.com/apache/doris/pull/37651) + +- Asynchronous materialized views remain available when schema changes do not affect the correctness of their data. [#37122](https://github.com/apache/doris/pull/37122) + +- Improve planning speed for transparent rewrites. [#37935](https://github.com/apache/doris/pull/37935) + +- When calculating the availability of asynchronous materialized views, the current refresh status is no longer taken into account. [#36617](https://github.com/apache/doris/pull/36617) + +### Semi-Structured Data Management + +- Optimize DESC performance for viewing VARIANT sub-columns through sampling. [#37217](https://github.com/apache/doris/pull/37217) + +- Support for special JSON data with empty keys in the JSON type. [#36762](https://github.com/apache/doris/pull/36762) + +### Inverted Index + +- Reduce latency by minimizing the invocation of inverted index exists to avoid delays in accessing object storage. [#36945](https://github.com/apache/doris/pull/36945) + +- Optimize the overhead of the inverted index query process. [#35357](https://github.com/apache/doris/pull/35357) + +- Prevent inverted indices in materialized views. [#36869](https://github.com/apache/doris/pull/36869) + +### Query Optimizer + +- When both sides of a comparison expression are literals, the string literal will attempt to convert to the type of the other side. [#36921](https://github.com/apache/doris/pull/36921) + +- Refactored the sub-path pushdown functionality for the variant type, now better supporting complex pushdown scenarios. [#36923](https://github.com/apache/doris/pull/36923) + +- Optimized the logic for calculating the cost of materialized views, enabling more accurate selection of lower-cost materialized views. [#37098](https://github.com/apache/doris/pull/37098) + +- Improved the SQL cache planning speed when using user variables in SQL. [#37119](https://github.com/apache/doris/pull/37119) + +- Optimized the row estimation logic for NOT NULL expressions, resulting in better performance when NOT NULL is present in queries. [#37498](https://github.com/apache/doris/pull/37498) + +- Optimized the null rejection derivation logic for LIKE expressions. [#37864](https://github.com/apache/doris/pull/37864) + +- Improved error messages when querying a specific partition fails, making it clearer which table is causing the issue. [#37280](https://github.com/apache/doris/pull/37280) + +### Query Execution + +- Improved the performance of the bitmap_union operator up to 3 times in certain scenarios. + +- Enhanced the reading performance of Arrow Flight in ARM environments. + +- Optimized the execution performance of the explode, explode_map, and explode_json functions. + +### Data Loading + +- Support setting `max_filter_ratio` for `INSERT INTO ... FROM TABLE VALUE FUNCTION` + +## Bug fixes + +### Lakehouse + +- Fixed an issue that caused BE crashes in some cases when querying Parquet format [#37086](https://github.com/apache/doris/pull/37086) + +- Fixed an issue where BE printed excessive logs when querying Parquet format. [#37012](https://github.com/apache/doris/pull/37012) + +- Fixed an issue where the FE side created a large number of duplicate FileSystem objects in some cases. [#37142](https://github.com/apache/doris/pull/37142) + +- Fixed an issue where transaction information was not cleaned up after writing to Hive in some cases. [#37172](https://github.com/apache/doris/pull/37172) + +- Fixed a thread leak issue caused by Hive table write operations in some cases. [#37247](https://github.com/apache/doris/pull/37247) + +- Fixed an issue where Hive Text format row and column delimiters could not be correctly obtained in some cases. [#37188](https://github.com/apache/doris/pull/37188) + +- Fixed a concurrency issue when reading lz4 compressed blocks in some cases. [#37187](https://github.com/apache/doris/pull/37187) + +- Fixed an issue where `count(*)` on Iceberg tables returned incorrect results in some cases. [#37810](https://github.com/apache/doris/pull/37810) + +- Fixed an issue where creating a Paimon catalog based on MinIO caused FE metadata replay errors in some cases. [#37249](https://github.com/apache/doris/pull/37249) + +- Fixed an issue where using Ranger to create a catalog caused the client to hang in some cases. [#37551](https://github.com/apache/doris/pull/37551) + +### Multi-Table Materialized View + +- Fixed an issue where adding new partitions to the base table could lead to incorrect results after partition aggregation roll-up rewrites. [#37651](https://github.com/apache/doris/pull/37651) + +- Fixed an issue where the materialized view partition status was not set to out-of-sync after deleting associated base table partitions. [#36602](https://github.com/apache/doris/pull/36602) + +- Fixed an occasional deadlock issue during asynchronous materialized view builds. [#37133](https://github.com/apache/doris/pull/37133) + +- Fixed an occasional "nereids cost too much time" error when refreshing a large number of partitions in a single asynchronous materialized view refresh. [#37589](https://github.com/apache/doris/pull/37589) + +- Fixed an issue where an asynchronous materialized view could not be created if the final select list contained a null literal. [#37281](https://github.com/apache/doris/pull/37281) + +- Fixed an issue with single-table materialized views where, even though the aggregation materialized view was successfully rewritten, the CBO did not select it. [#35721](https://github.com/apache/doris/pull/35721) [#36058](https://github.com/apache/doris/pull/36058) + +- Fixed an issue where partition derivation failed when building a partitioned materialized view with both join inputs being aggregations. [#34781](https://github.com/apache/doris/pull/34781) + +### Semi-Structured Data Management + +- Fixed issues with VARIANT in special cases such as concurrency and abnormal data.[#37976](https://github.com/apache/doris/pull/37976) [#37839](https://github.com/apache/doris/pull/37839) [#37794](https://github.com/apache/doris/pull/37794) [#37674](https://github.com/apache/doris/pull/37674) [#36997](https://github.com/apache/doris/pull/36997) + +- Fixed coredump issues when using VARIANT in unsupported SQL. [#37640](https://github.com/apache/doris/pull/37640) + +- Fixed coredump issues related to MAP data type when upgrading from 1.x to 2.x or higher versions. [#36937](https://github.com/apache/doris/pull/36937) + +- Improved ES Catalog support for Array types. [#36936](https://github.com/apache/doris/pull/36936) + +### Inverted Index + +- Fixed an issue where DROP INDEX for Inverted Index v2 did not delete metadata. [#37646](https://github.com/apache/doris/pull/37646) + +- Fixed query accuracy issues when string length exceeded the "ignore above" threshold. [#37679](https://github.com/apache/doris/pull/37679) + +- Fixed issues with index size statistics. [#37232](https://github.com/apache/doris/pull/37232) [#37564](https://github.com/apache/doris/pull/37564) + +### Query Optimizer + +- Fixed an issue that prevented import operations from executing due to the use of reserved keywords. [#35938](https://github.com/apache/doris/pull/35938) + +- Fixed a type error where char(255) was incorrectly recorded as char(1) when creating a table. [#37671](https://github.com/apache/doris/pull/37671) + +- Fixed incorrect results when the join expression in a correlated subquery was a complex expression. [#37683](https://github.com/apache/doris/pull/37683) + +- Fixed a potential issue with incorrect bucket pruning for decimal types. [#38013](https://github.com/apache/doris/pull/38013) + +- Fixed incorrect aggregation operator results when pipeline local shuffle was enabled in certain scenarios. [#38016](https://github.com/apache/doris/pull/38016) + +- Fixed planning errors that could occur when equal expressions existed in aggregation operators. [#36622](https://github.com/apache/doris/pull/36622) + +- Fixed planning errors that could occur when lambda expressions were present in aggregation operators. [#37285](https://github.com/apache/doris/pull/37285) + +- Fixed an issue where a literal generated from a window function being optimized to a literal had the wrong type, preventing execution. [#37283](https://github.com/apache/doris/pull/37283) + +- Fixed an issue with the null attribute being incorrectly output by the aggregate function foreach combinator. [#37980](https://github.com/apache/doris/pull/37980) + +- Fixed an issue where the acos function could not be planned when its parameter was a literal out of range. [#37996](https://github.com/apache/doris/pull/37996) + +- Fixed planning errors when specifying partitions for a query on a synchronized materialized view. [#36982](https://github.com/apache/doris/pull/36982) + +- Fixed occasional Null Pointer Exceptions (NPEs) during planning. [#38024](https://github.com/apache/doris/pull/38024) + +### Query Execution + +- Fixed an error in delete where statements when using decimal data types as conditions. [#37801](https://github.com/apache/doris/pull/37801) + +- Fixed an issue where BE memory was not released after query execution ended. [#37792](https://github.com/apache/doris/pull/37792) [#37297](https://github.com/apache/doris/pull/37297) + +- Fixed a problem where audit logs occupied too much FE memory under high QPS scenarios. [#37786](https://github.com/apache/doris/pull/37786) + +- Fixed BE core dumps when the sleep function received illegal input values. [#37681](https://github.com/apache/doris/pull/37681) + +- Fixed an error encountered during sync filter size execution. [#37103](https://github.com/apache/doris/pull/37103) + +- Fixed incorrect results when using time zones during execution. [#37062](https://github.com/apache/doris/pull/37062) + +- Fixed incorrect results when casting strings to integers. [#36788](https://github.com/apache/doris/pull/36788) + +- Fixed query errors when using the Arrow Flight protocol with pipelinex enabled. [#35804](https://github.com/apache/doris/pull/35804) + +- Fixed errors when casting strings to dates/datetimes. [#35637](https://github.com/apache/doris/pull/35637) + +- Fixed BE core dumps during large table join queries using <=>. [#36263](https://github.com/apache/doris/pull/36263) + +### Storage Management + +- Fixed the issue of invisible DELETE SIGN data encountered during column update and write operations. [#36755](https://github.com/apache/doris/pull/36755) + +- Optimized FE's memory usage during schema changes. [#36756](https://github.com/apache/doris/pull/36756) + +- Fixed the issue where BE would hang during restart due to transactions not being aborted [#36437](https://github.com/apache/doris/pull/36437) + +- Fixed occasional errors when changing from NOT NULL to NULL data types. [#36389](https://github.com/apache/doris/pull/36389) + +- Optimized replica repair scheduling when BE goes down. [#36897](https://github.com/apache/doris/pull/36897) + +- Supported round-robin disk selection for tablet creation on a single BE. [#36900](https://github.com/apache/doris/pull/36900) + +- Fixed query error -230 caused by slow publishing. [#36222](https://github.com/apache/doris/pull/36222) + +- Improved the speed of partition balancing. [#36976](https://github.com/apache/doris/pull/36976) + +- Controlled segment cache using the number of file descriptors (FDs) and memory to avoid FD exhaustion. [#37035](https://github.com/apache/doris/pull/37035) + +- Fixed potential replica loss caused by concurrent clone and alter operations [#36858](https://github.com/apache/doris/pull/36858) + +- Fixed the issue of not being able to adjust column order.[#37226](https://github.com/apache/doris/pull/37226) + +- Prohibited certain schema change operations on auto-increment columns. [#37331](https://github.com/apache/doris/pull/37331) + +- Fixed inaccurate error reporting for DELETE operations. [#37374](https://github.com/apache/doris/pull/37374) + +- Adjusted the trash expiration time on BE side to one day. [#37409](https://github.com/apache/doris/pull/37409) + +- Optimized compaction memory usage and scheduling. [#37491](https://github.com/apache/doris/pull/37491) + +- Checked for potential oversized backups causing FE restarts. [#37466](https://github.com/apache/doris/pull/37466) + +- Restored dynamic partition deletion policies and cross-partition behaviors to 2.1.3. [#37570](https://github.com/apache/doris/pull/37570) [#37506](https://github.com/apache/doris/pull/37506) + +- Fixed errors related to decimal types in DELETE predicates. [#37710](https://github.com/apache/doris/pull/37710) + +### Data Loading + +- Fixed data invisibility issues caused by race conditions in error handling during imports [#36744](https://github.com/apache/doris/pull/36744) + +- Added support for hhl_from_base64 in streamload imports. [#36819](https://github.com/apache/doris/pull/36819) + +- Fixed potential FE OOM issues when importing very large numbers of tablets for a single table. [#36944](https://github.com/apache/doris/pull/36944) + +- Fixed possible auto-increment column duplication during FE master-slave switchovers. [#36961](https://github.com/apache/doris/pull/36961) + +- Fixed errors when inserting into select with auto-increment columns. [#37029](https://github.com/apache/doris/pull/37029) + +- Reduced the number of data flush threads to optimize memory usage. [#37092](https://github.com/apache/doris/pull/37092) + +- Improved automatic recovery and error messaging for routine load tasks. [#37371](https://github.com/apache/doris/pull/37371) + +- Increased the default batch size for routine load. [#37388](https://github.com/apache/doris/pull/37388) + +- Fixed routine load task stoppage due to Kafka EOF expiration. [#37983](https://github.com/apache/doris/pull/37983) + +- Fixed coredump issues in multi-table streaming. [#37370](https://github.com/apache/doris/pull/37370) + +- Fixed premature backpressure caused by inaccurate memory estimation in groupcommit. [#37379](https://github.com/apache/doris/pull/37379) + +- Optimized BE-side thread usage in groupcommit. [#37380](https://github.com/apache/doris/pull/37380) + +- Fixed the issue of no error URL when data was not partitioned. [#37401](https://github.com/apache/doris/pull/37401) + +- Fixed potential memory misoperations during imports. [#38021](https://github.com/apache/doris/pull/38021) + +### Merge on Write Unique Key + +- Reduced memory usage during compaction for primary key tables. [#36968](https://github.com/apache/doris/pull/36968) + +- Fixed potential duplicate data issues when primary key replica cloning fails. [#37229](https://github.com/apache/doris/pull/37229) + +### Permissions + +- Fixed the issue of missing authorization when a table-valued function references a resource. [#37132](https://github.com/apache/doris/pull/37132) + +- Fixed the issue where the SHOW ROLE statement did not include workload group permissions. [#36032](https://github.com/apache/doris/pull/36032) + +- Fixed the issue where executing two statements simultaneously when creating a row policy could cause FE to fail to restart. [#37342](https://github.com/apache/doris/pull/37342) + +- Fixed the issue where, in some cases, upgrading from an older version could result in FE metadata replay failures due to row policies. [#37342](https://github.com/apache/doris/pull/37342) + +### Others + +- Fixed the issue of compute nodes participating in internal table creation. [#37961](https://github.com/apache/doris/pull/37961) + +- Fixed the read lag issue when `enable_strong_read_consistency` is set to true. [#37641](https://github.com/apache/doris/pull/37641) \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.6.md b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.6.md new file mode 100644 index 0000000000000..c14d25b52573f --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.6.md @@ -0,0 +1,524 @@ +--- +{ + "title": "Release 2.1.6", + "language": "en" +} +--- + + + +Dear community, **Apache Doris version 2.1.6 was officially released on September 10, 2024.** This version brings continuous upgrades and improvements to the Lakehouse, Async Materialized Views, and Semi-Structured Data Management. Additionally, several fixes have been implemented in areas such as the query optimizer, execution engine, storage management, permission management. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- Removed the `delete_if_exists` option from create repository. [#38192](https://github.com/apache/doris/pull/38192) + +- Added the `enable_prepared_stmt_audit_log` session variable to control whether JDBC prepared statements record audit logs, with the default being no recording. [#38624](https://github.com/apache/doris/pull/38624) [#39009](https://github.com/apache/doris/pull/39009) + +- Implemented fd limit and memory constraints for segment cache. [#39689](https://github.com/apache/doris/pull/39689) + +- When the FE configuration item `sys_log_mode` is set to BRIEF, file location information is added to the logs. [#39571](https://github.com/apache/doris/pull/39571) + +- Changed the default value of the session variable `max_allowed_packet` to 16MB. [#38697](https://github.com/apache/doris/pull/38697) + +- When a single request contains multiple statements, semicolons must be used to separate them. [#38670](https://github.com/apache/doris/pull/38670) + +- Added support for statements to begin with a semicolon. [#39399](https://github.com/apache/doris/pull/39399) + +- Aligned type formatting with MySQL in statements such as `show create table`. [#38012](https://github.com/apache/doris/pull/38012) + +- When the new optimizer planning times out, it no longer falls back to prevent the old optimizer from using longer planning times. [#39499](https://github.com/apache/doris/pull/39499) + +## New features + +### Lakehouse + +- Supported writeback for Iceberg tables. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/lakehouse/datalake-building/iceberg-build). + +- SQL interception rules now support external tables. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/query-admin/sql-interception). + +- Added the system table `file_cache_statistics` to view BE data cache metrics. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/system-tables/file_cache_statistics). + +### Async Materialized View + +- Supported transparent rewriting during inserts. [#38115](https://github.com/apache/doris/pull/38115) + +- Supported transparent rewriting when variant types exist in queries.[ #37929](https://github.com/apache/doris/pull/37929) + +### Semi-Structured Data Management + +- Supported casting ARRAY MAP to JSON type.[ #36548](https://github.com/apache/doris/pull/36548) + +- Supported the `json_keys` function.[ #36411](https://github.com/apache/doris/pull/36411) + +- Supported specifying the JSON path $. when importing JSON. [#38213](https://github.com/apache/doris/pull/38213) + +- ARRAY, MAP, STRUCT types now support `replace_if_not_null`[#38304](https://github.com/apache/doris/pull/38304) + +- ARRAY, MAP, STRUCT types now support adjusting column order.[#39210](https://github.com/apache/doris/pull/39210) + +- Added the `multi_match` function to match keywords across multiple fields, with support for inverted index acceleration. [#37722](https://github.com/apache/doris/pull/37722) + +### Query Optimizer + +- Filled in the original database name, table name, column name, and alias for returned columns in the MySQL protocol. [ #38126](https://github.com/apache/doris/pull/38126) + +- Supported the aggregation function `group_concat` with both order by and distinct simultaneously. [#38080](https://github.com/apache/doris/pull/38080) + +- SQL cache now supports reusing cached results for queries with different comments. [#40049](https://github.com/apache/doris/pull/40049) + +- In partition pruning, supported including `date_trunc` and date functions in filter conditions. [#38025](https://github.com/apache/doris/pull/38025) [#38743](https://github.com/apache/doris/pull/38743) + +- Allowed using the database name where the table resides as a qualifier prefix for table aliases. [#38640](https://github.com/apache/doris/pull/38640) + +- Supported hint-style comments.[#39113](https://github.com/apache/doris/pull/39113) + +### Others + +- Added the system table `table_properties` for viewing table properties. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/system-tables/information_schema/table_properties). + +- Introduced deadlock and slow lock detection in FE. + + - For more information, please refer to the [documentation](https://doris.apache.org/docs/admin-manual/maint-monitor/frontend-lock-manager). + +## Improvements + +### Lakehouse + +- Reimplemented the external table metadata caching mechanism. + + - For details, refer to the [documentation](https://doris.apache.org/docs/lakehouse/metacache). + +- Added the session variable `keep_carriage_return` with a default value of false. By default, reading Hive Text format tables treats both `\r\n` and `\n` as newline characters. [#38099](https://github.com/apache/doris/pull/38099) + +- Optimized memory statistics for Parquet/ORC file read/write operations.[#37257](https://github.com/apache/doris/pull/37257) + +- Supported pushing down IN/NOT IN predicates for Paimon tables. [#38390](https://github.com/apache/doris/pull/38390) + +- Enhanced the optimizer to support Time Travel syntax for Hudi tables. [#38591](https://github.com/apache/doris/pull/38591) + +- Optimized Kerberos authentication-related processes. [ #37301](https://github.com/apache/doris/pull/37301) + +- Enabled reading Hive tables after renaming column operations. [#38809](https://github.com/apache/doris/pull/38809) + +- Optimized the reading performance of partition columns for external tables. [#38810](https://github.com/apache/doris/pull/38810) + +- Improved the data shard merging strategy during external table query planning to avoid performance degradation caused by a large number of small shards.[#38964](https://github.com/apache/doris/pull/38964) + +- Added attributes such as location to `SHOW CREATE DATABASE/TABLE`. [#39644](https://github.com/apache/doris/pull/39644) + +- Supported complex types in MaxCompute Catalog. [#39822](https://github.com/apache/doris/pull/39822) + +- Optimized the file cache loading strategy by using asynchronous loading to avoid long BE startup times. [#39036](https://github.com/apache/doris/pull/39036) + +- Improved the file cache eviction strategy, such as evicting locks held for extended periods. [#39721](https://github.com/apache/doris/pull/39721) + +### Async Materialized View + +- Supported hourly, weekly, and quarterly partition roll-up construction. [#37678](https://github.com/apache/doris/pull/37678) + +- For materialized views based on Hive external tables, the metadata cache is now updated before refresh to ensure the latest data is obtained during each refresh. [#38212](https://github.com/apache/doris/pull/38212) + +- Improved the performance of transparent rewrite planning in storage-compute decoupled mode by batch fetching metadata. [#39301](https://github.com/apache/doris/pull/39301) + +- Enhanced the performance of transparent rewrite planning by prohibiting duplicate enumerations. [#39541](https://github.com/apache/doris/pull/39541) + +- Improved the performance of transparent rewrite for refreshing materialized views based on Hive external table partitions.[#38525](https://github.com/apache/doris/pull/38525) + +### Semi-Structured Data Management + +- Optimized memory allocation for TOPN queries to improve performance. [#37429](https://github.com/apache/doris/pull/37429) + +- Enhanced the performance of string processing in inverted indexes.[#37395](https://github.com/apache/doris/pull/37395) + +- Optimized the performance of inverted indexes in MOW tables. [#37428](https://github.com/apache/doris/pull/37428) + +- Supported specifying the row-store `page_size` during table creation to control compression effectiveness. [#37145](https://github.com/apache/doris/pull/37145) + +### Query Optimizer + +- Adjusted the row count estimation algorithm for mark joins, resulting in more accurate cardinality estimates for mark joins. [#38270](https://github.com/apache/doris/pull/38270) + +- Optimized the cost estimation algorithm for semi/anti joins, enabling more accurate selection of semi/anti join orders. [#37951](https://github.com/apache/doris/pull/37951) + +- Adjusted the filter estimation algorithm for cases where some columns have no statistical information, leading to more accurate cardinality estimates. [#39592](https://github.com/apache/doris/pull/39592) + +- Modified the instance calculation logic for set operation operators to prevent insufficient parallelism in extreme cases. [#39999](https://github.com/apache/doris/pull/39999) + +- Adjusted the usage strategy of bucket shuffle, achieving better performance when data is not sufficiently shuffled. [#36784](https://github.com/apache/doris/pull/36784) + +- Enabled early filtering of window function data, supporting multiple window functions in a single projection. [#38393](https://github.com/apache/doris/pull/38393) + +- When a `NullLiteral` exists in a filter condition, it can now be folded into false, further converted to an `EmptySet` to reduce unnecessary data scanning and computation. [#38135](https://github.com/apache/doris/pull/38135) + +- Expanded the scope of predicate derivation, reducing data scanning in queries with specific patterns. [#37314](https://github.com/apache/doris/pull/37314) + +- Supported partial short-circuit evaluation logic in partition pruning to improve partition pruning performance, achieving over 100% improvement in specific scenarios. [#38191](https://github.com/apache/doris/pull/38191) + +- Enabled the computation of arbitrary scalar functions within user variables. [#39144](https://github.com/apache/doris/pull/39144) + +- Maintained error messages consistent with MySQL when alias conflicts exist in queries. [#38104](https://github.com/apache/doris/pull/38104) + +### Query Execution + +- Adapted AggState for compatibility from 2.1 to 3.x and fixed coredump issues. [#37104](https://github.com/apache/doris/pull/37104) + +- Refactored the strategy selection for local shuffle when no joins are involved. [#37282](https://github.com/apache/doris/pull/37282) + +- Modified the scanner for internal table queries to an asynchronous approach to prevent blocking during internal table queries. [#38403](https://github.com/apache/doris/pull/38403) + +- Optimized the block merge process when building hash tables in Join operators. [#37471](https://github.com/apache/doris/pull/37471) + +- Reduced the lock holding time for MultiCast operations. [37462](https://github.com/apache/doris/pull/37462) + +- Optimized gRPC's keepAliveTime and added a connection monitoring mechanism, reducing the probability of query failures due to RPC errors during query execution. [#37304](https://github.com/apache/doris/pull/37304) + +- Cleaned up all dirty pages in jemalloc when memory limits are exceeded. [#37164](https://github.com/apache/doris/pull/37164) + +- Improved the performance of `aes_encrypt`/`decrypt` functions when handling constant types. [#37194](https://github.com/apache/doris/pull/37194) + +- Optimized the performance of `json_extract` functions when processing constant data. [#36927](https://github.com/apache/doris/pull/36927) + +- Optimized the performance of ParseURL functions when processing constant data. [#36882](https://github.com/apache/doris/pull/36882) + +### Backup Recovery / CCR + +- Restore now supports deleting redundant tablets and partition options. [#39363](https://github.com/apache/doris/pull/39363) + +- Check storage connectivity when creating a repository. [#39538](https://github.com/apache/doris/pull/39538) + +- Enables binlog to support `DROP TABLE`, allowing CCR to incrementally synchronize `DROP TABLE` operations. [#38541](https://github.com/apache/doris/pull/38541) + +### Compaction + +- Improves the issue where high-priority compaction tasks were not subject to task concurrency control limits. [#38189](https://github.com/apache/doris/pull/38189) + +- Automatically reduces compaction memory consumption based on data characteristics. [#37486](https://github.com/apache/doris/pull/37486) + +- Fixes an issue where the sequential data optimization strategy could lead to incorrect data in aggregate tables or MOR UNIQUE tables. [ #38299](https://github.com/apache/doris/pull/38299) + +- Optimizes the rowset selection strategy during compaction during replica replenishment to avoid triggering -235 errors. [#39262](https://github.com/apache/doris/pull/39262) + +### MOW (Merge-On-Write) + +- Optimizes slow column updates caused by concurrent column updates and compactions. [#38682](https://github.com/apache/doris/pull/38682) + +- Fixes an issue where segcompaction during bulk data imports could lead to incorrect MOW data. [#38992](https://github.com/apache/doris/pull/38992) [#39707](https://github.com/apache/doris/pull/39707) + +- Fixes data loss in column updates that may occur after BE restarts. [#39035](https://github.com/apache/doris/pull/39035) + +### Storage Management + +- Adds FE configuration to control whether queries under hot-cold tiering prefer local data replicas. [#38322](https://github.com/apache/doris/pull/38322) + +- Optimizes expired BE report messages to include newly created tablets. [#38839](https://github.com/apache/doris/pull/38839) [#39605](https://github.com/apache/doris/pull/39605) + +- Optimizes replica scheduling priority strategy to prioritize replicas with missing data. [#38884](https://github.com/apache/doris/pull/38884) + +- Prevents tablets with unfinished ALTER jobs from being balanced. [#39202](https://github.com/apache/doris/pull/39202) + +- Enables modifying the number of buckets for tables with list partitioning. [#39688](https://github.com/apache/doris/pull/39688) + +- Prefers querying from online disk services. [#39654](https://github.com/apache/doris/pull/39654) + +- Improves error messages for materialized view base tables that do not support deletion during synchronization. [#39857](https://github.com/apache/doris/pull/39857) + +- Improves error messages for single columns exceeding 4GB. [#39897](https://github.com/apache/doris/pull/39897) + +- Fixes an issue where aborted transactions were omitted when plan errors occurred during `INSERT` statements.[#38260](https://github.com/apache/doris/pull/38260) + +- Fixes exceptions during SSL connection closure.[#38677](https://github.com/apache/doris/pull/38677) + +- Fixes an issue where table locks were not held when aborting transactions using labels. [#38842](https://github.com/apache/doris/pull/38842) + +- Fixes `gson pretty` causing large image issues. [#39135](https://github.com/apache/doris/pull/39135) + +- Fixes an issue where the new optimizer did not check for bucket values of 0 in `CREATE TABLE` statements.[#38999](https://github.com/apache/doris/pull/38999) + +- Fixes errors when Chinese column names are included in `DELETE` condition predicates. [#39500](https://github.com/apache/doris/pull/39500) + +- Fixes frequent tablet balancing issues in partition balancing mode. [#39606](https://github.com/apache/doris/pull/39606) + +- Fixes an issue where partition storage policy attributes were lost. [#39677](https://github.com/apache/doris/pull/39677) + +- Fixes incorrect statistics when importing multiple tables within a transaction. [#39548](https://github.com/apache/doris/pull/39548) + +- Fixes errors when deleting random bucket tables. [#39830](https://github.com/apache/doris/pull/39830) + +- Fixes issues where FE fails to start due to non-existent UDFs. [#39868](https://github.com/apache/doris/pull/39868) + +- Fixes inconsistencies in the last failed version between FE master and slave. [#39947](https://github.com/apache/doris/pull/39947) + +- Fixes an issue where related tablets may still be in schema change state when schema change jobs are canceled. [ #39327](https://github.com/apache/doris/pull/39327) + +- Fixes errors when modifying type and column order in a single statement schema change (SC). [#39107](https://github.com/apache/doris/pull/39107) + +### Data Loading + +- Improves error messages for -238 errors during imports. [#39182](https://github.com/apache/doris/pull/39182) + +- Allows importing to other partitions while restoring a partition. [#39915](https://github.com/apache/doris/pull/39915) + +- Optimizes the strategy for FE to select BEs during group commit. [#37830](https://github.com/apache/doris/pull/37830) [#39010](https://github.com/apache/doris/pull/39010) + +- Avoids printing stack traces for some common streamload error messages. [#38418](https://github.com/apache/doris/pull/38418) + +- Improves handling of issues where offline BEs may affect import errors. [#38256](https://github.com/apache/doris/pull/38256) + +### Permissions + +- Optimizes access performance after enabling the Ranger authentication plugin. [#38575](https://github.com/apache/doris/pull/38575) +- Optimizes permission strategies for Refresh Catalog/Database/Table operations, allowing users to perform these operations with only SHOW permissions. [#39008](https://github.com/apache/doris/pull/39008) + +## Bug fixes + +### Lakehouse + +- Fixes the issue where switching catalogs may result in an error of not finding the database. [#38114](https://github.com/apache/doris/pull/38114) + +- Addresses exceptions caused by attempting to read non-existent data on S3. [#38253](https://github.com/apache/doris/pull/38253) + +- Resolves the issue where specifying an abnormal path during export operations may lead to incorrect export locations. [#38602](https://github.com/apache/doris/pull/38602) + +- Fixes the timezone issue for time columns in Paimon tables. [#37716](https://github.com/apache/doris/pull/37716) + +- Temporarily disables the Parquet PageIndex feature to avoid certain erroneous behaviors. + +- Corrects the selection of Backend nodes in the blacklist during external table queries. [#38984](https://github.com/apache/doris/pull/38984) + +- Resolves errors caused by missing subcolumns in Parquet Struct column types.[#39192](https://github.com/apache/doris/pull/39192) + +- Addresses several issues with predicate pushdown in JDBC Catalog. [#39082](https://github.com/apache/doris/pull/39082) + +- Fixes issues where some historical Parquet formats led to incorrect query results. [#39375](https://github.com/apache/doris/pull/39375) + +- Improves compatibility with ojdbc6 drivers for Oracle JDBC Catalog. [#39408](https://github.com/apache/doris/pull/39408) + +- Resolves potential FE memory leaks caused by Refresh Catalog/Database/Table operations. [#39186](https://github.com/apache/doris/pull/39186) [#39871](https://github.com/apache/doris/pull/39871) + +- Fixes thread leaks in JDBC Catalog under certain conditions. [#39666](https://github.com/apache/doris/pull/39666) [#39582](https://github.com/apache/doris/pull/39582) + +- Addresses potential event processing failures after enabling Hive Metastore event subscription. [#39239](https://github.com/apache/doris/pull/39239) + +- Disables reading Hive Text format tables with custom escape characters and null formats to prevent data errors. [#39869](https://github.com/apache/doris/pull/39869) + +- Resolves issues accessing Iceberg tables created via the Iceberg API under certain conditions. [#39203](https://github.com/apache/doris/pull/39203) + +- Fixes the inability to read Paimon tables stored on HDFS clusters with high availability enabled. [#39876](https://github.com/apache/doris/pull/39876) + +- Addresses errors that may occur when reading Paimon table deletion vectors after enabling file caching. [#39875](https://github.com/apache/doris/pull/39875) + +- Resolves potential deadlocks when reading Parquet files under certain conditions. [#39945](https://github.com/apache/doris/pull/39945) + +### Async Materialized View + +- Fixes the inability to use `SHOW CREATE MATERIALIZED VIEW` on follower FEs. [#38794](https://github.com/apache/doris/pull/38794) + +- Unifies the object type of asynchronous materialized views in metadata as tables to enable proper display in data tools. [#38797](https://github.com/apache/doris/pull/38797) + +- Resolves the issue where nested asynchronous materialized views always perform full refreshes. [#38698](https://github.com/apache/doris/pull/38698) + +- Fixes the issue where canceled tasks may show as running after restarting FEs. [ #39424](https://github.com/apache/doris/pull/39424) + +- Addresses incorrect use of contexts, which may lead to unexpected failures of materialized view refresh tasks. [#39690](https://github.com/apache/doris/pull/39690) + +- Resolves issues that may cause varchar type write failures due to unreasonable lengths when creating asynchronous materialized views based on external tables.[#37668](https://github.com/apache/doris/pull/37668) + +- Fixes the potential invalidation of asynchronous materialized views based on external tables after FE restarts or catalog rebuilds. [#39355](https://github.com/apache/doris/pull/39355) + +- Prohibits the use of partition rollup for materialized views with list partitions to prevent the generation of incorrect data. [#38124](https://github.com/apache/doris/pull/38124) + +- Fixes incorrect results when literals exist in the select list during transparent rewriting for aggregation rollup. [#38958](https://github.com/apache/doris/pull/38958) + +- Addresses potential errors during transparent rewriting when queries contain filters like `a = a`. [#39629](https://github.com/apache/doris/pull/39629) + +- Fixes issues where transparent rewriting for direct external table queries fails. [#39041](https://github.com/apache/doris/pull/39041) + +### Semi-Structured Data Management + +- Removes support for prepared statements in the old optimizer. [#39465](https://github.com/apache/doris/pull/39465) + +- Fixes issues with JSON escape character handling. [#37251](https://github.com/apache/doris/pull/37251) + +- Resolves issues with duplicate processing of JSON fields. [#38490](https://github.com/apache/doris/pull/38490) + +- Fixes issues with some ARRAY and MAP functions. [#39307](https://github.com/apache/doris/pull/39307) [#39699](https://github.com/apache/doris/pull/39699) [#39757](https://github.com/apache/doris/pull/39757) + +- Resolves complex combinations of inverted index queries and LIKE queries. [#36687](https://github.com/apache/doris/pull/36687) + +### Query Optimizer + +- Fixed the potential partition pruning error issue when the 'OR' condition exists in partition filter conditions. [#38897](https://github.com/apache/doris/pull/38897) + +- Fixed the potential partition pruning error issue when complex expressions are involved. [#39298](https://github.com/apache/doris/pull/39298) + +- Fixed the issue where nullable in `agg_state` subtypes might be planned incorrectly, leading to execution errors. [#37489](https://github.com/apache/doris/pull/37489) + +- Fixed the issue where nullable in set operation operators might be planned incorrectly, leading to execution errors. [#39109](https://github.com/apache/doris/pull/39109) + +- Fixed the incorrect execution priority issue of intersect operator. [#39095](https://github.com/apache/doris/pull/39095) + +- Fixed the NPE issue that may occur when the maximum valid date literal exists in the query. [#39482](https://github.com/apache/doris/pull/39482) + +- Fixed the occasional planning error that results in an illegal slot error during execution. [#39640](https://github.com/apache/doris/pull/39640) + +- Fixed the issue where repeatedly referencing columns in cte may lead to missing data in some columns in the result. [#39850](https://github.com/apache/doris/pull/39850) + +- Fixed the occasional planning error issue when 'case when' exists in the query. [#38491](https://github.com/apache/doris/pull/38491) + +- Fixed the issue where IP types cannot be implicitly converted to string types. [#39318](https://github.com/apache/doris/pull/39318) + +- Fixed the potential planning error issue when using multi-dimensional aggregation and the same column and its alias exist in the select list. [ #38166](https://github.com/apache/doris/pull/38166) + +- Fixed the issue where boolean types might be handled incorrectly when using BE constant folding. [#39019](https://github.com/apache/doris/pull/39019) + +- Fixed the planning error issue caused by `default_cluster`: as a prefix for the database name in expressions. [#39114](https://github.com/apache/doris/pull/39114) + +- Fixed the potential deadlock issue caused by` insert into`. [#38660](https://github.com/apache/doris/pull/38660) + +- Fixed the potential planning error issue caused by not holding table locks throughout the planning process. [#38950](https://github.com/apache/doris/pull/38950) + +- Fixed the issue where CHAR(0), VARCHAR(0) are not handled correctly when creating tables. [#38427](https://github.com/apache/doris/pull/38427) + +- Fixed the issue where `show create table` may incorrectly display hidden columns. [#38796](https://github.com/apache/doris/pull/38796) + +- Fixed the issue where columns with the same name as hidden columns are not prohibited when creating tables. [#38796](https://github.com/apache/doris/pull/38796) + +- Fixed the occasional planning error issue when executing `insert into as select` with CTEs. [#38526](https://github.com/apache/doris/pull/38526) + +- Fixed the issue where `insert into values` cannot automatically fill null default values. **[[fix](Nereids) fix insert into table with null literal default value #39122](https://github.com/apache/doris/pull/39122)** + +- Fixed the NPE issue caused by using cte in delete without using it. [#39379](https://github.com/apache/doris/pull/39379) + +- Fixed the issue where deleting from a randomly distributed aggregation model table fails. [#37985](https://github.com/apache/doris/pull/37985) + +### Query Execution + +- Fixed the issue where the pipeline execution engine gets stuck in multiple scenarios, causing queries not to end. [#38657](https://github.com/apache/doris/pull/38657) [#38206](https://github.com/apache/doris/pull/38206) [#38885](https://github.com/apache/doris/pull/38885) + +- Fixed the coredump issue caused by null and non-null columns in set difference calculations.[#38737](https://github.com/apache/doris/pull/38737) + +- Fixed the incorrect result issue of the `width_bucket` function. [#37892](https://github.com/apache/doris/pull/37892) + +- Fixed the query error issue when a single row of data is large and the result set is also large (exceeding 2GB). [#37990](https://github.com/apache/doris/pull/37990) + +- Fixed the incorrect result issue of `stddev` with DecimalV2 type. [#38731](https://github.com/apache/doris/pull/38731) + +- Fixed the coredump issue caused by the `MULTI_MATCH_ANY` function. [#37959](https://github.com/apache/doris/pull/37959) + +- Fixed the issue where `insert overwrite auto partition` causes transaction rollback. [#38103](https://github.com/apache/doris/pull/38103) + +- Fixed the incorrect result issue of the `convert_tz` function. [#37358](https://github.com/apache/doris/pull/37358) [#38764](https://github.com/apache/doris/pull/38764) + +- Fixed the coredump issue when using the `collect_set` function with window functions. [#38234](https://github.com/apache/doris/pull/38234) + +- Fixed the coredump issue caused by the mod function with abnormal input. [#37999](https://github.com/apache/doris/pull/37999) + +- Fixed the issue where executing the same expression in multiple threads may lead to incorrect Java UDF results. [#38612](https://github.com/apache/doris/pull/38612) + +- Fixed the overflow issue caused by the incorrect return type of the `conv` function. [#38001](https://github.com/apache/doris/pull/38001) + +- Fixed the unstable result issue of the histogram function. [#38608](https://github.com/apache/doris/pull/38608) + +### Backup & Recovery / CCR + +- Fixed the issue where the data version after backup and recovery may be incorrect, leading to unreadability. [#38343](https://github.com/apache/doris/pull/38343) + +- Fixed the issue of using restore version across versions. [#38396](https://github.com/apache/doris/pull/38396) + +- Fixed the issue where the job is not canceled when backup fails. [#38993](https://github.com/apache/doris/pull/38993) + +- Fixed the NPE issue in ccr during the upgrade from 2.1.4 to 2.1.5, causing the FE to fail to start. [#39910](https://github.com/apache/doris/pull/39910) + +- Fixed the issue where views and materialized views cannot be used after restoration. [#38072](https://github.com/apache/doris/pull/38072) [#39848](https://github.com/apache/doris/pull/39848) + +### Storage Management + +- Fixed possible memory leaks in routine load when loading multiple tables from a single stream. [#38824](https://github.com/apache/doris/pull/38824) + +- Fixed the issue where delimiters and escape characters in routine load were not effective. [#38825](https://github.com/apache/doris/pull/38825) + +- Fixed incorrectly show routine load results when the routine load task name contained uppercase letters. [#38826](https://github.com/apache/doris/pull/38826) + +- Fixed the issue where the offset cache was not reset when changing the routineload topic. [#38474](https://github.com/apache/doris/pull/38474) + +- Fixed the potential exception triggered by show routineload under concurrent scenarios. [#39525](https://github.com/apache/doris/pull/39525) + +- Fixed the issue where routine load might import data repeatedly. [#39526](https://github.com/apache/doris/pull/39526) + +- Fixed the data error caused by `setNull` when enabling group commit via JDBC. [#38276](https://github.com/apache/doris/pull/38276) + +- Fixed the potential NPE issue when enabling group commit insert to a non-master FE. [#38345](https://github.com/apache/doris/pull/38345) + +- Fixed incorrect error handling during internal data writing in group commit. [#38997](https://github.com/apache/doris/pull/38997) + +- Fixed the coredump that might be triggered when the group commit execution plan failed. [#39396](https://github.com/apache/doris/pull/39396) + +- Fixed the issue where concurrent imports into auto partition tables might report non-existent tablets. [#38793](https://github.com/apache/doris/pull/38793) + +- Fixed potential load stream leakage issues. [#39039](https://github.com/apache/doris/pull/39039) + +- Fixed the issue where transactions were opened for `insert into select` with no data. [#39108](https://github.com/apache/doris/pull/39108) + +- Ignored the single-replica import configuration when using memtable prefetching. [#39154](https://github.com/apache/doris/pull/39154) + +- Fixed the issue where background imports of stream load records might be abnormally aborted upon encountering db deletion. [#39527](https://github.com/apache/doris/pull/39527) + +- Fixed inaccurate error messages when data errors occurred in strict mode. [#39587](https://github.com/apache/doris/pull/39587) + +- Fixed the issue where streamload did not return an error URL upon encountering erroneous data. [#38417](https://github.com/apache/doris/pull/38417) + +- Fixed the issue with the combined use of insert overwrite and auto partition. [#38442](https://github.com/apache/doris/pull/38442) + +- Fixed parsing errors when CSV encountered data where the line delimiter was enclosed by the enclosing character. [#38445](https://github.com/apache/doris/pull/38445) + +### Data Exporting + +- Fixed the issue where enabling the delete_existing_files property during export operations might result in duplicate deletion of exported data. [#39304](https://github.com/apache/doris/pull/39304)) + +### Permissions + +- Fixed the incorrect requirement of ALTER TABLE permission when creating a materialized view. [#38011](https://github.com/apache/doris/pull/38011) + +- Fixed the issue where the db was explicitly displayed as empty when showing routine load. [#38365](https://github.com/apache/doris/pull/38365) + +- Fixed the incorrect requirement of CREATE permission on the original table when using CREATE TABLE LIKE. [#37879](https://github.com/apache/doris/pull/37879) + +- Fixed the issue where grant operations did not check if the object existed. [#39597](https://github.com/apache/doris/pull/39597) + +## Upgrade suggestions + +When upgrading Doris, please follow the principle of not skipping two minor versions and upgrade sequentially. + +For example, if you are upgrading from version 0.15.x to 2.0.x, it is recommended to first upgrade to the latest version of 1.1, then upgrade to the latest version of 1.2, and finally upgrade to the latest version of 2.0. + +For more upgrade information, see the documentation: [Cluster Upgrade](../../admin-manual/cluster-management/upgrade) \ No newline at end of file diff --git a/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.7.md b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.7.md new file mode 100644 index 0000000000000..414229276e6b0 --- /dev/null +++ b/versioned_docs/version-3.0/releasenotes/v2.1/release-2.1.7.md @@ -0,0 +1,180 @@ +--- +{ + "title": "Release 2.1.7", + "language": "en" +} +--- + + + +Dear community, **Apache Doris version 2.1.7 was officially released on November 10, 2024.** This version brings continuous upgrades and improvements. Additionally, several fixes have been implemented in areas such as the to the Lakehouse, Async Materialized Views, and Semi-Structured Data Management, Query Optimizer and Permission Management. + +**Quick Download:** https://doris.apache.org/download/ + +**GitHub Release:** https://github.com/apache/doris/releases + +## Behavior changes + +- The following global variables will be forcibly set to the following default values: + - enable_nereids_dml: true + - enable_nereids_dml_with_pipeline: true + - enable_nereids_planner: true + - enable_fallback_to_original_planner: true + - enable_pipeline_x_engine: true +- New columns have been added to the audit log. [#42262](https://github.com/apache/doris/pull/42262) + - For more information, please refer to [docs](https://doris.apache.org/docs/admin-manual/audit-plugin/) + +## New features + +### Async Materialized View + +- An asynchronous materialized view has added a property called `use_for_rewrite` to control whether it participates in transparent rewriting. [#40332](https://github.com/apache/doris/pull/40332) + +### Query Execution + +- The list of changed session variables is now output in the Profile. [#41016](https://github.com/apache/doris/pull/41016) +- Support for `trim_in`, `ltrim_in`, and `rtrim_in` functions has been added. [#42641](https://github.com/apache/doris/pull/42641) (Note: This is a duplicate mention, but I'm including it as per your original list.) +- Support for several URL functions (top_level_domain, first_significant_subdomain, cut_to_first_significant_subdomain) has been added. [#42916](https://github.com/apache/doris/pull/42916) +- The `bit_set` function has been added. [#42916](https://github.com/apache/doris/pull/42099) +- The `count_substrings` function has been added. [#42055](https://github.com/apache/doris/pull/42055) +- The `translate` and `url_encode` functions have been added. [#41051](https://github.com/apache/doris/pull/41051) +- The `normal_cdf`, `to_iso8601`, and `from_iso8601_date` functions have been added. [#40695](https://github.com/apache/doris/pull/40695) + + +### Storage Management + +- The `information_schema.table_options` and `table_properties` system tables have been added, supporting the querying of attributes set during table creation. [#34384](https://github.com/apache/doris/pull/34384) +- Support for `bitmap_empty` as a default value has been implemented. [#40364](https://github.com/apache/doris/pull/40364) +- A new session variable `require_sequence_in_insert` has been introduced to control whether a sequence column must be provided when performing `INSERT INTO SELECT` writes to a unique key table. [#41655](https://github.com/apache/doris/pull/41655) + +### Others + +- Allow for generating flame graphs on the BE WebUI page.[#41044](https://github.com/apache/doris/pull/41044) + +## Improvements + +### Lakehouse + +- Support for writing data to Hive text format tables. [#40537](https://github.com/apache/doris/pull/40537) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/datalake-building/hive-build) +- Access MaxCompute data using MaxCompute Open Storage API. [#41610](https://github.com/apache/doris/pull/41610) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/database/max-compute) +- Support for Paimon DLF Catalog. [#41694](https://github.com/apache/doris/pull/41694) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/datalake-analytics/paimon) +- Added `table$partitions` syntax to directly query Hive partition information.[#41230](https://github.com/apache/doris/pull/41230) + - For more information, please refer to [docs](https://doris.apache.org/docs/lakehouse/datalake-analytics/hive) +- Support for reading Parquet files in brotli compression format.[#42162](https://github.com/apache/doris/pull/42162) +- Support for reading DECIMAL 256 types in Parquet files. [#42241](https://github.com/apache/doris/pull/42241) +- Support for reading Hive tables in OpenCsvSerde format.[#42939](https://github.com/apache/doris/pull/42939) + +### Async Materialized View + +- Refined the granularity of lock holding during the build process for asynchronous materialized views. [#40402](https://github.com/apache/doris/pull/40402) [#41010](https://github.com/apache/doris/pull/41010). + +### Query optimizer + +- Improved the accuracy of statistic information collection and usage in extreme cases to enhance planning stability. [#40457](https://github.com/apache/doris/pull/40457) +- Runtime filters can now be generated in more scenarios to improve query performance. [#40815](https://github.com/apache/doris/pull/40815) +- Enhanced constant folding capabilities for numerical, date, and string functions to boost query performance. [#40820](https://github.com/apache/doris/pull/40820) +- Optimized the column pruning algorithm to enhance query performance. [#41548](https://github.com/apache/doris/pull/41548) + +### Query Execution + +- Supported parallel preparation to reduce the time consumed by short queries. [#40270](https://github.com/apache/doris/pull/40270) +- Corrected the names of some counters in the profile to match the audit logs. [#41993](https://github.com/apache/doris/pull/41993) +- Added new local shuffle rules to speed up certain queries. [#40637](https://github.com/apache/doris/pull/40637) + +### Storage Management + +- The `SHOW PARTITIONS` command now supports displaying the commit version. [#28274](https://github.com/apache/doris/pull/28274) +- Checked for unreasonable partition expressions when creating tables. [#40158](https://github.com/apache/doris/pull/40158) +- Optimized the scheduling logic when encountering EOF in Routine Load. [#40509](https://github.com/apache/doris/pull/40509) +- Made Routine Load aware of schema changes. [#40508](https://github.com/apache/doris/pull/40508) +- Improved the timeout logic for Routine Load tasks. [#41135](https://github.com/apache/doris/pull/41135) + +### Others + +- Allowed closing the built-in service port of BRPC via BE configuration. [#41047](https://github.com/apache/doris/pull/41047) +- Fixed issues with missing fields and duplicate records in audit logs. [#43015](https://github.com/apache/doris/pull/43015) + +## Bug fixes + +### Lakehouse + +- Fixed the inconsistency in the behavior of INSERT OVERWRITE with Hive. [#39840](https://github.com/apache/doris/pull/39840) +- Cleaned up temporarily created folders to address the issue of too many empty folders on HDFS. [#40424](https://github.com/apache/doris/pull/40424) +- Resolved memory leaks in FE caused by using the JDBC Catalog in some cases. [#40923](https://github.com/apache/doris/pull/40923) +- Resolved memory leaks in BE caused by using the JDBC Catalog in some cases. [#41266](https://github.com/apache/doris/pull/41266) +- Fixed errors in reading Snappy compressed formats in certain scenarios. [#40862](https://github.com/apache/doris/pull/40862) +- Addressed potential FileSystem leaks on the FE side in certain scenarios. [#41108](https://github.com/apache/doris/pull/41108) +- Resolved issues where using EXPLAIN VERBOSE to view external table execution plans could cause null pointer exceptions in some cases. [#41231] (https://github.com/apache/doris/pull/41231) +- Fixed the inability to read tables in Paimon parquet format. [#41487](https://github.com/apache/doris/pull/41487) +- Addressed performance issues introduced by compatibility changes in the JDBC Oracle Catalog. [#41407](https://github.com/apache/doris/pull/41407) +- Disabled predicate pushing down after implicit conversion to resolve incorrect query results in some cases with JDBC Catalog. [#42242](https://github.com/apache/doris/pull/42242) +- Fixed issues with case-sensitive access to table names in the External Catalog. [#42261](https://github.com/apache/doris/pull/42261) + +### Async Materialized View + +- Fixed the issue where user-specified start times were not effective. [#39573](https://github.com/apache/doris/pull/39573) +- Resolved the issue of nested materialized views not refreshing. [#40433](https://github.com/apache/doris/pull/40433) +- Fixed the issue where materialized views might not refresh after the base table was deleted and recreated. [#41762](https://github.com/apache/doris/pull/41762) +- Addressed issues where partition compensation rewrites could lead to incorrect results. [#40803](https://github.com/apache/doris/pull/40803) +- Fixed potential errors in rewrite results when `sql_select_limit` was set. [#40106](https://github.com/apache/doris/pull/40106) + +### Semi-Structured Data Management + +- Fixed the issue of index file handle leaks. [#41915](https://github.com/apache/doris/pull/41915) +- Addressed inaccuracies in the `count()` function of inverted indexes in special cases. (#41127)[https://github.com/apache/doris/pull/41127] +- Fixed exceptions with variant when light schema change was not enabled. [#40908](https://github.com/apache/doris/pull/40908) +- Resolved memory leaks when variant returns arrays. [#41339](https://github.com/apache/doris/pull/41339) + +### Query optimizer + +- Corrected potential errors in nullable calculations for filter conditions during external table queries, leading to execution exceptions. [#41014](https://github.com/apache/doris/pull/41014) +- Fixed potential errors in optimizing range comparison expressions. [#41356](https://github.com/apache/doris/pull/41356) + +### Query Execution + +- The match_regexp function could not correctly handle empty strings. [#39503](https://github.com/apache/doris/pull/39503) +- Resolved issues where the scanner thread pool could become stuck in high-concurrency scenarios. [#40495](https://github.com/apache/doris/pull/40495) +- Fixed errors in the results of the `data_floor` function. [#41948](https://github.com/apache/doris/pull/41948) +- Addressed incorrect cancel messages in some scenarios. [#41798](https://github.com/apache/doris/pull/41798) +- Fixed issues with excessive warning logs printed by arrow flight. [#41770](https://github.com/apache/doris/pull/41770) +- Resolved issues where runtime filters failed to send in some scenarios. [#41698](https://github.com/apache/doris/pull/41698) +- Fixed problems where some system table queries could not end normally or became stuck. [#41592](https://github.com/apache/doris/pull/41592) +- Addressed incorrect results from window functions. ][#40761](https://github.com/apache/doris/pull/40761) +- Fixed issues where the encrypt and decrypt functions caused BE cores. [#40726](https://github.com/apache/doris/pull/40726) +- Resolved errors in the results of the conv function. [#40530](https://github.com/apache/doris/pull/40530) + +### Storage Management + +- Fixed import failures when Memtable migration was used in multi-replica scenarios with machine crashes. [#38003](https://github.com/apache/doris/pull/38003) +- Addressed inaccurate memory statistics during the Memtable flush phase during imports. [#39536](https://github.com/apache/doris/pull/39536) +- Fixed fault tolerance issues with Memtable migration in multi-replica scenarios. [#40477](https://github.com/apache/doris/pull/40477) +- Resolved inaccurate bvar statistics with Memtable migration. [#40985](https://github.com/apache/doris/pull/40985) +- Fixed inaccurate progress reporting for S3 loads. [#40987](https://github.com/apache/doris/pull/40987) + +### Permissions + +- Fixed permission issues related to show columns, show sync, and show data from db.table. [#39726](https://github.com/apache/doris/pull/39726) + +### Others + +- Fixed the issue where the audit log plugin for version 2.0 could not be used in version 2.1. [#41400](https://github.com/apache/doris/pull/41400) diff --git a/versioned_docs/version-3.0/releasenotes/v3.0/release-3.0.3.md b/versioned_docs/version-3.0/releasenotes/v3.0/release-3.0.3.md index c15141832f1eb..b15777212b400 100644 --- a/versioned_docs/version-3.0/releasenotes/v3.0/release-3.0.3.md +++ b/versioned_docs/version-3.0/releasenotes/v3.0/release-3.0.3.md @@ -25,7 +25,7 @@ under the License. --> -Dear community members, the Apache Doris 3.0.2 version was officially released on December 02, 2024, this version further enhances the performance and stability of the system. +Dear community members, the Apache Doris 3.0.3 version was officially released on December 02, 2024, this version further enhances the performance and stability of the system. **Quick Download:** https://doris.apache.org/download/ diff --git a/versioned_sidebars/version-1.2-sidebars.json b/versioned_sidebars/version-1.2-sidebars.json index 294acbf4fed39..0726d49201142 100644 --- a/versioned_sidebars/version-1.2-sidebars.json +++ b/versioned_sidebars/version-1.2-sidebars.json @@ -35,7 +35,9 @@ { "type": "category", "label": "Doris Introduction", - "items": ["summary/basic-summary"] + "items": [ + "summary/basic-summary" + ] }, { "type": "category", @@ -151,17 +153,25 @@ { "type": "category", "label": "Alter Table", - "items": ["advanced/alter-table/schema-change", "advanced/alter-table/replace-table"] + "items": [ + "advanced/alter-table/schema-change", + "advanced/alter-table/replace-table" + ] }, { "type": "category", "label": "Doris Partition", - "items": ["advanced/partition/dynamic-partition", "advanced/partition/table-temp-partition"] + "items": [ + "advanced/partition/dynamic-partition", + "advanced/partition/table-temp-partition" + ] }, { "type": "category", "label": "Data Cache", - "items": ["advanced/cache/partition-cache"] + "items": [ + "advanced/cache/partition-cache" + ] }, "advanced/autobucket", "advanced/broker", @@ -204,7 +214,9 @@ { "type": "category", "label": "Slow Query Analysis", - "items": ["query-acceleration/slow-query-analysis/get-profile"] + "items": [ + "query-acceleration/slow-query-analysis/get-profile" + ] } ] }, @@ -1008,7 +1020,9 @@ { "type": "category", "label": "Operators", - "items": ["sql-manual/sql-reference/Operators/in"] + "items": [ + "sql-manual/sql-reference/Operators/in" + ] }, { "type": "category", @@ -1097,12 +1111,17 @@ { "type": "category", "label": "User Privilege and Ldap", - "items": ["admin-manual/privilege-ldap/user-privilege", "admin-manual/privilege-ldap/ldap"] + "items": [ + "admin-manual/privilege-ldap/user-privilege", + "admin-manual/privilege-ldap/ldap" + ] }, { "type": "category", "label": "System Table", - "items": ["admin-manual/system-table/rowsets"] + "items": [ + "admin-manual/system-table/rowsets" + ] }, "admin-manual/multi-tenant", { @@ -1190,7 +1209,11 @@ "type": "category", "label": "Benchmark", "collapsed": false, - "items": ["benchmark/ssb", "benchmark/tpch", "benchmark/tpcds"] + "items": [ + "benchmark/ssb", + "benchmark/tpch", + "benchmark/tpcds" + ] }, { "type": "category", @@ -1233,23 +1256,94 @@ "type": "category", "label": "FAQ", "collapsed": false, - "items": ["faq/install-faq", "faq/data-faq", "faq/sql-faq", "faq/lakehouse-faq", "faq/bi-faq"] + "items": [ + "faq/install-faq", + "faq/data-faq", + "faq/sql-faq", + "faq/lakehouse-faq", + "faq/bi-faq" + ] }, { "type": "category", "label": "Releases", "collapsed": false, "items": [ - "releasenotes/v1.2/release-1.2.8", - "releasenotes/v1.2/release-1.2.7", - "releasenotes/v1.2/release-1.2.6", - "releasenotes/v1.2/release-1.2.5", - "releasenotes/v1.2/release-1.2.4", - "releasenotes/v1.2/release-1.2.3", - "releasenotes/v1.2/release-1.2.2", - "releasenotes/v1.2/release-1.2.1", - "releasenotes/v1.2/release-1.2.0" + "releasenotes/all-release", + { + "type": "category", + "label": "v3.0", + "items": [ + "releasenotes/v3.0/release-3.0.3", + "releasenotes/v3.0/release-3.0.2", + "releasenotes/v3.0/release-3.0.1", + "releasenotes/v3.0/release-3.0.0" + ] + }, + { + "type": "category", + "label": "v2.1", + "items": [ + "releasenotes/v2.1/release-2.1.7", + "releasenotes/v2.1/release-2.1.6", + "releasenotes/v2.1/release-2.1.5", + "releasenotes/v2.1/release-2.1.4", + "releasenotes/v2.1/release-2.1.3", + "releasenotes/v2.1/release-2.1.2", + "releasenotes/v2.1/release-2.1.1", + "releasenotes/v2.1/release-2.1.0" + ] + }, + { + "type": "category", + "label": "v2.0", + "items": [ + "releasenotes/v2.0/release-2.0.15", + "releasenotes/v2.0/release-2.0.14", + "releasenotes/v2.0/release-2.0.13", + "releasenotes/v2.0/release-2.0.12", + "releasenotes/v2.0/release-2.0.11", + "releasenotes/v2.0/release-2.0.10", + "releasenotes/v2.0/release-2.0.9", + "releasenotes/v2.0/release-2.0.8", + "releasenotes/v2.0/release-2.0.7", + "releasenotes/v2.0/release-2.0.6", + "releasenotes/v2.0/release-2.0.5", + "releasenotes/v2.0/release-2.0.4", + "releasenotes/v2.0/release-2.0.3", + "releasenotes/v2.0/release-2.0.2", + "releasenotes/v2.0/release-2.0.1", + "releasenotes/v2.0/release-2.0.0" + ] + }, + { + "type": "category", + "label": "v1.2", + "items": [ + "releasenotes/v1.2/release-1.2.8", + "releasenotes/v1.2/release-1.2.7", + "releasenotes/v1.2/release-1.2.6", + "releasenotes/v1.2/release-1.2.5", + "releasenotes/v1.2/release-1.2.4", + "releasenotes/v1.2/release-1.2.3", + "releasenotes/v1.2/release-1.2.2", + "releasenotes/v1.2/release-1.2.1", + "releasenotes/v1.2/release-1.2.0" + ] + }, + { + "type": "category", + "label": "v1.1", + "items": [ + "releasenotes/v1.1/release-1.1.5", + "releasenotes/v1.1/release-1.1.4", + "releasenotes/v1.1/release-1.1.3", + "releasenotes/v1.1/release-1.1.2", + "releasenotes/v1.1/release-1.1.1", + "releasenotes/v1.1/release-1.1.0" + ] + } ] } ] -} +} \ No newline at end of file diff --git a/versioned_sidebars/version-2.0-sidebars.json b/versioned_sidebars/version-2.0-sidebars.json index 7a2feea7ca4ce..e284f52cd0f3f 100644 --- a/versioned_sidebars/version-2.0-sidebars.json +++ b/versioned_sidebars/version-2.0-sidebars.json @@ -77,7 +77,9 @@ { "type": "category", "label": "Database Connection", - "items": ["db-connect/database-connect"] + "items": [ + "db-connect/database-connect" + ] }, { "type": "category", @@ -200,12 +202,18 @@ { "type": "category", "label": "Quering Variables", - "items": ["query/query-variables/variables", "query/query-variables/sql-mode"] + "items": [ + "query/query-variables/variables", + "query/query-variables/sql-mode" + ] }, { "type": "category", "label": "Cost-Based Optimizer", - "items": ["query/nereids/nereids-new", "query/nereids/statistics"] + "items": [ + "query/nereids/nereids-new", + "query/nereids/statistics" + ] }, "query/pipeline-execution-engine", { @@ -239,7 +247,10 @@ { "type": "category", "label": "Distincting Counts", - "items": ["query/duplicate/orthogonal-bitmap-manual", "query/duplicate/using-hll"] + "items": [ + "query/duplicate/orthogonal-bitmap-manual", + "query/duplicate/using-hll" + ] }, "query/high-concurrent-point-query", "query/topn-query", @@ -255,7 +266,10 @@ { "type": "category", "label": "User Defined Functions", - "items": ["query/udf/java-user-defined-function", "query/udf/remote-user-defined-function"] + "items": [ + "query/udf/java-user-defined-function", + "query/udf/remote-user-defined-function" + ] } ] }, @@ -488,7 +502,11 @@ "type": "category", "label": "Benchmark", "collapsed": false, - "items": ["benchmark/ssb", "benchmark/tpch", "benchmark/tpcds"] + "items": [ + "benchmark/ssb", + "benchmark/tpch", + "benchmark/tpcds" + ] }, { "type": "category", @@ -515,7 +533,10 @@ { "type": "category", "label": "SQL Clients", - "items": ["ecosystem/bi/dbeaver", "ecosystem/bi/datagrip"] + "items": [ + "ecosystem/bi/dbeaver", + "ecosystem/bi/datagrip" + ] }, { "type": "category", @@ -540,7 +561,13 @@ "type": "category", "label": "FAQ", "collapsed": false, - "items": ["faq/install-faq", "faq/data-faq", "faq/sql-faq", "faq/lakehouse-faq", "faq/bi-faq"] + "items": [ + "faq/install-faq", + "faq/data-faq", + "faq/sql-faq", + "faq/lakehouse-faq", + "faq/bi-faq" + ] }, { "type": "category", @@ -609,7 +636,10 @@ { "type": "category", "label": "IP Data Type", - "items": ["sql-manual/sql-data-types/ip/IPV4", "sql-manual/sql-data-types/ip/IPV6"] + "items": [ + "sql-manual/sql-data-types/ip/IPV4", + "sql-manual/sql-data-types/ip/IPV6" + ] } ] }, @@ -1460,7 +1490,9 @@ { "type": "category", "label": "Operators", - "items": ["sql-manual/sql-reference/Operators/in"] + "items": [ + "sql-manual/sql-reference/Operators/in" + ] }, { "type": "category", @@ -1485,23 +1517,80 @@ "collapsed": false, "items": [ "releasenotes/all-release", - "releasenotes/v2.0/release-2.0.15", - "releasenotes/v2.0/release-2.0.14", - "releasenotes/v2.0/release-2.0.13", - "releasenotes/v2.0/release-2.0.12", - "releasenotes/v2.0/release-2.0.11", - "releasenotes/v2.0/release-2.0.10", - "releasenotes/v2.0/release-2.0.9", - "releasenotes/v2.0/release-2.0.8", - "releasenotes/v2.0/release-2.0.7", - "releasenotes/v2.0/release-2.0.6", - "releasenotes/v2.0/release-2.0.5", - "releasenotes/v2.0/release-2.0.4", - "releasenotes/v2.0/release-2.0.3", - "releasenotes/v2.0/release-2.0.2", - "releasenotes/v2.0/release-2.0.1", - "releasenotes/v2.0/release-2.0.0" + { + "type": "category", + "label": "v3.0", + "items": [ + "releasenotes/v3.0/release-3.0.3", + "releasenotes/v3.0/release-3.0.2", + "releasenotes/v3.0/release-3.0.1", + "releasenotes/v3.0/release-3.0.0" + ] + }, + { + "type": "category", + "label": "v2.1", + "items": [ + "releasenotes/v2.1/release-2.1.7", + "releasenotes/v2.1/release-2.1.6", + "releasenotes/v2.1/release-2.1.5", + "releasenotes/v2.1/release-2.1.4", + "releasenotes/v2.1/release-2.1.3", + "releasenotes/v2.1/release-2.1.2", + "releasenotes/v2.1/release-2.1.1", + "releasenotes/v2.1/release-2.1.0" + ] + }, + { + "type": "category", + "label": "v2.0", + "items": [ + "releasenotes/v2.0/release-2.0.15", + "releasenotes/v2.0/release-2.0.14", + "releasenotes/v2.0/release-2.0.13", + "releasenotes/v2.0/release-2.0.12", + "releasenotes/v2.0/release-2.0.11", + "releasenotes/v2.0/release-2.0.10", + "releasenotes/v2.0/release-2.0.9", + "releasenotes/v2.0/release-2.0.8", + "releasenotes/v2.0/release-2.0.7", + "releasenotes/v2.0/release-2.0.6", + "releasenotes/v2.0/release-2.0.5", + "releasenotes/v2.0/release-2.0.4", + "releasenotes/v2.0/release-2.0.3", + "releasenotes/v2.0/release-2.0.2", + "releasenotes/v2.0/release-2.0.1", + "releasenotes/v2.0/release-2.0.0" + ] + }, + { + "type": "category", + "label": "v1.2", + "items": [ + "releasenotes/v1.2/release-1.2.8", + "releasenotes/v1.2/release-1.2.7", + "releasenotes/v1.2/release-1.2.6", + "releasenotes/v1.2/release-1.2.5", + "releasenotes/v1.2/release-1.2.4", + "releasenotes/v1.2/release-1.2.3", + "releasenotes/v1.2/release-1.2.2", + "releasenotes/v1.2/release-1.2.1", + "releasenotes/v1.2/release-1.2.0" + ] + }, + { + "type": "category", + "label": "v1.1", + "items": [ + "releasenotes/v1.1/release-1.1.5", + "releasenotes/v1.1/release-1.1.4", + "releasenotes/v1.1/release-1.1.3", + "releasenotes/v1.1/release-1.1.2", + "releasenotes/v1.1/release-1.1.1", + "releasenotes/v1.1/release-1.1.0" + ] + } ] } ] -} +} \ No newline at end of file diff --git a/versioned_sidebars/version-2.1-sidebars.json b/versioned_sidebars/version-2.1-sidebars.json index c6668ab0bc841..7205212518e89 100644 --- a/versioned_sidebars/version-2.1-sidebars.json +++ b/versioned_sidebars/version-2.1-sidebars.json @@ -1865,14 +1865,79 @@ "collapsed": false, "items": [ "releasenotes/all-release", - "releasenotes/v2.1/release-2.1.7", - "releasenotes/v2.1/release-2.1.6", - "releasenotes/v2.1/release-2.1.5", - "releasenotes/v2.1/release-2.1.4", - "releasenotes/v2.1/release-2.1.3", - "releasenotes/v2.1/release-2.1.2", - "releasenotes/v2.1/release-2.1.1", - "releasenotes/v2.1/release-2.1.0" + { + "type": "category", + "label": "v3.0", + "items": [ + "releasenotes/v3.0/release-3.0.3", + "releasenotes/v3.0/release-3.0.2", + "releasenotes/v3.0/release-3.0.1", + "releasenotes/v3.0/release-3.0.0" + ] + }, + { + "type": "category", + "label": "v2.1", + "items": [ + "releasenotes/v2.1/release-2.1.7", + "releasenotes/v2.1/release-2.1.6", + "releasenotes/v2.1/release-2.1.5", + "releasenotes/v2.1/release-2.1.4", + "releasenotes/v2.1/release-2.1.3", + "releasenotes/v2.1/release-2.1.2", + "releasenotes/v2.1/release-2.1.1", + "releasenotes/v2.1/release-2.1.0" + ] + }, + { + "type": "category", + "label": "v2.0", + "items": [ + "releasenotes/v2.0/release-2.0.15", + "releasenotes/v2.0/release-2.0.14", + "releasenotes/v2.0/release-2.0.13", + "releasenotes/v2.0/release-2.0.12", + "releasenotes/v2.0/release-2.0.11", + "releasenotes/v2.0/release-2.0.10", + "releasenotes/v2.0/release-2.0.9", + "releasenotes/v2.0/release-2.0.8", + "releasenotes/v2.0/release-2.0.7", + "releasenotes/v2.0/release-2.0.6", + "releasenotes/v2.0/release-2.0.5", + "releasenotes/v2.0/release-2.0.4", + "releasenotes/v2.0/release-2.0.3", + "releasenotes/v2.0/release-2.0.2", + "releasenotes/v2.0/release-2.0.1", + "releasenotes/v2.0/release-2.0.0" + ] + }, + { + "type": "category", + "label": "v1.2", + "items": [ + "releasenotes/v1.2/release-1.2.8", + "releasenotes/v1.2/release-1.2.7", + "releasenotes/v1.2/release-1.2.6", + "releasenotes/v1.2/release-1.2.5", + "releasenotes/v1.2/release-1.2.4", + "releasenotes/v1.2/release-1.2.3", + "releasenotes/v1.2/release-1.2.2", + "releasenotes/v1.2/release-1.2.1", + "releasenotes/v1.2/release-1.2.0" + ] + }, + { + "type": "category", + "label": "v1.1", + "items": [ + "releasenotes/v1.1/release-1.1.5", + "releasenotes/v1.1/release-1.1.4", + "releasenotes/v1.1/release-1.1.3", + "releasenotes/v1.1/release-1.1.2", + "releasenotes/v1.1/release-1.1.1", + "releasenotes/v1.1/release-1.1.0" + ] + } ] } ] diff --git a/versioned_sidebars/version-3.0-sidebars.json b/versioned_sidebars/version-3.0-sidebars.json index a442674e10363..eee9f415b1ae9 100644 --- a/versioned_sidebars/version-3.0-sidebars.json +++ b/versioned_sidebars/version-3.0-sidebars.json @@ -1922,10 +1922,79 @@ "collapsed": false, "items": [ "releasenotes/all-release", - "releasenotes/v3.0/release-3.0.3", - "releasenotes/v3.0/release-3.0.2", - "releasenotes/v3.0/release-3.0.1", - "releasenotes/v3.0/release-3.0.0" + { + "type": "category", + "label": "v3.0", + "items": [ + "releasenotes/v3.0/release-3.0.3", + "releasenotes/v3.0/release-3.0.2", + "releasenotes/v3.0/release-3.0.1", + "releasenotes/v3.0/release-3.0.0" + ] + }, + { + "type": "category", + "label": "v2.1", + "items": [ + "releasenotes/v2.1/release-2.1.7", + "releasenotes/v2.1/release-2.1.6", + "releasenotes/v2.1/release-2.1.5", + "releasenotes/v2.1/release-2.1.4", + "releasenotes/v2.1/release-2.1.3", + "releasenotes/v2.1/release-2.1.2", + "releasenotes/v2.1/release-2.1.1", + "releasenotes/v2.1/release-2.1.0" + ] + }, + { + "type": "category", + "label": "v2.0", + "items": [ + "releasenotes/v2.0/release-2.0.15", + "releasenotes/v2.0/release-2.0.14", + "releasenotes/v2.0/release-2.0.13", + "releasenotes/v2.0/release-2.0.12", + "releasenotes/v2.0/release-2.0.11", + "releasenotes/v2.0/release-2.0.10", + "releasenotes/v2.0/release-2.0.9", + "releasenotes/v2.0/release-2.0.8", + "releasenotes/v2.0/release-2.0.7", + "releasenotes/v2.0/release-2.0.6", + "releasenotes/v2.0/release-2.0.5", + "releasenotes/v2.0/release-2.0.4", + "releasenotes/v2.0/release-2.0.3", + "releasenotes/v2.0/release-2.0.2", + "releasenotes/v2.0/release-2.0.1", + "releasenotes/v2.0/release-2.0.0" + ] + }, + { + "type": "category", + "label": "v1.2", + "items": [ + "releasenotes/v1.2/release-1.2.8", + "releasenotes/v1.2/release-1.2.7", + "releasenotes/v1.2/release-1.2.6", + "releasenotes/v1.2/release-1.2.5", + "releasenotes/v1.2/release-1.2.4", + "releasenotes/v1.2/release-1.2.3", + "releasenotes/v1.2/release-1.2.2", + "releasenotes/v1.2/release-1.2.1", + "releasenotes/v1.2/release-1.2.0" + ] + }, + { + "type": "category", + "label": "v1.1", + "items": [ + "releasenotes/v1.1/release-1.1.5", + "releasenotes/v1.1/release-1.1.4", + "releasenotes/v1.1/release-1.1.3", + "releasenotes/v1.1/release-1.1.2", + "releasenotes/v1.1/release-1.1.1", + "releasenotes/v1.1/release-1.1.0" + ] + } ] } ] From 6b51961bbaabe850aebaa789f4b2db5cb4361720 Mon Sep 17 00:00:00 2001 From: kassiez Date: Thu, 9 Jan 2025 14:57:45 +0800 Subject: [PATCH 3/5] 1 --- .../Show-Statements/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md | 6 +++--- .../{ => lakehouse}/lakehouse-best-practices/doris-hudi.md | 0 .../lakehouse-best-practices/doris-iceberg.md | 0 .../lakehouse-best-practices/doris-lakesoul.md | 0 .../lakehouse-best-practices/doris-paimon.md | 0 .../SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md | 6 +----- .../SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md | 5 +---- versioned_sidebars/version-2.1-sidebars.json | 2 +- versioned_sidebars/version-3.0-sidebars.json | 2 +- 9 files changed, 7 insertions(+), 14 deletions(-) rename i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/{ => lakehouse}/lakehouse-best-practices/doris-hudi.md (100%) rename i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/{ => lakehouse}/lakehouse-best-practices/doris-iceberg.md (100%) rename i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/{ => lakehouse}/lakehouse-best-practices/doris-lakesoul.md (100%) rename i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/{ => lakehouse}/lakehouse-best-practices/doris-paimon.md (100%) rename i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/{table => materialized-view}/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md (91%) rename i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/{table => materialized-view}/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md (92%) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/Show-Statements/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/Show-Statements/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md index 5ceda114bfcc1..905eed753717a 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/Show-Statements/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-statements/Show-Statements/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md @@ -32,7 +32,7 @@ SHOW ALTER TABLE MATERIALIZED VIEW ## 描述 -该命令用于查看通过 [CREATE-MATERIALIZED-VIEW](../../sql-reference/Data-Definition-Statements/Create/CREATE-MATERIALIZED-VIEW.md) 语句提交的创建物化视图作业的执行情况。 +该命令用于查看通过 [CREATE-MATERIALIZED-VIEW](../../sql-statements/Data-Definition-Statements/Create/CREATE-MATERIALIZED-VIEW.md) 语句提交的创建物化视图作业的执行情况。 > 该语句等同于 `SHOW ALTER TABLE ROLLUP`; @@ -72,7 +72,7 @@ RollupIndexName: r1 1 row in set (0.00 sec) ``` -- `JobId`:作业唯一ID。 +- `JobId`:作业唯一 ID。 - `TableName`:基表名称 @@ -90,7 +90,7 @@ RollupIndexName: r1 - WAITING_TXN: - 在正式开始产生物化视图数据前,会等待当前这个表上的正在运行的导入事务完成。而 `TransactionId` 字段就是当前正在等待的事务ID。当这个ID之前的导入都完成后,就会实际开始作业。 + 在正式开始产生物化视图数据前,会等待当前这个表上的正在运行的导入事务完成。而 `TransactionId` 字段就是当前正在等待的事务 ID。当这个 ID 之前的导入都完成后,就会实际开始作业。 - RUNNING:作业运行中。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse-best-practices/doris-hudi.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/lakehouse-best-practices/doris-hudi.md similarity index 100% rename from i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse-best-practices/doris-hudi.md rename to i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/lakehouse-best-practices/doris-hudi.md diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse-best-practices/doris-iceberg.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/lakehouse-best-practices/doris-iceberg.md similarity index 100% rename from i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse-best-practices/doris-iceberg.md rename to i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/lakehouse-best-practices/doris-iceberg.md diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse-best-practices/doris-lakesoul.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/lakehouse-best-practices/doris-lakesoul.md similarity index 100% rename from i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse-best-practices/doris-lakesoul.md rename to i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/lakehouse-best-practices/doris-lakesoul.md diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse-best-practices/doris-paimon.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/lakehouse-best-practices/doris-paimon.md similarity index 100% rename from i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse-best-practices/doris-paimon.md rename to i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/lakehouse-best-practices/doris-paimon.md diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/table/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/materialized-view/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md similarity index 91% rename from i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/table/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md rename to i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/materialized-view/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md index fc5d0e6a6f195..a5fff7fb72097 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/table/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/materialized-view/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md @@ -27,7 +27,7 @@ under the License. ## 描述 -该命令用于查看通过 [CREATE-MATERIALIZED-VIEW](../../sql-reference/Data-Definition-Statements/Create/CREATE-MATERIALIZED-VIEW.md) 语句提交的创建物化视图作业的执行情况。 +该命令用于查看通过 [CREATE-MATERIALIZED-VIEW](../../../sql-statements/table-and-view/materialized-view/CREATE-MATERIALIZED-VIEW.md) 语句提交的创建物化视图作业的执行情况。 > 该语句等同于 `SHOW ALTER TABLE ROLLUP`; @@ -107,9 +107,5 @@ RollupIndexName: r1 SHOW ALTER TABLE MATERIALIZED VIEW FROM example_db; ``` -## 关键词 - SHOW, ALTER, TABLE, MATERIALIZED, VIEW - -## 最佳实践 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/table/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/materialized-view/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md similarity index 92% rename from i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/table/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md rename to i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/materialized-view/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md index 7abf8773934e6..70696266aeccb 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/table/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/materialized-view/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md @@ -28,7 +28,7 @@ under the License. ## 描述 -该命令用于查看通过 [CREATE-MATERIALIZED-VIEW](../../sql-reference/Data-Definition-Statements/Create/CREATE-MATERIALIZED-VIEW.md) 语句提交的创建物化视图作业的执行情况。 +该命令用于查看通过 [CREATE-MATERIALIZED-VIEW](../../../sql-statements/table-and-view/materialized-view/CREATE-MATERIALIZED-VIEW.md) 语句提交的创建物化视图作业的执行情况。 > 该语句等同于 `SHOW ALTER TABLE ROLLUP`; @@ -108,9 +108,6 @@ RollupIndexName: r1 SHOW ALTER TABLE MATERIALIZED VIEW FROM example_db; ``` -## 关键词 - - SHOW, ALTER, TABLE, MATERIALIZED, VIEW diff --git a/versioned_sidebars/version-2.1-sidebars.json b/versioned_sidebars/version-2.1-sidebars.json index 23f15f27c1d04..415db9dba7a89 100644 --- a/versioned_sidebars/version-2.1-sidebars.json +++ b/versioned_sidebars/version-2.1-sidebars.json @@ -1603,7 +1603,6 @@ "sql-manual/sql-statements/table-and-view/table/ALTER-TABLE-COMMENT", "sql-manual/sql-statements/table-and-view/table/CANCEL-ALTER-TABLE", "sql-manual/sql-statements/table-and-view/table/SHOW-ALTER-TABLE", - "sql-manual/sql-statements/table-and-view/table/SHOW-ALTER-TABLE-MATERIALIZED-VIEW", "sql-manual/sql-statements/table-and-view/table/TRUNCATE-TABLE", "sql-manual/sql-statements/table-and-view/table/DROP-TABLE", "sql-manual/sql-statements/table-and-view/table/SHOW-CREATE-TABLE", @@ -1652,6 +1651,7 @@ "sql-manual/sql-statements/table-and-view/materialized-view/DROP-ASYNC-MATERIALIZED-VIEW", "sql-manual/sql-statements/table-and-view/materialized-view/REFRESH-MATERIALIZED-VIEW", "sql-manual/sql-statements/table-and-view/materialized-view/SHOW-CREATE-MATERIALIZED-VIEW", + "sql-manual/sql-statements/table-and-view/materialized-view/SHOW-ALTER-TABLE-MATERIALIZED-VIEW", "sql-manual/sql-statements/table-and-view/materialized-view/PAUSE-MATERIALIZED-VIEW", "sql-manual/sql-statements/table-and-view/materialized-view/RESUME-MATERIALIZED-VIEW", "sql-manual/sql-statements/table-and-view/materialized-view/CANCEL-MATERIALIZED-VIEW-TASK" diff --git a/versioned_sidebars/version-3.0-sidebars.json b/versioned_sidebars/version-3.0-sidebars.json index 47a8ef4bf0388..a7a63b568a34a 100644 --- a/versioned_sidebars/version-3.0-sidebars.json +++ b/versioned_sidebars/version-3.0-sidebars.json @@ -1658,7 +1658,6 @@ "sql-manual/sql-statements/table-and-view/table/ALTER-TABLE-AND-GENERATED-COLUMN", "sql-manual/sql-statements/table-and-view/table/CANCEL-ALTER-TABLE", "sql-manual/sql-statements/table-and-view/table/SHOW-ALTER-TABLE", - "sql-manual/sql-statements/table-and-view/table/SHOW-ALTER-TABLE-MATERIALIZED-VIEW", "sql-manual/sql-statements/table-and-view/table/TRUNCATE-TABLE", "sql-manual/sql-statements/table-and-view/table/DROP-TABLE", "sql-manual/sql-statements/table-and-view/table/SHOW-CREATE-TABLE", @@ -1707,6 +1706,7 @@ "sql-manual/sql-statements/table-and-view/materialized-view/DROP-ASYNC-MATERIALIZED-VIEW", "sql-manual/sql-statements/table-and-view/materialized-view/REFRESH-MATERIALIZED-VIEW", "sql-manual/sql-statements/table-and-view/materialized-view/SHOW-CREATE-MATERIALIZED-VIEW", + "sql-manual/sql-statements/table-and-view/materialized-view/SHOW-ALTER-TABLE-MATERIALIZED-VIEW", "sql-manual/sql-statements/table-and-view/materialized-view/PAUSE-MATERIALIZED-VIEW-JOB", "sql-manual/sql-statements/table-and-view/materialized-view/RESUME-MATERIALIZED-VIEW-JOB", "sql-manual/sql-statements/table-and-view/materialized-view/CANCEL-MATERIALIZED-VIEW-TASK" From be43be9cb2daf65ef67f15ce023d9347140e1edd Mon Sep 17 00:00:00 2001 From: kassiez Date: Thu, 9 Jan 2025 15:01:22 +0800 Subject: [PATCH 4/5] 2 --- .../Show-Statements/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md | 2 +- .../SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md | 2 +- .../SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) rename versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/{table => materialized-view}/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md (97%) rename versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/{table => materialized-view}/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md (97%) diff --git a/docs/sql-manual/sql-statements/Show-Statements/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md b/docs/sql-manual/sql-statements/Show-Statements/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md index ac8ecd3cf0530..24498cda880df 100644 --- a/docs/sql-manual/sql-statements/Show-Statements/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md +++ b/docs/sql-manual/sql-statements/Show-Statements/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md @@ -32,7 +32,7 @@ SHOW ALTER TABLE MATERIALIZED VIEW ### Description -This command is used to view the execution of the Create Materialized View job submitted through the [CREATE-MATERIALIZED-VIEW](../../sql-reference/Data-Definition-Statements/Create/CREATE-MATERIALIZED-VIEW.md) statement. +This command is used to view the execution of the Create Materialized View job submitted through the [CREATE-MATERIALIZED-VIEW](../../sql-statements/Data-Definition-Statements/Create/CREATE-MATERIALIZED-VIEW.md) statement. > This statement is equivalent to `SHOW ALTER TABLE ROLLUP`; diff --git a/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/table/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md b/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/materialized-view/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md similarity index 97% rename from versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/table/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md rename to versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/materialized-view/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md index 76f207a020015..7faaccdbab074 100644 --- a/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/table/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md +++ b/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/materialized-view/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md @@ -27,7 +27,7 @@ under the License. ## Description -This command is used to view the execution of the Create Materialized View job submitted through the [CREATE-MATERIALIZED-VIEW](../../sql-reference/Data-Definition-Statements/Create/CREATE-MATERIALIZED-VIEW.md) statement. +This command is used to view the execution of the Create Materialized View job submitted through the [CREATE-MATERIALIZED-VIEW](../../table-and-view/materialized-view/CREATE-MATERIALIZED-VIEW.md) statement. > This statement is equivalent to `SHOW ALTER TABLE ROLLUP`; diff --git a/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/table/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md b/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/materialized-view/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md similarity index 97% rename from versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/table/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md rename to versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/materialized-view/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md index 6e960837477c8..606bca984c6ca 100644 --- a/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/table/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md +++ b/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/materialized-view/SHOW-ALTER-TABLE-MATERIALIZED-VIEW.md @@ -27,7 +27,7 @@ under the License. ## Description -This command is used to view the execution of the Create Materialized View job submitted through the [CREATE-MATERIALIZED-VIEW](../../sql-reference/Data-Definition-Statements/Create/CREATE-MATERIALIZED-VIEW.md) statement. +This command is used to view the execution of the Create Materialized View job submitted through the [CREATE-MATERIALIZED-VIEW](../../table-and-view/materialized-view/CREATE-MATERIALIZED-VIEW.md) statement. > This statement is equivalent to `SHOW ALTER TABLE ROLLUP`; From bb070e2ad2a2cc3479e621e9ba4fbc7a5fa285b4 Mon Sep 17 00:00:00 2001 From: kassiez Date: Thu, 9 Jan 2025 15:03:03 +0800 Subject: [PATCH 5/5] deadlink --- .../materialized-view/CREATE-MATERIALIZED-VIEW.md | 2 +- .../materialized-view/CREATE-MATERIALIZED-VIEW.md | 2 +- .../materialized-view/CREATE-MATERIALIZED-VIEW.md | 2 +- .../materialized-view/CREATE-MATERIALIZED-VIEW.md | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/materialized-view/CREATE-MATERIALIZED-VIEW.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/materialized-view/CREATE-MATERIALIZED-VIEW.md index be70a0856a7e0..f35b282b6d3a8 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/materialized-view/CREATE-MATERIALIZED-VIEW.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-statements/table-and-view/materialized-view/CREATE-MATERIALIZED-VIEW.md @@ -28,7 +28,7 @@ under the License. 该语句用于创建物化视图。 -该操作为异步操作,提交成功后,需通过 [SHOW ALTER TABLE MATERIALIZED VIEW](../table/SHOW-ALTER-TABLE-MATERIALIZED-VIEW) 查看作业进度。在显示 FINISHED 后既可通过 `desc [table_name] all` 命令来查看物化视图的 schema 了。 +该操作为异步操作,提交成功后,需通过 [SHOW ALTER TABLE MATERIALIZED VIEW](./SHOW-ALTER-TABLE-MATERIALIZED-VIEW) 查看作业进度。在显示 FINISHED 后既可通过 `desc [table_name] all` 命令来查看物化视图的 schema 了。 语法: diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/materialized-view/CREATE-MATERIALIZED-VIEW.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/materialized-view/CREATE-MATERIALIZED-VIEW.md index e254c41bb0cf2..03eeb0c23a278 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/materialized-view/CREATE-MATERIALIZED-VIEW.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-statements/table-and-view/materialized-view/CREATE-MATERIALIZED-VIEW.md @@ -31,7 +31,7 @@ under the License. 该语句用于创建物化视图。 -该操作为异步操作,提交成功后,需通过 [SHOW ALTER TABLE MATERIALIZED VIEW](../table/SHOW-ALTER-TABLE-MATERIALIZED-VIEW) 查看作业进度。在显示 FINISHED 后既可通过 `desc [table_name] all` 命令来查看物化视图的 schema 了。 +该操作为异步操作,提交成功后,需通过 [SHOW ALTER TABLE MATERIALIZED VIEW](./SHOW-ALTER-TABLE-MATERIALIZED-VIEW) 查看作业进度。在显示 FINISHED 后既可通过 `desc [table_name] all` 命令来查看物化视图的 schema 了。 语法: diff --git a/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/materialized-view/CREATE-MATERIALIZED-VIEW.md b/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/materialized-view/CREATE-MATERIALIZED-VIEW.md index e6e0c13059e35..fb335f0939bed 100644 --- a/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/materialized-view/CREATE-MATERIALIZED-VIEW.md +++ b/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/materialized-view/CREATE-MATERIALIZED-VIEW.md @@ -29,7 +29,7 @@ under the License. 该语句用于创建物化视图。 -该操作为异步操作,提交成功后,需通过 [SHOW ALTER TABLE MATERIALIZED VIEW](../table/SHOW-ALTER-TABLE-MATERIALIZED-VIEW) 查看作业进度。在显示 FINISHED 后既可通过 `desc [table_name] all` 命令来查看物化视图的 schema 了。 +该操作为异步操作,提交成功后,需通过 [SHOW ALTER TABLE MATERIALIZED VIEW](./SHOW-ALTER-TABLE-MATERIALIZED-VIEW) 查看作业进度。在显示 FINISHED 后既可通过 `desc [table_name] all` 命令来查看物化视图的 schema 了。 语法: diff --git a/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/materialized-view/CREATE-MATERIALIZED-VIEW.md b/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/materialized-view/CREATE-MATERIALIZED-VIEW.md index 6db9459e6b289..954cde79a2323 100644 --- a/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/materialized-view/CREATE-MATERIALIZED-VIEW.md +++ b/versioned_docs/version-3.0/sql-manual/sql-statements/table-and-view/materialized-view/CREATE-MATERIALIZED-VIEW.md @@ -29,7 +29,7 @@ under the License. This statement is used to create a materialized view. -This operation is an asynchronous operation. After the submission is successful, you need to view the job progress through [SHOW ALTER TABLE MATERIALIZED VIEW](../table/SHOW-ALTER-TABLE-MATERIALIZED-VIEW). After displaying FINISHED, you can use the `desc [table_name] all` command to view the schema of the materialized view. +This operation is an asynchronous operation. After the submission is successful, you need to view the job progress through [SHOW ALTER TABLE MATERIALIZED VIEW](./SHOW-ALTER-TABLE-MATERIALIZED-VIEW). After displaying FINISHED, you can use the `desc [table_name] all` command to view the schema of the materialized view. grammar: