- It was possible for incremental models with the delete+insert strategy to fail if ClickHouse "light weight deletes" were
not enabled or the required setting
allow_nondetermistic_mutations
was not enabled and the user did not have permission to apply it. This condition is now detected on startup, and an exception will be thrown ifuse_lw_deletes
is configured in the profile. Otherwise, a warning will be logged that incremental models will be slower (because such models will be downgraded to use thelegacy
incremental strategy). This should prevent the confusing behavior in ClickHouse#197 by throwing an early exception for an unsupported configuration.
- Minimal compatibility with dbt 1.7.x. The date_spine macro and additional automated tests have not been implemented, but are planned for a future patch release.
- DBT 1.7 introduces a (complex) optimization mechanism for retrieving a dbt catalog which is overkill for ClickHouse (which has no separate schema/database level), so this release includes some internal catalog changes to simplify that process.
- The dbt
on_schema_change
configuration value for incremental models was effectively being ignored. This has been fixed with a very limited implementation. Closes ClickHouse#199. Because of the way that ORDER BY/SORT BY/PARTITION BY/PRIMARY KEYS work in ClickHouse, plus the complexities of correctly transforming ClickHouse data types,sync_all_columns
is not currently supported (although an implementation that works for non-key columns is theoretically possible, such an enhancement is not currently planned). Accordingly, onlyignore
,fail
, andappend_new_columns
values are supported foron_schema_change
. It is also not currently supported for Distributed tables.
Note that actually appending new columns requires a fallback to the legacy
incremental strategy, which is quite inefficient,
so while theoretically possible, using append_new_columns
is not recommended except for very small data volumes.
- Identifier quoting was disabled for tables/databases etc. This would cause failures for schemas or tables using reserved words or containing special characters. This has been fixed and some macros have been updated to correctly handle such identifiers. Note that there still may be untested edge cases where nonstandard identifiers cause issues, so they are still not recommended. Closes ClickHouse#144. Thanks to Alexandru Pisarenco for the report and initial PR!
- The new
allow_automatic_deduplication
setting was not being correctly propagated to the adapter, so setting it toTrue
did not have the intended affect. In addition, this setting is now ignored for older ClickHouse versions that do not supportCREATE TABLE AS SELECT ... EMPTY
, since the automatic deduplication window is required to allow correct inserts in Replicated tables on those older versions. Fixes ClickHouse#216.
- Compatible with dbt 1.6.x. Note that dbt new
clone
feature is not supported, as ClickHouse has no native "light weight" clone functionality, and copying tables without actual data transfer is not possible in ClickHouse (barring file manipulation outside ClickHouse itself). - A new ClickHouse specific Materialized View materialization contributed by Rory Sawyer.
This creates a ClickHouse Materialized view using the
TO
form with the name<model_name>_mv
and the associated target table<model_name>
. It's highly recommended to fully understand how ClickHouse materialized views work before using this materialization.
- The
ON CLUSTER
clause was in the incorrect place for legacy incremental materializations. This has been fixed. Thanks to Steven Reitsma for the fix! - The
ON CLUSTER
DDL for drop tables did not include a SYNC modifier, which might be the cause of some "table already exists" errors. TheSYNC
modifier has been added to theon_cluster
macro when dropping relations. - Fixed a bug where using table settings such as
allow_nullable_key
would break "legacy" incremental materializations. Closes ClickHouse#209. Also see the new modelconfig
propertyinsert_settings
described below. - Fixed an issue where incremental materializations would incorrectly exclude duplicated inserted elements due to "automatic"
ClickHouse deduplication on replicated tables. Closes ClickHouse#213. The fix consists
of always sending a
replicated_deduplication_window=0
table setting when creating the incremental relations. This behavior can be overridden by setting the new profile parameterallow_automatic_deduplication
toTrue
, although for general dbt operations this is probably not necessary and not recommended. Finally thanks to Andy(https://github.com/andy-miracl) for the report and debugging help!
- Added a new profile property
allow_automatic_deduplication
, which defaults toFalse
. ClickHouse Replicated deduplication is now disable for incremental inserts, but this property can be set to true if for some reason the default ClickHouse behavior for inserted blocks is desired. - Added a new model
config
propertyquery_settings
for any ClickHouse settings that should be sent with theINSERT INTO
orDELETE_FROM
queries used with materializations. Note this is distinct from the existing propertysettings
which is used for ClickHouse "table" settings in DDL statements likeCREATE TABLE ... AS
.
- Fix table materialization for compatibility with SQLFluff. Thanks to Kristof Szaloki for the PR!
- Compatible with dbt 1.5.x
- Contract support (using exact column data types)
- Fix s3 macro when bucket includes
https://
prefix. Closes ClickHouse#192.
- Lots of work on Distributed table materializations. Big thanks to gfunc for the additional PR and Zhenbang for code review and suggestions. See the README for details on how to use the new functionality.
- dbt would fail if a cluster name contained a dash. This has been fixed. Thanks to [Andy](https://github.com/the4thamigo-uk for the PR
- Fixed issues with experimental Distributed table materializations. Closes ClickHouse#179. Thanks to Zhebnang for the report and for contributing to the fix with gfunc.
- Fixed an exception in "legacy" incremental materializations that are not distributed
- Lightweight deletes could fail in environments where the HTTP session was not preserved (such as clusters behind a non-sticky load balancer). This has been fixed by sending the required settings with every request instead of relying on a SET statement. A similar approach has been used to persist the 'insert_distributed_sync' setting for Distributed table materializations.
- Adds additional experimental support for Distributed table engine models and incremental materialization. See the README for details. Thanks to gladkikhtutu for the contribution!
- Fixed two logging/exception handling issues that would cause exception on startup or when handling some exceptions from the ClickHouse server. Partially addresses ClickHouse#169.
- Fixed issue with the
on_cluster
macro that would break the exchange tables step of incremental materializations with an active cluster. Thanks to Andrew Davis for the PR. Closes ClickHouse#167
- Use correct return value for
execute
. This would cause an exception when running hooks. Thanks to Sergey Reshetnikov for the PR. Closed ClickHouse#161
- Added macros for creating distributed tables. See the
distributed_table.sql
include file. Thanks to gladkikhtutu for the contribution.
- Create initial dbt database (if not found) on the defined cluster on first run, instead of just the execution node. Thanks to Jens Hoevenaars for the PR
- Fix the SYSTEM SYNC REPLICA statement when exchanging tables ON CLUSTER for incremental materializations. Thanks to Schum for PR. Closed ClickHouse#157.
- Reduce the number of SQL calls for Modify Comment operations. Thanks to Konstantin Ilchenko.
- Add "on cluster" to several additional macros to better support distributed tables. Thanks to Saurabh Bikram
- Add the unique invocation id to temporary "new data" used in
delete+insert
incremental materializations to allow parallel transformations on the same table. In general parallel transformations are risky, so this approach should only be used when transformations are explicitly limited to non-overlapping data ranges. Closes ClickHouse#150
- Support dbt [1.4.1] ClickHouse#135
- Adds support for Python 3.11
- Adds additional dbt 1.4.0 tests
- Adds support for incremental_predicates. This only applies to
delete+insert
incremental strategy. Note that incremental predicates that depend on "non-deterministic" data (such as a subquery using a table that is accepting inserts) could lead to unexpected results for ReplicatedMergeTree tables depending on the timing of the incremental materialization. - Replaces deprecated Exception classes
- Setting the
use_lw_deletes
profile value to True will now attempt to enable theallow_experimental_lightweight_delete
setting for the dbt session (if user has such permissions on the ClickHouse server). See ClickHouse#133
- Composite unique keys specified as a list would not work with incremental materializations. This has been fixed.
Release 1.3.3, 2023-01-18
- The documentation has been updated to reflect that dbt-clickhouse does support ephemeral models, and ephemeral model tests do pass. However, due to a ClickHouse limitation, CTEs will not work directly with INSERT statements so table models will fail if they include ephemeral models in the SELECT. View models and other SQL statements using ephemeral models should work correctly.
- Client connections would incorrectly reuse a previous session_id when initializing, causing session locked errors. This has been fixed. This
closes ClickHouse#127. Multiple threads should now work correctly in dbt projects,
and dbt-clickhouse automated tests now use
threads=4
(soon to be the dbt default).
Release 1.3.2, 2022-12-23
- Added experimental support for the
delete+insert
incremental strategy. In most cases this strategy should be significantly faster than the existing "legacy" custom strategy (which is still the default). To enabledelete+insert
incremental materialization, the flagallow_experimental_lightweight_delete
must be enabled on your ClickHouse server. This flag is NOT currently considered production ready, so use at your own risk. To use this strategy as the "default" incremental strategy, the new configuration valueuse_lw_deletes
must be set to True. Otherwise, to use it for a particular model, set the model'sincremental_strategy
todelete+insert
. Important caveats about this strategy:- This strategy directly uses lightweight deletes on the target model/table. It does not create a temporary or intermediate table. Accordingly, if there is an issue during that transformation, the materialization can not be rolled back and the model may contain inconsistent data.
- If the incremental materialization includes schema changes, or lightweight deletes are not available, dbt-clickhouse will fall back to the much slower "legacy" strategy.
- Allow
append
as an incremental strategy. This is the same as using the custom model configuration valueinserts_only
. - New s3source macro. This simplifies the process of using the ClickHouse s3 table function in queries. See the tests for example usage.
- The ON CLUSTER clause has been added to additional DDL statements including incremental models processing.
Closes ClickHouse#117 and should close ClickHouse#95
for Replicated tables that use the
{uuid}
macro in the path to avoid name conflicts. Thanks to Saurabh Bikram - The
apply
andrevoke
grants macros now correctly work with roles as well as user. Again thanks to Saurabh Bikram - A compound unique_key (such as
key1, key2
) now works correctly with incremental models
Release 1.3.1, 2022-11-17
- Improve error message when atomic "EXCHANGE TABLES" operation is not supported
Release 1.3.0, 2022-10-30
- Support dbt 1.3.0 ClickHouse#105
- Adds additional dbt 1.3.0 core tests
- Adds array utility macros ported from dbt-utils
- Does NOT add support for Python models (not implemented)
- Does NOT utilize default/standard incremental materialization macros (standard strategies do not work in ClickHouse)
- Require exact match for relations. ClickHouse databases and tables are all case-sensitive, so all searches are now case-sensitive. Closes ClickHouse#100 and ClickHouse#110
Release 1.2.1, 2022-09-19
- Support dbt 1.2.1 ClickHouse#79
- Grants support on models
- Support the cross database dbt-core macros (migrated from dbt-utils)
- Add docs tests
- Validate Python 3.10 support
- Implement retry logic for small set of "retryable" connection errors
- Don't close connection on release to improvement model performance
- Allow specifying database engine for schema creation
- Extend support for ClickHouse Cloud and Replicated databases
- Validate that atomic EXCHANGE TABLES is supported on the target ClickHouse server ClickHouse#94
Release 1.1.8, 2022-09-02
- Support use of atomic EXCHANGE TABLES when supported by ClickHouse
Release 1.1.7, 2022-07-11
- Support ClickHouse Arrays with Nullable value types
- Fix brackets regex in columns)
Release 1.1.6, 2022-07-04
- Add support for CREATE AS SELECT in Replicated Databases
Release 1.1.5, 2022-06-27
- Ensure database exists before connection
- Safely handle dropping working database
Release 1.1.4, 2022-06-27
- Allow selection of either native protocol driver (clickhouse-driver) or http driver (clickhouse-connect)
Release 1.1.2, 2022-06-22
Release 1.1.0.2, 2022-06-17
- Support for inserting SETTINGS section to CREATE AS and INSERT INTO queries through model configuration
- Support adding custom session settings to the connection through profile credentials.
- Allow using custom databases/schemas for connection
- Support Boolean fields in CSV seeds
- Fixed prehooks and empty fields in CSV seeds
Release 1.1.0, 2022-06-02
Release 1.1.0.1, 2022-06-09
- Fixed prehooks and empty fields in CSV seeds
Release 1.1.0, 2022-06-02
- 1.1.0 dbt support
- Snapshots timestamps
- Replace temporary in-memory tables with MergeTree table - removing memory limitations over Incremental model and snapshots creation
- Moved to use clickhouse-connect driver - an officially supported HTTP driver by ClickHouse Inc.
Release 1.0.4, 2022-04-02
- 1.0.4 dbt support
- New logger
Release 1.0.1 - 2022-02-09
- Support 1.0.1 dbt
- Skip the order columns if the engine is Distributed ClickHouse#14
- Fix missing optional "as" ClickHouse#32
- Fix cluster name quoted ClickHouse#31
Release 1.0.0, 2022-01-02
- dbt 1.0.0 support
Release 0.21.1, 2022-01-01
- dbt 0.21.1 support
- Extended settings for clickhouse-driver ClickHouse#27
- Fix types in CSV seed ClickHouse#24
Release 0.21.0, 2021-11-18
- Support 0.21.0 dbt
- Fix FixString column ClickHouse#20
- Default behavior for a quoted ClickHouse#21
- Fix string expand ClickHouse#22
Release 0.20.2, 2021-10-16
- dbt 0.20.1 support
- Rewrite logic incremental materializations ClickHouse#12
- Fix dbt tests with ClickHouse#18 (thx @artamoshin)
- Fix relationships test ClickHouse#19
Release 0.20.1, 2021-08-15
- dbt 0.20.1 support
Release 0.20.0, 2021-08-14
- dbt 0.20.0 support
Release 0.19.1.1, 2021-08-13
- Add verify and secure to connection configuration
- Fix the delete expression ClickHouse#12
Release 0.19.1, 2021-05-07
- Add support the
ON CLUSTER
clause for main cases
- Engine now require brackets
()
- Fix a missing sample profile
Release 0.19.0.2, 2021-04-03
- Fix name of partition
Release 0.19.0.1, 2021-03-30
- Fix version of clickhouse-driver in setup.py