Releases: dlt-hub/dlt
0.2.6a1
Core library
- Feat/pipeline drop command by @steinitzu in #285
- collectors and progress bars by @rudolfix in #302
Customizations
- Feat/new
add_limit
method for resources by @z3z1ma in #298 - Same method added to sources. overall you can now quickly sample large sources to ie. create example data sets, test your transformations etc. without the need to load everything
Docs
- explains how to set logging level and format by @rudolfix in #297
- ga4 internal dashboard demo blog post by @TyDunn in #299
- Added google_analytics docs by @AmanGuptAnalytics in #305
- Update README, add contributor's guide by @burnash in #311
- progress bars docs by @rudolfix in #312
New Contributors
- @z3z1ma made their first contribution in #298
- @ashish-weblianz made their first contribution in #306
Full Changelog: 0.2.6a0...0.2.6a1
0.2.6a0
New package name and pip install command
💡 We changed the package name to dlt!
pip install dlt
Core library
- PyPI package name: migrate to
dlt
by @burnash in #264 - adds anonymous id to telemetry by @rudolfix in #284
- makes duckdb database to follow current working directory by @rudolfix in #291
- you can disable unique checks in incremental by passing empty tuple as
primary_key
todlt.sources.incremental
Helpers
a first of series of Airflow helpers and features: store secrets.toml
in Airflow Variable and have your credentials injected automatically. same code works locally and in Airflow DAG.
Building blocks
When building pipelines you can now use specs that wrap google credentials. we support service credentials and oauth2 credentials, detect default credentials, provide authorization methods etc. info on credentials below will soon be added to our docs
and some example pipelines.
from dlt.common.configuration.specs import GcpClientCredentials, GcpClientCredentialsWithDefault, GcpOAuthCredentials, GcpOAuthCredentialsWithDefault
Docs
- update doc with new alert capability by @adrianbr in #275
- updating the documentation for the section 'Transforming the data' by @rahuljo in #277
- first version of
understanding the tables
content by @TyDunn in #258 - Rename PyPI package to
dlt
in the docs by @burnash in #282 - pushing the new colab demo by @rahuljo in #288
- updates explore/transform the data in Python by @rudolfix in #289
- update a typo in create-a-pipeline.md by @Anggi-Permana-Harianja in #290
New Contributors
- @Anggi-Permana-Harianja made their first contribution in #290
- @redicane made their first contribution in #254
Full Changelog: 0.2.0a32...0.2.6a0
0.2.0a32
What's Changed in Docs
- moving to new docs structure by @TyDunn in #245
- adds Agolia DocSearch to the dlt docs 🚀 by @TyDunn in #248
- Zendesk pipeline docs by @AmanGuptAnalytics in #222
- Added Hubspot setup guide by @AmanGuptAnalytics in #250
- moving
create a pipeline
to use weatherapi and duckdb by @TyDunn in #255 - first version of
exploring the data
docs page by @TyDunn in #257 - adds schema general usage and schema adjusting walkthrough to docs by @rudolfix in #243
- filling in deploying section by @TyDunn in #262
- Examples for customisations by @adrianbr in #247
What's Changed
- Typed pipeline state by @steinitzu in #239
- allows
incremental
to be passed toresource.apply_hints()
method - adds
state
property to sources and resources to get actual value of source and resource scoped state - Fix failing tests for Redshift and PostgreSQL by @burnash in #270
- add resource name to table schema by @steinitzu in #265
- resets the resource scoped state when doing replace on resource
- you can add
Incremental
as a transform step, instead of injecting
Full Changelog: 0.2.0a30...0.2.0a32
0.2.0a30
What's Changed
This release includes two important features
merge
write disposition: load data incrementally by merging with merge keys and/or deduplicate/upsert with primary keys- incremental loading with last value and
dlt
state available when declaring resources
We consider those features still in alpha. Try them out and report bugs! Preliminary documentation is here: https://dlthub.com/docs/customization/incremental-loading
This release includes improved support for resources that use dynamic hint to dispatch data to several database tables and other bug fixes.
What's Changed in docs
- Strapi setup guide by @TyDunn in #212
- add
edit this page
button on all docs pages by @TyDunn in #226 - adding alerting content from workshop by @TyDunn in #233
- adding monitoring content from workshop by @TyDunn in #229
- adding the chess pipeline documentation by @rahuljo in #237
- adds deduplication of staging dataset during merge by @rudolfix in #240
New Contributors
Full Changelog: 0.2.0a29...0.2.0a30
0.2.0.a29
What's Changed
- Allow changing
write_disposition
in the resource without dropping dataset by @burnash in #205 - Add a suffix to the default dataset name by @burnash in #207
- improves and adds several
dlt pipeline
commands:info
,trace
,load-package
,failed-jobs
andsync
(https://dlthub.com/docs/command-line-interface#dlt-pipeline) - extends
LoadInfo
to include the schema changes applied to destination and a list of loaded package infos (https://dlthub.com/docs/running-in-production/running#inspect-save-and-alert-on-schema-changes) - extends load info with
raise_on_failed_jobs
andhas_failed_jobs
to make handling failed jobs easier LoadInfo
andpipeline.last_trace
can be directly loaded into destination to store more metadata on each load (https://dlthub.com/docs/running-in-production/running#inspect-and-save-the-load-info-and-trace)- adds retry strategy for
tenacity
to retryload
pipeline step (or any other per request) (https://dlthub.com/docs/running-in-production/running#handle-exceptions-failed-jobs-and-retry-the-pipeline) raise_on_failed_jobs
config option aborts the load package on first failed jobs (https://dlthub.com/docs/running-in-production/running#failed-jobs)
What's Changed in docs
- Fix typos and wording in docs/concepts/state by @burnash in #200
- Fix a broken link in README.md by @burnash in #203
- replacing team@ with community@ by @TyDunn in #211
- GitHub and Google Sheets setup guides by @AmanGuptAnalytics in #195
- "run a pipeline" troubleshooting & walkthrough https://dlthub.com/docs/walkthroughs/run-a-pipeline
- "run a pipeline in production": https://dlthub.com/docs/running-in-production/running
dlt pipeline
command: https://dlthub.com/docs/command-line-interface#dlt-pipeline
Full Changelog: 0.2.0a28...0.2.0a29
0.2.0a28
What's Changed
- transform functions may be added to the resources ie. map, filters and generators. https://dlthub.com/docs/concepts/resource#filter-transform-and-pivot-data
- resources can be added to instantiated sources ie. to enrich data https://dlthub.com/docs/concepts/resource#feeding-data-from-one-resource-into-another and https://dlthub.com/docs/concepts/source#add-more-resources-to-existing-source
Docs
- improve explanations by @adrianbr in #181
- docs: init from other sources by @adrianbr in #182
- operation duck blog post by @TyDunn in #185
- added setup_guide_pipedrive to pipelines folder by @AmanGuptAnalytics in #183
- Docs orchestrators by @adrianbr in #166
- fix typo: add a space after gen(10) by @burnash in #196
- adds transform docs by @rudolfix in #192
New Contributors
- @AmanGuptAnalytics made their first contribution in #183
- @burnash made their first contribution in #196
Full Changelog: 0.2.0a26...0.2.0a28
0.2.0a26
What's Changed
- adds anonymous telemetry https://dlthub.com/docs/reference/telemetry
- adds pipeline and exception tracing with sentry https://dlthub.com/docs/reference/tracing
0.2.0a25
What's Changed
- getting started: bigquery --> duckdb by @TyDunn in #146
- dbt docs by @adrianbr in #164
- incremental pipeline docs by @adrianbr in #165
- dlt init working with pipelines repo by @rudolfix in #168
New dlt init
command.
With this release you can use dlt init
to add existing pipelines to your project. Please see updated documentation
The pipelines currently come from pipelines contrib repo. You can use any other repo or a local folder that follows the same structure (dlt init ... --location <url>
)
0.2.0a23
0.2.0a22
dlt
library changes
- DATE type is supported on all destinations
- table and column names will be shortened to fit into particular destination
duckdb
database will be created in the current working directory by default #148- fixes connection problem on
duckdb
0.7.1 - allows to configure naming conventions or adopt naming conventions preferred by a destination
streamlit
app generated bydlt pipeline ... show
does not display deprecation warnings