Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable retry support for Microbatch models #10751

Merged
merged 31 commits into from
Sep 26, 2024
Merged

Conversation

QMalcolm
Copy link
Contributor

@QMalcolm QMalcolm commented Sep 20, 2024

Resolves #10624
Resolves #10715

Problem

Screenshot 2024-09-25 at 16 53 52

dbt was taking an all or nothing approach to microbatch models when running dbt retry. We want the command to instead be smarter, only retrying batches that failed for a given microbatch model

Solution

Screenshot 2024-09-25 at 16 53 04
  • add a new status type PartialSuccess
  • add batch_results to the RunResult object
  • propagate the failed batches to the corresponding model during the retry task
  • skip models downstream from a PartialSuccess microbatch model

Checklist

  • I have read the contributing guide and understand what's expected of me.
  • I have run this code in development, and it appears to resolve the stated issue.
  • This PR includes tests, or tests are not required or relevant for this PR.
  • This PR has no interface changes (e.g., macros, CLI, logs, JSON artifacts, config files, adapter interface, etc.) or this PR has already received feedback and approval from Product or DX.
  • This PR includes type annotations for new and modified functions.

@QMalcolm QMalcolm added Skip Changelog Skips GHA to check for changelog file artifact_minor_upgrade To bypass the CI check by confirming that the change is not breaking labels Sep 20, 2024
@cla-bot cla-bot bot added the cla:yes label Sep 20, 2024
Copy link

codecov bot commented Sep 20, 2024

Codecov Report

Attention: Patch coverage is 98.86364% with 1 line in your changes missing coverage. Please review.

Project coverage is 89.03%. Comparing base (ac66f91) to head (415f695).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10751      +/-   ##
==========================================
+ Coverage   89.00%   89.03%   +0.02%     
==========================================
  Files         181      182       +1     
  Lines       23126    23195      +69     
==========================================
+ Hits        20583    20651      +68     
- Misses       2543     2544       +1     
Flag Coverage Δ
integration 86.23% <98.86%> (+0.03%) ⬆️
unit 62.19% <59.09%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
Unit Tests 62.19% <59.09%> (+<0.01%) ⬆️
Integration Tests 86.23% <98.86%> (+0.03%) ⬆️

…d circular imports

In our next commit we're gonna modify `dbt/contracts/graph/nodes.py` to import the
`BatchType` as part of our work to implement dbt retry for microbatch model nodes.
Unfortunately, the import in `nodes.py` creates a circular dependency because
`dbt/artifacts/schemas/results.py` imports from `nodes.py` and `dbt/artifacts/schemas/run/v5/run.py`
imports from that `results.py`. Thus the new import creates a circular import. Now this
_shouldn't_ be necessary as nothing in artifacts should import from the rest of dbt-core.
However, we do. We should fix this, but this is also out of scope for this segement of work.
core/dbt/task/run.py Outdated Show resolved Hide resolved
core/dbt/task/run.py Outdated Show resolved Hide resolved
@QMalcolm
Copy link
Contributor Author

I have 2-ish tests to add, and then we should be good to go!

QMalcolm and others added 6 commits September 25, 2024 12:58
…-refresh behavior

This is necessary because of retry. Say on the initial run the microbatch model
succeeds on 97% of it's batches. Then on retry it does the last 3%. If the retry
of the microbatch model executes in full refresh mode it _might_ blow away the
97% of work that has been done. This edge case seems to be adapter specific.
…ialSuccess

In the previous commit we made it so that retries of microbatch models wouldn't
run in full refresh mode when the microbatch model to retry has batches already
specified from the prior run. This is only problematic when the run being retried
was a full refresh AND all the batches for a given microbatch model failed. In
that case WE DO want to do a full refresh for the given microbatch model. To better
outline the problem, consider the following:

* a microbatch model had a begin of `2020-01-01` and has been running this way for awhile
* the begin config has changed to `2024-01-01` and  dbt run --full-refresh gets run
* every batch for an microbatch model fails
* on dbt retry the the relation is said to exist, and the now out of range data (2020-01-01 through 2023-12-31) is never purged

To avoid this, all we have to do is ONLY pass the batch information for partially successful microbatch
models. Note: microbatch models only have a partially successful status IFF they have both
successful and failed batches.
@QMalcolm QMalcolm changed the title [WIP] Microbatch Retry Enable retry support for Microbatch models Sep 25, 2024
@QMalcolm QMalcolm removed the Skip Changelog Skips GHA to check for changelog file label Sep 25, 2024
@QMalcolm
Copy link
Contributor Author

I still need to open a schemas.getdbt.com PR for these changes

@QMalcolm QMalcolm marked this pull request as ready for review September 25, 2024 21:51
@QMalcolm QMalcolm requested review from a team as code owners September 25, 2024 21:51
@QMalcolm QMalcolm requested review from vadim82 and removed request for a team September 25, 2024 21:51
@@ -1293,9 +1293,12 @@ def code(self) -> str:
return "Q012"

def message(self) -> str:
if self.status == "error":
if self.status == "error": # or 'PARTIAL SUCCESS' in self.status:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this comment still relevant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not, I'll get it fixed

Comment on lines +268 to +270
"begin": {
"default": null
},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this change was already made in https://github.com/dbt-labs/schemas.getdbt.com, so I'm unsure why this PR is doing it for dbt-core but 🤷

@QMalcolm
Copy link
Contributor Author

The updates to schemas.getdbt.com can be found here dbt-labs/schemas.getdbt.com#61

@QMalcolm QMalcolm merged commit 1fd4d2e into main Sep 26, 2024
65 of 66 checks passed
@QMalcolm QMalcolm deleted the qmalcolm--microbatch-retry branch September 26, 2024 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
artifact_minor_upgrade To bypass the CI check by confirming that the change is not breaking cla:yes
Projects
None yet
2 participants