Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When retrying microbatch models, propagate prior successful state #10802

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

QMalcolm
Copy link
Contributor

@QMalcolm QMalcolm commented Sep 30, 2024

Resolves #10800

Problem

If you invoked dbt retry on a microbatch model twice, where on the first dbt retry all the retried batches failed, then on the second dbt retry invocation the microbatch model would rerun all the batches that were run on the initial dbt run being retried.

Prior bad behavior

Solution

When going through multiple dbt retry invocations, ensure that the prior successful batch information continues to be passed.

New good behavior

Checklist

  • I have read the contributing guide and understand what's expected of me.
  • I have run this code in development, and it appears to resolve the stated issue.
  • This PR includes tests, or tests are not required or relevant for this PR.
  • This PR has no interface changes (e.g., macros, CLI, logs, JSON artifacts, config files, adapter interface, etc.) or this PR has already received feedback and approval from Product or DX.
  • This PR includes type annotations for new and modified functions.

@cla-bot cla-bot bot added the cla:yes label Sep 30, 2024
Copy link

codecov bot commented Sep 30, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.22%. Comparing base (a86e2b4) to head (21e7919).

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10802      +/-   ##
==========================================
+ Coverage   89.17%   89.22%   +0.04%     
==========================================
  Files         183      183              
  Lines       23382    23388       +6     
==========================================
+ Hits        20850    20867      +17     
+ Misses       2532     2521      -11     
Flag Coverage Δ
integration 86.52% <100.00%> (+0.12%) ⬆️
unit 62.17% <45.45%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
Unit Tests 62.17% <45.45%> (-0.01%) ⬇️
Integration Tests 86.52% <100.00%> (+0.12%) ⬆️

@QMalcolm QMalcolm force-pushed the qmalcolm--fix-microbatch-multiple-retries-edge-case branch from 9502273 to 0100a58 Compare September 30, 2024 20:44
@dbt-labs dbt-labs deleted a comment from github-actions bot Sep 30, 2024
@QMalcolm QMalcolm marked this pull request as ready for review September 30, 2024 20:53
@QMalcolm QMalcolm requested a review from a team as a code owner September 30, 2024 20:53
Copy link
Contributor

@ChenyuLInx ChenyuLInx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question and a nit but the change looks good to me!

@@ -454,7 +454,7 @@ def resource_class(cls) -> Type[HookNodeResource]:

@dataclass
class ModelNode(ModelResource, CompiledNode):
batches: Optional[List[BatchType]] = None
batch_info: Optional[BatchResults] = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this cause a loading error for a previous manifest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has no effect on the written manifest, which is intentional. That is the case because these do not exist on the artifact class for the object

@@ -481,8 +489,8 @@ def _execute_microbatch_materialization(
start = microbatch_builder.build_start_time(end)
batches = microbatch_builder.build_batches(start, end)
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment here mentioning this only happens during retry?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ensure prior batch successes are propagated in RunResults during multiple dbt retry scenarios
2 participants