Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

source-impact-native: fix Actions backfill cursor and other misc. fixes #2186

Merged
merged 5 commits into from
Dec 6, 2024

Conversation

Alex-Bair
Copy link
Contributor

@Alex-Bair Alex-Bair commented Dec 5, 2024

Description:

The scope of this PR includes:

  • Updating snapshot file names.
  • Increasing the interval between sweeps to 5 minutes. The connector frequently hits Impact's 1000 requests/hour rate limit, and increasing the interval from 0s to 5m should help with that.
  • Updated documentation URL to be connector specific.
  • Renaming stop_date to start_date in the config.
  • Changing the cursor used when backfilling Actions.
    • Previously, the "LockingDate" field was used during Actions backfills. All actions have a "LockingDate" that's a fixed day the month after the action is created. For example, all actions created in October have a "LockingDate" of November 28th. This effectively batches actions into month-sized chunks, and using "LockingDate" as a cursor field forced us to only get completed month-sized chunks. This caused the connector to miss actions whose "LockingDate" was after the cutoff date (i.e. within in-progress month-sized chunks) during backfills.
    • Instead, Actions now uses the EventDate field during backfills (via the "ActionDateStart" and "ActionDateEnd" query params). This field has much finer granularity than "LockingDate", and backfills now capture the data that was previously missed.

Workflow steps:

(How does one use this feature, and how has it changed)

Documentation links affected:

Documentation does not exist for the source-impact-native connector, so it should be created.

Notes for reviewers:

Tested on a local stack. Confirmed that:

  • Actions backfills now get all data between the config's start date & the present - data is no longer missed during a backfill if it's "LockingDate" is after the cutoff date.

After this change is merged:

Snapshot changes are expected due to the stop_date to start_date rename, updating the documentation URL, and increasing the interval to PT5M.


This change is Reviewable

… filter during backfills

Previously, the connector missed large batches of `Actions` records
during backfills. This is because it was filtering based on the
"LockingDate" of each action. It seems like the "LockingDate" is a fixed
date the month after an action is created (ex: all actions made in
October have a "LockingDate" of 28NOV). This results in backfills
effectively only getting actions in month-sized batches. Meaning, if we
didn't start a batch on exactly the right time, the backfill will miss
data.

To fix this, the connector now uses "ActionDateStart" and
"ActionDateEnd" query params for `Actions` backfills. This filters
results based on their `EventDate`, which gives us a much finer
granularity & captures the records we were previously missing.
@Alex-Bair Alex-Bair marked this pull request as ready for review December 6, 2024 14:14
@Alex-Bair Alex-Bair requested a review from a team December 6, 2024 14:14
Copy link
Contributor

@jonwihl jonwihl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jonwihl jonwihl merged commit e227729 into main Dec 6, 2024
71 of 79 checks passed
@Alex-Bair Alex-Bair deleted the bair/source-impact-actions-cursors branch December 6, 2024 22:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants