source-twilio: improve MessageMedia incremental sync speed #2182
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description:
MessageMedia
was previously checking every message between the config's start date and the present for new media, then filtering out any media created before the last seen cursor value. That made incremental syncs take an extremely long time without any apparent progress; the stream could be searching through the past few years of messages when it usually only needs to search through the past few minutes.This change makes the
MessageMedia
stream only check messages created since the most recent cursor value, falling back to the config's start date if no cursor value is present. This significantly speeds up the connector during incremental syncs.This change also increases the date window size used when fetching a message's media from 1 year to 100 years. This reduces the number of API requests needed when backfilling media records over a year old; instead of requesting a single year of media at a time, the connector essentially requests all of a message's media in one request. For example, instead of making two requests spanning NOV2023-NOV2024 and NOV2024-DEC2024, a single request is made for NOV2023 - DEC2024.
It would make more sense to not use a sliding date window strategy for fetching a single message's media, but rewriting the
MessageMedia
stream in a backwards compatible way is a large effort I'd like to avoid, especially when small, targeted changes address the current issue.Workflow steps:
(How does one use this feature, and how has it changed)
Documentation links affected:
(list any documentation links that you created, or existing ones that you've identified as needing updates, along with a brief description)
Notes for reviewers:
Tested on a local stack. Confirmed that for the
MessageMedia
stream:Messages
.Messages
.This change is