Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixes CRDB watch implementation accumulating past updates #2100

Closed
wants to merge 2 commits into from

Conversation

vroldanbet
Copy link
Contributor

when the timestamp to start emitting updates from
is way in the past, CRDB will not emit checkpoints.

As a consequence, every update in the w.r.t to the moment the changefeed was created will accumulate
in memory and OOM the process with a large enough
backlog.

The proposed solution is to compute one checkpoint from the real-time stream of updates, and then use that as the high-watermark for the backlog of
changes from the past. We need to emit all the updates before we can emit a checkpoint, so downstream callers would have to handle it accordingly.

@github-actions github-actions bot added the area/datastore Affects the storage system label Oct 24, 2024
@vroldanbet vroldanbet force-pushed the crdb-checkpoint-fix branch 4 times, most recently from 123abc2 to 191971b Compare October 24, 2024 18:45
when the timestamp to start emitting updates from
is way in the past, CRDB will not emit checkpoints.

As a consequence, every update in the w.r.t to the
moment the changefeed was created will accumulate
in memory and OOM the process with a large enough
backlog.

The proposed solution is to compute one checkpoint
from the real-time stream of updates, and then use
that as the high-watermark for the backlog of
changes from the past. We need to emit all the updates
before we can emit a checkpoint, so downstream callers
would have to handle it accordingly.
@vroldanbet
Copy link
Contributor Author

superseded by #2120

@vroldanbet vroldanbet closed this Nov 7, 2024
@vroldanbet vroldanbet deleted the crdb-checkpoint-fix branch November 7, 2024 09:59
@github-actions github-actions bot locked and limited conversation to collaborators Nov 7, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/datastore Affects the storage system
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants