Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ghostferry current ignores statement based replication events, which may cause data corruption if binlog_format and binlog_row_image is set temporarily to the wrong value #183

Open
shuhaowu opened this issue May 12, 2020 · 0 comments

Comments

@shuhaowu
Copy link
Contributor

shuhaowu commented May 12, 2020

binlog_format and binlog_row_image are session variables, which could cause certain data mutating events to be dropped by Ghostferry as it currently doesn't process events. An example program that alters binlog_format on its session is pt-heartbeat.

This is an "user-error" type problem. That said, the problem may cause data corruption that is invisible to the user, which is not acceptable. At the minimum, we need to update the documentations about such an occurrence near where we mention why binlog_format needs to be set to ROW. If users are running with the InlineVerifier, there's a chance that these events will be caught by the verifier, but I don't think it is guaranteed.

Ideally, Ghostferry should know this is wrong and error out if it observes such an event. That said, we may need to exclude some SBR entries like pt-heartbeat, or tell users to ignore those schemas/tables all together. These problems could be tricky to resolve. That said, throwing an error due to any of these events would be a cautious first step.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant