Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encode JSON encode on write rather than read #257

Closed
wants to merge 2 commits into from

Conversation

robacourt
Copy link
Contributor

@robacourt robacourt commented Aug 6, 2024

JSON encoding on write rather than read to reduce memory footprint. For example the memory use of an initial sync of a 35MB/200k row table can be reduced from 50MB to 25MB on the first initial sync (which includes the encoding and the writing to storage) and to 6MB on subsequent initial syncs (where we just read from storage and there's no encoding).

This change also allows some simplifications, so I have included them as refactoring in this PR:

  • Logic to do with the structure and how to create Log Items has been consolidated to a new module LogItems
  • The prepared_change intermediate state has been removed. The transformation Changes > list(prepared_change) > LogItems now is simply Changes > LogItems which makes the code more readable ( prepared_change was a 5 tuple rather than a nicely labeled map) and easier to reason about (since there's one less data structure to worry about)
  • The snapshot row intermediate state has a few less references, ideally I'd like to get encapsulate it into a single module, but for now it's mainly Shapes.Querying that creates it and LogItems.from_snapshot_row/4 that reads it.
  • Some logic duplicated in InMemoryStorage and CubDbStorage has been consolidated.

I've kept the refactoring in a separate commit to the functional change to aid with reviewing this PR.

@robacourt robacourt force-pushed the rob/json-encode-on-write branch from 9522e83 to 15189e4 Compare August 6, 2024 19:53
@robacourt robacourt marked this pull request as draft August 6, 2024 19:55
@KyleAMathews
Copy link
Collaborator

Reminder to move this PR to electric — I'm going to close all the PRs tomorrow 🙏

@robacourt robacourt force-pushed the rob/json-encode-on-write branch from 15189e4 to 46d9554 Compare August 7, 2024 07:43
@robacourt robacourt closed this Aug 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants