Encode JSON encode on write rather than read #257
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
JSON encoding on write rather than read to reduce memory footprint. For example the memory use of an initial sync of a 35MB/200k row table can be reduced from 50MB to 25MB on the first initial sync (which includes the encoding and the writing to storage) and to 6MB on subsequent initial syncs (where we just read from storage and there's no encoding).
This change also allows some simplifications, so I have included them as refactoring in this PR:
LogItems
prepared_change
intermediate state has been removed. The transformationChanges > list(prepared_change) > LogItems
now is simplyChanges > LogItems
which makes the code more readable (prepared_change
was a 5 tuple rather than a nicely labeled map) and easier to reason about (since there's one less data structure to worry about)Shapes.Querying
that creates it andLogItems.from_snapshot_row/4
that reads it.I've kept the refactoring in a separate commit to the functional change to aid with reviewing this PR.