-
Notifications
You must be signed in to change notification settings - Fork 479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
colblk: special case zero bitmaps #3869
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 8 files at r1, all commit messages.
Reviewable status: 2 of 8 files reviewed, 3 unresolved discussions (waiting on @RaduBerinde)
sstable/colblk/bitmap.go
line 264 at r1 (raw file):
} else { b.unset(i) }
did these changes move the microbenchmarks at all?
sstable/colblk/bitmap.go
line 310 at r1 (raw file):
// First byte will be the encoding type. offset++ if b.IsZero() {
Size
is a bit performance sensitive during data block building, because we need to call it after every KV to determine whether we should finish the block. does the IsZero
call have a perf impact? Should we consider maintaining a flag indicating whether any bits are set?
When I was iterating on block building perf, I avoided ever calling Unset
instead relying on all bits being implicitly unset. We could make that part of the contract, remove unset
and make IsZero() == (len(b.words)==0)
sstable/colblk/data_block.go
line 488 at r1 (raw file):
// Invert the prefix-same bitmap before writing it out, because we want it // to represent when the prefix changes. w.prefixSame.Invert(w.rows)
hrm, the Size()
calls we make along the way will no longer be accurate since the size of the pre-inversion bitmap is not the same as the post-inversion bitmap. we could add support for 'all ones' to make it symmetric, or we could explicitly account for it in DataBlockWriter.Size()
04617ec
to
4e48f22
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 10 files reviewed, 3 unresolved discussions (waiting on @jbowens)
sstable/colblk/data_block.go
line 488 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
hrm, the
Size()
calls we make along the way will no longer be accurate since the size of the pre-inversion bitmap is not the same as the post-inversion bitmap. we could add support for 'all ones' to make it symmetric, or we could explicitly account for it inDataBlockWriter.Size()
Makes sense. I switched to only allowing setting bits and added an InvertedSize()
. Added some more checks in various tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 10 files reviewed, 4 unresolved discussions (waiting on @RaduBerinde)
sstable/colblk/bitmap.go
line 308 at r2 (raw file):
// // Note that Invert can affect the Size of the bitmap. Use InvertedSize() if you // intend to invert the buitmap before finishing.
nit: "bitmap"
Some bitmaps are going to be all-zeros in common cases, for example the `isValueExternal` bitmap. It is worth special-casing these to speed up the decoding. We also improve `Bitmap.At` a bit using a shift and a mask to get the offset to the respective word.
4e48f22
to
01e2ffb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
Reviewable status: 0 of 10 files reviewed, 3 unresolved discussions (waiting on @jbowens)
Some bitmaps are going to be all-zeros in common cases, for example
the
isValueExternal
bitmap. It is worth special-casing these tospeed up the decoding.
We also improve
Bitmap.At
a bit using a shift and a mask to get theoffset to the respective word.