Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

colblk: special case zero bitmaps #3869

Merged
merged 1 commit into from
Aug 22, 2024

Conversation

RaduBerinde
Copy link
Member

Some bitmaps are going to be all-zeros in common cases, for example
the isValueExternal bitmap. It is worth special-casing these to
speed up the decoding.

We also improve Bitmap.At a bit using a shift and a mask to get the
offset to the respective word.

@RaduBerinde RaduBerinde requested a review from jbowens August 20, 2024 23:12
@RaduBerinde RaduBerinde requested a review from a team as a code owner August 20, 2024 23:12
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 8 files at r1, all commit messages.
Reviewable status: 2 of 8 files reviewed, 3 unresolved discussions (waiting on @RaduBerinde)


sstable/colblk/bitmap.go line 264 at r1 (raw file):

	} else {
		b.unset(i)
	}

did these changes move the microbenchmarks at all?


sstable/colblk/bitmap.go line 310 at r1 (raw file):

	// First byte will be the encoding type.
	offset++
	if b.IsZero() {

Size is a bit performance sensitive during data block building, because we need to call it after every KV to determine whether we should finish the block. does the IsZero call have a perf impact? Should we consider maintaining a flag indicating whether any bits are set?

When I was iterating on block building perf, I avoided ever calling Unset instead relying on all bits being implicitly unset. We could make that part of the contract, remove unset and make IsZero() == (len(b.words)==0)


sstable/colblk/data_block.go line 488 at r1 (raw file):

	// Invert the prefix-same bitmap before writing it out, because we want it
	// to represent when the prefix changes.
	w.prefixSame.Invert(w.rows)

hrm, the Size() calls we make along the way will no longer be accurate since the size of the pre-inversion bitmap is not the same as the post-inversion bitmap. we could add support for 'all ones' to make it symmetric, or we could explicitly account for it in DataBlockWriter.Size()

Copy link
Member Author

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 10 files reviewed, 3 unresolved discussions (waiting on @jbowens)


sstable/colblk/data_block.go line 488 at r1 (raw file):

Previously, jbowens (Jackson Owens) wrote…

hrm, the Size() calls we make along the way will no longer be accurate since the size of the pre-inversion bitmap is not the same as the post-inversion bitmap. we could add support for 'all ones' to make it symmetric, or we could explicitly account for it in DataBlockWriter.Size()

Makes sense. I switched to only allowing setting bits and added an InvertedSize(). Added some more checks in various tests.

Copy link
Collaborator

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 0 of 10 files reviewed, 4 unresolved discussions (waiting on @RaduBerinde)


sstable/colblk/bitmap.go line 308 at r2 (raw file):

//
// Note that Invert can affect the Size of the bitmap. Use InvertedSize() if you
// intend to invert the buitmap before finishing.

nit: "bitmap"

Some bitmaps are going to be all-zeros in common cases, for example
the `isValueExternal` bitmap. It is worth special-casing these to
speed up the decoding.

We also improve `Bitmap.At` a bit using a shift and a mask to get the
offset to the respective word.
Copy link
Member Author

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: 0 of 10 files reviewed, 3 unresolved discussions (waiting on @jbowens)

@RaduBerinde RaduBerinde merged commit 55fcc6e into cockroachdb:master Aug 22, 2024
11 checks passed
@RaduBerinde RaduBerinde deleted the null-bitmap branch August 22, 2024 00:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants