Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add bit::sort and bit::stable_sort #40

Open
nmcclatchey opened this issue May 14, 2022 · 1 comment
Open

Add bit::sort and bit::stable_sort #40

nmcclatchey opened this issue May 14, 2022 · 1 comment

Comments

@nmcclatchey
Copy link

nmcclatchey commented May 14, 2022

Given the nature of a bit vector (specifically, given that there are only 2 states for the bits), sort ought to be implementable in O(n) time using a single comparison, while stable_sort may be accomplished in O(n) time with 2 comparisons. Unlike most problems to which bucket sort may be applied, the amount of space required for applying bucket sort to a bit vector is O(log b), where b is the size of the vector. Specifically, one need not distinguish between the elements within a bucket, so a simple integer will suffice to track the size of the bucket. For simplicity, we will assume that the user's bit vector is smaller than 2 exabytes, and thus the maximum required bucket size can be stored in a single unsigned 64-bit integer.

Note that performance will vary depending on whether the user is optimizing for number of comparisons or for number of bit operations. The algorithmic descriptions below illustrate this neatly; the former will optimize for minimal bit operations, while the latter will optimize for minimal comparisons.

We begin with a rough algorithmic description for stable_sort:

  1. Count all set bits (this number will be referred to by the name "num_true")
  2. If number of set bits is either 0 or the size of the bit vector, the range is uniform (and thus sorted). Exit.
  3. Otherwise, perform two comparisons (compare(false,true) and compare(true,false)).
  4. If both comparisons are true, then the user has violated the contract of std::stable_sort by providing a comparator that does not implement a strict weak ordering. You are now authorized to deploy nasal demons (see "undefined behavior"). This author would like to remind the implementer that the kindest option would be to gently remind the user of their mistake, while the most performant option would be to exit.
  5. If neither comparison is true, then all bits are incomparable, and should remain in their current order. Exit.
  6. Otherwise, if compare(false,true) evaluated to true, then unset the first size() - num_true bits, and set the final num_true bits.
  7. If compare(false,true) evaluated to false, then unset the num_true bits, and set the final size() - num_true bits.

Note that this algorithm is O(n) in the worst case.

Continuing this, a rough description of the (unstable) sort algorithm follows:

  1. Count all set bits (this number will be referred to by the name "num_true")
  2. If number of set bits is either 0 or the size of the bit vector, the range is uniform (and thus sorted). Exit.
  3. Otherwise, perform the comparison compare(false,true).
  4. If compare(false,true) evaluated to true, then unset the first size() - num_true bits, and set the final num_true bits.
  5. If compare(false,true) evaluated to false, then unset the num_true bits, and set the final size() - num_true bits.

This algorithm is O(n) in both best and worst case. Unlike the stable_sort proposed earlier, this algorithm will perform the smallest possible number of comparisons (0 or 1) required for an unstable sort.

I estimate the efficiency of the above algorithms to greatly exceed user expectations. However, I estimate their usefulness to be, at best, trivial.

@nmcclatchey
Copy link
Author

Note that some efficiency is lost to ensure that elements not found within the range are not compared. Otherwise, the stable_sort algorithm would be:

  1. Perform two comparisons (compare(false,true) and compare(true,false)).
  2. If both comparisons are true, then the user has violated the contract of std::stable_sort by providing a comparator that does not implement a strict weak ordering. You are now authorized to deploy nasal demons (see "undefined behavior").
  3. If neither comparison is true, then all bits are incomparable, and should remain in their current order. Exit.
  4. Otherwise, count all set bits (this number will be referred to by the name "num_true")
  5. If number of set bits is either 0 or the size of the bit vector, the range is uniform (and thus sorted). Exit.
  6. Otherwise, if compare(false,true) evaluated to true, then unset the first size() - num_true bits, and set the final num_true bits.
  7. If compare(false,true) evaluated to false, then unset the num_true bits, and set the final size() - num_true bits.

@bkille bkille added this to the STL algorithms milestone May 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants