You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@nothingmuch has suggested the following improvements in the pull request #1:
The Filter implementations rely on Bloom filters to skip checking negatives. However, the filters are arguably sized inappropriately, specifying a false positive rate for the size of the set, not its corresponding sumset, which may be substantially larger. This is particularly problematic with Bloom filters, as opposed to say cuckoo or quotient filters which are both resizable, and their FPR does not asymptotically approach 100% when close to capacity.
An LFU cache would handle positive matches, avoiding expensive computations (e.g., is_subset_sum).
If adjusting the bit vector-based powerset enumeration to use a grey code instead of 2's complement (i.e., normal integer incrementing), the sums could be adjusted by subtracting or adding one element at a time, allowing the previous sum value to be reused more easily. This might make an LRU cache a better fit than LFU for positive cases.
Filter.contains is invoked repeatedly on a fixed value (left set of SumFilteredPartitionIterator).
The trait object for Filter seems unnecessary, preventing monomorphisation and any optimizations that would enable, and adding indirection in fairly hot code paths.
The text was updated successfully, but these errors were encountered:
FWIW the reason I didn't open an issue is (and i need to sleep on this i have zombie brain after long flight) that i think a better approach would be to refactor the iterators so that they have a simpler structure first, then think about performance improvements
Suggestions for Improvement
@nothingmuch has suggested the following improvements in the pull request #1:
The
Filter
implementations rely on Bloom filters to skip checking negatives. However, the filters are arguably sized inappropriately, specifying a false positive rate for the size of the set, not its corresponding sumset, which may be substantially larger. This is particularly problematic with Bloom filters, as opposed to say cuckoo or quotient filters which are both resizable, and their FPR does not asymptotically approach 100% when close to capacity.An LFU cache would handle positive matches, avoiding expensive computations (e.g.,
is_subset_sum
).If adjusting the bit vector-based powerset enumeration to use a grey code instead of 2's complement (i.e., normal integer incrementing), the sums could be adjusted by subtracting or adding one element at a time, allowing the previous sum value to be reused more easily. This might make an LRU cache a better fit than LFU for positive cases.
Filter.contains
is invoked repeatedly on a fixed value (left set ofSumFilteredPartitionIterator
).The trait object for
Filter
seems unnecessary, preventing monomorphisation and any optimizations that would enable, and adding indirection in fairly hot code paths.The text was updated successfully, but these errors were encountered: