0.11.0 Mar 15, 2018: KLL quantiles sketch, tuple sketch API change and more
AlexanderSaydakov
released this
16 Mar 02:20
·
1974 commits
to master
since this release
- New KLL sketch:
KllFloatsSketch
:- This is a new quantiles sketch with better accuracy per stored bit than the original quantiles
DoublesSketch
. If you select a value of K for the KLL sketch so that it matches the same accuracy as the DoublesSketch, the K will be larger, but the space required will be much smaller. This sketch is specifically tuned for the smallest amount of space usage as possible (near theoretical optimum) and usesfloats
rather thandoubles
. On update this new KLL sketch is a little faster than the originalDoublesSketch
, but may be slower on merge. Also, this KLL sketch currently does not have a generic version (as does theDoublesSketch
) nor does it provide off-heap capability like theDoublesSketch
. Refer to the javadocs for a link to the KLL theoretical paper.
- This is a new quantiles sketch with better accuracy per stored bit than the original quantiles
- Tuple:
- generic sketch API change
- removed the convention to require static methods with a certain signature, these methods are now based on a more visible API
- added SummaryDeserializer
- The need to serialize factories has been removed
- removed getSummaries() method - use iterator instead
- generic sketch API change
- Theta:
- added new
SingleItemSketch
- fast way to create sketches with a single input item
- added new
- Original quantiles sketch enhancements:
- added getRank() - faster than getCDF() with one split point
- empty sketch returns null from getQuantiles(), getPMF() and getCDF()
- empty sketch returns NaN from getQuantile(), getMinValue() and getMaxValue()
- Komologorov-Smirnov Statistic between two quantiles sketches
- fixed sorting using comparator in generic ItemsSketch