Skip to content

0.6.0 Jun 29, 2016: Generic Quantiles

Compare
Choose a tag to compare
@leerho leerho released this 29 Jun 19:39
· 2842 commits to master since this release

Major Additions

Generic Quantiles Sketch

Any object that can be compared using a supplied comparator (or one of the default native comparators) can be processed using the new generic version of Quantiles.

Code Improvements in Frequent Items and Quantiles Sketches

Class name consistency

The original names were long and were redundant with the sketch package:

  • FrequentItemsSketch in the frequencies package
  • ItemsQuantilesSketch in the quantiles package

These class names have been renamed to remove redundancy and to be more consistent with class names in Theta and Tuple packages. This is a one-time change that will require users to update their code base to the new names. For example:

  • quantiles package:
    • ItemsQuantilesSketch -> ItemsSketch
    • DoublesQuantilesSketch -> DoublesSketch
    • DoublesQuantilesSketchBuilder -> DoublesSketchBuilder
  • frequencies package
    • FrequentItemsSketch -> ItemsSketch
    • FrequentLongsSketch -> LongsSketch

Similar changes are reflected in the names of the test classes

Binary Storage Improvements for quantiles/DoublesSketch

The new storage structure is about half the size, on average, so it will be faster and smaller to merge, serialize, deserialize. This is also a one-time change and the new version cannot read the old binary format. Hopefully we have caught this early enough so that users don't have many sketches stored in the previous format.

Restructuring

The frequencies/ArrayOf<Type>SerDe classes are useful for the generic versions of both FrequentItems and Quantiles sketches and so have been promoted to the sketches package.

Javadocs and code formatting

This version includes a number of improvements to the javadocs and code to make it easier to read. This is an ongoing process.

External Contributions. Thank you!

George Kankava of DevFactory suggested creating dedicated RuntimeException classes for the library. This has been implemented and will allow systems that implement the library to catch all exceptions thrown by the library classes to be caught as SketchesException. George also found a few binary OR operations that should have been implemented as Logical ORs. These have been fixed.