Skip to content

Latest commit

 

History

History
94 lines (68 loc) · 11 KB

GeoWaveQuery.md

File metadata and controls

94 lines (68 loc) · 11 KB

GeoWave Indexing and Querying

Index

img2

accumulo

In the simplest case, if the entire Hilbert index is viewed as a tree, each "tier" can be identified with a level of that tree. This is configurable (one tier, tiers that are not related to one another, &c), so that interpretation is not certainly true.

The "bin" is a function of space and time. In the most basic case, as in the case of BasicDimensionDefinition, it is just a clamping of the range of the data to some already given range (encoded within a descendant of NumericDimensionDefinition [e.g. BasicDimensionDefinition]).

The number of tiers and the precision of the dimensions within those tiers is configurable.

Insert

The following chain of events sketches how row IDs are generated in the case of a TieredSFCIndexStrategy.

  1. TieredSFCIndexStrategy.getInsertionIds
  2. TieredSFCIndexStrategy.internalGetInsertionIds
  3. TieredSFCIndexStrategy.getRowIds
  4. TieredSFCIndexStrategy.getRowIdsAtTier
  5. TieredSFCIndexStrategy.decomposeRangesForEntry

In the majority of cases, it is expected that an entry will intersect only a single key (or a constant number of keys?) in the GeoWave index (in some appropriate tier).

Query Planning

Primary Index

This section assumes a purely geometry query of the dataStore.query(...) style on an Accumulo-backed instance of GeoWave. It appears that the HBase backend works in a substantially similar way.

val adapter: RasterDataAdapter = ???
val customIndex: CustomIdIndex = ???
val dataStore: AccumuloDataStore = ???
val geom: Geometry = ???
val queryOptions = new QueryOptions(adapter, customIndex)
val query = new IndexOnlySpatialQuery(geom)
val index = (new SpatialDimensionalityTypeProvider.SpatialIndexBuilder).createIndex()
    
dataStore.query(queryOptions, query)

restricting attention to query planning, the code above leads to sequence below

  1. AccumuloDataStore.query, which is actually BasedDataStore.query
  2. BasedDataStore.queryConstraints which is actually AccumuloDataStore.queryConstraints
  3. AccumuloConstrainsQuery.query which is actually AccumuloFilteredIndexQuery.query
  4. AccumuloFilteredIndexQuery.getScanner which is actually AccumuloQuery.getScanner
  5. AccumuloQuery.getRanges which is actually AccumuloFilteredIndexQuery.getRanges which is actually AccumuloConstraintsQuery.getRanges
    • This is where the base object referred to above is used
  6. ConstraintsQuery.getRanges
  7. DataStoreUtils.constraintsToByteArrayRanges
  8. TieredSFCIndexStrategy.getQueryRanges
  9. TieredSFCIndexStrategy.getQueryRanges
  10. SpaceFillingCurve.decomposeRange which is actually HilbertSFC.decomposeRange which is actually HilbertSFCOperations.decomposeRange which is actually PrimitiveHilbertSFCOperations.decomposeRange
  1. com.google.uzaygezen.core.Query.getFilteredIndexRanges

Secondary Index

Secondary indices (numerical, temporal, textual, user-defined) are also available in GeoWave. If one wishes to perform a query making use of a secondary index, it must be used explicitly. The following code

final CloseableIterator<ByteArrayId> matches = secondaryIndexQueryManager.query(
                (BasicQuery) query,
                secondaryIndex,
                index,
                new String[0]);

produces the following chain of events

A collection of ranges covering the responsive records is generated using the secondary index, and those ranges are then scanned linearly for the desired records.

No cost-based optimization is done.