GeoWave Feature List

Input

Ingest from the CLI
- Ingest from filesystem -> GeoWave
  - Pointer: LocalToGeowaveCommand.java
  - Notes: Tested, believed to work.
- Ingest from filesystem -> HDFS -> GeoWave
  - Pointer: LocalToMapReduceToGeowaveCommand.java
  - Notes: Not tested, status unknown.
- Stage from filesystem -> HDFS
  - Pointer: LocalToHdfsCommand.java
  - Notes: Not tested, status unknown.
- Stage from filesystem -> Kafka
  - Pointer: LocalToKafkaCommand.java
  - Notes: Not tested, status unknown.
- Ingest from Kafka -> GeoWave
  - Pointer: KafkaToGeowaveCommand.java
  - Notes: Not tested, status unknown.
- Ingest from HDFS -> GeoWave
  - Pointer: MapReduceToGeowaveCommand.java
  - Notes: Not tested, status unknown.
- Pointer: GeoWaveMain.java
- Notes: Requires plugins for each input file format, jars must be in the classpath. The source code for those formats can be found in the extensions/formats directory.
Ingest Using the API
- Bulk
  - Pointer: AccumuloKeyValuePairGenerator.java
  - Notes: For data already stored on HDFS, given an appropriate InputFormat, it is possible to create a class derived from org.apache.hadoop.mapreduce.Mapper that can be used to bulk-insert the data into Accumulo (with appropriate GeoWave keys and values).
- Piecemeal
  - Pointer: IndexWriter.java
  - Notes: Given an appropriate DataStore, Adapter, and Index, it is possible to produce an IndexWriter that can be used to write items one-by-one into the DataStore, using the Adapater, and according to the Index.
File Formats Supported
- avro
- gdelt
- geolife
- geotools-raster (GeoTools-supported raster data)
- geotools-vector (GeoTools-supported vector data)
- gpx
- stanag4676
- tdrive
- Via Extensions:
  - Landsat 8
  - OpenStreetMap

Backends

HBase
Accumulo

Integrations

mrgeo (reading)
GeoTrellis (prospective) (reading and writing)
Via C++ bindings
- PDAL (reading and writing)
- mapnik (reading)

Secondary Indices

Numerical
- Pointer: NumericSecondaryIndexConfiguration.java
Temporal
- Pointer: TemporalSecondaryIndexConfiguration.java
Textual
- Pointer: TextSecondaryIndexConfiguration.java
User Defined
- Pointer: SimpleFeatureUserDataConfiguration.java
- Examples:
  - TimeDescriptionConfiguration allows indexing over intervals
  - VisibilityConfiguration.java
  - StatsConfigurationCollection.java
No Cost-Based Optimization

Processing

k-means
- CLI
  - Pointer: KmeansParallelCommand.java
- Map-Reduce
  - Pointer: KMeansMapReduce.java
  - Notes: Means are supported, it appears that medians and centers are not.
Jump Method (k-discovery)
- CLI
  - Pointer: KmeansJumpCommand.java
- Map-Reduce
  - Pointer: KMeansDistortionMapReduce.java
  - Notes: Uses the approach given in Catherine A. Sugar; Gareth M. James (2003). "Finding the number of clusters in a data set: An information theoretic approach". Journal of the American Statistical Association 98 (January): 750–763 with the common covariance matrix set to the identity matrix.
Sampling
- Map-Reduce
  - Pointer: KSamplerMapReduce.java
  - Notes: Chooses k random features from either the overall collection of features or from a (some) group(s) of features.
Kernel Density Estimation
- CLI
  - Pointer: KdeCommand.java
- Map-Reduce
  - Pointer: AccumuloKDEReducer.java
  - Notes: General background can be found on the Wikipedia page. This implementation uses Gaussian basis functions.
Nearest Neighbors
- CLI
  - Pointer: NearestNeighborCommand.java
- Map-Reduce
  - Pointer: NNMapReduce.java
  - Notes: Appears to use partitioned direct search.
Clustering
- Map-Reduce
  - Pointer: [GroupAssignmentMapReduce.java]https://github.com/ngageoint/geowave/blob/7f1194ede7d8efd358f9f26d23dd3fc954be9ca2/analytics/mapreduce/src/main/java/mil/nga/giat/geowave/analytic/mapreduce/kmeans/KSamplerMapReduce.java)
Convex Hulls of Clusters
- Map-Reduce
  - Pointer: ConvexHullMapReduce.java
DBSCAN
- Map-Reduce
  - Pointer: DBScanMapReduce.java
  - Notes: See the Wikipedia page for general background. This implementation deviates from the standard approach. See this code comment for details.
Spark Support
- Pointer: analytics/spark/src/main/scala/mil/nga/giat/geowave/analytics/spark
- Pointer: AnalyticRecipes.scala
- Notes: The analytic recipes file provides tidbits useful for addressing clustering-related questions with GeoWave and Spark.

Output

GeoServer Plugin
- Pointer: GeoserverServiceImpl.java
- Notes: This can be used to view GeoWave Layers in GeoServer. Using an SLD such as this one allows large datasets to be shown interactively by subsampling at the pixel level
Query
- RDD
  - Pointer: GeoWaveInputFormat.java
  - Notes: One can use the GeoWaveInputFormat class to perform a query which returns an RDD of key, value pairs. The procedure for doing that is to construct GeoWave Query and QueryOptions objects then insert them into a org.apache.hadoop.conf.Configuration object using static methods on GeoWaveInputFormat, the passing that configuration object into a call to the newAPIHadoopRDD method.
- Iterator
  - Pointer: DataStore.java
  - Notes: It is possible to perform a query which returns an iterator of values. Given a DataStore, a Query, and a QueryOptions, one uses the query method on the DataStore class.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GeoWave_Feature_List.md

GeoWave_Feature_List.md

GeoWave Feature List

Input

Backends

Integrations

Secondary Indices

Processing

Output

Files

GeoWave_Feature_List.md

Latest commit

History

GeoWave_Feature_List.md

File metadata and controls

GeoWave Feature List

Input

Backends

Integrations

Secondary Indices

Processing

Output