Skip to content

lyliyu/TimeSeriesAnalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TimeSeriesAnalysis

This repository contains a bunch of implementation of time series data analytics including anomaly detection, prediction and etc.

Time Series Anomaly Detection

This is a spark implementation based on twitter's paper: https://www.usenix.org/system/files/conference/hotcloud14/hotcloud14-vallis.pdf The basic idea is to use robust statistical metrics (Median Abosulte Deviation) to discover outliers in the long-term time series data. Up to now, the STL trending is not implemented the same way as in the paper.

The input file contains time series data in the format of:

timestamp1 value1

timestamp2 value2

timestamp3 value3

...

The result is a list of outliers detected. The result will be stored in collection TimeSeriesAppAnomaly of the MongoDB. This will enable correlating timeseries outliers with other data as the next step.

To run the application, first assemble a fat jar file: sbt clean assembly.

Then install and start a local mongod: http://docs.mongodb.org/getting-started/shell/installation/

Submit your spark job: /bin/spark-submit --class "com.lukelab.timeseries.TimeSeriesAnalysis" --master local[2] target/scala-2.11/timeseriesanalysis.jar --dataFile data/tsdata.csv --tenantId abc --stlEnabled true

Arguments:

--dataFile: path to the time series input file --tenantId: tenant id for multi-tenancy support --dbUrl: Url for mongodb. e.g. mongodb://localhost:27017/timeseries --stlEnabled: define if we are going to use seasonality or not

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages