Skip to content
Gabriel Iuhasz edited this page Sep 6, 2016 · 9 revisions

Welcome to the DICE-Anomaly-Detection-Tool (ADT) wiki!

Anomaly Detection is an important component involved in performance analysis of data intensive applications. We define an anomaly as an observation that does not conform to an expected pattern. Most tools or solutions such as Sematex, Datadog etc. are geared more towards a production environment in contrast to this the DICE Anomaly Detection Tool (ADT) is designed to be used during the development phases of big data applications.

##Overall architecture and workflow

The ADT is made up of a series of interconnected components that are controlled from a simple command line interface. This interface is meant to be used only for the initial version of the tool. Future versions will feature a more user friendly interface. In total there are 8 components that make up ADT. First we have the dmon-connector component which is used to connect to DMon. It is able to query the monitoring platform and also send it new data. This data can be detected anomalies or learned models. For each of these types of data dmon-connector creates a different index inside DMon. For anomalies it creates an index of the form anomaly-UTC where UTC stands for Unix time. Similarly to how the monitoring platform deals with metrics and their indices. Meaning that the index is rotated every 24 hours.

After the monitoring platform is queried the resulting dataset can be in JSON, CSV or RDF/XML. However, in some situations some additional formatting is required. This is done by the data formatter component. It is able to normalize the data, filter different features from the dataset or even window the data. The type of formatting the dataset may or may not need is highly dependant on the anomaly detection method used.

The feature selection component is used to reduce the dimensionality of the dataset. Not all features of a dataset may be needed to train a predictive model for anomaly detection. So in some situations it is important to have a mechanism that allows the selection of only the features that have a significant impact on the performance of the anomaly detection methods. Currently only two types of feature selection is supported. The first is Principal Component Analysis (from Weka) and Wrapper Methods.

ADT Architecture

The next two components are used for training and then validating predictive models for anomaly detection. For training a user must first select the type of method desired. The dataset is then split up into training and validation subsets and later used for cross validation. The ratio of validation to training size can be set during this phase. Parameters related to each method can also be set in this component.Validation is handled by a specialized component which minimizes the risk of overfitting the model as well as ensuring that out of sample performance is adequate. It does this by using cross validation and comparing the performance of the current model with past ones.

Once validation is complete the model exporter component transforms the current model into a serialized loadable form. We will use the PMML format wherever possible in order to ensure compatibility with as many machine learning frameworks as possible. This will also make the use of ADT in a production like environment much easier.

The resulting model can be fed into DMon. In fact the core services from DMon (specifically Elasticsearch) have to role of a serving layer from a lambda architecture. Both detected anomalies and trained models are stored in the DMon and can be queried directly from the monitoring platform. In essence this means that other tools from the DICE toolchain need to know only the DMon endpoint in order to see what anomalies have been detected.

Workflow

Furthermore, the training and validation scenarios are in fact the batch layer while unsupervised methods and/or loaded predictive models are the speed layer. Both these scenarios can be accomplished by ADT.

The last component is the anomaly detection engine. It is responsible for detecting anomalies. It is important to note the it is able to detect anomalies however it is unable to communicate them to the serving layer (i.e. DMon). It uses the dmon-connector component to accomplish this. The anomaly detection engine is also able to handle unsupervised learning methods.

The anomaly detection engine will also contain/control the Regression Based Anomaly Detection technique developed by Imperial. It is one of the selectable anomaly detection method.

Acknowledgement

This project has received funding from the European Union’s [Horizon 2020] research and innovation programme under grant agreement No. 644869.

European Union