Skip to content

gravitationalwave01/anomalyDetection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

79 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PROBLEM STATEMENT:
	Please see the url https://github.com/InsightDataScience/anomaly_detection for the full problem statement.


DEPENDENCIES:

	I use a very simple library for interpreting the json data structure. 
	The library is available online:
		https://github.com/nlohmann/json
	HOWEVER, only one single header file is necessary and is included in the ./src directory. You do not need to download or install anything for the json interpreter.

	
	My code also uses the Eigen libraries. Please follow the instructions here to set up Eigen: https://eigen.tuxfamily.org/dox/GettingStarted.html
	Alternatively, you may simply clone the repository https://bitbucket.org/eigen/eigen/ and link the directory to which you clone in the g++ -I/path/to/Eigen
	Note that no installation is necessary. Simply downloading the header files and linking to the Eigen directory is sufficient

	I assume that you have the g++ -std=c++11 compiler.


COMPILATION INSTRUCTIONS:

	To run this code you need to compile the source file ~/src/process_log.cpp

	Please enter the directory ./src and use the compilation command
		g++ -std=c++11 -I/path/to/Eigen process_log -o process_log

	Afterwards, please run the script ~/run.sh in the home directory.
	Note that in the ~/run.sh file you must specify four files: the location of the executable, the locations of the batch.json and stream.json files, and the location of the output flagged_purchases.json file	

	

ASSUMPTIONS MADE:

	1) I assume that each customer only knows a small number of other customers (e.g. the adjacency matrix is sparse). The code will work even without this assumption, but it will be very slow
	
 	2) I assume that all input data is pre-sorted in chronological order. If the input is not in chronological order then the code will spit out an error.


ERROR LOG:

	the directory ~/log_output contains a file called error_log. If the calculation fails then the error messages are output to this log.


About

Solution to Insight Data Science coding challenge

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published