The final version of the insider threat detection. It has the following features:
- Extracts more compact and discriminative features.
- Propose graph based detection algorithm to improve performance.
- Apache Spark
- pip install wrapt
- pip install pgmpy
-
Download CERT data r6.2.tar.bz2
-
Download answers.tar.bz2
-
Extract both r6.2.tar.bz2 and answers.tar.bz2, and place extracted answers under r6.2 folder.
- SPARK_MASTER: master address of the Spark.
- config.io.data_dir: root of the extracted r6.2 data.
- bash run.sh
- cache: all necessary intermediate results.
- result: scores of baseline systems.
- CR scores: printed to the terminal with highlighted colors.
The metric used for evaluation is cumulative recall (CR), with bucket size 25.
Table 1. The CR for 400 (perfect score is 16)
Algorithms | PCA | SVM | ISO-Forest | DNN |
---|---|---|---|---|
No GTM | 13.64 | 10.36 | 8.10 | 13.91 |
GTM Enabled | 15.00 | 12.00 | 11.27 | 15.54 |
Table 2. The CR for 1000 (perfect score is 40)
Algorithms | PCA | SVM | ISO-Forest | DNN |
---|---|---|---|---|
No GTM | 37.18 | 34.36 | 32.10 | 36.45 |
GTM Enabled | 39.00 | 35.73 | 35.27 | 39.54 |