- This project has been created as a result of test for Machine Learning engineer position at "Secret".
Here you see an example how you can classify different HTTP requests.
In short main idea is to get features from HTTP requests then transform it via en encoder and predict class of an HTTP request via simple Decision Tree.
- Extract features from raw data
- Compress the features to an representation with shape (10,)
- Make classification based the compressed data.
- Send results back in json format like [{"EVENT_ID":"Ax001", "LABEL_PRED": 1}]
For raw data extraction task has been used custom class in utils dir. To compress data has been used custom MLPAutoEncoder (reduce shape from 42 to 10 values). To set up initial labels has been used OPTICS model from scikit-learn. After that all data has been trained with CatBoostClassifier.
Here we have about 50 classes of HTTP request. From -1 to 48.
For CatBoostClassifier over 50 classes results show over 93% accuracy.
For custom MLPAutoEncoder result shows less ~0.01 MSE for enoding-decoding features after 100 epochs.
It seems that OPTICS setup labels quite well.
For example class 27 for bot HTTP request:
For example class 19 which seems to be a danger HTTP request.
a) Go to app directory.
cd ./app
b) Run via python3 file main.py where all configs are defined.
python3 main.py
To test fastapi app for classification task >> Type next code in your terminal:
curl -X 'POST' 'http://127.0.0.1:8127/predict' -H 'accept: application/json' -H 'Content-Type: application/json' -d '[
{
"CLIENT_IP": "188.138.92.55",
"EVENT_ID": "AVdhXFgVq1Ppo9zF5Fxu",
"CLIENT_USERAGENT": "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0",
"REQUEST_SIZE": 166,
"RESPONSE_CODE": 404,
"MATCHED_VARIABLE_SRC": "REQUEST_URI",
"MATCHED_VARIABLE_NAME": "url",
"MATCHED_VARIABLE_VALUE": "//tmp/20160925122692indo.php.vob"
}
]'
To check that app is available localy:
curl -X 'GET' 'http://127.0.0.1:8127/'
a) Create application image.
sudo docker build -t classification_http:baseline .
b) Run the image
sudo docker run -d --name test -p 8127:80 classification_http:baseline
To test fastapi in docker container you have to use next curl example:
curl -X 'POST' 'http://0.0.0.0:8127/predict' -H 'accept: application/json' -H 'Content-Type: application/json' -d '[
{
"CLIENT_IP": "188.138.92.55",
"EVENT_ID": "AVdhXFgVq1Ppo9zF5Fxu",
"CLIENT_USERAGENT": "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0",
"REQUEST_SIZE": 166,
"RESPONSE_CODE": 404,
"MATCHED_VARIABLE_SRC": "REQUEST_URI",
"MATCHED_VARIABLE_NAME": "url",
"MATCHED_VARIABLE_VALUE": "//tmp/20160925122692indo.php.vob"
},
{
"CLIENT_IP": "127.0.0.1",
"EVENT_ID": "CustomEVENTID001",
"CLIENT_USERAGENT": "select user_name from table",
"REQUEST_SIZE": 2000,
"RESPONSE_CODE": 200,
"MATCHED_VARIABLE_SRC": "REQUEST_GET_ARGS",
"MATCHED_VARIABLE_NAME": "REQUEST_GET_ARGS._",
"MATCHED_VARIABLE_VALUE": "Aleksandr Samofalov"
}
]'
To check that app is available inside container:
curl -X 'GET' 'http://0.0.0.0:8127/'