Note

Due to the data is too large, we put the parsed data on the google drive. Following is the link (https://drive.google.com/drive/folders/1jI5SQdT7wZE64E9baQhCMWzTsudfIMoc?usp=sharing)
Parsed data is in the /pased_data. An early version of test code and data is also available.

DeepSyslog

implement of DeepSyslog
framework

Requirement

python 3.7
pytorch >= 1.1.0

raw data

you can download raw data from (https://zenodo.org/record/3227177)

log pasing and preprocess

logpaser toolkit
we choose Spell to parse HDFS datasets and Drain to parse BGL datasets.
The raw logs are seperated to text data and parameters. Then split these composite words into separate parts and remove the stop words. place some non-numeric parameters with the label like "ip address","exception","port".
also save parameters for each log.

word embedding

embedding/w2v model.py
download pre-trained fastText word vectors from (https://fasttext.cc/docs/en/crawl-vectors.html)
load pre-trained model and convert it to gensim fastText model then continue to train the model using training data.

sentence embedding

embedding/sif by fse.py
(https://github.com/oborchers/Fast_Sentence_Embeddings) implement by fse load the trained word embedding and choose SIF case of fse to generate sentence embedding

model

train and test model in s_p_all.py

results

Dataset	Precision	Recall	F1
HDFS	0.97	0.99	0.98
BGL	0.98	0.97	0.975

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
.gitignore		.gitignore
README.md		README.md
dp_p_all.py		dp_p_all.py
framework.png		framework.png
numvec_dict		numvec_dict
svIndex		svIndex
sv_list		sv_list

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Note

DeepSyslog

Requirement

raw data

log pasing and preprocess

word embedding

sentence embedding

model

results

About

Releases

Packages

Languages

qyjcode/deepsyslog

Folders and files

Latest commit

History

Repository files navigation

Note

DeepSyslog

Requirement

raw data

log pasing and preprocess

word embedding

sentence embedding

model

results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages