Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predictive maintenance #39

Closed
amineebenamor opened this issue Apr 23, 2019 · 6 comments
Closed

Predictive maintenance #39

amineebenamor opened this issue Apr 23, 2019 · 6 comments

Comments

@amineebenamor
Copy link

Hello,
First, thank you for your implementations, it helped me a lot!
I want to build a predictive maintenance pipeline based on logs for research purposes.
Therefore, for predictive maintenance, I'm implementing DeepLog (that's the only solution I found, I don't think there are others, let me know if you have other solutions) but it supposes that the training set in input contains logs from normal execution.
In order to apply my pipeline for any logs, I don't want to use labelled datasets. In addition, logs can be huge and it's in many cases complicated to say whether it's an anomaly or not if I'm not an expert (Spark logs for example).
The LSTM (DeepLog) needs normal execution logs (otherwise, it would consider anomalies as normal behavior).
I thought of, as a preprocessing step, using one of the unsupervised learning methods (Log Clustering, PCA or Invariant Mining for example) to keep normal execution logs for my training set of DeepLog.
I would run this whole pipeline on datasets from loghub which are not labelled.
What do you think about this approach? Is there anything better that can be done? Any advice?
Thank you for your answer!

@ShilinHe
Copy link
Member

Yes, DeepLog indeed has such a problem that normal execution logs are required. You proposed a good idea, but it really depends on the accuracy of those unsupervised methods. Invariant Mining performs good, and you should have a try first. As an alternative, you may use those labeled data but remove all anomalous logs. In this case, your experimental setting should be same as Deeplog. BTW, we welcome pull requests from the community.

@Athiq
Copy link

Athiq commented May 6, 2019

Hi @ShilinHe
wuyifan18/DeepLog#1
is this something you people are working on ??

idea is --- any random log file --- i use spell to get the template --- then the question is how to convert those templates into numbers to feed them into LSTM(lets say) for forecasting. Any suggestions are very well appreciated.

@ShilinHe
Copy link
Member

ShilinHe commented May 6, 2019

Actually, no... we do not release code anywhere else.
If you have parsed the template out, then you can name the template as 0,1,2,... That's the number.
But usually, in LSTM, I think the input is the one-hot vectors, which should go through the embedding layer. I am not sure whether DeepLog works in this manner.

@amineebenamor
Copy link
Author

Actually, no... we do not release code anywhere else.
If you have parsed the template out, then you can name the template as 0,1,2,... That's the number.
But usually, in LSTM, I think the input is the one-hot vectors, which should go through the embedding layer. I am not sure whether DeepLog works in this manner.

Yes, the input layer in the LSTM of DeepLog encodes the n possible log keys from K as one-hot
vectors, as specified in the DeepLog paper.

@Athiq
Copy link

Athiq commented May 7, 2019

@amineebenamor did you implement with any other dataset ?? from spell or any other parser and fed it to Deeplog ??

@amineebenamor
Copy link
Author

@amineebenamor did you implement with any other dataset ?? from spell or any other parser and fed it to Deeplog ??

I'm working on it when I have time.
I haven't done anything yet, you can also try by yourself and see what you have :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants