Reproducing from the HDFS logs including parsing and encoding #46

c1505 · 2020-08-02T20:02:24Z

The data in this repo is already encoded. I tried looking through the other issues to get an understanding of how to reproduce the results using the original HDFS dataset and haven't been able to understand what to do.

I understand that the data needs to be parsed and encoded and Drain is a recommended tool for parsing. From there, it isn't clear if that is actually the tool used and what part or parts of the parsed data to use. I see in the conclusion of the paper this : DeepLog learns and encodes entire log message including timestamp, log key, and parameter values. Unsure if that is also what is done for this implementation or not.

The text was updated successfully, but these errors were encountered:

wuyifan18 · 2020-08-03T06:08:26Z

This repo only implements the log key anomaly detection model.

c1505 · 2020-08-03T17:13:17Z

Thanks @wuyifan18 . How did you go about tokenizing and creating the numerical representation from the log keys ?

wuyifan18 · 2020-08-04T01:10:05Z

@c1505 Just encode log keys from 0 to the number of log keys.

Nothing-bit · 2020-11-29T07:17:33Z

Could you share the orginal labeled logs in this code?

shoaib-intro · 2022-04-04T11:21:14Z

@wuyifan18 I have same question. could you please share encoding technique. If you don't feel comfortable please share some articles!

OutOfBoundCats · 2022-04-04T17:16:10Z

@shoaib-intro can you please check
#41 (comment)
it may help although I am not so sure

Nothing-bit · 2022-04-07T11:13:10Z

@wuyifan18 I have same question. could you please share encoding technique. If you don't feel comfortable please share some articles!

the encording technique uses the loghub and logparser, the first one present the original log files and the second presents the log template generator, which can be found on github

shoaib-intro · 2022-04-07T11:44:13Z

@shoaib-intro can you please check #41 (comment) it may help although I am not so sure

Yes, I have gone through thanks for that but the problem there is not always block id available if we talk about application logs and in that case I have combined log keys based on Component which is unique in my case. where some components has sequence length of **213k** in that case I face index out of bound error IndexError: Target -1 is out of bounds. over line loss = criterion(output, label.to(device)) any idea for that

OutOfBoundCats · 2022-04-07T13:03:51Z

@shoaib-intro
i am sorry but i really dont have any idea on that

shoaib-intro · 2022-08-03T09:29:32Z

@shoaib-intro can you please check #41 (comment) it may help although I am not so sure

Yes, I have gone through thanks for that but the problem there is not always block id available if we talk about application logs and in that case I have combined log keys based on Component which is unique in my case. where some components has sequence length of **213k** in that case I face index out of bound error IndexError: Target -1 is out of bounds. over line loss = criterion(output, label.to(device)) any idea for that

this happened my training data contains negative numbers which I removed and issue resolved.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducing from the HDFS logs including parsing and encoding #46

Reproducing from the HDFS logs including parsing and encoding #46

c1505 commented Aug 2, 2020

wuyifan18 commented Aug 3, 2020

c1505 commented Aug 3, 2020

wuyifan18 commented Aug 4, 2020

Nothing-bit commented Nov 29, 2020

shoaib-intro commented Apr 4, 2022

OutOfBoundCats commented Apr 4, 2022

Nothing-bit commented Apr 7, 2022

shoaib-intro commented Apr 7, 2022 •

edited

Loading

OutOfBoundCats commented Apr 7, 2022

shoaib-intro commented Aug 3, 2022

Reproducing from the HDFS logs including parsing and encoding #46

Reproducing from the HDFS logs including parsing and encoding #46

Comments

c1505 commented Aug 2, 2020

wuyifan18 commented Aug 3, 2020

c1505 commented Aug 3, 2020

wuyifan18 commented Aug 4, 2020

Nothing-bit commented Nov 29, 2020

shoaib-intro commented Apr 4, 2022

OutOfBoundCats commented Apr 4, 2022

Nothing-bit commented Apr 7, 2022

shoaib-intro commented Apr 7, 2022 • edited Loading

OutOfBoundCats commented Apr 7, 2022

shoaib-intro commented Aug 3, 2022

shoaib-intro commented Apr 7, 2022 •

edited

Loading