Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Procedure for DeepLog #18

Open
danielhanbitlee opened this issue May 23, 2019 · 2 comments
Open

Procedure for DeepLog #18

danielhanbitlee opened this issue May 23, 2019 · 2 comments

Comments

@danielhanbitlee
Copy link

Hi,

I want to make sure I get the procedure to implement DeepLog correct. Here's what I'm thinking. Given train log data and test log data, do the following:

  1. Run spell on the train data and test log data to get log keys and features.
  2. Sort the outputs from spell into different sessions or blocks for both train and test data.
  3. Take only the log keys and put into file. Each row will represent one session or block of log outputs. Do this for both train and test data.
  4. Take sessions from train data that do not have errors and train it on deeplog.
  5. Run the test data to make predictions.

Can someone confirm whether this thinking is correct?

@amineebenamor
Copy link

Almost correct.
You have to split the train and test data only after the step 4. Before the step 4, you have to do everything on all the dataset.

@zhangch-fnst
Copy link

zhangch-fnst commented Aug 15, 2019

@danielhanbitlee @amineebenamor Hi,about step2,can you tell me how to [Sort the outputs from spell into different sessions or blocks].
for example:we have the following data

EventId EventTemplate ParameterList
6af214fd Receiving block <> src <> <> dest <> 50010 ['blk_-1608999687919862906', '/10.250.19.102:54106', '/10.250.19.102']
26ae4ce0 BLOCK* NameSystem.allocateBlock <*> ['mnt/hadoop/mapred/system/job_200811092030_0001/job.jar. blk_-1608999687919862906']
6af214fd Receiving block <> src <> <> dest <> 50010 ['blk_-1608999687919862906', '/10.250.10.6:40524', '/10.250.10.6']
6af214fd Receiving block <> src <> <> dest <> 50010 ['blk_7503483334202473044', '/10.251.215.16:55695', '/10.251.215.16']
-- -- --
6af214fd Receiving block <> src <> <> dest <> 50010 ['blk_7503483334202473044', '/10.250.19.102:34232', '/10.250.19.102']
6af214fd Receiving block <> src <> <> dest <> 50010 ['blk_-1608999687919862906', '/10.250.14.224:42420', '/10.250.14.224']
dc2c74b7 PacketResponder <> for block <> terminating ['1', 'blk_-1608999687919862906']
dc2c74b7 PacketResponder <> for block <> terminating ['2', 'blk_-1608999687919862906']

After sorting by blocks,what will the data become?Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants