Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Add ability to easily subset/filter log files #162

Open
parmsam-pfizer opened this issue Feb 14, 2023 · 8 comments · Fixed by #164 or #166
Open

Feature Request: Add ability to easily subset/filter log files #162

parmsam-pfizer opened this issue Feb 14, 2023 · 8 comments · Fixed by #164 or #166
Labels
enhancement New feature or request release 0.3.0

Comments

@parmsam-pfizer
Copy link
Collaborator

Feature Idea

The text log file seems to be formatted in a pretty standard way. With some enhancements, maybe users could parse this text file and grab the info they need (list of package versions for example). Or there could be a feature added to output it into an object that you can more easily subset/filter (like json or rds with a nested list).

Relevant Input

No response

Relevant Output

No response

Reproducible Example/Pseudo Code

No response

@kodesiba
Copy link
Collaborator

There is the option to keep the environment object we use after execution that can be accessed in scripting but flexibility is definitely worth having. You are correct, it's pretty predictable and could be parsed and was something we'd talked about but never implemented,

@nicholas-masel
Copy link
Collaborator

I saw an upcoming presentation for PHUSE US Connect, SS04: Post-mortem Logs in R, that is parsing logrx/timber.

I can reach out to my colleague as well to help in getting requirements and see if she has some code to contribute as a starting point.

@bms63
Copy link
Collaborator

bms63 commented Feb 14, 2023

I forgot about this presentation. Thanks for the reminder.

@tkakinyi
Copy link

Hi, I do have some code that I am writing for phuse that may be a good starting point for enhancement. ATM, it can parse based on strings entered by developer - hoping to get it to a point for strings entered by user. A challenge has been unlike sas logs that have "warning" or "error" whatever message in the respective line, the logrx logs are organized in sections with section headers....nonetheless can still parse them. Still playing around with the code and happy to share for ideas
PS: Does anyone have some "dirty" logs I can use to develop?

@bms63
Copy link
Collaborator

bms63 commented Feb 15, 2023

Hey @tkakinyi We don't have any dirty logs. For our unit tests, we just have them temporarily created and then removed.

If you make some scripts and logs with some "dirtiness" perhaps we can store them in a dev folder for reproducibility on this repo?

@parmsam-pfizer
Copy link
Collaborator Author

Here's some code to split the log by section headers (on a file named example-logrx.log). It might be worth adding a dash sequence similar to what appears under the Session Information output for subsections (under Errors and Warning and Message, Output, and Results for example). That would make it easier nest them.

library(stringr)
log_txt <- readLines("example-logrx.log")
sect_headers <- c()
sect_status <- FALSE
sect_info <- list()
for (i in log_txt) {
  if (i == paste(rep("-", 80), collapse = "")) {
    sect_status <- !sect_status
  } else if (sect_status == TRUE) {
    sect_headers <- c(sect_headers, i)
  } else {
    cur_pos <- length(sect_headers)
    if (length(sect_info) == cur_pos) {
      sect_info[[cur_pos]] <- c(sect_info[[cur_pos]], i)
    } else {
      sect_info[[cur_pos]] <- i
    }
  }
}
sect_headers <- stringr::str_remove_all(sect_headers, "-?\\s{3,}-?")
names(sect_info) <- sect_headers
sect_info

@bms63
Copy link
Collaborator

bms63 commented Feb 17, 2023

So is this going to lead us to a Post-Mortem Logs Vignette? :)

@tkakinyi
Copy link

tkakinyi commented Feb 17, 2023

hi all,
check out a good starting point(opinion) in https://github.com/tkakinyi/phuse2023/tree/main
I have also included 3 "dirty" logs from logrx : rloud was created twice so I could have files of different sizes and admiral_Adae is from admiral I just messed with the file some to generate messages. Running these with logrx should give log files with some errors, warning and messages. Though for errors - only one can be generated as to my comprehension R stops execution when it encounters an error.
To test the sas functionality I used internal code and these are more ubiquitous so I did not include them. Pretty large function so can possibly be "chunked" out.
known issues for further development

  1. The add_r_sxtn can only be used when parsing an individual {logrx} file, as in example 5.
  2. The source code for the function is currently a large function, which may be costly in system run time.
  3. The argument select_file can only accommodate one file at a time.
    I did not include them in your repo as I do not know the setup of it, so far this is just an in-script function developed in v 4.2.2 to be sourced
    [edit] - any immediate feedback is welcome

@bms63 bms63 linked a pull request Jun 26, 2023 that will close this issue
13 tasks
@bms63 bms63 linked a pull request Sep 22, 2023 that will close this issue
13 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request release 0.3.0
Projects
Status: Done
5 participants