-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rethink and centralize how we deal with logging, warning and error handling in our packages #47
Comments
pacta.portfolio.analysis
and workflow.transition.monitor
This is a rather meta issue that probably belongs somewhere else. @jdhoffa? |
See also thread in RMI-PACTA/workflow.transition.monitor#252 |
Relates to: RMI-PACTA/workflow.data.preparation#17 |
Putting my thoughts on the "how" of logging here, since this seems to be the most cohesive thread at the moment. Standard utilitiesFirst: there should not be any notable difference between running something interactively and running it via a script. The function should be the function, and any behavior changes should be brought about by arguments. This includes the need to make things like the This means that we need to write our functions to be largely ignorant of their contexts, except what is defined by arguments, or shared namespace (functional programming 🎉 ). Therefore, logging should always go to the one place that we know exists:
# add appender so that reprex captures the log output. Also visible in the stdout dropdown below
logger::log_appender(logger::appender_stdout, index = 2)
foo <- function(bar) {
# some error condition
if (!is.numeric(bar)) {
logger::log_error("Argument `bar` must be numeric, not {class(bar)}.")
stop("Argument `bar` must be numeric.")
}
# some warning condition
if (!is.integer(bar)) {
logger::log_warn(
"Argument `bar` is recommended to be integer, not {class(bar)}."
)
warning("Argument `bar` should be integer")
}
return(bar)
}
foo("bar")
#> ERROR [2024-03-13 17:23:41] Argument `bar` must be numeric, not character.
#> Error in foo("bar"): Argument `bar` must be numeric.
foo(1.5)
#> WARN [2024-03-13 17:23:41] Argument `bar` is recommended to be integer, not numeric.
#> Warning in foo(1.5): Argument `bar` should be integer
#> [1] 1.5
foo(1L)
#> [1] 1 Created on 2024-03-13 with reprex v2.0.2 Standard output and standard errorERROR [2024-03-13 17:23:41] Argument `bar` must be numeric, not character.
WARN [2024-03-13 17:23:41] Argument `bar` is recommended to be integer, not numeric. Session infosessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.3.2 (2023-10-31)
#> os macOS Sonoma 14.2
#> system aarch64, darwin23.0.0
#> ui unknown
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz Europe/Belgrade
#> date 2024-03-13
#> pandoc 3.1.7 @ /opt/homebrew/bin/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> cli 3.6.2 2023-12-11 [1] CRAN (R 4.3.2)
#> digest 0.6.35 2024-03-11 [1] CRAN (R 4.3.2)
#> evaluate 0.23 2023-11-01 [1] CRAN (R 4.3.2)
#> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.2)
#> fs 1.6.3 2023-07-20 [1] CRAN (R 4.3.2)
#> glue 1.7.0 2024-01-09 [1] CRAN (R 4.3.2)
#> htmltools 0.5.7 2023-11-03 [1] CRAN (R 4.3.2)
#> knitr 1.45 2023-10-30 [1] CRAN (R 4.3.2)
#> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.3.1)
#> logger 0.3.0 2024-03-05 [1] CRAN (R 4.3.2)
#> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.2)
#> purrr 1.0.2 2023-08-10 [1] CRAN (R 4.3.2)
#> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.3.1)
#> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.3.1)
#> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.3.1)
#> R.utils 2.12.2 2022-11-11 [1] CRAN (R 4.3.1)
#> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.3.1)
#> rlang 1.1.3 2024-01-10 [1] CRAN (R 4.3.2)
#> rmarkdown 2.25 2023-09-18 [1] CRAN (R 4.3.1)
#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.1)
#> styler 1.10.2 2023-08-29 [1] CRAN (R 4.3.1)
#> vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.3.2)
#> withr 3.0.0 2024-01-16 [1] CRAN (R 4.3.2)
#> xfun 0.41 2023-11-01 [1] CRAN (R 4.3.2)
#> yaml 2.3.8 2023-12-11 [1] CRAN (R 4.3.2)
#>
#> [1] /opt/homebrew/lib/R/4.3/site-library
#> [2] /opt/homebrew/Cellar/r/4.3.2/lib/R/library
#>
#> ────────────────────────────────────────────────────────────────────────────── Multiple output streamsAwesome. so that covers logging to
Thankfully again, # add appender so that reprex captures the log output
logger::log_appender(logger::appender_stdout, index = 2)
# add file outputs to capture logs
logger::log_appender(logger::appender_file("foo.log"), index = 3)
# and because this is a demo, add one that captures in different level and format
logger::log_appender(logger::appender_file("bar.json"), index = 4)
logger::log_threshold("TRACE", index = 4)
logger::log_layout(logger::layout_json(), index = 4)
foo <- function(bar) {
# Add a trace log
logger::log_trace("Entering foo()")
logger::log_trace("Argument `bar` is {bar}")
# some error condition
if (!is.numeric(bar)) {
logger::log_error("Argument `bar` must be numeric, not {class(bar)}.")
stop("Argument `bar` must be numeric.")
}
# some warning condition
if (!is.integer(bar)) {
logger::log_warn(
"Argument `bar` is recommended to be integer, not {class(bar)}."
)
warning("Argument `bar` should be integer")
}
return(bar)
}
foo("some string")
#> ERROR [2024-03-13 18:43:49] Argument `bar` must be numeric, not character.
#> Error in foo("some string"): Argument `bar` must be numeric.
foo(1.5)
#> WARN [2024-03-13 18:43:49] Argument `bar` is recommended to be integer, not numeric.
#> Warning in foo(1.5): Argument `bar` should be integer
#> [1] 1.5
foo(1L)
#> [1] 1
readLines("foo.log")
#> [1] "ERROR [2024-03-13 18:43:49] Argument `bar` must be numeric, not character."
#> [2] "WARN [2024-03-13 18:43:49] Argument `bar` is recommended to be integer, not numeric."
readLines("bar.json")
#> [1] "{\"time\":\"2024-03-13 18:43:49\",\"level\":\"TRACE\",\"ns\":\"global\",\"ans\":\"global\",\"topenv\":\"R_GlobalEnv\",\"fn\":\"foo\",\"node\":\"MACCM424GV711\",\"arch\":\"arm64\",\"os_name\":\"Darwin\",\"os_release\":\"23.2.0\",\"os_version\":\"Darwin Kernel Version 23.2.0: Wed Nov 15 21:59:33 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T8112\",\"pid\":8395,\"user\":\"aaxthelm\",\"msg\":\"Entering foo()\"}"
#> [2] "{\"time\":\"2024-03-13 18:43:49\",\"level\":\"TRACE\",\"ns\":\"global\",\"ans\":\"global\",\"topenv\":\"R_GlobalEnv\",\"fn\":\"foo\",\"node\":\"MACCM424GV711\",\"arch\":\"arm64\",\"os_name\":\"Darwin\",\"os_release\":\"23.2.0\",\"os_version\":\"Darwin Kernel Version 23.2.0: Wed Nov 15 21:59:33 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T8112\",\"pid\":8395,\"user\":\"aaxthelm\",\"msg\":\"Argument `bar` is some string\"}"
#> [3] "{\"time\":\"2024-03-13 18:43:49\",\"level\":\"ERROR\",\"ns\":\"global\",\"ans\":\"global\",\"topenv\":\"R_GlobalEnv\",\"fn\":\"foo\",\"node\":\"MACCM424GV711\",\"arch\":\"arm64\",\"os_name\":\"Darwin\",\"os_release\":\"23.2.0\",\"os_version\":\"Darwin Kernel Version 23.2.0: Wed Nov 15 21:59:33 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T8112\",\"pid\":8395,\"user\":\"aaxthelm\",\"msg\":\"Argument `bar` must be numeric, not character.\"}"
#> [4] "{\"time\":\"2024-03-13 18:43:49\",\"level\":\"TRACE\",\"ns\":\"global\",\"ans\":\"global\",\"topenv\":\"R_GlobalEnv\",\"fn\":\"foo\",\"node\":\"MACCM424GV711\",\"arch\":\"arm64\",\"os_name\":\"Darwin\",\"os_release\":\"23.2.0\",\"os_version\":\"Darwin Kernel Version 23.2.0: Wed Nov 15 21:59:33 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T8112\",\"pid\":8395,\"user\":\"aaxthelm\",\"msg\":\"Entering foo()\"}"
#> [5] "{\"time\":\"2024-03-13 18:43:49\",\"level\":\"TRACE\",\"ns\":\"global\",\"ans\":\"global\",\"topenv\":\"R_GlobalEnv\",\"fn\":\"foo\",\"node\":\"MACCM424GV711\",\"arch\":\"arm64\",\"os_name\":\"Darwin\",\"os_release\":\"23.2.0\",\"os_version\":\"Darwin Kernel Version 23.2.0: Wed Nov 15 21:59:33 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T8112\",\"pid\":8395,\"user\":\"aaxthelm\",\"msg\":\"Argument `bar` is 1.5\"}"
#> [6] "{\"time\":\"2024-03-13 18:43:49\",\"level\":\"WARN\",\"ns\":\"global\",\"ans\":\"global\",\"topenv\":\"R_GlobalEnv\",\"fn\":\"foo\",\"node\":\"MACCM424GV711\",\"arch\":\"arm64\",\"os_name\":\"Darwin\",\"os_release\":\"23.2.0\",\"os_version\":\"Darwin Kernel Version 23.2.0: Wed Nov 15 21:59:33 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T8112\",\"pid\":8395,\"user\":\"aaxthelm\",\"msg\":\"Argument `bar` is recommended to be integer, not numeric.\"}"
#> [7] "{\"time\":\"2024-03-13 18:43:49\",\"level\":\"TRACE\",\"ns\":\"global\",\"ans\":\"global\",\"topenv\":\"R_GlobalEnv\",\"fn\":\"foo\",\"node\":\"MACCM424GV711\",\"arch\":\"arm64\",\"os_name\":\"Darwin\",\"os_release\":\"23.2.0\",\"os_version\":\"Darwin Kernel Version 23.2.0: Wed Nov 15 21:59:33 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T8112\",\"pid\":8395,\"user\":\"aaxthelm\",\"msg\":\"Entering foo()\"}"
#> [8] "{\"time\":\"2024-03-13 18:43:49\",\"level\":\"TRACE\",\"ns\":\"global\",\"ans\":\"global\",\"topenv\":\"R_GlobalEnv\",\"fn\":\"foo\",\"node\":\"MACCM424GV711\",\"arch\":\"arm64\",\"os_name\":\"Darwin\",\"os_release\":\"23.2.0\",\"os_version\":\"Darwin Kernel Version 23.2.0: Wed Nov 15 21:59:33 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T8112\",\"pid\":8395,\"user\":\"aaxthelm\",\"msg\":\"Argument `bar` is 1\"}" Created on 2024-03-13 with reprex v2.0.2 Standard output and standard errorERROR [2024-03-13 18:43:49] Argument `bar` must be numeric, not character.
WARN [2024-03-13 18:43:49] Argument `bar` is recommended to be integer, not numeric. Session infosessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.3.2 (2023-10-31)
#> os macOS Sonoma 14.2
#> system aarch64, darwin23.0.0
#> ui unknown
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz Europe/Belgrade
#> date 2024-03-13
#> pandoc 3.1.7 @ /opt/homebrew/bin/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> cli 3.6.2 2023-12-11 [1] CRAN (R 4.3.2)
#> digest 0.6.35 2024-03-11 [1] CRAN (R 4.3.2)
#> evaluate 0.23 2023-11-01 [1] CRAN (R 4.3.2)
#> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.2)
#> fs 1.6.3 2023-07-20 [1] CRAN (R 4.3.2)
#> glue 1.7.0 2024-01-09 [1] CRAN (R 4.3.2)
#> htmltools 0.5.7 2023-11-03 [1] CRAN (R 4.3.2)
#> jsonlite 1.8.8 2023-12-04 [1] CRAN (R 4.3.2)
#> knitr 1.45 2023-10-30 [1] CRAN (R 4.3.2)
#> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.3.1)
#> logger 0.3.0 2024-03-05 [1] CRAN (R 4.3.2)
#> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.2)
#> purrr 1.0.2 2023-08-10 [1] CRAN (R 4.3.2)
#> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.3.1)
#> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.3.1)
#> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.3.1)
#> R.utils 2.12.2 2022-11-11 [1] CRAN (R 4.3.1)
#> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.3.1)
#> rlang 1.1.3 2024-01-10 [1] CRAN (R 4.3.2)
#> rmarkdown 2.25 2023-09-18 [1] CRAN (R 4.3.1)
#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.1)
#> styler 1.10.2 2023-08-29 [1] CRAN (R 4.3.1)
#> vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.3.2)
#> withr 3.0.0 2024-01-16 [1] CRAN (R 4.3.2)
#> xfun 0.41 2023-11-01 [1] CRAN (R 4.3.2)
#> yaml 2.3.8 2023-12-11 [1] CRAN (R 4.3.2)
#>
#> [1] /opt/homebrew/lib/R/4.3/site-library
#> [2] /opt/homebrew/Cellar/r/4.3.2/lib/R/library
#>
#> ────────────────────────────────────────────────────────────────────────────── Note here that we're not capturing the messages from But in any case, this gives us what I think we're looking for: a function implemented once, and the logging behavior controlled external to the function, but consistent. By default, if a user runs this function interactively, they'll get the useful messages on # run_pacta.R
output_dir <- sys.getEnv("OUTPUT_DIR")
# include logs along with outputs
logger::log_appender(logger::appender_file(file.path(output_dir, "log.txt"), index = 2)
#write traces to separate dir, so they can be removed/cleaned regularly
logger::log_appender(logger::appender_file(file.path(traces_dir, "log.txt"), index = 3)
logger::log_threshold("TRACE", index = 3)
main(write_to = output_dir) |
@AlexAxthelm this sounds very cool!! NIT: In general, it sounds like @cjyetman doesn't want this discussion to happen on this particular repo that he maintains, so maybe we move this discussion to https://github.com/RMI-PACTA/workflow.pacta ? |
moved. |
@cjyetman @jdhoffa picking up this thread again, I think what we have in Basically, we import I think that adding some |
Happy to close it! |
Context
Many people use R (and the functions that we design for PACTA) interactively. R is developed with this in mind, and tidyverse facilitates this very well.
However, we also have several Docker images and workflows that combine functions and data in particular ways to produce outputs. Two of the largest are: https://github.com/RMI-PACTA/workflow.data.preparation and https://github.com/RMI-PACTA/workflow.transition.monitor
In the case of workflows, it is often useful to run through many functions and decide how to handle errors from each of them (and where to log them) on a case by case. For the transition monitor, we may want to write the logs to an
error_messages.txt
file. For data preparation, we may want to write the logs to amanifest.json
, or some other log output. (We may also want to align the behaviours of these two, and handle logging in a natural and consistent way for both? Relates to RMI-PACTA/pacta.portfolio.audit#2)Benefits of separating the error handling from the function
When writing unit tests, we often just want to know "Is the data being processed in the right way?", ie.
We don't necessarily want to care about the broader behavior of "should it fail if input data came from a step before that had Equity data but no Bonds data, or how do we handle data prep input that is missing some particular scenario that is crucial to one particular metric.
We want to be able to test the precise functionality of the function in the context of the output it produces in isolation.
How functions get combined downstream, and how they should behave when we are trying to generate a report is a separate (but equally important) concept.
Contrary to that, we have a logging function in this package: https://github.com/RMI-PACTA/pacta.portfolio.analysis/blob/main/R/write_log.R
Which is called in various contexts (and often many times) to log different aspects within different functions: https://github.com/RMI-PACTA/pacta.portfolio.analysis/blob/2d322adf82103cab16cc04ffc5e89421b300a431/R/get_input_files.R#L28-L34
This is a bit of an awkward implementation since it makes it difficult to run anything interactively (you would need to have a local logging path). You also need to decide WHICH messages to log (wouldn't it be easier to just automatically log ALL error and warning messages, with appropriate tags?). We are combining the "WHAT" and the "HOW" and it makes it difficult to test.
Extracting the logging and handling functionality would make it much easier to unit_test the actual purpose of functions in
pacta.portfolio.analysis
, while leaving error handling and logging to separate functions, that we can (for example) wrap function calls in when running it in a server or docker context. See a toy example of this functionality in the reprex below:Reprex
Created on 2023-04-13 with reprex v2.0.2
FYI. @cjyetman @AlexAxthelm
The text was updated successfully, but these errors were encountered: