Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: rename zenoh::try_init_log_from_env to zenoh::init_logging #1165

Open
wants to merge 7 commits into
base: dev/1.0.0
Choose a base branch
from

Conversation

wyfo
Copy link
Contributor

@wyfo wyfo commented Jun 18, 2024

@milyin If I remember correctly, we already talked about doing this renaming. Here is the PR.

@Mallets

@Mallets
Copy link
Member

Mallets commented Jun 18, 2024

I disagree on this PR. try_init_lof_from_env is more explicit on the fact the logging could not be initialized. Moreover, this kind of change requires updating all the plugins, backends, and bindings.

@wyfo
Copy link
Contributor Author

wyfo commented Jun 18, 2024

I disagree on this PR. try_init_lof_from_env is more explicit on the fact the logging could not be initialized.

That's right, but it brings me two things in mind:

  • The current try_init_log_from_env is "bugged", as it doesn't handled log events, so logging from crates like mio are ignored. (Also, it will fails silently if an other global subscriber has been registered first). I will fix it.
  • I don't think we should expose directly RUST_LOG to the user (especially to the bindings users). I think instead of the current zenoh::try_init_log_from_env, we should provide a proper zenoh::init_logging which would use an environment variable named ZENOH_LOG, which would contains only the level. It would for example allow us to properly parse the level, and raise an error in case of typo, what try_init_log_from_env doesn't do. (Of course, RUST_LOG would be used instead when provided, allowing easier debugging without having to change the code). Using ZENOH_LOG instead of RUST_LOG would also be more friendly to the binding users, and would avoid polluting the logs with dependencies like mio.

EDIT: if the user still want to a finer control on the logging than zenoh::init_logging/ZENOH_LOG provides, he should use tracing_subscriber directly anyway.

@wyfo
Copy link
Contributor Author

wyfo commented Jun 18, 2024

If you are ok with this idea, I can open a PR right now. Exposing a ZENOH_LOG is something I wanted to do for the Python binding, where I didn't dare expose RUST_LOG (as I'm not using the environment filter anyway).

@wyfo
Copy link
Contributor Author

wyfo commented Jun 18, 2024

@gabrik, as you touched to the logging, do you have an opinion on the proposition above?

@gabrik
Copy link
Contributor

gabrik commented Jun 18, 2024

I do not have a strong opinion on it, the fact that fails silently is not a real issue, as it means that the subscriber has been already initialized in the process.

Regarding the ZENOH_LOG vs RUST_LOG I think both should be available, like we as you said for the average user ZENOH_LOG=debug is easier than RUST_LOG=zenoh=debug

Also with ZENOH_LOG we may decide to expose different levels for different parts of the code (like transport debug and zenoh info, etc...) so I'm in favor of it.

@Mallets
Copy link
Member

Mallets commented Jun 18, 2024

A logging rework should be done throughout the whole Zenoh ecosystem. I'll try to enumerate what I believe current issues are:

  1. Logging level is too coarse. Moving from debug to trace is a big jump in terms of logging and it's not easy to parse/digest given the amount of information being generated.
  2. I agree that specifying RUST_LOG looks weird when working with non-Rust code. However, RUST_LOG is commonly understood by any applications written in any language (with the sole exception of zenoh-pico). This should be kept true also for the current zenoh-python.
  3. We may want to allow bindings to interact with the logger rather than moving the logging in the bindings. Some configuration logging could be applied across different languages and deduplicating code and configuration management should be the goal.

@wyfo
Copy link
Contributor Author

wyfo commented Jun 18, 2024

I think both should be available, like we as you said for the average user ZENOH_LOG=debug is easier than RUST_LOG=zenoh=debug

Exactly, even if RUST_LOG=debug also "works" (but would embed dependency debug logs, so it don't really works)

Also with ZENOH_LOG we may decide to expose different levels for different parts of the code (like transport debug and zenoh info, etc...) so I'm in favor of it.

It's already the case with RUST_LOG, you can write RUST_LOG=zenoh=trace,zenoh::net::runtime::orchestrator=debug. Though we could provide an alternative syntax specific to zenoh. But the main goal is as you said to write ZENOH_LOG=debug. If the user wants a more advanced and precise filtering, I honestly think he should use tracing-subscriber directly, as he may also want to modify the formatting for example.

In my opinion, zenoh::init_logging should be a quick and dumb logging initializer, with a fixed formatting, and a configurable level with a reasonable default (for example WARN as it's the case for Python default logging configuration). tracing_subscriber is still there for advanced user.

wyfo added a commit to ZettaScaleLabs/zenoh-python that referenced this pull request Jun 18, 2024
@wyfo
Copy link
Contributor Author

wyfo commented Jun 18, 2024

Logging level is too coarse. Moving from debug to trace is a big jump in terms of logging and it's not easy to parse/digest given the amount of information being generated.

Tracing doesn't offer additional levels, but RUST_LOG is way more powerful than what we currently use, see https://docs.rs/tracing-subscriber/latest/tracing_subscriber/filter/struct.EnvFilter.html#directives You can refine the target, for example zenoh::net::runtime::orchestrator, and even use span/event fields to do the filtering!

However, RUST_LOG is commonly understood by any applications written in any language (with the sole exception of zenoh-pico).

I'm not sure to follow you, could you elaborate on this point? I did a quick search on Google ("RUST_LOG Python", "RUST_LOG javascript", etc.) and I didn't found any application (zenoh-python excepted) in an other language than Rust using RUST_LOG.
EDIT: ok, I misunderstood you the first time I read, you meant "any zenoh application". I agree on the convenience of having a consistent way of doing across bindings. I think ZENOH_LOG is more suited than RUST_LOG though. But we should still keep RUST_LOG available for bindings "power users", and to make migration easier.

We may want to allow bindings to interact with the logger rather than moving the logging in the bindings. Some configuration logging could be applied across different languages and deduplicating code and configuration management should be the goal.

I agree, and that's completely the point of zenoh_util::init_log_with_callback and the current logging implementation in zenoh-python: zenoh.init_logging (in zenoh-python) redirects tracing events to the standard Python logging system, making it completely transparent for the user. Other bindings may/should also use zenoh_util::init_log_with_callback the same way.

That's also based on this zenoh-python experience that I made the proposal for a zenoh::init_logging.
When I proposed to remove zenoh::try_init_log_from_env in #1061, you told me

Many users do NOT care about logging in their system. They may not even be logging at all strictly speaking. They need a way to print at screen what's happening to debug the system. As of today, this can be easily done by setting RUST_LOG=debug as environment variable if try_init_log_from_env is called without the need to care about anything else.

I was ok with that, but now I think we should go further:

  • provide a default logging configuration, either warning (like Python does) or error (like tracing does)
  • use a more explicit environment variable like ZENOH_LOG instead of RUST_LOG, especially for bindings. It would allow convenient features like typo detection and custom configurations (backed by a corresponding RUST_LOG expression); it could also be used with zenoh-pico!
  • panic if a global subscriber is already set, to avoid surprises
  • use a friendlier name, like init_logging (there is no more "try" aspect if we provide a default config and typo detection)
  • for bindings, try to integrate with a/the native logging system, as zenoh-python provided before feat: improve logging zenoh-python#235.

Mallets pushed a commit to eclipse-zenoh/zenoh-python that referenced this pull request Jun 18, 2024
* feat: improve logging

* fix: add missing header

* fix: replace `init_logging` with `try_init_log_from_env`

`init_logging` code is kept unused waiting for
eclipse-zenoh/zenoh#1165
@wyfo
Copy link
Contributor Author

wyfo commented Jun 18, 2024

In addition to my previous remarks, I think we can add the following issue:

  1. we are not using structured logging the way we could/should:
  • we are only logging a raw message (because of log legacy), while tracing provides event fields in order to structure log informations (need to study performance impact, but I expect it to be completely negligeable)
  • we are not using tracing spans to structure and factorize logging events (here again, performance impact has to be studied, but I don't expect it to be big, especially with a proper use of span levels)
  • some types, at least ZBuf have a unsuitable Debug implementation

@milyin
Copy link
Contributor

milyin commented Jun 19, 2024

I think we have two separate problems here

  • how to improve our logging to make it convenient and uniform for all bindings
  • how log initialization API should look like

I think that from the API point of view the function try_init_log_from_env doesn't look good.

  • it doesn't give any information which environment variable will be used
  • the name hints that something may go wrong, but no way to resolve this is given
  • the name gives too much details which user may be don't want to care about (try, from env)

I think we can add init_logging() function which resolves to builder. This will allow us to provide additional options without breaking API.
Setters may provide the same approach which Option type uses, with or_ in the name.

Examples:

init_logging().from_env().wait().; // init from RUST_LOG
init_logging().from_string("Debug").wait().; // init from string only
init_logging().from_string("Debug").from_env().wait().; // init from string then override from env, if env present
init_logging().from_env().or_from_string("Debug").wait(); // init from env and if not present, init from string

@wyfo
Copy link
Contributor Author

wyfo commented Jun 20, 2024

I think we can add init_logging() function which resolves to builder. This will allow us to provide additional options without breaking API.

There is no point using our current builder pattern for something not being related to asynchronous. We could use a common builder pattern, so replacing wait() with build()/init(), however, that contradicts the main point of exposing init_logging, summed up by @Mallets:

Many users do NOT care about logging in their system. They may not even be logging at all strictly speaking. They need a way to print at screen what's happening to debug the system.

That' why zenoh::init_logging interface should be kept the simplest possible. Advanced users who want advanced customization can and shall use tracing-subscriber directly.

Tests are now using `test-log` library.
See detailed `zenoh_util::init_logging` documentation.
@wyfo
Copy link
Contributor Author

wyfo commented Jun 20, 2024

Anyway, I've pushed a POC of my proposal, with a very detailed documentation, and some experimentation results about memory leaks.
I let you look at the full changelog, but I will copy here the documentation of zenoh::init_logging to make it easier to follow:

/// Initialize zenoh logging using the value of `ZENOH_LOG` environment variable.
///
/// `ZENOH_LOG` is parsed use [`LogLevel::from_str`], possible values are `"debug"`/`"INFO"`/etc.
/// If `ZENOH_LOG` is not provided, [`LogLevel::default`], i.e. `warn` is used.
///
/// See [`init_logging_with_level`] if you prefer setting the level directly in your code.
/// This function is roughly a wrapper around
/// ```ignore
/// let level = std::env::var("ZENOH_LOG")
///     .map(|var| var.parse().unwrap())
///     .unwrap_or_default();
/// init_logging_with_level(level);
/// ```
///
/// Logs are printed on stdout and are formatted like the following:
/// ```text
/// 2024-06-19T09:46:18.808602Z  INFO main ThreadId(01) zenoh::net::runtime: Using ZID: 1a615ea88fe1dc531a9d8701775d5bee
/// 2024-06-19T09:46:18.814577Z  INFO main ThreadId(01) zenoh::net::runtime::orchestrator: Zenoh can be reached at: tcp/[fe80::1]:58977
/// ```
///
/// # Advanced use
///
/// zenoh logging uses `tracing` crates internally; this function is just a convenient wrapper
/// around `tracing-subscriber`. If you want to control the formatting, or have a finer grain on
/// log filtering, we advise using `tracing-subscriber` directly.
///
/// However, to make migration and on the fly debugging easier, [`RUST_LOG`][1] environment variable
/// can still be used, and will override `ZENOH_LOG` configuration.
///
/// # Panics
///
/// This function may panic if the following operations fail:
/// - parsing `ZENOH_LOG`/`RUST_LOG` (see [Advanced use](#advanced-use)) environment variable
/// - register the global tracing subscriber, because another one has already been registered
///
/// These errors mostly being the result of unintended use, fast failure is assumed to be more
/// suitable than unexpected behavior, especially as logging should be initialized at program start.
///
/// # Use in tests
///
/// This function should **not** be used in tests, as it would panic as soon as there is more
/// than one test executed in the same unit, because only the first one to execute would be able to
/// register the global tracing subscriber.
///
/// Moreover, `tracing` and Rust logging in general requires special care about testing because of
/// libtest output capturing; see
/// [`SubscriberBuilder::with_test_writer`](tracing_subscriber::fmt::SubscriberBuilder::with_test_writer).
///
/// That's why we advise you to use a dedicated library like [`test-log`][3]
/// (with `"tracing"` feature enabled).
///
/// # Memory leak
///
/// [`tracing`] use a static subscriber, which is not deallocated prior to process exiting.
/// Tools such as `valgrind` will then report memory leaks in *still reachable* category.
///
/// However, when `RUST_LOG` is provided (see [Advanced use](#advanced-use)),
/// [`tracing_subscriber::EnvFilter`] is then used, and causes also *still reachable* blocks,
/// but also *possibly lost* blocks, which are [known false-positives][2].
/// Those "leaks" can be "suppressed" from `valgrind` report using the following suppression:
/// ```text
/// {
///    zenoh_init_logging
///    Memcheck:Leak
///    ...
///    fun:*zenoh*init_logging*
/// }
/// ```
///
/// [1]: https://docs.rs/tracing-subscriber/latest/tracing_subscriber/filter/struct.EnvFilter.html#directives
/// [2]: https://github.com/rust-lang/regex/issues/1205
/// [3]: https://crates.io/crates/test-log

By the way, as written in the documentation, init_logging must not be used in test, as regular tracing subscribers shouldn't be used in tests. Instead, I've added test-log crate to zenoh workspace.

commons/zenoh-util/src/log.rs Outdated Show resolved Hide resolved
/// # Panics
///
/// This function may panic if the following operations fail:
/// - parsing `RUST_LOG` (see [Advanced use](#advanced-use)) environment variable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parsing RUST_LOG / ZENOH_LOG

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed, ZENOH_LOG is not parsed in init_logging_with_level.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get this, sorry. But why? Isn't it just adds more complexity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked about it in our last meeting: to keep things the simplest possible, I've dissociated initialization from variable and initialization from code.
It's still possible to override the provided level with RUST_LOG environment variable for debug purpose.

I don't know what would be the use case of using init_logging_from_level together with ZENOH_LOG. The only thing I see is when you want to increase the logging level to debug something (without having to recompile/redeploy), but in that case, RUST_LOG seems more appropriate to me, and it's already supported.

@milyin
Copy link
Contributor

milyin commented Jun 20, 2024

The approach looks good for me. The naming init_logging I think is definitely better than try_init_log_from_env. Usage of test-log crate seems to be right thing too

@wyfo
Copy link
Contributor Author

wyfo commented Jun 20, 2024

I've also tried to specify a convention for the bindings behavior, that I've written in init_logging_with_callback documentation:

/// Initialize zenoh logging using the provided callbacks.
///
/// This function is mainly meant to be used in zenoh bindings, to provide a bridge between Rust
/// `tracing` implementation and a native logging implementation.
///
/// To be consistent with zenoh API, bindings should parse `ZENOH_LOG` environment variable
/// to set the log level (unless it is set directly in code).
/// However, if `RUST_LOG` environment variable is provided, the callback will be skipped, and
/// [`init_logging`] called instead.

I admit I'm not sure about that RUST_LOG overriding decision. The reason behind is to allow easier debugging for the zenoh team, and we could ask users to use also use it when we ask for user logs in GitHub issue template for example.

@wyfo wyfo requested a review from milyin June 20, 2024 14:31
pub use zenoh_util::{init_logging, init_logging_with_level, InvalidLogLevel, LogLevel};
#[cfg(feature = "internal")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • better to use #[zenoh_macros::internal]
  • the function implementations should be under "internal" too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to use #[zenoh_macros::internal]

Indeed

the function implementations should be under "internal" too

Do you mean moving zenoh::logging::try_init_logging to zenoh::internal::try_init_logging?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I mean the implementation of internal/unstable functions are usually under "internal"/"unstable" cfg too, not only reexports.
It would be nice to check such cases automatically - generate build error if there is some public item inside which is not publicly reexported

Copy link
Contributor Author

@wyfo wyfo Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, I will add the feature to zenoh-util.

@wyfo
Copy link
Contributor Author

wyfo commented Jun 20, 2024

This rework however doesn't cover the first point of @Mallets:

Logging level is too coarse. Moving from debug to trace is a big jump in terms of logging and it's not easy to parse/digest given the amount of information being generated.

It could be solved however with additional variants to LogLevel – it's #[not_exhaustive] on purpose – which would implement custom filtering, for example based on event target, or specific span, or specific event field, etc. Ideally, the filtering rule should entirely be expressible in RUST_LOG syntax, in order to be compatible with RUST_LOG override.

But that's not the object of the current rework., this one should just be extendable in a backward compatible way. It should be part of the next rework regarding structured logging application.

@wyfo
Copy link
Contributor Author

wyfo commented Jun 20, 2024

The to-decide list brought by the rework is:

  • naming: ZENOH_LOG/init_logging/init_logging_with_level (and LogLevel)
  • should init_logging_with_level allow ZENOH_LOG overriding, see above. I think it's better to not allow it
  • bindings behavior convention, as specified above. Is my RUST_LOG override a good idea or not?
  • because zenoh_util is no more used for logging in tests, should all the logging code be moved in zenoh crate in the end? I think we should, as logging items are now only used through zenoh:: in all the workspace.

@wyfo
Copy link
Contributor Author

wyfo commented Jun 20, 2024

@Mallets I would to ask for your feedback on the rework proposal when you will have the time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants