Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wrap timed streams and iterators in tracing::Spans; follow Opentelemetry conventions #3313

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

zvkemp
Copy link

@zvkemp zvkemp commented Jun 26, 2024

Does your PR solve an issue?

When exporting to instrumentation collectors, timed spans are generally more desirable than discrete events annotated with timing data. This PR:

  • wraps the existing logger and pool::acquire tracing events in tracing spans. Common metadata is moved to the span; generally the tracing subscriber can (and does by default in fmt::subscriber) write span-level metadata when events are logged, so the current output should be the same. The span level is set to the max level of the event log configuration, where appropriate.
  • Adds an associated NAME_LOWERCASE constant to Database, which follows the semantic conventions defined by Opentelemetry (https://opentelemetry.io/docs/specs/semconv/attributes-registry/db/). This value is attached to the span as db.system.

@zvkemp zvkemp force-pushed the logging-conventions branch from d4738ae to 6f96abe Compare July 17, 2024 13:14
// Human-friendly - includes units (usually ms). Also kept for backward compatibility
?elapsed,
// Search friendly - numeric
elapsed_secs = elapsed.as_secs_f64(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing elapsed and elapsed_secs is a breaking change to the instrumentation and reporting that others have built on top of existing sqlx logging.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good ole' Hyrum's Law https://www.hyrumslaw.com/

Strictly speaking, we never made the exact format of log messages part of our API contract, but I'd also be pretty annoyed if this changed out from under me.

It'd also make me a hypocrite, given how the query macros came about.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was hoping this would be rolled into the breaking-changes-allowed 8.x release, as it's a little superfluous to track timings in addition to the tracing span's own timing, but I certainly see how it's useful. Restored and rebased.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

A standards based approach would certainly be better in the long run.

I have my own logging breaking change pr open - but its much smaller in scope and at @abonander's suggestion we tagged a few users who were interested to see if that break will affect them, as an alternative to maintaining two outputs.

The break that was in this PR (removing the elapses) would affect people.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was hoping this would be rolled into the breaking-changes-allowed 8.x release

Yeah, to be perfectly honest, this PR required more thinking than I wanted to do while I was trying to wrap up 0.8.0.

At a certain point, I have to stop myself from looking at new PRs, or else I'll sit there forever trying to get every one of them merged before I kick out a release.

If we're happy calling this a breaking change, we can plan to land this for 0.9.0.

Copy link
Author

@zvkemp zvkemp Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a certain point, I have to stop myself from looking at new PRs, or else I'll sit there forever trying to get every one of them merged before I kick out a release.

Completely understand; hope you didn't take my comment as a criticism. Currently re-working this to not be a breaking change; the tracing span should be opt-in if it lands in 0.8, and whether or not that would change in 0.9 is an open question.

@zvkemp zvkemp force-pushed the logging-conventions branch 2 times, most recently from e632167 to a5a0a89 Compare September 6, 2024 14:58
// Human-friendly - includes units (usually ms). Also kept for backward compatibility
?elapsed,
// Search friendly - numeric
elapsed_secs = elapsed.as_secs_f64(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation looks like it might trip up the format linting, and in any case for git history it would be nice to avoid reordering the fields.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation addressed (sorry, the formatter is still not good at catching things in macro invocations). Can you clarify the issue with field ordering? The shared fields were moved into the span, but in the same order

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately the fields need to also be kept in the existing logging after being added to the new span.

Moving them from the individual log statement to the span changes the json structure that loggers like tracing-subscriber with the json feature will output - with that crate the items move from .fields to .span - which is good for standardization, but breaks existing consumers. We need both to avoid a breaking change.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, there doesn't currently seem to be a way to configure the json formatter to flatten the span fields into the event output (though maybe there should be?).

Would it be acceptable to gate some of this behavior behind a feature flag? It sounds like the span isn't really necessary for the slow query logger, and the event isn't really necessary for opentelemetry tracing. It would also be nice for those two things to work well together with a minimum of duplication.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The slow query logger can already be configured to never trigger.

For example, if span tracking was added, but no existing logging was diminished or removed, a crate user could set slow_statements_level to LevelFilter::Off and rely exclusively on metrics reported on the span for their monitoring and alerting.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose that's true. I'm reusing this config to control span visibility as well, so I'll rework how that happens.

@zvkemp zvkemp force-pushed the logging-conventions branch from a5a0a89 to 83a4f71 Compare September 6, 2024 15:45
@zvkemp zvkemp force-pushed the logging-conventions branch from 83a4f71 to 21967bd Compare September 6, 2024 15:47
@@ -122,37 +157,35 @@ impl<'q> QueryLogger<'q> {
String::new()
};

if was_slow {
self.span.record("summary", summary);
self.span.record("db.statement", sql);
Copy link
Contributor

@iamjpotts iamjpotts Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something else to keep in mind, and may need adjustment here, is that logging the query isn't free.

Its not uncommon for select statements to exceed 25KB or 50KB or more, and 100% logging of those will have a materially negative impact on both the running application outputting the logs, and the log observer attempting to capture them.

This is why the current implementation only logs the query when the query is slow.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm taking that into account. The revised implementation will allow the span to be leveled individually, and the query will only be recorded if any of the events or spans are within the configured level.

@zvkemp
Copy link
Author

zvkemp commented Sep 6, 2024

I've pushed a new commit to refactor these changes; notably it allows setting a span level independently from the log level (defaults to Off, which creates a no-op span). This complicates the level checking a bit but should avoid doing unnecessary work or recording large fields when they aren't explicitly required by the configuration. Note that I haven't had a chance to validate these changes yet, but should be able to test it out soon.

@iamjpotts
Copy link
Contributor

Great.

I'm interested in local validation also.

I think eventually my first iteration of the logging could be pared back in a 0.9 version of the crate. Spans are better conceptually and for standards, but we need to consider some of the conditional logging scenarios, and for now, some backwards compatibility.

I may get to it this weekend. Cheers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants