-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add row count and affected rows to Trilogy spans #1290
base: main
Are you sure you want to change the base?
feat: Add row count and affected rows to Trilogy spans #1290
Conversation
|
@@ -164,6 +164,12 @@ | |||
_(span.attributes[OpenTelemetry::SemanticConventions::Trace::DB_SYSTEM]).must_equal 'mysql' | |||
_(span.attributes[OpenTelemetry::SemanticConventions::Trace::DB_STATEMENT]).must_equal 'DESELECT ?' | |||
end | |||
|
|||
it 'includes row count' do | |||
client.query('SELECT 1') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only see SELECT
s in this test file, so I'm not sure what the best way to test affected_rows
is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to run an insert and subsequent update command to assert the behaviour here, no?
super | ||
result = super | ||
|
||
span.set_attribute('db.response.returned_rows', result.count) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
db.response.returned_rows
was recently added in open-telemetry/semantic-conventions@63094c6 but appears to be experimental. Is this (and .affected_rows
) a reasonable key to use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
affected_rows
seems to be the accepted language for this concept, so I don't see any problem there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see a few challenges here:
- I am trying to avoid having mixed semantic conventions here between pre-1.0 and 1.x+. This would be adding new attributes which would be a mix and likely confusing to our users.
- We have in the past rejected adding metrics as attributes for other PRs but it seems like now metrics are being adding as span attributes. Does that mean we should reverse our decisions going forward?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have in the past rejected adding metrics as attributes for other PRs but it seems like now metrics are being adding as span attributes. Does that mean we should reverse our decisions going forward?
Feeling a bit out of the loop on this one. Has there been some push back against reporting numeric details on spans?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. We have had folks submit PRs to add things like internal rails attributes for db times and rendering times.
I have said no to these kinds of attributes in the past because these are either derived span values or more appropriate to emit as metrics. I can look up specific instances of PRs.
Now that semconv is setting a precedent I am second guessing that position.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am trying to avoid having mixed semantic conventions here between pre-1.0 and 1.x+. This would be adding new attributes which would be a mix and likely confusing to our users.
I wonder how we move forward then. I hoped there would be less resistant to this change as it is additive rather than modifying an existing attribute.
Just a quick scan of the db.
prefixed attributes in the sem conv main branch.
Experimental attributes:
- ✅
db.system
- ❌
db.collection.name
(this probably maps todb.name
) - ❌
db.operation.name
(we havedb.operation
) - ❌
db.response.status_code
- ❌
db.operation.batch.size
- ❌
db.query.summary
- ❌
db.query.text
( we havedb.statement
) - ❌
db.response.returned_rows
- Nothing in the current experimental versioning maps to the
db.user
attribute we have - Nothing maps to
affected_rows
I see three options:
- I could come up with something to make the attribute versioning configurable, that feels a bit toily but maybe useful going forward
- We just add these new attributes because one is unspecced, and one is experimental (but we're not breaking anything that exists)
- We (shopify) look at bringing this instrumentation internal, because at the end of the day the instrumentation needs to solve problems for users and I don't really want to allow the slow moving beast that is spec compliance to prevent that.
I have said no to these kinds of attributes in the past because these are either derived span values or more appropriate to emit as metrics. I can look up specific instances of PRs.
I see what you're saying, I think in this instance it's a bit different as we want to associate these counts returned with actual queries, the high cardinality nature of this stuff doesn't mesh well as metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both of these attributes seem really helpful for users and I'm excited by this PR. Thanks for opening it, @skipkayhil! I'd like to find a way to add these attributes to the instrumentation, while still supporting the semantic conventions.
I am trying to avoid having mixed semantic conventions here between pre-1.0 and 1.x+. This would be adding new attributes which would be a mix and likely confusing to our users.
I wonder how we move forward then. I hoped there would be less resistant to this change as it is additive rather than modifying an existing attribute.
@arielvalentin @robertlaurin - this sounds like a great opportunity to bring in OTEL_SEMCONV_STABILITY_OPT_IN
. There's a description for this environment variable at the top of the Database Semantic Conventions.
With this environment variable, we can have a clear toggle to add new conventions without needing to interfere with users who are happy with the existing conventions.
Database semantic conventions as so close to going stable. We eventually need to remove the old conventions and adopt the new ones. I think we should start that process now, and introduce db.response.returned_rows
as part of this move.
@skipkayhil, would you be interested in helping the Trilogy instrumentation support OTEL_SEMCONV_STABILITY_OPT_IN
? We don't have any database instrumentation prototypes yet. I'd be happy to help work on something with you if you're interested.
affected_rows
seems to be the accepted language for this concept, so I don't see any problem there.
I think this sounds like a great name for this concept, but I'm a little hesitant to add an attribute that isn't part of the semantic conventions yet.
What do you think about pausing on that attribute and opening an issue to propose adding db.response.affected_rows
to the semantic conventions? Here's the form to propose a new convention. Also happy to help shepherd that conversation if you'd like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skipkayhil, would you be interested in helping the Trilogy instrumentation support
OTEL_SEMCONV_STABILITY_OPT_IN
? We don't have any database instrumentation prototypes yet. I'd be happy to help work on something with you if you're interested.
I see that open-telemetry/opentelemetry-ruby#1728 is open in the main repo. My first thought was to do something simpler like having different patch classes or metaprogramming the patches depending on the ENV being set. Happy to take a stab at a PR but would probably like a point in the right direction first 😅
What do you think about pausing on that attribute and opening an issue to propose adding
db.response.affected_rows
to the semantic conventions? Here's the form to propose a new convention. Also happy to help shepherd that conversation if you'd like.
Sounds good to me, should I open an issue using the form?
span.set_attribute('db.response.returned_rows', result.count) | ||
span.set_attribute('db.response.affected_rows', result.affected_rows) unless result.affected_rows.nil? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're adding more than one attribute you want to use the add_attributes
interface for bulk setting attributes, otherwise you hit a mutex N many times, versus just the 1 time.
span.set_attribute('db.response.returned_rows', result.count) | |
span.set_attribute('db.response.affected_rows', result.affected_rows) unless result.affected_rows.nil? | |
attributes = { 'db.response.returned_rows' => result.count } | |
attributes['db.response.affected_rows'] = result.affected_rows unless result.affected_rows.nil? | |
span.add_attributes(attributes) |
Identifying queries that return/update many rows is very useful for improving database performance. Queries that return many rows can cause applications to use excess memory (or in some cases like Vitess) could be blocked altogether. Queries that update many rows can cause replication lag and generally put more pressure on a database. This commit adds both of these values (row_count and affected_rows) to spans created from Trilogy#query to help identify these problematic queries. Trilogy::Result#count will always return a value (0 for mutations), but #affected_rows only returns values for mutations (nil for SELECTs).
be86418
to
6bfc5d3
Compare
Identifying queries that return/update many rows is very useful for improving database performance. Queries that return many rows can cause applications to use excess memory (or in some cases like Vitess) could be blocked altogether. Queries that update many rows can cause replication lag and generally put more pressure on a database.
This commit adds both of these values (row_count and affected_rows) to spans created from Trilogy#query to help identify these problematic queries. Trilogy::Result#count will always return a value (0 for mutations), but #affected_rows only returns values for mutations (nil for SELECTs).