Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: obfuscation for mysql2, dalli and postgresql as default option for db_statement #682

Merged

Conversation

xuan-cao-swi
Copy link
Contributor

Description

Based on opentelemetry-specification and discussion, change the db.statement default sql to sanitized sql

Test

Tested with basic test suite (e.g. rake test)

@xuan-cao-swi xuan-cao-swi changed the title bfuscation for mysql2, dalli and postgresql Obfuscation for mysql2, dalli and postgresql as default option for db_statement Oct 4, 2023
Copy link
Contributor

@ericmustin ericmustin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obfuscation adds a significant performance penalty.

We should either omit or include by default. The spec says to omit now if I understand correctly?

@arielvalentin
Copy link
Collaborator

Obfuscation adds a significant performance penalty.

We should either omit or include by default. The spec says to omit now if I understand correctly?

@ericmustin The spec says nothing about the defaults client-side sanitization, AFAIK, other than:

[2]: Should be collected by default only if there is sanitization that excludes sensitive information.

I interpret that to mean you should only collect it if the SQL is sanitized. If I extrapolate a bit, then it should never record any unsanitized SQL at all.

There are some open issues about db statement sanitization in general:

W/R/T db.operation it says:

[3]: When setting this to an SQL keyword, it is not recommended to attempt any client-side parsing of db.statement just to get this property, but it should be set if the operation name is provided by the library being instrumented. If the SQL statement has an ambiguous operation, or performs more than one operation, this value may be omitted.

Given this information I for one prefer to sanitize by default and omit if the driver does not have a sanitizer implemented.

@ericmustin
Copy link
Contributor

The appropriate place to do sanitization is in the collector, or more generally out of process. If your interpretation of the spec is correct(and I think it is), then that would effectively exclude the possibility of doing sql sanitization in the collector because the spec would dictate it can never be exported without being sanitized. This seems flawed to me.

Additionally we know real examples of the "obfuscate" config option introducing a performance regression in applications at large enterprise organizations which use ruby. Setting it as the default would seem like a poor choice.

I would prefer omit as the default if we're left with a choice btwn omit and "conditionally obfuscate depending on the particulars of driver options", as I think users will find the lack of standard confusing.

@arielvalentin
Copy link
Collaborator

The appropriate place to do sanitization is in the collector, or more generally out of process. If your interpretation of the spec is correct(and I think it is), then that would effectively exclude the possibility of doing sql sanitization in the collector because the spec would dictate it can never be exported without being sanitized. This seems flawed to me.

Additionally we know real examples of the "obfuscate" config option introducing a performance regression in applications at large enterprise organizations which use ruby. Setting it as the default would seem like a poor choice.

I would prefer omit as the default if we're left with a choice btwn omit and "conditionally obfuscate depending on the particulars of driver options", as I think users will find the lack of standard confusing.

Would it make the most sense to discussion in the spec issues?

I think the challenge here is there are no OOTB SQL sanitizers in contrib. If there were, then I would totally be onboard with leaving include/raw as the default, but since there isn't I still advocate that we obfucate/sanitize by default.

@cheempz
Copy link
Contributor

cheempz commented Oct 4, 2023

Regarding

The appropriate place to do sanitization is in the collector, or more generally out of process.

that disagrees with a recent separate discussion, which is the case with our product--telemetry is exported directly to our platform and doesn't go though a collector.

OTel JS and Java obfuscate/sanitize by default, so there's a bit of precedence despite the much needed spec clarification/configuration naming being discussed :)

@ericmustin
Copy link
Contributor

Indeed, I disagree with the spec discussion. If I'm not
Mistaken, Both github and Shopify have introduced regressions into production by enabling obfuscation in app ootb. Perhaps benchmarking would be useful here, but im dubious of defaulting to obfuscation. The chief concern of observability instrumentation should be minimizing the observer effect as much as possible, and I feel we're disregarding that for what's effectively product/vendor developer experience preferences

@arielvalentin
Copy link
Collaborator

For context, at GH we've accepted the trade off so we obfuscate by default. It's slow but we mitigate it with the character limit setting.

@ericmustin
Copy link
Contributor

just circling back, if others feel obfuscate is the appropriate default, then I won't stand in the way here.

Copy link
Collaborator

@arielvalentin arielvalentin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this all of the data store drivers? Or do the others already have obfuscate by default?

@xuan-cao-swi
Copy link
Contributor Author

Is this all of the data store drivers? Or do the others already have obfuscate by default?

AFAIK, I think these three gems are the last to set obfuscate to default

Copy link
Contributor

@ericmustin ericmustin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it goes

@arielvalentin
Copy link
Collaborator

@xuan-cao-swi just one more favor to ask before I merge.

Please update the commit messages to use conventional commits:

https://github.com/open-telemetry/opentelemetry-ruby-contrib/blob/main/CONTRIBUTING.md#use-conventional-commit-messages

This is something that I think requires a Minor release, so please using the BREAKING CHANGE format in the message, e.g.

feat!: Set db.statement option to obfuscate by default

@xuan-cao-swi xuan-cao-swi force-pushed the obfuscation_by_default branch from 7386368 to 364e270 Compare October 11, 2023 04:15
@xuan-cao-swi xuan-cao-swi changed the title Obfuscation for mysql2, dalli and postgresql as default option for db_statement feat!: obfuscation for mysql2, dalli and postgresql as default option for db_statement Oct 11, 2023
@xuan-cao-swi
Copy link
Contributor Author

@xuan-cao-swi just one more favor to ask before I merge.

Please update the commit messages to use conventional commits:

https://github.com/open-telemetry/opentelemetry-ruby-contrib/blob/main/CONTRIBUTING.md#use-conventional-commit-messages

This is something that I think requires a Minor release, so please using the BREAKING CHANGE format in the message, e.g.

feat!: Set db.statement option to obfuscate by default

Changed!

@arielvalentin arielvalentin merged commit 20e1cd0 into open-telemetry:main Oct 12, 2023
45 of 46 checks passed
Hareramrai pushed a commit to tools-aoeur/opentelemetry-ruby-contrib that referenced this pull request Nov 2, 2023
* chore: use stable toys version

* chore: use stable version of toys release code

* release: Release 2 gems (open-telemetry#675)

* fix: Fix "uses cases" typo in `CONTRIBUTING.md` (open-telemetry#679)

* chore(deps-dev): update webmock requirement from ~> 3.18.1 to ~> 3.19.1 in /resources/azure (open-telemetry#653)

* chore(deps-dev): update webmock requirement from ~> 3.18.1 to ~> 3.19.1 in /resources/google_cloud_platform (open-telemetry#652)

* fix: Remove dependence on activesupport (open-telemetry#687)

Instrumentations must not use transitive dependencies used by the library.

In this case, relying on the gruf library to load specific ActiveSupport extensions leaves the instrumentation vulnerable to bugs.

For this reason I have changed the code to use Enumerable methods instead of ActiveSupport extensions.

See open-telemetry#686

* fix!: Drop DelayedJob ActiveRecord in Tests (open-telemetry#685)

* fix: Add Rails 7.1 compatability (open-telemetry#684)

* chore: Add tests for Rails 7.1

* fix: Rails 7.1 incompatabilities

* feat!: obfuscation for mysql2, dalli and postgresql as default option for db_statement (open-telemetry#682)

* feat!: fuscation for mysql2, dalli and pg

* feat!: update readme

* feat!: set db.statement option to obfuscate by default for mysql2, pg and dalli

Co-authored-by: Ariel Valentin <arielvalentin@users.noreply.github.com>

* ci: Update test reset code for GraphQL-Ruby 2.1.3 (open-telemetry#691)

Fixes open-telemetry#689

* fix: Omit `nil` `net.peer.name` attributes (open-telemetry#693)

* ci: upgrade to latest stable version of toys (open-telemetry#694)

Fixes errors with incompatible versions defined in the configs:

https://github.com/open-telemetry/opentelemetry-ruby-contrib/actions/runs/6534421626/job/17741560442#step:5:10

* release: Release 12 gems (open-telemetry#695)

* release: Release 12 gems

* opentelemetry-instrumentation-gruf 0.1.1 (was 0.1.0)
* opentelemetry-instrumentation-active_support 0.4.3 (was 0.4.2)
* opentelemetry-instrumentation-action_view 0.6.1 (was 0.6.0)
* opentelemetry-instrumentation-action_pack 0.7.1 (was 0.7.0)
* opentelemetry-instrumentation-active_job 0.6.1 (was 0.6.0)
* opentelemetry-instrumentation-active_record 0.6.3 (was 0.6.2)
* opentelemetry-instrumentation-dalli 0.25.0 (was 0.24.2)
* opentelemetry-instrumentation-delayed_job 0.22.0 (was 0.21.0)
* opentelemetry-instrumentation-faraday 0.23.3 (was 0.23.2)
* opentelemetry-instrumentation-mysql2 0.25.0 (was 0.24.3)
* opentelemetry-instrumentation-pg 0.26.0 (was 0.25.3)
* opentelemetry-instrumentation-rails 0.28.1 (was 0.28.0)

* ci: Trigger builds

* fix: Apply suggestions from code review

Co-authored-by: Josef Šimánek <josef.simanek@gmail.com>

* fix: bump gem versions

* fix: bump gems

---------

Co-authored-by: Ariel Valentin <ariel@arielvalentin.com>
Co-authored-by: Ariel Valentin <arielvalentin@github.com>
Co-authored-by: Ariel Valentin <arielvalentin@users.noreply.github.com>
Co-authored-by: Josef Šimánek <josef.simanek@gmail.com>

* chore: Update CODEOWNERS (open-telemetry#692)

* chore: Update CODEOWNERS

Add @simi and @kaylareopelle and approvers to the list

* release: Release opentelemetry-instrumentation-all 0.51.0 (was 0.50.1) (open-telemetry#699)

* release: Release opentelemetry-instrumentation-all 0.51.0 (was 0.50.1)

* docs: Update all gem Changelog

---------

Co-authored-by: Ariel Valentin <ariel@arielvalentin.com>
Co-authored-by: Ariel Valentin <arielvalentin@users.noreply.github.com>

* feat(trilogy): instrument connect and ping (open-telemetry#704)

These can be just as slow as a query if not slower.

Co-authored-by: Jean Boussier <jean.boussier@gmail.com>

* fix: Remove dependency on ActiveSupport core extensions from Grape instrumentation (open-telemetry#706)

* release: Release opentelemetry-instrumentation-trilogy 0.57.0 (was 0.56.3) (open-telemetry#708)

* release: Release opentelemetry-instrumentation-trilogy 0.57.0 (was 0.56.3)

* Empty commit

* Release instrumentation-all as well.

---------

Co-authored-by: Ariel Valentin <ariel@arielvalentin.com>
Co-authored-by: Francis Bogsanyi <francis.bogsanyi@shopify.com>

* chore(deps): update rubocop requirement from ~> 1.56.2 to ~> 1.57.1 (open-telemetry#702)

Updates the requirements on [rubocop](https://github.com/rubocop/rubocop) to permit the latest version.
- [Release notes](https://github.com/rubocop/rubocop/releases)
- [Changelog](https://github.com/rubocop/rubocop/blob/master/CHANGELOG.md)
- [Commits](rubocop/rubocop@v1.56.2...v1.57.1)

---
updated-dependencies:
- dependency-name: rubocop
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps-dev): update rubocop requirement from ~> 1.56.1 to ~> 1.57.1 in /instrumentation/base (open-telemetry#701)

chore(deps-dev): update rubocop requirement from ~> 1.56.1 to ~> 1.57.1

Updates the requirements on [rubocop](https://github.com/rubocop/rubocop) to permit the latest version.
- [Release notes](https://github.com/rubocop/rubocop/releases)
- [Changelog](https://github.com/rubocop/rubocop/blob/master/CHANGELOG.md)
- [Commits](rubocop/rubocop@v1.56.1...v1.57.1)

---
updated-dependencies:
- dependency-name: rubocop
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps-dev): update rubocop requirement from ~> 1.56.1 to ~> 1.57.1 in /resource_detectors (open-telemetry#700)

chore(deps-dev): update rubocop requirement from ~> 1.56.1 to ~> 1.57.1

Updates the requirements on [rubocop](https://github.com/rubocop/rubocop) to permit the latest version.
- [Release notes](https://github.com/rubocop/rubocop/releases)
- [Changelog](https://github.com/rubocop/rubocop/blob/master/CHANGELOG.md)
- [Commits](rubocop/rubocop@v1.56.1...v1.57.1)

---
updated-dependencies:
- dependency-name: rubocop
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps-dev): update rubocop requirement from ~> 1.56.1 to ~> 1.57.1 in /propagator/ottrace (open-telemetry#698)

chore(deps-dev): update rubocop requirement from ~> 1.56.1 to ~> 1.57.1

Updates the requirements on [rubocop](https://github.com/rubocop/rubocop) to permit the latest version.
- [Release notes](https://github.com/rubocop/rubocop/releases)
- [Changelog](https://github.com/rubocop/rubocop/blob/master/CHANGELOG.md)
- [Commits](rubocop/rubocop@v1.56.1...v1.57.1)

---
updated-dependencies:
- dependency-name: rubocop
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps-dev): update rubocop requirement from ~> 1.56.1 to ~> 1.57.1 in /propagator/xray (open-telemetry#697)

chore(deps-dev): update rubocop requirement from ~> 1.56.1 to ~> 1.57.1

Updates the requirements on [rubocop](https://github.com/rubocop/rubocop) to permit the latest version.
- [Release notes](https://github.com/rubocop/rubocop/releases)
- [Changelog](https://github.com/rubocop/rubocop/blob/master/CHANGELOG.md)
- [Commits](rubocop/rubocop@v1.56.1...v1.57.1)

---
updated-dependencies:
- dependency-name: rubocop
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Remove opentelemetry-resource_detectors all-in-one gem (open-telemetry#659)

* fix: remove call to ActiveSupport::Notifications.notifier#synchronize deprecated in Rails 7.2 (open-telemetry#707)

* wrap call to depcrecated private API behind version conditional

* convert Rails version string to Gem::Version

* fix module access?

* check for synchronize method instead of Rails version

* add comment

---------

Co-authored-by: Ariel Valentin <arielvalentin@users.noreply.github.com>

* chore: Fix linter issue

* release: Release 2 gems (open-telemetry#710)

* release: Release 2 gems

* opentelemetry-instrumentation-grape 0.1.5 (was 0.1.4)
* opentelemetry-instrumentation-active_support 0.4.4 (was 0.4.3)

* ci: Force

---------

Co-authored-by: Ariel Valentin <ariel@arielvalentin.com>
Co-authored-by: Ariel Valentin <arielvalentin@github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Ariel Valentin <arielvalentin@github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sam Bostock <sambostock@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Ariel Valentin <arielvalentin@users.noreply.github.com>
Co-authored-by: Xuan <112967240+xuan-cao-swi@users.noreply.github.com>
Co-authored-by: Robert Mosolgo <rdmosolgo@gmail.com>
Co-authored-by: Yohei Kitamura <3087402+yoheyk@users.noreply.github.com>
Co-authored-by: Ariel Valentin <ariel@arielvalentin.com>
Co-authored-by: Josef Šimánek <josef.simanek@gmail.com>
Co-authored-by: Jean byroot Boussier <jean.boussier+github@shopify.com>
Co-authored-by: Jean Boussier <jean.boussier@gmail.com>
Co-authored-by: Muriel <m.picone.farias@catawiki.nl>
Co-authored-by: Francis Bogsanyi <francis.bogsanyi@shopify.com>
Co-authored-by: Sander Jans <scbjans@gmail.com>
Co-authored-by: Katherine Oelsner <49968061+octokatherine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants