-
-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ref(ddm): Dual write to Sentry metrics as well as Datadog #5474
Conversation
This PR wraps the Datadog and Sentry metrics into a single backend, so we can start dogfooding Sentry metrics. Currently the Sentry portion is feature flagged and has a sampling rate.
if self._use_sentry(): | ||
self.sentry.gauge(name, value, tags, unit) | ||
|
||
def timing( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
timings are distributions -- i don't think we need both. the implementation in the SDK is identical
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're not identical, because in the SDK they accept different units. The timing ones specifically only accept timing units, and distributions are more flexible.
So we should be using distribution
when we are logging things that aren't timing based. See https://github.com/getsentry/snuba/blob/master/snuba/clickhouse/http.py#L253 for an example that should be changed to distribution
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we have a default impl for timing that uses distribution internally or the other way around? The impl really is identical in the SDK (the only thing that differs is type hints and return value) -- i get that you want to have different defaults for unit in each case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not both? Both exist in the Sentry SDK and in Datadog. Why not use the SDKs as intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because in both SDKs they're literally just aliases and now you're forcing every backend to implement those aliases.
the purpose of sentry_sdk.metrics.timing
is to provide a decorator and context manager to measure a codeblock -- we're not using that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
forcing every backend to implement those aliases.
This is true, but we are missing the distribution
functionality from all our backends, so I would have to add that no matter what (timing is a subset of distribution). Why not leave timing in place, add distributions, and then our backends line up with both of the production SDKs (DD and Sentry).
literally just aliases
If the aliases are already available, why not leverage them? Why in turn make timing another alias on distribution? The DD and Sentry SDKs already provide that alias for us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have to add that no matter what
you really don't, you could continue to use timing or rename it to distribution if you want to. timing is not a subset of distribution in master branch, it's just distribution with maybe an unfortunate (not-technically-correct) name. the semantic difference was created by adding unit to the interface. i don't see the point in adding more methods that do the same thing but only differ in their default for unit, but I don't feel too strongly about all of this so i'm approving
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #5474 +/- ##
==========================================
- Coverage 90.19% 90.02% -0.18%
==========================================
Files 889 891 +2
Lines 43447 43530 +83
Branches 288 288
==========================================
- Hits 39189 39188 -1
- Misses 4216 4300 +84
Partials 42 42 ☔ View full report in Codecov by Sentry. |
Do we need to do something to blacklist the metrics consumer to avoid going into a loop or is this handled by s4s? |
at this sample rate probably not -- but as we increase it I would override the sample rate to 0 for those particular deployments |
to be clear, there's sample rate setting that can be overridden with envvars, I suggest to set that envvar for the particular consumers where we see this feedback loop effect, either set it back to 0 or just a lower value |
PR reverted: 0b1242d |
)" This reverts commit a32d014. Co-authored-by: evanh <2727124+evanh@users.noreply.github.com>
This PR wraps the Datadog and Sentry metrics into a single backend, so we can start dogfooding Sentry metrics. Currently the Sentry portion is feature flagged and has a sampling rate. Sentry also provides a way to specify the unit and record distributions, so add that to the abstract so we can start using it later.
) This PR wraps the Datadog and Sentry metrics into a single backend, so we can start dogfooding Sentry metrics. Currently the Sentry portion is feature flagged and has a sampling rate. Sentry also provides a way to specify the unit and record distributions, so add that to the abstract so we can start using it later.
This PR wraps the Datadog and Sentry metrics into a single backend, so we can
start dogfooding Sentry metrics. Currently the Sentry portion is feature
flagged and has a sampling rate.
Sentry also provides a way to specify the unit and record distributions, so add
that to the abstract so we can start using it later.