-
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Anomaly detection - peak/trough detection and caching of matrix profle #1081
Conversation
f393fef
to
d773ddb
Compare
37507c2
to
17276d6
Compare
@inject | ||
@sentry_sdk.trace | ||
def _combo_detect( | ||
self, ts_with_history: TimeSeriesWithHistory, config: AnomalyDetectionConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing a = injected
for DI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the config param is part of request payload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do need to remove the @inject
above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
transaction_name = ( | ||
"Stream AD for alert" | ||
if isinstance(request.context, AlertInSeer) | ||
else self._batch_detect(request.context) | ||
else ( | ||
"Stream AD for timeseries with history" | ||
if isinstance(request.context, TimeSeriesWithHistory) | ||
else "Batch AD for timeseries" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: nested ternary makes this section somewhat hard to follow. Could this be done using a switch or an if/else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
17276d6
to
aa84a44
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will there be a follow up PR for the peak/trough logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I will tweak the peak/trough logic some more based on your experiments in a follow-up PR. Wanted to get this out as frontend work is dependent on the API change.
return timeseries | ||
batch_detector = MPBatchAnomalyDetector() | ||
anomalies = batch_detector.detect(convert_external_ts_to_internal(timeseries), config) | ||
return timeseries, anomalies | ||
|
||
@inject | ||
@sentry_sdk.trace | ||
def _online_detect( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it would be good to add some brief docstrings to these to clarify the context in which each is used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
else: | ||
transaction_name = "Batch AD for timeseries" | ||
|
||
with sentry_sdk.start_transaction(op="task", name=transaction_name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any particular reason why we want a separate transaction here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one endpoint caters to three different types of anomaly detection - stateless batch, stateless online and stateful alert based. Each serve a different use case and will have different performance characteristics. So I think it is worth tracking them as different transactions with separate names. Any potential problem with this approach?
if np.isnan(mp_dist): | ||
return "none" | ||
if mp_dist < threshold_lower: | ||
return "none" | ||
if mp_dist < threshold_upper: | ||
return "anomaly_lower_confidence" | ||
return "anomaly_higher_confidence" | ||
|
||
def _adjust_flag_for_vicinity( | ||
self, flag: AnomalyFlags, ts_value: float, context: npt.NDArray[np.float64] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this logic is pretty difficult to follow without a docstring - specifically what is context
and how should i think about it? (i think i know the answer from talking to you but will probably be confused looking at this in 2 months)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code LGTM - do we feel confident about the peak/trough logic?
aa84a44
to
b72f2ca
Compare
The logic that is going in is incrementally better than what we had previously. It does need more tweaking. I will be following up on it immediately and have an updated version out before the internal release. |
Suspect IssuesThis pull request was deployed and Sentry observed the following issues:
Did you find this useful? React with a 👍 or 👎 |
This PR has multiple changes: