-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Process delayed alert conditions in batches of 10,000 #75302
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #75302 +/- ##
========================================
Coverage 78.21% 78.21%
========================================
Files 6786 6787 +1
Lines 302382 302486 +104
Branches 52035 52050 +15
========================================
+ Hits 236496 236600 +104
+ Misses 59523 59521 -2
- Partials 6363 6365 +2
|
b98edae
to
4788057
Compare
8b76d57
to
77ed8cf
Compare
- look at the size of the hash - if under the limit - parse the groups as we were - else (over the limit) - break into batches of 10,000. Store new batches in redis.
…ush_to_hash there
…egroup data to that function
77ed8cf
to
c94c18e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice fix, I was concerned that this batching might end up pretty complex, but this is easy to reason about
uniqueness across all of them for the centralized redis buffer. The batches are stored in redis because | ||
we shouldn't pass complex objects in the celery task arguments, and we can't send a page of data in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jfyi: Mostly we don't like to pass things that would have to be pickled, like Django models etc. That said, this approach is probably better because I'm not sure if there are downsides to storing 10k ints per task in rabbitmq
@@ -47,7 +47,7 @@ | |||
logger = logging.getLogger("sentry.rules.delayed_processing") | |||
EVENT_LIMIT = 100 | |||
COMPARISON_INTERVALS_VALUES = {k: v[1] for k, v in COMPARISON_INTERVALS.items()} | |||
CHUNK_BATCH_SIZE = 10000 | |||
CHUNK_BATCH_SIZE = options.get("delayed_processing.batch_size") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't set this as a module variable - it'll never update (and might cause problems on load). We just need to perform the check in function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heh, ty, i was just about to ask to check if did that right. thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, just updated again, anything else i'm missing with the options? (I was looking at the docs here: https://develop.sentry.dev/backend/options/ to add it). ps, thanks for the help!
… using the test helper instead
src/sentry/options/defaults.py
Outdated
register( | ||
"delayed_processing.batch_size", | ||
default=10000, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You just need to add flags=FLAG_AUTOMATOR_MODIFIABLE
and then you'll be good to go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing is in appled_delayed
, I think we need to edit cleanup_redis_buffer
b/c it uses the project_id
when deleting the hash values rather than the generated uuid
@@ -86,8 +86,10 @@ class RedisOperation(Enum): | |||
SORTED_SET_GET_RANGE = "zrangebyscore" | |||
SORTED_SET_DELETE_RANGE = "zremrangebyscore" | |||
HASH_ADD = "hset" | |||
HASH_ADD_BULK = "hmset" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to https://redis.io/docs/latest/commands/hmset/, this command is deprecated after 4.0 and can be replaced with hset
which can take multiple key value pairs.
I'm not what version of redis we use tho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i had initially tried the hset
with mapping
syntax and it threw errors, so we'll need to use hmset until then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah ok, could we add a comment somewhere then referencing the future deprecation?
def fetch_rulegroup_to_event_data(project_id: int) -> dict[str, str]: | ||
return buffer.backend.get_hash(model=Project, field={"project_id": project_id}) | ||
def fetch_rulegroup_to_event_data(project_id: int, batch_key: str | None = None) -> dict[str, str]: | ||
field: dict[str, models.Model | int | str] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does this models.Model typing come from on the key 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's like a weird nested definition on the function call to get_hash
for the field variable.
ideally i could not type this at all, and mypy would evaluate the field data type as dict[str, int | str]
which adheres to dict[str, models.Model | int | str]
(at least, that's how typescript works). unfortunately, it was throwing errors and if i typed it explicitly it threw the same errors.. since the dict does adhere to the type definition with models.Model, i just added it to appease the mypy
overlords.
any recommendations on cleanup or better ways to appease the mypy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that makes sense, I would think the implicit typing would work but I guess not lol
mypy is just a mystery sometimes 🔍
@schew2381 🙏 💯 thanks for catching that. i'll update to pass the batch_key as well! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall lgtm!
mock_delayed = Mock() | ||
mock_apply_delayed.delayed = mock_delayed | ||
process_rulegroups_in_batches(self.project.id) | ||
|
||
mock_delayed.assert_called_once_with(self.project.id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think any calls on mocks generate more mocks, so you can do:
mock_delayed = Mock() | |
mock_apply_delayed.delayed = mock_delayed | |
process_rulegroups_in_batches(self.project.id) | |
mock_delayed.assert_called_once_with(self.project.id) | |
process_rulegroups_in_batches(self.project.id) | |
mock_apply_delayed.delayed.assert_called_once_with(self.project.id) |
PR reverted: efe481e |
This reverts commit 215491d. Co-authored-by: saponifi3d <1569818+saponifi3d@users.noreply.github.com>
This reverts commit efe481e.
Description
Some orgs are sending 100k+ events per minute, and the processing is taking to long for a single task.
This PR will look at the size of the hash and determine if it needs to be batched.
There's some restrictions around the celery task / redis, info is outlined in a code comment here: https://github.com/getsentry/sentry/pull/75302/files#diff-f906e75a0e4419db4870fa45ca5a1608ca79beaa052c8bc50b4805607a665d27R482-R486