Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(seer grouping): Store Seer metadata on grouphashes during ingest #77956

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

lobsterkatie
Copy link
Member

@lobsterkatie lobsterkatie commented Sep 23, 2024

This adds the following Seer information to grouphash metadata during ingest:

  • When the grouphash was sent to Seer
  • Which event's stacktrace was sent (stored as an event id)
  • The Seer model version used to analyze the stacktrace
  • The matched hash returned by Seer, if any (stored as a reference to that hash's GroupHash record)
  • The similarity distance returned by Seer, if any

Given that a single event may generate multiple grouphashes, and only one hash value actually gets sent, this also adds a seer_grouphash_sent field, which holds a reference to the GroupHash corresponding to the sent hash. This not only lets us know which value we should find in the Seer database, it also lets us avoid storing all of the above information redundantly - we can store it only on the GroupHash which gets sent, and give the "sibling" GroupHashes access to the information, should they need it, via that link.

For example, if an event generates GroupHash records grouphashA and grouphashB, and grouphashA is sent:

  • The timestamp, event id, model version, and results will be stored in grouphashA's metadata
  • Both grouphashA and grouphashB will link to grouphashA as the representative which was sent
  • Determining if that event's group has been sent to Seer will be possible by checking for the existence of metadata.seer_grouphash_sent on either grouphashA or grouphashB.
  • The Seer data itself (for example, the send timestamp) will be available via both grouphashA.seer_grouphash_sent.metadata.seer_date_sent and grouphashB.seer_grouphash_sent.metadata.seer_date_sent.

Because all of this linking makes for a kind of clunky API, at the point at which we actually go to use this information (in grouping info, as a check during future backfills if/when we update our Seer model, etc), we'll probably want to add appropriate properties and/or helper methods to Group or GroupHash, depending on the application. Until we get to that point, though, it made sense to me to wait rather than guess (potentially incorrectly) what we'll need.

Notes:

  • As a result of this change, we're no longer storing Seer results in event data. This is better - before, to see the Seer results you had to find the first event to generate a given hash (which is the first event in a group in cases where Seer doesn't find a match, but is some random event among a group's full event list in cases where Seer does match to an existing group). Now, you can find Seer resuls from any event in a group, via the group's GroupHash records. (It did mean I had to pull out some temporary logs I had added to group creation, but it's unclear if they're still necessary. If the issue which prompted them comes up again, I'll add them back in a different way.)

  • As agreed offline, I'm updating the GroupHashMetadata records which are created elsewhere - rather than waiting to create them until the Seer results are available (or until we decide not to call Seer) - because it's easier to reason about and because the cardinality here is low, given that more than 99% of events match to an existing group and therefore never hit this code. If, after the GroupHashMetadata MVP is done, we decide we need to optimize this, we can do so before GA.

  • Not done here: Updating the backfill code to store Seer results in grouphash metadata rather than on the group. This should be done before the next time we run a backfill, but given that the current backfill is already half-completed, having them in one place for new groups and a different (single) place for backfilled groups - while not ideal - seemed better than having them in one place for new groups and sometimes the same place but sometimes a different place for backfilled groups.

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Sep 23, 2024
Copy link
Contributor

This PR has a migration; here is the generated SQL for src/sentry/migrations/0765_add_seer_fields_to_grouphash_metadata.py ()

--
-- Add field seer_date_sent to grouphashmetadata
--
ALTER TABLE "sentry_grouphashmetadata" ADD COLUMN "seer_date_sent" timestamp with time zone NULL;
--
-- Add field seer_event_sent to grouphashmetadata
--
ALTER TABLE "sentry_grouphashmetadata" ADD COLUMN "seer_event_sent" varchar(32) NULL;
--
-- Add field seer_grouphash_sent to grouphashmetadata
--
ALTER TABLE "sentry_grouphashmetadata" ADD COLUMN "seer_grouphash_sent_id" bigint NULL;
--
-- Add field seer_match_distance to grouphashmetadata
--
ALTER TABLE "sentry_grouphashmetadata" ADD COLUMN "seer_match_distance" double precision NULL;
--
-- Add field seer_matched_grouphash to grouphashmetadata
--
ALTER TABLE "sentry_grouphashmetadata" ADD COLUMN "seer_matched_grouphash_id" bigint NULL;
--
-- Add field seer_model to grouphashmetadata
--
ALTER TABLE "sentry_grouphashmetadata" ADD COLUMN "seer_model" varchar NULL;
ALTER TABLE "sentry_grouphashmetadata" ADD CONSTRAINT "sentry_grouphashmeta_seer_grouphash_sent__831da653_fk_sentry_gr" FOREIGN KEY ("seer_grouphash_sent_id") REFERENCES "sentry_grouphash" ("id") DEFERRABLE INITIALLY DEFERRED NOT VALID;
ALTER TABLE "sentry_grouphashmetadata" VALIDATE CONSTRAINT "sentry_grouphashmeta_seer_grouphash_sent__831da653_fk_sentry_gr";
CREATE INDEX CONCURRENTLY "sentry_grouphashmetadata_seer_grouphash_sent_id_831da653" ON "sentry_grouphashmetadata" ("seer_grouphash_sent_id");
ALTER TABLE "sentry_grouphashmetadata" ADD CONSTRAINT "sentry_grouphashmeta_seer_matched_groupha_c92b0107_fk_sentry_gr" FOREIGN KEY ("seer_matched_grouphash_id") REFERENCES "sentry_grouphash" ("id") DEFERRABLE INITIALLY DEFERRED NOT VALID;
ALTER TABLE "sentry_grouphashmetadata" VALIDATE CONSTRAINT "sentry_grouphashmeta_seer_matched_groupha_c92b0107_fk_sentry_gr";
CREATE INDEX CONCURRENTLY "sentry_grouphashmetadata_seer_matched_grouphash_id_c92b0107" ON "sentry_grouphashmetadata" ("seer_matched_grouphash_id");

Copy link

codecov bot commented Sep 23, 2024

❌ 11 Tests Failed:

Tests completed Failed Passed Skipped
21638 11 21627 206
View the full list of 3 ❄️ flaky tests
tests.snuba.api.endpoints.test_organization_events_span_indexed.OrganizationEventsEAPSpanEndpointTest test_simple

Flake rate in main: 66.67% (Passed 2 times, Failed 4 times)

Stack Traces | 17.3s run time
#x1B[1m#x1B[.../api/endpoints/test_organization_events_span_indexed.py#x1B[0m:559: in test_simple
    assert data == [
#x1B[1m#x1B[31mE   AssertionError: assert [{'count()': ....status': ''}] == [{'count()': ...': 'success'}]#x1B[0m
#x1B[1m#x1B[31mE     #x1B[0m
#x1B[1m#x1B[31mE     At index 0 diff: {'description': 'bar', 'span.status': '', 'count()': 1} != {'span.status': 'invalid_argument', 'description': 'bar', 'count()': 1}#x1B[0m
#x1B[1m#x1B[31mE     #x1B[0m
#x1B[1m#x1B[31mE     Full diff:#x1B[0m
#x1B[1m#x1B[31mE       [#x1B[0m
#x1B[1m#x1B[31mE           {#x1B[0m
#x1B[1m#x1B[31mE               'count()': 1,#x1B[0m
#x1B[1m#x1B[31mE               'description': 'bar',#x1B[0m
#x1B[1m#x1B[31mE     -         'span.status': 'invalid_argument',#x1B[0m
#x1B[1m#x1B[31mE     ?                         ----------------#x1B[0m
#x1B[1m#x1B[31mE     +         'span.status': '',#x1B[0m
#x1B[1m#x1B[31mE           },#x1B[0m
#x1B[1m#x1B[31mE           {#x1B[0m
#x1B[1m#x1B[31mE               'count()': 1,#x1B[0m
#x1B[1m#x1B[31mE               'description': 'foo',#x1B[0m
#x1B[1m#x1B[31mE     -         'span.status': 'success',#x1B[0m
#x1B[1m#x1B[31mE     ?                         -------#x1B[0m
#x1B[1m#x1B[31mE     +         'span.status': '',#x1B[0m
#x1B[1m#x1B[31mE           },#x1B[0m
#x1B[1m#x1B[31mE       ]#x1B[0m
tests.snuba.api.endpoints.test_organization_events_span_indexed.OrganizationEventsEAPSpanEndpointTest test_sentry_tags_syntax

Flake rate in main: 66.67% (Passed 2 times, Failed 4 times)

Stack Traces | 17.5s run time
#x1B[1m#x1B[.../api/endpoints/test_organization_events_span_indexed.py#x1B[0m:160: in test_sentry_tags_syntax
    assert data[0]["sentry_tags[transaction.method]"] == "foo"
#x1B[1m#x1B[31mE   AssertionError: assert '' == 'foo'#x1B[0m
#x1B[1m#x1B[31mE     #x1B[0m
#x1B[1m#x1B[31mE     - foo#x1B[0m
tests.snuba.api.endpoints.test_organization_events_span_indexed.OrganizationEventsEAPSpanEndpointTest test_numeric_attr_with_spaces

Flake rate in main: 100.00% (Passed 0 times, Failed 4 times)

Stack Traces | 18s run time
#x1B[1m#x1B[.../api/endpoints/test_organization_events_span_indexed.py#x1B[0m:702: in test_numeric_attr_with_spaces
    assert data[0]["tags[foo, string]"] == "five"
#x1B[1m#x1B[31mE   AssertionError: assert '' == 'five'#x1B[0m
#x1B[1m#x1B[31mE     #x1B[0m
#x1B[1m#x1B[31mE     - five#x1B[0m

To view individual test run time comparison to the main branch, go to the Test Analytics Dashboard

@lobsterkatie lobsterkatie force-pushed the kmclb-store-seer-results-in-grouphash-metadata branch from 4b444cd to 32565d6 Compare September 23, 2024 17:59
@github-actions github-actions bot added the Scope: Frontend Automatically applied to PRs that change frontend components label Sep 23, 2024

This comment was marked as off-topic.

@lobsterkatie lobsterkatie removed the Scope: Frontend Automatically applied to PRs that change frontend components label Sep 23, 2024
# that because of merging/unmerging, the sent GroupHash and this metadata's GroupHash (if not
# one and the same) aren't guaranteed to forever point to the same group (though they will when
# this field is written).
seer_grouphash_sent = FlexibleForeignKey(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would prefer not to keep track of this on GroupHashMetadata. -- we can simply infer this between the relations between the grouphash and group.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see note above on merging/unmerging, i'll think further on this

Copy link
Member

@JoshFerge JoshFerge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small note: for when moving out of draft, it would be preferred to isolate the model changes in a singular PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Scope: Backend Automatically applied to PRs that change backend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants