Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(consumer): Add transactions consumer SLO #4442

Merged
merged 10 commits into from
Jul 28, 2023
Merged

Conversation

ayirr7
Copy link
Member

@ayirr7 ayirr7 commented Jun 29, 2023

The received field is (most likely) always present in transaction payloads, but we currently don't do anything with it in the relevant MessageProcessor. This PR allows us to grab the timestamp at which Relay received the transaction and pass it into the SLO calculation, to help us get the ingest transactions SLO.

I think, if we are confident that we are always containing the received field, then we should change the Kafka schema to make received a required field. Need to verify this is the case though.

@codecov
Copy link

codecov bot commented Jul 4, 2023

Codecov Report

Patch coverage: 78.76% and project coverage change: +0.03% 🎉

Comparison is base (6518d74) 90.34% compared to head (83ccf6e) 90.38%.
Report is 98 commits behind head on master.

❗ Current head 83ccf6e differs from pull request most recent head 842557c. Consider uploading reports for the commit 842557c to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4442      +/-   ##
==========================================
+ Coverage   90.34%   90.38%   +0.03%     
==========================================
  Files         804      823      +19     
  Lines       39576    40585    +1009     
  Branches      245      289      +44     
==========================================
+ Hits        35755    36681     +926     
- Misses       3789     3861      +72     
- Partials       32       43      +11     
Files Changed Coverage Δ
scripts/check-migrations.py 0.00% <ø> (ø)
setup.py 0.00% <0.00%> (ø)
snuba/admin/auth_roles.py 93.22% <ø> (ø)
snuba/admin/clickhouse/system_queries.py 70.37% <ø> (ø)
snuba/admin/static/api_client.tsx 2.03% <0.00%> (-0.11%) ⬇️
snuba/admin/static/clickhouse_queries/index.tsx 30.00% <ø> (+10.64%) ⬆️
snuba/admin/static/snql_to_sql/index.tsx 11.90% <ø> (ø)
snuba/admin/static/table.tsx 81.81% <ø> (ø)
snuba/cli/consumer.py 0.00% <0.00%> (ø)
snuba/cli/devserver.py 0.00% <0.00%> (ø)
... and 106 more

... and 3 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ayirr7 ayirr7 marked this pull request as ready for review July 7, 2023 00:04
@ayirr7 ayirr7 requested a review from a team as a code owner July 7, 2023 00:04
@lynnagara
Copy link
Member

I think you can go ahead and change received in the transactions schema to required. It should fail Sentry / Snuba CI if there are tests where the payload is missing it. In prod, it should be there since Relay is processing events/transactions the same - but even if it's not, it's just a warning.

@ayirr7
Copy link
Member Author

ayirr7 commented Jul 7, 2023

It should fail Sentry / Snuba CI if there are tests where the payload is missing it

There was a test where the sample payload didn't include the received field, and I had to add it for the test to pass. @lynnagara

@@ -472,4 +474,11 @@ def process_message(
# the following operation modifies the event_dict and is therefore *not* order-independent
self._process_contexts_and_user(processed, event_dict)

return InsertBatch([processed], None)
if event_dict["data"]["received"] is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the schema it's defined as a number, and not nullable, so why do we need this check here?

Copy link
Member Author

@ayirr7 ayirr7 Jul 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought process was that -- this is meant to capture the case where the received field is not passed through (since it's not required in the schema yet). Or, is it implied that the field is required if the field isn't nullable in the schema?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from schema point of view, if it's required and non-nullable, you probably don't need to check it. @lynnagara do you know what will happen if a payload is not schema compatible, will we drop the payload? Do you know what are the processes to ensure the payloads matched the schema?

Copy link
Member

@lynnagara lynnagara Jul 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right now, you'll just get a sentry warning if the schema doesn't match but nothing bad will happen. i'd merge the schema change first to validate that if you're unsure if it's always present.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good. Once the payload is always schema compatible, then we probably don't need this extra check.

assert raw_received is not None
received = datetime.utcfromtimestamp(raw_received)
return InsertBatch([processed], received)
except Exception:
Copy link

@hubertsentry hubertsentry Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would only check if it's KeyError, Exception can be wild range of exceptions

except KeyError:

Also, maybe looking into https://docs.python.org/3/tutorial/errors.html#exception-chaining

@ayirr7 ayirr7 requested a review from lynnagara July 25, 2023 20:14
received = datetime.utcfromtimestamp(raw_received)
return InsertBatch([processed], received)
except KeyError:
raise KeyError("Missing received timestamp field in transaction")
Copy link
Member

@lynnagara lynnagara Jul 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't the point of the try/except to fall back to the old behavior in case we hit the KeyError rather than re-raising it?

Copy link
Member Author

@ayirr7 ayirr7 Jul 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear: today, what happens if there is a payload missing the received field is that the MessageProcessor will just drop it - is this correct?

So I guess we can fall back to the old behavior (so that we don't drop the message), but we also want to see the error show up in Sentry so we can know if there were any payloads missing the field. What's the right way of getting that error to show up?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the point was to log to sentry without throwing the exception, i.e just manually call logger.exception

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright I can do that then, thanks

received = datetime.utcfromtimestamp(raw_received)
return InsertBatch([processed], received)
except KeyError as err:
logger.exception("Missing required field received in payload", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logger.exception("Missing required field received in payload", err)
logger.exception(e)

return InsertBatch([processed], None)
try:
raw_received = _collapse_uint32(int(event_dict["data"]["received"]))
assert raw_received is not None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If received is not there, won't you get an AssertionError that isn't being caught?

Copy link
Member Author

@ayirr7 ayirr7 Jul 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but wouldn't the first line (which throws KeyError) be thrown and caught first? So we go into the except clause?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I guess the AssertionError could be even easier, since it'd be guaranteed that it's being thrown at that line

@ayirr7 ayirr7 merged commit ea2e56b into master Jul 28, 2023
24 checks passed
@ayirr7 ayirr7 deleted the add-transactions-SLO branch July 28, 2023 19:18
davidtsuk pushed a commit that referenced this pull request Aug 10, 2023
* add transactions SLO

* pass received

* fix received typing to datetime

* fix test

* add received to test event

* do try/except

* fix typing

* except KeyError

* add logging

* fix try except
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants