Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ref: Deprecate TOPIC_PARTITION_COUNTS setting #5868

Merged
merged 14 commits into from
May 20, 2024

Conversation

lynnagara
Copy link
Member

@lynnagara lynnagara commented May 6, 2024

Snuba should fetch the actual value from the broker on startup rather than rely on a hardcoded TOPIC_PARTITION_COUNTS settings. This will ensure that it always has the correct values and remove the manual process and associated drift in ops that is currently involved in keeping this up to date.

Depends on getsentry/sentry#70478

This is a prerequisite to https://github.com/getsentry/ops/pull/10392 and https://github.com/getsentry/ops/pull/10451 which will enable clean up of topicctl configuration in ops.

Snuba should fetch the actual value from the broker on startup rather than
rely on a hardcoded TOPIC_PARTITION_COUNTS settings. This will ensure that
it always has the correct values and remove the manual process and associated
drift in ops that is currently involved in keeping this up to date.
@lynnagara lynnagara changed the title ref: Deprecated TOPIC_PARTITION_COUNTS setting ref: Deprecate TOPIC_PARTITION_COUNTS setting May 6, 2024
@lynnagara lynnagara marked this pull request as ready for review May 6, 2024 19:51
@lynnagara lynnagara requested a review from a team as a code owner May 6, 2024 19:51
@getsentry getsentry deleted a comment from codecov bot May 6, 2024
config = get_default_kafka_configuration(self.__topic, None)
client = AdminClient(config)
topic_name = self.get_physical_topic_name()
print(topic_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm debugging a test that fails in CI but not locally

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed now

Copy link
Member

@evanh evanh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiousity, if the partition check fails for some reason, what would happen

@property
def replication_factor(self) -> int:
return 1
config = get_default_kafka_configuration(self.__topic, None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This connection already happens in bootstrap.py. Is there a way we could leverage that code, to avoid connecting twice on startup?

Also, there is a utility function that does this: https://github.com/getsentry/snuba/blob/master/snuba/consumers/utils.py#L17

Could we use that instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugggh I just noticed a pretty bad bug with that other function. It does not properly resolve the topic name if it is overridden to something else and assumes the default. So I guess we can't safely use it right now, but also... this probably needs to be fixed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like it should be fixed. Did you want to add it to this PR?

@lynnagara
Copy link
Member Author

lynnagara commented May 7, 2024

Out of curiousity, if the partition check fails for some reason, what would happen

The pod would crash if Kafka or the topic is unavailable for some reason. This applies to the subscription scheduler pod as well as the API pod if the subscription POST endpoint is called.

@enochtangg
Copy link
Member

This applies to the subscription scheduler pod as well as the API pod if the subscription POST endpoint is called

Is there any way around this so that the API pod doesn't crash? This kinda reminds me of the old readiness health-check we had before where snuba API would crash if it isn't able to connect to ClickHouse. Ideally, the API still works for all other the other datasets even if one partition check fails for a single dataset.

lynnagara added a commit to getsentry/sentry that referenced this pull request May 13, 2024
@lynnagara
Copy link
Member Author

Discussed offline. Looks like exception handling and returning a proper response is already automatically handled by

snuba/snuba/web/views.py

Lines 173 to 194 in 177e23e

@application.errorhandler(InternalServerError)
def handle_internal_server_error(exception: InternalServerError) -> Response:
original = getattr(exception, "original_exception", None)
if original is None:
return Response(
json.dumps(
{"error": {"type": "internal_server_error", "message": str(exception)}},
indent=4,
),
500,
{"Content-Type": "application/json"},
)
return Response(
json.dumps(
{"error": {"type": "internal_server_error", "message": str(original)}},
indent=4,
),
500,
{"Content-Type": "application/json"},
)

@lynnagara
Copy link
Member Author

also needs getsentry/sentry#70809

lynnagara added a commit to getsentry/sentry that referenced this pull request May 20, 2024
fix test for getsentry/snuba#5868

the events topic needs to exist for subscriptions to work
@lynnagara lynnagara merged commit 4aedda7 into master May 20, 2024
30 checks passed
@lynnagara lynnagara deleted the deprecate-topic-partitions-count branch May 20, 2024 20:04
@lynnagara
Copy link
Member Author

sentry test is failing

@getsentry-bot
Copy link
Contributor

PR reverted: ed0426b

getsentry-bot added a commit that referenced this pull request May 20, 2024
This reverts commit 4aedda7.

Co-authored-by: lynnagara <1779792+lynnagara@users.noreply.github.com>
lynnagara added a commit that referenced this pull request May 20, 2024
lynnagara added a commit to getsentry/sentry that referenced this pull request May 20, 2024
we need to actually create the topic in CI
(this worked locally because of auto-create settings)

getsentry/snuba#5868 had to be reverted until we
get this fix in as CI will fail
cmanallen pushed a commit to getsentry/sentry that referenced this pull request May 21, 2024
fix test for getsentry/snuba#5868

the events topic needs to exist for subscriptions to work
cmanallen pushed a commit to getsentry/sentry that referenced this pull request May 21, 2024
we need to actually create the topic in CI
(this worked locally because of auto-create settings)

getsentry/snuba#5868 had to be reverted until we
get this fix in as CI will fail
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants