Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(generic-metrics): Add generic metrics interface #46957

Merged
merged 29 commits into from
Apr 13, 2023
Merged
Show file tree
Hide file tree
Changes from 26 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 61 additions & 0 deletions src/sentry/sentry_metrics/metrics_interface.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
from abc import ABC, abstractmethod
from typing import Mapping, Optional, Sequence, Union

from sentry.sentry_metrics.use_case_id_registry import UseCaseID


class GenericMetricsBackend(ABC):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this interface is going to produce on Kafka it will be asynchronous. Still you will likely want to allow the client to wait for the message to be acknowledged (callback). How do you plan to support this use case?

You will likely want an interface that allows you to swap the backend (relay or kafka) but still allow delivery guarantee. How will it look like in relay ? Will the call be synchronous ?

There are multiple ways to achieve this:

  • have a sync api and an async api
  • have only an async api and simulate the async behavior in relay with a background thread that wait for the HTTP response.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, to start, maybe we can have an aysnc api and a sync api. I would start working on this while building the Kafka backend, which will be the next PR. Is there anything else specific to this PR that would need to change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should figure out how the async interface looks like (how the client receives the callback, is that a future?) at this point as you will need such interface immediately for Kafka.

Copy link
Member Author

@ayirr7 ayirr7 Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I am thinking about that for the Kafka implementation. Since this PR implements the UseCaseID though, it's a bit blocking for some other work. How would it be if we merged the current state of the interface and, in another PR, revisit it as needed + build the Kafka implementation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate from Filippo's comment, I think on the base class we should have docstrings for each method.

@abstractmethod
def counter(
self,
use_case_id: UseCaseID,
org_id: int,
project_id: int,
metric_name: str,
value: Union[int, float],
unit: Optional[str],
tags: Mapping[str, str],
) -> None:

"""
Used for emitting a counter metric. Ensure that the use_case_id
passed in has been registered in the UseCaseID enum.
"""

raise NotImplementedError()

@abstractmethod
def set(
self,
use_case_id: UseCaseID,
org_id: int,
project_id: int,
metric_name: str,
value: Sequence[int],
unit: Optional[str],
tags: Mapping[str, str],
) -> None:

"""
Used for emitting a set metric. Can support a sequence of values.
Ensure that the use_case_id passed in has been registered in the UseCaseID enum.
"""
raise NotImplementedError()

@abstractmethod
def distribution(
self,
use_case_id: UseCaseID,
org_id: int,
project_id: int,
metric_name: str,
value: Sequence[Union[int, float]],
unit: Optional[str],
tags: Mapping[str, str],
) -> None:

"""
Used for emitting a distribution metric. Can support a sequence of values.
Ensure that the use_case_id passed in has been registered in the UseCaseID enum.
"""
raise NotImplementedError()
43 changes: 43 additions & 0 deletions src/sentry/sentry_metrics/use_case_id_registry.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
from __future__ import annotations

from enum import Enum
from typing import Mapping

from sentry.sentry_metrics.configuration import UseCaseKey


class UseCaseID(Enum):
PERFORMANCE = "performance"
RELEASE_HEALTH = "release-health"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MRI namespaces used for these cases today are transactions and sessions, what's the reason for changing the names ?

Copy link
Member Author

@ayirr7 ayirr7 Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This UseCaseID is what is ultimately populating the use_case_id field in messages' payloads. It's also what Clickhouse knows to be the use_case_id. Is there a reason why we should change these from performance and release-health to transactions and sessions? (Also to be clear, we are not changing the namespaces. We are just introducing the UseCaseID enum to keep track of the "true" use cases - e.g. performance, RH, escalating_issues - that identify messages and that Clickhouse recognizes ultimately. We also have a mapping between UseCaseID and namespace. You can also see UseCaseKey which is what currently gets added onto message payloads before sending them to Snuba.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the true use cases should be transactions and sessions not performance and release_health so you avoid the mapping altogether.
I don't think we are storing anything as use case in Clickhouse today, so you would not be changing the existing value.

Another reason for not doing the pamming. Let's say you map what comes from MRI (transactions) to what the indexer consider the true use case (performance). What if somebody tomorow introduces a new MRI namespace called performance in the intent of creating a new use case? Now you would have a naming clash.

I'd just take transactions and sessions as use case ids and avoid the mapping.



# UseCaseKey will be renamed to MetricPathKey
METRIC_PATH_MAPPING: Mapping[UseCaseID, UseCaseKey] = {
UseCaseID.RELEASE_HEALTH: UseCaseKey.RELEASE_HEALTH,
UseCaseID.PERFORMANCE: UseCaseKey.PERFORMANCE,
}


john-z-yang marked this conversation as resolved.
Show resolved Hide resolved
def get_metric_path_from_usecase(use_case: UseCaseID) -> UseCaseKey:
return METRIC_PATH_MAPPING[use_case]


NAMESPACE_MAPPING: Mapping[UseCaseID, str] = {
UseCaseID.RELEASE_HEALTH: "sessions",
UseCaseID.PERFORMANCE: "transactions",
}


def get_namespace_from_usecase(use_case: UseCaseID) -> str:
if use_case in NAMESPACE_MAPPING:
return NAMESPACE_MAPPING[use_case]

return use_case.value


def get_usecase_from_namespace(namespace: str) -> UseCaseID:
for use_case in NAMESPACE_MAPPING:
if NAMESPACE_MAPPING[use_case] == namespace:
return use_case

return UseCaseID(namespace)