Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(generic-metrics): Add generic metrics interface #46957

Merged
merged 29 commits into from
Apr 13, 2023
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions src/sentry/sentry_metrics/kafka_metrics_interface.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
from typing import Mapping

from sentry.sentry_metrics.metrics_interface import GenericMetricsBackend
from sentry.sentry_metrics.use_case_id_registry import UseCaseID


class KafkaGenericMetricsBackend(GenericMetricsBackend):
ayirr7 marked this conversation as resolved.
Show resolved Hide resolved
def __init__(self) -> None:
super().__init__()

def counter(
ayirr7 marked this conversation as resolved.
Show resolved Hide resolved
self,
use_case_id: UseCaseID,
org_id: int,
project_id: int,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite an opinionated interface. Why is org and project ID needed at all? Seems like it can't be used for any use cases that aren't within the context of a single org and project then?

Copy link
Contributor

@onewland onewland Apr 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is (a required) part of the back-end schema so I don't think we should drop these.

Copy link
Member

@lynnagara lynnagara Apr 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's because the backend schema is a multi-tenant system and we need to keep individual customer data separate from other customers. For Sentry, this seems like it should always be org_id = 1 and project_id = 1. Why asking the user to pass it every time here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wouldn't be the Sentry org_id and project_id since this would be customer data. @onewland am I missing anything?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So just to clarify, the metrics platform is for aggregating customer data received during ingest only? I think this restriction wasn't super clear to me before, and I assumed it was valid to use metrics for anything at all.

Copy link
Contributor

@onewland onewland Apr 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think from both a query efficiency (the back-end tables are ordered first by org_id and project_id) and cost accounting perspective we should aim to include customer project IDs and org IDs where they make sense, and we can pass Sentry org IDs and project IDs everywhere else.

For example, on the escalating issues project, we're trying to keep a count of error by group ID on ingest. This has an obvious correspondence to a customer's project_id and org_id, and therefore that data should propagate through.

If we wanted to track something like the number of times a celery task is executed, and this is independent from customer data/behavior, then I think it would make more sense to do org_id = 1 and project_id = 1.

Lyn, how would you feel if these parameters were optional and the backends did some resolution if org_id/project_id aren't passed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't just hardcode org_id 1 and and project_id 1 though. Is there a plan for how these values will get resolved?

Does this interface make using Relay a non-starter? Or does there need to be other changes there to make this happen? The implication of this interface seems to be that a project can send data onto some other org/project not just itself. I am guessing this is not something generally allowed in Relay as it might have all kinds of implications in terms of security, quotas, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that fallback values could be configured in Sentry options so they could vary by deployment.

I don't see why this interface makes Relay a non-starter, nor why you think it implies that projects would send data on behalf of other projects. Keep in mind that this interface will only be supported by an internal Relay endpoint, and we would not be exposing this interface to anybody outside Sentry. If we ever reach the point where we want to accept metrics externally, it will be through a DSN which does auth and so there will always be a customer-specific org/project_id attached.

Copy link
Member

@untitaker untitaker Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What changes are planned on Relay-side? Without any, you won't be able to implement this interface for Relay. If you only want to implement this interface for Kafka all is well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, so just to clarify - the internal Relay endpoint mentioned above would allow us to skip authenticating as Sentry, and would allow us to send customer org_id and project_id to Relay via this interface

metric_name: str,
value: int,
ayirr7 marked this conversation as resolved.
Show resolved Hide resolved
tags: Mapping[str, str],
) -> None:
pass

def set(
self,
use_case_id: UseCaseID,
org_id: int,
project_id: int,
metric_name: str,
value: int,
tags: Mapping[str, str],
) -> None:
pass

def distribution(
self,
use_case_id: UseCaseID,
org_id: int,
project_id: int,
metric_name: str,
value: int,
tags: Mapping[str, str],
) -> None:
pass
42 changes: 42 additions & 0 deletions src/sentry/sentry_metrics/metrics_interface.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
import abc
from typing import Mapping

from sentry.sentry_metrics.use_case_id_registry import UseCaseID


class GenericMetricsBackend(metaclass=abc.ABCMeta):
ayirr7 marked this conversation as resolved.
Show resolved Hide resolved
@abc.abstractmethod
def counter(
self,
use_case_id: UseCaseID,
org_id: int,
project_id: int,
metric_name: str,
value: int,
tags: Mapping[str, str],
) -> None:
raise NotImplementedError()

@abc.abstractmethod
def set(
self,
use_case_id: UseCaseID,
org_id: int,
project_id: int,
metric_name: str,
value: int,
ayirr7 marked this conversation as resolved.
Show resolved Hide resolved
tags: Mapping[str, str],
) -> None:
raise NotImplementedError()

@abc.abstractmethod
def distribution(
self,
use_case_id: UseCaseID,
org_id: int,
project_id: int,
metric_name: str,
value: int,
ayirr7 marked this conversation as resolved.
Show resolved Hide resolved
tags: Mapping[str, str],
) -> None:
raise NotImplementedError()
64 changes: 64 additions & 0 deletions src/sentry/sentry_metrics/use_case_id_registry.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
from __future__ import annotations

from typing import Any, Iterator, Mapping, Optional, Set

from sentry.sentry_metrics.configuration import UseCaseKey

_HARDCODED_USE_CASES = {"PERFORMANCE": "performance", "RELEASE_HEALTH": "release-health"}

_REGISTERED_USE_CASES: dict[str, str] = {}


class _UseCaseID(type):
def __getattr__(self, attr: str) -> UseCaseID:
if attr not in _HARDCODED_USE_CASES and attr not in _REGISTERED_USE_CASES:
raise AttributeError(attr)

return UseCaseID(attr.lower())

def __iter__(self) -> Iterator[UseCaseID]:
return iter(
UseCaseID(value)
for value in {
**_HARDCODED_USE_CASES,
**_REGISTERED_USE_CASES,
}.values()
)


class UseCaseID(metaclass=_UseCaseID):
ayirr7 marked this conversation as resolved.
Show resolved Hide resolved
def __init__(self, value: str):
if (
value not in _HARDCODED_USE_CASES.values()
and value not in _REGISTERED_USE_CASES.values()
):
raise ValueError("Passed use case has not been registered")

self.value = value

def __hash__(self) -> int:
return hash(self.value)

def __eq__(self, other: Any) -> bool:
return isinstance(other, UseCaseID) and other.value == self.value

def __repr__(self) -> str:
return f"UseCaseID.{self.value.upper()}"


def register_use_case(key: str) -> UseCaseID:
_REGISTERED_USE_CASES[key.upper()] = key.lower()
return UseCaseID(key)


METRIC_PATH_MAPPING: Mapping[UseCaseKey, Set[UseCaseID]] = {
UseCaseKey.RELEASE_HEALTH: {UseCaseID.RELEASE_HEALTH},
UseCaseKey.PERFORMANCE: {UseCaseID.PERFORMANCE},
}


john-z-yang marked this conversation as resolved.
Show resolved Hide resolved
def get_metric_path_from_usecase(use_case: UseCaseID) -> Optional[UseCaseKey]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need the separate UseCaseID and UseCaseKey?

Copy link
Member Author

@ayirr7 ayirr7 Apr 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So UseCaseKey is currently used for both configuration (see MetricsIngestConfiguration), as well as the use_case_id that is used to populate message payloads and is used in the indexer interface (e.g. the resolve function being used to resolve (string, use_case_id=performance) to an integer). We're going to make it so that UseCaseKey is solely used for ingest configuration (so, it can be either metrics or generic metrics), and that UseCaseID becomes the representation for all use cases (performance, release health, escalating issues, etc.) within one of those metric paths.

This mapping exists so that we can map UseCaseID (individual use cases mentioned above) to the metric paths represented by UseCaseKey currently.

We're going to rename UseCaseKey to MetricPathKey. One plan is to do that once we switch UseCaseKey to UseCaseID in all the places where UseCaseKey is being used to refer to these individual use cases. This PR #46575 would address that. We could also do the rename earlier, please share if there are any opinions on when that should happen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Should the metric path mapping be Mapping[UseCaseID, UseCaseKey] then instead of Mapping[UseCaseKey, Set[UseCaseID]]? It seems like this function would be simpler and also it removes the possibility of invalid data like putting the same use case ID into both keys, etc

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that makes sense to me

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this then means that the UseCaseID is now globally unique and not namespaced by the UseCaseKey/MetricPathKey like it was before... Is that right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify what you mean by "namespaced by" here? If I understood correctly, I think the answer to your question is yes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously the same use case ID could be present under both metrics paths. Now seems like it cannot. Just want to clarify the desired behavior.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's correct

for metric_path_key, use_case_list in METRIC_PATH_MAPPING.items():
if use_case in use_case_list:
return metric_path_key
return None