Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ref(generic-metrics): Remove 10 seconds granularity #4396

Closed
wants to merge 3 commits into from

Conversation

john-z-yang
Copy link
Member

Overview

Create migrations for materialized view of sets, counters, and distributions to remove granularity of 10 seconds.

Testing

Ran snuba locally with migrations applied and sent 9 messages in total, a single (set, counter, distribution) kafka payload for 3 use case IDs.

SELECT
    use_case_id,
    org_id,
    project_id,
    metric_id,
    granularity
FROM generic_metric_sets_local
WHERE arrayExists(v -> match(v, 'gen_metric_e2e_.*FJ1D6416'), tags.raw_value)
UNION ALL
SELECT
    use_case_id,
    org_id,
    project_id,
    metric_id,
    granularity
FROM generic_metric_distributions_aggregated_local
WHERE arrayExists(v -> match(v, 'gen_metric_e2e_.*FJ1D6416'), tags.raw_value)
UNION ALL
SELECT
    use_case_id,
    org_id,
    project_id,
    metric_id,
    granularity
FROM generic_metric_counters_aggregated_local
WHERE arrayExists(v -> match(v, 'gen_metric_e2e_.*FJ1D6416'), tags.raw_value)

Query id: 3554e50f-e92d-4333-949e-4c8e00e36cef

┌─use_case_id──┬─org_id─┬─project_id─┬─metric_id─┬─granularity─┐
│ transactions │      1 │          3 │     65551 │           1 │
│ transactions │      1 │          3 │     65551 │           2 │
│ transactions │      1 │          3 │     65551 │           3 │
│ spans        │      1 │          3 │     65559 │           1 │
│ spans        │      1 │          3 │     65559 │           2 │
│ spans        │      1 │          3 │     65559 │           3 │
│ hello        │      1 │          3 │     65641 │           1 │
│ hello        │      1 │          3 │     65641 │           2 │
│ hello        │      1 │          3 │     65641 │           3 │
└──────────────┴────────┴────────────┴───────────┴─────────────┘
┌─use_case_id──┬─org_id─┬─project_id─┬─metric_id─┬─granularity─┐
│ transactions │      1 │          3 │     65553 │           1 │
│ transactions │      1 │          3 │     65553 │           2 │
│ transactions │      1 │          3 │     65553 │           3 │
│ spans        │      1 │          3 │     65556 │           1 │
│ spans        │      1 │          3 │     65556 │           2 │
│ spans        │      1 │          3 │     65556 │           3 │
│ hello        │      1 │          3 │     65643 │           1 │
│ hello        │      1 │          3 │     65643 │           2 │
│ hello        │      1 │          3 │     65643 │           3 │
└──────────────┴────────┴────────────┴───────────┴─────────────┘
┌─use_case_id──┬─org_id─┬─project_id─┬─metric_id─┬─granularity─┐
│ transactions │      1 │          3 │     65546 │           1 │
│ transactions │      1 │          3 │     65546 │           2 │
│ transactions │      1 │          3 │     65546 │           3 │
│ spans        │      1 │          3 │     65562 │           1 │
│ spans        │      1 │          3 │     65562 │           2 │
│ spans        │      1 │          3 │     65562 │           3 │
│ hello        │      1 │          3 │     65649 │           1 │
│ hello        │      1 │          3 │     65649 │           2 │
│ hello        │      1 │          3 │     65649 │           3 │
└──────────────┴────────┴────────────┴───────────┴─────────────┘

@github-actions
Copy link

github-actions bot commented Jun 21, 2023

This PR has a migration; here is the generated SQL

-- start migrations

-- forward migration generic_metrics : 0014_sets_mv_2
Local op: CREATE MATERIALIZED VIEW IF NOT EXISTS generic_metric_sets_aggregation_mv_2 TO generic_metric_sets_local (org_id UInt64, project_id UInt64, metric_id UInt64, granularity UInt8, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, tags Nested(key UInt64, indexed_value UInt64, raw_value String), value AggregateFunction(uniqCombined64, UInt64), use_case_id LowCardinality(String)) AS 
                SELECT
                    use_case_id,
                    org_id,
                    project_id,
                    metric_id,
                    arrayJoin(granularities) as granularity,
                    tags.key,
                    tags.indexed_value,
                    tags.raw_value,
                    toDateTime(multiIf(granularity=1,60,granularity=2,3600,granularity=3,86400,-1) *
                      intDiv(toUnixTimestamp(timestamp),
                             multiIf(granularity=1,60,granularity=2,3600,granularity=3,86400,-1))) as timestamp,
                    retention_days,
                    uniqCombined64State(arrayJoin(set_values)) as value
                FROM generic_metric_sets_raw_local
                WHERE materialization_version = 2
                  AND metric_type = 'set'
                GROUP BY
                    use_case_id,
                    org_id,
                    project_id,
                    metric_id,
                    tags.key,
                    tags.indexed_value,
                    tags.raw_value,
                    timestamp,
                    granularity,
                    retention_days
                ;
-- end forward migration generic_metrics : 0014_sets_mv_2




-- backward migration generic_metrics : 0014_sets_mv_2
Local op: DROP TABLE IF EXISTS generic_metric_sets_aggregation_mv_2;
-- end backward migration generic_metrics : 0014_sets_mv_2
-- forward migration generic_metrics : 0015_distributions_mv_2
Local op: CREATE MATERIALIZED VIEW IF NOT EXISTS generic_metric_distributions_aggregation_mv_2 TO generic_metric_distributions_aggregated_local (org_id UInt64, project_id UInt64, metric_id UInt64, granularity UInt8, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, tags Nested(key UInt64, indexed_value UInt64, raw_value String), value AggregateFunction(uniqCombined64, UInt64), use_case_id LowCardinality(String)) AS 
                SELECT
                    use_case_id,
                    org_id,
                    project_id,
                    metric_id,
                    arrayJoin(granularities) as granularity,
                    tags.key,
                    tags.indexed_value,
                    tags.raw_value,
                    toDateTime(multiIf(granularity=1,60,granularity=2,3600,granularity=3,86400,-1) *
                      intDiv(toUnixTimestamp(timestamp),
                             multiIf(granularity=1,60,granularity=2,3600,granularity=3,86400,-1))) as timestamp,
                    retention_days,
                    quantilesState(0.5, 0.75, 0.9, 0.95, 0.99)((arrayJoin(distribution_values) AS values_rows)) as percentiles,
                    minState(values_rows) as min,
                    maxState(values_rows) as max,
                    avgState(values_rows) as avg,
                    sumState(values_rows) as sum,
                    countState(values_rows) as count,
                    histogramState(250)(values_rows) as histogram_buckets
                FROM generic_metric_distributions_raw_local
                WHERE materialization_version = 2
                  AND metric_type = 'distribution'
                GROUP BY
                    use_case_id,
                    org_id,
                    project_id,
                    metric_id,
                    tags.key,
                    tags.indexed_value,
                    tags.raw_value,
                    timestamp,
                    granularity,
                    retention_days
                ;
-- end forward migration generic_metrics : 0015_distributions_mv_2




-- backward migration generic_metrics : 0015_distributions_mv_2
Local op: DROP TABLE IF EXISTS generic_metric_distributions_aggregation_mv_2;
-- end backward migration generic_metrics : 0015_distributions_mv_2
-- forward migration generic_metrics : 0016_counters_mv_2
Local op: CREATE MATERIALIZED VIEW IF NOT EXISTS generic_metric_counters_aggregation_mv_2 TO generic_metric_counters_aggregated_local (org_id UInt64, project_id UInt64, metric_id UInt64, granularity UInt8, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, tags Nested(key UInt64, indexed_value UInt64, raw_value String), value AggregateFunction(sum, Float64), use_case_id LowCardinality(String)) AS 
                SELECT
                    use_case_id,
                    org_id,
                    project_id,
                    metric_id,
                    arrayJoin(granularities) as granularity,
                    tags.key,
                    tags.indexed_value,
                    tags.raw_value,
                    toDateTime(multiIf(granularity=1,60,granularity=2,3600,granularity=3,86400,-1) *
                      intDiv(toUnixTimestamp(timestamp),
                             multiIf(granularity=1,60,granularity=2,3600,granularity=3,86400,-1))) as timestamp,
                    retention_days,
                    sumState(count_value) as value
                FROM generic_metric_counters_raw_local
                WHERE materialization_version = 2
                  AND metric_type = 'counter'
                GROUP BY
                    use_case_id,
                    org_id,
                    project_id,
                    metric_id,
                    tags.key,
                    tags.indexed_value,
                    tags.raw_value,
                    timestamp,
                    granularity,
                    retention_days
                ;
-- end forward migration generic_metrics : 0016_counters_mv_2




-- backward migration generic_metrics : 0016_counters_mv_2
Local op: DROP TABLE IF EXISTS generic_metric_counters_aggregation_mv_2;
-- end backward migration generic_metrics : 0016_counters_mv_2

@codecov
Copy link

codecov bot commented Jun 21, 2023

Codecov Report

Patch coverage: 92.81% and project coverage change: +0.01 🎉

Comparison is base (7a07b74) 90.34% compared to head (371f77e) 90.35%.

❗ Current head 371f77e differs from pull request most recent head b724c2b. Consider uploading reports for the commit b724c2b to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4396      +/-   ##
==========================================
+ Coverage   90.34%   90.35%   +0.01%     
==========================================
  Files         802      806       +4     
  Lines       39388    39558     +170     
  Branches      245      245              
==========================================
+ Hits        35584    35743     +159     
- Misses       3772     3783      +11     
  Partials       32       32              
Impacted Files Coverage Δ
snuba/cli/consumer.py 0.00% <0.00%> (ø)
snuba/cli/devserver.py 0.00% <ø> (ø)
snuba/cli/dlq_consumer.py 0.00% <ø> (ø)
snuba/datasets/processors/outcomes_processor.py 94.28% <ø> (ø)
snuba/migrations/group_loader.py 95.00% <ø> (ø)
snuba/settings/settings_test.py 100.00% <ø> (ø)
tests/test_search_issues_api.py 100.00% <ø> (ø)
snuba/consumers/consumer_builder.py 65.11% <33.33%> (-0.76%) ⬇️
snuba/consumers/dlq.py 86.02% <50.00%> (-2.62%) ⬇️
snuba/admin/auth.py 84.00% <78.26%> (-3.04%) ⬇️
... and 18 more

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@john-z-yang john-z-yang marked this pull request as ready for review June 21, 2023 21:47
@john-z-yang john-z-yang requested a review from a team as a code owner June 21, 2023 21:47
@john-z-yang
Copy link
Member Author

john-z-yang commented Jun 26, 2023

Updated the pr to remove the 0 from granularities in the consumer, let me know if this needs to be split up. One of the action is complaining about Migration changes are coupled with other changes

Copy link
Member

@evanh evanh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the migration plan makes sense here.

If you have to change the processor in the same PR as the migration, then that means there will be a period of time where the processor is broken.

@john-z-yang
Copy link
Member Author

john-z-yang commented Jun 28, 2023

I don't think the migration plan makes sense here.

If you have to change the processor in the same PR as the migration, then that means there will be a period of time where the processor is broken.

I think I'm going to close this due to some additional changes I'd like to make to the matview.

But quick (and pedantic) question, would this really break? Wouldn't this need another pr to set the materialized version in the processor to the one specified in the migration for the matview to kick in?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants