-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add experimental support for delta temporality #121
Conversation
@ankitnayan, the modified query for the rate with detla. I will be doing more testing, but I believe we don't need a SELECT
service_name,
toStartOfInterval(toDateTime(intDiv(timestamp_ms, 1000)), toIntervalSecond(60)) AS ts,
sum(value) / 60 AS value
FROM signoz_metrics.distributed_samples_v2
GLOBAL INNER JOIN
(
SELECT
JSONExtractString(labels, 'service_name') AS service_name,
fingerprint
FROM signoz_metrics.distributed_time_series_v2
WHERE (metric_name = 'signoz_calls_total') AND (temporality = 'Delta') AND (timestamp_ms >= (toUnixTimestamp(now() - toIntervalMinute(30)) * 1000)) AND (timestamp_ms <= (toUnixTimestamp(now()) * 1000))
) AS filtered_time_series USING (fingerprint)
WHERE (metric_name = 'signoz_calls_total')
GROUP BY
service_name,
ts
ORDER BY
service_name ASC,
ts ASC Output of the query (same as final result of cumulative query/traces)
|
@srikanthccv what would a percentile query look like? |
SELECT
service_name,
ts,
histogramQuantile(arrayMap(x -> toFloat64(x), groupArray(le)), groupArray(value), 0.95) AS value
FROM
(
SELECT
service_name,
le,
toStartOfInterval(toDateTime(intDiv(timestamp_ms, 1000)), toIntervalSecond(60)) AS ts,
sum(value) / 60 AS value
FROM signoz_metrics.distributed_samples_v2
GLOBAL INNER JOIN
(
SELECT
JSONExtractString(labels, 'service_name') AS service_name,
JSONExtractString(labels, 'le') AS le,
fingerprint
FROM signoz_metrics.distributed_time_series_v2
WHERE (metric_name = 'signoz_latency_bucket') AND (temporality = 'Delta')
) AS filtered_time_series USING (fingerprint)
WHERE metric_name = 'signoz_latency_bucket'
GROUP BY
service_name,
le,
ts
ORDER BY
service_name ASC,
le ASC,
ts ASC
)
GROUP BY
service_name,
ts
ORDER BY
service_name ASC,
ts ASC |
1 Shard, 4 vCPUs, and 16 GB. Here is how the delta compares to the cumulative with the devrev data. The cumulative values are taken from an earlier exercise here https://github.com/SigNoz/engineering-pod/issues/903#issuecomment-1597998417
Queries for delta: (1 service) 1 day - 5 minquerySELECT
destination_service,
ts,
histogramQuantile(arrayMap(x -> toFloat64(x), groupArray(le)), groupArray(value), 0.9) AS value
FROM
(
SELECT
destination_service,
le,
toStartOfInterval(toDateTime(intDiv(timestamp_ms, 1000)), toIntervalSecond(300)) AS ts,
sum(value) / 300 AS value
FROM signoz_metrics.distributed_samples_v2
INNER JOIN
(
SELECT
JSONExtractString(labels, 'destination_service') AS destination_service,
JSONExtractString(labels, 'le') AS le,
fingerprint
FROM signoz_metrics.time_series_v2
WHERE (metric_name = 'istio_request_bytes_bucket') AND (labels LIKE '%gateway%') AND (JSONExtractString(labels, 'destination_service') = 'gateway.gateway.svc.cluster.local')
) AS filtered_time_series USING (fingerprint)
WHERE (metric_name = 'istio_request_bytes_bucket') AND (timestamp_ms >= 1679500800000) AND (timestamp_ms <= 1679587200000)
GROUP BY
destination_service,
le,
ts
ORDER BY
destination_service ASC,
le ASC,
ts ASC
)
GROUP BY
destination_service,
ts
ORDER BY
destination_service ASC,
ts ASC
288 rows in set. Elapsed: 16.774 sec. Processed 189.05 million rows, 13.10 GB (11.27 million rows/s., 780.87 MB/s.)
(1 service) 7 days - 1 hrquerySELECT
destination_service,
ts,
histogramQuantile(arrayMap(x -> toFloat64(x), groupArray(le)), groupArray(value), 0.9) AS value
FROM
(
SELECT
destination_service,
le,
toStartOfInterval(toDateTime(intDiv(timestamp_ms, 1000)), toIntervalSecond(3600)) AS ts,
sum(value) / 3600 AS value
FROM signoz_metrics.distributed_samples_v2
INNER JOIN
(
SELECT
JSONExtractString(labels, 'destination_service') AS destination_service,
JSONExtractString(labels, 'le') AS le,
fingerprint
FROM signoz_metrics.time_series_v2
WHERE (metric_name = 'istio_request_bytes_bucket') AND (labels LIKE '%gateway%') AND (JSONExtractString(labels, 'destination_service') = 'gateway.gateway.svc.cluster.local')
) AS filtered_time_series USING (fingerprint)
WHERE (metric_name = 'istio_request_bytes_bucket') AND (timestamp_ms >= 1679500800000) AND (timestamp_ms <= 1680105600000)
GROUP BY
destination_service,
le,
ts
ORDER BY
destination_service ASC,
le ASC,
ts ASC
)
GROUP BY
destination_service,
ts
ORDER BY
destination_service ASC,
ts ASC
137 rows in set. Elapsed: 43.238 sec. Processed 744.89 million rows, 27.15 GB (17.23 million rows/s., 627.90 MB/s.) (all services) 1 day - 5 minquerySELECT
destination_service,
ts,
histogramQuantile(arrayMap(x -> toFloat64(x), groupArray(le)), groupArray(value), 0.9) AS value
FROM
(
SELECT
destination_service,
le,
toStartOfInterval(toDateTime(intDiv(timestamp_ms, 1000)), toIntervalSecond(300)) AS ts,
sum(value) / 300 AS value
FROM signoz_metrics.distributed_samples_v2
INNER JOIN
(
SELECT
JSONExtractString(labels, 'destination_service') AS destination_service,
JSONExtractString(labels, 'le') AS le,
fingerprint
FROM signoz_metrics.time_series_v2
WHERE (metric_name = 'istio_request_bytes_bucket')
) AS filtered_time_series USING (fingerprint)
WHERE (metric_name = 'istio_request_bytes_bucket') AND (timestamp_ms >= 1679500800000) AND (timestamp_ms <= 1679587200000)
GROUP BY
destination_service,
le,
ts
ORDER BY
destination_service ASC,
le ASC,
ts ASC
)
GROUP BY
destination_service,
ts
ORDER BY
destination_service ASC,
ts ASC
17414 rows in set. Elapsed: 40.875 sec. Processed 189.05 million rows, 13.10 GB (4.62 million rows/s., 320.54 MB/s.) (all services) 7 days - 1 hrquerySELECT
destination_service,
ts,
histogramQuantile(arrayMap(x -> toFloat64(x), groupArray(le)), groupArray(value), 0.9) AS value
FROM
(
SELECT
destination_service,
le,
toStartOfInterval(toDateTime(intDiv(timestamp_ms, 1000)), toIntervalSecond(3600)) AS ts,
sum(value)/3600 AS value
FROM signoz_metrics.distributed_samples_v2
INNER JOIN
(
SELECT
JSONExtractString(labels, 'destination_service') AS destination_service,
JSONExtractString(labels, 'le') AS le,
fingerprint
FROM signoz_metrics.time_series_v2
WHERE metric_name = 'istio_request_bytes_bucket'
) AS filtered_time_series USING (fingerprint)
WHERE (metric_name = 'istio_request_bytes_bucket') AND (timestamp_ms >= 1679500800000) AND (timestamp_ms <= 1680105600000)
GROUP BY
destination_service,
le,
ts
ORDER BY
destination_service ASC,
le ASC,
ts ASC
)
GROUP BY
destination_service,
ts
ORDER BY
destination_service ASC,
ts ASC
9417 rows in set. Elapsed: 96.222 sec. Processed 744.89 million rows, 27.15 GB (7.74 million rows/s., 282.19 MB/s.)
It's not clear what's happening with the 2nd query but we can notice the difference otherwise. |
I expected more perf improvement. Maybe Can you try for other common queries and measure performance? Eg, RPS or avg duration with |
I think there should be at least service_name in the group by. Here is RPS table
|
@ankitnayan please review |
temporality
withSet(3)
index (Unspecified, Delta, Cumulative); add the__temporality__
label since we want different fingerprints for different temporality.monotonic=true
; otherwise Prometheus(receiver) converts them to gauges.