[ffresty] [metric] HTTP Response Time and Complete Gauge Support #160

onelapahead · 2024-12-12T03:37:11Z

For Resty, we enhance the existing before/after hooks so that we record the elapsed time on responses. And use more labels and smarter context's for consistently deriving any request metadata like the host.

Then, within the metric manager itself, we didn't offer the ability to use gauge inc/dec which can be very useful rather than using set.

pkg/ffresty/ffresty.go

firefly-common.iml

Chengxuan · 2024-12-18T10:26:32Z

pkg/ffresty/ffresty.go

-		metricsManager.NewCounterMetricWithLabels(ctx, "network_error", "Network error", []string{"host", "method"}, false)
+		metricsManager.NewCounterMetricWithLabels(ctx, metricsHTTPResponsesTotal, "HTTP response", []string{"status", "error", "host", "method"}, false)
+		metricsManager.NewCounterMetricWithLabels(ctx, metricsNetworkErrorsTotal, "Network error", []string{"host", "method"}, false)
+		metricsManager.NewSummaryMetricWithLabels(ctx, metricsHTTPResponseTime, "HTTP response time", []string{"status", "host", "method"}, false)


@onelapahead what's your consideration between Histogram and Summary for this timing metrics?

Histogram is more costly in terms of cardinality bc of all the buckets you make. So this felt light enough to only calculate average response time across a few basic dimensions.

Would love to toggle btwn histogram and summaries based on certain settings eventually, but that was a bigger change than I had an appetite for.

Chengxuan · 2024-12-18T10:27:55Z

pkg/ffresty/ffresty.go

+const (
+	metricsHTTPResponsesTotal = "http_responses_total"
+	metricsNetworkErrorsTotal = "network_errors_total"
+	metricsHTTPResponseTime   = "http_response_time_seconds"


Some size metrics for data transmitted would be useful to add as well.

Chengxuan

That is a great step forward, thanks @onelapahead .

I asked a question about the metrics type for the response time. Switching to histogram will have the following benefits, which could be compelling:

reduce observation costs on the client
enable aggregation across instances

^^ from https://prometheus.io/docs/practices/histograms/

onelapahead · 2024-12-19T18:10:02Z

enable aggregation across instances

To clarify - this is still entirely possible with summaries. They are similar to a histogram but less counters:

# summary example
http_response_time_sum{}
http_response_time_count{}

# histogram example
http_response_time_buckets{le="1.0"}
http_response_time_buckets{le="0.5"}
http_response_time_buckets{le="0.1"}
http_response_time_sum{}
http_response_time_count{}

Histograms are great/preferred for aggregate percentiles and therefore performance engineering. Summaries are great for aggregated averages and therefore reliability engineering.

Quantiles are a special version of summary, which to your point are basically worthless at any real scale bc they cannot be aggregated. However, they are less expensive cardinality than histograms. So if you care about individual "thing's" performance they are useful.

Do hope ff-common can offer histograms as an option, but not by default since they are quite expensive. But will save that for a future contributor.

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

EnriqueL8

This looks good @onelapahead

Getting Error: pkg/ffresty/ffresty.go:201:1: cyclomatic complexity 37 of func NewWithConfig is high (> 30) (gocyclo) from build, given this function is doing a lot of just configuration we can probably ignore this complexity warning by adding the ignore comment. Or if you are up for it we can break it down.

We could split adding the onBeforeRequest to another function but it might be messy

onelapahead changed the title ~~[ffresty] [metric] Richer HTTP Client Metrics and Complete Gauge Support~~ [ffresty] [metric] HTTP Response Time and Complete Gauge Support Dec 12, 2024

onelapahead commented Dec 12, 2024

View reviewed changes

pkg/ffresty/ffresty.go Outdated Show resolved Hide resolved

onelapahead commented Dec 13, 2024

View reviewed changes

firefly-common.iml Outdated Show resolved Hide resolved

onelapahead marked this pull request as ready for review December 16, 2024 12:53

Chengxuan reviewed Dec 18, 2024

View reviewed changes

onelapahead added 13 commits December 19, 2024 21:51

[metric] Increment and Decerement for Gauges

9b99686

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

updating interface

fd3c3db

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

response time and request body size metrics

f1ada49

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

removing request body size

29bfa10

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

fix linter and inefficencies

f6fc2ac

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

fix response time metric

c5821a0

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

fix the linter

4efbf8c

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

simplify elapsed calculations using time utils

fb8c447

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

remove ff-common idea

27ea4fe

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

gitignore for intellij

1b2e68b

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

unit tests

4ac2fee

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

fix test and making host parsing logic more robust

052c797

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

other edge case of unknown host

8180cc5

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

onelapahead force-pushed the metric-gauge-incdec branch from 3d55fdd to 8180cc5 Compare December 20, 2024 02:59

fix merge

1d3ce59

Signed-off-by: hfuss <hayden.fuss@kaleido.io>

Chengxuan approved these changes Jan 6, 2025

View reviewed changes

EnriqueL8 reviewed Jan 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ffresty] [metric] HTTP Response Time and Complete Gauge Support #160

[ffresty] [metric] HTTP Response Time and Complete Gauge Support #160

onelapahead commented Dec 12, 2024 •

edited

Loading

Chengxuan Dec 18, 2024

onelapahead Dec 19, 2024

Chengxuan Dec 18, 2024

Chengxuan left a comment

onelapahead commented Dec 19, 2024

EnriqueL8 left a comment •

edited

Loading

[ffresty] [metric] HTTP Response Time and Complete Gauge Support #160

Are you sure you want to change the base?

[ffresty] [metric] HTTP Response Time and Complete Gauge Support #160

Conversation

onelapahead commented Dec 12, 2024 • edited Loading

Chengxuan Dec 18, 2024

Choose a reason for hiding this comment

onelapahead Dec 19, 2024

Choose a reason for hiding this comment

Chengxuan Dec 18, 2024

Choose a reason for hiding this comment

Chengxuan left a comment

Choose a reason for hiding this comment

onelapahead commented Dec 19, 2024

EnriqueL8 left a comment • edited Loading

Choose a reason for hiding this comment

onelapahead commented Dec 12, 2024 •

edited

Loading

EnriqueL8 left a comment •

edited

Loading