Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data source HTTP metrics #1130

Merged
merged 10 commits into from
Nov 5, 2024
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions backend/admission.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,16 @@ package backend
import (
"context"

ctxHelpers "github.com/grafana/grafana-plugin-sdk-go/backend/context"
Copy link
Member

@wbrowne wbrowne Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I missed this somewhere, but why the need for a new package for this? Is it due to requiring the backend package in httpclient?

Yes this same old pesky problem. I recommend we find a better name for this just so we don't always have to alias the import. Though I dislike utility/helper packages, maybe ctxutil? In the meantime I might do a quick look to see how much work it would be to shift some things around to avoid this entirely

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like it, yes, a cyclic dependency because of this line:

		labels["endpoint"] = ctxHelpers.EndpointFromContext(ctx).String()

My suggestion to avoid the breaking change:

  • Create a package under backend/internal/endpointctx and expose EndpointCtxKeyType and EndpointCtxKey. Since it's within an internal folder, it won't be exposed outside of the repository.
  • Change backend/endpoint.go to use the above.
  • Change backend/httpclient/datasource_metrics_middleware.go and manually read the exposed key. Something like:
		labels["endpoint"] = ""
		if ep := ctx.Value(endpointctx.EndpointCtxKey); ep != nil {
			labels["endpoint"] = ep.(string)
		}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like a good alternative, I will implement and request a re-review 😊

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice @andresmgot - I think that'll do well for now. Ket's keep an eye on if this becomes more common, then we may look at a broader change to these packages perhaps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented changes 😄

"github.com/grafana/grafana-plugin-sdk-go/genproto/pluginv2"
)

const (
// EndpointValidateAdmission friendly name for the validate admission endpoint/handler.
EndpointValidateAdmission Endpoint = "validateAdmission"
EndpointValidateAdmission ctxHelpers.Endpoint = "validateAdmission"

// EndpointMutateAdmission friendly name for the mutate admission endpoint/handler.
EndpointMutateAdmission Endpoint = "mutateAdmission"
EndpointMutateAdmission ctxHelpers.Endpoint = "mutateAdmission"
)

// AdmissionHandler is an EXPERIMENTAL service that allows checking objects before they are saved
Expand Down
File renamed without changes.
File renamed without changes.
4 changes: 3 additions & 1 deletion backend/conversion.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,13 @@ package backend

import (
"context"

pCtx "github.com/grafana/grafana-plugin-sdk-go/backend/context"
)

const (
// EndpointConvertObjects friendly name for the convert objects endpoint/handler.
EndpointConvertObjects Endpoint = "convertObjects"
EndpointConvertObjects pCtx.Endpoint = "convertObjects"
)

// ConversionHandler is an EXPERIMENTAL service that allows converting objects between versions
Expand Down
3 changes: 2 additions & 1 deletion backend/data.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,13 @@ import (
"net/http"
"time"

ctxHelpers "github.com/grafana/grafana-plugin-sdk-go/backend/context"
"github.com/grafana/grafana-plugin-sdk-go/data"
jsoniter "github.com/json-iterator/go"
)

// EndpointQueryData friendly name for the query data endpoint/handler.
const EndpointQueryData Endpoint = "queryData"
const EndpointQueryData ctxHelpers.Endpoint = "queryData"

// QueryDataHandler handles data queries.
type QueryDataHandler interface {
Expand Down
6 changes: 4 additions & 2 deletions backend/diagnostics.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,16 @@ import (
"context"
"net/http"
"strconv"

ctxHelpers "github.com/grafana/grafana-plugin-sdk-go/backend/context"
)

const (
// EndpointCheckHealth friendly name for the check health endpoint/handler.
EndpointCheckHealth Endpoint = "checkHealth"
EndpointCheckHealth ctxHelpers.Endpoint = "checkHealth"

// EndpointCollectMetrics friendly name for the collect metrics endpoint/handler.
EndpointCollectMetrics Endpoint = "collectMetrics"
EndpointCollectMetrics ctxHelpers.Endpoint = "collectMetrics"
)

// CheckHealthHandler enables users to send health check
Expand Down
6 changes: 4 additions & 2 deletions backend/handler_middleware.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ import (
"context"
"errors"
"slices"

ctxHelpers "github.com/grafana/grafana-plugin-sdk-go/backend/context"
)

var (
Expand Down Expand Up @@ -46,9 +48,9 @@ func HandlerFromMiddlewares(finalHandler Handler, middlewares ...HandlerMiddlewa
}, nil
}

func (h *MiddlewareHandler) setupContext(ctx context.Context, pluginCtx PluginContext, endpoint Endpoint) context.Context {
func (h *MiddlewareHandler) setupContext(ctx context.Context, pluginCtx PluginContext, endpoint ctxHelpers.Endpoint) context.Context {
ctx = initErrorSource(ctx)
ctx = WithEndpoint(ctx, endpoint)
ctx = ctxHelpers.WithEndpoint(ctx, endpoint)
ctx = WithPluginContext(ctx, pluginCtx)
ctx = WithGrafanaConfig(ctx, pluginCtx.GrafanaConfig)
ctx = WithUser(ctx, pluginCtx.User)
Expand Down
159 changes: 159 additions & 0 deletions backend/httpclient/datasource_metrics_middleware.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
package httpclient

import (
"errors"
"fmt"
"net/http"
"strconv"
"strings"
"time"

ctxHelpers "github.com/grafana/grafana-plugin-sdk-go/backend/context"
"github.com/grafana/grafana-plugin-sdk-go/backend/log"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto"
"github.com/prometheus/client_golang/prometheus/promhttp"
)

const kb = 1024
const mb = kb * kb
const gb = mb * kb

var (
datasourceRequestCounter = promauto.NewCounterVec(
prometheus.CounterOpts{
Namespace: "plugins",
Name: "datasource_request_total",
Help: "A counter for outgoing requests for an external data source",
},
[]string{"datasource", "datasource_type", "code", "method", "secure_socks_ds_proxy_enabled", "endpoint"},
)

datasourceRequestHistogram = promauto.NewHistogramVec(
prometheus.HistogramOpts{
Namespace: "plugins",
Name: "datasource_request_duration_seconds",
Help: "histogram of durations of outgoing external data source requests sent from Grafana",
Buckets: []float64{.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10, 25, 50, 100},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to consider native histogram here as well if it works for datasource_response_size_bytes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image
This doesn't look like a native histogram response :/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about

histogram_quantile(0.99, sum(rate(plugins_datasource_response_size_bytes{datasource_type="$plugin"}[$__rate_interval])) by (le))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope 😞 definitely shows native histograms aren't supported atm

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created an issue to investigate this further #1132

FYI @wbrowne

}, []string{"datasource", "datasource_type", "code", "method", "secure_socks_ds_proxy_enabled", "endpoint"},
)

datasourceResponseHistogram = promauto.NewHistogramVec(
prometheus.HistogramOpts{
Namespace: "plugins",
Name: "datasource_response_size_bytes",
Help: "histogram of external data source response sizes returned to Grafana",
Buckets: []float64{128, 256, 512, 1 * kb, 2 * kb, 4 * kb, 8 * kb, 16 * kb, 32 * kb, 64 * kb, 128 * kb, 256 * kb, 512 * kb, 1 * mb,
2 * mb, 4 * mb, 8 * mb, 16 * mb, 32 * mb, 64 * mb, 128 * mb, 256 * mb, 512 * mb, 1 * gb,
2 * gb, 4 * gb, 8 * gb},
NativeHistogramBucketFactor: 1.1,
NativeHistogramMaxBucketNumber: 100,
NativeHistogramMinResetDuration: time.Hour,
Comment on lines +49 to +51
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to verify this works as expected for an external datasource plugin.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately this doesn't look like it works atm, maybe best to remove?

}, []string{"datasource", "datasource_type", "secure_socks_ds_proxy_enabled", "endpoint"},
)

datasourceRequestsInFlight = promauto.NewGaugeVec(
prometheus.GaugeOpts{
Namespace: "plugins",
Name: "datasource_request_in_flight",
Help: "A gauge of outgoing external data source requests currently being sent by Grafana",
},
[]string{"datasource", "datasource_type", "secure_socks_ds_proxy_enabled", "endpoint"},
)
)

// sanitizeLabelName removes all invalid chars from the label name.
// If the label name is empty or contains only invalid chars, it
// will return an error.
func sanitizeLabelName(name string) (string, error) {
if len(name) == 0 {
return "", errors.New("label name cannot be empty")
}

out := strings.Builder{}
for i, b := range name {
if (b >= 'a' && b <= 'z') || (b >= 'A' && b <= 'Z') || b == '_' || (b >= '0' && b <= '9' && i > 0) {
out.WriteRune(b)
} else if b == ' ' {
out.WriteRune('_')
}
}

if out.Len() == 0 {
return "", fmt.Errorf("label name only contains invalid chars: %q", name)
}

return out.String(), nil
}

const DataSourceMetricsMiddlewareName = "datasource_metrics"

var executeMiddlewareFunc = executeMiddleware
aangelisc marked this conversation as resolved.
Show resolved Hide resolved

func DataSourceMetricsMiddleware() Middleware {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if this could be re-used by Grafana itself here https://github.com/grafana/grafana/blob/f062c66f89b09fca7716438ac488a76bf81ed7f8/pkg/infra/httpclient/httpclientprovider/http_client_provider.go#L28, perhaps a follow up. But then it would require an option to configure the namespace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed although I'm not sure how to change the namespace without m moving the creation of the metrics into a function which then causes duplicate register errors.

return NamedMiddlewareFunc(DataSourceMetricsMiddlewareName, func(opts Options, next http.RoundTripper) http.RoundTripper {
if opts.Labels == nil {
return next
}

datasourceName, exists := opts.Labels["datasource_name"]
if !exists {
return next
}

datasourceLabelName, err := sanitizeLabelName(datasourceName)
// if the datasource named cannot be turned into a prometheus
// label we will skip instrumenting these metrics.
if err != nil {
log.DefaultLogger.Error("failed to sanitize datasource name", "error", err)
return next
aangelisc marked this conversation as resolved.
Show resolved Hide resolved
}

datasourceType, exists := opts.Labels["datasource_type"]
if !exists {
return next
}
datasourceLabelType, err := sanitizeLabelName(datasourceType)
// if the datasource type cannot be turned into a prometheus
// label we will skip instrumenting these metrics.
if err != nil {
log.DefaultLogger.Error("failed to sanitize datasource type", "error", err)
return next
}

labels := prometheus.Labels{
"datasource": datasourceLabelName,
"datasource_type": datasourceLabelType,
"secure_socks_ds_proxy_enabled": strconv.FormatBool(opts.ProxyOptions != nil && opts.ProxyOptions.Enabled),
}

return executeMiddlewareFunc(next, labels)
})
}

func executeMiddleware(next http.RoundTripper, labels prometheus.Labels) http.RoundTripper {
return RoundTripperFunc(func(r *http.Request) (*http.Response, error) {
ctx := r.Context()
labels["endpoint"] = ctxHelpers.EndpointFromContext(ctx).String()
requestCounter := datasourceRequestCounter.MustCurryWith(labels)
requestHistogram := datasourceRequestHistogram.MustCurryWith(labels)
requestInFlight := datasourceRequestsInFlight.With(labels)
responseSizeHistogram := datasourceResponseHistogram.With(labels)

res, err := promhttp.InstrumentRoundTripperDuration(requestHistogram,
promhttp.InstrumentRoundTripperCounter(requestCounter,
promhttp.InstrumentRoundTripperInFlight(requestInFlight, next))).
RoundTrip(r)
if err != nil {
return nil, err
}

if res != nil && res.StatusCode != http.StatusSwitchingProtocols {
aangelisc marked this conversation as resolved.
Show resolved Hide resolved
res.Body = CountBytesReader(res.Body, func(bytesRead int64) {
responseSizeHistogram.Observe(float64(bytesRead))
})
}

return res, nil
})
}
Loading
Loading