Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(translator): Set statPrefix for HCM and TCPProxy #3728

Merged
merged 29 commits into from
Aug 2, 2024

Conversation

aoledk
Copy link
Contributor

@aoledk aoledk commented Jul 2, 2024

What this PR does / why we need it:

Improve metric observability.

  • set http-{port} for HTTP listener
  • set https-{port} for HTTPS listener
  • set tcp-{port} for TCP listener
  • set tls-passthrough-{port} for TLS passthrough listener
  • set tls-terminate-{port} for TLS terminate listener

Which issue(s) this PR fixes:

xref #3160

Signed-off-by: Dingkang Li <dingkang1743@gmail.com>
@aoledk aoledk requested a review from a team as a code owner July 2, 2024 11:29
Copy link

codecov bot commented Jul 2, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 67.59%. Comparing base (f4c53f4) to head (6345734).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3728   +/-   ##
=======================================
  Coverage   67.59%   67.59%           
=======================================
  Files         183      183           
  Lines       22444    22446    +2     
=======================================
+ Hits        15171    15173    +2     
+ Misses       6194     6193    -1     
- Partials     1079     1080    +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@zirain
Copy link
Contributor

zirain commented Jul 2, 2024

I'm not sure this's a good idea or not, especially put hostname on stats, which means more memory cost.

@aoledk
Copy link
Contributor Author

aoledk commented Jul 3, 2024

I put hostname in statPrefix to allow user to inspect metric of https or terminate listener on specific hostname. This will leading to more memory cost by envoy stats module and bandwidth cost between envoy and prometheus.

  • contour use {http/https/tcp}-{port} as statPrefix 1.
  • emissary use {gatewayNamespace}-{gatewayName}-{listenerIndex} as statPrefix 2.
  • istio can use multiple ways to set statPrefix 3, without hostname.

Maybe we can do similar things as contour, remove {hostname} based on previous proposal.

Footnotes

  1. https://github.com/projectcontour/contour/blob/20d2ed91fdd616e1122a64a6520bdd6e4c7016f5/internal/gatewayapi/listeners.go#L184

  2. https://github.com/emissary-ingress/emissary/blob/5f7ac3008006082e44f602ac048610d298906ed1/pkg/gateway/gw_transforms.go#L31

  3. https://github.com/istio/istio/blob/8f4506bf3d4526d7bee254b9fc390459ba555354/pilot/pkg/networking/core/listener_builder.go#L301

@zirain
Copy link
Contributor

zirain commented Jul 3, 2024

I put hostname in statPrefix to allow user to inspect metric of https or terminate listener on specific hostname. This will leading to more memory cost by envoy stats module and bandwidth cost between envoy and prometheus.

  • contour use {http/https/tcp}-{port} as statPrefix 1.
  • emissary use {gatewayNamespace}-{gatewayName}-{listenerIndex} as statPrefix 2.
  • istio can use multiple ways to set statPrefix 3, without hostname.

Maybe we can do similar things as contour, remove {hostname} based on previous proposal.

Footnotes

  1. https://github.com/projectcontour/contour/blob/20d2ed91fdd616e1122a64a6520bdd6e4c7016f5/internal/gatewayapi/listeners.go#L184
  2. https://github.com/emissary-ingress/emissary/blob/5f7ac3008006082e44f602ac048610d298906ed1/pkg/gateway/gw_transforms.go#L31
  3. https://github.com/istio/istio/blob/8f4506bf3d4526d7bee254b9fc390459ba555354/pilot/pkg/networking/core/listener_builder.go#L301

it should be as concise as possible or user input.

@aoledk
Copy link
Contributor Author

aoledk commented Jul 3, 2024

Then I prefer to {http/https/tcp/passthrough/terminate}-{port}, usually port maybe configured on cloud provider's load balancer target group, what's your option ? @zirain

@zirain
Copy link
Contributor

zirain commented Jul 3, 2024

Then I prefer to {http/https/tcp/passthrough/terminate}-{port}, usually port maybe configured on cloud provider's load balancer target group, what's your option ? @zirain

looks better, what's the different between https/passthrough/terminate?

@aoledk
Copy link
Contributor Author

aoledk commented Jul 3, 2024

http / tcp / https / passthrough / terminate are current statPrefix for HCM or TCPProxy:

  • https for HTTPS listener with TLS certificates.
  • passthrough for TLS listener without TLS certificates, just to inspect SNI for routing.
  • terminate for TLS listener with TLS certificates.

aoledk added 3 commits July 3, 2024 18:17
Signed-off-by: Dingkang Li <dingkang1743@gmail.com>
Signed-off-by: Dingkang Li <dingkang1743@gmail.com>
Signed-off-by: Dingkang Li <dingkang1743@gmail.com>
@zirain
Copy link
Contributor

zirain commented Jul 3, 2024

http / tcp / https / passthrough / terminate are current statPrefix for HCM or TCPProxy:

  • https for HTTPS listener with TLS certificates.
  • passthrough for TLS listener without TLS certificates, just to inspect SNI for routing.
  • terminate for TLS listener with TLS certificates.

what's the different between HTTPS and TLS?

@zhaohuabing
Copy link
Member

zhaohuabing commented Jul 3, 2024

http / tcp / https / passthrough / terminate are current statPrefix for HCM or TCPProxy:

  • https for HTTPS listener with TLS certificates.
  • passthrough for TLS listener without TLS certificates, just to inspect SNI for routing.
  • terminate for TLS listener with TLS certificates.

what's the different between HTTPS and TLS?

I think the alpns are HTTP vs others. These terminologies are quite confusing, I need to think a moment every time I see them.

These names could be more explicit: (plain)http/https(-terminate)/(plain)tcp/tls-passthrought/tls-terminate

@aoledk
Copy link
Contributor Author

aoledk commented Jul 3, 2024

https means it's a Gateway API listener whose protocol is HTTPS, will be HCM eventually. tls means it's a Gateway API listener whose protocol is TLS, will be TCPProxy eventually.

@aoledk
Copy link
Contributor Author

aoledk commented Jul 3, 2024

http / tcp / https / passthrough / terminate are current statPrefix for HCM or TCPProxy:

  • https for HTTPS listener with TLS certificates.
  • passthrough for TLS listener without TLS certificates, just to inspect SNI for routing.
  • terminate for TLS listener with TLS certificates.

what's the different between HTTPS and TLS?

I think the alpns are HTTP vs others. These terminologies are quite confusing, I need to think a moment every time I see them.

These names could be more explicit: (plain)http/https(-terminate)/(plain)tcp/tls-passthrought/tls-terminate

Leading with tls prefix for passthrough and terminate is a good idea.

@zhaohuabing
Copy link
Member

zhaohuabing commented Jul 5, 2024

@aoledk In the example of this issue #3160 , port is part of the envoy_listener_address lable of the metric. Do all the metrics have this label? If that's the case, we don't need to add the port to the stats prefix.

envoy_listener_http_downstream_rq_completed{envoy_http_conn_manager_prefix="https",envoy_listener_address="0.0.0.0_8080"} 0

@aoledk
Copy link
Contributor Author

aoledk commented Jul 5, 2024

@aoledk In the example of this issue #3160 , port is part of the envoy_listener_address lable of the metric. Do all the metrics have this label? If that's the case, we don't need to add the port to the stats prefix.

envoy_listener_http_downstream_rq_completed{envoy_http_conn_manager_prefix="https",envoy_listener_address="0.0.0.0_8080"} 0

Only Per listener statistics of HCM will have envoy_listener_address tag and it just covers very small part of all metrics of HCM. TCPProxy doesn't have Per listener statistics like HCM.

So IMO port should be added to statPrefix.

zhaohuabing
zhaohuabing previously approved these changes Jul 5, 2024
Copy link
Member

@zhaohuabing zhaohuabing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for adding the "tls-" prefix to the tls passthrough and tls terminate listeners, which makes the stat prefix easier to understand.

@guydc
Copy link
Contributor

guydc commented Jul 11, 2024

@aoledk - can you regenerate and update the PR description?

Signed-off-by: Dingkang Li <dingkang1743@gmail.com>
@aoledk aoledk force-pushed the set-statprefix-for-hcm-and-tcpproxy branch from 03fca6d to 7dbd2c4 Compare July 12, 2024 02:54
@aoledk
Copy link
Contributor Author

aoledk commented Jul 12, 2024

@aoledk - can you regenerate and update the PR description?

Done.

@aoledk aoledk requested a review from zhaohuabing July 12, 2024 03:10
zhaohuabing
zhaohuabing previously approved these changes Jul 12, 2024
zhaohuabing
zhaohuabing previously approved these changes Jul 24, 2024
@arkodg
Copy link
Contributor

arkodg commented Jul 31, 2024

can you check why tests are failing @aoledk ?
also would be great if you can share what the metrics looked like now vs earlier

@aoledk
Copy link
Contributor Author

aoledk commented Aug 1, 2024

can you check why tests are failing @aoledk ? also would be great if you can share what the metrics looked like now vs earlier

previous looks like

envoy_http_downstream_rq_total{envoy_http_conn_manager_prefix="admin"} 19
envoy_http_downstream_rq_total{envoy_http_conn_manager_prefix="eg-ready-http"} 45
envoy_http_downstream_rq_total{envoy_http_conn_manager_prefix="http-10080"} 0

now looks like

envoy_http_downstream_rq_total{envoy_http_conn_manager_prefix="admin"} 19
envoy_http_downstream_rq_total{envoy_http_conn_manager_prefix="eg-ready-http"} 45
envoy_http_downstream_rq_total{envoy_http_conn_manager_prefix="http"} 0

@arkodg
Copy link
Contributor

arkodg commented Aug 1, 2024

thanks for fixing the e2e test @aoledk, does this change impact the grafana dashboards as well https://gateway.envoyproxy.io/docs/tasks/observability/grafana-integration/
cc @shawnh2

@zirain
Copy link
Contributor

zirain commented Aug 2, 2024

/retest

@aoledk
Copy link
Contributor Author

aoledk commented Aug 2, 2024

/retest

Copy link
Contributor

@zirain zirain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for adding this.

@arkodg arkodg merged commit a43cc6c into envoyproxy:main Aug 2, 2024
24 checks passed
@aoledk aoledk deleted the set-statprefix-for-hcm-and-tcpproxy branch August 5, 2024 01:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants