Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TLS terminated listener + proxy protocol conflict #3188

Closed
bhamon opened this issue Apr 14, 2024 · 10 comments
Closed

TLS terminated listener + proxy protocol conflict #3188

bhamon opened this issue Apr 14, 2024 · 10 comments
Milestone

Comments

@bhamon
Copy link

bhamon commented Apr 14, 2024

Description:

I want to expose a TCP service behind a load balancer with TLS + proxy protocol support.

I've tested a TLS-only setup first:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: gateway
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  gatewayClassName: gateway
  listeners:
    - name: mail-sieve
      hostname: sieve.<REDACTED>
      port: 4190
      protocol: TLS
      tls:
        certificateRefs:
          - name: sieve-<REDACTED>
---
apiVersion: gateway.networking.k8s.io/v1alpha2
kind: TCPRoute
metadata:
  name: mail-sieve
spec:
  parentRefs:
    - name: gateway
      sectionName: mail-sieve
  rules:
    - backendRefs:
        - name: mail
          port: 4190

I can reach my service through the load balancer external IP with openssl s_client.
The TLS session is properly terminated by envoy and the raw TCP stream is sent to my upstream service.

I then tested a proxy protocol only setup:

# Modified gateway listener (TLS protocol switched to TCP)
    - name: mail-sieve
      port: 4190
      protocol: TCP
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: ClientTrafficPolicy
metadata:
  name: proxy
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: Gateway
    name: gateway
  enableProxyProtocol: true
---
# Same TCPRoute as before
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
  name: sieve
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: TCPRoute
    name: mail-sieve
  proxyProtocol:
    version: V2

I have set a provider-specific annotation on the envoy service to activate proxy protocol on the load balancer (through an EnvoyProxy CR properly registered with a parametersRef on the gateway class).
I modified my service config to support incoming proxy protocol.

With this setup I can also reach my service (with netcat).

I then tried a TLS + proxy protocol setup:

  • Gateway listener from the first version (TLS only)
  • ClientTrafficPolicy from the second version (proxy protocol only)
  • BackendTrafficPolicy from the second version (proxy protocol only)
  • Proxy protocol annotation added to the load balancer
  • Proxy protocol support added to the service config

And it doesn't work. A call to openssl s_client fails:

CONNECTED(00000003)
40E752D5A6700000:error:0A00042E:SSL routines:ssl3_read_bytes:tlsv1 alert protocol version:../ssl/record/rec_layer_s3.c:1584:SSL alert number 70
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 7 bytes and written 323 bytes
Verification: OK
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
Early data was not sent
Verify return code: 0 (ok)

I activated the debug logs of envoy and got the following trace:

[2024-04-13 20:17:48.129][15][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:243] [Tags: "ConnectionId":"31"] new tcp proxy session
[2024-04-13 20:17:48.129][15][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:436] [Tags: "ConnectionId":"31"] Creating connection to cluster tcproute/mail/mail-sieve/rule/-1
[2024-04-13 20:17:48.129][15][debug][misc] [source/common/upstream/cluster_manager_impl.cc:2328] Allocating TCP conn pool
[2024-04-13 20:17:48.129][15][debug][pool] [source/common/conn_pool/conn_pool_base.cc:291] trying to create new connection
[2024-04-13 20:17:48.129][15][debug][pool] [source/common/conn_pool/conn_pool_base.cc:145] creating a new connection (connecting=0)
[2024-04-13 20:17:48.129][15][debug][connection] [source/common/network/connection_impl.cc:1009] [Tags: "ConnectionId":"32"] connecting to 100.64.0.203:14190
[2024-04-13 20:17:48.129][15][debug][connection] [source/common/network/connection_impl.cc:1028] [Tags: "ConnectionId":"32"] connection in progress
[2024-04-13 20:17:48.129][15][debug][conn_handler] [source/common/listener_manager/active_tcp_listener.cc:160] [Tags: "ConnectionId":"31"] new connection from 172.16.4.7:50004
[2024-04-13 20:17:48.129][15][debug][connection] [source/extensions/transport_sockets/tls/ssl_socket.cc:241] [Tags: "ConnectionId":"31"] remote address:172.16.4.7:50004,TLS_error:|268435703:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER:TLS_error_end
[2024-04-13 20:17:48.129][15][debug][connection] [source/common/network/connection_impl.cc:278] [Tags: "ConnectionId":"31"] closing socket: 0
[2024-04-13 20:17:48.129][15][debug][connection] [source/extensions/transport_sockets/tls/ssl_socket.cc:241] [Tags: "ConnectionId":"31"] remote address:172.16.4.7:50004,TLS_error:|268435703:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER:TLS_error_end:TLS_error_end
[2024-04-13 20:17:48.129][15][debug][pool] [source/common/conn_pool/conn_pool_base.cc:670] cancelling pending stream
[2024-04-13 20:17:48.129][15][debug][connection] [source/common/network/connection_impl.cc:146] [Tags: "ConnectionId":"32"] closing data_to_write=0 type=1
[2024-04-13 20:17:48.129][15][debug][connection] [source/common/network/connection_impl.cc:278] [Tags: "ConnectionId":"32"] closing socket: 1
[2024-04-13 20:17:48.129][15][debug][pool] [source/common/conn_pool/conn_pool_base.cc:495] [Tags: "ConnectionId":"32"] client disconnected, failure reason: 
[2024-04-13 20:17:48.129][15][debug][pool] [source/common/conn_pool/conn_pool_base.cc:463] invoking 1 idle callback(s) - is_draining_for_deletion_=false
[2024-04-13 20:17:48.129][15][debug][pool] [source/common/conn_pool/conn_pool_base.cc:463] invoking 0 idle callback(s) - is_draining_for_deletion_=false
[2024-04-13 20:17:48.129][15][debug][conn_handler] [source/common/listener_manager/active_stream_listener_base.cc:136] [Tags: "ConnectionId":"31"] adding to cleanup list

It seems like a filter order issue to me:

  • OpenSSL client connects to the load balancer.
  • Load balancer connects to the envoy proxy.
  • Load balancer prepends the proxy protocol header to the stream.
  • On the envoy side a TLS handshake is awaited first. But the proxy protocol header is read, thus failing the TLS handshake.

The proxy protocol header should be read first before TLS handshake.

Environment:

Envoy gateway: 1.0.1

@bhamon bhamon added the triage label Apr 14, 2024
@liorokman
Copy link
Contributor

I think that ClientTrafficPolicy only affects HTTPRoutes and GRPCRoutes.

There's a comment in the code that says that supporting ClientTrafficPolicy for TCP and TLS routes required #1635 to be merged first.

@bhamon
Copy link
Author

bhamon commented Apr 14, 2024

Doesn't ClientTrafficPolicy CRs only affects downstream communication (between load balancer and envoy proxy)?

Moreover in my proxy protocol without TLS test I got it working fine (with a gateway-wide ClientTrafficPolicy, I'm waiting for #3163 for a per-listener proxy protocol strategy ; and a BackendTrafficPolicy attached to the TCPRoute).

@liorokman
Copy link
Contributor

Doesn't ClientTrafficPolicy CRs only affects downstream communication (between load balancer and envoy proxy)?

The ClientTrafficPolicy CRs are interpreted internally in Envoy Gateway only for HTTP based listeners attached to the Gateway.

Moreover in my proxy protocol without TLS test I got it working fine (with a gateway-wide ClientTrafficPolicy,

If I had to guess, then since you didn't configure TLS, probably the server was tolerant enough to ignore the ProxyProtocol's prefixed data.

@arkodg
Copy link
Contributor

arkodg commented Apr 14, 2024

#3163 should fix this, BTP support for TCPRoute was recently added ( available.on v0.0.0-latest)

@bhamon
Copy link
Author

bhamon commented Apr 14, 2024

Thank you for the explanations @liorokman and @arkodg.
I'll postpone my tests after #3163 then.

@arkodg arkodg removed the triage label Apr 15, 2024
@arkodg arkodg added this to the v1.1.0-rc1 milestone Apr 15, 2024
@arkodg
Copy link
Contributor

arkodg commented Apr 15, 2024

@bhamon rethinking, for your use case (TCP proxying) you dont even need a CTP with enableProxyProtocol set, since its TCP, it should proxy the proxy protocol header over TCP . You'll need the BTP config. Can you try this out on v0.0.0-latest ?

@bhamon
Copy link
Author

bhamon commented Apr 20, 2024

To follow your idea I've tried different setups.

Upstream service

I've configured my service without proxy protocol support:

  • With a kubectl port-forward and netcat I properly communicate with it.
  • With a kubectl port-forward and netcat with proxy protocol header injection I can communicate with it but the server respond with an error right away. It's expected because the proxy protocol header is not a valid packet for my upstream service.

I've then configured it with proxy protocol support:

  • With a kubectl port-forward and netcat I get disconnected after a few bytes (I've not sent the proxy protocol header).
  • With a kubectl port-forward and netcat with proxy protocol header injection I properly communicate with it.

With this baseline I know exactly how my upstream service reacts in all cases.

Without TLS

I've then configured my load balancer with proxy protocol injection.
The gateway listener is in TCP mode and a TCPRoute is attached to the service.

When I reach my service through the external IP everything works.
You were right, the proxy protocol header is properly forwareded to the upstream service without adding anything in envoy. Everything is just forwarded.

With TLS

I've then configured the gateway listener in TLS mode (my final use-case).

In this configuration envoy expects a proper TLS handshake but gets the proxy protocol header packet first from the load balancer.

For this use-case to work the proxy protocol header must be read by envoy before TLS handshaking (TCP ClientTrafficPolicy) + re-injected to the upstream stream (TCP BackendTrafficPolicy).

I've already confirmed the upstream proxy protocol support with a TCP BackendTrafficPolicy with the actual implementation.
When I set the LB in proxy mode + TCP listener + BackendTrafficPolicy I can reach my service and I see an error (case number 2 in my previous upstream service section). It's because the proxy protocol header is injected once by my LB and a second time by the BackendTrafficPolicy.

So in the end, I'll wait for a proper support of TCP ClientTrafficPolicy (#3163) and replay my tests again.

@arkodg
Copy link
Contributor

arkodg commented Apr 23, 2024

thanks for running the tests @bhamon ! , you're right reg TLS, #3163 is blocking it

@arkodg
Copy link
Contributor

arkodg commented May 23, 2024

should be fixed now that #3163 is in, @bhamon can you try using v0.0.0-latest ?

@arkodg
Copy link
Contributor

arkodg commented May 30, 2024

closing this one, since its done, feel free to reopen if you are still hitting this issue

@arkodg arkodg closed this as completed May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants