Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Allow Policy to attach to multiple http listeners #2967

Merged
merged 6 commits into from
Mar 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 25 additions & 4 deletions internal/gatewayapi/clienttrafficpolicy.go
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ func (t *Translator) ProcessClientTrafficPolicies(resources *Resources,
// It must exist since we've already finished processing the gateways
gwXdsIR := xdsIR[irKey]
if string(l.Name) == section {
err = validatePortOverlapForClientTrafficPolicy(l, gwXdsIR)
err = validatePortOverlapForClientTrafficPolicy(l, gwXdsIR, false)
if err == nil {
err = t.translateClientTrafficPolicyForListener(policy, l, xdsIR, infraIR, resources)
}
Expand Down Expand Up @@ -234,7 +234,7 @@ func (t *Translator) ProcessClientTrafficPolicies(resources *Resources,
irKey := t.getIRKey(l.gateway)
// It must exist since we've already finished processing the gateways
gwXdsIR := xdsIR[irKey]
if err := validatePortOverlapForClientTrafficPolicy(l, gwXdsIR); err != nil {
if err := validatePortOverlapForClientTrafficPolicy(l, gwXdsIR, true); err != nil {
errs = errors.Join(errs, err)
} else if err := t.translateClientTrafficPolicyForListener(policy, l, xdsIR, infraIR, resources); err != nil {
errs = errors.Join(errs, err)
Expand Down Expand Up @@ -312,7 +312,7 @@ func resolveCTPolicyTargetRef(policy *egv1a1.ClientTrafficPolicy, gateways map[t
return gateway.GatewayContext, nil
}

func validatePortOverlapForClientTrafficPolicy(l *ListenerContext, xds *ir.Xds) error {
func validatePortOverlapForClientTrafficPolicy(l *ListenerContext, xds *ir.Xds, attachedToGateway bool) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding comments for the new logic would really help

// Find Listener IR
// TODO: Support TLSRoute and TCPRoute once
// https://github.com/envoyproxy/gateway/issues/1635 is completed
Expand All @@ -328,8 +328,29 @@ func validatePortOverlapForClientTrafficPolicy(l *ListenerContext, xds *ir.Xds)

// IR must exist since we're past validation
if httpIR != nil {
// Get a list of all other non-TLS listeners on this Gateway that share a port with
// the listener in question.
if sameListeners := listenersWithSameHTTPPort(xds, httpIR); len(sameListeners) != 0 {
return fmt.Errorf("affects additional listeners: %s", strings.Join(sameListeners, ", "))
if attachedToGateway {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is now O(n^3)
can we instead hold a map containing listener name as the key and policy as the value ?

  • if key is missing, add key, and return
  • if key is present, policy is same, return
  • if key is present, and a different policy, reject

this takes an approach of first policy wins, which imo is better than no policy winning

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this takes an approach of first policy wins, which imo is better than no policy winning

"First policy wins" doesn't make sense. It's not a conflict between two policies where one can win and the other can lose, it's that translating a single policy (no other policies existing in the system) would result in it affecting multiple listeners including listeners that should not be affected by the policy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is now O(n^3)

Looking closely at the validatePortOverlapForClientTrafficPolicy method, it seems to me that it's O(n) where n is the number of HTTP listeners attached to a Gateway XDS representation.

func validatePortOverlapForClientTrafficPolicy(l *ListenerContext, xds *ir.Xds, attachedToGateway bool) error {
	irListenerName := irHTTPListenerName(l)
	var httpIR *ir.HTTPListener

         // O(n) loop

	for _, http := range xds.HTTP { 
		if http.Name == irListenerName {
			httpIR = http
			break
		}
	}

	// IR must exist since we're past validation
	if httpIR != nil {
                //           listenersWithSameHTTPPort is O(n) and it is called once. 
                //  This is an 'if' statment, not a 'for' statement.
		if sameListeners := listenersWithSameHTTPPort(xds, httpIR); len(sameListeners) != 0 {


			if attachedToGateway {
				gatewayName := irListenerName[0:strings.LastIndex(irListenerName, "/")]
				conflictingListeners := []string{}

                                // sameListeners is an array with an upper bound of O(n)

				for _, currName := range sameListeners {
					if strings.Index(currName, gatewayName) != 0 {
						conflictingListeners = append(conflictingListeners, currName)
					}
				}
				if len(conflictingListeners) != 0 {
					return fmt.Errorf("affects additional listeners: %s", strings.Join(conflictingListeners, ", "))
				}
			} else {
				return fmt.Errorf("affects additional listeners: %s", strings.Join(sameListeners, ", "))
			}
		}
	}
	return nil
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liorokman adding some reasons why we should consider first policy wins

  • we already oldest config wins in most places
    // Sort based on timestamp
  • this allows existing traffic to not get interrupted/dropped, ensures that is policy1 targets listener1 at t1 and policy2 targets listener2 at t2, listener1 is not affected

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arkodg The bug occurs when there is exactly one policy. The scenario where there are two policies is out of scope for this issue. There's no way to implement "first policy wins", because "first policy wins" means "reject the policy provided".

Here's a recreation of the bug that this PR is trying to solve.:

apiVersion: gateway.networking.k8s.io/v1beta1
    kind: Gateway
    metadata:
      name: gateway
    spec:
      gatewayClassName: envoy-gateway-class
      listeners:
        - name: http
          port: 80
          protocol: HTTP
          hostname: foo.example.com
          allowedRoutes:
            namespaces:
              from: Same
        - name: http2
          port: 80
          protocol: HTTP
          hostname: bar.example.com
          allowedRoutes:
            namespaces:
              from: Same
---
apiVersion: gateway.envoyproxy.io/v1alpha1
  kind: ClientTrafficPolicy
  metadata:
    name: target-gateway
  spec:
    targetRef:
      group: gateway.networking.k8s.io
      kind: Gateway
      name: gateway
      namespace: default
      sectionName: http
    path:
        escapedSlashesAction: RejectRequest
    timeout:
      http:
        requestReceivedTimeout: "5s"
      

The above single ClientTrafficPolicy tries to target only one listener - the one called http - and because both http and http2 are non-TLS and on the same port, the requestReceivedTimeout configured for http2 will also be 5s. Even worse, while http should have the RejectRequest action for escaped slashes, http2 should use the default value (UnescapeAndForward) but will instead be configured to RejectRequest.

Without this PR the translated IR will be correct - the various settings from the ClientTrafficPolicy will be set in the correct location - only on http and not on http2. The bug occurs because the correct IR is not translatable to valid XDS.

// If this policy is attached to an entire gateway and the mergeGateways feature
// is turned on, validate that all the listeners affected by this policy originated
// from the same Gateway resource. The name of the Gateway from which this listener
// originated is part of the listener's name by construction.
gatewayName := irListenerName[0:strings.LastIndex(irListenerName, "/")]
conflictingListeners := []string{}
for _, currName := range sameListeners {
if strings.Index(currName, gatewayName) != 0 {
conflictingListeners = append(conflictingListeners, currName)
}
}
if len(conflictingListeners) != 0 {
return fmt.Errorf("ClientTrafficPolicy is being applied to multiple http (non https) listeners (%s) on the same port, which is not allowed", strings.Join(conflictingListeners, ", "))
}
} else {
// If this policy is attached to a specific listener, any other listeners in the list
// would be affected by this policy but should not be, so this policy can't be accepted.
return fmt.Errorf("ClientTrafficPolicy is being applied to multiple http (non https) listeners (%s) on the same port, which is not allowed", strings.Join(sameListeners, ", "))
}
}
}
return nil
Expand Down
28 changes: 2 additions & 26 deletions internal/gatewayapi/securitypolicy.go
Original file line number Diff line number Diff line change
Expand Up @@ -134,9 +134,7 @@ func (t *Translator) ProcessSecurityPolicies(securityPolicies []*egv1a1.Security
continue
}

err := t.translateSecurityPolicyForRoute(policy, targetedRoute, resources, xdsIR)

if err != nil {
if err := t.translateSecurityPolicyForRoute(policy, targetedRoute, resources, xdsIR); err != nil {
status.SetTranslationErrorForPolicyAncestors(&policy.Status,
parentGateways,
t.GatewayControllerName,
Expand Down Expand Up @@ -188,15 +186,7 @@ func (t *Translator) ProcessSecurityPolicies(securityPolicies []*egv1a1.Security
continue
}

irKey := t.getIRKey(targetedGateway.Gateway)
// Should exist since we've validated this
xds := xdsIR[irKey]
err := validatePortOverlapForSecurityPolicyGateway(xds)
if err == nil {
err = t.translateSecurityPolicyForGateway(policy, targetedGateway, resources, xdsIR)
}

if err != nil {
if err := t.translateSecurityPolicyForGateway(policy, targetedGateway, resources, xdsIR); err != nil {
status.SetTranslationErrorForPolicyAncestors(&policy.Status,
parentGateways,
t.GatewayControllerName,
Expand Down Expand Up @@ -508,20 +498,6 @@ func (t *Translator) translateSecurityPolicyForGateway(
return errs
}

func validatePortOverlapForSecurityPolicyGateway(xds *ir.Xds) error {
affectedListeners := []string{}
for _, http := range xds.HTTP {
if sameListeners := listenersWithSameHTTPPort(xds, http); len(sameListeners) != 0 {
affectedListeners = append(affectedListeners, sameListeners...)
}
}

if len(affectedListeners) > 0 {
return fmt.Errorf("affects multiple listeners: %s", strings.Join(affectedListeners, ", "))
}
return nil
}

func (t *Translator) buildCORS(cors *egv1a1.CORS) *ir.CORS {
var allowOrigins []*ir.StringMatch

Expand Down
23 changes: 19 additions & 4 deletions internal/gatewayapi/testdata/conflicting-policies.out.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,8 @@ clientTrafficPolicies:
namespace: default
conditions:
- lastTransitionTime: null
message: 'Affects additional listeners: default/gateway-1/http'
message: ClientTrafficPolicy is being applied to multiple http (non https)
listeners (default/gateway-1/http) on the same port, which is not allowed
reason: Invalid
status: "False"
type: Accepted
Expand Down Expand Up @@ -261,9 +262,9 @@ securityPolicies:
namespace: default
conditions:
- lastTransitionTime: null
message: 'Affects multiple listeners: default/mfqjpuycbgjrtdww/http, default/gateway-1/http'
reason: Invalid
status: "False"
message: Policy has been accepted.
reason: Accepted
status: "True"
type: Accepted
controllerName: gateway.envoyproxy.io/gatewayclass-controller
xdsIR:
Expand Down Expand Up @@ -314,6 +315,20 @@ xdsIR:
- backendWeights:
invalid: 0
valid: 0
cors:
allowCredentials: true
allowMethods:
- PUT
- GET
- POST
- DELETE
- PATCH
- OPTIONS
allowOrigins:
- distinct: false
name: ""
safeRegex: http://.*\.foo\.com
maxAge: 10m0s
destination:
name: httproute/default/mfqjpuycbgjrtdww/rule/0
settings:
Expand Down
225 changes: 225 additions & 0 deletions internal/gatewayapi/testdata/merge-with-isolated-policies-2.in.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,225 @@
envoyproxy:
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
namespace: envoy-gateway-system
name: test
spec:
mergeGateways: true
gateways:
- apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
name: gateway-1
namespace: default
spec:
gatewayClassName: envoy-gateway-class
listeners:
- name: http
port: 80
protocol: HTTP
hostname: bar.example.com
allowedRoutes:
namespaces:
from: Same
- name: http-2
port: 80
hostname: foo.example.com
protocol: HTTP
allowedRoutes:
namespaces:
from: Same
- apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
name: gateway-2
namespace: default
spec:
gatewayClassName: envoy-gateway-class
listeners:
- name: http
port: 81
protocol: HTTP
hostname: bar.example.com
allowedRoutes:
namespaces:
from: Same
- name: http-2
port: 81
hostname: foo.example.com
protocol: HTTP
allowedRoutes:
namespaces:
from: Same
httpRoutes:
- apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
namespace: default
name: httproute-1
spec:
hostnames:
- bar.example.com
parentRefs:
- namespace: default
name: gateway-1
sectionName: http
rules:
- matches:
- path:
value: "/"
backendRefs:
- name: service-1
port: 8080
- apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
namespace: default
name: httproute-2
spec:
hostnames:
- foo.example.com
parentRefs:
- namespace: default
name: gateway-1
sectionName: http-2
rules:
- matches:
- path:
value: "/"
backendRefs:
- name: service-2
port: 8080
- apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
namespace: default
name: httproute-3
spec:
hostnames:
- bar.example.com
parentRefs:
- namespace: default
name: gateway-2
sectionName: http
rules:
- matches:
- path:
value: "/"
backendRefs:
- name: service-1
port: 8080
- apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
namespace: default
name: httproute-4
spec:
hostnames:
- foo.example.com
parentRefs:
- namespace: default
name: gateway-2
sectionName: http-2
rules:
- matches:
- path:
value: "/"
backendRefs:
- name: service-2
port: 8080
securityPolicies:
- apiVersion: gateway.envoyproxy.io/v1alpha1
kind: SecurityPolicy
metadata:
namespace: default
name: policy-for-route-1
spec:
targetRef:
group: gateway.networking.k8s.io
kind: Gateway
name: gateway-1
namespace: default
cors:
allowOrigins:
- "*"
allowMethods:
- GET
- POST
allowHeaders:
- "x-header-5"
- "x-header-6"
exposeHeaders:
- "x-header-7"
- "x-header-8"
maxAge: 2000s
- apiVersion: gateway.envoyproxy.io/v1alpha1
kind: SecurityPolicy
metadata:
namespace: default
name: policy-for-route-2
spec:
targetRef:
group: gateway.networking.k8s.io
kind: HTTPRoute
name: httproute-3
namespace: default
cors:
allowOrigins:
- "*"
allowMethods:
- GET
- POST
allowHeaders:
- "x-header-5"
- "x-header-6"
exposeHeaders:
- "x-header-7"
- "x-header-8"
maxAge: 2000s
clientTrafficPolicies:
- apiVersion: gateway.envoyproxy.io/v1alpha1
kind: ClientTrafficPolicy
metadata:
namespace: default
name: target-gateway-2
spec:
targetRef:
group: gateway.networking.k8s.io
kind: Gateway
name: gateway-2
sectionName: http
namespace: default
timeout:
http:
requestReceivedTimeout: "5s"
- apiVersion: gateway.envoyproxy.io/v1alpha1
kind: ClientTrafficPolicy
metadata:
namespace: default
name: target-gateway
spec:
targetRef:
group: gateway.networking.k8s.io
kind: Gateway
name: gateway-1
namespace: default
timeout:
http:
requestReceivedTimeout: "5s"
backendTrafficPolicies:
- apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
namespace: default
name: policy-for-gateway
spec:
targetRef:
group: gateway.networking.k8s.io
kind: Gateway
name: gateway-1
namespace: default
tcpKeepalive:
probes: 3
idleTime: 20m
interval: 60s
Loading