bug: envoy proxy pods restarted always when configuration change #2965

zetaab · 2024-03-18T07:08:56Z

Description:

 % kubectl get pods -n envoy-gateway-system
NAME                                          READY   STATUS        RESTARTS   AGE
envoy-eg-external-a514c411-5ffc5d664b-74g5j   2/2     Running       0          2d19h
envoy-eg-external-a514c411-5ffc5d664b-dhmmg   2/2     Running       0          2d19h
envoy-eg-internal-7f4ff7e4-7fb9c8d7df-8kjgk   2/2     Terminating   0          27s
envoy-eg-internal-7f4ff7e4-7fb9c8d7df-krcqc   2/2     Terminating   0          25s
envoy-eg-internal-7f4ff7e4-8685896d59-4z8n8   1/2     Terminating   0          4m31s
envoy-eg-internal-7f4ff7e4-8685896d59-gqtd7   2/2     Running       0          6s
envoy-eg-internal-7f4ff7e4-8685896d59-lrd4f   2/2     Running       0          4s
envoy-gateway-5987f4589-9h6ts                 1/1     Running       0          3d

When I modify like httproutes it will lead to envoy pod restarts. This is situation that is not really good when using external loadbalancers in front of envoy. I do understand that envoy will drain connections. However, when envoy uses externalTrafficPolicy: Local by default, it means that external loadbalancer will mark only nodes as healthy which contains the envoy pods. Now when these pods are moving between machines, it will always take 10-30 seconds (depends how loadbalancer healthchecks are installed) that services will start replying again.

Repro steps:
use type loadbalancer service in front of envoy, if needed modify the external loadbalancer healthcheck intervals to 60 seconds (to see how it really behaves). Then modify httproute configurations and see when pods start restarting and moving between kubernetes nodes -> it will make the services unavailable for some seconds.

Instead of restarting pods, envoy configurations should be reloaded. Avoid modifying kubernetes deployment configuration itself all the time, it will make downtime when using external loadbalancers and externaltrafficpolicy local - the health checks are not that fast.

Environment:
eg 1.0.0

The text was updated successfully, but these errors were encountered:

arkodg · 2024-03-18T07:59:43Z

@zetaab can you share config to repro this ? is this specific when setting mergeGateways to true ?

cnvergence · 2024-03-18T08:59:20Z

possibly related, #2637

github-actions · 2024-04-17T12:02:20Z

This issue has been automatically marked as stale because it has not had activity in the last 30 days.

arkodg · 2024-04-18T22:23:40Z

does this issue still exist @zetaab ?

zetaab · 2024-04-19T06:35:05Z

I have not tried EG after this, need to revisit when I have time

zetaab · 2024-04-25T07:10:05Z

@arkodg still issue with 1.0.1 at least. I am using merge gateway. When I create additional gateways it will always restart envoy pods. When I am running diff for the replicasets:

>         - containerPort: 10443
>           name: echose-ebb62894
>           protocol: TCP
220,222d222
<           protocol: TCP
<         - containerPort: 10443
<           name: envoy-938fd695

so the issue is port naming in kubernetes deployment configuration. Not sure is this #3130 part of 1.0.1? It could fix that but not sure

zetaab · 2024-04-26T20:46:48Z

I can confirm that this works much better in latest main. So I think this is solved

zetaab added the triage label Mar 18, 2024

This was referenced Mar 18, 2024

fix: use unique ports in kubernetes service #2972

Closed

fix: use unique container ports in deployment #2973

Closed

github-actions bot added the stale label Apr 17, 2024

github-actions bot removed the stale label Apr 19, 2024

zetaab closed this as completed Apr 26, 2024

alexandermarston mentioned this issue May 15, 2024

Envoy Gateways are restarted when a resource is updated #3394

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: envoy proxy pods restarted always when configuration change #2965

bug: envoy proxy pods restarted always when configuration change #2965

zetaab commented Mar 18, 2024 •

edited

Loading

arkodg commented Mar 18, 2024

cnvergence commented Mar 18, 2024

github-actions bot commented Apr 17, 2024

arkodg commented Apr 18, 2024

zetaab commented Apr 19, 2024

zetaab commented Apr 25, 2024 •

edited

Loading

zetaab commented Apr 26, 2024

bug: envoy proxy pods restarted always when configuration change #2965

bug: envoy proxy pods restarted always when configuration change #2965

Comments

zetaab commented Mar 18, 2024 • edited Loading

arkodg commented Mar 18, 2024

cnvergence commented Mar 18, 2024

github-actions bot commented Apr 17, 2024

arkodg commented Apr 18, 2024

zetaab commented Apr 19, 2024

zetaab commented Apr 25, 2024 • edited Loading

zetaab commented Apr 26, 2024

zetaab commented Mar 18, 2024 •

edited

Loading

zetaab commented Apr 25, 2024 •

edited

Loading