Spike: PolicyServer should report the status of a Policy #740

fabriziosestito · 2024-04-19T14:58:51Z

Is your feature request related to a problem?

This is a change require to address a problem of the Policy Server.

Currently, if one of the defined policies cannot be loaded (maybe it cannot be downloaded from the remote registry, maybe the user provided settings that are not valid,...) the Policy Server process exits with an error.

Inside of Kubernetes, the Pod running the Policy Server will be restarted a bunch of time and then it will be left in crash loop state. The only way to recover from this situation is to have someone look into the error message of the Policy Server Pod and fix the issue.

This behavior is dangerous. When rolling up a new policy (or making any change to the existing ones), the new Policy Server Pods could be end up in this broken state. The old ones, still running with the old working configuration, are going to disappear if something happens to the node where they are scheduled.

Because of that, it's possible to end up with a broken cluster: all the incoming admission requests are rejected because there are no working instances of Policy Server.

Solution you'd like

Instead of exiting with an error, the Policy Server should boot regularly, but Kubewarden should report back to the user that the Policy the error status.
Currently, the controller is in charge of changing the status of a Policy to Active and configuring the webhook. Instead, we should wait for the PolicyServer to report back that the policy was initialized successfully or that the initialization failed and act accordingly.
In the scenario of more than one PolicyServer replica, the aggregated status should be considered, similar to what Kubernetes does for replicas/readyReplicas of a deployment.

A possible implementation involves the PolicyServer updating the Policy CRD status fields directly and adding the error statuses to the Policy status state machine.

Alternatives you've considered

No response

Anything else?

No response

The text was updated successfully, but these errors were encountered:

ferhatguneri · 2024-06-26T14:34:41Z

@flavio This is and important Spike I believe needs priority to make kubewarden policy server more resilient and better product. We are wondering when this can be picked up to have some progress on it ?

viccuad added the kind/spike label Apr 19, 2024

fabriziosestito added this to the 1.13 milestone Apr 19, 2024

flavio modified the milestones: 1.13, 1.14 May 31, 2024

flavio removed this from the 1.14 milestone Jun 11, 2024

flavio added this to the 1.17 milestone Aug 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spike: PolicyServer should report the status of a Policy #740

Spike: PolicyServer should report the status of a Policy #740

fabriziosestito commented Apr 19, 2024 •

edited

Loading

ferhatguneri commented Jun 26, 2024

Spike: PolicyServer should report the status of a Policy #740

Spike: PolicyServer should report the status of a Policy #740

Comments

fabriziosestito commented Apr 19, 2024 • edited Loading

Is your feature request related to a problem?

Solution you'd like

Alternatives you've considered

Anything else?

ferhatguneri commented Jun 26, 2024

fabriziosestito commented Apr 19, 2024 •

edited

Loading