Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spike: PolicyServer should report the status of a Policy #740

Open
fabriziosestito opened this issue Apr 19, 2024 · 1 comment
Open

Spike: PolicyServer should report the status of a Policy #740

fabriziosestito opened this issue Apr 19, 2024 · 1 comment
Milestone

Comments

@fabriziosestito
Copy link
Contributor

fabriziosestito commented Apr 19, 2024

Is your feature request related to a problem?

This is a change require to address a problem of the Policy Server.

Currently, if one of the defined policies cannot be loaded (maybe it cannot be downloaded from the remote registry, maybe the user provided settings that are not valid,...) the Policy Server process exits with an error.

Inside of Kubernetes, the Pod running the Policy Server will be restarted a bunch of time and then it will be left in crash loop state. The only way to recover from this situation is to have someone look into the error message of the Policy Server Pod and fix the issue.

This behavior is dangerous. When rolling up a new policy (or making any change to the existing ones), the new Policy Server Pods could be end up in this broken state. The old ones, still running with the old working configuration, are going to disappear if something happens to the node where they are scheduled.

Because of that, it's possible to end up with a broken cluster: all the incoming admission requests are rejected because there are no working instances of Policy Server.

Solution you'd like

Instead of exiting with an error, the Policy Server should boot regularly, but Kubewarden should report back to the user that the Policy the error status.
Currently, the controller is in charge of changing the status of a Policy to Active and configuring the webhook. Instead, we should wait for the PolicyServer to report back that the policy was initialized successfully or that the initialization failed and act accordingly.
In the scenario of more than one PolicyServer replica, the aggregated status should be considered, similar to what Kubernetes does for replicas/readyReplicas of a deployment.

A possible implementation involves the PolicyServer updating the Policy CRD status fields directly and adding the error statuses to the Policy status state machine.

Alternatives you've considered

No response

Anything else?

No response

@fabriziosestito fabriziosestito added this to the 1.13 milestone Apr 19, 2024
@flavio flavio modified the milestones: 1.13, 1.14 May 31, 2024
@flavio flavio removed this from the 1.14 milestone Jun 11, 2024
@ferhatguneri
Copy link

@flavio This is and important Spike I believe needs priority to make kubewarden policy server more resilient and better product. We are wondering when this can be picked up to have some progress on it ?

@flavio flavio added this to the 1.17 milestone Aug 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Todo
Development

No branches or pull requests

4 participants