Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Feature - Downscaling Jobs Using Admission Controller (Kyverno And Gatekeeper) #42

Merged
merged 5 commits into from
Jun 7, 2024

Conversation

samuel-esp
Copy link
Collaborator

@samuel-esp samuel-esp commented May 19, 2024

Motivation

Hey everyone!

This pull request will add the possibility to also downscale jobs, the only requirement is to have an admission controller installed inside the cluster (Gatakeeper or Kyverno). The pull request comes with a new argument --admission-controller which is required only if the user supply "jobs" inside the "--include-resources" arg.

Code reviews are welcomed. I tried to develop this feature in the separate way in order with a slightly different logic here and there. This choice was made to keep all the previously developed features compatible with jobs as well

This is an example of how the feature looks like:

2023-11-18 19:11:45,098 INFO: Downscaler vdev started with admission_controller=gatekeeper, debug=False, default_downtime=never, default_uptime=Mon-Sun 07:30-13:45 CET, deployment_time_annotation=None, downscale_period=never, downtime_replicas=0, dry_run=False, enable_events=False, exclude_deployments=kube-downscaler,downscaler, exclude_namespaces=kube-system, grace_period=30, include_resources=jobs, interval=60, matching_labels=, namespace=None, once=False, upscale_period=never
2023-11-18 19:11:45,135 INFO: Suspending jobs for Namespace/chaos-mesh (uptime: Mon-Sun 07:30-13:45 CET, downtime: never)
2023-11-18 19:11:45,145 INFO: Suspending jobs for Namespace/crontest (uptime: Mon-Sun 07:30-13:45 CET, downtime: never)
2023-11-18 19:11:45,168 INFO: Suspending jobs for Namespace/default (uptime: Mon-Sun 07:30-13:45 CET, downtime: never)
2023-11-18 19:11:45,190 INFO: Suspending jobs for Namespace/example7-infrastructure-dev-namespace (uptime: Mon-Sun 07:30-13:45 CET, downtime: never)
2023-11-18 19:11:45,207 INFO: Suspending jobs for Namespace/gatekeeper-system (uptime: Mon-Sun 07:30-13:45 CET, downtime: never)
2023-11-18 19:11:45,227 INFO: Suspending jobs for Namespace/istio-operator (uptime: Mon-Sun 07:30-13:45 CET, downtime: never)
2023-11-18 19:11:45,248 INFO: Suspending jobs for Namespace/istio-system (uptime: Mon-Sun 07:30-13:45 CET, downtime: never)
2023-11-18 19:11:45,267 INFO: Suspending jobs for Namespace/keda (uptime: Mon-Sun 07:30-13:45 CET, downtime: never)
2023-11-18 19:11:45,288 INFO: Suspending jobs for Namespace/kube-downscaler (uptime: Mon-Sun 07:30-13:45 CET, downtime: never)
2023-11-18 19:11:45,307 INFO: Suspending jobs for Namespace/kube-node-lease (uptime: Mon-Sun 07:30-13:45 CET, downtime: never)
2023-11-18 19:11:45,330 INFO: Suspending jobs for Namespace/kube-public (uptime: Mon-Sun 07:30-13:45 CET, downtime: never)
2023-11-18 19:11:45,353 INFO: Suspending jobs for Namespace/spring-test-dev-namespace (uptime: Mon-Sun 07:30-13:45 CET, downtime: never)
2023-11-18 19:11:45,376 INFO: Suspending jobs for Namespace/test-dev-namespace (uptime: Mon-Sun 07:30-13:45 CET, downtime: never)
2023-11-18 19:11:45,405 INFO: Suspending jobs for Namespace/vpa (uptime: Mon-Sun 07:30-13:45 CET, downtime: never)

The error displayed to the user is the following:

Error from server (Forbidden): error when creating "jobtest.yaml": admission webhook "validation.gatekeeper.sh" denied the request: [test-namespace] Job creation is not allowed in this namespace during a kube-downscaler downtime period.

Once the downscaling period ends, the following lines appear in the logs

2023-11-18 19:15:01,478 INFO: Downscaler vdev started with admission_controller=gatekeeper, debug=False, default_downtime=never, default_uptime=Mon-Sun 07:30-23:45 CET, deployment_time_annotation=None, downscale_period=never, downtime_replicas=0, dry_run=False, enable_events=False, exclude_deployments=kube-downscaler,downscaler, exclude_namespaces=kube-system, grace_period=30, include_resources=jobs, interval=60, matching_labels=, namespace=None, once=False, upscale_period=never
2023-11-18 19:15:01,516 INFO: Unsuspending jobs for Namespace/chaos-mesh (uptime: Mon-Sun 07:30-23:45 CET, downtime: never)
2023-11-18 19:15:01,524 INFO: Unsuspending jobs for Namespace/crontest (uptime: Mon-Sun 07:30-23:45 CET, downtime: never)
2023-11-18 19:15:01,544 INFO: Unsuspending jobs for Namespace/default (uptime: Mon-Sun 07:30-23:45 CET, downtime: never)
2023-11-18 19:15:01,558 INFO: Unsuspending jobs for Namespace/example7-infrastructure-dev-namespace (uptime: Mon-Sun 07:30-23:45 CET, downtime: never)
2023-11-18 19:15:01,573 INFO: Unsuspending jobs for Namespace/gatekeeper-system (uptime: Mon-Sun 07:30-23:45 CET, downtime: never)
2023-11-18 19:15:01,593 INFO: Unsuspending jobs for Namespace/istio-operator (uptime: Mon-Sun 07:30-23:45 CET, downtime: never)
2023-11-18 19:15:01,609 INFO: Unsuspending jobs for Namespace/istio-system (uptime: Mon-Sun 07:30-23:45 CET, downtime: never)
2023-11-18 19:15:01,621 INFO: Unsuspending jobs for Namespace/keda (uptime: Mon-Sun 07:30-23:45 CET, downtime: never)
2023-11-18 19:15:01,642 INFO: Unsuspending jobs for Namespace/kube-downscaler (uptime: Mon-Sun 07:30-23:45 CET, downtime: never)
2023-11-18 19:15:01,659 INFO: Unsuspending jobs for Namespace/kube-node-lease (uptime: Mon-Sun 07:30-23:45 CET, downtime: never)
2023-11-18 19:15:01,672 INFO: Unsuspending jobs for Namespace/kube-public (uptime: Mon-Sun 07:30-23:45 CET, downtime: never)
2023-11-18 19:15:01,691 INFO: Unsuspending jobs for Namespace/spring-test-dev-namespace (uptime: Mon-Sun 07:30-23:45 CET, downtime: never)
2023-11-18 19:15:01,708 INFO: Unsuspending jobs for Namespace/test-dev-namespace (uptime: Mon-Sun 07:30-23:45 CET, downtime: never)
2023-11-18 19:15:01,728 INFO: Unsuspending jobs for Namespace/vpa (uptime: Mon-Sun 07:30-23:45 CET, downtime: never)

Changes

with the newest commits:

  • added new argument --admission-controller. User can choose the admission controller they want (Kyverno or Gatekeeper)
  • added the possibility to downscale jobs using dedicated CRDs managed by Kyverno or Gatekeeper
  • added checks to verify that base Kyverno and Gatekeeper CRDs are installed inside the cluster in order to guarantee the correct use of this feature
  • refactored RBAC.yaml in order to let the admission controllers create, read and delete dedicated policies for jobs blocking and unblocking. Delete permission are fine grained and won't affect resources not managed by the Kube Downscaler
  • added support to downscaler/exclude and dowscaler/exclude-until at job level
  • added support to EXCLUDE_DEPLOYMENTS environment variable and --exclude-deployments argument for jobs
  • jobs created from cronjobs won't be blocked if jobs is specified inside --include-resources argument and cronjobs is not specified inside --include-resources argument. So the user is able to downscale jobs and cronjobs separately if needed
  • added several tests dedicated to the changes i made
  • refactored docs in order to explain "downscaling jobs" feature
  • refactored docs in order to better explain "matching labels" feature

Tests done

Built a dedicated image to test. Unit and Mock tests available inside test_scaler.py, test_resources.py and test_cmd.py

TODO

  • I've assigned myself to this PR

…d gatekeeper). Refactored docs in order to explain how to use the new feature and how to use matching_labels args
@eumel8
Copy link
Member

eumel8 commented May 24, 2024

rebase is needed due the merge conflicts

@samuel-esp
Copy link
Collaborator Author

I will rebase as soon as i can, hopefully this week

@samuel-esp
Copy link
Collaborator Author

I'm still here, a bit late on the schedule but I'll try to rebase this week!

…d gatekeeper). Refactored docs in order to explain how to use the new feature and how to use matching_labels. Rebased args
…ownscaling-jobs-feature

# Conflicts:
#	README.md
#	chart/templates/rbac.yaml
#	kube_downscaler/cmd.py
#	kube_downscaler/scaler.py
#	tests/test_cmd.py
#	tests/test_scaler.py
@samuel-esp
Copy link
Collaborator Author

Rebase completed! 🚀

Fovty
Fovty previously approved these changes Jun 6, 2024
Copy link
Member

@Fovty Fovty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@samuel-esp samuel-esp dismissed Fovty’s stale review June 6, 2024 19:46

The merge-base changed after approval.

@samuel-esp samuel-esp requested a review from Fovty June 7, 2024 07:13
@samuel-esp
Copy link
Collaborator Author

Re-requested review, i think the last comment is a Github bug

Fovty
Fovty previously approved these changes Jun 7, 2024
Copy link
Member

@Fovty Fovty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@samuel-esp samuel-esp dismissed Fovty’s stale review June 7, 2024 07:16

The merge-base changed after approval.

@Fovty
Copy link
Member

Fovty commented Jun 7, 2024

@samuel-esp hm, I will try closing and reopening the pr - maybe this helps

@Fovty Fovty closed this Jun 7, 2024
@Fovty Fovty reopened this Jun 7, 2024
@Fovty Fovty merged commit f4ea670 into caas-team:main Jun 7, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants