Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduler should not place pods to the downed or disconnected RP #1420

Open
yb01 opened this issue Apr 16, 2022 · 0 comments
Open

Scheduler should not place pods to the downed or disconnected RP #1420

yb01 opened this issue Apr 16, 2022 · 0 comments

Comments

@yb01
Copy link
Collaborator

yb01 commented Apr 16, 2022

in k8s, the resource and application is maintained in the same apiserver-etcd. the node status, hearbeats, node life cycle controller and scheduler connects to the same apiserver-etcd as the cluster state remains there. so it subtly keeps all controllers/scheduler will be in a "frozen" state when api server is down or disconnected -- the leader election will not find a leader for them.

this is important to avoid issues for controllers act when the cluster state is unknown.

in Arktos, app and resource are separated into two set of api server-etcd in the TP or RP clusters. scheduler uses the TP cluster for it leader election so as long as the TP is up, scheduler will remain functioning.

this will introduce an issue where when RP api server is disconnected, regardless the nodes are actually live or not. scheduler will continue place pods to the nodes managed by this disconnected or downed RP. which can cause unexpected situation for the placed pods depending on the node status.

the desired behavior should be the system will not place any pods for the RP is disconnected. the scheduler can still functioning to schedule pods to the other RPs it connects to ( in a multiple RP env ).

@yb01 yb01 changed the title Scheduler should be in paused state when RP api server is down or disconnected Scheduler should not place pods to the downed or disconnected RP Apr 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant