-
Notifications
You must be signed in to change notification settings - Fork 832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
V2: All model requests fail if the expected count of model replicas exceeds the count of server replicas #5124
Comments
Hi @lynnmatrix, Is this if you try to schedule a new model and send it requests, or when an existing model loses some amount of availability? |
@agrski It is the first case that adding more replicas for existing model |
If this is a completely new model, it's likely that it simply isn't scheduling. In any case, could you please check the kubectl -n <namespace> get model <model name> -o jsonpath '{.status}' | jq |
@agrski It goes like this. The triton replicas=1, and the model M replicas=1, now all requests for model M work well.
{
"conditions": [
{
"lastTransitionTime": "2023-09-06T02:07:09Z",
"message": "ScheduleFailed",
"reason": "****",
"status": "False",
"type": "ModelReady"
},
{
"lastTransitionTime": "2023-09-06T02:07:09Z",
"message": "ScheduleFailed",
"reason": "****",
"status": "False",
"type": "Ready"
}
],
"replicas": 2
} |
Thanks for providing the extra details @lynnmatrix. It looks like the scheduler is attempting to fully reschedule the model and removing its existing assignment(s). This might be just the routing being affected and the model remaining loaded on the one Triton server, or it might be unloading that as well. I believe the desired behaviour should be for existing assignments to remain and for the model to be considered partially available. |
I looked at the code. After model schedule fails (because there is no server replicas to arrange a new model replica), model.server will be reset to empty. Envoy try to remove and re-add the model's route, but fails to re-add because model.server is empty. |
@sakoush This sounds like a regression bug from the reset server "fix" for other reasons? |
there was a regression and it is fixed in #5074 |
Thanks. #5074 can fix this issue |
@lynnmatrix I will close this issue as it seems it is fixed in your case now. |
Describe the bug
All model requests fail if the expected count of model replicas exceeds the count of server replicas
To reproduce
Expected behaviour
At least half of the requests succeed
Environment
Seldon core v2.6.0
The text was updated successfully, but these errors were encountered: