generated from amazon-archives/__template_Custom
-
Notifications
You must be signed in to change notification settings - Fork 184
Open
Labels
bugSomething isn't workingSomething isn't working
Description
What is the bug?
Model is getting stuck in deploying state while registering it on the cluster. We have seen cases where the model is not found on the few nodes.
Scenario
- Model stuck in DEPLOYING state.
- Call model undeploy api on the cluster returning the following response.
"NodeId": {
"stats": {
"ModelId": "not_found"
}
},
"NodeId": {
"stats": {
"ModelId": "not_found"
}
},
"NodeId": {
"stats": {
"ModelId": "undeployed"
}
},
"NodeId": {
"stats": {
"ModelId": "not_found"
}
},
"NodeId": {
"stats": {
"ModelId": "undeployed"
}
},
"NodeId": {
"stats": {
"ModelId": "not_found"
}
},
"NodeId": {
"stats": {
"ModelId": "not_found"
}
},
"NodeId": {
"stats": {
"ModelId": "undeployed"
}
},
"NodeId": {
"stats": {
"ModelId": "not_found"
}
},
"NodeId": {
"stats": {
"ModelId": "not_found"
}
}
}
- Called GetModel api and it returning model state as DEPLOYING.
What is the expected behavior?
Model should be undeployed.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working
Type
Projects
Status
In Progress