Skip to content

[BUG] Model is getting stuck in deploying state #2970

@gaurav7830

Description

@gaurav7830

What is the bug?
Model is getting stuck in deploying state while registering it on the cluster. We have seen cases where the model is not found on the few nodes.

Scenario

  1. Model stuck in DEPLOYING state.
  2. Call model undeploy api on the cluster returning the following response.
    "NodeId": {
        "stats": {
            "ModelId": "not_found"
        }
    },
    "NodeId": {
        "stats": {
            "ModelId": "not_found"
        }
    },
    "NodeId": {
        "stats": {
            "ModelId": "undeployed"
        }
    },
    "NodeId": {
        "stats": {
            "ModelId": "not_found"
        }
    },
    "NodeId": {
        "stats": {
            "ModelId": "undeployed"
        }
    },
    "NodeId": {
        "stats": {
            "ModelId": "not_found"
        }
    },
    "NodeId": {
        "stats": {
            "ModelId": "not_found"
        }
    },
    "NodeId": {
        "stats": {
            "ModelId": "undeployed"
        }
    },
    "NodeId": {
        "stats": {
            "ModelId": "not_found"
        }
    },
    "NodeId": {
        "stats": {
            "ModelId": "not_found"
        }
    }
}
  1. Called GetModel api and it returning model state as DEPLOYING.

What is the expected behavior?
Model should be undeployed.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions