Skip to content

(Major?) Bug with mcr.microsoft.com/mlops/python:latest and Azure Machine Learning extension in DevOps #329

Closed
@kodonnell

Description

@kodonnell

TLDR; I believe this will prevent anyone from using MLOps inside of Azure DevOps.

We're running an MLOps workshops (in partnership with Microsoft) with customers, and things started failing at the Azure ML Model Deploy step.

/usr/local/envs/mlopspython_ci/bin/az ml model deploy -n mlops-aci --model oilwells_model.pkl:4 --ic /__w/1/s/oilwells/scoring/inference_config.yml --dc /__w/1/s/oilwells/scoring/deployment_config_aci.yml -g MLOps-2020-09-22-team-02-prod -w mlops-AML-WS --overwrite
The command failed with an unexpected error. Here is the traceback:

cannot import name 'PROFILE_METADATA_CPU_KEY' from 'azureml._model_management._constants' (/usr/local/envs/mlopspython_ci/lib/python3.7/site-packages/azureml/_model_management/_constants.py)
    op = import_module(mod_to_import)
  File "/usr/local/envs/mlopspython_ci/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/AzDevOps_azpcontainer/.azure/cliextensions/azure-cli-ml/azext_ml/model.py", line 17, in <module>
    from azureml._model_management._constants import ACI_WEBSERVICE_TYPE, AKS_ENDPOINT_TYPE, AKS_WEBSERVICE_TYPE, \
ImportError: cannot import name 'PROFILE_METADATA_CPU_KEY' from 'azureml._model_management._constants' (/usr/local/envs/mlopspython_ci/lib/python3.7/site-packages/azureml/_model_management/_constants.py)

We've traced the issue somewhat to mcr.microsoft.com/mlops/python:latest - there are no versions on docker hub, so we had to go with latest (which we pulled about 30 minutes before this message), and if you look at /usr/local/envs/mlopspython_ci/lib/python3.7/site-packages/azureml/_model_management/_constants.py there is no PROFILE_METADATA_CPU_KEY (though there is PROFILE_RECOMMENDED_CPU_KEY which seems suspicious). I can't find /home/AzDevOps_azpcontainer/.azure/cliextensions/azure-cli-ml/azext_ml/model.py, but I found in the DevOps logs that this was created eventually via /usr/local/envs/mlopspython_ci/bin/az extension add -n azure-cli-ml which installs version 1.14.0 of the extension. If I check out that file, then (removing cruft) ...

from azureml._model_management._constants import ... PROFILE_METADATA_CPU_KEY ...

So it definitely seems to be an error. My guess is that it's a version mismatch somewhere, and if we have time tomorrow, that's what we'll be digging into - and, if there's no hotfix, we'll have to build our own docker container on top of mcr.microsoft.com/mlops/python:latest where we fix any issues ... which would be painful. (Hint hint to anyone else who comes along this and wants to help out!)

The second day of the workshop is tomorrow, so needless to say, this is somewhat urgent.

Other info:

  • Latest version of the Azure Machine Learning extension in DevOps. We uninstalled and reinstalled it with the same issues.
  • Linux build agent on a VM scale set.
  • Was working maybe 10 hours ago, so I'm guessing the docker container or azure-cli-ml were updated in that time. UPDATE: yes, azure-cli-ml was bumped to v 1.14.0 7 hours ago ... Azure/azure-cli-extensions@39d5ca4

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions