Skip to content

Vertex AI - model versioning doesn't work with CreateAutoMLTextTrainingJobOperator #37400

@devinmnorris

Description

@devinmnorris

Apache Airflow Provider(s)

google

Versions of Apache Airflow Providers

apache-airflow-providers-google==10.12.0

Apache Airflow version

2.6.3

Operating System

Ubuntu 22.04.3 LTS

Deployment

Docker-Compose

Deployment details

No response

What happened

When creating AutoML Text Training jobs using CreateAutoMLTextTrainingJobOperator and providing the resource name or model ID of an existing model to the parent_model parameter, an entirely new model with Version 1 shows up in Vertex AI Model Registry.

What you think should happen instead

Since we provided an argument to parent_model, the model uploaded by the job should be a version of the existing parent model.
image

How to reproduce

If your model registry already has an existing model to use as the parent model, skip to step 3. Otherwise:

  1. Train the initial model
  2. Get the initial model's resource name
  3. Train a new model, specifying parent_model=initial_model_resource_name
def get_parent_model(project_id: str):
    from google.cloud import aiplatform

    aiplatform.init(project=project_id)
    models = [m for m in aiplatform.Model.list()]
    models.sort(key=lambda m: m.version_update_time, reverse=True)

    return models[0].resource_name


with DAG as dag:
    initial_model = CreateAutoMLTextTrainingJobOperator(
        task_id="create_auto_ml_training_job-1",
        project_id=PROJECT_ID,
        region=REGION,
        display_name="automl-training-job-1",
        training_fraction_split=0.8,
        test_fraction_split=0.2,
        dataset_id=DATASET_ID,
        prediction_type="classification",
    )

    initial_model_resource_name = PythonVirtualenvOperator(
        task_id="initial_model_resource_name",
        python_callable=get_parent_model,
        requirements=["google-cloud-aiplatform"],
        op_kwargs={
            "project_id": PROJECT_ID,
        },
    )

    model_version_2 = CreateAutoMLTextTrainingJobOperator(
        task_id="create_auto_ml_training_job-2",
        project_id=PROJECT_ID,
        region=REGION,
        display_name="automl-training-job-2",
        parent_model=initial_model_resource_name.output,
        training_fraction_split=0.8,
        test_fraction_split=0.2,
        dataset_id=DATASET_ID,
        prediction_type="classification",
    )

    initial_model >> initial_model_resource_name >> model_version_2

Anything else

This problem only occurs when using the CreateAutoMLTextTrainingJobOperator, and not with the Vertex AI SDK for Python. For example, we were able to implement model versioning successfully using something like:

google-cloud-aiplatform==1.41.0

from google.cloud import aiplatform

aiplatform.init(project=PROJECT, location=LOCATION)

text_dataset = aiplatform.TextDataset(DATASET_ID)

job = aiplatform.AutoMLTextTrainingJob(
    display_name=display_name,
    prediction_type="classification",
    multi_label=False,
)

model = job.run(
    dataset=text_dataset,
    model_display_name=model_display_name,
    training_fraction_split=0.8,
    validation_fraction_split=0.1,
    test_fraction_split=0.1,
    parent_model=PARENT_MODEL_ID,
    is_default_version=is_default_version,
)

model.wait()

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions