-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Description
Apache Airflow Provider(s)
Versions of Apache Airflow Providers
apache-airflow-providers-google==10.12.0
Apache Airflow version
2.6.3
Operating System
Ubuntu 22.04.3 LTS
Deployment
Docker-Compose
Deployment details
No response
What happened
When creating AutoML Text Training jobs using CreateAutoMLTextTrainingJobOperator and providing the resource name or model ID of an existing model to the parent_model parameter, an entirely new model with Version 1 shows up in Vertex AI Model Registry.
What you think should happen instead
Since we provided an argument to parent_model, the model uploaded by the job should be a version of the existing parent model.

How to reproduce
If your model registry already has an existing model to use as the parent model, skip to step 3. Otherwise:
- Train the initial model
- Get the initial model's resource name
- Train a new model, specifying
parent_model=initial_model_resource_name
def get_parent_model(project_id: str):
from google.cloud import aiplatform
aiplatform.init(project=project_id)
models = [m for m in aiplatform.Model.list()]
models.sort(key=lambda m: m.version_update_time, reverse=True)
return models[0].resource_name
with DAG as dag:
initial_model = CreateAutoMLTextTrainingJobOperator(
task_id="create_auto_ml_training_job-1",
project_id=PROJECT_ID,
region=REGION,
display_name="automl-training-job-1",
training_fraction_split=0.8,
test_fraction_split=0.2,
dataset_id=DATASET_ID,
prediction_type="classification",
)
initial_model_resource_name = PythonVirtualenvOperator(
task_id="initial_model_resource_name",
python_callable=get_parent_model,
requirements=["google-cloud-aiplatform"],
op_kwargs={
"project_id": PROJECT_ID,
},
)
model_version_2 = CreateAutoMLTextTrainingJobOperator(
task_id="create_auto_ml_training_job-2",
project_id=PROJECT_ID,
region=REGION,
display_name="automl-training-job-2",
parent_model=initial_model_resource_name.output,
training_fraction_split=0.8,
test_fraction_split=0.2,
dataset_id=DATASET_ID,
prediction_type="classification",
)
initial_model >> initial_model_resource_name >> model_version_2Anything else
This problem only occurs when using the CreateAutoMLTextTrainingJobOperator, and not with the Vertex AI SDK for Python. For example, we were able to implement model versioning successfully using something like:
google-cloud-aiplatform==1.41.0
from google.cloud import aiplatform
aiplatform.init(project=PROJECT, location=LOCATION)
text_dataset = aiplatform.TextDataset(DATASET_ID)
job = aiplatform.AutoMLTextTrainingJob(
display_name=display_name,
prediction_type="classification",
multi_label=False,
)
model = job.run(
dataset=text_dataset,
model_display_name=model_display_name,
training_fraction_split=0.8,
validation_fraction_split=0.1,
test_fraction_split=0.1,
parent_model=PARENT_MODEL_ID,
is_default_version=is_default_version,
)
model.wait()Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct