Open
Description
- azure-ai-ml==1.26.0 (tried other versions)
- Python 3.10
Describe the bug
When creating a simple @pipeline
in azure-ai-ml
with one command step, I get an error on updating the schedule when I launch the Job before. The issue did not occured few weeks ago. But when updating the schedule and then reploy the job, I have no errors.
To Reproduce
Steps to reproduce the behavior:
- I created a tool to make steps easier to create in SDK V2 (because we had it in V1) https://stackoverflow.com/a/77354028
- Create a step with the
Step
class - Use
create_pipeline
function - Deploy the pipeline
- Create a Schedule
- Update the schedule (here is the bug)
step_1 = Step(
display_name="step_1",
description="step_1",
environment=...,
command="python main.py",
code...,
is_deterministic=False,
)
pipeline_job = create_pipeline(steps_graph, default_compute="my_compute", name="my_pipeline", experiment_name="my_experiment")
# Publish job
pipeline_job = ml_client.jobs.create_or_update(pipeline_job)
# Make schdule
schedule_start_time = datetime.now()
cron_trigger = CronTrigger(
expression="0 6 * * 1",
start_time=schedule_start_time,
time_zone=TimeZone.CENTRAL_EUROPEAN_STANDARD_TIME,
)
job_schedule = JobSchedule(
name="my_schedule",
trigger=cron_trigger,
create_job=pipeline_job,
)
job_schedule = self.ml_client.schedules.begin_create_or_update(
schedule=job_schedule
).result()
Expected behavior
Update the schedule correctly.
errors
Here is the error on Python side:
Source path of Step 'ds_view_check': /mnt/c/Users/XXXXX/Documents/GitHub/data_management
Class AutoDeleteSettingSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class AutoDeleteConditionSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class BaseAutoDeleteSettingSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class IntellectualPropertySchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class ProtectionLevelSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class BaseIntellectualPropertySchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Warning: the provided asset name 'ds_view_check' will not be used for anonymous registration
Uploading data_management (15.81 MBs): 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15807217/15807217 [00:01<00:00, 10587620.66it/s]
Pipeline created successfully.
name: lucid_card_xxxxxxxx
display_name: ds_view_check
description: Default pipeline function to be executed.
tags:
TOOLS: AML
TARGET: OTHER
PROCESS: OTHER
FRAMEWORK_VERSION: 2.6.0
PYTHON: '3.10'
type: pipeline
jobs:
ds_view_check:
type: command
inputs:
DEPLOY_ENV: dev
SUBSCRIPTION_ENV: dev
component: azureml:azureml_anonymous:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
compute: azureml:cpu-16-128
identity:
type: managed_identity
client_id: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
object_id: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
creation_context:
created_at: '2025-03-31T07:33:44.055977+00:00'
created_by: UserX
created_by_type: User
experiment_name: ds_view_check
id: azureml:/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/rg-xxxx/providers/Microsoft.MachineLearningServices/workspaces/mlw-xxxx/jobs/lucid_card_xxxxxxxx
properties:
mlflow.source.git.repoURL: [REDACTED]
mlflow.source.git.branch: main
mlflow.source.git.commit: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
azureml.git.dirty: 'True'
services:
Tracking:
endpoint: [REDACTED]
type: Tracking
Studio:
endpoint: [REDACTED]
type: Studio
status: NotStarted
Readonly attribute status will be ignored in class <class 'azure.ai.ml._restclient.v2023_04_01_preview.models._models_py3.JobService'>
Traceback (most recent call last):
File "/home/XXXXX/anaconda3/envs/data_management/lib/python3.10/site-packages/azure/core/polling/base_polling.py", line 788, in run
self._poll()
File "/home/XXXXX/anaconda3/envs/data_management/lib/python3.10/site-packages/azure/core/polling/base_polling.py", line 820, in _poll
raise OperationFailed("Operation failed or canceled")
azure.core.polling.base_polling.OperationFailed: Operation failed or canceled
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/mnt/c/Users/XXXXX/Documents/GitHub/data_management/aml_scheduling.py", line 98, in <module>
schedule = azureml_helper.create_or_update_schedule(
File "/home/XXXXX/anaconda3/envs/data_management/lib/python3.10/site-packages/datalab_framework/aml_scheduling/azureml_helper.py", line 770, in create_or_update_schedule
).result()
File "/home/XXXXX/anaconda3/envs/data_management/lib/python3.10/site-packages/azure/core/polling/_poller.py", line 254, in result
self.wait(timeout)
File "/home/XXXXX/anaconda3/envs/data_management/lib/python3.10/site-packages/azure/core/tracing/decorator.py", line 116, in wrapper_use_tracer
return func(*args, **kwargs)
File "/home/XXXXX/anaconda3/envs/data_management/lib/python3.10/site-packages/azure/core/polling/_poller.py", line 273, in wait
raise self._exception # type: ignore
File "/home/XXXXX/anaconda3/envs/data_management/lib/python3.10/site-packages/azure/core/polling/_poller.py", line 188, in _start
self._polling_method.run()
File "/home/XXXXX/anaconda3/envs/data_management/lib/python3.10/site-packages/azure/core/polling/base_polling.py", line 803, in run
raise HttpResponseError(response=self._pipeline_response.http_response, error=err) from err
azure.core.exceptions.HttpResponseError: (UserError) Invalid trigger definition, details: Microsoft.MachineLearning.Common.Core.ServiceInvocationException: Service invocation failed!
Request: POST smt.designer-westeurope.svc/studioservice/api/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/rg-xxxx/workspaces/mlw-xxxx/GenericTriggerJob/ParseJob
Status Code: 400 BadRequest
Error Code: UserError/BadArgument/ArgumentInvalid/InvalidPipelineJob/InvalidJobsOverride
Reason Phrase: Invalid jobs override for pipeline job since source job lucid_card_xxxxxxxx is specified.
Here is the error on AML UI side:
The schedule "[SCHEDULE_NAME]" could not be updated. Failure reason: UserError, Invalid trigger definition, details: Microsoft.MachineLearning.Common.Core.ServiceInvocationException: Service invocation failed!
Request: POST smt.designer-[REGION].svc/studioservice/api/subscriptions/[SUBSCRIPTION_ID]/resourceGroups/[RESOURCE_GROUP]/workspaces/[WORKSPACE]/GenericTriggerJob/ParseJob
Status Code: 400 BadRequest
Error Code: UserError/BadArgument/ArgumentInvalid/InvalidPipelineJob/InvalidJobsOverride
Reason Phrase: Invalid jobs override for pipeline job since source job [SOURCE_JOB_ID] is specified.
Response Body:
{
"error": {
"code": "UserError",
"message": "Invalid jobs override for pipeline job since source job [SOURCE_JOB_ID] is specified.",
"innerError": {
"code": "BadArgument",
"innerError": {
"code": "ArgumentInvalid",
"innerError": {
"code": "InvalidPipelineJob",
"innerError": {
"code": "InvalidJobsOverride"
}
}
}
}
},
"correlation": {
"operation": "[OPERATION_ID]",
"request": "[REQUEST_ID]"
},
"environment": "[REGION]",
"location": "[REGION]",
"time": "2025-03-31T07:33:46.687716+00:00",
"componentName": "Designer-MiddleTier-Service",
"statusCode": 400
}
When updating the schedule then creating the job, it works. And this issue is new, my code and version of azure-ai-ml
did not change.
Metadata
Metadata
Assignees
Labels
This issue points to a problem in the data-plane of the library.Workflow: This issue is responsible by Azure service team.Issues that are reported by GitHub users external to the Azure organization.Workflow: The Azure SDK team believes it to be addressed and ready to close.The issue doesn't require a change to the product in order to be resolved. Most issues start as that