add hotfix for uploading kubeflow pipelines #40

PhilippeMoussalli · 2023-04-25T12:45:23Z

Kubeflow requires unique pipeline names when uploading pipeline.

Previous implementations relied on deleting the pipeline using the sdk by finding if it exits using either the client.list_pipelines() and filtering on the pipeline name or the client.get_pipeline_id(pipeline_name) and then finding existing versions from the the id, deleting them before deleting the pipeline.

Recent issues started appearing when pipelines were compiled with the v2 SDK. Both functions mentioned above stopped working. Error message:

Failed to list pipelines with context \u0026{0xc0001ee9a0}, options \u0026{100 0xc0012d1880}: InternalServerError: Failed to execute SQL for listing pipelines: Error 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '(PARTITION BY PipelineId ORDER BY CreatedAtInSec DESC) rn FROM pipeline_versions' at line 1

The issue is not related to connection to the SQL server where the pipelines servers are used since methods like client.get_experiments() seem to work just fine. Also using the kfp v2 sdk seems to be also able to return the pipeline versions. I suspect something changed internally in the database when we were testing things the v2 sdk but it's not clear. The issue should be resolved when starting from a clean slate (new kfp cluster and database).

For now, I implemented a fix that requires passing the pipeline_id (found in the UI) in order to delete existing pipelines before uploading. I was able to circumvent the api calls that were causing the issue. Not an ideal permanent solution but a temporary workaround.

RobbeSneyders · 2023-04-25T13:17:56Z

Would adding a timestamp to the pipeline name not be an easier solution?

PhilippeMoussalli · 2023-04-25T13:27:55Z

Would adding a timestamp to the pipeline name not be an easier solution?

Mainly to avoid bloating the UI with too many submitted pipelines. Especially when debugging

RobbeSneyders · 2023-04-25T13:37:19Z

Is redeploying the cluster a lot of work? Then we can rename it to Fondant as well 😅

PhilippeMoussalli · 2023-04-25T13:48:26Z

Is redeploying the cluster a lot of work? Then we can rename it to Fondant as well sweat_smile

Not really, just need to make sure no one is working on it :)

NielsRogge

Temporary fix is fine for me, but it's definitely weird as I'm still using the v1 SDK to compile pipelines.

Looks like existing_pipelines = client.list_pipelines(page_size=100).pipelines triggers the error, so for some reason we're not able to programmatically get the existing pipelines. Isn't this an authorization issue?

PhilippeMoussalli · 2023-04-25T13:57:32Z

Temporary fix is fine for me, but it's definitely weird as I'm still using the v1 SDK to compile pipelines.

Looks like existing_pipelines = client.list_pipelines(page_size=100).pipelines triggers the error, so for some reason we're not able to programmatically get the existing pipelines. Isn't this an authorization issue?

I think we're still sticking to the v for the meanwhile. We will not update the SDK. I don't think it's an authorization issue since other API calls for listing experiments and runs work fine. I think something has gone wrong internally either in the internal registry of kubeflow or the database. Redeploying it again might be faster than attempting to fix the the issue.

Kubeflow requires unique pipeline names when uploading pipeline. Previous implementations relied on deleting the pipeline using the sdk by finding if it exits using either the `client.list_pipelines()` and filtering on the pipeline name or the `client.get_pipeline_id(pipeline_name)` and then finding existing versions from the the id, deleting them before deleting the pipeline. Recent issues started appearing when pipelines were compiled with the v2 SDK. Both functions mentioned above stopped working. Error message: `Failed to list pipelines with context \u0026{0xc0001ee9a0}, options \u0026{100 0xc0012d1880}: InternalServerError: Failed to execute SQL for listing pipelines: Error 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '(PARTITION BY PipelineId ORDER BY CreatedAtInSec DESC) rn FROM pipeline_versions' at line 1` The issue is not related to connection to the SQL server where the pipelines servers are used since methods like `client.get_experiments()` seem to work just fine. Also using the kfp v2 sdk seems to be also able to return the pipeline versions. I suspect something changed internally in the database when we were testing things the v2 sdk but it's not clear. The issue should be resolved when starting from a clean slate (new kfp cluster and database). For now, I implemented a fix that requires passing the `pipeline_id` (found in the UI) in order to delete existing pipelines before uploading. I was able to circumvent the api calls that were causing the issue. Not an ideal permanent solution but a temporary workaround.

add hotfix for uploading kubeflow pipelines

794ad2f

PhilippeMoussalli requested review from RobbeSneyders and NielsRogge April 25, 2023 12:45

linting

1994f2c

NielsRogge approved these changes Apr 25, 2023

View reviewed changes

PhilippeMoussalli closed this Apr 25, 2023

PhilippeMoussalli reopened this Apr 25, 2023

Merge branch 'main' into hotfix/pipeline-upload

7092738

PhilippeMoussalli merged commit 18ff584 into main Apr 25, 2023

RobbeSneyders deleted the hotfix/pipeline-upload branch May 4, 2023 07:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add hotfix for uploading kubeflow pipelines #40

add hotfix for uploading kubeflow pipelines #40

PhilippeMoussalli commented Apr 25, 2023 •

edited

Loading

RobbeSneyders commented Apr 25, 2023

PhilippeMoussalli commented Apr 25, 2023

RobbeSneyders commented Apr 25, 2023

PhilippeMoussalli commented Apr 25, 2023

NielsRogge left a comment

PhilippeMoussalli commented Apr 25, 2023

add hotfix for uploading kubeflow pipelines #40

add hotfix for uploading kubeflow pipelines #40

Conversation

PhilippeMoussalli commented Apr 25, 2023 • edited Loading

RobbeSneyders commented Apr 25, 2023

PhilippeMoussalli commented Apr 25, 2023

RobbeSneyders commented Apr 25, 2023

PhilippeMoussalli commented Apr 25, 2023

NielsRogge left a comment

Choose a reason for hiding this comment

PhilippeMoussalli commented Apr 25, 2023

PhilippeMoussalli commented Apr 25, 2023 •

edited

Loading