-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add hotfix for uploading kubeflow pipelines #40
Conversation
Would adding a timestamp to the pipeline name not be an easier solution? |
Mainly to avoid bloating the UI with too many submitted pipelines. Especially when debugging |
Is redeploying the cluster a lot of work? Then we can rename it to Fondant as well 😅 |
Not really, just need to make sure no one is working on it :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Temporary fix is fine for me, but it's definitely weird as I'm still using the v1 SDK to compile pipelines.
Looks like existing_pipelines = client.list_pipelines(page_size=100).pipelines
triggers the error, so for some reason we're not able to programmatically get the existing pipelines. Isn't this an authorization issue?
I think we're still sticking to the v for the meanwhile. We will not update the SDK. I don't think it's an authorization issue since other API calls for listing experiments and runs work fine. I think something has gone wrong internally either in the internal registry of kubeflow or the database. Redeploying it again might be faster than attempting to fix the the issue. |
Kubeflow requires unique pipeline names when uploading pipeline. Previous implementations relied on deleting the pipeline using the sdk by finding if it exits using either the `client.list_pipelines()` and filtering on the pipeline name or the `client.get_pipeline_id(pipeline_name)` and then finding existing versions from the the id, deleting them before deleting the pipeline. Recent issues started appearing when pipelines were compiled with the v2 SDK. Both functions mentioned above stopped working. Error message: `Failed to list pipelines with context \u0026{0xc0001ee9a0}, options \u0026{100 0xc0012d1880}: InternalServerError: Failed to execute SQL for listing pipelines: Error 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '(PARTITION BY PipelineId ORDER BY CreatedAtInSec DESC) rn FROM pipeline_versions' at line 1` The issue is not related to connection to the SQL server where the pipelines servers are used since methods like `client.get_experiments()` seem to work just fine. Also using the kfp v2 sdk seems to be also able to return the pipeline versions. I suspect something changed internally in the database when we were testing things the v2 sdk but it's not clear. The issue should be resolved when starting from a clean slate (new kfp cluster and database). For now, I implemented a fix that requires passing the `pipeline_id` (found in the UI) in order to delete existing pipelines before uploading. I was able to circumvent the api calls that were causing the issue. Not an ideal permanent solution but a temporary workaround.
Kubeflow requires unique pipeline names when uploading pipeline.
Previous implementations relied on deleting the pipeline using the sdk by finding if it exits using either the
client.list_pipelines()
and filtering on the pipeline name or theclient.get_pipeline_id(pipeline_name)
and then finding existing versions from the the id, deleting them before deleting the pipeline.Recent issues started appearing when pipelines were compiled with the v2 SDK. Both functions mentioned above stopped working. Error message:
Failed to list pipelines with context \u0026{0xc0001ee9a0}, options \u0026{100 0xc0012d1880}: InternalServerError: Failed to execute SQL for listing pipelines: Error 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '(PARTITION BY PipelineId ORDER BY CreatedAtInSec DESC) rn FROM pipeline_versions' at line 1
The issue is not related to connection to the SQL server where the pipelines servers are used since methods like
client.get_experiments()
seem to work just fine. Also using the kfp v2 sdk seems to be also able to return the pipeline versions. I suspect something changed internally in the database when we were testing things the v2 sdk but it's not clear. The issue should be resolved when starting from a clean slate (new kfp cluster and database).For now, I implemented a fix that requires passing the
pipeline_id
(found in the UI) in order to delete existing pipelines before uploading. I was able to circumvent the api calls that were causing the issue. Not an ideal permanent solution but a temporary workaround.