Skip to content

Conversation

@amirmor1
Copy link
Contributor

Fix a failure when trying to do a partial update of Dataplex Data Quality Task and always getting AirflowException because mandatory fields are missing.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

amirmor1 and others added 5 commits November 14, 2024 16:18
When we try to update dataplex data quality task using the DataplexCreateOrUpdateDataQualityScanOperator, it will first try to create the task, and only if it fails with AlreadyExists exception, it will try to update the task, but if you want to provide a partial parameters to the update (and not to replace the entire data scan properties), it will fail with AirflowException `Error creating Data Quality scan` because its missing mandatory parameters in the DataScan, and will never update the task.

I've added a check to see if update_mask is not None, first try to do this update, and only if not -> try to create the task.
Also moved the update section into a private function to reuse it this check, and later if we are trying to do a full update of the task
@boring-cyborg boring-cyborg bot added area:providers provider:google Google (including GCP) related issues labels Nov 21, 2024
Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

@potiuk potiuk merged commit b22e3c1 into apache:main Nov 22, 2024
got686-yandex pushed a commit to got686-yandex/airflow that referenced this pull request Jan 30, 2025
* 44012 - Update index.rst

* Fix Dataplex Data Quality Task partial update

When we try to update dataplex data quality task using the DataplexCreateOrUpdateDataQualityScanOperator, it will first try to create the task, and only if it fails with AlreadyExists exception, it will try to update the task, but if you want to provide a partial parameters to the update (and not to replace the entire data scan properties), it will fail with AirflowException `Error creating Data Quality scan` because its missing mandatory parameters in the DataScan, and will never update the task.

I've added a check to see if update_mask is not None, first try to do this update, and only if not -> try to create the task.
Also moved the update section into a private function to reuse it this check, and later if we are trying to do a full update of the task

* add empty line for lint

* add test to verify update when update_mask is not none

---------

Co-authored-by: Amir Mor <amir.mor26@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:google Google (including GCP) related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants