[DOP-33014] Remove job.tag_values not send by integration #377
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Change Summary
Upgrading Spark/Airflow/OL/etc versions will lead to storing both old and new versions as job tags. To fix that, for job tags we keep only new tag_values and delete old ones because they are outdated. Except when job was received without any tags, this is expected for parent jobs, and we left tags intact.
This is not the case for dataset tags as there can be multiple sources of dataset tags, and there is no way to determine if tag value is outdated or not.
Ideally, this should be implemented as
run_tag_valuetable, but there are issues with this approach, with a lack of user demand on pet-run tags feature.Related issue number
Checklist
docs/changelog/next_release/<pull request or issue id>.<change type>.rstfile added describing change(see CONTRIBUTING.rst for details.)