Skip to content

Conversation

@dolfinus
Copy link
Contributor

@dolfinus dolfinus commented May 28, 2025

OpenLineage facet for Task describes inlets & outlets as string, but they actually list of strings:

{
    "run": {
        "facets": {
            "airflow": {
                "_producer": "https://github.com/apache/airflow/tree/providers-openlineage/1.11.0",
                "_schemaURL": "https://openlineage.io/spec/2-0-2/OpenLineage.json#/$defs/RunFacet",
                "task": {
                    "depends_on_past": false,
                    "downstream_task_ids": "['add_period_in_hive']",
                    "executor_config": {},
                    "ignore_first_depends_on_past": true,
                    "inlets": [],
                    "is_setup": false,
                    "is_teardown": false,
                    "mapped": false,
                    "multiple_outputs": false,
                    "operator_class": "PythonOperator",
                    "operator_class_path": "***.operators.python.PythonOperator",
                    "outlets": [],
                    "owner": "***",
                    "priority_weight": 1,
                    "queue": "default",
                    "retries": 2,
                    "retry_exponential_backoff": false,
                    "task_id": "add_params_in_xcom",
                    "trigger_rule": "all_success",
                    "upstream_task_ids": "['Sensors.Sensor__hdp2gp_oebsap_dm__oebs-dm-replication-to-greenplum', 'Sensors.Sensor__hdp2gp_oebsar_dm__oebs-dm-replication-to-greenplum', 'Sensors.Sensor__hdp2gp_oebsfa_dm__oebs-dm-replication-to-greenplum', 'Sensors.Sensor__postgres2gp_1c_sales_report__oebs-dm-replication-to-greenplum', 'Sensors.Sensor__hdp2gp_oebsgl_dm__oebs-dm-replication-to-greenplum', 'Sensors.Sensor__hdp2gp_oebspa_dm__oebs-dm-replication-to-greenplum', 'Sensors.Sensor__hdp2gp_oebsinv_dm__oebs-dm-replication-to-greenplum', 'Sensors.Sensor__hdp2gp_oebsxtr_dm__oebs-dm-replication-to-greenplum', 'Sensors.Sensor__hdp2gp_oebspay_dm__oebs-dm-replication-to-greenplum', 'Sensors.Sensor__hdp2gp_oebsce_dm__oebs-dm-replication-to-greenplum']",
                    "wait_for_downstream": false,
                    "wait_for_past_depends_before_skipping": false,
                    "weight_rule": "<<non-serializable: _DownstreamPriorityWeightStrategy>>"
                }
            }
        }
    }
}

Same for DAG tags:

{
    "run": {
        "facets": {
            "airflow": {
                "_producer": "https://github.com/apache/airflow/tree/providers-openlineage/1.11.0",
                "_schemaURL": "https://openlineage.io/spec/2-0-2/OpenLineage.json#/$defs/RunFacet",
                "dag": {
                    "dag_id": "control_dm__oebsstatus",
                    "fileloc": "/data/airflow/dags/oebsstatus/master/control_dm.py",
                    "owner": "airflow",
                    "schedule_interval": "00 4 * * *",
                    "tags": [
                        "oebsstatus",
                        "master"
                    ],
                    "timetable": {
                        "expression": "00 4 * * *",
                        "timezone": "UTC"
                    }
                }
            }
        }
    }
}

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@mobuchowski
Copy link
Contributor

@dolfinus this has been true in 1.11.0, but not since - see #41786

@dolfinus
Copy link
Contributor Author

But why tags list should be serialized to JSON?

@dolfinus dolfinus closed this May 28, 2025
@dolfinus dolfinus deleted the bugfix/openlineage-inlets-outlets-wrong-type branch May 28, 2025 15:33
@mobuchowski
Copy link
Contributor

@dolfinus they shouldn't, but they are and the breaking change will cause problems for some consumers... unfortunately I feel like at this point is easier to deal with this than change it.

@kacpermuda
Copy link
Contributor

I've also wanted to improve that recently in #50399, and we decided to not do that. My idea for now is using TagsJobFacet instead. I think it's a good idea to move some information from airflowRunFacet into more generic facets instead. @dolfinus WDYT? I'll probably work on it next week

@dolfinus
Copy link
Contributor Author

I agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants