Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openlineage: add support for system tests, add support for example_bigquery_queries system test #31717

Closed

Conversation

mobuchowski
Copy link
Contributor

@mobuchowski mobuchowski commented Jun 5, 2023

The idea for system tests is as follows:

  • add VariableTransport that will push OpenLineage events to Variable with key <DAG_ID>.<TASK_ID>.event.<EVENT_TYPE>.
  • System tests DAGs will be configured to use this transport
  • then, newly added OpenLineageTestOperator will compare the expected events with the ones that are stored in Variables.

This PR adds those components, as well as OL support for BigQueryGetDataOperator and OL support for system test utilizing it.

PR is based on previous PR: #31293 - only second commit is relevant to this PR.

Closes: #29676

@@ -235,11 +236,21 @@
trigger_rule=TriggerRule.ALL_DONE,
)

openlineage_test = OpenlineageTestOperator(
Copy link
Member

@pankajkoti pankajkoti Jun 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if this is the right place to keep OpenLineage system tests or should we have it in its own home tests/system/providers/openlineage/...? WDYT?

Wondering if some config disables OpenLineage or uses a different lineage backend then would having this here affect the system test run?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's advantageous to have OL system tests as an addition to regular ones, because they are just an addition of one operator, and actively use rest of the dag.

Those aren't only "OpenLineage system tests" in vacuum, but test OL integration of this particular provider.

Having separate dags for system tests would result in fragmentation - we'd have to maintain the same dag, or at least subset of it, with just the addition of a single OpenLineageTestOperator.

However, I believe option to disable OpenLineage and make this special operator always pass is a very valid need, and will add such option.

airflow/providers/openlineage/extractors/base.py Outdated Show resolved Hide resolved
@@ -235,11 +236,21 @@
trigger_rule=TriggerRule.ALL_DONE,
)

openlineage_test = OpenlineageTestOperator(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@mobuchowski mobuchowski force-pushed the openlineage-system-tests branch 4 times, most recently from 61718b5 to 351086a Compare July 24, 2023 10:34
Signed-off-by: Maciej Obuchowski <obuchowski.maciej@gmail.com>
@mobuchowski
Copy link
Contributor Author

Will reopen later

@mobuchowski mobuchowski closed this Sep 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement system tests that confirm OpenLineage integration in selected provider works
3 participants