Skip to content

Define and implement core AIP-60 to OpenLineage dataset mapping  #38767

@mobuchowski

Description

@mobuchowski

Body

There is need to decide how and where the mapping will take place.

Natural space for this is provider - they should "take ownership" of particular AIP-60 URI scheme, and be able to convert between those (with some help of hook code?).

It's a part of the task to determine how OpenLineage provider, given AIP-60 dataset would determine which provider the URI scheme is responsible for, and how to successfully use that information to get the converted dataset.

There is a possiblity that for some cases direct hook usage will be needed - research those and determine if we need particular instance of hook. Then, the HookLineageCollector would accept pair of (Dataset/Hook) - that should be defined as #38766

Additional complexity is Object Storage - to check if is it possible to use AIP-60 URI scheme in object storage even if the corresponding provider is not present.

Committer

  • I acknowledge that I am a maintainer/committer of the Apache Airflow project.

Metadata

Metadata

Assignees

No one assigned

    Labels

    AIP-62Tasks tracking implementation of AIP-62 Getting Lineage from Hook Instrumentationarea:lineageprovider:openlineageAIP-53

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions