-
Notifications
You must be signed in to change notification settings - Fork 96
Home
wjohnson edited this page Sep 10, 2021
·
6 revisions
Welcome to the pyapacheatlas wiki!
The purpose of this package is to make it easy to work with the Apache Atlas REST API without having to learn too much about its nuances. In addition, the package provides a way to read an Excel file and extract entities, lineage, column mappings, and type definitions so you don't have to dig into the nuances of Atlas just to get something into your data catalog.
The package is broken up into several submodules:
-
auth
- Provides azure-identity (Managed Identity, Azure CLI), ServicePrincipal, and Basic authentication (for Apache Atlas) support.
-
core
- Provides an
AtlasClientorPurviewClientto your Apache Atlas backed service. - Provides
AtlasEntityandAtlasProcessclasses to make it easier to work with an Entity and Process type. - Provides Entity and Relationship TypeDef support.
- Provides a "What If" validator to help check if your entities are valid against a provided set of type defs.
- Provides an
-
readers
- A reader aides in extracting entities and types from standardized formats. Currently, the
ExcelReaderis the only provided reader. However, theReaderbase class could be extended to support other formats you need. - A reader has a few standardized methods that take in a template that you have filled in and produces a batch of entities, custom lineage, column mapping, or type definitions.
- The
parse_update_lineagefunction reads an excel file's UpdateLineage tab and extracts your Process types from excel and prepares the metadata to be uploaded to Atlas or Purview. - The
parse_bulk_entitiesfunction lets you define entities with attributes and their relationship to other entities (e.g. define a table, columns, and the connection between them). - The
parse_entity_defsandparse_classification_defsextracts entity and classification definitions (respectively). - You can generate an Excel template with the required headers by running
python -m pyapacheatlas --make-template ./template.xlsxon the terminal.
- A reader aides in extracting entities and types from standardized formats. Currently, the
-
scaffolding
- Create a type definition "payload" that provides the table, column, table lineage process, column lineage process, table to column relationship, and table lineage to column lineage relationship.
from pyapacheatlas.scaffolding import column_lineage_scaffolding.
Thank you for your interest in using PyApacheAtlas! Please be sure to take a look at the more detailed pages in the wiki to get more specific information on the Excel Reader and Azure Purview Tips.
- Create a type definition "payload" that provides the table, column, table lineage process, column lineage process, table to column relationship, and table lineage to column lineage relationship.