As I am writing this small blog, Azure Purview supports few data processing systems; however, in reality, data engineers might be using third-party tools or custom scripts for data processing. Azure Purview provides a way to include such missing lineage information.
This section will walk through setting up custom lineage between two Azure Data Lake Gen 2 storage in Azure purview.
The next step is to assign a role to application identity in Azure Purview. Azure purview doesn't allow direct access to the service. Therefore, every API request should have an authorization token.
Copy file "Azure Purview API Request.postman_collection.json" from API_requests and import it in Postman
- Open "01 Get Token", set Tenant_ID, Client_Id, Client_secret, click Send
- Copy GUID for both storages as shown in the below image
3)Open "02 Create Lineage" in Postman, provide a token (got in step 1) in Auth, provide input and output GUIDs under inputs and outputs section of JSON in Body tab. Click Send.
- On successful response, check the Azure portal under the lineage tab of source or destination connection. You will see the lineage like the below image
Thanks for reading and trying it out. Happy learning!! Would you mind creating a PR if you have any questions/feedback?