Skip to content

docs(snowflake): update tag propagation automation doc #11747

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 58 additions & 2 deletions docs/automations/snowflake-tag-propagation.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,30 @@ import FeatureAvailability from '@site/src/components/FeatureAvailability';
Snowflake Tag Propagation is an automation that allows you to sync DataHub Glossary Terms and Tags on
both columns and tables back to Snowflake. This automation is available in DataHub Cloud (Acryl) only.

This automation is bidirectional - it works in unison with the Snowflake Ingestion Source, to ensure that Tags which are ingested can
be synced back to Snowflake properly, and those which are synced to Snowflake can correctly be ingested back into DataHub.

## Capabilities

- Automatically Add DataHub Glossary Terms to Snowflake Tables and Columns
- Automatically Add DataHub Tags to Snowflake Tables and Columns
- Automatically Add DataHub Glossary Terms to Snowflake Tables and Columns as Snowflake Tags
- Automatically Add DataHub Tags to Snowflake Tables and Columns as Snowflake Tags
- Automatically Remove DataHub Glossary Terms and Tags from Snowflake Tables and Columns when they are removed in DataHub
- Any tags that were previously provisioned by DataHub will be ingested as their original DataHub Tags or Glossary Terms.
- Any tags that were ingested from Snowflake into DataHub will be able to sync back into Snowflake as Tags (so long as you configure them to sync back)

## Caveats

- Currently, renaming a Tag or Glossary Term in DataHub will not rename the corresponding Snowflake Tag. The existing tag will continue to be used and applied. You can manually change or remove the old tag in Snowflake if needed.

### Tag Provisioning in Snowflake

Tags in Snowflake are associated with a single Database or Schema. To avoid creating duplicate tags in every Database or Schema,
DataHub will automatically sync new tags into a Database named `DATAHUB` and a schema within named `SYNCED_TAGS`.

By default, we'll attempt to provision this Database and Schema if they do not exist, however this requires that the service
account being used for the Automation has the privileges to create databases and schemas. If you do not want to grant
these privileges, you are free to create the `DATAHUB` database and `SYNCED_TAGS` schema manually prior to enabling the automation.


## Enabling Snowflake Tag Sync

Expand Down Expand Up @@ -85,3 +104,40 @@ You can view propagated Tags (and corresponding DataHub URNs) inside the Snowfla
<p align="center">
<img width="70%" src="https://raw.githubusercontent.com/datahub-project/static-assets/main/imgs/automation/saas/snowflake-tag-propagation/view-snowflake-tags.png"/>
</p>

## FAQ

### Can I provide a custom name for the default `DATAHUB` database or `SYNCED_TAGS` schema where Tags are created?

Not yet. In a future release, we may add support for this but as of today this is not changeable via the UI.

### Can I create Tags in the same database and schema where the corresponding tables / columns are defined?

Currently, no. Based on demand, we may add support for this but as of today this is not changeable via the UI.
For now, all Tags that are provisioned by DataHub in Snowflake will exist under the special `DATAHUB.SYNCED_TAGS` schema.

### If I create a Tag on DataHub and apply it to a table or column, will it be synced back to Snowflake?

Yes. When you create a new Tag in DataHub and apply it to a table or column, it will be synced back to Snowflake as a new Tag within the
`DATAHUB.SYNCED_TAGS` schema.

### If I ingest my existing Tags from Snowflake into DataHub, will they be synced back to Snowflake after adding them to a table or column in DataHub?

Yes. When ingesting, we mint a new DataHub Tag from the Snowflake Tag. When that Tag is applied to columns or tables within DataHub, it will be synced back to Snowflake as the original Snowflake Tag.

### If I create a Tag on DataHub, and apply it to a table or column, and then it syncs back to Snowflake, and finally I add the tag to a table or column in _Snowflake_, will ingestion reflect the new relationship properly?

Yes. When you add a Tag that was created by DataHub to a table or column in Snowflake, it will be ingested back into DataHub as the same Tag or Glossary Term. This allows you to update the Tag associations
in either Snowflake or DataHub, although we recommend choosing one source of truth system to maintain sanity.

### If I rename a Tag or Glossary Term in DataHub, will the new name be reflected in Snowflake?

No - Snowflake does not support renaming Tags. Thus, any name change that occurs on DataHub will not be reflected in Snowflake.

### I've enabled the automation, but I don't see tags being created. Why not?

A few things to try:

1. Verify your Snowflake connection details & credentials are correct.
2. Ensure that the service account provided has the ability to create databases and schemas in Snowflake, IF you did not create the `DATAHUB` database and `SYNCED_TAGS` schema manually.
3. If you have specific Tags or Glossary Terms selected in the automation configuration, ensure that you are testing applying those Tags and Glossary Terms only.
Loading