Skip to content
This repository was archived by the owner on May 17, 2024. It is now read-only.

Allow for Snowflake key-pair authentication with --dbt #450

Closed
wants to merge 4 commits into from
Closed

Allow for Snowflake key-pair authentication with --dbt #450

wants to merge 4 commits into from

Conversation

jakedave
Copy link

@jakedave jakedave commented Mar 20, 2023

Adds support for key-pair authentication with Snowflake with dbt; namely supports dbt profile options:

...
private_key_passphrase: <...>
private_key_path: <...>
# or
private_key: <...>

Running locally with the correct key-pair provides message (indicating a successful authentication):

❯ poetry run python3 -m data_diff --dbt
Found 1 successful model runs from the last dbt command.
[15:07:12] ERROR - 002003 (02000): SQL compilation error:
Database 'XXX' does not exist or not authorized.

And with incorrect key-pair:

❯ poetry run python3 -m data_diff --dbt
Found 1 successful model runs from the last dbt command.
[15:49:02] ERROR - Bad decrypt. Incorrect password?

While this works, this code does not handle situations where a user incorrectly provides both a password and a key/key path; if both are provided, the password is preferred.
The code, prior to this contribution, does not appear to take a stance on error handling with respect to different types of authentication; I am hoping a maintainer can provide an opinion on this. I am happy to add unit tests and modify the existing code after the route forward has been decided

@jakedave
Copy link
Author

Handles issue: #442

@dlawin dlawin self-requested a review March 21, 2023 00:28
@dlawin dlawin added enhancement New feature or request --dbt Issues/features related to the dbt integration labels Mar 21, 2023
data_diff/dbt.py Outdated
if credentials.get("password") is not None:
conn_info["password"] = credentials.get("password")
else:
conn_info["private_key"] = self._get_snowflake_private_key(credentials)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really appreciate the contribution @jakedave

Want to make sure you saw some of the downstream snowflake code, which is currently in a separate repo (likely to change).

https://github.com/datafold/sqeleton/blob/6c532733e1a3efe329b573691d3d9d18dcf4f410/sqeleton/databases/snowflake.py#L166

If we set conn_info["key"], that if statement will eventually fire with some code similar to _get_snowflake_private_key

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the callout - I was not aware of this code!

To be clear, are you asking that I follow the pattern outlined there for error-handling? Happy to follow that example!

Copy link
Contributor

@dlawin dlawin Mar 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More calling out that this change would not be DRY as is.

_get_snowflake_private_key should not be necessary and we should limit the connection code in dbt.py to only putting the needed values in conn_info, as much of the key functionality already exists. If we'd like to extend that functionality, we should do so in this file/repo

Copy link
Author

@jakedave jakedave Mar 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, so remove the private key restriction that is imposed in dbt.py and handle downstream.

It seems by the way the code flows that a wider refactor would be required - it attempts to connect with the provided credentials before ever getting to the downstream code I believe that code is not called in the case of dbt

Our company is trying to evaluate this product; we are unable to do so while data-diff does not support key pair auth for dbt

Would it be possible to contribute this as is? I'm happy to provide further contribution in terms of DRYing out the code in the future

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying it's creating a snowflake connection somewhere other than here? https://github.com/datafold/sqeleton/blob/6c532733e1a3efe329b573691d3d9d18dcf4f410/sqeleton/databases/snowflake.py#L182

I may not be following, I can take a closer look later today

Copy link
Author

@jakedave jakedave Mar 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not an expert in this but it appears that in the main function, dbt has its own branch of logic which calls the DBT parser's connector as opposed to sqeleton's connector. I don't think the code you linked is called in the case of dbt specifically

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Jake, it definitely is, the entry point for the connection creation starts here. Eventually making its way to sqeleton

table1 = connect_to_table(

I can take a closer look at this later today, and implement the changes I'm describing if that helps.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, I was about to recorrect myself. Following the inheritance as an outsider here is non-trivial. I can keep looking but if you get to it that would be great too

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dlawin is the plan to refactor the sqeleton code while removing exceptions related to the presence of a priv key here? as is, sqeleton does not allow you to specify a keyphrase for a priv key and seems like it could benefit from at least the _get_snowflake_private_key function

@jakedave
Copy link
Author

jakedave commented Mar 22, 2023

@dlawin I made a PR which I tested locally via poetry add --editable ../sqeleton-fork. I've made some other modifications to this code as to make this compatible with non-dbt auth; happy to push that up too

It's quick and dirty - like I mentioned, just trying to get this usable for my org

@dlawin
Copy link
Contributor

dlawin commented Mar 31, 2023

Hey Jake, we want to get this in the next release (soon) so I ran with a few PRs of my own:
#468
datafold/sqeleton#34
#469

Appreciate your contributions here!

@dlawin dlawin closed this Mar 31, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
--dbt Issues/features related to the dbt integration enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants