-
Notifications
You must be signed in to change notification settings - Fork 288
Allow for Snowflake key-pair authentication with --dbt
#450
Conversation
Handles issue: #442 |
data_diff/dbt.py
Outdated
if credentials.get("password") is not None: | ||
conn_info["password"] = credentials.get("password") | ||
else: | ||
conn_info["private_key"] = self._get_snowflake_private_key(credentials) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really appreciate the contribution @jakedave
Want to make sure you saw some of the downstream snowflake code, which is currently in a separate repo (likely to change).
If we set conn_info["key"], that if statement will eventually fire with some code similar to _get_snowflake_private_key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the callout - I was not aware of this code!
To be clear, are you asking that I follow the pattern outlined there for error-handling? Happy to follow that example!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More calling out that this change would not be DRY as is.
_get_snowflake_private_key
should not be necessary and we should limit the connection code in dbt.py to only putting the needed values in conn_info, as much of the key functionality already exists. If we'd like to extend that functionality, we should do so in this file/repo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, so remove the private key restriction that is imposed in dbt.py
and handle downstream.
It seems by the way the code flows that a wider refactor would be required - it attempts to connect with the provided credentials before ever getting to the downstream code I believe that code is not called in the case of dbt
Our company is trying to evaluate this product; we are unable to do so while data-diff does not support key pair auth for dbt
Would it be possible to contribute this as is? I'm happy to provide further contribution in terms of DRYing out the code in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you saying it's creating a snowflake connection somewhere other than here? https://github.com/datafold/sqeleton/blob/6c532733e1a3efe329b573691d3d9d18dcf4f410/sqeleton/databases/snowflake.py#L182
I may not be following, I can take a closer look later today
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not an expert in this but it appears that in the main function, dbt has its own branch of logic which calls the DBT parser's connector as opposed to sqeleton's connector. I don't think the code you linked is called in the case of dbt specifically
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Jake, it definitely is, the entry point for the connection creation starts here. Eventually making its way to sqeleton
Line 153 in d21c140
table1 = connect_to_table( |
I can take a closer look at this later today, and implement the changes I'm describing if that helps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, I was about to recorrect myself. Following the inheritance as an outsider here is non-trivial. I can keep looking but if you get to it that would be great too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dlawin is the plan to refactor the sqeleton code while removing exceptions related to the presence of a priv key here? as is, sqeleton does not allow you to specify a keyphrase for a priv key and seems like it could benefit from at least the _get_snowflake_private_key
function
Hey Jake, we want to get this in the next release (soon) so I ran with a few PRs of my own: Appreciate your contributions here! |
Adds support for key-pair authentication with Snowflake with dbt; namely supports dbt profile options:
Running locally with the correct key-pair provides message (indicating a successful authentication):
❯ poetry run python3 -m data_diff --dbt Found 1 successful model runs from the last dbt command. [15:07:12] ERROR - 002003 (02000): SQL compilation error: Database 'XXX' does not exist or not authorized.
And with incorrect key-pair:
❯ poetry run python3 -m data_diff --dbt Found 1 successful model runs from the last dbt command. [15:49:02] ERROR - Bad decrypt. Incorrect password?
While this works, this code does not handle situations where a user incorrectly provides both a password and a key/key path; if both are provided, the password is preferred.
The code, prior to this contribution, does not appear to take a stance on error handling with respect to different types of authentication; I am hoping a maintainer can provide an opinion on this. I am happy to add unit tests and modify the existing code after the route forward has been decided