-
Notifications
You must be signed in to change notification settings - Fork 1.2k
gdrive: use the normal file backend for saved credentials #5554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
c9d6df9
to
9545908
Compare
9545908
to
a9ee9ff
Compare
if self._use_service_account: | ||
temporary_save_path = os.path.join( | ||
tempfile.gettempdir(), "google-creds.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The additional logic is needed because the saved file (cached authorization info) is different than the actual service json. The one that we have secret on the CI is the exported raw credentials json file. But the one pydrive's newer versions expect from this is the cached version. So we have to designate the temporary file as the raw credentials instead of the serialized ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, didn't realise that this is litterally /tmp/google-creds.json
. Indeed, that's seems a bit odd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@isidentical Can we keep it in .dvc/tmp
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to depend on self.repo
in this logic (which is how we get .dvc/tmp
, self.repo.tmp_dir
) since we intend to remove repo
access. Do you know a better way to get this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned above - I see where it comes from, but I feel we don't need these changes. We can keep everything the way it was before all these changes (cache on CI saved credentials instead of the raw JSON file). Otherwise, if we want to support an env var we should introduce another one, since it has a different semantics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@isidentical tmp_dir
will need to be passed as an option to the fs, it is fine. We might also switch to using proper global directories for it (e.g. using appdirs) in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@isidentical Wdyt about @shcheklein 's suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cache on CI saved credentials instead of the raw JSON file
Can do it. This would require us to switch the CI variable to a serialized version of this raw file @efiop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to a serialized version of this raw file
not raw but the one that GDrive saves into .dvc/tmp/....
? right? just to be sure we are on the same page...
Otherwise, if we want to support an env var we should introduce another one, since it has a different semantics.
I still like the idea behind this PR (to make it work with the raw file), @isidentical I think it's fine to create a ticket? It just feels that we need to do it bit differently (var name, path?, change docs, etc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not raw but the one that GDrive saves into .dvc/tmp/....? right? just to be sure we are on the same page...
I guess we have to find proper names for these 2 different files since it might become very confusing (serialized file / raw file maybe? or cached file / raw file). Yes, that's what I meant by "serialized version of this raw file".
I still like the idea behind this PR (to make it work with the raw file), @isidentical I think it's fine to create a ticket? It just feels that we need to do it bit differently (var name, path?, change docs, etc)
My aim was make the the CI work, so I didn't see this as a feature though can create a ticket.
is_credentials_temp = os.getenv( | ||
GDriveFileSystem.GDRIVE_CREDENTIALS_DATA | ||
) | ||
if self._use_service_account: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GDRIVE_CREDENTIALS_DATA
this way has two difference meanings in different situations. I think this is confusing, complicates the code, etc. Can we just do the same as we had before? If we don't change anything and just revert the CI variable value, I think it should work? Or cache the saved credentials instead of the raw JSON file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found that logic a bit weird, considering it would throw a random error when somebody tries to run GDRIVE_CREDENTIALS_DATA=... dvc status -c
in case of ...
is the raw JSON file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, I suggest we do a separate env variable - GDRIVE_SERVICE_ACCOUNT_DARA? It's bad to mix two different meanings into the same variable.
considering it would throw a random error when somebody tries to run
handle and show a nice message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
handle and show a nice message?
It is an arbitrary key error :( Probably won't even worth the effort though if we want, I guess the best place is to that is wrapping around it in the pydrive2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
Raises:
ValueError, if the credential type is not :data:`SERVICE_ACCOUNT`.
KeyError, if one of the expected keys is not present in
the keyfile.
handle this in the auth.py
ServiceAuth()
(?) makes total sense
Resolves #5551,
awaiting iterative/PyDrive2#97