Skip to content

fs: simplify auth #211

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 21, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ Features of PyDrive2
classes of each resource to make your program more object-oriented.
- Helps common operations else than API calls, such as content fetching
and pagination control.
- Provides `fsspec`_ filesystem implementation.

How to install
--------------
Expand Down Expand Up @@ -125,6 +126,23 @@ File listing pagination made easy
for file1 in file_list:
print('title: {}, id: {}'.format(file1['title'], file1['id']))

Fsspec filesystem
-----------------

*PyDrive2* provides easy way to work with your files through `fsspec`_
compatible `GDriveFileSystem`_.

.. code:: python

from pydrive2.fs import GDriveFileSystem

fs = GDriveFileSystem("root", client_id=my_id, client_secret=my_secret)

for root, dnames, fnames in fs.walk(""):
...

.. _`GDriveFileSystem`: https://docs.iterative.ai/PyDrive2/fsspec/

Concurrent access made easy
---------------------------

Expand All @@ -137,3 +155,5 @@ Thanks to all our contributors!

.. image:: https://contrib.rocks/image?repo=iterative/PyDrive2
:target: https://github.com/iterative/PyDrive2/graphs/contributors

.. _`fsspec`: https://filesystem-spec.readthedocs.io/en/latest/
117 changes: 117 additions & 0 deletions docs/fsspec.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
fsspec filesystem
=================

*PyDrive2* provides easy way to work with your files through `fsspec`_
compatible `GDriveFileSystem`_.

Installation
------------

.. code-block:: sh

pip install 'pydrive2[fsspec]'

Local webserver
---------------

.. code-block:: python

from pydrive2.fs import GDriveFileSystem

fs = GDriveFileSystem(
"root",
client_id="my_client_id",
client_secret="my_client_secret",
)

By default, credentials will be cached per 'client_id', but if you are using
multiple users you might want to use 'profile' to avoid accidentally using
someone else's cached credentials:

.. code-block:: python

from pydrive2.fs import GDriveFileSystem

fs = GDriveFileSystem(
"root",
client_id="my_client_id",
client_secret="my_client_secret",
profile="myprofile",
)

Writing cached credentials to a file and using it if it already exists (which
avoids interactive auth):

.. code-block:: python

from pydrive2.fs import GDriveFileSystem

fs = GDriveFileSystem(
"root",
client_id="my_client_id",
client_secret="my_client_secret",
client_json_file_path="/path/to/keyfile.json",
)

Using cached credentials from json string (avoids interactive auth):

.. code-block:: python

from pydrive2.fs import GDriveFileSystem

fs = GDriveFileSystem(
"root",
client_id="my_client_id",
client_secret="my_client_secret",
client_json=json_string,
)

Service account
---------------

Using json keyfile path:

.. code-block:: python

from pydrive2.fs import GDriveFileSystem

fs = GDriveFileSystem(
"root",
use_service_account=True,
client_json_file_path="/path/to/keyfile.json",
)

Using json keyfile string:

.. code-block:: python

from pydrive2.fs import GDriveFileSystem

fs = GDriveFileSystem(
"root",
use_service_account=True,
client_json=json_string,
)

Use `client_user_email` if you are using `delegation of authority`_.

Using filesystem
----------------

.. code-block:: python

for root, dnames, fnames in fs.walk(""):
for dname in dnames:
print(f"dir: {root}/{dname}")

for fname in fnames:
print(f"file: {root}/{fname}")

Filesystem instance offers a large number of methods for getting information
about and manipulating files, refer to fsspec docs on
`how to use a filesystem`_.

.. _`fsspec`: https://filesystem-spec.readthedocs.io/en/latest/
.. _`GDriveFileSystem`: /PyDrive2/pydrive2/#pydrive2.fs.GDriveFileSystem
.. _`delegation of authority`: https://developers.google.com/admin-sdk/directory/v1/guides/delegation
.. _`how to use a filesystem`: https://filesystem-spec.readthedocs.io/en/latest/usage.html#use-a-file-system
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,5 +44,6 @@ Table of Contents
oauth
filemanagement
filelist
fsspec
pydrive2
genindex
175 changes: 173 additions & 2 deletions pydrive2/fs/spec.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
import appdirs
import errno
import io
import logging
import os
import posixpath
import threading
from collections import defaultdict
from contextlib import contextmanager

from fsspec.spec import AbstractFileSystem
from funcy import cached_property, retry, wrap_prop, wrap_with
Expand All @@ -13,11 +15,24 @@

from pydrive2.drive import GoogleDrive
from pydrive2.fs.utils import IterStream
from pydrive2.auth import GoogleAuth

logger = logging.getLogger(__name__)

FOLDER_MIME_TYPE = "application/vnd.google-apps.folder"

COMMON_SETTINGS = {
"get_refresh_token": True,
"oauth_scope": [
"https://www.googleapis.com/auth/drive",
"https://www.googleapis.com/auth/drive.appdata",
],
}


class GDriveAuthError(Exception):
pass


def _gdrive_retry(func):
def should_retry(exc):
Expand Down Expand Up @@ -49,13 +64,169 @@ def should_retry(exc):
)(func)


@contextmanager
def _wrap_errors():
try:
yield
except Exception as exc:
# Handle AuthenticationError, RefreshError and other auth failures
# It's hard to come up with a narrow exception, since PyDrive throws
# a lot of different errors - broken credentials file, refresh token
# expired, flow failed, etc.
raise GDriveAuthError("Failed to authenticate GDrive") from exc


def _client_auth(
client_id=None,
client_secret=None,
client_json=None,
client_json_file_path=None,
profile=None,
):
if client_json:
save_settings = {
"save_credentials_backend": "dictionary",
"save_credentials_dict": {"creds": client_json},
"save_credentials_key": "creds",
}
else:
creds_file = client_json_file_path
if not creds_file:
cache_dir = os.path.join(
appdirs.user_cache_dir("pydrive2fs", appauthor=False),
client_id,
)
os.makedirs(cache_dir, exist_ok=True)

profile = profile or "default"
creds_file = os.path.join(cache_dir, f"{profile}.json")

save_settings = {
"save_credentials_backend": "file",
"save_credentials_file": creds_file,
}

settings = {
**COMMON_SETTINGS,
"save_credentials": True,
**save_settings,
"client_config_backend": "settings",
"client_config": {
"client_id": client_id,
"client_secret": client_secret,
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
"revoke_uri": "https://oauth2.googleapis.com/revoke",
"redirect_uri": "",
},
}

auth = GoogleAuth(settings=settings)

with _wrap_errors():
auth.LocalWebserverAuth()

return auth


def _service_auth(
client_user_email=None,
client_json=None,
client_json_file_path=None,
):
settings = {
**COMMON_SETTINGS,
"client_config_backend": "service",
"service_config": {
"client_user_email": client_user_email,
"client_json": client_json,
"client_json_file_path": client_json_file_path,
},
}

auth = GoogleAuth(settings=settings)

with _wrap_errors():
auth.ServiceAuth()

return auth


class GDriveFileSystem(AbstractFileSystem):
def __init__(self, path, google_auth, trash_only=True, **kwargs):
def __init__(
self,
path,
google_auth=None,
trash_only=True,
client_id=None,
client_secret=None,
client_user_email=None,
client_json=None,
client_json_file_path=None,
use_service_account=False,
profile=None,
**kwargs,
):
"""Access to gdrive as a file-system

:param path: gdrive path.
:type path: str.
:param google_auth: Authenticated GoogleAuth instance.
:type google_auth: GoogleAuth.
:param trash_only: Move files to trash instead of deleting.
:type trash_only: bool.
:param client_id: Client ID of the application.
:type client_id: str
:param client_secret: Client secret of the application.
:type client_secret: str.
:param client_user_email: User email that authority was delegated to
(only for service account).
:type client_user_email: str.
:param client_json: JSON keyfile loaded into a string.
:type client_json: str.
:param client_json_file_path: Path to JSON keyfile.
:type client_json_file_path: str.
:param use_service_account: Use service account.
:type use_service_account: bool.
:param profile: Profile name for caching credentials
(ignored for service account).
:type profile: str.
:raises: GDriveAuthError
"""
super().__init__(**kwargs)
self.path = path
self.root, self.base = self.split_path(self.path)

if not google_auth:
if (
not client_json
and not client_json_file_path
and not (client_id and client_secret)
):
raise ValueError(
"Specify credentials using one of these methods: "
"client_id/client_secret or "
"client_json or "
"client_json_file_path"
)

if use_service_account:
google_auth = _service_auth(
client_json=client_json,
client_json_file_path=client_json_file_path,
client_user_email=client_user_email,
)
else:
google_auth = _client_auth(
client_id=client_id,
client_secret=client_secret,
client_json=client_json,
client_json_file_path=client_json_file_path,
profile=profile,
)

self.client = GoogleDrive(google_auth)
self._trash_only = trash_only
super().__init__(**kwargs)

def split_path(self, path):
parts = path.replace("//", "/").rstrip("/").split("/", 1)
Expand Down
Loading