-
-
Notifications
You must be signed in to change notification settings - Fork 31.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-32248 - Implement importlib.resources #4911
Conversation
Doc/whatsnew/3.7.rst
Outdated
@@ -282,7 +282,14 @@ Other Language Changes | |||
New Modules | |||
=========== | |||
|
|||
* None yet. | |||
importlib.resource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
importlib.resources
*
Doc/library/importlib.rst
Outdated
*resource* may not contain path separators and it may | ||
not have sub-resources (i.e. it cannot be a directory). | ||
:type resource: ``Resource`` | ||
:returns: A context manager for use in a :keyword`with` statement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this line read :returns: A context manager for use in an :keyword:`with` statement.
?
@warsaw I left some review comments of what I seen so far that should be corrected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright looks good to me now.
Doc/library/importlib.rst
Outdated
|
||
Open for binary reading the *resource* within *package*. | ||
|
||
:param package: A package name or module object. See above for the API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This kind of markup seems unusual in sdtlib documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is unusual, so I vote to drop it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even I do not like this markup style in my own code and drop it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was copied from the standalone version, where it makes more sense and doesn't have the same consistency problems. I am rewriting it without the markup for this branch.
Doc/library/importlib.rst
Outdated
|
||
The following types are defined. | ||
|
||
.. class:: Package |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks misleading to describe it as a class, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any suggestions on what directive to use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. I think it's the first time a stdlib module exposes typing declarations. I'm not sure it's a good idea (see current python-dev discussion).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, typing defines types as classes, so with the lack of a better alternative, I think it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes... technically it's correct. What I mean is that a class is supposed to specify concrete behaviour, but being a type, Package
doesn't have any methods or attributes of its own. So I don't see the point of mentioning it in the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It describes the expected interfaces for arguments in the API, so it seems valuable to me. We'd still have to describe that somewhere and I don't want to have to repeat it for every method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... It might be worth discussing on python-dev to see what a recommended practice can look like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ivan said that Guido prefers data::
and that makes more sense in the face of PEP 560/563. So I've made that switch in my branch for consistency. I do still think it makes sense to document them because user code may want to use those types in their own annotations.
Doc/library/importlib.rst
Outdated
|
||
Return an iterator over the contents of the package. The iterator can | ||
return resources (e.g. files) and non-resources (e.g. directories). The | ||
iterator does not recurse into subdirectories. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is one supposed to do with directories?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends on how much you want to ignore the API. Basically this is an escape hatch for users if they know they are on a file system and so can actually use the subdirectory results. For others they should filter them out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what's a little confusing may be that contents()
returns str
names (i.e. entries) so you can do with them whatever you want. Anyway, if that's what's confusing here, I'm rewriting this section a bit to make that clearer.
Resource = Union[str, os.PathLike] | ||
|
||
|
||
def _get_package(package) -> ModuleType: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to have docstrings on the helper functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
Lib/importlib/resources.py
Outdated
full_path = os.path.join(package_path, resource) | ||
try: | ||
return builtins_open(full_path, mode='rb') | ||
except IOError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a nit, but the canonical spelling is OSError
nowadays ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that in the original standalone version, but forgot to make that change suggestion. I think it's worth using OSError
here.
return BytesIO(data) | ||
|
||
|
||
def open_text(package: Package, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mostly the same as open_binary
, so how about sharing a common implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both text and binary used to be implemented as a single function, but they got split out, since it made documenting and reasoning about the API easier. And it's actually kind of tricky to refactor as you suggest because the text versions require different calls to open()
or wrapping in text wrappers. That makes the API for any refactored commonality less readable. I played around with a couple of options but wasn't super happy with them, so I didn't think it was worth it.
|
||
def open_text(package: Package, | ||
resource: Resource, | ||
encoding: str = 'utf-8', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does PEP 8 say about this? I would expect no spaces around the equals sign (i.e. encoding: str='utf-8'
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think PEP 8 says anything about type hints. This is just the style based on the type hint PEPs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. But if we take the advice for default parameter values, then there should be no space around the equals :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, it's one of those situations where I'm not sure which way to go, so I just went with the PEP that Guido wrote most recently. 😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the extra spaces a little jarring to originally, but trying it both ways, I think I like the spaces actually. Type hints make default arguments read differently so I don't like the stronger association to str
than encoding
that the lack of spaces implies. It would be interesting to have that discussion on PEP 8!
|
||
|
||
@contextmanager | ||
def path(package: Package, resource: Resource) -> Iterator[Path]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does it return an iterator? Is it the standard way to express context managers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the contextmanager
decorator transforms the function to return a context manager. But the undecorated function itself still only yields Path
instances. Basically it looks weird in source but at least mypy does the right thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this issue tries to get at better syntax for describing things like @contextmanager
s and such.
# contents, and make sure that this entry doesn't have sub-entries. | ||
archive_path = package.__spec__.loader.archive # type: ignore | ||
package_directory = Path(package.__spec__.origin).parent | ||
with ZipFile(archive_path) as zf: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm extremely surprised to see ZIpFile-specific code here. Isn't this whole module supposed to be an abstraction above the various concrete loaders? This seems wrong to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is until we get proper support in zipimport itself (which has not happened yet). It's also an accidental hold-over since the PyPI project has to support back to Python 3.4 and 2.7.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. When we get built-in zipimport support, we can drop this code (and the in-stdlib version doesn't need to worry about Pythons < 3.7).
if archive_path is None: | ||
raise | ||
relpath = package_directory.relative_to(archive_path) | ||
with ZipFile(archive_path) as zf: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @pitrou has some legitimate change requests.
@@ -282,7 +282,14 @@ Other Language Changes | |||
New Modules | |||
=========== | |||
|
|||
* None yet. | |||
importlib.resources |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use :mod:`importlib.resources`
in titles so it links to the documentation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of the other entries in this section (i.e. the "Improved Modules") uses links, so I just went with consistency here.
When you're done making the requested changes, leave the comment: |
I have made the requested changes; please review again |
Thanks for making the requested changes! @brettcannon: please review the changes made to this pull request. |
--------------------------------------- | ||
|
||
.. module:: importlib.resources | ||
:synopsis: Package resource reading, opening, and access |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Block contents in reST directives should start with three spaces, not four.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we have some inconsistency in this file. I just copied it from every other section in that file except the top level .. module:: importlib
.
Lib/importlib/resources.py
Outdated
if reader is not None: | ||
return TextIOWrapper(reader.open_resource(resource), encoding, errors) | ||
# Using pathlib doesn't work well here due to the lack of 'strict' | ||
# argument for pathlib.Path.resolve() prior to Python 3.6. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A later comment mentions that this version of the code (I mean the stdlib version in comparison to the PyPI backport) can be compatible with 3.7+ only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I'll fix this by actually using Path objects in the stdlib version (and removing the comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple minor things to make decisions on that will deviate the code even further from the PyPI code in minor ways.
from io import BytesIO, TextIOWrapper | ||
from pathlib import Path | ||
from types import ModuleType | ||
from typing import Iterator, Optional, Set, Union # noqa: F401 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we drop the directive comment?
Or better yet, do we even need the typing
-related imports with the "lazy" type hint changes in Python 3.7?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has that landed yet? I tried removing them (except for Union
, which we'd still need because of the Package
and Resource
definitions). Python was not happy. So I guess we can address that again later.
I could remove the #noqa
directive, but it makes my editor happier, so I think I'll leave them in for now.
Lib/importlib/resources.py
Outdated
|
||
|
||
def is_resource(package: Package, name: str) -> bool: | ||
"""True if `name` is a resource inside `package`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'name'
, 'package'
based on what I think of as common practice for marking common-word parameter names in docstrings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Consistency within the project is probably useful, though I've tended to use backticks (i.e. markdown) for these in my own code.
Lib/importlib/resources.py
Outdated
toc = zf.namelist() | ||
relpath = package_directory.relative_to(archive_path) | ||
candidate_path = relpath / name | ||
for entry in toc: # pragma: nobranch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop the pragma?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose we'll never run coverage over this code inside the stdlib, so I'm okay with that. Plus this code will get replaced once we implement ResourceReader
in zipimport.
Lib/importlib/resources.py
Outdated
subparts = path.parts[len(relpath.parts):] | ||
if len(subparts) == 1: | ||
yield subparts[0] | ||
elif len(subparts) > 1: # pragma: nobranch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pragma
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
K
Looks like Windows is unhappy. |
This reverts commit 3607846. Let's see if this is the cause of the Windows failures.
This ports
importlib_resources
to the stdlib asimportlib.resource
. This includes documentation and tests.I think we should do the
ResourceReader
implementation as a separate branch.https://bugs.python.org/issue32248