-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Docs] Clarify usage of include_package_data/package_data/exclude_package_data
on package data files
#4643
Merged
Merged
[Docs] Clarify usage of include_package_data/package_data/exclude_package_data
on package data files
#4643
Changes from 5 commits
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
1aaddc1
make setuptools inline literal
DanielYang59 70b5ec5
add clarify of data files
DanielYang59 0a5667e
add index to second level titles
DanielYang59 a377af0
fix case sensitivity in toml
DanielYang59 1a4a129
adjust section positioning
DanielYang59 d1e4b11
fix typo
DanielYang59 a079e8b
revision control system -> version control system
DanielYang59 46fd8db
revert version control -> revision control
DanielYang59 3a84300
clarify usage of setuptools vs Setuptools
DanielYang59 735f9ae
Merge branch 'main' into clarify-pack-data-doc
DanielYang59 bbd2167
apply @abravalheri 's suggestion as is first
DanielYang59 b07bde5
remove source files for simplicity
DanielYang59 e19f2d6
replace with sdist or wheel
DanielYang59 cae1e68
add sketch
DanielYang59 c46d060
update sketch
DanielYang59 f32b975
update legacy with latest setuptools
DanielYang59 5481166
fix typo
DanielYang59 041e23d
add note for version difference
DanielYang59 e447010
remove note on bug behaviour < 58.5.3
DanielYang59 5eab47f
remove custom notation
DanielYang59 1595318
Update docs/userguide/datafiles.rst
abravalheri File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,13 +2,14 @@ | |
Data Files Support | ||
==================== | ||
|
||
Old packaging installation methods in the Python ecosystem | ||
have traditionally allowed installation of "data files", which | ||
are placed in a platform-specific location. However, the most common use case | ||
for data files distributed with a package is for use *by* the package, usually | ||
by including the data files **inside the package directory**. | ||
|
||
Setuptools focuses on this most common type of data files and offers three ways | ||
Old packaging installation methods in the Python ecosystem have | ||
traditionally allowed the inclusion of "data files" (files beyond | ||
:ref:`the default set <manifest>` ), which are placed in a platform-specific | ||
location. However, the most common use case for data files distributed | ||
with a package is for use *by* the package, usually by including the | ||
data files **inside the package directory**. | ||
|
||
``Setuptools`` focuses on this most common type of data files and offers three ways | ||
of specifying which files should be included in your packages, as described in | ||
the following section. | ||
|
||
|
@@ -19,10 +20,11 @@ Configuration Options | |
|
||
.. _include-package-data: | ||
|
||
include_package_data | ||
-------------------- | ||
1. ``include_package_data`` | ||
--------------------------- | ||
|
||
First, you can use the ``include_package_data`` keyword. | ||
|
||
For example, if the package tree looks like this:: | ||
|
||
project_root_directory | ||
|
@@ -35,16 +37,34 @@ For example, if the package tree looks like this:: | |
├── data1.txt | ||
└── data2.txt | ||
|
||
and you supply this configuration: | ||
When at least one of the following conditions are met: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The |
||
|
||
1. These files are included via the :ref:`MANIFEST.in <Using MANIFEST.in>` file, | ||
like so:: | ||
|
||
include src/mypkg/*.txt | ||
include src/mypkg/*.rst | ||
|
||
2. They are being tracked by a revision control system such as Git, Mercurial | ||
or SVN, **AND** you have configured an appropriate plugin such as | ||
:pypi:`setuptools-scm` or :pypi:`setuptools-svn`. | ||
(See the section below on :ref:`Adding Support for Revision | ||
Control Systems` for information on how to configure such plugins.) | ||
|
||
then all the ``.txt`` and ``.rst`` files will be included into | ||
the source distribution. | ||
|
||
To further include them into the ``wheels``, you can need to use the | ||
``include_package_data`` keyword: | ||
|
||
.. tab:: pyproject.toml | ||
|
||
.. code-block:: toml | ||
|
||
[tool.setuptools] | ||
# ... | ||
# By default, include-package-data is true in pyproject.toml, so you do | ||
# NOT have to specify this line. | ||
# By default, include-package-data is true in pyproject.toml, | ||
# so you do NOT have to specify this line. | ||
include-package-data = true | ||
|
||
[tool.setuptools.packages.find] | ||
|
@@ -76,33 +96,18 @@ and you supply this configuration: | |
include_package_data=True | ||
) | ||
|
||
then all the ``.txt`` and ``.rst`` files will be automatically installed with | ||
your package, provided: | ||
|
||
1. These files are included via the :ref:`MANIFEST.in <Using MANIFEST.in>` file, | ||
like so:: | ||
|
||
include src/mypkg/*.txt | ||
include src/mypkg/*.rst | ||
|
||
2. OR, they are being tracked by a revision control system such as Git, Mercurial | ||
or SVN, and you have configured an appropriate plugin such as | ||
:pypi:`setuptools-scm` or :pypi:`setuptools-svn`. | ||
(See the section below on :ref:`Adding Support for Revision | ||
Control Systems` for information on how to write such plugins.) | ||
|
||
.. note:: | ||
.. versionadded:: v61.0.0 | ||
The default value for ``tool.setuptools.include-package-data`` is ``True`` | ||
The default value for ``tool.setuptools.include-package-data`` is ``true`` | ||
when projects are configured via ``pyproject.toml``. | ||
This behaviour differs from ``setup.cfg`` and ``setup.py`` | ||
(where ``include_package_data=False`` by default), which was not changed | ||
(where ``include_package_data`` is ``False`` by default), which was not changed | ||
to ensure backwards compatibility with existing projects. | ||
|
||
.. _package-data: | ||
|
||
package_data | ||
------------ | ||
2. ``package_data`` | ||
------------------- | ||
|
||
By default, ``include_package_data`` considers **all** non ``.py`` files found inside | ||
the package directory (``src/mypkg`` in this case) as data files, and includes those that | ||
|
@@ -172,7 +177,7 @@ file, nor require to be added by a revision control system plugin. | |
|
||
.. note:: | ||
If your glob patterns use paths, you *must* use a forward slash (``/``) as | ||
the path separator, even if you are on Windows. Setuptools automatically | ||
the path separator, even if you are on Windows. ``Setuptools`` automatically | ||
DanielYang59 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
converts slashes to appropriate platform-specific separators at build time. | ||
|
||
.. important:: | ||
|
@@ -271,8 +276,8 @@ we specify that ``data1.rst`` from ``mypkg1`` alone should be captured as well. | |
|
||
.. _exclude-package-data: | ||
|
||
exclude_package_data | ||
-------------------- | ||
3. ``exclude_package_data`` | ||
--------------------------- | ||
|
||
Sometimes, the ``include_package_data`` or ``package_data`` options alone | ||
aren't sufficient to precisely define what files you want included. For example, | ||
|
@@ -450,7 +455,7 @@ With :ref:`package-data`, the configuration might look like this: | |
} | ||
) | ||
|
||
In other words, we allow Setuptools to scan for namespace packages in the ``src`` directory, | ||
In other words, we allow ``Setuptools`` to scan for namespace packages in the ``src`` directory, | ||
which enables the ``data`` directory to be identified, and then, we separately specify data | ||
files for the root package ``mypkg``, and the namespace package ``data`` under the package | ||
``mypkg``. | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your suggestion. I understand that your main objective was to clarify what “data files” are.
While this intention is very appreciated and welcome, the proposed parenthesis seem to suggest: data file = file not included in the "default set". This could lead to confusion, as what defines a data file is not fundamentally related to whether a file is included in the default set or not. For example, we could change setuptools to start including .json files automatically, but that would not make them more or less “data file”-y.
Could you please have a look at the following suggestion (which adds a new paragraph before the original text)? Would this address your concerns?
Does this look good to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes exactly. Because this is the starting sentence of the
Data Files Support
section, and we have to make sure users understand what is considered "data file" in the first place, such that they could decide whether they even need these keywords at all.For example C source code/READMEs are not "Python" files, but we don't need to declare them to include them into the package. And therefore we might need to clarify the following definition of "data files":
Thanks a ton for the clarification! Yes my sketch indeed would need more discussion and check.
Personally I think the following terminologies are already very confusing: "package data", "data file", "source distribution", "files inside the wheel", and their difference is pretty unclear, and can we somehow define/clarify them, and perhaps avoid introducing more synonyms ("resource files" in this case)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no problems with omitting
resource files
, I just mentioned before because other packages started calling them that way (e.g.importlib_resources
). Please feel free to use and modify my suggestion to align with your vision in the contribution.Yes, in this case C/README files are not data files, because they don't fit in the remaining part of the suggested definition:
C-files are not installed along side Python files in the user's machine and also are not intended to be used in runtime, right?
That is not 100% precise is it?
MANIFEST.in
determines what goes into thesdist
, and then the contents of thesdist
influence what goes into the wheel (specially wheninclude-package-data = true
which is the default)... So there is a potential indirect effect there. That is because the build process works more or less like in the mermaidjs diagram below1:Footnotes
The "build process" creates distribution artifacts/packages.
sdist
distributions are meant to be platform independent (but may contain varying levels of optimisation - e.g. compiling Python to C via Cython).wheel
distributions contain the final files meant to be directly copied to the user's machine, and therefore may be platform specific. ↩