Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] Clarify usage of include_package_data/package_data/exclude_package_data on package data files #4643

Merged
merged 21 commits into from
Sep 26, 2024
Merged
Changes from 18 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
104 changes: 76 additions & 28 deletions docs/userguide/datafiles.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,16 @@
Data Files Support
====================

In the Python ecosystem, the term "data files" is used in various complex scenarios
and can have nuanced meanings. For the purposes of this documentation,
we define "data files" as non-Python files that are installed alongside Python
modules and packages on the user's machine when they install a
:term:`distribution <Distribution Package>` from either source distribution
or via a binary distribution (``.whl`` file for example).
abravalheri marked this conversation as resolved.
Show resolved Hide resolved

These files are typically intended for use at **runtime** by the package itself or
to influence the behavior of other packages or systems.

Old packaging installation methods in the Python ecosystem
have traditionally allowed installation of "data files", which
are placed in a platform-specific location. However, the most common use case
Expand All @@ -19,10 +29,11 @@ Configuration Options

.. _include-package-data:

include_package_data
--------------------
1. ``include_package_data``
---------------------------

First, you can use the ``include_package_data`` keyword.

For example, if the package tree looks like this::

project_root_directory
Expand All @@ -35,16 +46,34 @@ For example, if the package tree looks like this::
├── data1.txt
└── data2.txt

and you supply this configuration:
When **at least one** of the following conditions are met:

1. These files are included via the :ref:`MANIFEST.in <Using MANIFEST.in>` file,
like so::

include src/mypkg/*.txt
include src/mypkg/*.rst

2. They are being tracked by a revision control system such as Git, Mercurial
or SVN, **AND** you have configured an appropriate plugin such as
:pypi:`setuptools-scm` or :pypi:`setuptools-svn`.
(See the section below on :ref:`Adding Support for Revision
Control Systems` for information on how to configure such plugins.)

then all the ``.txt`` and ``.rst`` files will be included into
the source distribution.

To further include them into the ``wheels``, you can use the
``include_package_data`` keyword:

.. tab:: pyproject.toml

.. code-block:: toml

[tool.setuptools]
# ...
# By default, include-package-data is true in pyproject.toml, so you do
# NOT have to specify this line.
# By default, include-package-data is true in pyproject.toml,
# so you do NOT have to specify this line.
include-package-data = true

[tool.setuptools.packages.find]
Expand Down Expand Up @@ -76,33 +105,18 @@ and you supply this configuration:
include_package_data=True
)

then all the ``.txt`` and ``.rst`` files will be automatically installed with
your package, provided:

1. These files are included via the :ref:`MANIFEST.in <Using MANIFEST.in>` file,
like so::

include src/mypkg/*.txt
include src/mypkg/*.rst

2. OR, they are being tracked by a revision control system such as Git, Mercurial
or SVN, and you have configured an appropriate plugin such as
:pypi:`setuptools-scm` or :pypi:`setuptools-svn`.
(See the section below on :ref:`Adding Support for Revision
Control Systems` for information on how to write such plugins.)

.. note::
.. versionadded:: v61.0.0
The default value for ``tool.setuptools.include-package-data`` is ``True``
The default value for ``tool.setuptools.include-package-data`` is ``true``
when projects are configured via ``pyproject.toml``.
This behaviour differs from ``setup.cfg`` and ``setup.py``
(where ``include_package_data=False`` by default), which was not changed
(where ``include_package_data`` is ``False`` by default), which was not changed
to ensure backwards compatibility with existing projects.

.. _package-data:

package_data
------------
2. ``package_data``
-------------------

By default, ``include_package_data`` considers **all** non ``.py`` files found inside
the package directory (``src/mypkg`` in this case) as data files, and includes those that
Expand Down Expand Up @@ -172,7 +186,7 @@ file, nor require to be added by a revision control system plugin.

.. note::
If your glob patterns use paths, you *must* use a forward slash (``/``) as
the path separator, even if you are on Windows. Setuptools automatically
the path separator, even if you are on Windows. ``setuptools`` automatically
converts slashes to appropriate platform-specific separators at build time.

.. important::
Expand Down Expand Up @@ -271,8 +285,8 @@ we specify that ``data1.rst`` from ``mypkg1`` alone should be captured as well.

.. _exclude-package-data:

exclude_package_data
--------------------
3. ``exclude_package_data``
---------------------------

Sometimes, the ``include_package_data`` or ``package_data`` options alone
aren't sufficient to precisely define what files you want included. For example,
Expand Down Expand Up @@ -337,6 +351,40 @@ Any files that match these patterns will be *excluded* from installation,
even if they were listed in ``package_data`` or were included as a result of using
``include_package_data``.

.. _interplay_package_data_keywords:

Interplay between these keywords
--------------------------------

Meanwhile, to further clarify the interplay between these three keywords,
to include certain data file into the source distribution, the following
logic condition has to be met::

m or (p and not e)
DanielYang59 marked this conversation as resolved.
Show resolved Hide resolved

In plain language, the file should be either: 1. included in ``MANIFEST.in``;
or 2. selected by ``package-data`` AND not excluded by ``exclude-package-data``.

To include some data file into the ``.whl``::

(not e) and ((i and m) or p)

In plain language, the file should not be excluded by ``exclude-package-data``
(highest priority), and should be either: 1. selected by ``package-data``; or
2. selected by ``MANIFEST.in`` AND use ``include-package-data = true``.

**Notation**::

i - "include-package-data = true" is set
e - file selected by "exclude-package-data"
p - file selected by "package-data"
m - file included in "MANIFEST.in"

.. note::
Different versions of ``setuptools`` might behave differently. The above
description applies to versions after ``58.5.3`` (exclusive). For information
on the behavior of earlier versions and more details, please refer to the
`GitHub repository <https://github.com/abravalheri/experiment-setuptools-package-data>`_.
DanielYang59 marked this conversation as resolved.
Show resolved Hide resolved

Summary
-------
Expand Down Expand Up @@ -450,7 +498,7 @@ With :ref:`package-data`, the configuration might look like this:
}
)

In other words, we allow Setuptools to scan for namespace packages in the ``src`` directory,
In other words, we allow ``setuptools`` to scan for namespace packages in the ``src`` directory,
which enables the ``data`` directory to be identified, and then, we separately specify data
files for the root package ``mypkg``, and the namespace package ``data`` under the package
``mypkg``.
Expand Down