Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RECORD size and hash do not reflect rewritten shebangs #10744

Closed
1 task done
LukeShu opened this issue Dec 22, 2021 · 6 comments · Fixed by #11052
Closed
1 task done

RECORD size and hash do not reflect rewritten shebangs #10744

LukeShu opened this issue Dec 22, 2021 · 6 comments · Fixed by #11052
Labels
state: needs reproducer Need to reproduce issue type: bug A confirmed bug or unintended behavior

Comments

@LukeShu
Copy link

LukeShu commented Dec 22, 2021

Description

When installing a wheel, pip rewrites the shebang in scripts in {dist}-{ver}.dist-info/scripts/ from #!python to the appropriate path to the Python interpreter. However, the hash and size in RECORD for the script correspond to the original version of the script, not the rewritten version.

Expected behavior

I expected the RECORD to reflect the files that actually get installed, so that an install can be integrity-checked.

pip version

20.3.4

Python version

3.9.9

OS

Parabola GNU/Linux-libre (like Arch Linux)

How to Reproduce

  1. Use pip to install a wheel that contains a {dist}-{ver}.dist-info/scripts/* script with a #!python shebang. One such wheel is websocket-client 0.57.0.
  2. Check whether the hash and size of the script match what got recorded in RECORD.

Output

$ wget https://files.pythonhosted.org/packages/4c/5f/f61b420143ed1c8dc69f9eaec5ff1ac36109d52c80de49d66e0c36c3dfdf/websocket_client-0.57.0-py2.py3-none-any.whl

$ pip install --ignore-installed --no-deps --prefix=testdir ./websocket_client-0.57.0-py2.py3-none-any.whl


$ # Observe what the resulting RECORD says
$ grep bin/w testdir/lib/python3.9/site-packages/websocket_client-0.57.0.dist-info/RECORD 
../../../bin/wsdump.py,sha256=S54et6zebnxb2VJcgBadSnvXblK1iBF93ap54hlc5O8,6403

$ # Observe whether this matches the resulting script file
$ sha256sum testdir/bin/wsdump.py | xargs python -c 'import sys, base64; print(base64.b64encode(bytes.fromhex(sys.argv[1])).decode("utf-8"))'
6GQkITdeQFmlpL7/T9+O/X0sWsKeddIZCvwtU0ld+hc=
$ wc -c < testdir/bin/wsdump.py 
6412

$ # Observe whether this matches the original script in the wheel
$ bsdtar xfO websocket_client-0.57.0-py2.py3-none-any.whl websocket_client-0.57.0.data/scripts/wsdump.py | sha256sum | xargs python -c 'import sys, base64; print(base64.b64encode(bytes.fromhex(sys.argv[1])).decode("utf-8"))'
S54et6zebnxb2VJcgBadSnvXblK1iBF93ap54hlc5O8=
$ bsdtar xfO websocket_client-0.57.0-py2.py3-none-any.whl websocket_client-0.57.0.data/scripts/wsdump.py |wc -c
6403

Code of Conduct

@LukeShu LukeShu added S: needs triage Issues/PRs that need to be triaged type: bug A confirmed bug or unintended behavior labels Dec 22, 2021
@pradyunsg
Copy link
Member

Is this reproducible with the latest pip release?

@pradyunsg pradyunsg added state: needs reproducer Need to reproduce issue and removed S: needs triage Issues/PRs that need to be triaged labels Dec 23, 2021
@LukeShu
Copy link
Author

LukeShu commented Dec 23, 2021

Is this reproducible with the latest pip release?

Whoop, I get used to thinking that Arch/Parabola have the latest everything, but I guess Arch has a blocker for upgrading to pip 21.

OK, so using a virtualenv to get the latest pip: yes, it is reproducible with the latest pip release.

$ python -m virtualenv venv                                                                                                                                                                                                      
created virtual environment CPython3.9.9.final.0-64 in 468ms
  creator CPython3Posix(dest=…/venv, clear=False, no_vcs_ignore=False, global=False)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/lukeshu/.local/share/virtualenv)
    added seed packages: pip==21.3.1, setuptools==59.2.0, wheel==0.37.0
  activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator

$ . venv/bin/activate

(venv) $ pip --version
pip 21.3.1 from …/venv/lib/python3.9/site-packages/pip (python 3.9)

(venv) $ pip install --ignore-installed --no-deps --prefix=testdir ./websocket_client-0.57.0-py2.py3-none-any.whl
Processing ./websocket_client-0.57.0-py2.py3-none-any.whl
Installing collected packages: websocket-client
Successfully installed websocket-client-0.57.0

(venv) $ grep bin/w testdir/lib/python3.9/site-packages/websocket_client-0.57.0.dist-info/RECORD 
../../../bin/wsdump.py,sha256=S54et6zebnxb2VJcgBadSnvXblK1iBF93ap54hlc5O8,6403

@uranusjr
Copy link
Member

I do wonder what changed during the time, there’s nothing I recall around this. Maybe this is not even a pip issue but an Arch issue? Could you try versions between 20.3.4 and 21.3.1 and see if the behaviour changed in a particular version?

@pfmoore
Copy link
Member

pfmoore commented Dec 24, 2021

@uranusjr am I confused here? I thought the OP stated that the behaviour was the same in 20.3.4 and 21.3.1? Which makes this feel like a straight bug. In fact, I just confirmed it with Ubuntu (on WSL) with Python 3.8 and pip 21.3.1.

I suspect this is "simply" a case of us copying RECORD from the wheel without recalculating the hash for scripts where we rewrite the shebang. It's probably not been reported before because (a) scripts are uncommon (most people use entry points these days) and (b) very few people check hashes, I suspect.

@pradyunsg
Copy link
Member

pradyunsg commented Dec 30, 2021

We do compute these things for generated files.

https://github.com/pypa/pip/blob/main/src/pip/_internal/operations/install/wheel.py#L266

I think we might want to start validating these hashes from wheels at some point, similar to how we'd want to validate the hashes by default in some contexts and get TUF-based protection of all the index interactions. 🤷🏻‍♂️

@pradyunsg
Copy link
Member

pradyunsg commented Dec 30, 2021

And we check for files that have been changed; to recompute hashes and sizes.

https://github.com/pypa/pip/blob/main/src/pip/_internal/operations/install/wheel.py#L260

Someone should step through this chunk of code with a debugger to figure out what is going wrong here; because we certainly have code to account for changed files.

inmantaci pushed a commit to inmanta/inmanta-core that referenced this issue Jul 21, 2022
Bumps [pip](https://github.com/pypa/pip) from 22.1.2 to 22.2.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a href="https://github.com/pypa/pip/blob/main/NEWS.rst">pip's changelog</a>.</em></p>
<blockquote>
<h1>22.2 (2022-07-21)</h1>
<h2>Deprecations and Removals</h2>
<ul>
<li>Remove the <code>html5lib</code> deprecated feature flag. (<code>[#10825](pypa/pip#10825) &lt;https://github.com/pypa/pip/issues/10825&gt;</code>_)</li>
<li>Remove <code>--use-deprecated=backtrack-on-build-failures</code>. (<code>[#11241](pypa/pip#11241) &lt;https://github.com/pypa/pip/issues/11241&gt;</code>_)</li>
</ul>
<h2>Features</h2>
<ul>
<li>
<p>Add support to use <code>truststore &lt;https://pypi.org/project/truststore/&gt;</code>_ as an
alternative SSL certificate verification backend. The backend can be enabled on Python
3.10 and later by installing <code>truststore</code> into the environment, and adding the
<code>--use-feature=truststore</code> flag to various pip commands.</p>
<p><code>truststore</code> differs from the current default verification backend (provided by
<code>certifi</code>) in it uses the operating system’s trust store, which can be better
controlled and augmented to better support non-standard certificates. Depending on
feedback, pip may switch to this as the default certificate verification backend in
the future. (<code>[#11082](pypa/pip#11082) &lt;https://github.com/pypa/pip/issues/11082&gt;</code>_)</p>
</li>
<li>
<p>Add <code>--dry-run</code> option to <code>pip install</code>, to let it print what it would install but
not actually change anything in the target environment. (<code>[#11096](pypa/pip#11096) &lt;https://github.com/pypa/pip/issues/11096&gt;</code>_)</p>
</li>
<li>
<p>Record in wheel cache entries the URL of the original artifact that was downloaded
to build the cached wheels. The record is named <code>origin.json</code> and uses the PEP 610
Direct URL format. (<code>[#11137](pypa/pip#11137) &lt;https://github.com/pypa/pip/issues/11137&gt;</code>_)</p>
</li>
<li>
<p>Support <code>PEP 691 &lt;https://peps.python.org/pep-0691/&gt;</code><em>. (<code>[#11158](pypa/pip#11158) &lt;https://github.com/pypa/pip/issues/11158&gt;</code></em>)</p>
</li>
<li>
<p>pip's deprecation warnings now subclass the built-in <code>DeprecationWarning</code>, and
can be suppressed by running the Python interpreter with
<code>-W ignore::DeprecationWarning</code>. (<code>[#11225](pypa/pip#11225) &lt;https://github.com/pypa/pip/issues/11225&gt;</code>_)</p>
</li>
<li>
<p>Add <code>pip inspect</code> command to obtain the list of installed distributions and other
information about the Python environment, in JSON format. (<code>[#11245](pypa/pip#11245) &lt;https://github.com/pypa/pip/issues/11245&gt;</code>_)</p>
</li>
<li>
<p>Significantly speed up isolated environment creation, by using the same
sources for pip instead of creating a standalone installation for each
environment. (<code>[#11257](pypa/pip#11257) &lt;https://github.com/pypa/pip/issues/11257&gt;</code>_)</p>
</li>
<li>
<p>Add an experimental <code>--report</code> option to the install command to generate a JSON report
of what was installed. In combination with <code>--dry-run</code> and <code>--ignore-installed</code> it
can be used to resolve the requirements. (<code>[#53](pypa/pip#53) &lt;https://github.com/pypa/pip/issues/53&gt;</code>_)</p>
</li>
</ul>
<h2>Bug Fixes</h2>
<ul>
<li>Fix <code>pip install --pre</code> for packages with pre-release build dependencies defined
both in <code>pyproject.toml</code>'s <code>build-system.requires</code> and <code>setup.py</code>'s
<code>setup_requires</code>. (<code>[#10222](pypa/pip#10222) &lt;https://github.com/pypa/pip/issues/10222&gt;</code>_)</li>
<li>When pip rewrites the shebang line in a script during wheel installation,
update the hash and size in the corresponding <code>RECORD</code> file entry. (<code>[#10744](pypa/pip#10744) &lt;https://github.com/pypa/pip/issues/10744&gt;</code>_)</li>
<li>Do not consider a <code>.dist-info</code> directory found inside a wheel-like zip file
as metadata for an installed distribution. A package in a wheel is (by</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="https://github.com/pypa/pip/commit/8e7e76e60f4e115ea1201bee2f176377a718fce1"><code>8e7e76e</code></a> Bump for release</li>
<li><a href="https://github.com/pypa/pip/commit/b6f6a94e36f10a4535ea5bbdc6b351f62003eede"><code>b6f6a94</code></a> Update AUTHORS.txt</li>
<li><a href="https://github.com/pypa/pip/commit/790725aca3f60c745e33827a6079d9600da373d8"><code>790725a</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/11274">#11274</a> from sbidoul/install-report-note-sbi</li>
<li><a href="https://github.com/pypa/pip/commit/d4b9e187aa7cc5ab14b2339f6171f7f2ea6504e9"><code>d4b9e18</code></a> Add clarifications to the installation report documentation</li>
<li><a href="https://github.com/pypa/pip/commit/b1a01ef762a78af1194958a1c874015eaf81fd04"><code>b1a01ef</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/11265">#11265</a> from finnagin/main</li>
<li><a href="https://github.com/pypa/pip/commit/48bcb0a4ccd30a9d00e58fe58827772e307a7e39"><code>48bcb0a</code></a> reformat to pass pre-commit check</li>
<li><a href="https://github.com/pypa/pip/commit/a7c1fe3bff5655393018c53b448b669b3525515b"><code>a7c1fe3</code></a> Remove utc fixture from tests</li>
<li><a href="https://github.com/pypa/pip/commit/0c574f72905185d62bcca741c813df9bae1d9282"><code>0c574f7</code></a> Remove time import</li>
<li><a href="https://github.com/pypa/pip/commit/246fef19149eea893f1cf3efd53f9b17c94c952f"><code>246fef1</code></a> Remove utc fixture</li>
<li><a href="https://github.com/pypa/pip/commit/c9cb7f4629bdd8c61b792feff6dacb1d2e848d57"><code>c9cb7f4</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/11270">#11270</a> from uranusjr/upgrade-pre-commit-hooks</li>
<li>Additional commits viewable in <a href="https://github.com/pypa/pip/compare/22.1.2...22.2">compare view</a></li>
</ul>
</details>
<br />

[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pip&package-manager=pip&previous-version=22.1.2&new-version=22.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

</details>
inmantaci pushed a commit to inmanta/inmanta-core that referenced this issue Jul 21, 2022
Bumps [pip](https://github.com/pypa/pip) from 22.1.2 to 22.2.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a href="https://github.com/pypa/pip/blob/main/NEWS.rst">pip's changelog</a>.</em></p>
<blockquote>
<h1>22.2 (2022-07-21)</h1>
<h2>Deprecations and Removals</h2>
<ul>
<li>Remove the <code>html5lib</code> deprecated feature flag. (<code>[#10825](pypa/pip#10825) &lt;https://github.com/pypa/pip/issues/10825&gt;</code>_)</li>
<li>Remove <code>--use-deprecated=backtrack-on-build-failures</code>. (<code>[#11241](pypa/pip#11241) &lt;https://github.com/pypa/pip/issues/11241&gt;</code>_)</li>
</ul>
<h2>Features</h2>
<ul>
<li>
<p>Add support to use <code>truststore &lt;https://pypi.org/project/truststore/&gt;</code>_ as an
alternative SSL certificate verification backend. The backend can be enabled on Python
3.10 and later by installing <code>truststore</code> into the environment, and adding the
<code>--use-feature=truststore</code> flag to various pip commands.</p>
<p><code>truststore</code> differs from the current default verification backend (provided by
<code>certifi</code>) in it uses the operating system’s trust store, which can be better
controlled and augmented to better support non-standard certificates. Depending on
feedback, pip may switch to this as the default certificate verification backend in
the future. (<code>[#11082](pypa/pip#11082) &lt;https://github.com/pypa/pip/issues/11082&gt;</code>_)</p>
</li>
<li>
<p>Add <code>--dry-run</code> option to <code>pip install</code>, to let it print what it would install but
not actually change anything in the target environment. (<code>[#11096](pypa/pip#11096) &lt;https://github.com/pypa/pip/issues/11096&gt;</code>_)</p>
</li>
<li>
<p>Record in wheel cache entries the URL of the original artifact that was downloaded
to build the cached wheels. The record is named <code>origin.json</code> and uses the PEP 610
Direct URL format. (<code>[#11137](pypa/pip#11137) &lt;https://github.com/pypa/pip/issues/11137&gt;</code>_)</p>
</li>
<li>
<p>Support <code>PEP 691 &lt;https://peps.python.org/pep-0691/&gt;</code><em>. (<code>[#11158](pypa/pip#11158) &lt;https://github.com/pypa/pip/issues/11158&gt;</code></em>)</p>
</li>
<li>
<p>pip's deprecation warnings now subclass the built-in <code>DeprecationWarning</code>, and
can be suppressed by running the Python interpreter with
<code>-W ignore::DeprecationWarning</code>. (<code>[#11225](pypa/pip#11225) &lt;https://github.com/pypa/pip/issues/11225&gt;</code>_)</p>
</li>
<li>
<p>Add <code>pip inspect</code> command to obtain the list of installed distributions and other
information about the Python environment, in JSON format. (<code>[#11245](pypa/pip#11245) &lt;https://github.com/pypa/pip/issues/11245&gt;</code>_)</p>
</li>
<li>
<p>Significantly speed up isolated environment creation, by using the same
sources for pip instead of creating a standalone installation for each
environment. (<code>[#11257](pypa/pip#11257) &lt;https://github.com/pypa/pip/issues/11257&gt;</code>_)</p>
</li>
<li>
<p>Add an experimental <code>--report</code> option to the install command to generate a JSON report
of what was installed. In combination with <code>--dry-run</code> and <code>--ignore-installed</code> it
can be used to resolve the requirements. (<code>[#53](pypa/pip#53) &lt;https://github.com/pypa/pip/issues/53&gt;</code>_)</p>
</li>
</ul>
<h2>Bug Fixes</h2>
<ul>
<li>Fix <code>pip install --pre</code> for packages with pre-release build dependencies defined
both in <code>pyproject.toml</code>'s <code>build-system.requires</code> and <code>setup.py</code>'s
<code>setup_requires</code>. (<code>[#10222](pypa/pip#10222) &lt;https://github.com/pypa/pip/issues/10222&gt;</code>_)</li>
<li>When pip rewrites the shebang line in a script during wheel installation,
update the hash and size in the corresponding <code>RECORD</code> file entry. (<code>[#10744](pypa/pip#10744) &lt;https://github.com/pypa/pip/issues/10744&gt;</code>_)</li>
<li>Do not consider a <code>.dist-info</code> directory found inside a wheel-like zip file
as metadata for an installed distribution. A package in a wheel is (by</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="https://github.com/pypa/pip/commit/8e7e76e60f4e115ea1201bee2f176377a718fce1"><code>8e7e76e</code></a> Bump for release</li>
<li><a href="https://github.com/pypa/pip/commit/b6f6a94e36f10a4535ea5bbdc6b351f62003eede"><code>b6f6a94</code></a> Update AUTHORS.txt</li>
<li><a href="https://github.com/pypa/pip/commit/790725aca3f60c745e33827a6079d9600da373d8"><code>790725a</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/11274">#11274</a> from sbidoul/install-report-note-sbi</li>
<li><a href="https://github.com/pypa/pip/commit/d4b9e187aa7cc5ab14b2339f6171f7f2ea6504e9"><code>d4b9e18</code></a> Add clarifications to the installation report documentation</li>
<li><a href="https://github.com/pypa/pip/commit/b1a01ef762a78af1194958a1c874015eaf81fd04"><code>b1a01ef</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/11265">#11265</a> from finnagin/main</li>
<li><a href="https://github.com/pypa/pip/commit/48bcb0a4ccd30a9d00e58fe58827772e307a7e39"><code>48bcb0a</code></a> reformat to pass pre-commit check</li>
<li><a href="https://github.com/pypa/pip/commit/a7c1fe3bff5655393018c53b448b669b3525515b"><code>a7c1fe3</code></a> Remove utc fixture from tests</li>
<li><a href="https://github.com/pypa/pip/commit/0c574f72905185d62bcca741c813df9bae1d9282"><code>0c574f7</code></a> Remove time import</li>
<li><a href="https://github.com/pypa/pip/commit/246fef19149eea893f1cf3efd53f9b17c94c952f"><code>246fef1</code></a> Remove utc fixture</li>
<li><a href="https://github.com/pypa/pip/commit/c9cb7f4629bdd8c61b792feff6dacb1d2e848d57"><code>c9cb7f4</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/11270">#11270</a> from uranusjr/upgrade-pre-commit-hooks</li>
<li>Additional commits viewable in <a href="https://github.com/pypa/pip/compare/22.1.2...22.2">compare view</a></li>
</ul>
</details>
<br />

[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pip&package-manager=pip&previous-version=22.1.2&new-version=22.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

</details>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
state: needs reproducer Need to reproduce issue type: bug A confirmed bug or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants