Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package download interrupted then pip reports invalid md5 hash #4930

Closed
tomzo opened this issue Dec 18, 2017 · 4 comments
Closed

Package download interrupted then pip reports invalid md5 hash #4930

tomzo opened this issue Dec 18, 2017 · 4 comments
Labels
auto-locked Outdated issues that have been locked by automation C: download About fetching data from PyPI and other sources

Comments

@tomzo
Copy link

tomzo commented Dec 18, 2017

  • Pip version: 9.0.1
  • Python version: tried on 2.7 and 3.5
  • Operating system: Linux
  • Devpi-server: 4.3.0

Description:

I am running devpi-server as cache of public packages, because I have slow internet connection. Which works fine except for pandas package which is unusually big 20M-24M. When server cache is empty (download is slow) the download is interrupted at about 85%-95%. But then pip does not report timeout error, but instead computes hash of downloaded part and reports md5 hash mismatch.
I am not sure if this is caused by pip timeout or some bug in devpi.

What I've run:

When server cache is cold, I can reproduce the error by running pip install or download.
Notice that it gets interrupted at different moment in time:

pip3 download pandas==0.20.3
Collecting pandas==0.20.3
  Downloading http://devpi.ai-traders.com/root/pypi/+f/a0f/df9f6d772ed81/pandas-0.20.3-cp35-cp35m-manylinux1_x86_64.whl (24.0MB)
    89% |############################    | 21.5MB 43.7MB/s eta 0:00:01
THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
    pandas==0.20.3 from http://devpi.ai-traders.com/root/pypi/+f/a0f/df9f6d772ed81/pandas-0.20.3-cp35-cp35m-manylinux1_x86_64.whl#md5=a0fdf9f6d772ed8168ec6ce555ac7aba:
        Expected md5 a0fdf9f6d772ed8168ec6ce555ac7aba
             Got        d01f88a8b5728ce6cc6648762056c379
pip download pandas==0.20.3
Collecting pandas==0.20.3
  Downloading http://devpi.ai-traders.com/root/pypi/+f/a0f/df9f6d772ed81/pandas-0.20.3-cp35-cp35m-manylinux1_x86_64.whl (24.0MB)
    92% |#############################   | 22.2MB 26.4MB/s eta 0:00:01
THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
    pandas==0.20.3 from http://devpi.ai-traders.com/root/pypi/+f/a0f/df9f6d772ed81/pandas-0.20.3-cp35-cp35m-manylinux1_x86_64.whl#md5=a0fdf9f6d772ed8168ec6ce555ac7aba:
        Expected md5 a0fdf9f6d772ed8168ec6ce555ac7aba
             Got        885ebdc266f92e3a580c93082b4f5a84

Install fails too:

pip3 install pandas==0.20.3
Collecting pandas==0.20.3
  Downloading http://devpi.ai-traders.com/root/pypi/+f/a0f/df9f6d772ed81/pandas-0.20.3-cp35-cp35m-manylinux1_x86_64.whl (24.0MB)
    82% |##########################      | 19.9MB 47.2MB/s eta 0:00:01
THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
    pandas==0.20.3 from http://devpi.ai-traders.com/root/pypi/+f/a0f/df9f6d772ed81/pandas-0.20.3-cp35-cp35m-manylinux1_x86_64.whl#md5=a0fdf9f6d772ed8168ec6ce555ac7aba:
        Expected md5 a0fdf9f6d772ed8168ec6ce555ac7aba
             Got        55784939b2225b3653c4cecf13ac0fc5

Then I can run download with higher timeout and it will pass:

pip download pandas==0.21.1 --timeout 180
Collecting pandas==0.21.1
  Downloading http://devpi.ai-traders.com/root/pypi/+f/664/ee5e5a3af1850/pandas-0.21.1-cp35-cp35m-manylinux1_x86_64.whl (25.7MB)
    100% |################################| 25.7MB 48kB/s 
  Saved ./pandas-0.21.1-cp35-cp35m-manylinux1_x86_64.whl
Collecting python-dateutil>=2 (from pandas==0.21.1)
  File was already downloaded /ide/work/python_dateutil-2.6.1-py2.py3-none-any.whl
Collecting numpy>=1.9.0 (from pandas==0.21.1)
  File was already downloaded /ide/work/numpy-1.13.3-cp35-cp35m-manylinux1_x86_64.whl
Collecting pytz>=2011k (from pandas==0.21.1)
  File was already downloaded /ide/work/pytz-2017.3-py2.py3-none-any.whl
Collecting six>=1.5 (from python-dateutil>=2->pandas==0.21.1)
  File was already downloaded /ide/work/six-1.11.0-py2.py3-none-any.whl
Successfully downloaded pandas python-dateutil numpy pytz six

But, strangely, running install with higher timeout fails:

pip install pandas==0.21.1 --timeout 180
Collecting pandas==0.21.1
  Downloading http://devpi.ai-traders.com/root/pypi/+f/664/ee5e5a3af1850/pandas-0.21.1-cp35-cp35m-manylinux1_x86_64.whl (25.7MB)
    32% |##########                      | 8.5MB 61.0MB/s eta 0:00:01
THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
    pandas==0.21.1 from http://devpi.ai-traders.com/root/pypi/+f/664/ee5e5a3af1850/pandas-0.21.1-cp35-cp35m-manylinux1_x86_64.whl#md5=664ee5e5a3af1850ce715ba6b7ededf7:
        Expected md5 664ee5e5a3af1850ce715ba6b7ededf7
             Got        d6d0fbd859cafe420e992405858bf81c

Then when server cache was populated I can run pip install with success.
I tested all above multiple times and I can reproduce it consistently.

@pradyunsg
Copy link
Member

I think this is proper-ish behaviour. It seems that the network link/server is not very reliable and, when pip is unable to properly download the files, for whatever reason, that's being caught by the hash check and installation/download aborted.

Could you try and see if you able to reproduce the same timeout + not-matching-hash behaviour with a simple requests-based script to download the same file? If so, I think this'll be an upstream issue.

@pradyunsg pradyunsg added C: download About fetching data from PyPI and other sources S: awaiting response Waiting for a response/more information labels Jan 19, 2018
@pradyunsg pradyunsg added the S: needs triage Issues/PRs that need to be triaged label May 11, 2018
@pradyunsg
Copy link
Member

Closing due to a lack of a response.

@pradyunsg pradyunsg removed the S: needs triage Issues/PRs that need to be triaged label Jul 15, 2018
@Starxyz
Copy link

Starxyz commented Aug 13, 2018

@pradyunsg thx

@lock
Copy link

lock bot commented Jun 2, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot added the auto-locked Outdated issues that have been locked by automation label Jun 2, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Jun 2, 2019
@pradyunsg pradyunsg removed the S: awaiting response Waiting for a response/more information label Mar 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation C: download About fetching data from PyPI and other sources
Projects
None yet
Development

No branches or pull requests

3 participants