Skip to content

Latest pip prevents installation with Python > 3.9 due to cchardet #331

@mementum

Description

@mementum

Issue #276 was opened for Python 3.11 but this also affects, for example installing under Windows with Python 3.10

The current determination of whether to use cchardet or cchardet happens here

try:
import cchardet as chardet
except ImportError:
import chardet

And it is then used here

return body.decode(chardet.detect(body)['encoding'])

The current requirements.txt is somehow conflicting, because it allows skipping installation of cchardet for Python 3.11 and the same time says that chardet (the non C version, i.e.: pure python) is also requirement for all versions.

cchardet==2.1.7; python_version < '3.11'
mock==5.0.1; python_version < '3.8'
certifi==2022.12.7
cffi==1.15.1
chardet==3.0.4
charset-normalizer==2.1.1

And charset-normalizer, as suggested in issue #276 should be used as a replacement for cchardet. Surprisingly it is a requirement but it is not used.

Suggested actions

  1. Drop cchardet as a hard requirement for anything above 3.9. Because there are wheels up to that version. i.e.
    Change cchardet==2.1.7; python_version < '3.11' to cchardet==2.1.7; python_version < '3.10'

  2. Change the imports to also import charset-normalizer and use it. The detect function is chardet compatible.

       try:
           import cchardet as chardet
       except ImportError:
           try:
               import charset_normalizer as chardet
           except ImportError
               import chardet

    This is actually redundant given that charset-normalizer has been added a a hard requirement to requirements.txt. A better approach would be

       try:
           import cchardet as chardet
       except ImportError:
           import charset_normalizer as chardet

    And drop the pure python chardet altogether from requirements.txt. Both blacksheep and charset-normalizer are good for python 3.8 => 3.11 (which is the minimum common denominator, taking blacksheep as the restriction)

The detect portion of the code needs no change.

Should you need a pull-request, don't hesitate to say: yes and I will submit one

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions