Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

textwrap should treat Unicode em-dash like ASCII em-dash #74865

Open
jonathaneunice mannequin opened this issue Jun 15, 2017 · 4 comments
Open

textwrap should treat Unicode em-dash like ASCII em-dash #74865

jonathaneunice mannequin opened this issue Jun 15, 2017 · 4 comments
Labels
3.10 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@jonathaneunice
Copy link
Mannequin

jonathaneunice mannequin commented Jun 15, 2017

BPO 30680
Nosy @bitdancer, @methane, @jonathaneunice
PRs
  • gh-74865: textwrap support for true (Unicode) em-dashes #2224
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2017-06-15.19:09:00.240>
    labels = ['type-feature', 'library', '3.10']
    title = 'textwrap should treat Unicode em-dash like ASCII em-dash'
    updated_at = <Date 2020-10-21.04:33:09.340>
    user = 'https://github.com/jonathaneunice'

    bugs.python.org fields:

    activity = <Date 2020-10-21.04:33:09.340>
    actor = 'methane'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2017-06-15.19:09:00.240>
    creator = 'jonathaneunice'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 30680
    keywords = []
    message_count = 4.0
    messages = ['296124', '296126', '296127', '379189']
    nosy_count = 3.0
    nosy_names = ['r.david.murray', 'methane', 'jonathaneunice']
    pr_nums = ['2224']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue30680'
    versions = ['Python 3.10']

    @jonathaneunice
    Copy link
    Mannequin Author

    jonathaneunice mannequin commented Jun 15, 2017

    The textwrap module goes to great lengths to "do the right thing" when it finds the ASCII simulation of an em-dash (two or more consecutive hyphens), but it does nothing to recognize and similarly treat true (Unicode) em-dashes (aka '\N{EM DASH}', '\u2014', or U+2014). Real em-dashes should get at least as good a treatment as simulated em-dashes.

    @jonathaneunice jonathaneunice mannequin added 3.7 (EOL) end of life stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Jun 15, 2017
    @bitdancer
    Copy link
    Member

    This seems sensible to me (I haven't looked at the PR, I'm talking about adding the support). When textwrap was written python was pretty ascii oriented, so it is not too much of a surprise that unicode em dashes were not supported.

    @jonathaneunice
    Copy link
    Mannequin Author

    jonathaneunice mannequin commented Jun 15, 2017

    Agreed. It makes great sense that textwrap started as highly ASCII-centric. But in the Python 3, Unicode-friendly era, ASCII-biased isn't where we should leave things.

    @methane
    Copy link
    Member

    methane commented Oct 21, 2020

    Agreed. It makes great sense that textwrap started as highly ASCII-centric. But in the Python 3, Unicode-friendly era, ASCII-biased isn't where we should leave things.

    It needs Unicode experts. If we support Unicode, we should implemente UAX #14.
    http://www.unicode.org/reports/tr14/tr14-45.html

    But I am not sure some core developer love textwrap and Unicode enough to implement it.
    It can be implemented in 3rd party package before adding it in stdlib.

    Then, is U+2014 really important to implement even though we can not implement UAX#14 in foreseeable future?
    It doesn't make sense to me.

    @methane methane added 3.10 only security fixes and removed 3.7 (EOL) end of life labels Oct 21, 2020
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants