Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Argparse considers unknown optional arguments with spaces as a known positional argument #66623

Open
denkoren mannequin opened this issue Sep 17, 2014 · 8 comments
Open
Labels
3.10 only security fixes extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error

Comments

@denkoren
Copy link
Mannequin

denkoren mannequin commented Sep 17, 2014

BPO 22433
Nosy @blueyed
PRs
  • bpo-22433: do not consider "--foo='bar baz'" to be a positional argument #20924
  • Files
  • argparse.py.patch: Bugfix for argparse 1.1
  • argparse_bug_example.py: Bug demonstration
  • example.py
  • patch_1.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2014-09-17.15:57:56.123>
    labels = ['extension-modules', 'type-bug', '3.10']
    title = 'Argparse considers unknown optional arguments with spaces as a known positional argument'
    updated_at = <Date 2020-07-06.07:49:35.888>
    user = 'https://bugs.python.org/DenKoren'

    bugs.python.org fields:

    activity = <Date 2020-07-06.07:49:35.888>
    actor = 'terry.reedy'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Extension Modules']
    creation = <Date 2014-09-17.15:57:56.123>
    creator = 'DenKoren'
    dependencies = []
    files = ['36641', '36643', '36674', '36704']
    hgrepos = []
    issue_num = 22433
    keywords = ['patch']
    message_count = 6.0
    messages = ['227007', '227009', '227193', '227261', '227310', '227403']
    nosy_count = 4.0
    nosy_names = ['bethard', 'blueyed', 'paul.j3', 'DenKoren']
    pr_nums = ['20924']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue22433'
    versions = ['Python 3.10']

    @denkoren
    Copy link
    Mannequin Author

    denkoren mannequin commented Sep 17, 2014

    Argparse version 1.1 consider ANY unknown argument string containig ' ' (space character) as positional argument. As a result it can use such unknown optional argument as a value of known positional argument.

    Demonstration code:

    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("--known-optional-arg", "-k", action="store_true")
    parser.add_argument("known_positional", action="store", type=str)
    parser.parse_known_args(["--known-optional-arg", "--unknown-optional-arg=with spaces", "known positional arg"])
    parser.parse_known_args(["--known-optional-arg", "--unknown-optional-arg=without_spaces", "known positional arg"])

    Bugfix is attached to issue and affects ArgumentParser._parse_optional() method body.

    Sorry, if it is a better way to report (or, possibly, fix) a bug than this place. It is my first python bug report.

    Thanks.

    @denkoren denkoren mannequin added extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error labels Sep 17, 2014
    @denkoren
    Copy link
    Mannequin Author

    denkoren mannequin commented Sep 17, 2014

    I've signed 'Contributor's Agreement'.

    Bug demonstration code listed in issue description is in file attached to comment.

    It can be run with following command:
    python argparse_bug_example.py

    @paulj3
    Copy link
    Mannequin

    paulj3 mannequin commented Sep 20, 2014

    Your patch fails:

    python3 -m unittest test_argparse.TestEmptyAndSpaceContainingArguments
    

    specifically these 2 subcases:

       (['-a badger'], NS(x='-a badger', y=None)),
       (['-y', '-a badger'], NS(x=None, y='-a badger')),
    

    --------------------

    At issue is the ambiguity when the user gives you a string that looks like an optionals flag, but doesn't match one of the defined arguments. When should it be treated as a positional, and when should it be treated as an unknown?

    The code, and test says - if it has the space, treat it like a positional. You are trying to reverse that choice - if it has the prefix, treat it like an unknown optional. At the point where you apply the patch, we already know that the string has a prefixchar. So you are, effectively, eliminating the 'space' test.

    I've added a simpler example of where the presence of the space flips that choice.

    @denkoren
    Copy link
    Mannequin Author

    denkoren mannequin commented Sep 22, 2014

    There is an standard way to solve this ambiguity.

    There is a special marker '--' used to force argument parsing function treat all arguments given in command after this marker as positional arguments. It was invented specially for tasks where you need to force argument parsing engine ignore option indicator and treat argument as positional.

    I know this argument as standard 'de facto', but can't find any documents about '--' interpretation.
    But you can see, that Linux 'libc' standard C library function getopt() knows about this flag:
    http://linux.die.net/man/3/getopt
    So, any utility using this function knows about '--'.

    For example: grep, tee, cat, tail, head, vim, less, rm, rmdir, common shells (sh, bash, csh, dash, ...) and so on. The list is large. Almost any utility except special ones like 'echo' which main function is to print any argument given to utility.

    So, I think, that the true way here is to parse all arguments staring with '-'/'--' and listed before special '--' marker as optional, and treat all of arguments listed after '--' as positional ones and do not try to use any secondary indicators to make a decision: 'optional' or not.

    For example:
    """
    any parameter before '--' special marker starting with '--' is treated as optional
    # python3 example.py --bar="one two" xxx
    (Namespace(pos='xxx'), ['--bar=one two'])

    all parameters after '--' special marker are treated as positional
    # python3 example.py -- --bar="one two" xxx
    (Namespace(pos='--bar=one two'), ['xxx'])
    """

    Yes, my patch does not the solution and argparse should be patched another way.

    Where and how can I get unittests for argparse module to check my code modifications before posting them here? I want to try to modify argparse and produce the patch changing argparse behaviour as described above.

    Thanks.

    @paulj3
    Copy link
    Mannequin

    paulj3 mannequin commented Sep 22, 2014

    Proposed patches like this are supposed to be generated against the current development version (3.5...), especially if they are 'enhancements' (as opposed to bugs). But there isn't much of a difference in argparse between 2.7+ and 3.4+ (except one nested yield expression).

    Tests are in 'lib/test/...'

    argparse does implement the '--' syntax. That is, anything after it is understood to be positional, regardless of its prefix characters. But before that, optionals and positionals can be freely mixed (within limits).

    I agree that '--foo=one two' looks a lot more like an unknown optional than the test case 'a badger'. Strings with '=' are tested earlier in _parse_optional.

    A fix that passes test_argparse.py and your example is:

            if '=' in arg_string:
                option_string, explicit_arg = arg_string.split('=', 1)
                if option_string in self._option_string_actions:
                    action = self._option_string_actions[option_string]
                    return action, option_string, explicit_arg
                else: # added for unrecognized
                    return None, option_string, explicit_arg
                    # or return None, arg_string, None

    But the '=' case is also tested in the following line:

        option_tuples = self._get_option_tuples(arg_string)

    The obvious difference is that _get_option_tuples handles abbreviations.

    I wonder if the two can be refined to reduce the duplication, and handler your case as well.

    There are other bug issues dealing with multiple '--', the mixing of optionals and positionals, and controlling whether abbreviations are allowed or not. I don't recall any others dealing with strings that contain '=' or space.

    @paulj3
    Copy link
    Mannequin

    paulj3 mannequin commented Sep 24, 2014

    I've added a patch with tests that I think handles this case, without messing with existing test cases. It adds a new 'space' test right before the existing one, one that explicitly handles an '=' arg_string.

    If the space is in the 1st part, it is marked as a positional (as it does in the existing code). But if the space is in the argument portion has no such effect; the arg_string will be handled as an unknown optional.

            if '=' in arg_string:
                option_prefix, explicit_arg = arg_string.split('=', 1)
                if ' ' in option_prefix:
                    return None
                else:
                    return None, arg_string, None

    The new testcase is in the TestParseKnownArgs class (near the end of test_argparse.py), and tests the different ways in which an arg_string can be allocated to a positional or unknown.

    The underlying idea is that elsewhere in '_parse_optional()', an arg_string with '=' is handled just like a two part optional. It first tries an exact match, and then tries an abbreviation match. In that spirit, this space test also focuses on the flag part of the arg_string, not the argument part.

    Much as I like this solution, I still worry about backward compatibility. As discussed in 'http://bugs.python.org/issue9334', 'argparse does not accept options taking arguments beginning with dash (regression from optparse)', compatibility in this _parse_optional() function is a serious issue. There the proposed patch adds a switch to the API. But an API change is itself messy, and not to be taken lightly if it isn't needed.

    This patch has some comments that should be stripped out before it is actually applied. There is a quite a backlog argparse issues, so I don't expect quick action here.

    @terryjreedy
    Copy link
    Member

    terryjreedy commented Jan 23, 2024

    I think the following, based on the closed duplicate, and run in IDLE 3.13.013, Win10, demonstrates the issue.

    >>> import argparse
    >>> parser = argparse.ArgumentParser()
    >>> parser.add_argument('args', nargs='*')
    _StoreAction(option_strings=[], dest='args', nargs='*', const=None, default=None, type=None, choices=None, required=True, help=None, metavar=None)
    >>> parser.parse_args(['--wrong-arg="foo bar"'])
    Namespace(args=['--wrong-arg="foo bar"'])
    >>> parser.parse_args(['--wrong-arg="foobar"'])
    usage: [-h] [args ...]
    : error: unrecognized arguments: --wrong-arg="foobar"

    @dbarnett
    Copy link

    Hi there, ran into this issue in 2024, found this existing issue for it, but seems like progress stalled out in 2014 (a decade ago... wow!).

    Is there general agreement that space is the wrong heuristic for determining what could be a conditional arg? Could anyone summarize status here and current options?

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error
    Projects
    Status: Bugs
    Development

    No branches or pull requests

    2 participants