Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make "type: string" options greedy? #77

Closed
shadowspawn opened this issue Mar 9, 2022 · 21 comments · Fixed by #88
Closed

Make "type: string" options greedy? #77

shadowspawn opened this issue Mar 9, 2022 · 21 comments · Fixed by #88
Labels
enhancement New feature or request

Comments

@shadowspawn
Copy link
Collaborator

shadowspawn commented Mar 9, 2022

Short version

Making an option with an expected option-argument (i.e. type: string) always consume the next argument has some benefits:

  • simple rule
  • consistent with usage description (e.g. [-c option-argument])
  • no special support needed for negative numbers as option arguments
  • consistent with common implementations (e.g. getopts_long, Commander)
  • POSIX compliant

Long version

This is a revisit for a deeper dive into specifically whether a string option should be greedy. Previously raised in #25 and #75 (comment) but not considered in depth.

Given the program:

const { parseArgs } = require('@pkgjs/parseargs');
const options = {
    with: { short: 'w', type: 'string'}
};
console.log(parseArgs({ options }));

If there is an argument following the string option, does the parsing result depend on the contents of the argument? (Consider following argument one of: foo, -x, --xx, -5, true, -, --, ---, or empty string.)

Some simple examples:

node index.js --with foo
node index.js --with --floating-point-conversion
node index.js --w bar
node index.js --w -5

Let's take a look at the reference specifications and implementations I identified in #76.

POSIX

Utility Conventions and notes

An option with a mandatory option-argument (our "string" option) should be described in the usage like: [-c option_argument]

Guideline 7 states: Option-arguments should not be optional. This is clarified in the notes to make it clear that the option is greedy:

Guideline 7 allows any string to be an option-argument; an option-argument can begin with any character, can be - or --, and can be an empty string. For example, the commands pr -h -, pr -h --, pr -h -d, pr -h +2, and pr -h " contain the option-arguments -, --, -d, +2, and an empty string, respectively. Conversely, the command pr -h -- -d treats -d as an option, not as an argument, because the -- is an option-argument here, not a delimiter.

GNU

Program Argument Conventions adds long options and does not change the handling of mandatory option-arguments.

getopt_long

Greedy. An option with a mandatory option-argument consumes the next argument.

Commander

Greedy. An option with a mandatory option-argument consumes the next argument.

Minimist

The option-argument is optional. It is not consumed if it is option-like. Negative numbers are not consumed.

(The details actually vary between short and long options, but this may be a historical accident rather than by design! Different behaviours for -, leading ---, and empty string.)

Yargs

The option-argument is optional. It is not consumed if it is option-like. There is an exception added for negative numbers, which are consumed.

(The details actually vary between short and long options, but this may be a historical accident rather than by design! Different behaviours for - and leading ---.)

Minimist and Yargs likely rationale

Minimist and Yargs using zero-config parsing assume options may take an argument. This allows multiple options which are variously parsed as boolean or string options. For example with Minimist:

$ node index.js -b -w foo
{ _: [], b: true, w: 'foo' }

parseArgs does not have the same constraints as we are assuming auto-discovered options are boolean.

@ljharb
Copy link
Member

ljharb commented Mar 9, 2022

"It is not consumed if it is option-like." seems like the proper approach, because the user's intention (which is more important than the parseArgs consumer's) is clearly to pass it as an option.

@shadowspawn

This comment was marked as outdated.

@ljharb

This comment was marked as outdated.

@shadowspawn

This comment was marked as outdated.

@ljharb

This comment was marked as outdated.

@shadowspawn

This comment was marked as outdated.

@ljharb

This comment was marked as outdated.

@shadowspawn

This comment was marked as outdated.

@ljharb

This comment was marked as outdated.

@aaronccasanova
Copy link
Collaborator

Thanks for consolidating all that research @shadowspawn! Definitely helped me build a more informed opinion on the matter.

This is a tough one.. I like that type: 'string' being greedy immediately makes the parser behavior more consistent/predictable (e.g. type: 'string' in long and short forms always captures following values), would likely clean up the implementation, and enables support for signed numbers (which has already been brought up in two separate issues).

The tough part for me is while this change would align with notable standards (e.g. POSIX) it differs from other popular arg parsing Node libraries (e.g. Yargs and Minimist) with the exception of Commander. While my instinct says we should stick to established standards, a part of me thinks it might be better to use idiomatic conventions already used throughout the Node ecosystem.

That being said, I don't have a strong opinion either way and would be happy with either approach 👍

@shadowspawn
Copy link
Collaborator Author

shadowspawn commented Mar 11, 2022

For my own interest, I went through the ~800 Commander bugs for ones related to greedy/not-greedy. Commander supports required greedy option-argument and optional choosy option-argument, so gets issues both ways around. I don't know what the historical usage ratio is, but expect more required option-argument than optional option-argument.

Issues related to greedy behaviour: 3
Issues related to option not being greedy: 4

The not-greedy issues had examples calls of:

  • tests run --launchparams "--ez e10s true"
  • node index -e '-KkXJUkBZMYVJXiMo0Pe'
  • lookup -g -83,23
  • node something.js --value -10

@shadowspawn

This comment was marked as outdated.

@bakkot
Copy link
Collaborator

bakkot commented Mar 19, 2022

If it's not greedy, what's the alternative for when you want to pass a string starting with -?

Can you do it with =? For example, does (e.g.) --coordinates="-1" give you { values: { coordinates: '-1' } }? If that works, and if ultimately we decide that --coordinates -1 is an error [in strict: true], it's almost certainly worth adding a special error for this case (value-taking option where the next thing in args starts with -) which suggests the user write --coordinates="-1" instead.

The non-greedy behavior seems a lot more palatable if it's always possible to have an error message which tells the user what to write instead.


For example negative numbers would always be accepted as an option-argument unless there was a configured option for the starting digit.

For negative numbers in particular, that's pretty close to what Python does, except that Python says "any numeric options" rather than "specifically that digit".

I think "any numeric options" is the more reasonable of the two, since it's weird as a user if --coordinates -1 works but --coordinates -2 throws an error.

@shadowspawn
Copy link
Collaborator Author

If it's not greedy, what's the alternative for when you want to pass a string starting with -?

Can you do it with =? ...

Yes.

The non-greedy behavior seems a lot more palatable if it's always possible to have an error message which tells the user what to write instead.

Good suggestion, with caveat it only assists in strict:true mode. Extra work, but extra value.

@shadowspawn
Copy link
Collaborator Author

Modified proposal: the type:string option processing is greedy. Strict mode throws an error if the candidate option-argument starts with a dash, due to the potential ambiguity of user intent with (e.g.) --foo --bar.

The error in strict mode for --foo --bar is custom (as suggested in #77 (comment)), and a different message to foo missing a value


There is not a "right" answer, and split feedback. This enhanced approach might suit a majority.

@aaronccasanova
Copy link
Collaborator

Strict mode throws an error if the candidate option-argument starts with a dash, due to the potential ambiguity of user intent with (e.g.) --foo --bar.

I'm all for the modified proposal and gave my 👍 for whatever that's worth 😄 However, I want to put it out there that I don't think erroring on option-like-arguments reduces ambiguity of user intent. It feels very situational to the tool being created.

Take for instance this grep example. I'm trying to search for instances of a given CSS custom property (which by convention starts with --). At first grep errors unrecognized option '--brand-color'. However, when I add the -e option it greedily accepts the next token. IMO this satisfies both the author and user intent.

image

Apologies for prolonging a resolution 😬 Feel free to disregard this opinion as I think there will be more opportunities to refine greedy behavior post MVP (e.g. global parserConfig.allowHyphenValues, optionConfig.allowHyphenValue, etc.).

@shadowspawn
Copy link
Collaborator Author

I want to put it out there that I don't think erroring on option-like-arguments reduces ambiguity of user intent

My description was not great.

The "ambiguity" is: did the user intend that, or did they make a mistake? Following your example context:

grep -e --line-buffered styles.scss

Did the user forget the pattern argument after -e, or did they really want to search for --line-buffered?

We aren't sure so in strict mode we can throw an error just in case, and suggest how to change command if was intended. e.g.

$ grep -e --line-buffered styles.scss
Did you forget to specify the option argument for '-e'? 
To specify an option argument starting with a dash use '-e--line-buffered' (no space after option).

Apologies for prolonging a resolution

Feedback is good. I wasn't sure how this hybrid would be received, and checking it seems reasonable before pushing it. 😄

@bakkot
Copy link
Collaborator

bakkot commented Apr 2, 2022

Did you forget to specify the option argument for '-e'?

Tiny bit of feedback on the error message: if it's possible to suggest they use a long-form argument with = instead (so, --regexp=--line-buffered), I think that's probably better than suggesting the -eVALUE form, even if it requires expanding a short alias to its long version. This is because I expect more people to be familiar with the --long=value form than with the -evalue form, and so less likely to be confused by the message.

(But that's just my intuition; no hard data either way. And the custom message is much better than nothing!)

@shadowspawn
Copy link
Collaborator Author

In somewhat of an aside, the grep example starts with an ambiguity/conflict between whether an argument is an option or a positional. This is a nice example of the -- support allowing the user to resolve, without the author needing to do anything extra.

$ grep --brand-color styles.scss
grep: unrecognised option ` --brand-color'
...
$ grep -- --brand-color styles.scss

@aaronccasanova
Copy link
Collaborator

Thanks @shadowspawn The examples you provided help clear up the ambiguity we're trying to avoid 👍

@shadowspawn
Copy link
Collaborator Author

@bcoe Did you have an opinion on this? Ok to ignore if you are feeling bandwidth limited!

Summary of feedback and modified proposal in: #77 (comment)

@shadowspawn shadowspawn added enhancement New feature or request and removed discussion labels Apr 14, 2022
@bcoe bcoe closed this as completed in #88 Apr 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
4 participants