-
Notifications
You must be signed in to change notification settings - Fork 391
[Need discussion (cf #814)] refactor(_command_offset): implement Bash's $2 passed to completion functions #791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't help feeling that the fix here is so complex and relies on undocumented observed behavior, that is it really something we should be doing? Is something we do earlier in completions just wrong in the sense that we have dug ourselves a hole we need to try dig out of with even more complexity. Or is this just something we go into pretty great lengths to work around bash's shortcomings? Is it worth it, should we, or is there something we could suggest bash to make easier for us in future versions and leave existing ones where they are?
The rest of the refactorings/cleanups here are quite understandable and nice, I'd be happy to work on getting them merged first, and see about the actual fix for #790 i.e. c59655c separately.
self.check(bash, "a`b\\cd", "cd") | ||
self.check(bash, "a`b\\cde\\fg", "fg") | ||
self.check(bash, "a`b\\c\\\\a", "\\a") | ||
self.check(bash, "a`b\\c\\\\\\a", "a") | ||
self.check(bash, "a`b\\c\\\\\\\\a", "\\a") | ||
self.check(bash, "a`b\\c\\a\\a", "a") | ||
self.check(bash, "a`b\\", "") | ||
self.check(bash, "a`b\\\\", "\\") | ||
self.check(bash, "a`b\\\\\\", "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use raw strings here to make these more readable? E.g. r"\\"
instead of "\\\\"
.
Ditto elsewhere in this file where we have "enough" backslashes (and no need to backslash escape single or double quotes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An annoying point is that Python's raw string literal doesn't accept odd numbers of backslashes at the end of the string. For example, r"\"
and r"\\\"
are syntax errors. So, we cannot use the raw strings in some of the test cases. I'll later try to create a separate commit to change the literal and see how it looks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have pushed 117e7dd. As mentioned in the previous reply, Python raw strings cannot handle the second last test case. I could use the raw strings also for the last and the third last case, but kept them with the normal strings for consistency with the second last case.
edit: I have updated more cases 6c467d0.
bash_completion
Outdated
[[ -v _comp_command_offset__mut_initialized && $COMP_WORDBREAKS == "$_comp_command_offset__mut_initialized" ]] && return | ||
_comp_command_offset__mut_initialized=$COMP_WORDBREAKS | ||
|
||
_comp_command_offset__mut_regex_closed_prefix='^([^\"'\'']|\\.|"([^\"]|\\.)*"|'\''[^'\'']*'\'')*' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how long I'd have to eyeball regexps here to actually understand them and their usage fully :) There's no perfect bash syntax available to clarify them, but perhaps splitting at appropriate |
s over several lines, making use of +=
would help some? Mainly meaning for this, and _comp_command_offset__mut_regex_break
below.
But one thing I don't understand yet, is what makes a literal .
special so that it is present in here and _comp_command_offset__mut_regex_break
below? And why it is only there with double quotes, not single ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how long I'd have to eyeball regexps here to actually understand them and their usage fully :)
I think I can parametrize these sub-regexps by introducing the variables like regex_single_quote_literal
.
But one thing I don't understand yet, is what makes a literal
.
special so that it is present in here and_comp_command_offset__mut_regex_break
below? And why it is only there with double quotes, not single ones?
'\\.'
here becomes \\.
in the regular expression and is treated as the pattern that matches with « one literal backslash plus following any one character ». The difference between the string literals of single quotes and double quotes is that the backslash can escape a character only in the latter string literal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have refactored the regular expression generation. Actually, a large part of the non-trivial is making the regular expression [$COMP_WORDBREAKS]
in a safe way such that ^
, ]
, [
, and -
would not be treated as special characters in the bracket expression [...]
. I have isolated that part in a new function.
I have also "named" sub-regexps by putting them in variables and referencing them as $name
in the main regexps. I have also added descriptions and avoided another regex hack.
edit: I updated comments 6c467d0..378a82b
Thank you. I'll split the PR. |
776935f
to
c5c1608
Compare
I have completed splitting the PR. #796, #797, #798, and #799. This PR actually depends on the change in #798, so the test of this PR will now fail. Also, this PR conflicts with the change in #799. I thought #797 would also conflict with the change remaining in this PR, but #797 turned out to be actually independent. Edit: For the sake of completeness, this PR also conflicts with #793. If this PR is merged first, maybe I'll resolve the conflicts. |
c5c1608
to
117e7dd
Compare
6c467d0
to
378a82b
Compare
378a82b
to
e9e9765
Compare
e9e9765
to
9e3b520
Compare
I have created an alternative fix at PR #814 to discuss which approach is better. |
Bash does not directly pass the current word, i.e., ${COMP_WORDS[COMP_CWORD]}, to the seond argument of completion functions specified by `complete -F func` but instead modify the value of the current word with a certain rule. External completion functions, which are not based on bash-completion, can possibly assume the modified value [1]. The current implementation of `_command_offset` directly passes `${COMP_WORDS[COMP_CWORD]}` to the completion function extracted from `complete -p`, which may cause problems for not bash-completion-based external completions. [1] scop#790 Co-authored-by: Ville Skyttä <ville.skytta@iki.fi>
Note that Python raw strings cannot represent the strings that end with odd numbers of consecutive backslashes, so only a part of the strings is represented by raw strings.
9e3b520
to
e003411
Compare
Closing in favor of #814. |
This includes a fix of #790, the renaming of the functions
_command_offset
,_command
, and_root_command
as suggested in #539, and also other refactorings._comp_command_offset__reduce_cur
.Fixes #790
Edit: I have separated changes to other PRs: #796, #797, #798, and #799.