Skip to content

Allow checking whether optionals provided explicitly #187

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

kstrafe
Copy link

@kstrafe kstrafe commented Mar 14, 2024

Allow checking whether optionals provided explicitly

Since boolean optionals default to "off" we have no way to
check if a boolean optional was actually specified on the command line.

The ability to check if a boolean optional has been given is useful in
cases where we source values from a configuration file where we want any
explicitly given arguments to override the configuration.

An example that mostly works is (for plain optionals with an empty default):

    test -f ~/.standard-values-for-this-script && source "$_"
    MYOPT="${MYOPT:-$_arg_myopt}"
    ... # Code using MYOPT

This works since optionals can default to an empty string. We still are
not entirely sure if the optional was actually passed here, but it tends
to be "good enough" in most cases.

In the case where myopt is boolean, the above case would always use
"off" if the argument was not specified, thus overriding the sourced
configuration file.

This patch introduces _supplied_<argname>, a value that gets set if
and only if an argument is supplied for that option. This must be
enabled on a per-argument basis by declaring

    ARGBASH_INDICATE_SUPPLIED([long arg name])

Where long arg name matches the argument name for any
ARG_OPTIONAL_*.

This allows us to do the following:

    test -f ~/.standard-values-for-this-script && source "$_"
    if [[ "${_supplied_arg_myopt-x}" = 1 ]]; then
            MYOPT="$_arg_myopt" # We override
    fi
    # Alternatively
    if [[ -v _supplied_arg_myopt ]]; then
            # ...
    fi
    ... # Code using MYOPT

@kstrafe
Copy link
Author

kstrafe commented Mar 14, 2024

Posting this PR to get some feedback, it's why I haven't written any tests. Let me know if this is a better way to implement this. I think this feature is really useful.

@matejak matejak added this to the 2.11.0 milestone Apr 27, 2025
@matejak
Copy link
Owner

matejak commented Apr 27, 2025

Very interesting, would you mind adding tests? You basically have to modify e.g. https://github.com/matejak/argbash/blob/master/tests/regressiontests/test-simple.m4, adding an option that would print _optionals, so test in https://github.com/matejak/argbash/blob/master/tests/regressiontests/make/tests/tests-base.m4#L125 can be extended accordingly.

Also, could you give a shot to documenting this in https://github.com/matejak/argbash/blob/master/doc/guide.rst#using-parsing-results?

@kstrafe
Copy link
Author

kstrafe commented Apr 27, 2025

Very interesting, would you mind adding tests? You basically have to modify e.g. https://github.com/matejak/argbash/blob/master/tests/regressiontests/test-simple.m4, adding an option that would print _optionals, so test in https://github.com/matejak/argbash/blob/master/tests/regressiontests/make/tests/tests-base.m4#L125 can be extended accordingly.

Also, could you give a shot to documenting this in https://github.com/matejak/argbash/blob/master/doc/guide.rst#using-parsing-results?

@matejak I have added two tests. The initial implementation would fail on x*) and just print _optionals+=(arg), that is now fixed too.

In addition, shellcheck complained, I've inserted a fix around the shift in parse_commandline. This happened with shellcheck version 0.10.0, and yields the following:
Was fixed in previous merge commit.

In addition to that, I've changed all instances of /bin/bash to /usr/bin/env bash (or env -S bash -e for the cases where it was bash -e).

This is because on my system I do not have a /bin/bash, /usr/bin/env bash is supposed to support more systems. See https://askubuntu.com/questions/1402718/is-the-same-usr-bin-env-bash-than-bin-bash

Let me know if you want to include those two extra commits or not, I can remove them if necessary.

Short documentation string also added.

@kstrafe kstrafe force-pushed the master branch 3 times, most recently from fc899f3 to 5a06fe6 Compare April 27, 2025 22:21
@matejak
Copy link
Owner

matejak commented Apr 28, 2025

Great, tests have revealed the problem with non-bash shells that don't have arrays. What about an alternative approach that would use e.g. _arg_<name>_supplied variable that could be unset or set to yes or 1?
So one would use if [[ "$_arg_myopt_supplied" = yes ]]; then - this would even be quite readable.

@matejak
Copy link
Owner

matejak commented Apr 28, 2025

FTR, this PR fixes: #185

@kstrafe
Copy link
Author

kstrafe commented Apr 29, 2025

Great, tests have revealed the problem with non-bash shells that don't have arrays. What about an alternative approach that would use e.g. _arg_<name>_supplied variable that could be unset or set to yes or 1? So one would use if [[ "$_arg_myopt_supplied" = yes ]]; then - this would even be quite readable.

@matejak

This sounds like a good idea but can lead to collisions.
If you define myopt_supplied in addition to myopt, then setting myopt will write _arg_myopt_supplied. I think we need a different approach.

I've pushed a commit that uses _supplied_<arg-name>, so quiet would get _supplied_arg_quiet=1. It shouldn't be possible to generate collisions this way, and allows for both [[ -v _supplied_arg_name ]] and [ "${_supplied_arg_name-x}" = 1 ] for presence checks.

@kstrafe kstrafe force-pushed the master branch 2 times, most recently from 8f91b27 to ede3635 Compare April 29, 2025 23:13
@kstrafe
Copy link
Author

kstrafe commented Apr 29, 2025

Fixed the shellcheck error in the test.

Copy link
Owner

@matejak matejak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I see two minor issues that need addressing for the time being.

Generally, I like your solution very much.
I will help you with tests, so please bear with me for a couple of days.

Finally, please add yourself to AUTHORS file, respecting the alphabetical ordering :-)

@@ -373,6 +373,46 @@ test-simple: $(TESTDIR)/test-simple.sh
ERROR="require exactly 1" $(REVERSE) $<
$< pos -o 'uf ta' | grep -q 'OPT_S=uf ta,POS_S=pos,'
test -z "$(SHELLCHECK)" || $(SHELLCHECK) "$(TESTDIR)/test-simple.sh"
test-simple: $(TESTDIR)/test-simple.sh
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self - I have to refresh my memory how the test definitions work, this feels as unnecessary copy-pasting.

Copy link
Owner

@matejak matejak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, I have pointed out a test optimization path for the script and also for the Makefile.

I have been thinking about this feature in general, and I came to a conclusion that the feature should be optional, because the target group is small, and having this on by default would generate larger parsing code for everybody.
There is a precedent for that - the ARG_POSITIONAL_DOUBLEDASH or ARGBASH_SET_DELIM allow you to set properties of the parsing code in a script-centric (not generator-centric) way.
It's not a big deal and I can help you with that - it is matter of defining a macro, and instead of having the literal assignment to the _supplied_... variable, invoke that macro that will either assign stuff, or won't do anything.
Do you have any thoughts on this or on the name of the macro?

@kstrafe
Copy link
Author

kstrafe commented May 3, 2025

Great, I have pointed out a test optimization path for the script and also for the Makefile.

I have been thinking about this feature in general, and I came to a conclusion that the feature should be optional, because the target group is small, and having this on by default would generate larger parsing code for everybody. There is a precedent for that - the ARG_POSITIONAL_DOUBLEDASH or ARGBASH_SET_DELIM allow you to set properties of the parsing code in a script-centric (not generator-centric) way. It's not a big deal and I can help you with that - it is matter of defining a macro, and instead of having the literal assignment to the _supplied_... variable, invoke that macro that will either assign stuff, or won't do anything. Do you have any thoughts on this or on the name of the macro?

I'm not sure on this compromise, but I agree that this feature may add a lot of ultimately unused lines. Even if we add an ARG_RECORD_SUPPLIED, we may only be interested whether some optionals were supplied, but not all of them.

One alternative is to tag an optional argument so we can choose which optionals to record. For instance, making another ARG: ARG_OPTIONAL_SINGLE_SUPPLIED, but I fear that this will become complicated if we ever want to add one or more more similar features to this one, since we have a combinatorial explosion on the types of arguments; we get ARG_OPTIONAL_SINGLE_SUPPLIED_X as well as ARG_OPTIONAL_SINGLE_X. I don't think that's desirable.

Alternatively, the "set supplied" flag can be added to the argument list for ARG_*, but since our arguments to ARG_* are untagged, and we already have a whole bunch of arguments, readability might be reduced. For example:

ARG_OPTIONAL_BOOLEAN([argument-name-long], [argument-name-short (optional)], [help message (optional)], [default (optional)], [supplied (optional)])

Not to mention that we must then provide all preceding optionals to make supplied work.

What about a separate ARG_SUPPLIED([argument-long-name])? Here, argument-long-name must be an existing ARG_OPTIONAL_*. This will tell the code generator to generate a _supplied_arg_<name>=1 line for that specific argument. This seems to me the most scalable method of going about this.

@matejak what are your thoughts on this? Would this be easy to implement with m4?

@matejak
Copy link
Owner

matejak commented May 4, 2025

Thank you for your points!
I also don't like the idea of extending the API with similar commands differing in only one aspect, or extending the amount of parameters.
I like the idea of ARG_SUPPLIED([argument-long-name], [another-argument-long-name], ...), and perhaps if called without any arguments, it can even act as a global "supplied on" switch. It should be doable in m4, and I will look into this in course of the next week. Perhaps the name can be different - ARGBASH_INDICATE_SUPPLIED or something slightly more verbose.
In the meantime, you can work on acceptance tests and documentation, and I will then push the implementation as a commit to your PR. Regarding acceptance tests, you can either create additional scripts and therefore also tests focused solely on testing the "supplied" functionality - you create a file something.m4, and introduce a test using ADD_TEST_BASH([something], [test body]). The framework will take care of generating the respective .sh file and of deleting it at the end.

@kstrafe kstrafe force-pushed the master branch 4 times, most recently from 9634ccd to 6af4f26 Compare May 4, 2025 15:55
@kstrafe
Copy link
Author

kstrafe commented May 4, 2025

@matejak
I've pushed an implementation that makes the user specify ARGBASH_INDICATE_SUPPLIED, updated documentation, implemented test.
I've not implemented a global ARGBASH_INDICATE_SUPPLIED flag, you have to specify the arguments for each item you wish to supply.

@kstrafe
Copy link
Author

kstrafe commented May 5, 2025

Would it be preferable to have _supplied_arg_name=0 initialized for all supplied arguments?
I'm fond of using set -u which makes the script fail on undefined variables.
This makes it so that if we do [ "$_supplied_arg_x" = 1 ] will fail if the variable doesn't exist, which is a guard against us forgetting to set ARGBASH_INDICATE_SUPPLIED, otherwise it would always be false.

@kstrafe
Copy link
Author

kstrafe commented May 5, 2025

Thinking of squashing commit Default-initialize supplied_arg=0 for ARGBASH_INDICATE_SUPPLIED into the previous one, but leaving it as a separate commit for now to see if the solution is acceptable.

@@ -17,7 +17,8 @@
# Now we take the parsed data and assign them no nice-looking variable names,
# sometimes after a basic validation
if [ "$_arg_print_optionals" = on ]; then
echo "_supplied_arg_prefix=${_supplied_arg_prefix-x},_supplied_arg_print_optionals=${_supplied_arg_print_optionals-x},_supplied_arg_la=${_supplied_arg_la-x},_supplied_arg_not_supplied=${_supplied_arg_not_supplied-x}"
set -u
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a debug session artifact.

Copy link
Author

@kstrafe kstrafe May 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to ensure that the arguments that were marked as supplied exist. If they do not exist then the script will fail. I think we should test this to make sure. Alternatively we can test if the _supplied_arg_x are set to 0 or 1 above the if-statement, and exit 1 if they are not.

@matejak
Copy link
Owner

matejak commented May 6, 2025

Great, I like it very much. I have a couple of suggestions, but they are related to code style rather than to behavior.
Please execute make check in resources if you can to catch errors early.

I find your PR really impressive regarding the complexity of the language the project is written in. That's amazing! Were you able to get meaningful help from an AI, or was it all your doing?

@kstrafe
Copy link
Author

kstrafe commented May 6, 2025

Great, I like it very much. I have a couple of suggestions, but they are related to code style rather than to behavior. Please execute make check in resources if you can to catch errors early.

I find your PR really impressive regarding the complexity of the language the project is written in. That's amazing! Were you able to get meaningful help from an AI, or was it all your doing?

I did run make check from resources for each build. These are OK locally, so not sure why CI fails. Does it use conditional tests based on what's installed on the system perhaps? I only see CI complain about stdin:43: error: Optional arguments '--not-supplied' don't have a short option, which is not supported in POSIX mode. I'll add a short to this optional argument and see if it passes.

Only had an AI chatbot explain to me m4sugar, other than that the biggest hurdle was figuring out the build system (why invoke make from resources? It's not intuitive, a root-dir makefile makes more sense to me) and wrapping it in a nix-develop shell, and then just writing incremental changes followed by inspection to see what changed. The biggest helper is that the code that's generated is deterministic, so it's easy to see what changes affect generated code.

deining and others added 6 commits May 6, 2025 16:09
rst2man shows an error message when converting from rst to man.

> rst2man argbash.rst | gzip > argbash.1.gz
> argbash.rst:42: (ERROR/3) Undefined substitution referenced: "OPTION_IN_PLACE".

To avoid this error message, add the in-place option in argbash-defs.rst
file.

Fixes: eae1874 ("Added the in-place option to the .m4 source.")
Fixes: c994630 ("Add --in-place option")
Shellcheck became very smart, and it can detect that if there are only
action optional args that exit upon call, there is no need for a 'for'
to loop through args and, specifically, for 'shift' after 'case'.
@kstrafe kstrafe force-pushed the master branch 3 times, most recently from 5102d84 to cb3a2bf Compare May 6, 2025 23:34
Since boolean optionals default to "off" we have no way to
check if a boolean optional was actually specified on the command line.

The ability to check if a boolean optional has been given is useful in
cases where we source values from a configuration file where we want any
explicitly given arguments to override the configuration.

An example that mostly works is (for plain optionals with an empty default):

        test -f ~/.standard-values-for-this-script && source "$_"
        MYOPT="${MYOPT:-$_arg_myopt}"
        ... # Code using MYOPT

This works since optionals can default to an empty string. We still are
not entirely sure if the optional was actually passed here, but it tends
to be "good enough" in most cases.

In the case where `myopt` is boolean, the above case would always use
"off" if the argument was not specified, thus overriding the sourced
configuration file.

This patch introduces `_supplied_<argname>`, a value that gets set to 1
if an argument is supplied for that option, otherwise it's set to 0.
This must be enabled on a per-argument basis by declaring:

        ARGBASH_INDICATE_SUPPLIED([long arg name])

Where `long arg name` matches the argument name for any
`ARG_OPTIONAL_*`.

This allows us to do the following:

        test -f ~/.standard-values-for-this-script && source "$_"
        if [ "$_supplied_arg_myopt" = 1 ]; then
                MYOPT="$_arg_myopt" # We override
        fi
        ... # Code using MYOPT
@kstrafe
Copy link
Author

kstrafe commented May 7, 2025

Added a commit that fails when we use ARGBASH_INDICATE_SUPPLIED on non-optional values.
@matejak could you help me out with unittesting this? I'm not sure how to integrate it into the existing framework here. We could just have a new set of tests that runs argbash <file> and then greps for a string in the output.

@@ -163,6 +163,7 @@ m4_define([__ADD_OPTIONAL_ARGUMENT], [m4_do(
[_FILL_IN_VALUES_FOR_AN_OPTIONAL_ARGUMENT([$1], [$3], _arg_varname, [$5], [$2], [$4])],
[m4_popdef([_arg_varname])],
[m4_define([_DISTINCT_OPTIONAL_ARGS_COUNT], m4_incr(_DISTINCT_OPTIONAL_ARGS_COUNT))],
[m4_set_add([OPTIONAL_SET], [$1])],
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please select a more descriptive name s.a. _ALL_OPTIONAL_ARGUMENTS

Comment on lines +532 to +534
[IS_SUBSET_OF([HAVE_SUPPLIED], [OPTIONAL_SET], [elem],
[],
[m4_fatal([ARGBASH_INDICATE_SUPPLIED: Not an optional argument:] elem)])],
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "subset" term is tricky here, you basically want to know whether there is any intersection between arguments and the OPTIONAL_SET, so instead of the macro, you can use a more descriptive construction involving m4_set_add_all([SUPPLIED_NOT_OPTIONAL] m4_set_difference([HAVE_SUPPLIED], [OPTIONAL_SET])) that (if the set is destroyed before) defines a set of problematic arguments, and then you can query it whether it is empty, and you can also format it using m4_set_list in the error message.

When doing this, it may be useful to check out tests/unittests and test the logic. Macros usually decouple the logic and fatal errors exactly for the purpose of unit-testing that they are able to produce correct input for fatal errors.

Comment on lines +343 to +354


dnl
dnl $1: Subset
dnl $2: Superset
dnl $3: Element binding
dnl $4: If subset element contained in superset
dnl $5: If subset element not contained in superset
m4_define([IS_SUBSET_OF], [m4_do(
[m4_set_foreach([$1], [$3],
[m4_set_contains([$2], $3, [$4], [$5])])],
)])
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose to modify this so it uses set operations as noted before. The idea of if-else branching and defining a macro named as the caller wants to use in the error message is very clever, and great for unit testing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this:

m4_define([CHECK_SUPPLIED_ARE_OPTIONAL], [m4_do(
	[m4_set_add_all([SUPPLIED_NOT_OPTIONAL], m4_set_difference([HAVE_SUPPLIED], [_ALL_OPTIONAL_ARGUMENTS]))],
	[m4_if(m4_set_size([SUPPLIED_NOT_OPTIONAL]), 0,
		[],
		[m4_fatal([ARGBASH_INDICATE_SUPPLIED: The following arguments are not optional:] m4_set_dump([SUPPLIED_NOT_OPTIONAL], [, ]))])],
)])

But this triggers the fatal even though everything is OK. Any pointers?

@matejak
Copy link
Owner

matejak commented May 8, 2025

Added a commit that fails when we use ARGBASH_INDICATE_SUPPLIED on non-optional values.

I have added some pointers toward unittests, and you can also add tests that are supposed to fail generation like here: A gentest testing e.g. infinitely many arguments that are not supported in Dash uses gen-test-infinity.m4 under the hood as the file that shouldn't compile by argbash with posix output.

One more thing before the merge - could you rebase the PR against the current master branch? There are some commits showing up that are already merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants