Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use /usr/bin/env -S bash instead of /bin/bash on the host system #5684

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,8 @@ class BashWrapperBuilder {

static final public List<String> BASH
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to rename BASH to BIN_BASH to make the difference clear.


static final public List<String> ENV_BASH

static private int level

@PackageScope
Expand All @@ -85,6 +87,7 @@ class BashWrapperBuilder {
log.warn "Invalid value for `NXF_DEBUG` variable: $str -- See http://www.nextflow.io/docs/latest/config.html#environment-variables"
}
BASH = Collections.unmodifiableList( level > 0 ? ['/bin/bash','-uex'] : ['/bin/bash','-ue'] )
ENV_BASH = Collections.unmodifiableList(['/usr/bin/env', '-S', 'bash', BASH[1]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why introducing this variable instead of relying on process.shell?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this. process.shell is /bin/bash ... by default. Does this answer your question?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is that after #4292 you should be able to use process.shell = ['/usr/bin/env, '-S', 'bash'] in your config.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I still need clarification: Do you suggest this PR is not necessary at all because one could set process.shell = ['/usr/bin/env, '-S', 'bash'] in a pipeline's configuration or do you suggest a different implementation (not using ENV_BASH as I have proposed) would be a technically better approach to achieve this PR's goal?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Yes", to my first or second interpretation? I feel like the are mutually exclusive. 🤷

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I suggest this PR is not necessary at all because one could set process.shell

Copy link
Author

@rollf rollf Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Disclaimer: I assume that the additional usage of /usr/bin/env -S bash on the host system (to invoke .command.run as well as for non-container task execution) is acceptable by the Nextflow maintainers and we just need to sort out the technical implementation. Note that the POSIX-compliant existence of /usr/bin/env is already expected by Nextflow and please also consider the PR description above.]

procoess.shell cannot not be used for various reasons:

  • The value of process.shell will be used for task execution in both the non-container as well as the container scenario. This means that one would need to set this to a value that works in the host as well as in the container env if one wants to freely disable/enable containerization. And this contradicts the whole point.
  • Setting process.shell =/usr/bin/env -S bash -ue (because I want -ue) in the nextflow.config causes problems during tracing (because due to Use shell directive with trace command in wrapper script #4292, Nextflow will favor process.shell's actual value [it's not BASH] and then use -ue which Nextflow specifically doesn't expect for tracing)
  • Even if it worked, that means that everybody without /bin/bash needs to adapt every pipeline they want to run. I'm not trying to fix a specific problem I have with a specific pipeline, I'd like to contribute here in order to solve the general problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of /usr/bin/env is not viable because adds another level of indirection of in the process execution making (even) more complex the composition of the launch command and the passing of shell option.

I propose to address the problem of providing a custom Bash path in a more structural manner in this PR #5696


}

Expand Down Expand Up @@ -563,12 +566,14 @@ class BashWrapperBuilder {
statsEnabled || fixOwnership()
}

protected String shellPath() {
// keep the shell path as "/bin/bash" when a non-custom "shell" attribute is specified
// to not introduce unexpected changes due to the fact BASH is defined as "/bin/bash -eu" by default
return shell.is(BASH)
? "/bin/bash"
: shell.join(' ')
protected String traceInterpreter() {
// if no specific shell was defined, do not use the default one (i.e. "...bash -ue[x]") neither because
// -ue[x] should not be used during tracing
if( shell.is(BASH) || shell.is(ENV_BASH) ) {
return shell.init().join(' ')
}
// otherwise, stick to the custom shell
return shell.join(' ')
}

protected String getLaunchCommand(String interpreter, String env) {
Expand All @@ -582,7 +587,7 @@ class BashWrapperBuilder {
final traceWrapper = isTraceRequired()
if( traceWrapper ) {
// executes the stub which in turn executes the target command
launcher = "${shellPath()} ${fileStr(wrapperFile)} nxf_trace"
launcher = "${traceInterpreter()} ${fileStr(wrapperFile)} nxf_trace"
}
else {
launcher = "${interpreter} ${fileStr(scriptFile)}"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ class LocalTaskHandler extends TaskHandler implements FusionAwareTask {
}

protected ProcessBuilder localProcessBuilder() {
final cmd = new ArrayList<String>(BashWrapperBuilder.BASH) << wrapperFile.getName()
final cmd = new ArrayList<String>(BashWrapperBuilder.ENV_BASH) << wrapperFile.getName()
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sure that .command.run gets executed correctly on a host without /bin/bash.

log.debug "Launch cmd line: ${cmd.join(' ')}"
// make sure it's a posix file system
if( task.workDir.fileSystem != FileSystems.default )
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ class TaskBean implements Serializable, Cloneable {
this.useMicromamba = task.getCondaConfig()?.useMicromamba()
this.spackEnv = task.getSpackEnv()
this.moduleNames = task.config.getModule()
this.shell = task.config.getShell() ?: BashWrapperBuilder.BASH
this.shell = task.config.getShell(task.isContainerEnabled()) ?: BashWrapperBuilder.BASH
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, task will always (?) have a shell since it comes with the process defaults. I have not touched the "else"-block here since I think it is actually not necessary (and it did not cause issues for me even though I don't have /bin/bash).

this.script = task.getScript()
this.beforeScript = task.config.getBeforeScript()
this.afterScript = task.config.getAfterScript()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -374,10 +374,10 @@ class TaskConfig extends LazyMap implements Cloneable {
return (List<String>) get('secret')
}

List<String> getShell() {
List<String> getShell(boolean isContainerEnabled = true) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to adapt the unit tests but I don't know how 🤷 .

final value = get('shell')
if( !value )
return BashWrapperBuilder.BASH
return isContainerEnabled ? BashWrapperBuilder.BASH : BashWrapperBuilder.ENV_BASH

if( value instanceof List )
return (List)value
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,6 @@ class ProcessConfig implements Map<String,Object>, Cloneable {
static final Map<String,Object> DEFAULT_CONFIG = [
debug: false,
cacheable: true,
shell: BashWrapperBuilder.BASH,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in this file and the next one make sure that Nextflow does not assume /bin/bash execution by default. Rather, it used /bin/bash when the configuration requests task containerization, otherwise, /usr/bin/env -S bash is used (i.e. when the task is to be executed on the host system itself).

maxRetries: 1,
maxErrors: -1,
errorStrategy: ErrorStrategy.TERMINATE
Expand Down Expand Up @@ -169,14 +168,15 @@ class ProcessConfig implements Map<String,Object>, Cloneable {
*
* @param script The owner {@code BaseScript} configuration object
*/
protected ProcessConfig( BaseScript script ) {
protected ProcessConfig( BaseScript script, boolean isContainerEnabled = true) {
ownerScript = script
configProperties = new LinkedHashMap()
configProperties.putAll( DEFAULT_CONFIG )
configProperties.put('shell', isContainerEnabled ? BashWrapperBuilder.BASH : BashWrapperBuilder.ENV_BASH)
}

ProcessConfig( BaseScript script, String name ) {
this(script)
ProcessConfig( BaseScript script, String name, boolean isContainerEnabled = true) {
this(script, isContainerEnabled)
this.processName = name
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ class ProcessDef extends BindableDef implements IterableDef, ChainableDef {
assert processConfig==null

// the config object
processConfig = new ProcessConfig(owner,processName)
processConfig = new ProcessConfig(owner,processName,session.getContainerConfig().isEnabled())

// Invoke the code block which will return the script closure to the executed.
// As side effect will set all the property declarations in the 'taskConfig' object.
Expand Down