Skip to content

Compatibility mode to match Docker/OCI arg and env escaping / quoting behaviour #487

Closed
@player1537

Description

@player1537

Version of Singularity
What version of Singularity are you using? Run:

$ singularity version
3.8.4-1.el7

Describe the bug
The default runscript that Singularity produces doesn't handle special characters in the arguments array, and in particular, doesn't follow the ideas/principles behind Bernstein Chaining (ref and ref). What this ultimately means is that, if I've got a working command, and I tack singularity run myimage.sif in front of it, then my command should still work as intended. In practice, I've not found this to be the case, especially when trying to pass things like bash (bash -c) and awk scripts on the command line.

The crux of the issue is that the default runscript is written in a way where: 1) arguments get simply concatenated together with flaky quoting, and 2) the concatenated arguments then get passed through eval.

I believe the justification for the latter is somehow due to needing to be compatible with the Open Container Initiative (I believe this is what "OCI" refers to in the runscript). I believe as well that the eval is there because in Docker, you might specify the CMD or ENTRYPOINT as a small shell script that needs to be expanded properly (concretely with a dumb example, ENTRYPOINT ls -lah /* and ./test.sif /etc/passwd would need to expand the glob in order to be compatible, I believe).

To Reproduce

$ singularity pull test.sif docker://ubuntu:focal
INFO:    Converting OCI blobs to SIF format
INFO:    Starting build...
Getting image source signatures
Copying blob 7b1a6ab2e44d done  
Copying config e7132beceb done  
Writing manifest to image destination
Storing signatures
2021/12/16 23:15:54  info unpack layer: sha256:7b1a6ab2e44dbac178598dabe7cff59bd67233dba0b27e4fbd1f9d4b3c877a54
INFO:    Creating SIF file...
$
$ # Here is the default runscript
$ singularity inspect --runscript ./test.sif
#!/bin/sh
OCI_ENTRYPOINT=''
OCI_CMD='"bash"'
CMDLINE_ARGS=""
# prepare command line arguments for evaluation
for arg in "$@"; do
    CMDLINE_ARGS="${CMDLINE_ARGS} \"$arg\""
done

# ENTRYPOINT only - run entrypoint plus args
if [ -z "$OCI_CMD" ] && [ -n "$OCI_ENTRYPOINT" ]; then
    if [ $# -gt 0 ]; then
        SINGULARITY_OCI_RUN="${OCI_ENTRYPOINT} ${CMDLINE_ARGS}"
    else
        SINGULARITY_OCI_RUN="${OCI_ENTRYPOINT}"
    fi
fi

# CMD only - run CMD or override with args
if [ -n "$OCI_CMD" ] && [ -z "$OCI_ENTRYPOINT" ]; then
    if [ $# -gt 0 ]; then
        SINGULARITY_OCI_RUN="${CMDLINE_ARGS}"
    else
        SINGULARITY_OCI_RUN="${OCI_CMD}"
    fi
fi

# ENTRYPOINT and CMD - run ENTRYPOINT with CMD as default args
# override with user provided args
if [ $# -gt 0 ]; then
    SINGULARITY_OCI_RUN="${OCI_ENTRYPOINT} ${CMDLINE_ARGS}"
else
    SINGULARITY_OCI_RUN="${OCI_ENTRYPOINT} ${OCI_CMD}"
fi

# Evaluate shell expressions first and set arguments accordingly,
# then execute final command as first container process
eval "set ${SINGULARITY_OCI_RUN}"
exec "$@"
$
$ # This works as expected.
$ bash -c 'echo hello world'
hello world
$ ./test.sif bash -c 'echo hello world'
hello world
$
$ # This fails because of adding double quotes around arguments and not escaping any quotes inside
$ bash -c 'echo "hello world"'
hello world
$ ./test.sif -c 'echo "hello world"'
hello
$
$ # Fails because BASH_EXECUTION_STRING is evaluated at the wrong time.
$ # Instead of being evaluated within the `bash -c` command, it's evaluated within the runscript where the variable is empty.
$ bash -c 'echo BASH_EXECUTION_STRING=$BASH_EXECUTION_STRING'
BASH_EXECUTION_STRING=echo BASH_EXECUTION_STRING=$BASH_EXECUTION_STRING
$ ./test.sif bash -c 'echo BASH_EXECUTION_STRING=$BASH_EXECUTION_STRING'
BASH_EXECUTION_STRING=

The last example I feel is the most significant because it represents a potential vulnerability vector.

$ cat printargs.bash
#!/usr/bin/env bash

i=0
while [ "$i" -le "$#" ]; do
    printf $'argv[%d]=%q\n' "$i" "${!i}"
    ((++i))
done
$
$ # This works normally.
$ ./printargs.bash hello world
argv[0]=./printargs.bash
argv[1]=hello
argv[2]=world
$ ./test.sif ./printargs.bash hello world
argv[0]=/home/me/printargs.bash
argv[1]=hello
argv[2]=world
$
$ # This fails because it executes the subshell (which is a harmless date command but could be malicious).
$ ./printargs.bash '$(date)'
argv[0]=./printargs.bash
argv[1]=\$\(date\)
$ ./test.sif ./printargs.bash '$(date)'
argv[0]=/home/me/printargs.bash
argv[1]=Fri\ Dec\ 17\ 00:02:43\ UTC\ 2021

There's a lot more that could be done to show this, such as proving that the problem occurs not in the singularity executable but in the runscript (I've tested it and the problem is in the runscript, but I don't think it's necessary to show here).

Expected behavior
For any command run on the command line, the behavior should be identical when prefixed with either singularity run myimage.sif or ./myimage.sif (barring obvious things like paths being slightly different or environment variables).

OS / Linux Distribution
N/A. Though the system I copy-pasted from for the above code blocks has the following information.

$ cat /etc/os-release
NAME="Red Hat Enterprise Linux Server"
VERSION="7.9 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="7.9"
PRETTY_NAME="Red Hat Enterprise Linux Server 7.9 (Maipo)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.9:GA:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.9
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.9"

Installation Method
N/A. This happens regardless of how Singularity is installed. In my case, I've experienced this behavior both on managed systems (where the sysadmins installed it somehow) and on my own systems (where I installed from a package manager).

Additional context
Here I'll describe a couple of potential fixes as well as a workaround for anyone else experiencing problems with this behavior.

Potential Fix 1 (without eval)
I believe the best fix is, rather than generate the lengthy default script Singularity currently has, the script can be significantly shorter and much simpler. For a container with ENTRYPOINT ["hello", "world"] and a CMD ["foo", "bar"], then the script could look like the following.

#!/usr/bin/env bash

if [ $# -eq 0 ]; then
    set -- "foo" "bar"  # CMD
fi
set -- "hello" "world" "$@"  # ENTRYPOINT
exec "$@"

I don't know Go very well, but I believe that this function could be written as follows:

func (cp *OCIConveyorPacker) insertRunScript() (err error) {
	f, err := os.Create(cp.b.RootfsPath + "/.singularity.d/runscript")
	if err != nil {
		return
	}

	defer f.Close()

	_, err = f.WriteString("#!/bin/sh\n\n")
	if err != nil {
		return
	}

	if len(cp.imgConfig.Cmd) > 0 {
		_, err = f.WriteString("if [ $# -eq 0 ]; then\n    set -- " +
			shell.ArgsQuoted(cp.imgConfig.Cmd) +
			"  # CMD\nfi\n")
		if err != nil {
			return
		}
	}

	if len(cp.imgConfig.Entrypoint) > 0 {
		_, err = f.WriteString("set -- " +
			shell.ArgsQuoted(cp.imgConfig.Entrypoint) +
			" \"$@\"  # ENTRYPOINT\n")
		if err != nil {
			return
		}
	}

	_, err = f.WriteString("exec \"$@\"\n")
	if err != nil {
		return
	}


	f.Sync()

	err = os.Chmod(cp.b.RootfsPath+"/.singularity.d/runscript", 0o755)
	if err != nil {
		return
	}

	return nil
}

Potential Fix 2 (with eval):
If it's actually necessary to have the eval in there-- though I'm inclined to believe that it is not --then the runscript can be modified as follows.

#!/usr/bin/env bash

if [ $# -eq 0 ]; then
    eval 'set -- "foo" "bar"'  # CMD
fi
eval 'set -- "hello" "world" "$@"'  # ENTRYPOINT
exec "$@"

The equivalent Go code would be:

func (cp *OCIConveyorPacker) insertRunScript() (err error) {
	f, err := os.Create(cp.b.RootfsPath + "/.singularity.d/runscript")
	if err != nil {
		return
	}

	defer f.Close()

	_, err = f.WriteString("#!/bin/sh\n\n")
	if err != nil {
		return
	}

	if len(cp.imgConfig.Cmd) > 0 {
		_, err = f.WriteString("if [ $# -eq 0 ]; then\n    eval 'set -- " +
			shell.EscapeSingleQuotes(shell.ArgsQuoted(cp.imgConfig.Cmd)) +
			"'  # CMD\nfi\n")
		if err != nil {
			return
		}
	}

	if len(cp.imgConfig.Entrypoint) > 0 {
		_, err = f.WriteString("eval 'set -- " +
			shell.EscapeSingleQuotes(shell.ArgsQuoted(cp.imgConfig.Entrypoint)) +
			" \"$@\"'  # ENTRYPOINT\n")
		if err != nil {
			return
		}
	}

	_, err = f.WriteString("exec \"$@\"\n")
	if err != nil {
		return
	}

	f.Sync()

	err = os.Chmod(cp.b.RootfsPath+"/.singularity.d/runscript", 0o755)
	if err != nil {
		return
	}

	return nil
}

Workaround:
I wrote this script which also has a docstring that explains everything going on so that it can be copied around as needed. Hopefully it helps someone else that runs into this problem until it is fixed.

#!/usr/bin/env python3
# MIT License
#
# Copyright (c) 2021 Tanner Hobson
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.

r"""Safely pass arguments to Singularity (last checked: Singularity v3.8.4-1.el7).

https://github.com/sylabs/singularity/issues/487

Usage
-----

  $ workaround.py [first set of arguments] -- [second set of arguments]

The first set of arguments (before the first "--" separator, or the entire
string if no "--" is passed) will not be escaped and will be passed to execvp
directly.

The second set of arguments (after the first "--" separator) will be escaped in
a way that makes it safe to pass to the default runscript within a Singularity
image.

Example
-------

If you already have a command like the following that works on the host system:

  $ echo hello '"world"' \$foo '$(date)'
  hello "world" $foo $(date)

And you try running it under Singularity, then variables in the arguments are
expanded and subshells are executed:

  $ export SINGULARITYENV_foo=bar
  $ singularity run myimage.sif echo hello '"world"' \$foo '$(date)'
  hello world bar Fri Dec 17 03:32:39 UTC 2021

To fix this problem, use this script to escape characters that get interpreted
wrongly by Singularity:

  $ ./workaround.py singularity run myimage.sif -- echo hello '"world"' \$foo '$(date)'
  hello "world" $foo $(date)

Explanation
-----------

For this explanation, I'll use the same problematic command to explain what
happens. This command is:

  $ singularity run myimage.sif hello '"world"' \$foo '$(date)'

In Python, this set of arguments looks like:

  ['hello', '"world"', '$foo', '$(date)']

Singularity's default runscript first concatenates all command line arguments
after wrapping each one in double quotes individually.

  CMDLINE_ARGS=""
  # prepare command line arguments for evaluation
  for arg in "$@"; do
      CMDLINE_ARGS="${CMDLINE_ARGS} \"$arg\""
  done

In essence, it iteratively builds up the following variable:

  CMDLINE_ARGS=""
  CMDLINE_ARGS=" \"hello\""
  CMDLINE_ARGS=" \"hello\" \"\"world\"\""
  CMDLINE_ARGS=" \"hello\" \"\"world\"\" \"\$foo\""
  CMDLINE_ARGS=" \"hello\" \"\"world\"\" \"\$foo\" \"\$(date)\""

It's a little clearer to see what this variable is in Python's syntax:

  CMDLINE_ARGS = r' "hello" ""world"" "$foo" "$(date)"'

So far, there is no problem. None of the variables have been improperly expanded
or executed.

The next important step in the runscript is:

  SINGULARITY_OCI_RUN="${CMDLINE_ARGS}"
  eval "set ${SINGULARITY_OCI_RUN}"
  exec "$@"

This amounts to running:

  eval "set \"hello\" \"\"world\"\" \"\$foo\" \"\$(date)\""
  set "hello" ""world"" "$foo" "$(date)"

This is the point where variables get evaluated in the wrong contexts and
subcommands get run automatically when they shouldn't be run at all. In a
vulnerability sense, passing untrusted user input to a program running under
Singularity can cause attackers to run processes on your machine (within the
Singularity container). For a sense of scale, the following is problematic:

  $ user_word='bannanas$(rm -rf $PWD)'
  $ singularity run my_cool_spellchecker.sif "$user_word"

To work around this problem, this script makes use of the fact that variables
get expanded erroneously to escape problematic characters from the string.
Effectively, this postpones the evaluation of variables and subshells until
after the eval is complete.

For the above input, this script will run (in Python syntax):

  os.execvp("singularity", [
      r"singularity",
      r"run",
      r"myimage.sif",
      r"hello",
      r'\${doublequote}world\${doublequote}',
      r'\${dollar}foo',
      r'\${dollar}(date)',
  ])

This script also namespaces each of the special character variable names so they
don't collide with anything someone is actually using, and sets SINGULARITYENV_
variables to pass their values into the container.

"""

from os import execvp, environ


UNIQUE_PREFIX      = 'WORKAROUND_6ec97707_c1a7_430e_a47c_e0f9e55fba9f'
UNIQUE_DOLLAR      = UNIQUE_PREFIX + "_dollar"
UNIQUE_BACKSLASH   = UNIQUE_PREFIX + "_backslash"
UNIQUE_BACKTICK    = UNIQUE_PREFIX + "_backtick"
UNIQUE_DOUBLEQUOTE = UNIQUE_PREFIX + "_doublequote"


def main(unchanged, needs_escaped):
    arguments = unchanged
    for argument in needs_escaped:
        argument = argument.replace("$",  "${" + UNIQUE_DOLLAR      + ":?}")
        argument = argument.replace("\\", "${" + UNIQUE_BACKSLASH   + ":?}")
        argument = argument.replace("`",  "${" + UNIQUE_BACKTICK    + ":?}")
        argument = argument.replace('"',  "${" + UNIQUE_DOUBLEQUOTE + ":?}")
        arguments.append(argument)

    # Gotta be careful with these environment variables. They're also
    # susceptible to double expansion, though for an entirely separate reason
    # than the arguments.
    environ['SINGULARITYENV_' + UNIQUE_DOLLAR      ] = "\\$"
    environ['SINGULARITYENV_' + UNIQUE_BACKSLASH   ] = "\\\\"
    environ['SINGULARITYENV_' + UNIQUE_BACKTICK    ] = "\\`"
    environ['SINGULARITYENV_' + UNIQUE_DOUBLEQUOTE ] = '"'

    execvp(arguments[0], arguments)


def cli():
    from sys import argv

    try:
        index = argv.index('--')
    except ValueError:
        unchanged, needs_escaped = argv[1:], []
    else:
        unchanged, needs_escaped = argv[1:index], argv[index+1:]
    
    main(unchanged, needs_escaped)


if __name__ == '__main__':
    cli()

LASTLY:
While I was reading through the code and trying to find out if my solution works and if I'm not just missing something fundamental within Singularity, I found out that there is also a problem with environment variables.

$ SINGULARITYENV_foo='$(date >&2)' ./test.sif echo hello world
Fri Dec 17 03:49:10 UTC 2021
hello world

Internally, Singularity is injecting /.inject-singularity-env.sh with a bunch of export KEY=VALUE statements, where the value is the output of the Go shell.EscapeDoubleQuotes() function, wrapped in double quotes manually. This needs to be escaped more. I believe swapping the call to shell.EscapeDoubleQuotes to a call for shell.Escape is sufficient.

Metadata

Metadata

Labels

enhancementNew feature or requestroadmapFeatures / changes that are scheduled to be implementedtechdebt

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions