Use multi-stage build and replace CMD with ENTRYPOINT #125

dliappis · 2017-10-10T14:22:21Z

This PR moves away from running the es-docker wrapper script via CMD to a new ENTRYPOINT script.

The ENTRYPOINT runs as root, on vanilla docker and most orchestration frameworks, and switches to the pre-provisioned elasticsearch user with uid=gid=1000 before exec'ing elasticsearch.

It's still possible to run another CMD, through the ENTRYPOINT, such
as /bin/bash to inspect things or directly elasticsearch -Ex.y=z[4].

As there is at least one docker orchestration framework, Openshift Open[1], that overrides the Dockerfile USER to something with random uid and gid=0, the entrypoint script conditionally skips the privilege de-escalation in this case. To achieve this, the elasticsearch.tar.gz needs to be extracted so that all files+dirs are gid=0 and have the same group permissions as the owner. This is according to RedHat's best practices described in [1].

We avoid generating another large layer due to chown -R and chgrp -R, by utilizing multi-stage builds and the new COPY --chown= feature[2], available starting with docker-ce >=17.09. This is only a build requirement, that doesn't affect the portability of the image. As a side effect we are also able to significantly clean up the Dockerfile jinja2 template.

We also introduce the optional TAKE_FILE_OWNERSHIP flag which will. in most orchestration environments, forcefully change the uid and gid of the data and logdir. This can help users bind-mounting host directories that don't provide write access either to the default user 1000 or to gid 0. This flag is not expected to receive much use in orchestration frameworks as k8s has the securityContext option[3], elsewhere volumes are typically gid=0 and g=rwx and docker+docker swarm have named volumes and storage drivers.

The PR also includes three new tests:

ensure all extracted files under /usr/share/elasticsearch have the right gid.
tests that elasticsearch works ok when using a bind-mounted host directory for data.
tests that elasticsearch works ok also when a bind-mounted host dir has a random uid (but gid=0).

There are a number of more tests that we can add here (e.g. for the MUTATE_MOUNTS flag, testing that elasticsearch can start with --user forced to a different value), but I wanted to keep the size of this PR under control for now and add tests, if needed, in an additional PR.

[1]: https://docs.openshift.org/latest/creating_images/guidelines.html
[2]: moby/moby#34263
[3]: kubernetes/kubernetes#2630 (comment)
[4]: https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html#_d_override_the_image_8217_s_default_ulink_url_https_docs_docker_com_engine_reference_run_cmd_default_command_or_options_cmd_ulink
f674bf9

Closes: #86, #90, #114,

We need to deal with two, rather conflicting requirements #90 and For k8s and other orchestration systems, there is no direct equivalent to docker native volumes[1] but instead volumes are owned by root by default. The GID at least can be influenced with `securityContext`[2], but for lots of people it's easier to have an easier mechanism. One solution is described in [3]. However the existing setup doesn't work if the container is executed with a different `--user` or in Openshift where the `USER` in the `Dockerfile` is overriden and the ENTRYPOINT/CMD is executed by a random UID>=1024, member of 0 group. This commit switches away from the esdocker CMD to an ENTRYPOINT. The ENTRYPOINT (`/usr/local/bin/docker-entrypoint.sh`) combines Docker + Kubernetes + Openshift best practices. The group ownership of `/usr/share/elasticsearch` is now 0, so in the case of Openshift, or forcing `--user <other_uid>`, elasticsearch can still run and write to bind mounted directories. The root group doesn't grant any additional special permissions, so should somehow the elasticsearch process in the container be compromised, it can't access/mutate critical files from the host os, even if `/` is mounted in the container. Optionally, a `MUTATE_MOUNTS` env var can be explicitly used to override the UID:GID of the `data` and `logs` dirs as well, when the ENTRYPOINT runs as the default root user (i.e. not in Openshift). It's still possible to run another CMD, through the ENTRYPOINT, such as `/bin/bash` to inspect things or directly `elasticsearch -Ex.y=z`[4]. Finally, CTRL-C is now handled more appropriately, when using the image interactively. [1]: https://success.docker.com/KBase/Different_Types_of_Volumes [2]: kubernetes/kubernetes#2630 (comment) [3]: #114 (comment) [4]: https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html#_d_override_the_image_8217_s_default_ulink_url_https_docs_docker_com_engine_reference_run_cmd_default_command_or_options_cmd_ulink

Ensuring all files under /usr/share/elasticsearch have GID=0 **and** the same group permissions as those of the user is tricky. `-R` options of `chmod`/`chgrp` increase the image size with a large layer and introduce slowness in the build process. `tar` doesn't give us any option to force `gid` while extracting and of course we'd still need to match the group permissions. One solution is to prepare the needed files using multi-stage builds[1] in a separate stage. Unfortunately `COPY` will always use 0:0[2] overriding `USER`, so even if prepare the files at another stage, we'll still need to mangle ownership+perms or COPY and then extract an archive, which again increases image size. docker-ce 17.09 introduced the `--chown`[3] parameter, which solves this problem. Move code to extract elasticsearch, adjust ownership + group permissions, install x-pack and plugins to a stage `prep_es_files`. This allows have a very clean stage to build the final elasticsearch image, with a clean history and reasonable size. IMPORTANT: With this commit, the minimum docker-ce version on the building machine needs to be `17.09`. [1] https://docs.docker.com/engine/userguide/eng-image/multistage-build/ [2] moby/moby#7537 (comment) [3] moby/moby#34263

As required by Openshift Origin best practices[1] all files+directories under our application path should have group set to 0. [1] https://docs.openshift.com/enterprise/3.0/creating_images/guidelines.html

In order to support tests for bind-mounts, we need to make the, up to now hardcoded, `volume: ` definitions in docker-compose configurable. Allow overriding the default named volumes via the `--mount-datavolume1` and `--mount-datavolume2` pytest parameters.

Now that we can configure the data volume definitions in the docker-compose file through env vars and testinfra[1], test that elasticsearch can start and write to the data dir. [1] 9bdbf3a

For test_datadirs.py we also need to ensure that bind-mounted host dirs are empty before starting tests. Clean up local datadir1,2 directories on the host, where tests get executed, using privilege escalation through `docker run`.

Adjust the default umask in the entrypoint so that new files are group writable too. This is particularly important for Openshift, where a container with a random uid may start writing to a bind-mount, then get restarted with a new user with a different uid.

Add helper functions in toolbelt.py to abstract the tasks of creating, deleting and chown'ing directories via bind-mounts and docker run. Generalize fixture asserting that the bind-mounted data dir is writable and add test to try elasticsearch access to a bind-mounted data dir with a different uid + gid=0. Do not store empty directories in git, tests can now created/delete test directories as required.

dliappis · 2017-10-10T14:24:27Z

@drewr FYI as discussed

cc @xeraa

@agc93 This PR should resolve the issue #114 you reported using Openshift Open.

dliappis · 2017-10-10T15:24:22Z

Also FYI @tylerjl

agc93 · 2017-10-10T23:27:23Z

@dliappis That's awesome, thanks!

ghost

A bold new era!

ghost · 2017-10-11T02:43:23Z

build/elasticsearch/bin/docker-entrypoint.sh

+# https://docs.openshift.org/latest/creating_images/guidelines.html
+if ! whoami &> /dev/null; then
+    if [ -w /etc/passwd ]; then
+        echo "${USER_NAME:-default}:x:$(id -u):0:${USER_NAME:-default} user:/usr/share/elasticsearch:/sbin/nologin" >> /etc/passwd


I don't think this user should be called default. The default user is elasticsearch. This user is the opposite of default. It's an unusual, unexpected and custom user. Let's call it custom_user or something.

From a different angle, did you consider changing the UID of the existing elasticsearch user instead? Is that a bad idea?

From a different angle, did you consider changing the UID of the existing elasticsearch user instead? Is that a bad idea?

That's a good suggestion. One of my concerns was for people who want to inspect a running container and a new user (with something like --user <random> -ti <container_id> /bin/bash) .

But this is very much a corner case I think and testing it seems to work fine. Therefore I suggest an even bolder move: remove this entire section. I introduced it as suggested in the Openshift docs for fear of applications that want to lookup the associated user from /etc/passwd. I tried building things again now without it and every combination I tried doesn't give problems, including entering the container with an interactive shell. I also tried some of the binaries under bin/ and bin/x-pack and didn't see issues.

Adding one test case that attempts to run elasticsearch with --user=<random will cover us somewhat, but naturally we can't write test cases to execute every binary in this scenario.

I vote for removing it altogether, WDYT?

👍 to that. Less code is less complexity, and I'm always in favour of that.

Also, I think having an anonymous user that's identified only by UID makes it very clear that is a local, ephemeral user.

ghost · 2017-10-11T02:45:22Z

build/elasticsearch/bin/docker-entrypoint.sh

+# or simply to run /bin/bash to check the image
+if [[ "$1" != "eswrapper" ]]; then
+    if [[ "$(id -u)" == "0" ]] && [[ "$1" == *elasticsearch* ]]; then
+        exec chroot --userspec=1000 / "$@"


Can we get a comment about this very unusual use of chroot? (It's very clever but very weird).

Addressed in 9e3d43d

ghost · 2017-10-11T02:49:00Z

templates/Dockerfile.j2

+
+WORKDIR /usr/share/elasticsearch
+
+COPY --from=prep_es_files --chown=1000:0 /usr/share/elasticsearch /usr/share/elasticsearch


ghost · 2017-10-11T02:52:54Z

build/elasticsearch/bin/docker-entrypoint.sh

-exec bin/elasticsearch "${es_opts[@]}"
+if [[ "$(id -u)" == "0" ]]; then
+    # If requested and running as root, mutate the ownership of bind-mounts
+    if [[ -n "$MUTATE_MOUNTS" ]]; then


$TAKE_FILE_OWNERSHIP? It's not obvious what kind of "mutation" is going to happen.

Addressed in 74ca02a

Addresses #125 (comment)

To simulate edge cases like Openshift Open, simulate running Elasticsearch with a random high uid while bound-mounting a host directory that is owned by a different uid having gid=0.

This code block was added to prevent errors that there's no matching user entry in /etc/passwd when the container runs with a random `--user`. However all current tests show nothing (including inspecting the image with `-ti <container-name> /bin/bash`) requires a matching entry in `/etc/passwd` for the current uid, so less code is better.

The chroot that comes with the coreutils version of the centos:7 base image doesn't have the --skip-chdir parameter, thus changes the CWD to /. Using the full path `/usr/share/elasticsearch/bin/elasticsearch` is perfectly fine, but if a user wants to override the CMD to execute elasticsearch directly as documented currently as configuration option D[1], `elasticsearch -E x.y=z` would work, but not `bin/elasticsearch -E x.y=z`. Overwrite $1 argument of a custom CMD to `elasticsearch` when using configuration option D. so that we retain backwards compatibility with the docs. [1] https://www.elastic.co/guide/en/elasticsearch/reference/5.6/docker.html#_d_override_the_image_8217_s_default_ulink_url_https_docs_docker_com_engine_reference_run_cmd_default_command_or_options_cmd_ulink

dliappis added 12 commits October 6, 2017 19:25

Break long Dockerfile commands to multilines

d577f1f

Adjust whitespace control for generated Dockerfiles from j2 template

a952118

Test that all entries in the elasticsearch path have gid==0

c8e40da

As required by Openshift Origin best practices[1] all files+directories under our application path should have group set to 0. [1] https://docs.openshift.com/enterprise/3.0/creating_images/guidelines.html

Add test for bind-mounted host dirs as the elasticsearch data dir

a97800c

Now that we can configure the data volume definitions in the docker-compose file through env vars and testinfra[1], test that elasticsearch can start and write to the data dir. [1] 9bdbf3a

tests: cleanup bind-mounted datadirs

4bf09b8

For test_datadirs.py we also need to ensure that bind-mounted host dirs are empty before starting tests. Clean up local datadir1,2 directories on the host, where tests get executed, using privilege escalation through `docker run`.

Fix typo in comment

fa88a5d

Build oss flavor too

2bb9382

dliappis requested a review from a user October 10, 2017 14:22

ghost approved these changes Oct 11, 2017

View reviewed changes

dliappis added 6 commits October 11, 2017 15:04

Add comment about user of chroot for privilege de-escalation

9e3d43d

Addresses #125 (comment)

Change flag name to more discoverable TAKE_FILE_OWNERSHIP

74ca02a

Addresses #125 (comment)

Add test to run elasticsearch as a different user

3eb66b4

To simulate edge cases like Openshift Open, simulate running Elasticsearch with a random high uid while bound-mounting a host directory that is owned by a different uid having gid=0.

Update version matrix

f824f0f

dliappis merged commit 0abc1f8 into master Oct 12, 2017

dliappis deleted the switch-to-entrypoint-as-root branch October 12, 2017 16:01

This was referenced Oct 12, 2017

Permissions prevent running on OpenShift #114

Closed

Entrypoint needs to chown data directory #90

Closed

Provide ENV variable for runtime user? #86

Closed

dliappis mentioned this pull request Nov 7, 2017

Set ownership on startup for es volume mounts #70

Closed

dliappis mentioned this pull request May 1, 2018

kibana-docker user and group ids clash with elasticsearch-docker elastic/kibana-docker#42

Closed

dliappis mentioned this pull request Aug 22, 2019

Adding USER id to avoid run as root in OpenShift elastic/elasticsearch#45376

Closed

dliappis mentioned this pull request Aug 30, 2019

Pin USER in Dockerfile complying with Docker best practices elastic/elasticsearch#46166

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use multi-stage build and replace CMD with ENTRYPOINT #125

Use multi-stage build and replace CMD with ENTRYPOINT #125

Uh oh!

dliappis commented Oct 10, 2017 •

edited

Loading

Uh oh!

dliappis commented Oct 10, 2017

Uh oh!

dliappis commented Oct 10, 2017

Uh oh!

agc93 commented Oct 10, 2017

Uh oh!

ghost left a comment

Uh oh!

ghost Oct 11, 2017

Uh oh!

dliappis Oct 11, 2017

Uh oh!

ghost Oct 12, 2017

Uh oh!

ghost Oct 11, 2017

Uh oh!

dliappis Oct 11, 2017

Uh oh!

ghost Oct 11, 2017

Uh oh!

ghost Oct 11, 2017

Uh oh!

dliappis Oct 11, 2017

Uh oh!

Uh oh!


		WORKDIR /usr/share/elasticsearch

		COPY --from=prep_es_files --chown=1000:0 /usr/share/elasticsearch /usr/share/elasticsearch

Use multi-stage build and replace CMD with ENTRYPOINT #125

Use multi-stage build and replace CMD with ENTRYPOINT #125

Uh oh!

Conversation

dliappis commented Oct 10, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dliappis commented Oct 10, 2017

Uh oh!

dliappis commented Oct 10, 2017

Uh oh!

agc93 commented Oct 10, 2017

Uh oh!

ghost left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dliappis commented Oct 10, 2017 •

edited

Loading