Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker image not starting on Fedora #4020

Closed
ioadler opened this issue Feb 5, 2017 · 22 comments
Closed

Docker image not starting on Fedora #4020

ioadler opened this issue Feb 5, 2017 · 22 comments
Assignees
Labels
kind/question Questions that haven't been identified as being feature requests or bugs. status/blocked Issue that can’t be moved forward. Must include a comment on the reason for the blockage.

Comments

@ioadler
Copy link

ioadler commented Feb 5, 2017

When I start Che, I get the following message:

INFO: Welcome to Eclipse Che!
INFO:
INFO: You are missing a mandatory parameter:
INFO: 1. Mount 'docker.sock' for accessing Docker with unix sockets.
INFO: 2. Or, set DOCKER_HOST to Docker's daemon location (unix or tcp).
INFO:
INFO: Mount Syntax:
INFO: Start with 'docker run -it --rm -v /var/run/docker.sock:/var/run/docker.sock' ...
INFO:
INFO: DOCKER_HOST Syntax:
INFO: Start with 'docker run -it --rm -e DOCKER_HOST= ...'

Reproduction Steps:

  • Install Fedora Server from scratch
  • Install Docker from Fedora repo as Admin User
  • Test Docker is running
  • Create data directory for Che
  • Try to run Che with: sudo docker run -it --rm -v /var/run/docker.sock:/var/run/docker.sock -v /tmp/chedata:/data eclipse/che start

OS and version:

  • Fedora Server 4.8.6-300.fc25.x86_64
  • Docker 1.12.6

Diagnostics:

INFO: Welcome to Eclipse Che!
INFO:
INFO: You are missing a mandatory parameter:
INFO: 1. Mount 'docker.sock' for accessing Docker with unix sockets.
INFO: 2. Or, set DOCKER_HOST to Docker's daemon location (unix or tcp).
INFO:
INFO: Mount Syntax:
INFO: Start with 'docker run -it --rm -v /var/run/docker.sock:/var/run/docker.sock' ...
INFO:
INFO: DOCKER_HOST Syntax:
INFO: Start with 'docker run -it --rm -e DOCKER_HOST= ...'

@TylerJewell TylerJewell added the kind/question Questions that haven't been identified as being feature requests or bugs. label Feb 5, 2017
@TylerJewell
Copy link

TylerJewell commented Feb 5, 2017

I just did a test on Digital Ocean and everything worked as expected.

  1. Followed Docker installation instructions to get Docker 1.13 using sudo as root user here:
    https://docs.docker.com/engine/installation/linux/fedora/

  2. Used the nightly Che image with the command:
    docker run -it --rm -v /var/run/docker.sock:/var/run/docker.sock -v /root/che:/data eclipse/che:nightly init

This ran and got past the particular errors that you are witnessing.

We are going to need the community to debug this as we cannot reproduce the issue. To test and try to debug what is going on:

  1. Clone the Che source repo to your system with Docker installed - say to /che.
  2. When you run the command enter some extra volume mounts (which should use the scripts we provide in the source repository instead of what is in the image):
docker run -it --rm 
  -v /var/run/docker.sock:/var/run/docker.sock 
  -v /root/che:/data
  -v /che/dockerfiles/base/scripts/base:/scripts/base
    eclipse/che:nightly init --skip:nightly 
  1. Then, when this fails for you, do some debugging. The command that tests for valid access to the Docker daemon is 'docker ps' which is run inside of our container. If this command fails in our container, then the container is not able to gain access to the daemon. The check is here: https://github.com/eclipse/che/blob/master/dockerfiles/base/scripts/base/docker.sh#L218-L233.

Is it possible that the -z check fails on some versions of Fedora?

@TylerJewell TylerJewell added the status/info-needed More information is needed before the issue can move into the “analyzing” state for engineering. label Feb 5, 2017
@ghost
Copy link

ghost commented Feb 6, 2017

@ioadler is SElinux enabled in your machine? Can you try it this way?

-v /var/run/docker.sock:/var/run/docker.sock:Z

If it does not help, try changing permissions for /var/run/docker.sock - sudo chmod 777 /var/run/docker.sock. Not sure this helps but it's worth trying.

Another test to run is:

docker run -ti -v /var/run/docker.sock:/var/run/docker.sock 1.13.0-dind sh

When in a running container, run docker ps.

By the way, your user should be able to run docker commands without sudo. Have you added it to do docker group?

@ioadler
Copy link
Author

ioadler commented Feb 7, 2017

Thank you very much for your replies. The problem (difference) was that I used the Fedora Docker installation.

Now I have the following problem after starting Che with:
sudo docker run -it --rm -v /var/run/docker.sock:/var/run/docker.sock -v /home/iar/chedata:/data eclipse/che:nightly start

...
INFO: (che config): Generating che configuration...
INFO: (che config): Customizing docker-compose for running in a container
INFO: (che start): Preflight checks
mem (1.5 GiB): [OK]
disk (100 MB): [OK]
port 8080 (http): [AVAILABLE]
conn (browser => ws): [NOT OK]
conn (server => ws): [NOT OK]
ERROR: Try 'docker run eclipse/che:nightly info --network' for more tests.

Running: ... info --network it says:

INFO: (che cli): nightly - using docker 1.13.0 / native
INFO: (che download): Pulling image eclipse/che:nightly

nightly: Pulling from eclipse/che
Digest: sha256:c5bfd5bb50e6671a14f1e03849284459f3563991145610c5c6d74f9a024722ee
Status: Image is up to date for eclipse/che:nightly

INFO:
INFO: ---------------------------------------
INFO: -------- CONNECTIVITY TEST --------
INFO: ---------------------------------------
INFO: (che network): eclipse/che-ip:nightly: 172.17.0.1
INFO: (che network): Browser => Workspace Agent (localhost): Connection failed
INFO: (che network): Browser => Workspace Agent (172.17.0.1): Connection failed
INFO: (che network): Server => Workspace Agent (External IP): Connection failed
INFO: (che network): Server => Workspace Agent (Internal IP): Connection succeeded

What's the problem now? (I found some issues that sounded similar from June, but their description didn't help.)

@TylerJewell
Copy link

I think 172.17.0.1 is the IP address of docker0 and not the IP address of the node itself. What does "info" and "info --bundle" generate for you?

If the auto-detected IP address is not the right one, provide the proper one with -e CHE_HOST=<ip> on the command line when starting Che.

@ghost
Copy link

ghost commented Feb 9, 2017

@ioadler can you ru8n ifconfig on your Fedora node?

@ioadler
Copy link
Author

ioadler commented Feb 10, 2017

I tried it with and without -e CHE_HOST:
sudo docker run -e CHE_HOST=192.168.0.209 -it --rm -v /var/run/docker.sock:/var/run/docker.sock -v /home/iar/chedata:/data eclipse/che:nightly info

Without:

CLI:
 TTY:            true
 Daemon:         /var/run/docker.sock
 Image:          eclipse/che:nightly
 Version:        nightly
 Command:        info
 Parameters:     info
Mounts:
 /data:          /home/iar/chedata
 /data/instance: not set
 /data/backup:   not set
 /repo:          not set
 /assembly:      not set
 /sync:          not set
 /unison:        not set
 /chedir:        not set
System:
 Docker:         native
 Proxy:          not set
Internal:
 CHE_VERSION:    nightly
 CHE_HOST:       172.17.0.1
 CHE_INSTANCE:   /home/iar/chedata/instance
 CHE_CONFIG:     /home/iar/chedata
 CHE_BACKUP:     /home/iar/chedata/backup
 CHE_REGISTRY:   /version
 CHE_DEBUG:      false
 IP Detection:   172.17.0.1
 Initialized:    true
Image Registry:
 IMAGE_INIT=eclipse/che-init:nightly
 IMAGE_CHE=eclipse/che-server:nightly
 IMAGE_COMPOSE=docker/compose:1.8.1
 BOOTSTRAP_IMAGE_ALPINE=alpine:3.4
 BOOTSTRAP_IMAGE_CHEIP=eclipse/che-ip:nightly
 UTILITY_IMAGE_CHEACTION=eclipse/che-action:nightly
 UTILITY_IMAGE_CHEDIR=eclipse/che-dir:nightly
 UTILITY_IMAGE_CHETEST=eclipse/che-test:nightly
 UTILITY_IMAGE_CHEMOUNT=eclipse/che-mount:nightly
che.env:
 CHE_HOST=172.17.0.1

With:

CLI:
 TTY:            true
 Daemon:         /var/run/docker.sock
 Image:          eclipse/che:nightly
 Version:        nightly
 Command:        info
 Parameters:     info
Mounts:
 /data:          /home/iar/chedata
 /data/instance: not set
 /data/backup:   not set
 /repo:          not set
 /assembly:      not set
 /sync:          not set
 /unison:        not set
 /chedir:        not set
System:
 Docker:         native
 Proxy:          not set
Internal:
 CHE_VERSION:    nightly
 CHE_HOST:       **192.168.0.209**
 CHE_INSTANCE:   /home/iar/chedata/instance
 CHE_CONFIG:     /home/iar/chedata
 CHE_BACKUP:     /home/iar/chedata/backup
 CHE_REGISTRY:   /version
 CHE_DEBUG:      false
 IP Detection:   172.17.0.1
 Initialized:    true
Image Registry:
 IMAGE_INIT=eclipse/che-init:nightly
 IMAGE_CHE=eclipse/che-server:nightly
 IMAGE_COMPOSE=docker/compose:1.8.1
 BOOTSTRAP_IMAGE_ALPINE=alpine:3.4
 BOOTSTRAP_IMAGE_CHEIP=eclipse/che-ip:nightly
 UTILITY_IMAGE_CHEACTION=eclipse/che-action:nightly
 UTILITY_IMAGE_CHEDIR=eclipse/che-dir:nightly
 UTILITY_IMAGE_CHETEST=eclipse/che-test:nightly
 UTILITY_IMAGE_CHEMOUNT=eclipse/che-mount:nightly
che.env:
 CHE_HOST=172.17.0.1

Starting results in the same problem.

Result of ifconfig:

docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 0.0.0.0
        inet6 fe80::42:8dff:fe0e:3520  prefixlen 64  scopeid 0x20<link>
        ether 02:42:8d:0e:35:20  txqueuelen 0  (Ethernet)
        RX packets 287  bytes 18140 (17.7 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 117  bytes 18933 (18.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp2s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.0.209  netmask 255.255.255.0  broadcast 192.168.0.255
        inet6 fe80::7d9:e299:80c2:e75b  prefixlen 64  scopeid 0x20<link>
        ether 90:e6:ba:52:f3:20  txqueuelen 1000  (Ethernet)
        RX packets 182335  bytes 269547863 (257.0 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 65718  bytes 4562459 (4.3 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1  (Lokale Schleife)
        RX packets 20  bytes 1544 (1.5 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 20  bytes 1544 (1.5 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

@TylerJewell
Copy link

TylerJewell commented Feb 11, 2017

@ioadler - in your without scenario, you still had it in your che.env file.

  1. Can you update your che.env with your IP address and rerun (not just use the command line). I will double check, but I believe that che.env > command line. You can see in the info command that CHE_HOST was the wrong IP. Our IP detection is definitely picking up your docker0 ethernet address when it should be picking up your enp2s0.

  2. @benoitf - can you take a look at this ifconfig output. Is this an edge case of the eclipse/che-ip:nightly not detecting the right IP? I notice that in your test scenario, we assume eth0 interface on Fedora. This is the first time I have seen an enp2s0 interface.

  3. @eivantsov @riuvshin @benoitf - today, the logic in the CLI is:

if is_initialized; then 
  use the value of CHE_HOST set in che.env.
  print warning message saying that che.env overrides command line CHE_HOST
fi

Should we change this? And instead have:

if is_initialized; then
  if `-e CHE_HOST=<ip>` != eclipse/che-ip:nightly; then
    update che.env CHE_HOST with value of `-e CHE_HOST=<ip>`
  fi
fi

@TylerJewell TylerJewell added the kind/bug Outline of a bug - must adhere to the bug report template. label Feb 11, 2017
@ghost
Copy link

ghost commented Feb 11, 2017

Definitely, -e CHE_HOST should override che.env.

@benoitf
Copy link
Contributor

benoitf commented Feb 11, 2017

@ioadler could you provide me result of command docker run --rm --net host alpine:3.5 ip a show and docker run --rm alpine:3.5 uname -r to update che-ip utility. Thanks

@benoitf benoitf added status/open-for-dev An issue has had its specification reviewed and confirmed. Waiting for an engineer to take it. status/blocked Issue that can’t be moved forward. Must include a comment on the reason for the blockage. and removed status/info-needed More information is needed before the issue can move into the “analyzing” state for engineering. status/open-for-dev An issue has had its specification reviewed and confirmed. Waiting for an engineer to take it. labels Feb 11, 2017
@l0rd
Copy link
Contributor

l0rd commented Feb 13, 2017

@ioadler @TylerJewell Che should/can run correctly with the Fedora version of Docker. Just try:

sudo docker run -p 8080:8080 \
           --name che \
           -v /var/run/docker.sock:/var/run/docker.sock \
           -v <LOCAL_PATH>:/data:Z \
           --security-opt label:disable \
           -e CHE_DOCKER_SERVER__EVALUATION__STRATEGY=docker-local \
           eclipse/che-server:nightly

There are 3 things that are different with respect to other distributions:

  • SELinux is activated by default and blocks access to the Docker socket. The workaround is to use option --security-opt label:disable when running the che-server.
  • A firewall blocks communications between containers that go through the Docker0 bridge. The workaround is to use option -e CHE_DOCKER_SERVER__EVALUATION__STRATEGY=docker-local when running the Che server
  • Using docker without sudo is not recommended

IMHO disabling SELinux system wide or using the version packaged by Docker Inc. are not good ideas (c.f. 1 and 2).

@ioadler
Copy link
Author

ioadler commented Feb 17, 2017

@TylerJewell

I changed che.env. Same result. docker ... info shows:

INFO: (che cli): nightly - using docker 1.13.0 / native
WARN: (che cli): 'CHE_HOST=192.168.0.209' is != discovered IP '172.17.0.1'
INFO: (che download): Pulling image eclipse/che:nightly
....

@benoitf

The output is:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP qlen 1000
link/ether 90:e6:ba:52:f3:20 brd ff:ff:ff:ff:ff:ff
inet 192.168.0.209/24 brd 192.168.0.255 scope global dynamic enp2s0
valid_lft 863569sec preferred_lft 863569sec
inet6 fe80::7d9:e299:80c2:e75b/64 scope link
valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN
link/ether 02:42:57:71:98:fb brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 scope global docker0
valid_lft forever preferred_lft forever
inet6 fe80::42:57ff:fe71:98fb/64 scope link
valid_lft forever preferred_lft forever

@l0rd

Thanks - going back to square one ;-)

@ghost
Copy link

ghost commented Feb 17, 2017

@ioadler how can I get exactly the same Fedora as you have. We all tried hard to reproduce but Fedora images on Digital Ocean and AWS were merciful - Che started as expected.

@TylerJewell
Copy link

TylerJewell commented Feb 17, 2017

@eivantsov - that warning message indicates that he has a che.env CHE_HOST that is overriding the actual IP address he should be using. I think we have a situation where Che isn't running properly because of the command line + che.env configuration.

@ioadler - fortunately for us, I added some things to fix this in the 5.3.0 release which just got made today :). In all the versions that you were running, the discovered IP address of 172.17.0.1 was being used. This is not what you want. In the new version we now make sure that command-line parameters override anything in che.env.

Can you wipe out your Docker images and then run with:

sudo docker run -e CHE_HOST=192.168.0.209 -it --rm 
-v /var/run/docker.sock:/var/run/docker.sock 
-v /home/iar/chedata:/data 
eclipse/che:5.3.0 start

@ioadler
Copy link
Author

ioadler commented Feb 18, 2017

@TylerJewell

I took the image from https://getfedora.org/de_CH/server/download/.

I started as proposed. Result:

INFO: (che cli): 5.3.0 - using docker 1.13.0 / native
INFO: (che config): Generating che configuration...
INFO: (che config): Customizing docker-compose for running in a container
INFO: (che start): Preflight checks
         mem (1.5 GiB):           [OK]
         disk (100 MB):           [OK]
         port 8080 (http):        [AVAILABLE]
         conn (browser => ws):    [NOT OK]
         conn (server => ws):     [NOT OK]


ERROR: Try 'docker run <options> eclipse/che:5.3.0 info --network' for more tests.

docker ... info --network results in:

INFO: (che cli): 5.3.0 - using docker 1.13.0 / native
INFO:
INFO: ---------------------------------------
INFO: --------   CONNECTIVITY TEST   --------
INFO: ---------------------------------------
INFO: (che network): eclipse/che-ip:5.3.0: 192.168.0.209
INFO: (che network): Browser => Workspace Agent (localhost): Connection failed
INFO: (che network): Browser => Workspace Agent (192.168.0.209): Connection failed
INFO: (che network): Server  => Workspace Agent (External IP): Connection failed
INFO: (che network): Server  => Workspace Agent (Internal IP): Connection succeeded

@TylerJewell
Copy link

Check the Che.env file for network configuration parameters.

@ghost
Copy link

ghost commented Feb 18, 2017

@ioadler can you make sure containers can communicate in your VM?

docker run -d -p 32791:80 nginx
Go to 192.168.0.209:32791 in your browser, and you should see nginx starting page.

$ docker run -ti appropriate/curl sh
# curl -v 192.168.0.209:32791

Can curl grab content that nginx serves on port 80 in the container, that port being exposed an published to 32791?

If you can, then I am puzzled. If you cannot, we should solve this problem first.

@ioadler
Copy link
Author

ioadler commented Feb 20, 2017

@eivantsov - I can see the starting page. Curl shows the following result:

/ # curl -v 192.168.0.209:32791
* Rebuilt URL to: 192.168.0.209:32791/
*   Trying 192.168.0.209...
* TCP_NODELAY set
* connect to 192.168.0.209 port 32791 failed: Host is unreachable
* Failed to connect to 192.168.0.209 port 32791: Host is unreachable
* Closing connection 0
curl: (7) Failed to connect to 192.168.0.209 port 32791: Host is unreachable

Curl from the command line of 192.168.0.209 works correctly.

@ghost
Copy link

ghost commented Feb 21, 2017

@ioadler so, this is a local problem with Docker containers being unable to communicate with one another/with host.

benoitf added a commit that referenced this issue Feb 22, 2017
…4020

Change-Id: I6b65eba02eaab76260d33ffb323fe7413d5ee162
Signed-off-by: Florent BENOIT <fbenoit@codenvy.com>
benoitf added a commit that referenced this issue Feb 23, 2017
…4020

Change-Id: I6b65eba02eaab76260d33ffb323fe7413d5ee162
Signed-off-by: Florent BENOIT <fbenoit@codenvy.com>
@ghost
Copy link

ghost commented Mar 3, 2017

@ioadler have you been able to get past this problem? Playing with iptables is probably a good idea here.

@ghost ghost removed the kind/bug Outline of a bug - must adhere to the bug report template. label Mar 3, 2017
@ghost
Copy link

ghost commented Mar 28, 2017

@ioadler can you provide any update on this issue please?

@ghost
Copy link

ghost commented Apr 4, 2017

Closing due to inactivity. Feel free to reopen once you have more info to share.

@ghost ghost closed this as completed Apr 4, 2017
@csamarghitan
Copy link

I had the same issue on Fedora, after opening the firewall for the range of ports 1025-65535 TCP and UDP on the 'docker0' connection everything worked as expected.

JPinkney pushed a commit to JPinkney/che that referenced this issue Aug 17, 2017
…clipse-che#4020

Change-Id: I6b65eba02eaab76260d33ffb323fe7413d5ee162
Signed-off-by: Florent BENOIT <fbenoit@codenvy.com>
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/question Questions that haven't been identified as being feature requests or bugs. status/blocked Issue that can’t be moved forward. Must include a comment on the reason for the blockage.
Projects
None yet
Development

No branches or pull requests

5 participants