Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue running pod under rkt (works fine under docker) #726

Closed
s-urbaniak opened this issue Oct 20, 2016 · 8 comments
Closed

Issue running pod under rkt (works fine under docker) #726

s-urbaniak opened this issue Oct 20, 2016 · 8 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@s-urbaniak
Copy link

From @tomdee on September 22, 2016 20:1

Using minikube v0.1.0, the v0.0.4 version of minikube iso and the beta8 release of kubernetes 1.4

When I try to deploy "self hosted" Calico, I find that the containers can't run. Looking at the logs I see something like this for both containers in my pod.

rpc error: code = 2 desc = pod "1ec08fc3-8f30-44a2-afb8-821154efd3f3" not found

Repro, by running:

kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico-containers/master/docs/cni/kubernetes/manifests/calico-configmap.yaml -f https://raw.githubusercontent.com/projectcalico/calico-containers/master/docs/cni/kubernetes/manifests/calico-hosted.yaml

Copied from original issue: coreos/minikube-iso#26

@s-urbaniak
Copy link
Author

@tomdee hmm ... the kubectl apply -f command above doesn't work for me, as https://raw.githubusercontent.com/projectcalico/calico-containers/master/docs/cni/kubernetes/manifests/calico-configmap.yaml includes a couple of __VARS__.

Can you specify the values of those for reproduction?

@s-urbaniak
Copy link
Author

From @tomdee on September 27, 2016 22:19

The VARS_ don't need to be filled out, they get filled out by the https://github.com/projectcalico/calico-cni/blob/master/k8s-install/scripts/install-cni.sh script (which is run automatically by the other .yaml file)

@s-urbaniak
Copy link
Author

@tomdee sorry, I was dumb, I can reproduce it now.

@s-urbaniak
Copy link
Author

@tomdee I got it (partly) running now, and this is exactly the feedback I was hoping for when running rktnetes :-)

So, we have two issues here in order to be able to deploy the above in rktnetes:

  1. /var/run/calico mounted in calico-node [1] doesn't exist in the VM. This is a known issue in rktnetes, and tracked in [2]. When I create this directory manually in the VM, calico-node starts, but we have a new issue:
  2. Once calico-node starts bootstraping fails, because /proc/sys/net/ipv4/conf/default/rp_filter is read-only in the container despite having hostNetwork: true, and securityContext: privileged: true:
$ kubectl logs calico-node-tvl1p -c calico-node
...
2016-09-28 08:47:15,198 [INFO][18853/140302822536960] calico.etcddriver.driver 465: Got snapshot headers, snapshot index is 5312; starting watcher...
  File "site-packages/gevent/greenlet.py", line 534, in run
  File "site-packages/calico/felix/felix.py", line 93, in _main_greenlet
  File "site-packages/calico/felix/devices.py", line 88, in configure_global_kernel_config
  File "site-packages/calico/felix/devices.py", line 272, in _write_proc_sys
IOError: [Errno 30] Read-only file system: '/proc/sys/net/ipv4/conf/default/rp_filter'
...

$ kubectl exec -ti calico-node-tvl1p /bin/sh
/ # mount | grep proc
...
proc on /proc/sys type proc (ro,nosuid,nodev,noexec,relatime)
...

This is a valid bug in rkt itself, and I filed [3].

I suggest to move the discussion to the corresponding rkt issues, since they are both not related to minikube-iso, but to rkt, and rktnetes itself.

[1] https://github.com/projectcalico/calico-containers/blob/master/docs/cni/kubernetes/manifests/calico-hosted.yaml#L52
[2] kubernetes/kubernetes#26816
[3] rkt/rkt#3245

@leodotcloud
Copy link

I can see the same error in felix.log even when not using either k8s or rkt.

Here is my docker-compose.yml:

services:
  calico:
    image: calico/node:v1.0.2
    privileged: true
    network_mode: host
    depends_on:
      - calico-etcd
    labels:
      io.rancher.sidekicks: cni-driver
      io.rancher.scheduler.global: 'true'
      io.rancher.container.dns: 'true'
    volumes:
      - /var/run/calico:/var/run/calico:rw
      - /var/log/calico:/var/log/calico:rw
    environment:
      - ETCD_ENDPOINTS=http://calico-etcd.calico.rancher.internal:2379

Error:

2017-02-16 05:21:21,214 [INFO][702/4] calico.felix.fetcd 166: Config loaded, resync interval 3600.
2017-02-16 05:21:21,215 [ERROR][702/5] calico.felix.felix 282: Exception killing main greenlet
Traceback (most recent call last):
  File "site-packages/calico/felix/felix.py", line 93, in _main_greenlet
  File "site-packages/calico/felix/devices.py", line 88, in configure_global_kernel_config
  File "site-packages/calico/felix/devices.py", line 272, in _write_proc_sys
IOError: [Errno 30] Read-only file system: '/proc/sys/net/ipv4/conf/default/rp_filter'

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 21, 2017
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 20, 2018
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

5 participants