Open
Description
Greetings,
It's the same issue as #368: after the node restart, runner pool pod does not start. Here are the logs:
$ kubectl logs runner-pool-pod-9rmfv -c runner -n github-actions-runner-operator
# Runner removal
Cannot connect to server, because config files are missing. Skipping removing runner from the server.
Does not exist. Skipping Removing .credentials
Does not exist. Skipping Removing .runner
--------------------------------------------------------------------------------
| ____ _ _ _ _ _ _ _ _ |
| / ___(_) |_| | | |_ _| |__ / \ ___| |_(_) ___ _ __ ___ |
| | | _| | __| |_| | | | | '_ \ / _ \ / __| __| |/ _ \| '_ \/ __| |
| | |_| | | |_| _ | |_| | |_) | / ___ \ (__| |_| | (_) | | | \__ \ |
| \____|_|\__|_| |_|\__,_|_.__/ /_/ \_\___|\__|_|\___/|_| |_|___/ |
| |
| Self-hosted runner registration |
| |
--------------------------------------------------------------------------------
# Authentication
√ Connected to GitHub
# Runner Registration
A runner exists with the same name
A runner exists with the same name runner-pool-pod-9rmfv.
Indeed, in Github repo Runners section does exist a runner with the same name in "Offline" state.
Here's the CR:
apiVersion: garo.tietoevry.com/v1alpha1
kind: GithubActionRunner
metadata:
name: runner-pool
namespace: github-actions-runner-operator
spec:
minRunners: 1
maxRunners: 6
organization: myorg
reconciliationPeriod: 1m
repository: "myrepo"
podTemplateSpec:
metadata:
annotations:
"prometheus.io/scrape": "true"
"prometheus.io/port": "3903"
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchExpressions:
- key: garo.tietoevry.com/pool
operator: In
values:
- runner-pool
containers:
- name: runner
env:
- name: RUNNER_DEBUG
value: "true"
- name: DOCKER_TLS_CERTDIR
value: /certs
- name: DOCKER_HOST
value: tcp://localhost:2376
- name: DOCKER_TLS_VERIFY
value: "1"
- name: DOCKER_CERT_PATH
value: /certs/client
- name: GH_ORG
value: myorg
#if runner for repo:
- name: GH_REPO
value: myrepo
envFrom:
- secretRef:
name: runner-pool-regtoken
# find the fixed-in-time tags at https://quay.io/repository/evryfs/github-actions-runner?tab=tags if you want to avoid pulling on a moving tag
# due to https://github.com/actions/runner/issues/246 the runner sw needs to be recent
# you can subscribe to release-feeds at https://github.com/evryfs/github-actions-runner/releases.atom
image: quay.io/evryfs/github-actions-runner:master
imagePullPolicy: Always
resources: {}
volumeMounts:
- mountPath: /certs
name: docker-certs
- mountPath: /home/runner/_diag
name: runner-diag
- mountPath: /home/runner/_work
name: runner-work
- name: docker
env:
- name: DOCKER_TLS_CERTDIR
value: /certs
image: docker:stable-dind
imagePullPolicy: Always
args:
# See linked issues from: https://github.com/evryfs/github-actions-runner-operator/issues/39
- --mtu=1430
resources: {}
securityContext:
privileged: true
volumeMounts:
- mountPath: /var/lib/docker
name: docker-storage
- mountPath: /certs
name: docker-certs
- mountPath: /home/runner/_work
name: runner-work
- name: exporter
image: quay.io/evryfs/github-actions-runner-metrics:v0.0.6
ports:
- containerPort: 3903
protocol: TCP
volumeMounts:
- name: runner-diag
mountPath: /_diag
readOnly: true
volumes:
- name: runner-work
emptyDir: {}
- name: runner-diag
emptyDir: {}
- name: docker-storage
emptyDir: {}
- name: docker-certs
emptyDir: {}
I've installed this operator via the helm chart:
helm install github-actions-runner-operator evryfs-oss/github-actions-runner-operator --namespace github-actions-runner-operator --set githubapp.existingSecret=github-runner-app --set githubapp.enabled=true
If I delete runner from GitHub and delete runner-pool pod, it'd have been recreated and works normally, but I'm in a situation when the cluster node restarts daily so it it's not a viable solution for me. Can this be fixed?
Metadata
Metadata
Assignees
Labels
No labels