Missing documentation on how to redeploy broken registry. #10585

Firstyear · 2016-08-23T06:13:43Z

Missing documentation on how to redeploy broken registry. An openshift install shows the following output.

oc status
svc/docker-registry - 172.30.61.89:5000
dc/docker-registry deploys registry.access.redhat.com/openshift3/ose-docker-registry:v1.2.1
deployment #1 failed 2 hours ago

But the registry says it's working

openshift admin registry
Docker registry "docker-registry" service exists

oc get svc
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
docker-registry 172.30.61.89 5000/TCP 2h
kubernetes 172.30.0.1 443/TCP,53/UDP,53/TCP 4d
router 172.30.48.169 80/TCP,443/TCP,1936/TCP 1h

There is not documentation about how to redeploy or rebuild a broken registry. This is causing new container builds to fail.

Version

oc v1.2.1
kubernetes v1.2.0-36-g4a3f9c5

Additional Information

[Note] Running diagnostic: ClusterRegistry
Description: Check that there is a working Docker registry

ERROR: [DClu1006 from diagnostic ClusterRegistry@openshift/origin/pkg/diagnostics/cluster/registry.go:203]
The "docker-registry" service exists but has no associated pods, so it
is not available. Builds and deployments that use the registry will fail.

ERROR: [DClu1001 from diagnostic ClusterRegistry@openshift/origin/pkg/diagnostics/cluster/registry.go:173]
The "docker-registry" service exists but no pods currently running, so it
is not available. Builds and deployments that use the registry will fail.

201sandeep · 2017-01-19T09:06:30Z

yes, even i am facing the same issue, can someone please help

gregswift · 2017-01-25T22:33:25Z

fwiw i had the same or similar issue... I dont recall what lead to it but here is how i solved it.

Deleted everything related to the docker-registry deployment. The service, router, deployments (all of the failed ones), and the service. Then I ran

oc deploy docker-registry --latest -n default

That errored because of a few existing things, primarily the service account, but the registry itself came up.

miminar · 2017-02-14T12:35:17Z

A note to myself for documenting this.

A deployment config needs to be re-created after service. Otherwise, the registry pod will lack environment variables

${DOCKER_REGISTRY_SERVICE_HOST}
${DOCKER_REGISTRY_SERVICE_PORT}

To test it:

$ oc rsh dc/docker-registry bash -c 'echo ${DOCKER_REGISTRY_SERVICE_HOST}:${DOCKER_REGISTRY_SERVICE_PORT}'
172.30.30.30:5000
# If a pod is started before the service exists, it will look like this
$ oc rsh dc/docker-registry bash -c 'echo ${DOCKER_REGISTRY_SERVICE_HOST}:${DOCKER_REGISTRY_SERVICE_PORT}'
:

If undefined, DOCKER_REGISTRY_URL will be empty, causing following problems:

Pushing image 172.30.91.135:5000/haowang/ruby-ex:latest ...
Pushed 4/5 layers, 82% complete
Pushed 5/5 layers, 100% complete
Registry server Address:
Registry server User Name: serviceaccount
Registry server Email: serviceaccount@example.org
Registry server Password: <<non-empty>>
error: build error: Failed to push image: received unexpected HTTP status: 500 Internal Server Error

$ oc logs -f dc/docker-registry
...
time="2017-02-14T08:57:23.804381606Z" level=error msg="error creating ImageStreamMapping: ImageStreamMapping \"ruby-ex\" is invalid: image.dockerImageReference: Invalid value: \":/zhouy/ruby-ex@sha256:79884cc0d892dd8096d3f7ca9b2484045c5210ef0e488755ce4b635f231f809a\": invalid reference format" go.version=go1.7.4 http.request.contenttype="application/vnd.docker.distribution.manifest.v1+prettyjws" http.request.host="172.30.91.135:5000" http.request.id=d49a6588-c7b4-4426-bf17-8933dbef9780 http.request.method=PUT http.request.remoteaddr="10.129.0.1:51862" http.request.uri="/v2/zhouy/ruby-ex/manifests/latest"
time="2017-02-14T08:57:23.804494035Z" level=error msg="response completed with error" err.code=unknown err.detail="ImageStreamMapping \"ruby-ex\" is invalid: image.dockerImageReference: Invalid value: \":/zhouy/ruby-ex@sha256:79884cc0d892dd8096d3f7ca9b2484045c5210ef0e488755ce4b635f231f809a\": invalid reference format" err.message="unknown error" go.version=go1.7.4 http.request.contenttype="application/vnd.docker.distribution.manifest.v1+prettyjws" http.request.host="172.30.91.135:5000" http.request.id=d49a6588-c7b4-4426-bf17-8933dbef9780 http.request.method=PUT http.request.remoteaddr="10.129.0.1:51862" http.request.uri="/v2/zhouy/ruby-ex/manifests/latest"
...

Update: since 3.9, the following will be printed if the variables aren't set:

level=fatal msg="error parsing configuration file: configuration error in openshift.server.addr: REGISTRY_OPENSHIFT_SERVER_ADDR variable must be set when running outside of Kubernetes cluster"

201sandeep · 2017-02-14T12:48:08Z

I got it resolved by modifying below line in YAML playbook and re-compile the OSO.

[nodes]
master.example.com openshift_node_labels="
{'region':'infra','zone':'default'}" openshift_schedulable=true

openshift_schedulable=true is the important parameter which is not letting the registry pod spawn (if sets to "false" ) in case you have single node in infra region.

Once done, it took couple of mins to make my OSO registry working.

Regards,Sandeep

walidshaari · 2017-08-22T00:24:47Z

What works for me is the following
oc rollout latest docker-registry

davistran86 · 2017-09-07T08:45:43Z

@walidshaari thanks, you helped me out, your command worked for me 👍

openshift-bot · 2018-02-19T11:03:39Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

Caplost · 2018-03-21T08:31:06Z

@walidshaari THX, worked for me

miminar · 2018-04-10T15:27:15Z

Fix: openshift/openshift-docs#8666

Resolves: openshift/origin#10585 Signed-off-by: Michal Minář <miminar@redhat.com>

dmage · 2018-05-09T12:57:36Z

/remove-lifecycle stale

miminar · 2018-08-01T07:11:05Z

The doc work I started needs a rewrite. But the registry operator will change things in a way that all guidance will become obsolete. This could be resolved with a documentation for the operator.

openshift-merge-robot · 2018-10-30T10:22:52Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2018-11-29T12:19:09Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

Firstyear · 2018-11-30T04:26:36Z

/lifecycle rotten
/remove-lifecycle stale

Firstyear · 2018-11-30T04:26:53Z

/lifecycle frozen

mfojtik added priority/P2 area/documentation component/imageregistry labels Aug 23, 2016

mfojtik assigned miminar Aug 23, 2016

openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 19, 2018

miminar mentioned this issue Apr 10, 2018

[WIP] Docker registry re-deployment tips openshift/openshift-docs#8666

Closed

miminar pushed a commit to miminar/openshift-docs that referenced this issue Apr 16, 2018

Docker registry re-deployment guidelines

ed3ad18

Resolves: openshift/origin#10585 Signed-off-by: Michal Minář <miminar@redhat.com>

openshift-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 9, 2018

miminar removed their assignment Aug 1, 2018

openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 30, 2018

openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 29, 2018

openshift-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Nov 30, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing documentation on how to redeploy broken registry. #10585

Missing documentation on how to redeploy broken registry. #10585

Firstyear commented Aug 23, 2016

201sandeep commented Jan 19, 2017

gregswift commented Jan 25, 2017

miminar commented Feb 14, 2017 •

edited

Loading

201sandeep commented Feb 14, 2017

walidshaari commented Aug 22, 2017

davistran86 commented Sep 7, 2017

openshift-bot commented Feb 19, 2018

Caplost commented Mar 21, 2018

miminar commented Apr 10, 2018

dmage commented May 9, 2018

miminar commented Aug 1, 2018

openshift-merge-robot commented Oct 30, 2018

openshift-bot commented Nov 29, 2018

Firstyear commented Nov 30, 2018

Firstyear commented Nov 30, 2018

Missing documentation on how to redeploy broken registry. #10585

Missing documentation on how to redeploy broken registry. #10585

Comments

Firstyear commented Aug 23, 2016

Version

Additional Information

201sandeep commented Jan 19, 2017

gregswift commented Jan 25, 2017

miminar commented Feb 14, 2017 • edited Loading

201sandeep commented Feb 14, 2017

walidshaari commented Aug 22, 2017

davistran86 commented Sep 7, 2017

openshift-bot commented Feb 19, 2018

Caplost commented Mar 21, 2018

miminar commented Apr 10, 2018

dmage commented May 9, 2018

miminar commented Aug 1, 2018

openshift-merge-robot commented Oct 30, 2018

openshift-bot commented Nov 29, 2018

Firstyear commented Nov 30, 2018

Firstyear commented Nov 30, 2018

miminar commented Feb 14, 2017 •

edited

Loading