Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing documentation on how to redeploy broken registry. #10585

Open
Firstyear opened this issue Aug 23, 2016 · 15 comments
Open

Missing documentation on how to redeploy broken registry. #10585

Firstyear opened this issue Aug 23, 2016 · 15 comments
Labels
area/documentation component/imageregistry lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/P2

Comments

@Firstyear
Copy link

Missing documentation on how to redeploy broken registry. An openshift install shows the following output.

oc status
svc/docker-registry - 172.30.61.89:5000
dc/docker-registry deploys registry.access.redhat.com/openshift3/ose-docker-registry:v1.2.1
deployment #1 failed 2 hours ago

But the registry says it's working

openshift admin registry
Docker registry "docker-registry" service exists

oc get svc
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
docker-registry 172.30.61.89 5000/TCP 2h
kubernetes 172.30.0.1 443/TCP,53/UDP,53/TCP 4d
router 172.30.48.169 80/TCP,443/TCP,1936/TCP 1h

There is not documentation about how to redeploy or rebuild a broken registry. This is causing new container builds to fail.

Version

oc v1.2.1
kubernetes v1.2.0-36-g4a3f9c5

Additional Information

[Note] Running diagnostic: ClusterRegistry
Description: Check that there is a working Docker registry

ERROR: [DClu1006 from diagnostic ClusterRegistry@openshift/origin/pkg/diagnostics/cluster/registry.go:203]
The "docker-registry" service exists but has no associated pods, so it
is not available. Builds and deployments that use the registry will fail.

ERROR: [DClu1001 from diagnostic ClusterRegistry@openshift/origin/pkg/diagnostics/cluster/registry.go:173]
The "docker-registry" service exists but no pods currently running, so it
is not available. Builds and deployments that use the registry will fail.

@201sandeep
Copy link

yes, even i am facing the same issue, can someone please help

@gregswift
Copy link

fwiw i had the same or similar issue... I dont recall what lead to it but here is how i solved it.

Deleted everything related to the docker-registry deployment. The service, router, deployments (all of the failed ones), and the service. Then I ran

oc deploy docker-registry --latest -n default

That errored because of a few existing things, primarily the service account, but the registry itself came up.

@miminar
Copy link

miminar commented Feb 14, 2017

A note to myself for documenting this.

A deployment config needs to be re-created after service. Otherwise, the registry pod will lack environment variables

${DOCKER_REGISTRY_SERVICE_HOST}
${DOCKER_REGISTRY_SERVICE_PORT}

To test it:

$ oc rsh dc/docker-registry bash -c 'echo ${DOCKER_REGISTRY_SERVICE_HOST}:${DOCKER_REGISTRY_SERVICE_PORT}'
172.30.30.30:5000
# If a pod is started before the service exists, it will look like this
$ oc rsh dc/docker-registry bash -c 'echo ${DOCKER_REGISTRY_SERVICE_HOST}:${DOCKER_REGISTRY_SERVICE_PORT}'
:

If undefined, DOCKER_REGISTRY_URL will be empty, causing following problems:

Pushing image 172.30.91.135:5000/haowang/ruby-ex:latest ...
Pushed 4/5 layers, 82% complete
Pushed 5/5 layers, 100% complete
Registry server Address:
Registry server User Name: serviceaccount
Registry server Email: serviceaccount@example.org
Registry server Password: <<non-empty>>
error: build error: Failed to push image: received unexpected HTTP status: 500 Internal Server Error

$ oc logs -f dc/docker-registry
...
time="2017-02-14T08:57:23.804381606Z" level=error msg="error creating ImageStreamMapping: ImageStreamMapping \"ruby-ex\" is invalid: image.dockerImageReference: Invalid value: \":/zhouy/ruby-ex@sha256:79884cc0d892dd8096d3f7ca9b2484045c5210ef0e488755ce4b635f231f809a\": invalid reference format" go.version=go1.7.4 http.request.contenttype="application/vnd.docker.distribution.manifest.v1+prettyjws" http.request.host="172.30.91.135:5000" http.request.id=d49a6588-c7b4-4426-bf17-8933dbef9780 http.request.method=PUT http.request.remoteaddr="10.129.0.1:51862" http.request.uri="/v2/zhouy/ruby-ex/manifests/latest"
time="2017-02-14T08:57:23.804494035Z" level=error msg="response completed with error" err.code=unknown err.detail="ImageStreamMapping \"ruby-ex\" is invalid: image.dockerImageReference: Invalid value: \":/zhouy/ruby-ex@sha256:79884cc0d892dd8096d3f7ca9b2484045c5210ef0e488755ce4b635f231f809a\": invalid reference format" err.message="unknown error" go.version=go1.7.4 http.request.contenttype="application/vnd.docker.distribution.manifest.v1+prettyjws" http.request.host="172.30.91.135:5000" http.request.id=d49a6588-c7b4-4426-bf17-8933dbef9780 http.request.method=PUT http.request.remoteaddr="10.129.0.1:51862" http.request.uri="/v2/zhouy/ruby-ex/manifests/latest"
...

Update: since 3.9, the following will be printed if the variables aren't set:

level=fatal msg="error parsing configuration file: configuration error in openshift.server.addr: REGISTRY_OPENSHIFT_SERVER_ADDR variable must be set when running outside of Kubernetes cluster"

@201sandeep
Copy link

I got it resolved by modifying below line in YAML playbook and re-compile the OSO.

[nodes]
master.example.com openshift_node_labels="
{'region':'infra','zone':'default'}" openshift_schedulable=true

openshift_schedulable=true is the important parameter which is not letting the registry pod spawn (if sets to "false" ) in case you have single node in infra region.

Once done, it took couple of mins to make my OSO registry working.

Regards,Sandeep

@walidshaari
Copy link

What works for me is the following
oc rollout latest docker-registry

@davistran86
Copy link

@walidshaari thanks, you helped me out, your command worked for me 👍

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 19, 2018
@Caplost
Copy link

Caplost commented Mar 21, 2018

@walidshaari THX, worked for me

@miminar
Copy link

miminar commented Apr 10, 2018

Fix: openshift/openshift-docs#8666

miminar pushed a commit to miminar/openshift-docs that referenced this issue Apr 16, 2018
Resolves: openshift/origin#10585

Signed-off-by: Michal Minář <miminar@redhat.com>
@dmage
Copy link
Contributor

dmage commented May 9, 2018

/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 9, 2018
@miminar miminar removed their assignment Aug 1, 2018
@miminar
Copy link

miminar commented Aug 1, 2018

The doc work I started needs a rewrite. But the registry operator will change things in a way that all guidance will become obsolete. This could be resolved with a documentation for the operator.

@openshift-merge-robot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 30, 2018
@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 29, 2018
@Firstyear
Copy link
Author

/lifecycle rotten
/remove-lifecycle stale

@Firstyear
Copy link
Author

/lifecycle frozen

@openshift-ci-robot openshift-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Nov 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/documentation component/imageregistry lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/P2
Projects
None yet
Development

No branches or pull requests