You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jul 30, 2024. It is now read-only.
I was able to reproduce this is the lab in Reston by turning off the local DNS server and doing an install with just external DNS servers configured (DNS servers 208.67.222.222 and 208.67.220.220) . The installation failed at the same point as we saw in the factory.
Also note the IP address "10.0.2.3:53" in the error below is not something I have in my lab or anything I configured, not sure where that is coming from
STEP 8: WAIT FOR THE OPENSHIFT INSTALLATION TO COMPLETE
write upgrade status to tmp file localhost>localhost
wait for bootstrap to complete
failed: localhost: non-zero return code
there was an error during install
failed: localhost: level=info msg=Waiting up to 20m0s for the Kubernetes API at https://api.edge.rdc100.lan:6443...
level=error msg=Attempted to gather ClusterOperator status after wait failure: listing ClusterOperator objects: Get "https://api.edge.rdc100.lan:6443/apis/config.openshift.io/v1/clusteroperators": dial tcp: lookup api.edge.rdc100.lan on 10.0.2.3:53: no such host
level=info msg=Use the following commands to gather logs from the cluster
level=info msg=openshift-install gather bootstrap --help
level=fatal msg=failed waiting for Kubernetes API: Get "https://api.edge.rdc100.lan:6443/version?timeout=32s": dial tcp: lookup api.edge.rdc100.lan on 10.0.2.3:53: no such host
The text was updated successfully, but these errors were encountered:
Please hold - I'm doing some more testing, it seems like it may be related to the external DNS we were using. Install worked when using 1.1.1.1 as the external DNS
I've done several installs the last couple day and reproduced this in the Reston lab. Whenever the system points to and external DNS and is forced to resolve addresses on the private network (192.168.8.xx), it seems to have problems connecting to the cluster address and the install fails.
Fresh install on the same system, same config.sh and only changing the DNS to a local on the external network with the suggested DNS entries and the system works fine.
I've even tried just changing the DNS after and install and not re-installing and in general, once it fails, it tends to stay on the private network and does not work, even if I modify after modifying the DNS field to point to something external, I generally need to re-install the whole things (RHEL plus OCS) to reset it.
Summary of Problem: Installation fails without a local DNS server
This is related to the factory install failing with can't open "API at https://api.edge.rdc100.lan:6443"
I was able to reproduce this is the lab in Reston by turning off the local DNS server and doing an install with just external DNS servers configured (DNS servers 208.67.222.222 and 208.67.220.220) . The installation failed at the same point as we saw in the factory.
Also note the IP address "10.0.2.3:53" in the error below is not something I have in my lab or anything I configured, not sure where that is coming from
STEP 8: WAIT FOR THE OPENSHIFT INSTALLATION TO COMPLETE
write upgrade status to tmp file localhost>localhost
wait for bootstrap to complete
failed: localhost: non-zero return code
there was an error during install
failed: localhost: level=info msg=Waiting up to 20m0s for the Kubernetes API at https://api.edge.rdc100.lan:6443...
level=error msg=Attempted to gather ClusterOperator status after wait failure: listing ClusterOperator objects: Get "https://api.edge.rdc100.lan:6443/apis/config.openshift.io/v1/clusteroperators": dial tcp: lookup api.edge.rdc100.lan on 10.0.2.3:53: no such host
level=info msg=Use the following commands to gather logs from the cluster
level=info msg=openshift-install gather bootstrap --help
level=fatal msg=failed waiting for Kubernetes API: Get "https://api.edge.rdc100.lan:6443/version?timeout=32s": dial tcp: lookup api.edge.rdc100.lan on 10.0.2.3:53: no such host
The text was updated successfully, but these errors were encountered: