Why OpenShift installer for AWS provider fails unable to connect to Kubernetes API?

12/18/2019

OpenShift AWS installer fails waiting for Kubernetes API to be available with fatal error "waiting for Kubernetes API: context deadline exceeded":

$ openshift-install create cluster --dir=$HOME/openshift --log-level debug
...
DEBUG Still waiting for the Kubernetes API: Get https://api.cluster-name.'IP_ADDRESS'.nip.io:6443/version?timeout=32s: dial tcp 'IP_ADDRESS':6443: i/o timeout 
DEBUG Still waiting for the Kubernetes API: Get https://api.cluster-name.'IP_ADDRESS'.nip.io:6443/version?timeout=32s: dial tcp 'IP_ADDRESS':6443: i/o timeout 
DEBUG Still waiting for the Kubernetes API: Get https://api.cluster-name.'IP_ADDRESS'.nip.io:6443/version?timeout=32s: dial tcp 'IP_ADDRESS':6443: i/o timeout 
DEBUG Still waiting for the Kubernetes API: Get https://api.cluster-name.'IP_ADDRESS'.nip.io:6443/version?timeout=32s: dial tcp 'IP_ADDRESS':6443: i/o timeout 
DEBUG Fetching "Install Config"...                 
DEBUG Loading "Install Config"...                  
DEBUG   Loading "SSH Key"...                       
DEBUG   Using "SSH Key" loaded from state file     
DEBUG   Loading "Base Domain"...                   
DEBUG     Loading "Platform"...                    
DEBUG     Using "Platform" loaded from state file  
DEBUG   Using "Base Domain" loaded from state file 
DEBUG   Loading "Cluster Name"...                  
DEBUG     Loading "Base Domain"...                 
DEBUG   Using "Cluster Name" loaded from state file 
DEBUG   Loading "Pull Secret"...                   
DEBUG   Using "Pull Secret" loaded from state file 
DEBUG   Loading "Platform"...                      
DEBUG Using "Install Config" loaded from state file 
DEBUG Reusing previously-fetched "Install Config"  
INFO Use the following commands to gather logs from the cluster 
... 
FATAL waiting for Kubernetes API: context deadline exceeded 

The problem is also described here

-- Vic K
amazon-web-services
kubernetes
openshift

1 Answer

12/18/2019

In my case the installer tried to connect to Kubernetes API linked to a non-existing endpoint. One of indications of that if oc-client hangs when run a simple command like oc whoami - it actually tries to connect to the same endpoint (taken that KUBECONFIG is set).
It turned out that it has to do with Route 53 hosted zone and in particular with a subdomain. When OpenShift is being installed against a subdomain (like in my case), a record set in a main domain referencing to the subdomain needs to be created. So, for openshift.example.com do the following in aws console:
Go to Route 53 -> Hosted zones -> click openshift.example.com. (if it's not there - create a hosted zone) -> copy NS records, e.g.:

ns-711.awsdns-24.net.   
ns-126.awsdns-15.com.   
ns-1274.awsdns-31.org.   
ns-1556.awsdns-02.co.uk.

Back to Hosted Zones -> example.com. -> Create Record Set:
create a record set for openshift.example.com, type: NS - Name server, Value: paste copied NS records.

After this change the installation went through successfully.

-- Vic K
Source: StackOverflow