Error restoring Rancher: This cluster is currently Unavailable; areas that interact directly with it will not be available until the API is ready

10/6/2019

I am trying to backup and restore rancher server (single node install), as the link: https://rancher.com/docs/rancher/v2.x/en/backups/backups/single-node-backups/

After backup, I tried to turn off the rancher server node, and I run a new rancher container on a new node (in the same network, but another ip address), then I restored using the backup file as the link https://rancher.com/docs/rancher/v2.x/en/backups/restorations/single-node-restoration/

After restoring, I logined to the rancher UI and it showed below error

enter image description here

So, I checked the log of the rancher server and it showed as below:

2019-10-05 16:41:32.197641 I | http: TLS handshake error from 127.0.0.1:38388: EOF 2019-10-05 16:41:32.202442 I | http: TLS handshake error from 127.0.0.1:38380: EOF 2019-10-05 16:41:32.210378 I | http: TLS handshake error from 127.0.0.1:38376: EOF 2019-10-05 16:41:32.211106 I | http: TLS handshake error from 127.0.0.1:38386: EOF 2019/10/05 16:42:26 [ERROR] ClusterController c-4pgjl [user-controllers-controller] failed with : failed to start user controllers for cluster c-4pgjl: failed to contact server: Get https://192.168.94.154:6443/api/v1/namespaces/kube-system?timeout=30s: waiting for cluster agent to connect 2019/10/05 16:44:34 [ERROR] ClusterController c-4pgjl [user-controllers-controller] failed with : failed to start user controllers for cluster c-4pgjl: failed to contact server: Get https://192.168.94.154:6443/api/v1/namespaces/kube-system?timeout=30s: waiting for cluster agent to connect 2019/10/05 16:48:50 [ERROR] ClusterController c-4pgjl [user-controllers-controller] failed with : failed to start user controllers for cluster c-4pgjl: failed to contact server: Get https://192.168.94.154:6443/api/v1/namespaces/kube-system?timeout=30s: waiting for cluster agent to connect 2019-10-05 16:50:19.114475 I | mvcc: store.index: compact 75951 2019-10-05 16:50:19.137825 I | mvcc: finished scheduled compaction at 75951 (took 22.527694ms) 2019-10-05 16:55:19.120803 I | mvcc: store.index: compact 76282 2019-10-05 16:55:19.124813 I | mvcc: finished scheduled compaction at 76282 (took 2.746382ms)

After that, I checked log of the master nodes, I found the rancher agent still try to connect to the old rancher server (old ip address), not as the new one, so it make cluster not available

How can I fix this ?

-- taibc
kubernetes
rancher
rke

0 Answers