OKD 3.11 Installation failed "Control plane pods didn't come up"
Environment
Ansible inventory file
##Open shift master nodes
# Create an OSEv3 group that contains the masters, nodes, and etcd groups
[OSEv3:children]
masters
nodes
etcd
# host group for masters
[masters]
SBSTJVMLX605 openshift_ip=192.168.62.95
# host group for etcd
[etcd]
SBSTJVMLX605 openshift_ip=192.168.62.95
# host group for nodes, includes region info
[nodes]
SBSTJVMLX605 openshift_node_group_name='node-config-master-infra' openshift_schedulable=true openshift_ip=192.168.62.95
[OSEv3:vars]
openshift_deployment_type=origin
#openshift_release="3.11"
openshift_disable_check=disk_availability,docker_storage,package_version,memory_availability
openshift_enable_docker_excluder=false
debug_level=4
Errors from ansible
Failure summary:
1. Hosts: SBSTJVMLX605
Play: Configure masters
Task: Report control plane errors
Message: Control plane pods didn't come up
Errors from journal ctl
31207 kubelet.go:2101] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Also
origin-node[31207]: W0418 10:26:47.866972 31207 docker_service.go:545] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
Apr 18 10:26:47 SBSTJVMLX605.SUNTECSBS.COM origin-node[31207]: I0418 10:26:47.866991 31207 docker_service.go:238] Hairpin mode set to "hairpin-veth"
Apr 18 10:26:47 SBSTJVMLX605.SUNTECSBS.COM origin-node[31207]: W0418 10:26:47.867101 31207 cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d
Apr 18 10:26:47 SBSTJVMLX605.SUNTECSBS.COM origin-node[31207]: W0418 10:26:47.870697 31207 cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d
Apr 18 10:26:47 SBSTJVMLX605.SUNTECSBS.COM origin-node[31207]: I0418 10:26:47.870720 31207 plugins.go:159] Loaded network plugin "cni"
Apr 18 10:26:47 SBSTJVMLX605.SUNTECSBS.COM origin-node[31207]: I0418 10:26:47.870745 31207 docker_service.go:253] Docker cri networking managed by cni
Apr 18 10:26:47 SBSTJVMLX605.SUNTECSBS.COM origin-node[31207]: I0418 10:26:47.883666 31207 docker_service.go:258] Docker Info: &{ID:BHX3:QGBT:WAEZ:TNJL:PCFX:4S6K:CEV6:QMMQ:NELX:RCXS:EXOE:AXYY Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStopped:0 Images:5 Driver:overlay2 DriverStatus:[[Backing Filesystem xfs] [Supports d_type true] [Native Overlay Diff true]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host macvlan null overlay] Authorization:[] Log:[]} MemoryLimit:true SwapLimit:true KernelMemory:true CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true IPv4Forwarding:true BridgeNfIptables:false BridgeNfIP6tables:false Debug:false NFd:16 OomKillDisable:true NGoroutines:23 SystemTime:2019-04-18T10:26:47.873639519+05:30 LoggingDriver:json-file CgroupDriver:systemd NEventsListener:0 KernelVersion:3.10.0-957.el7.x86_64 OperatingSystem:CentOS Linux 7 (Core) OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc4208af5e0 NCPU:4 MemTotal:8370298880 GenericResources:[] DockerRootDir:/var/lib/docker HTTPProxy:http://sbstjsvfw01:8080/ HTTPSProxy:http://sbstjsvfw01:8080/ NoProxy: 192.168.62.95, localhost,.cluster.local,.svc,127.0.0.1,169.254.169.254,172.30.0.1,192.168.62.95,sbstjvmlx605.suntecsbs.com Name:SBSTJVMLX605.SUNTECSBS.COM Labels:[] ExperimentalBuild:false ServerVersion:1.13.1 ClusterStore: ClusterAdvertise: Runtimes:map[docker-runc:{Path:/usr/libexec/docker/docker-runc-current Args:[]} runc:{Path:docker-runc Args:[]}] DefaultRuntime:docker-runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:0xc420c46dc0} LiveRestoreEnabled:false Isolation: InitBinary:/usr/libexec/docker/docker-init-current ContainerdCommit:{ID: Expected:aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1} RuncCommit:{ID:df5c38a9167e87f53a9894d77c0950e178a745e7 Expected:9df8b306d01f59d3a8029be411de015b7304dd8f} InitCommit:{ID:fec3683b971d9c3ef73f284f176672c44b448662 Expected:949e6facb77383876aeff8a6944dde66b3089574} SecurityOptions:[name=seccomp,profile=/etc/docker/seccomp.json name=selinux]}
Apr 18 10:26:47 SBSTJVMLX605.SUNTECSBS.COM origin-node[31207]: I0418 10:26:47.883740 31207 docker_service.go:271] Setting cgroupDriver to systemd
Apr 18 10:26:47 SBSTJVMLX605.SUNTECSBS.COM origin-node[31207]: I0418 10:26:47.883810 31207 kubelet.go:628] Starting the GRPC server for the docker CRI shim.
Apr 18 10:26:47 SBSTJVMLX605.SUNTECSBS.COM origin-node[31207]: I0418 10:26:47.883872 31207 docker_server.go:59] Start dockershim grpc server
Apr 18 10:26:47 SBSTJVMLX605.SUNTECSBS.COM origin-node[31207]: W0418 10:26:47.883909 31207 util_unix.go:75] Using "/var/run/dockershim.sock" as endpoint is deprecated, please consider using full url format "unix:///var/run/dockershim.sock".
Apr 18 10:26:47 SBSTJVMLX605.SUNTECSBS.COM origin-node[31207]: I0418 10:26:47.890481 31207 oom_linux.go:65] attempting to set "/proc/19601/oom_score_adj" to "-999"
Apr 18 10:26:47 SBSTJVMLX605.SUNTECSBS.COM origin-node[31207]: I0418 10:26:47.890657 31207 oom_linux.go:65] attempting to set "/proc/19610/oom_score_adj" to "-999"
Apr 18 10:26:47 SBSTJVMLX605.SUNTECSBS.COM origin-node[31207]: I0418 10:26:47.900614 31207 remote_runtime.go:43] Connecting to runtime service /var/run/dockershim.sock
Apr 18 10:26:47 SBSTJVMLX605.SUNTECSBS.COM origin-node[31207]: W0418 10:26:47.900638 31207 util_unix.go:75] Using "/var/run/dockershim.sock" as endpoint is deprecated, please consider using full url format "unix:///var/run/dockershim.sock".
Apr 18 10:26:47 SBSTJVMLX605.SUNTECSBS.COM origin-node[31207]: I0418 10:26:47.900715 31207 remote_image.go:40] Connecting to image service /var/run/dockershim.sock
Apr 18 10:26:47 SBSTJVMLX605.SUNTECSBS.COM origin-node[31207]: W0418 10:26:47.900728 31207 util_unix.go:75] Using "/var/run/dockershim.sock" as endpoint is deprecated, please consider using full url format "unix:///var/run/dockershim.sock".
Apr 18 10:26:47 SBSTJVMLX605.SUNTECSBS.COM origin-node[31207]: I0418 10:26:47.900860 31207 plugins.go:56] Registering credential provider: .dockercfg
Apr 18 10:26:47 SBSTJVMLX605.SUNTECSBS.COM origin-node[31207]: I0418 10:26:47.902596 31207 kuberuntime_manager.go:186] Container runtime docker initialized, version: 1.13.1, apiVersion: 1.26.0
Also a lot of connect refused on 8443 port master api.
Did some one came across some thing similar. I have a running cluster on 3.9 and was working properly. I tried this as a fresh installation after uninstalling 3.9
Oof could get the thing up !!
Unable to update cni config: No networks found in /etc/cni/net.d.
The network plugin was not ready perhaps because of this. This was the thing @capt2101akash pointed also.
Got a work around solution from the below mailing list
Basically created a file /etc/cni/net.d/80-openshift-network.conf with content
{ "cniVersion": "0.2.0", "name": "openshift-sdn", "type": "openshift-sdn" }
Again it was not coming up, Ansible scripts showed me the same old story "Control planes not up"
This time master logs /usr/local/bin/master-logs api api
was showing a command not found in /etc/origin/master/master.env
. Guessed a problem in inventory file. The entry for proxy had an additional \
and no proxy some spaces.
Santized the inventory file.
Redeployed and it is up and accessible.
This was a workaround, would post it if i get into the real cause.
You need a network plugin for your installation to complete. It is failing because it can't find the network plugin to use for ex. CNI, weave-net, etc. Here you can find the listed plugins supported by k8s.