I have installed jenkins
on a GKE cluster using the stable helm chart.
I am able to access and login to the UI.
However, when trying to run a simple job, the agent pod fails to be created.
The logs are not very informative on this
jenkins-kos-58586644f9-vh278 jenkins 2020-01-28 18:30:46.523+0000 [id=184] WARNING o.c.j.p.k.KubernetesLauncher#launch: Error in provisioning; agent=KubernetesSlave name: default-ld008, template=PodTemplate{inheritFrom='', name='default', slaveConnectTimeout=0, label='jenkins-kos-jenkins-slave ', serviceAccount='default', nodeSelector='', nodeUsageMode=NORMAL, workspaceVolume=EmptyDirWorkspaceVolume [memory=false], containers=[ContainerTemplate{name='jnlp', image='jenkins/jnlp-slave:3.27-1', workingDir='/home/jenkins/agent', command='', args='${computer.jnlpmac} ${computer.name}', resourceRequestCpu='500m', resourceRequestMemory='1Gi', resourceLimitCpu='4000m', resourceLimitMemory='8Gi', envVars=[ContainerEnvVar [getValue()=http://jenkins-kos.jenkins.svc.cluster.local:8080/jenkins, getKey()=JENKINS_URL]]}]}
jenkins-kos-58586644f9-vh278 jenkins java.lang.IllegalStateException: Pod has terminated containers: jenkins/default-ld008 (jnlp)
jenkins-kos-58586644f9-vh278 jenkins at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.periodicAwait(AllContainersRunningPodWatcher.java:166)
jenkins-kos-58586644f9-vh278 jenkins at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.periodicAwait(AllContainersRunningPodWatcher.java:187)
jenkins-kos-58586644f9-vh278 jenkins at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.await(AllContainersRunningPodWatcher.java:127)
jenkins-kos-58586644f9-vh278 jenkins at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.launch(KubernetesLauncher.java:132)
jenkins-kos-58586644f9-vh278 jenkins at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:290)
jenkins-kos-58586644f9-vh278 jenkins at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
jenkins-kos-58586644f9-vh278 jenkins at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
jenkins-kos-58586644f9-vh278 jenkins at java.util.concurrent.FutureTask.run(FutureTask.java:266)
jenkins-kos-58586644f9-vh278 jenkins at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
jenkins-kos-58586644f9-vh278 jenkins at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
jenkins-kos-58586644f9-vh278 jenkins at java.lang.Thread.run(Thread.java:748)
jenkins-kos-58586644f9-vh278 jenkins 2020-01-28 18:30:46.524+0000 [id=184] INFO o.c.j.p.k.KubernetesSlave#_terminate: Terminating Kubernetes instance for agent default-ld008
jenkins-kos-58586644f9-vh278 jenkins Terminated Kubernetes instance for agent jenkins/default-ld008
jenkins-kos-58586644f9-vh278 jenkins Disconnected computer default-ld008
jenkins-kos-58586644f9-vh278 jenkins 2020-01-28 18:30:46.559+0000 [id=184] INFO o.c.j.p.k.KubernetesSlave#deleteSlavePod: Terminated Kubernetes instance for agent jenkins/default-ld008
jenkins-kos-58586644f9-vh278 jenkins 2020-01-28 18:30:46.560+0000 [id=184] INFO o.c.j.p.k.KubernetesSlave#_terminate: Disconnected computer default-ld008
jenkins-kos-58586644f9-vh278 jenkins 2020-01-28 18:30:56.009+0000 [id=53
Here are the kubernetes
events
0s Normal Scheduled pod/default-zkwp4 Successfully assigned jenkins/default-zkwp4 to gke-kos-nodepool1-kq69
0s Normal Pulled pod/default-zkwp4 Container image "docker.io/istio/proxyv2:1.4.0" already present on machine
0s Normal Created pod/default-zkwp4 Created container
0s Normal Started pod/default-zkwp4 Started container
0s Normal Pulled pod/default-zkwp4 Container image "jenkins/jnlp-slave:3.27-1" already present on machine
0s Normal Created pod/default-zkwp4 Created container
0s Normal Started pod/default-zkwp4 Started container
0s Normal Pulled pod/default-zkwp4 Container image "docker.io/istio/proxyv2:1.4.0" already present on machine
1s Normal Created pod/default-zkwp4 Created container
0s Normal Started pod/default-zkwp4 Started container
0s Warning Unhealthy pod/default-zkwp4 Readiness probe failed: Get http://10.15.2.113:15020/healthz/ready: dial tcp 10.15.2.113:15020: connect: connection refused
0s Warning Unhealthy pod/default-zkwp4 Readiness probe failed: Get http://10.15.2.113:15020/healthz/ready: dial tcp 10.15.2.113:15020: connect: connection refused
0s Normal Killing pod/default-zkwp4 Killing container with id docker://istio-proxy:Need to kill Pod
The TCP port for agent communication is fixed to 50000
Using jenkins/jnlp-slave:3.27-1
for the agent image.
Any ideas what might be causing this?
UPDATE 1: Here is a gist with the description of the failed agent.
UPDATE 2: Managed to pinpoint the actual error in the jnlp
logs using stackdriver (although not aware of the root cause yet)
"SEVERE: Failed to connect to http://jenkins-kos.jenkins.svc.cluster.local:8080/jenkins/tcpSlaveAgentListener/: Connection refused (Connection refused)
UPDATE 3: Here comes the weird(est) part: from a pod I spin up within the jenkins
namespace:
/ # dig +short jenkins-kos.jenkins.svc.cluster.local
10.14.203.189
/ # nc -zv -w 3 jenkins-kos.jenkins.svc.cluster.local 8080
jenkins-kos.jenkins.svc.cluster.local (10.14.203.189:8080) open
/ # curl http://jenkins-kos.jenkins.svc.cluster.local:8080/jenkins/tcpSlaveAgentListener/
Jenkins
UPDATE 4: I can confirm that this occurs on a GKE cluster using istio
1.4.0
but NOT on another one using 1.1.15
You can disable the sidecar proxy for agents.
Go to Manage Jenkins -> Configuration -> Kubernetes Cloud.
Select Annotations options and enter the below annotation value.
sidecar.istio.io/inject: "false"