My ingress pod is having trouble reaching two clusterIP services by IP. There are plenty of other clusterIP services it has no trouble reaching. Including in the same namespace. Another pod has no problem reaching the service (I tried the default backend in the same namespace and it was fine).
Where should I look? Here are my actual services, it cannot reach the first but can reach the second:
- apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2019-08-23T16:59:10Z"
labels:
app: pka-168-emtpy-id
app.kubernetes.io/instance: palletman-pka-168-emtpy-id
app.kubernetes.io/managed-by: Tiller
helm.sh/chart: pal-0.0.1
release: palletman-pka-168-emtpy-id
name: pka-168-emtpy-id
namespace: palletman
resourceVersion: "108574168"
selfLink: /api/v1/namespaces/palletman/services/pka-168-emtpy-id
uid: 539364f9-c5c7-11e9-8699-0af40ce7ce3a
spec:
clusterIP: 100.65.111.47
ports:
- port: 80
protocol: TCP
targetPort: 8080
selector:
app: pka-168-emtpy-id
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
- apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2019-03-05T19:57:26Z"
labels:
app: production
app.kubernetes.io/instance: palletman
app.kubernetes.io/managed-by: Tiller
helm.sh/chart: pal-0.0.1
release: palletman
name: production
namespace: palletman
resourceVersion: "81337664"
selfLink: /api/v1/namespaces/palletman/services/production
uid: e671c5e0-3f80-11e9-a1fc-0af40ce7ce3a
spec:
clusterIP: 100.65.82.246
ports:
- port: 80
protocol: TCP
targetPort: 8080
selector:
app: production
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
My ingress pod:
apiVersion: v1
kind: Pod
metadata:
annotations:
sumologic.com/format: text
sumologic.com/sourceCategory: 103308/CT/LI/kube_ingress
sumologic.com/sourceName: kube_ingress
creationTimestamp: "2019-08-21T19:34:48Z"
generateName: ingress-nginx-65877649c7-
labels:
app: ingress-nginx
k8s-addon: ingress-nginx.addons.k8s.io
pod-template-hash: "2143320573"
name: ingress-nginx-65877649c7-5npmp
namespace: kube-ingress
ownerReferences:
- apiVersion: extensions/v1beta1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: ingress-nginx-65877649c7
uid: 97db28a9-c43f-11e9-920a-0af40ce7ce3a
resourceVersion: "108278133"
selfLink: /api/v1/namespaces/kube-ingress/pods/ingress-nginx-65877649c7-5npmp
uid: bcd92d96-c44a-11e9-8699-0af40ce7ce3a
spec:
containers:
- args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/nginx-default-backend
- --configmap=$(POD_NAMESPACE)/ingress-nginx
- --publish-service=$(POD_NAMESPACE)/ingress-nginx
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
image: gcr.io/google_containers/nginx-ingress-controller:0.9.0-beta.13
imagePullPolicy: Always
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
name: ingress-nginx
ports:
- containerPort: 80
name: http
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-dg5wn
readOnly: true
dnsPolicy: ClusterFirst
nodeName: ip-10-55-131-177.eu-west-1.compute.internal
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 60
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: default-token-dg5wn
secret:
defaultMode: 420
secretName: default-token-dg5wn
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2019-08-21T19:34:48Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2019-08-21T19:34:50Z"
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2019-08-21T19:34:48Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: docker://d597673f4f38392a52e9537e6dd2473438c62c2362a30e3d58bf8a98e177eb12
image: gcr.io/google_containers/nginx-ingress-controller:0.9.0-beta.13
imageID: docker-pullable://gcr.io/google_containers/nginx-ingress-controller@sha256:c9d2e67f8096d22564a6507794e1a591fbcb6461338fc655a015d76a06e8dbaa
lastState: {}
name: ingress-nginx
ready: true
restartCount: 0
state:
running:
startedAt: "2019-08-21T19:34:50Z"
hostIP: 10.55.131.177
phase: Running
podIP: 172.6.218.18
qosClass: BestEffort
startTime: "2019-08-21T19:34:48Z"
It could be connectivity to the node where your Pod is running. (Or network overlay related) You can check where that pod is running:
$ kubectl get pod -o=json | jq .items[0].spec.nodeName
Check if the node is 'Ready':
$ kubectl get node <node-from-above>
If it's ready, then ssh into the node to further troubleshoot:
$ ssh <node-from-above>
Is your overlay pod running on the node? (Calico, Weave, CNI, etc)
You can further troubleshoot connecting to the pod/container
# From <node-from-above>
$ docker exec -it <container-id-in-pod> bash
# Check connectivity (ping, dig, curl, etc)
Also, from using the kubectl command line (if you have network connectivity to the node)
$ kubectl exec -it <pod-id> -c <container-name> bash
# Troubleshoot...