Title may sound a little cryptic but here is the thing.
I've a Kubernetes Cluster on GKE (v1.20.10-gke.1600) which host a bunch of small applications.
Some of them need to be available only through internal IP. We use a VPN to act as inside the network and it works for everything flawlessly.
These applications, I'm fairly sure, were reachable when I first deployed them but it's been two months already and they are rarely used so when today i've got a call telling me that they were getting an error I was baffled.
So, when I've set it up I've followed this guide: https://cloud.google.com/kubernetes-engine/docs/how-to/internal-load-balance-ingress
These are the yaml configurations, hopefully I've censored all the sensible data:
Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: foo-app
labels:
app: "foo"
spec:
replicas: 1
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
selector:
matchLabels:
app: "foo"
template:
metadata:
labels:
app: "foo"
spec:
containers:
- name: foo-container
image: eu.gcr.io/project-12345/foo:--DEPLOYMENT-NUMBER--
imagePullPolicy: Always
resources:
limits:
memory: 300Mi
cpu: 300m
requests:
memory: 100Mi
cpu: 100m
ports:
- containerPort: 8080
name: tcp
nodeSelector:
bha.preemptible: "false"
---
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: foo-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: foo-app
minReplicas: 1
maxReplicas: 3
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 95
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
Service and Ingress
apiVersion: v1
kind: Service
metadata:
name: foo-nodeport
annotations:
cloud.google.com/neg: '{"ingress": true}'
labels:
app: "foo"
spec:
selector:
app: "foo"
type: NodePort
ports:
- protocol: TCP
port: 80
targetPort: 8080
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: foo-ingress
annotations:
kubernetes.io/ingress.class: gce-internal
kubernetes.io/ingress.regional-static-ip-name: foo-ingress
spec:
rules:
- host: foo.domain.net
http:
paths:
- path: /*
pathType: ImplementationSpecific
backend:
service:
name: foo-nodeport
port:
number: 80
Running a kubectl apply
on those two yaml seems to work without issues.
I've taken note of the warning about the firewall rules and run the proper update and followed through the troubleshooting but everything is healthy.
I've created the internal static IP prior to the deployment. Being 10.11.0.3
and I can see it being used in the load balancer frontend. The load balancer backend is correctly pointing at the right node IP 10.101.1.2
.
However if I try and open 10.11.0.3
or foo.domain.net
from the browser I get an upstream request timeout
. If I try to open 10.101.1.2
it loads up the application.
I tried to check if the container had some problems but everything seems fine. The nginx process is running and there are no errors in the logs. It seems to me that the issue is in the load balancer configuration somehow but I can't figure out what it's wrong.
Do anyone know at least where to look at, to better understand where the problem is?
Edit:
Under suggestion from @cesar I've checked with log explorer for any occurrence of the domain name being mentioned and the only thing that comes out are warning with payload like these:
{
"insertId": "ut0lskdjtr8",
"jsonPayload": {
"@type": "type.googleapis.com/google.cloud.loadbalancing.type.LoadBalancerLogEntry"
},
"httpRequest": {
"requestMethod": "GET",
"requestUrl": "http://foo.domain.net/",
"requestSize": "517",
"status": 504,
"responseSize": "118",
"userAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36",
"remoteIp": "10.13.209.144:52409",
"serverIp": "10.101.1.2:8080",
"latency": "30.001717s",
"protocol": "HTTP/1.1"
},
"resource": {
"type": "internal_http_lb_rule",
"labels": {
"region": "europe-west2",
"project_id": "project-12345",
"backend_type": "NETWORK_ENDPOINT_GROUP",
"backend_target_type": "BACKEND_SERVICE",
"network_name": "network-shared-vpc",
"backend_scope": "europe-west2-c",
"matched_url_path_rule": "/",
"backend_scope_type": "ZONE",
"url_map_name": "k8s2-um-zeslpm63-ns-frontend-apps-foo-ingress-usro40mc",
"backend_name": "k8s1-009b06c9-ns-frontend-apps-foo-nodeport-80-8d5eff42",
"backend_target_name": "k8s1-009b06c9-ns-frontend-apps-foo-nodeport-80-8d5eff42",
"forwarding_rule_name": "k8s2-fr-zeslpm63-ns-frontend-apps-foo-ingress-usro40mc",
"target_proxy_name": "k8s2-tp-zeslpm63-ns-frontend-apps-foo-ingress-usro40mc"
}
},
"timestamp": "2021-11-12T09:14:29.481099Z",
"severity": "WARNING",
"logName": "projects/project-12345/logs/loadbalancing.googleapis.com%2Frequests",
"receiveTimestamp": "2021-11-12T09:15:07.271078019Z"
}