I have an existing microservice infrastructure which I use keycloak as auth provider. It was working fine up until I enabled istio service mesh for the k8s namespace. Now, all other containers work except Keycloak. My deployment for keycloak looks like,
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: keycloak
namespace: lms
spec:
replicas: 1
template:
metadata:
labels:
app: keycloak
spec:
containers:
- name: keycloak
image: jboss/keycloak
ports:
- name: http
containerPort: 8080
- name: https
containerPort: 8443
env:
- name: DB_DATABASE
value: lms
- name: DB_USER
value: root
- name: DB_PASSWORD
value: "some pass"
- name: DB_ADDR
value: mysql
- name: DB_PORT
value: "3306"
- name: KEYCLOAK_USER
value: admin
- name: KEYCLOAK_PASSWORD
value: "some pass"
- name: PROXY_ADDRESS_FORWARDING
value: "true"
readinessProbe:
httpGet:
path: /auth/
port: 8080
initialDelaySeconds: 120
periodSeconds: 10
livenessProbe:
httpGet:
path: /auth/
port: 8080
initialDelaySeconds: 300
periodSeconds: 10
and k8s readiness probe fails, message
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 10m default-scheduler Successfully assigned lms/keycloak-8cccb54c6-7czmq to gke-ing-standard-cluster-default-pool-59e1dee5-d4sn
Normal Pulled 10m kubelet, gke-ing-standard-cluster-default-pool-59e1dee5-d4sn Container image "docker.io/istio/proxy_init:1.2.4" already present on machine
Normal Created 10m kubelet, gke-ing-standard-cluster-default-pool-59e1dee5-d4sn Created container
Normal Started 10m kubelet, gke-ing-standard-cluster-default-pool-59e1dee5-d4sn Started container
Normal Pulling 10m kubelet, gke-ing-standard-cluster-default-pool-59e1dee5-d4sn pulling image "jboss/keycloak"
Normal Pulled 10m kubelet, gke-ing-standard-cluster-default-pool-59e1dee5-d4sn Successfully pulled image "jboss/keycloak"
Normal Created 10m kubelet, gke-ing-standard-cluster-default-pool-59e1dee5-d4sn Created container
Normal Started 10m kubelet, gke-ing-standard-cluster-default-pool-59e1dee5-d4sn Started container
Normal Pulled 10m kubelet, gke-ing-standard-cluster-default-pool-59e1dee5-d4sn Container image "docker.io/istio/proxyv2:1.2.4" already present on machine
Normal Created 10m kubelet, gke-ing-standard-cluster-default-pool-59e1dee5-d4sn Created container
Normal Started 10m kubelet, gke-ing-standard-cluster-default-pool-59e1dee5-d4sn Started container
Warning Unhealthy 8m30s kubelet, gke-ing-standard-cluster-default-pool-59e1dee5-d4sn Readiness probe failed: Get http://10.8.0.67:8080/health: read tcp 10.8.0.1:34376->10.8.0.67:8080: read: connection reset by peer
Warning Unhealthy 8m20s kubelet, gke-ing-standard-cluster-default-pool-59e1dee5-d4sn Readiness probe failed: Get http://10.8.0.67:8080/health: read tcp 10.8.0.1:34432->10.8.0.67:8080: read: connection reset by peer
Warning Unhealthy 8m10s kubelet, gke-ing-standard-cluster-default-pool-59e1dee5-d4sn Readiness probe failed: Get http://10.8.0.67:8080/health: read tcp 10.8.0.1:34490->10.8.0.67:8080: read: connection reset by peer
Warning Unhealthy 8m kubelet, gke-ing-standard-cluster-default-pool-59e1dee5-d4sn Readiness probe failed: Get http://10.8.0.67:8080/health: read tcp 10.8.0.1:34548->10.8.0.67:8080: read: connection reset by peer
Warning Unhealthy 7m50s kubelet, gke-ing-standard-cluster-default-pool-59e1dee5-d4sn Readiness probe failed: Get http://10.8.0.67:8080/health: read tcp 10.8.0.1:34616->10.8.0.67:8080: read: connection reset by peer
Warning Unhealthy 7m40s kubelet, gke-ing-standard-cluster-default-pool-59e1dee5-d4sn Readiness probe failed: Get http://10.8.0.67:8080/health: read tcp 10.8.0.1:34676->10.8.0.67:8080: read: connection reset by peer
Warning Unhealthy 7m30s kubelet, gke-ing-standard-cluster-default-pool-59e1dee5-d4sn Readiness probe failed: Get http://10.8.0.67:8080/health: read tcp 10.8.0.1:34736->10.8.0.67:8080: read: connection reset by peer
Warning Unhealthy 7m20s kubelet, gke-ing-standard-cluster-default-pool-59e1dee5-d4sn Readiness probe failed: Get http://10.8.0.67:8080/health: read tcp 10.8.0.1:34808->10.8.0.67:8080: read: connection reset by peer
Warning Unhealthy 7m10s kubelet, gke-ing-standard-cluster-default-pool-59e1dee5-d4sn Readiness probe failed: Get http://10.8.0.67:8080/health: read tcp 10.8.0.1:34866->10.8.0.67:8080: read: connection reset by peer
Warning Unhealthy 30s (x31 over 7m) kubelet, gke-ing-standard-cluster-default-pool-59e1dee5-d4sn (combined from similar events): Readiness probe failed: Get http://10.8.0.67:8080/health: read tcp 10.8.0.1:37320->10.8.0.67:8080: read: connection reset by peer
and the istio proxy log looks like,
2020-02-28T10:55:25.880386Z info FLAG: --zipkinAddress="zipkin.istio-system:9411"
2020-02-28T10:55:25.880485Z info Version root@ubuntu-docker.io/istio-94746ccd404a8e056483dd02e4e478097b950da6-dirty-94746ccd404a8e056483dd02e4e478097b950da6-dirty-Modified
2020-02-28T10:55:25.880701Z info Obtained private IP [10.8.0.67]
2020-02-28T10:55:25.881093Z info Proxy role: &model.Proxy{ClusterID:"", Type:"sidecar", IPAddresses:[]string{"10.8.0.67", "10.8.0.67"}, ID:"keycloak-8cccb54c6-7czmq.lms", Locality:(*core.Locality)(nil), DNSDomain:"lms.svc.cluster.local", TrustDomain:"cluster.local", PilotIdentity:"", MixerIdentity:"", ConfigNamespace:"", Metadata:map[string]string{}, SidecarScope:(*model.SidecarScope)(nil), ServiceInstances:[]*model.ServiceInstance(nil), WorkloadLabels:model.LabelsCollection(nil)}
2020-02-28T10:55:25.881244Z info PilotSAN []string{"spiffe://cluster.local/ns/istio-system/sa/istio-pilot-service-account"}
2020-02-28T10:55:25.882101Z info Effective config: binaryPath: /usr/local/bin/envoy
concurrency: 2
configPath: /etc/istio/proxy
connectTimeout: 10s
controlPlaneAuthPolicy: MUTUAL_TLS
discoveryAddress: istio-pilot.istio-system:15011
drainDuration: 45s
parentShutdownDuration: 60s
proxyAdminPort: 15000
serviceCluster: keycloak.lms
statNameLength: 189
tracing:
zipkin:
address: zipkin.istio-system:9411
2020-02-28T10:55:25.882325Z info Monitored certs: []string{"/etc/certs/key.pem", "/etc/certs/root-cert.pem", "/etc/certs/cert-chain.pem"}
2020-02-28T10:55:25.882399Z info waiting 2m0s for /etc/certs/key.pem
2020-02-28T10:55:25.882487Z info waiting 2m0s for /etc/certs/root-cert.pem
2020-02-28T10:55:25.882556Z info waiting 2m0s for /etc/certs/cert-chain.pem
2020-02-28T10:55:25.882634Z info PilotSAN []string{"spiffe://cluster.local/ns/istio-system/sa/istio-pilot-service-account"}
2020-02-28T10:55:25.882932Z info Opening status port 15020
2020-02-28T10:55:25.883419Z info Starting proxy agent
2020-02-28T10:55:25.883790Z info watching /etc/certs for changes
2020-02-28T10:55:25.883872Z info Received new config, resetting budget
2020-02-28T10:55:25.883950Z info Reconciling retry (budget 10)
2020-02-28T10:55:25.884016Z info Epoch 0 starting
2020-02-28T10:55:25.914398Z info Envoy command: [-c /etc/istio/proxy/envoy-rev0.json --restart-epoch 0 --drain-time-s 45 --parent-shutdown-time-s 60 --service-cluster keycloak.lms --service-node sidecar~10.8.0.67~keycloak-8cccb54c6-7czmq.lms~lms.svc.cluster.local --max-obj-name-len 189 --local-address-ip-version v4 --allow-unknown-fields -l warning --component-log-level misc:error --concurrency 2]
[2020-02-28 10:55:25.976][12][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:86] gRPC config stream closed: 14, no healthy upstream
[2020-02-28 10:55:25.976][12][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:49] Unable to establish new stream
2020-02-28T10:55:28.462367Z info Envoy proxy is ready
[2020-02-28 10:56:01.031][12][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:86] gRPC config stream closed: 13,
[2020-02-28T10:57:00.599Z] "- - -" 0 - "-" "-" 1432 1491 226010 - "-" "-" "-" "-" "10.8.2.28:3306" outbound|3306||mysql.lms.svc.cluster.local 10.8.0.67:49060 10.8.2.28:3306 10.8.0.67:49058 -
[2020-02-28T10:56:51.178Z] "- - -" 0 - "-" "-" 48058 142271 235436 - "-" "-" "-" "-" "10.8.2.28:3306" outbound|3306||mysql.lms.svc.cluster.local 10.8.0.67:48996 10.8.2.28:3306 10.8.0.67:48994 -
[2020-02-28T10:56:50.072Z] "- - -" 0 - "-" "-" 2350 2551 236544 - "-" "-" "-" "-" "10.8.2.28:3306" outbound|3306||mysql.lms.svc.cluster.local 10.8.0.67:48980 10.8.2.28:3306 10.8.0.67:48978 -
[2020-02-28 11:01:02.564][12][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:86] gRPC config stream closed: 13,
[2020-02-28T11:02:08.889Z] "- - -" 0 - "-" "-" 1432 1491 247625 - "-" "-" "-" "-" "10.8.2.28:3306" outbound|3306||mysql.lms.svc.cluster.local 10.8.0.67:50964 10.8.2.28:3306 10.8.0.67:50962 -
[2020-02-28T11:01:59.772Z] "- - -" 0 - "-" "-" 48058 142271 256746 - "-" "-" "-" "-" "10.8.2.28:3306" outbound|3306||mysql.lms.svc.cluster.local 10.8.0.67:50908 10.8.2.28:3306 10.8.0.67:50906 -
[2020-02-28T11:01:58.988Z] "- - -" 0 - "-" "-" 2350 2551 257532 - "-" "-" "-" "-" "10.8.2.28:3306" outbound|3306||mysql.lms.svc.cluster.local 10.8.0.67:50902 10.8.2.28:3306 10.8.0.67:50900 -
[2020-02-28 11:07:05.000][12][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:86] gRPC config stream closed: 13,
any ideas how to fix this?
After a lot of headaches, I figured out that the server actually starts fine and it is accessible from inside the pod. Since the health check forces the container to be recreated, the actual RC becomes misleading. Eventually, sidecar.istio.io/rewriteAppHTTPProbers: "true" fixed the issue.
spec:
replicas: 1
template:
metadata:
labels:
app: keycloak
annotations:
sidecar.istio.io/rewriteAppHTTPProbers: "true"