Probem: I have 2 replicas for my app, library. I have configured a service to talk to my two replicas. I have a dockerfile that starts my app at port # 8080. The load balancer and the app containers all seem to be running. However, I am not able to connect to them. Every time I click on the external IP shown I get "Bad Request (400)"
This log from the container seems to indicate somethingw ith the db connection going wrong...
2019-12-07 07:33:56.669 IST Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 217, in ensure_connection self.connect() File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 195, in connect self.connection = self.get_new_connection(conn_params) File "/usr/local/lib/python3.8/site-packages/django/db/backends/postgresql/base.py", line 178, in get_new_connection connection = Database.connect(**conn_params) File "/usr/local/lib/python3.8/site-packages/psycopg2/init.py", line 126, in connect conn = _connect(dsn, connection_factory=connection_factory, **kwasync) psycopg2.OperationalError: could not connect to server: Connection refused
$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.19.240.1 <none> 443/TCP 61m
library-svc LoadBalancer 10.19.254.164 34.93.141.11 80:30227/TCP 50m
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
libary-6f9b45fcdb-g5xfv 1/1 Running 0 54m
library-745d6798d8-m45gq 3/3 Running 0 12m
In settings.py
ALLOWED_HOSTS = ['library-259506.appspot.com', '127.0.0.1', 'localhost', '*']
Here is the library.yaml file I have to go with it.
# [START kubernetes_deployment]
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: library
labels:
app: library
spec:
replicas: 2
template:
metadata:
labels:
app: library
spec:
containers:
- name: library-app
# Replace with your project ID or use `make template`
image: gcr.io/library-259506/library
# This setting makes nodes pull the docker image every time before
# starting the pod. This is useful when debugging, but should be turned
# off in production.
imagePullPolicy: Always
env:
# [START cloudsql_secrets]
- name: DATABASE_USER
valueFrom:
secretKeyRef:
name: cloudsql
key: username
- name: DATABASE_PASSWORD
valueFrom:
secretKeyRef:
name: cloudsql
key: password
# [END cloudsql_secrets]
ports:
- containerPort: 8080
# [START proxy_container]
- image: gcr.io/cloudsql-docker/gce-proxy:1.16
name: cloudsql-proxy
command: ["/cloud_sql_proxy", "--dir=/cloudsql",
"-instances=library-259506:asia-south1:library=tcp:3306",
"-credential_file=/secrets/cloudsql/credentials.json"]
volumeMounts:
- name: cloudsql-oauth-credentials
mountPath: /secrets/cloudsql
readOnly: true
- name: ssl-certs
mountPath: /etc/ssl/certs
- name: cloudsql
mountPath: /cloudsql
# [END proxy_container]
# [START volumes]
volumes:
- name: cloudsql-oauth-credentials
secret:
secretName: cloudsql-oauth-credentials
- name: ssl-certs
hostPath:
path: /etc/ssl/certs
- name: cloudsql
emptyDir:
# [END volumes]
# [END kubernetes_deployment]
---
# [START service]
# The library-svc service provides a load-balancing proxy over the polls app
# pods. By specifying the type as a 'LoadBalancer', Container Engine will
# create an external HTTP load balancer.
# The service directs traffic to the deployment by matching the service's selector to the deployment's label
#
# For more information about external HTTP load balancing see:
# https://cloud.google.com/container-engine/docs/load-balancer
apiVersion: v1
kind: Service
metadata:
name: library-svc
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: 8080
selector:
app: library
# [END service]
The Dockerfile:
FROM python:3
ENV PYTHONUNBUFFERED 1
RUN mkdir /code
WORKDIR /code
COPY requirements.txt /code/
RUN pip install -r requirements.txt
COPY . /code/
# Server
EXPOSE 8080
STOPSIGNAL SIGINT
ENTRYPOINT ["python", "manage.py"]
CMD ["runserver", "0.0.0.0:8080"]
library-app container logs
2019-12-07T02:03:58.742639999Z Performing system checks...
I
2019-12-07T02:03:58.742701271Z
I
2019-12-07T02:03:58.816567541Z System check identified no issues (0 silenced).
I
2019-12-07T02:03:59.338790311Z December 07, 2019 - 02:03:59
I
2019-12-07T02:03:59.338986187Z Django version 2.2.6, using settings 'locallibrary.settings'
I
2019-12-07T02:03:59.338995688Z Starting development server at http://0.0.0.0:8080/
I
2019-12-07T02:03:59.338999467Z Quit the server with CONTROL-C.
I
2019-12-07T02:04:00.814238478Z [07/Dec/2019 02:04:00] "GET / HTTP/1.1" 400 26
E
undefined
library container logs:
2019-12-07T02:03:56.568839208Z Performing system checks...
I
2019-12-07T02:03:56.568912262Z
I
2019-12-07T02:03:56.624835039Z System check identified no issues (0 silenced).
I
2019-12-07T02:03:56.669088750Z Exception in thread django-main-thread:
E
2019-12-07T02:03:56.669204639Z Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 217, in ensure_connection
self.connect()
File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 195, in connect
self.connection = self.get_new_connection(conn_params)
File "/usr/local/lib/python3.8/site-packages/django/db/backends/postgresql/base.py", line 178, in get_new_connection
connection = Database.connect(**conn_params)
File "/usr/local/lib/python3.8/site-packages/psycopg2/__init__.py", line 126, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: could not connect to server: Connection refused
E
2019-12-07T02:03:56.672779570Z Is the server running on host "127.0.0.1" and accepting
E
2019-12-07T02:03:56.672783910Z TCP/IP connections on port 3306?
E
2019-12-07T02:03:56.672826889Z
E
2019-12-07T02:03:56.672903098Z
E
2019-12-07T02:03:56.672909494Z The above exception was the direct cause of the following exception:
E
2019-12-07T02:03:56.672913216Z
E
2019-12-07T02:03:56.672962576Z Traceback (most recent call last):
File "/usr/local/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/usr/local/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.8/site-packages/django/utils/autoreload.py", line 54, in wrapper
fn(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/django/core/management/commands/runserver.py", line 120, in inner_run
self.check_migrations()
File "/usr/local/lib/python3.8/site-packages/django/core/management/base.py", line 453, in check_migrations
executor = MigrationExecutor(connections[DEFAULT_DB_ALIAS])
File "/usr/local/lib/python3.8/site-packages/django/db/migrations/executor.py", line 18, in __init__
self.loader = MigrationLoader(self.connection)
File "/usr/local/lib/python3.8/site-packages/django/db/migrations/loader.py", line 49, in __init__
self.build_graph()
File "/usr/local/lib/python3.8/site-packages/django/db/migrations/loader.py", line 212, in build_graph
self.applied_migrations = recorder.applied_migrations()
File "/usr/local/lib/python3.8/site-packages/django/db/migrations/recorder.py", line 73, in applied_migrations
if self.has_table():
File "/usr/local/lib/python3.8/site-packages/django/db/migrations/recorder.py", line 56, in has_table
return self.Migration._meta.db_table in self.connection.introspection.table_names(self.connection.cursor())
File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 256, in cursor
return self._cursor()
File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 233, in _cursor
self.ensure_connection()
File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 217, in ensure_connection
self.connect()
File "/usr/local/lib/python3.8/site-packages/django/db/utils.py", line 89, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 217, in ensure_connection
self.connect()
File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 195, in connect
self.connection = self.get_new_connection(conn_params)
File "/usr/local/lib/python3.8/site-packages/django/db/backends/postgresql/base.py", line 178, in get_new_connection
connection = Database.connect(**conn_params)
File "/usr/local/lib/python3.8/site-packages/psycopg2/__init__.py", line 126, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
django.db.utils.OperationalError: could not connect to server: Connection refused
E
2019-12-07T02:03:56.684080787Z Is the server running on host "127.0.0.1" and accepting
E
2019-12-07T02:03:56.684085077Z TCP/IP connections on port 3306?
E
Cloud SQL proxy container logs:
2019-12-07T02:03:57.648625670Z 2019/12/07 02:03:57 current FDs rlimit set to 1048576, wanted limit is 8500. Nothing to do here.
E
2019-12-07T02:03:57.660316236Z 2019/12/07 02:03:57 using credential file for authentication; email=library-service@library-259506.iam.gserviceaccount.com
E
2019-12-07T02:03:58.167649397Z 2019/12/07 02:03:58 Listening on 127.0.0.1:3306 for library-259506:asia-south1:library
E
2019-12-07T02:03:58.167761109Z 2019/12/07 02:03:58 Ready for new connections
E
2019-12-07T02:03:58.865747581Z 2019/12/07 02:03:58 New connection for "library-259506:asia-south1:library"
E
2019-12-07T03:03:29.000559014Z 2019/12/07 03:03:29 ephemeral certificate for instance library-259506:asia-south1:library will expire soon, refreshing now.
E
2019-12-07T04:02:59.004152307Z 2019/12/07 04:02:59 ephemeral certificate for instance library-259506:asia-south1:library will expire soon, refreshing now.
E
I can't believe it the issue was my library.yaml file was missing this for the cloud proxy. I believe "RunAs" -- Controls which user ID the containers are run with -- this forces it to run as daemon user and allows us specify RunAsUser which can be overridden by RunAsUser in SecurityContext on a per Container basis.
https://cloud.google.com/solutions/best-practices-for-operating-containers#avoid_running_as_root
# [START cloudsql_security_context]
securityContext:
runAsUser: 2 # non-root user
allowPrivilegeEscalation: false
# [END cloudsql_security_context]
From docs https://kubernetes.io/docs/concepts/policy/pod-security-policy/#users-and-groups: "MustRunAsNonRoot - Requires that the pod be submitted with a non-zero runAsUser or have the USER directive defined (using a numeric UID) in the image. Pods which have specified neither runAsNonRoot nor runAsUser settings will be mutated to set runAsNonRoot=true, thus requiring a defined non-zero numeric USER directive in the container. No default provided. Setting allowPrivilegeEscalation=false is strongly recommended with this strategy."