I have 2 different microservices deployed as backend in minikube, call it deployment A and deployment B. Both these deployments have a different replica of pods running.
Deployment B is exposed as service B. Pods of deployment A call deployment B pods via service B which is of ClusterIP type.
The pods of deployment B have a scrapyd application running inside them with scraping spiders deployed in them. Each celery worker( pods of deployment A) takes a task from a redis queue and calls scrapyd service to schedule spiders on them.
Everything works fine but after I scale the application (deployment A and B seperately), I observe that the resource consumption is not uniform, using kubectl top pods
I observe that some of the pods of deployment B are not used at all. The pattern I observe is that only those pods of deployment B, that are up and running after all the pods of deployment A are up, are never utilized.
Is it normal behavior? I suspect the connection between pods of deployment A and B is persistent I am confused as to why request handling by pods of deployment B is not uniformly distributed? Sorry for the naive question. I am new to this field.
The manifest for deployment A is :
apiVersion: apps/v1
kind: Deployment
metadata:
name: celery-worker
labels:
deployment: celery-worker
spec:
replicas: 1
selector:
matchLabels:
pod: celery-worker
template:
metadata:
labels:
pod: celery-worker
spec:
containers:
- name: celery-worker
image: celery:latest
imagePullPolicy: Never
command: ['celery', '-A', 'mysite', 'worker', '-E', '-l', 'info',]
resources:
limits:
cpu: 500m
requests:
cpu: 200m
terminationGracePeriodSeconds: 200
and that of deployment B is
apiVersion: apps/v1
kind: Deployment
metadata:
name: scrapyd
labels:
app: scrapyd
spec:
replicas: 1
selector:
matchLabels:
pod: scrapyd
template:
metadata:
labels:
pod: scrapyd
spec:
containers:
- name: scrapyd
image: scrapyd:latest
imagePullPolicy: Never
ports:
- containerPort: 6800
resources:
limits:
cpu: 800m
requests:
cpu: 800m
terminationGracePeriodSeconds: 100
---
kind: Service
apiVersion: v1
metadata:
name: scrapyd
spec:
selector:
pod: scrapyd
ports:
- protocol: TCP
port: 6800
targetPort: 6800
Output of kubectl top pods
:
So the solution to the above problem that I figured out is as follows :
In the current setup, install linkerd using this link. Linkerd Installation in Kubernetes
After that inject the linkerd proxy into celery deployment as follows :
cat celery/deployment.yaml | linkerd inject - | kubectl apply -f -
This ensures that the requests from celery are first passed on to this proxy and then Load balanced directly to scrapy at L7 Layer. In this case, kube-proxy is by passed and the default load balancing over L4 is no longer functional.