I have two jobs that will run only once. One is called Master
and one is called Slave
. As the name implies a Master pod needs some info from the slave then queries some API online. A simple scheme on how the communicate can be done like this:
Slave --- port 6666 ---> Master ---- port 8888 ---> internet:www.example.com
To achieve this I created 5 yaml file:
apiVersion: batch/v1
kind: Job
metadata:
name: master-job
labels:
app: master-job
role: master-job
spec:
template:
metadata:
name: master
spec:
containers:
- name: master
image: registry.gitlab.com/example
command: ["python", "run.py", "-wait"]
ports:
- containerPort: 6666
imagePullSecrets:
- name: regcred
restartPolicy: Never
apiVersion: v1
kind: Service
metadata:
name: master-service
labels:
app: master-job
role: master-job
spec:
selector:
app: master-job
role: master-job
ports:
- protocol: TCP
port: 6666
targetPort: 6666
apiVersion: v1
kind: Service
metadata:
name: master-np-service
spec:
type: NodePort
selector:
app: master-job
ports:
- protocol: TCP
port: 8888
targetPort: 8888
nodePort: 31000
apiVersion: batch/v1
kind: Job
metadata:
name: slave-job
labels:
app: slave-job
spec:
template:
metadata:
name: slave
spec:
containers:
- name: slave
image: registry.gitlab.com/example2
ports:
- containerPort: 6666
#command: ["python", "run.py", "master-service.default.svc.cluster.local"]
#command: ["python", "run.py", "10.106.146.155"]
command: ["python", "run.py", "master-service"]
imagePullSecrets:
- name: regcred
restartPolicy: Never
apiVersion: v1
kind: Service
metadata:
name: slave-service
spec:
selector:
app: slave-job
ports:
- protocol: TCP
port: 6666
targetPort: 6666
But no matter what I do (as it can be seen in the job_slave.yaml file in the commented lines) they cannot communicate with each other except when I put the IP of the Master node in the command section of the Slave. Also the Master node cannot communicate with the outside world (even though I created a configMap
with upstreamNameservers: | ["8.8.8.8"]
Everything is running in a minikube environment. But I cannot pinpoint what my problem is. Any help is appreciated.
Your Job spec has two parts: a description of the Job itself, and a description of the Pods it creates. (Using a Job here is a little odd and I'd probably pick a Deployment instead, but the same applies here.) Where the Service object has a selector:
that matches the labels:
of the Pods.
In the YAML files you show the Jobs have correct labels but the generated Pods don't. You need to add (potentially duplicate) labels to the pod spec part:
apiVersion: batch/v1
kind: Job
metadata:
name: master-job
labels: {...}
spec:
template:
metadata:
# name: will get ignored here
labels:
app: master-job
role: master-job
You should be able to verify with kubectl describe service master-service
. At the end of its output will be a line that says Endpoints:
. If the Service selector and the Pod labels don't match this will say <none>
; if they do match you will see the Pod IP addresses.
(You don't need a NodePort
service unless you need to accept requests from outside the cluster; it could be the same as the service you use to accept requests from within the cluster. You don't need to include objects' types in their names. Nothing you've shown has any obvious relevance to communication out of the cluster.)
Try with headless service:
apiVersion: v1
kind: Service
metadata:
name: master-service
labels:
app: master-job
role: master-job
spec:
type: ClusterIP
clusterIP: None
selector:
app: master-job
role: master-job
ports:
- protocol: TCP
port: 6666
targetPort: 6666
and use command: ["python", "run.py", "master-service"]
in your job_slave.yaml
Make sure your master job is listening on port 6666 inside your container.