When I'm trying to execute the following command
["/bin/sh", "-c", "cqlsh cassandra.my-namespace.svc.cluster.local -f /path/to/schema.cql"]
from my Job, I am receiving the following error:
Traceback (most recent call last):
File "/usr/bin/cqlsh.py", line 2443, in <module>
main(*read_options(sys.argv[1:], os.environ))
File "/usr/bin/cqlsh.py", line 2421, in main
encoding=options.encoding)
File "/usr/bin/cqlsh.py", line 485, in __init__
load_balancing_policy=WhiteListRoundRobinPolicy([self.hostname]),
File "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.11.0-bb96859b.zip/cassandra-driver-3.11.0-bb96859b/cassandra/policies.py", line 417, in __init__
socket.gaierror: [Errno -2] Name or service not known
My Job is defined as Helm Hook with post-install
annotation. My Cassandra Pod is defined using StatefulSet.
kind: StatefulSet
metadata:
name: cassandra
spec:
serviceName: cassandra
replicas: 1
template:
metadata:
labels:
app: cassandra
spec:
containers:
- name: cassandra
image: cassandra:3
imagePullPolicy: IfNotPresent
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
env:
- name: CASSANDRA_SEEDS
value: cassandra-0.cassandra.default.svc.cluster.local
- name: MAX_HEAP_SIZE
value: 256M
- name: HEAP_NEWSIZE
value: 100M
- name: CASSANDRA_CLUSTER_NAME
value: "Cassandra"
- name: CASSANDRA_DC
value: "DC1"
- name: CASSANDRA_RACK
value: "Rack1"
- name: CASSANDRA_ENDPOINT_SNITCH
value: GossipingPropertyFileSnitch
volumeMounts:
- name: cassandra-data
mountPath: /var/lib/cassandra/data
volumeClaimTemplates:
- metadata:
name: cassandra-data
annotations: # comment line if you want to use a StorageClass
# or specify which StorageClass
volume.beta.kubernetes.io/storage-class: "" # comment line if you
# want to use a StorageClass or specify which StorageClass
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 1Gi
And this is my Service:
kind: Service
metadata:
labels:
app: cassandra
name: cassandra
spec:
clusterIP: None
ports:
- port: 9042
selector
app: cassandra
When I run the cqlsh
command manually from the container, everything works. Unfortunately, the automated solution throws the mentioned error.
Am I missing something in the Service configuration? I have thought since I am connecting to service from the Pod created by Job, it should work.
EDIT: Job looks like this:
apiVersion: batch/v1
kind: Job
metadata:
name: init-db
spec:
template:
metadata:
name: init-db
annotations:
"helm.sh/hooks": postn-install
spec:
containers:
- name: cqlsh
image: <cassandra-image>
command: ["/bin/sh", "-c", "cqlsh cassandra.my-namespace.svc.cluster.local -f /path/to/schema.cql"]
volumeMounts:
- name: cass-init
mountPath: /etc/config
volumes:
...
And here is the output of etc/resolv.conf
:
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndtos:5
The statefulset doesn't have a selector so the YAML in question doesn't work. If you use the below YAML s it works and connects to the job as well. Everything will be deployed in default namespace. You need to use appropriate storage class (below example uses standard) You can follow cassandra deployment from here
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: cassandra
spec:
serviceName: cassandra
replicas: 1
selector:
matchLabels:
app: cassandra
template:
metadata:
labels:
app: cassandra
spec:
containers:
- name: cassandra
image: cassandra:3
imagePullPolicy: IfNotPresent
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
env:
- name: CASSANDRA_SEEDS
value: cassandra-0.cassandra.default.svc.cluster.local
- name: MAX_HEAP_SIZE
value: 256M
- name: HEAP_NEWSIZE
value: 100M
- name: CASSANDRA_CLUSTER_NAME
value: "Cassandra"
- name: CASSANDRA_DC
value: "DC1"
- name: CASSANDRA_RACK
value: "Rack1"
- name: CASSANDRA_ENDPOINT_SNITCH
value: GossipingPropertyFileSnitch
volumeMounts:
- name: cassandra-data
mountPath: /var/lib/cassandra/data
volumeClaimTemplates:
- metadata:
name: cassandra-data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: standard
resources:
requests:
storage: 1Gi
The service YAML
apiVersion: v1
kind: Service
metadata:
labels:
app: cassandra
name: cassandra
spec:
clusterIP: None
ports:
- port: 9042
selector:
app: cassandra
The Job YAML , The command has been modified to describe cluster as it connects and prints the required cluster info.
apiVersion: batch/v1
kind: Job
metadata:
name: init-db
spec:
template:
metadata:
name: init-db
annotations:
"helm.sh/hooks": postn-install
spec:
restartPolicy: Never
containers:
- name: cqlsh
image: cassandra:3
command: ["/bin/sh", "-c", "cqlsh cassandra.default.svc.cluster.local 9042 -e 'describe cluster'"]
The error you posted indicates that wherever you're running the cqlsh command, it cannot resolve the service name.
Depending on how your k8s cluster is configured and where the job runs (inside the same k8s cluster or external), you'll need to expose access to the pods with Ingress
or NodePort
.
Aleš Nosek has a good explanation of how to access pods in his blog post here. Cheers!
Since you are running insto as service mesh, there might be some problems with your setup.
First I would recommend enabling envoy logging for both pods:
kubectl exec -it <pod-name> -c istio-proxy -- curl -X POST http://localhost:15000/logging?level=trace
No, follow the istio sidecar logs by
kubectl logs <pod-name> -c isiot-proxy -f
This might already give you some information, why the connection is failing.
Some ideas what might be the problem:
1) The cassandra's pod doesn't have a sidecar
If the cassandra pod doesn't have a sidecar and you are enforcing mTLS, it will receive encrypted traffic, but is unable to decrypt it. This contradicts the documentation and seems to be a bug. You can disable mTLS for traffic to the cassandra service with a DestinationRule:
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: cassandra-disable-mtls
spec:
host: "cassandra-host" #insert correct cassandra name here
trafficPolicy:
tls:
mode: DISABLE
2) The job's pod doesn't have a sidecar
If the cassandra pod has a sidecar, but the job's pod doesn't and you enforce mTLS, the cassandras sidecar will deny access, since traffic isn't encrypted. You can either inject a sidecar into the job's pod (recommended) or enable PERMISSIVE
mode for cassandra, so the sidecar will accept both ssl encrypted and plain text traffic.
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: cassandra-allow-plain-traffic
spec:
selector:
matchLabels:
app: cassandra
mtls:
mode: PERMISSIVE
Please let me know if you need further assistance.