I'm using google spark-operator and some calico network policies to protect the namespaces.
The Spark driver pods need to be able to communicate with the kubernetes
service in the default
namespace to speak with the api-server.
This is what I get :
Operation: [get] for kind: [Pod] with name: [xx] in namespace: [xx] failed.
*The problem is :
Using any kind of network policies blocks communication toward the default namespace. Restoring the connectivity is possible but selecting the kubernetes.default
service is still impossible as it is a particular service (has no selectors)... And so you can't communicate with it !*
I tried opening communication to all pods in default
+ kube-system
namespace. It's working for all services except kubernetes.default
which is still unreachable !
apiVersion: projectcalico.org/v3
kind: NetworkPolicy
metadata:
name: mynetpol
spec:
selector: all()
types:
- Egress
egress:
# Allows comm to kube-system namespace
- action: Allow
destination:
selector: all()
namespaceSelector: ns == 'kube-system'
- action: Allow
source:
selector: all()
namespaceSelector: ns == 'kube-system'
# Allows comm to default namespace
- action: Allow
destination:
selector: all()
namespaceSelector: ns == 'default'
- action: Allow
source:
selector: all()
namespaceSelector: ns == 'default'
For some reasons curling kubernetes.default.svc.cluster.local:443
timeouts event though all communication is wide open.
So... In the end...
Network policies don't work on services that dont target pods, which is the case of this particular kubernetes
service sitting quietly in the default
namespace. It's a special service that always points to the api-server.
The solution is to retrieve the api-server's real IP and allow egress
-ing to it.
To find this IP you can use this command :
kubectl get endpoints --namespace default kubernetes
Courtesy of @Dave McNeill
Then you can allow this IP in you network policy.
If you are using the default netpol API, check Dave's answer https://stackoverflow.com/a/56494510/5512455
If you are using the calico policies, which I encourage you to do because the Kube ones sucks, following is the working yaml:
kind: NetworkPolicy
metadata:
name: allow-egress-api-server
spec:
selector: all()
types:
- Egress
egress:
# Allow api-server
- action: Allow
protocol: TCP
destination:
nets:
- <Your api-server IP>/32
ports:
- 6443