I have an application running in my GKE cluster that needs access to www.googleapis.com
. I also make use of Network Policy to enhance security.
With a default deny all egress traffic in place, I cannot connect to www.googleapis.com
naturally. I get the error
INFO 0827 14:33:53.313241 retry_util.py] Retrying request, attempt #3...
DEBUG 0827 14:33:53.313862 http_wrapper.py] Caught socket error, retrying: timed out
DEBUG 0827 14:33:53.314035 http_wrapper.py] Retrying request to url https://www.googleapis.com/storage/v1/b?project=development&projection=noAcl&key=AIzaSyDnac<key>bmJM&fields=nextPageToken%2Citems%2Fid&alt=json&maxResults=1000 after exception timed out
I found out that the hostname www.googleapis.com
corresponds to the IP 216.58.207.36
So I went ahead an created an egress entry in my Network Policy
spec:
egress:
- ports:
- port: 443
protocol: TCP
to:
- ipBlock:
cidr: 216.58.207.36/32
And now from within the Pod, I can telnet this endpoint
$ telnet googleapis.com 443
Trying 216.58.207.36...
Connected to googleapis.com.
Escape character is '^]'.
But for some reason Im still encountering the same error
INFO 0827 14:36:15.767508 retry_util.py] Retrying request, attempt #5...
DEBUG 0827 14:36:15.768018 http_wrapper.py] Caught socket error, retrying: timed out
DEBUG 0827 14:36:15.768128 http_wrapper.py] Retrying request to url https://www.googleapis.com/storage/v1/b?project=development&projection=noAcl&key=AIzaSyDnac<key>bmJM&fields=nextPageToken%2Citems%2Fid&alt=json&maxResults=1000 after exception timed out
However if I delete the network policy, I can connect
INFO 0827 14:40:24.177456 base_api.py] Body: (none)
INFO 0827 14:40:24.177595 transport.py] Attempting refresh to obtain initial access_token
WARNING 0827 14:40:24.177864 multiprocess_file_storage.py] Credentials file could not be loaded, will ignore and overwrite.
DEBUG 0827 14:40:24.177957 multiprocess_file_storage.py] Read credential file
WARNING 0827 14:40:24.178036 multiprocess_file_storage.py] Credentials file could not be loaded, will ignore and overwrite.
DEBUG 0827 14:40:24.178090 multiprocess_file_storage.py] Read credential file
WARNING 0827 14:40:24.356631 multiprocess_file_storage.py] Credentials file could not be loaded, will ignore and overwrite.
DEBUG 0827 14:40:24.356972 multiprocess_file_storage.py] Read credential file
DEBUG 0827 14:40:24.357510 multiprocess_file_storage.py] Wrote credential file /var/lib/jenkins/.gsutil/credstore2.
connect: (www.googleapis.com, 443)
send: 'GET /storage/v1/b?project=development&fields=nextPageToken%2Citems%2Fid&alt=json&projection=noAcl&maxResults=1000 HTTP/1.1\r\nHost: www.googleapis.com\r\ncontent-length: 0\r\nauthorization: REDACTED
My Network Policy allows ALL ingress traffic by default
ingress:
- {}
podSelector: {}
Any idea what I might be missing here ? Is there some other IP address that I need to whitelist in this case ?
EDIT
When the network Policy is in place, I did a test using curl
and I get
* Trying 2a00:1450:4001:80b::200a...
* TCP_NODELAY set
* Immediate connect fail for 2a00:1450:4001:80b::200a: Cannot assign requested address
* Trying 2a00:1450:4001:80b::200a...
* TCP_NODELAY set
* Immediate connect fail for 2a00:1450:4001:80b::200a: Cannot assign requested address
* Trying 2a00:1450:4001:80b::200a...
* TCP_NODELAY set
* Immediate connect fail for 2a00:1450:4001:80b::200a: Cannot assign requested address
* Trying 2a00:1450:4001:80b::200a...
* TCP_NODELAY set
* Immediate connect fail for 2a00:1450:4001:80b::200a: Cannot assign requested address
* Trying 2a00:1450:4001:80b::200a...
* TCP_NODELAY set
* Immediate connect fail for 2a00:1450:4001:80b::200a: Cannot assign requested address
* Trying 2a00:1450:4001:80b::200a...
* TCP_NODELAY set
* Immediate connect fail for 2a00:1450:4001:80b::200a: Cannot assign requested address
This does not happen when the Network Policy is deleted.
The comment from @mensi is correct, there are multiple IPs behind www.googleapis.com. You can for example see that by pinging the URL multiple times, you'll most likely get a different IP every time.
The easiest solution would be to allow all egress by default with:
spec:
podSelector: {}
egress:
- {}
policyTypes:
- Egress
You could also try allowing all of the Google API's public IP ranges, but as Google doesn't seem to publish a list of those (only the restricted.googleapis.com and private.googleapis.com here), that might be a bit tougher.