Openshift/OCP-> not able to resolve the hostname within same namespace. Service is mapping to the POD name which is not resolvable

8/3/2020

I am new to Kubernetes/OCP world and deploying my application to OCP namespace and trying to connect to gemfire server which is also in same namespace. To access I have created a ClusterIP Service: ocp-gemfire. This service exposes 10334 and 40404 ports and same ports are exposed in the underline container. What my expectation is when I pass this service it should connect from my application as below:

ocp-gemfire/xx.xx.xx.xx(Service IP):10334
ocp-gemfire/xx.xx.xx.xx(Service IP):40404

But what's happening is , its mapping to underline Pod name rather than IP. I am able to telnet IP or service name from application on both the ports; however this Pod name is not resolvable.

I am not sure why its mapping to Pod name rather than IP ?

2020-08-03 16:55:20 INFO  - AutoConnectionSource discovered new locators [myapp-gemfire-1-9npl6:10334]
2020-08-03 16:55:20 WARN  - Could not connect to: myapp-gemfire-1-9npl6:40404
java.net.UnknownHostException: myapp-gemfire-1-9npl6

My service :

apiVersion: v1
kind: Service
metadata:
  labels:
    app: ${APPLICATION_NAME}-gemfire
  name: ${APPLICATION_NAME}-gemfire
  namespace: ${PROJECT_NAMESPACE}
spec:
  type: ClusterIP
  ports:
    - name: 10334-tcp
      port: 10334
      protocol: TCP
      targetPort: 10334
    - name: 40404-tcp 
      port: 40404
      protocol: TCP
      targetPort: 40404
  selector:
    app: ${APPLICATION_NAME}-gemfire
    deploymentconfig: ${APPLICATION_NAME}-gemfire
-- parinda
kubernetes
open-closed-principle

1 Answer

8/4/2020

As you can read here:

In general a pod has the following DNS resolution:

pod-ip-address.my-namespace.pod.cluster-domain.example.

For example, if a pod in the default namespace has the IP address 172.17.0.3, and the domain name for your cluster is cluster.local, then the Pod has a DNS name:

172-17-0-3.default.pod.cluster.local.

Any pods created by a Deployment or DaemonSet exposed by a Service have the following DNS resolution available:

pod-ip-address.deployment-name.my-namespace.svc.cluster-domain.example.

So your attempts to resolve Pod names via DNS are doomed to failure right from the start.

You may want to take a look at my other answer which touches very similar topic.

First of all your application shouldn't refer to your Pods names at all. For conectivity with your Pods it should use only Service. It might be in a short form. Service name is totally enough when it is deployed in the same namespace as Pods you're connecting to it from, or FQDN if it lives in different namespace than your Pods.

Maybe I don't quite understand something here...

2020-08-03 16:55:20 INFO  - AutoConnectionSource discovered new locators [myapp-gemfire-1-9npl6:10334]
2020-08-03 16:55:20 WARN  - Could not connect to: myapp-gemfire-1-9npl6:40404
java.net.UnknownHostException: myapp-gemfire-1-9npl6

but the above error message looks like it's coming from your application. So what actually happens whey you're telneting from another Pod to Service name, it's FQDN or it's cluster IP ? You should be redirected to one of your Pods. Do you get similar message that your Pod name cannot be resolved ?

I'm pretty sure that your Service maps correctly to your Pods as it uses their labels under the hood. So even if the Pod is destroyed, another one is created and it's available under completely different IP and name, Service takes care about updating the endpoints list so it could always direct your traffic to selected Pods (selected by the selector in the Service definition, which uses those Pods labels).

As it comes to your Service definition, it looks correct and it shouldn't map simply to your Pod names under no circumstances as it would contradict with the very design of kubernetes itself, right ? If Pod names are not resolvable by definition, it would be strange that Service, being part of the solution was using Pod names which are not resolvable.

Error message which I believe comes from your app is showing something different as if your Pods are advertising to this app by their names and than your app tries to connect to them by resolving this name, which of course cannot be done.

Simple Google search for a phrase AutoConnectionSource discovered new locators is enough to figure ot that it's a process typical to gemfire, so nothing directly related with Kubernetes/Openshift. So as we can see here:

AutoConnectionSource discovered new locators [myapp-gemfire-1-9npl6:10334]

for some reason this discovery uses simple Pod names and tries to connect to them using <pod-name>:port connection string, which of course is impossible.

-- mario
Source: StackOverflow