Openshift origin pods cannot access other pods via service name

7/21/2017

I have tried two different applications, both consisting of a web application frontend that needs to connect to a relational database.

In both cases the frontend application is unable to connect to the database. In both instances the database is also running as a container (pod) in OpenShift. And the web application uses the service name as the url. Both applications have worked in other OpenShift environments.

Version

  • OpenShift Master: v1.5.1+7b451fc
  • Kubernetes Master: v1.5.2+43a9be4
  • Installed using Ansible Openshift
  • Single node, with master on this node
  • Host OS: CentOS 7 Minimal

I am not sure where to look in OpenShift to debug this issue. The only way I was able to reach the db pod from the web pod was using the cluster ip address.

-- Magick
kubernetes
openshift

4 Answers

7/25/2017

I have been facing issues connecting to databases as well using SkyDNS e.g phpMyAdmin, as a workaround I tried entering the ClusterIP instead of the SkyDNS name, and it worked, have you tried using service ClusterIP instead?

-- egyparadox
Source: StackOverflow

7/24/2017

In OpenShift skydns is part of master, you can restart master to restart internal dns, but I suggest you try this: 1. Check whether dns can resolve your service name using dig 2. If it fail it's the dns problem, or it's the iptables problem, you can try restart the kube proxy(part of the node service) to sync the proxy rules.

-- Haoran Wang
Source: StackOverflow

7/27/2017

In order for the internal DNS resolution to work, you need to ensure dnsmasq.service is running, /etc/resolv.conf contains the IP address of the OCP node itself instead of other DNS servers (these should be in /etc/dnsmasq.d/origin-upstream-dns.conf).

Example:

# ip a s eth0
    ...
        inet 10.0.0.1/24


# cat /etc/resolv.conf 
    ...
    nameserver 10.0.0.1
    # nameserver updated by /etc/NetworkManager/dispatcher.d/99-origin-dns.sh

^^ note the dispatcher script in the /etc/resolv.conf

# systemctl status dnsmasq.service
    ● dnsmasq.service - DNS caching server.
       Loaded: loaded (/usr/lib/systemd/system/dnsmasq.service; enabled; vendor preset: disabled)
       Active: active (running)


# cat /etc/dnsmasq.d/origin-dns.conf 
    no-resolv
    domain-needed
    server=/cluster.local/172.18.0.1

^^ this IP should be kubernetes service IP (oc get svc -n default)

# cat /etc/dnsmasq.d/origin-upstream-dns.conf 
    server=<dns ip 1>
    server=<dns ip 2>

If the OpenShift is running on some kind of OpenStack instance, AWS or similar, it might happen that cloud-init does not trigger the NetworkManager dispatcher script, therefore the resolv.conf is not modified to point to dnsmasq. Try to restart whole network, e.g.:

# systemctl restart network.service

I hope this helps.

-- agolis
Source: StackOverflow

7/25/2017

If route is not reachable it's dns issue

-- Humble
Source: StackOverflow