Unable to connect to Kubernetes Service after exposing it as a NodePort

11/13/2018

I am unable to connect to any service I expose from my GKE cluster despite being able to see the pod up and running. Any ideas what I am doing wrong?

I have a GKE cluster which has private nodes (public master). I have it configured to assign the pods ips from 10.0.x.x and the services ips from 10.2.x.x

A terraform for the cluster is shown below

resource "google_container_cluster" "playground" {
  provider    = "google-beta"
  name        = "playground"
  description = "Playground cluster"
  project     = "${module.playground_project.project_id}"
  zone        = "europe-west4-a"

  min_master_version = "1.11.2-gke.9"

  master_auth {
    username = "admin"
    password = "xxx"
  }
  lifecycle {
    ignore_changes = ["initial_node_count", "node_config", "node_pool", "network", "subnetwork"]
  }

  network = "${google_compute_network.playground.self_link}"

  subnetwork = "${google_compute_subnetwork.playground-gke.self_link}"

  private_cluster_config {
    enable_private_endpoint = false
    enable_private_nodes    = true
    master_ipv4_cidr_block  = "172.30.16.0/28"
  }



  master_authorized_networks_config {
    cidr_blocks = [
      {
        cidr_block   = "${var.my_ip}"
      },
    ]
  }
  node_pool {
    name = "default-pool" # Default empty node pool
  }
  ip_allocation_policy {
    # create_subnetwork = true  # subnetwork_name   = "gke-playground"
    cluster_secondary_range_name  = "subnet-play-gke-pods" # 10.0.0.0/15
    services_secondary_range_name = "subnet-play-gke-services" #10.2.0.0/15
  }
}

resource "google_container_node_pool" "np" {
  provider = "google-beta"
  name     = "node-pool-1"
  project  = "${module.playground_project.project_id}"
  zone     = "europe-west4-a"
  cluster  = "playground"

  depends_on = ["google_container_cluster.playground"]

  management {
    auto_upgrade = true
    auto_repair  = true
  }

  lifecycle {
    ignore_changes = ["node_count"]
  }

  # Enable this or autoscaling, not both
  # node_count = 1

  autoscaling {
    min_node_count = 1
    max_node_count = 3
  }
  initial_node_count = 1
  node_config {
    # preemptible  = true
    machine_type = "n1-standard-1"
    disk_size_gb = "20"
    disk_type    = "pd-standard"

    # metadata
    # labels
    # tags
    tags = ["gke"]

    labels = [
      {
        environment = "playground"
      },
    ]

    oauth_scopes = [
      "https://www.googleapis.com/auth/compute",
      "https://www.googleapis.com/auth/devstorage.read_only",
      "https://www.googleapis.com/auth/logging.write",
      "https://www.googleapis.com/auth/monitoring",
    ]
  }
}

I have setup a bastion within the VPC that can reach the nodes (verified by sshing to the nodes)

I am able to deploy an application like so:

local $ kubectl run hello --image=gcr.io/google-samples/hello-app:1.0 --port=8080
deployment.apps "hello" created

local $ kubectl get deployment hello
NAME      DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
hello     1         1         1            1           1m

local $ kubectl expose deployment hello --target-port=8080 --type=NodePort
service "hello" exposed

local $ kubectl describe service hello
Name:                     hello
Namespace:                default
Labels:                   run=hello
Annotations:              <none>
Selector:                 run=hello
Type:                     NodePort
IP:                       10.2.109.113
Port:                     <unset>  8080/TCP
TargetPort:               8080/TCP
NodePort:                 <unset>  32420/TCP
Endpoints:                10.0.2.13:8080
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>

Now if I connect to my bastion I can query the pod directly on 10.0.2.13:8080, which is great

bastion $ curl 10.0.2.13:8080
Hello, world!
Version: 1.0.0
Hostname: hello-68669bb559-x7zpb

But if I try to connect to the service url on 10.2.109.113 my connection times out:

bastion $ curl -vvvv --connect-timeout 10 10.2.109.113:32420
* Rebuilt URL to: 10.2.109.113:32420/
*   Trying 10.2.109.113...
* TCP_NODELAY set
* Connection timed out after 10001 milliseconds
* Curl_http_done: called premature == 1
* stopped the pause stream!
* Closing connection 0
curl: (28) Connection timed out after 10001 milliseconds

A dump from gcloud with my firewall rules is below to give an idea of the firewall rules

local $gcloud compute firewall-rules list
NAME                                     NETWORK   DIRECTION  PRIORITY  ALLOW                         DENY  DISABLED
default-allow-icmp                       default   INGRESS    65534     icmp                                False
default-allow-internal                   default   INGRESS    65534     tcp:0-65535,udp:0-65535,icmp        False
default-allow-rdp                        default   INGRESS    65534     tcp:3389                            False
default-allow-ssh                        default   INGRESS    65534     tcp:22                              False
egress-from-bastion-to-me-over-ssh   vpc-play  EGRESS     1000      tcp:22                              False
gke-playground-f9a5cbc4-all              vpc-play  INGRESS    1000      sctp,tcp,udp,icmp,esp,ah            False
gke-playground-f9a5cbc4-master           vpc-play  INGRESS    1000      tcp:10250,tcp:443                   False
gke-playground-f9a5cbc4-vms              vpc-play  INGRESS    1000      icmp,tcp:1-65535,udp:1-65535        False
ingress-from-bastion-to-gke-over-all     vpc-play  INGRESS    1000      all                                 False
ingress-from-me-to-bastion-over-ssh  vpc-play  INGRESS    1000      tcp:22                              False
k8s-fw-l7--fff685c495e2595e              vpc-play  INGRESS    1000      tcp:30000-32767                     False
nat-europe-west4-a                       vpc-play  INGRESS    1000      all                                 False
nat-gateway-europe-west4-a-vm-ssh        vpc-play  INGRESS    1000      tcp:22                              False
-- McGin
google-cloud-platform
google-kubernetes-engine
kubernetes
networking

1 Answer

11/16/2018

service IPs are not rout-able outside the cluster. If you want to test the service, try curl [node_IP]:32420 This will hit the nodeport on the node which will reach your service endpoint

-- Patrick W
Source: StackOverflow