Kubernetes 1.0.1 External Load Balancer on GCE with CoreOS

8/5/2015

Using a previous version of Kubernetes (0.16.x) I was able to create a cluster of CoreOS based VMs on GCE that were capable of generating external network load balancers for services. With the release of v1 of Kubernetes the configuration necessary for this functionality seems to have changed. Could anyone offer any advice or point me in the direction of some documentation that might help me out with this issue?

I suspect that the problem is to do with ip/naming as I was previously using kube-register to handle this, and this component no longer seems necessary. My current configuration will create internal service load balancers without issue, and will even create external service load balancers, but they are only viewable through the gcloud UI and are not registered or displayed in kubectl output. Unfortunately the external ips generated do not actually proxy the traffic through either.

The kube-controller-manager service log looks like this:

Aug 05 12:15:42 europe-west1-b-k8s-master.c.staging-infrastructure.internal hyperkube[1604]: I0805 12:15:42.516360    1604 gce.go:515] Firewall doesn't exist, moving on to deleting target pool.
Aug 05 12:15:42 europe-west1-b-k8s-master.c.staging-infrastructure.internal hyperkube[1604]: E0805 12:15:42.516492    1604 servicecontroller.go:171] Failed to process service delta. Retrying: googleapi: Error 404: The resource 'projects/staging-infrastructure/global/firewalls/k8s-fw-a4db9328c3b6b11e5ab9f42010af0397' was not found, notFound
Aug 05 12:15:42 europe-west1-b-k8s-master.c.staging-infrastructure.internal hyperkube[1604]: I0805 12:15:42.516539    1604 servicecontroller.go:601] Successfully updated 2 out of 2 external load balancers to direct traffic to the updated set of nodes
Aug 05 12:16:07 europe-west1-b-k8s-master.c.staging-infrastructure.internal hyperkube[1604]: E0805 12:16:07.620094    1604 servicecontroller.go:171] Failed to process service delta. Retrying: failed to create external load balancer for service default/autobot-cache-graph: googleapi: Error 400: Invalid value for field 'resource.targetTags[0]': 'europe-west1-b-k8s-node-0.c.staging-infrastructure.int'.  Must be a match of regex '(?:[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?)', invalid
Aug 05 12:16:12 europe-west1-b-k8s-master.c.staging-infrastructure.internal hyperkube[1604]: I0805 12:16:12.804512    1604 servicecontroller.go:275] Deleting old LB for previously uncached service default/autobot-cache-graph whose endpoint &{[{146.148.114.97 }]} doesn't match the service's desired IPs []

Here is the config I am using (download chmod etc omitted for clarity).

On the master:

- name: kube-apiserver.service
  command: start
  content: |
    [Unit]
    Description=Kubernetes API Server
    Requires=setup-network-environment.service etcd.service generate-serviceaccount-key.service
    After=setup-network-environment.service etcd.service generate-serviceaccount-key.service
    [Service]
    EnvironmentFile=/etc/network-environment
    ExecStart=/opt/bin/hyperkube apiserver \
    --cloud-provider=gce \
    --service_account_key_file=/opt/bin/kube-serviceaccount.key \
    --service_account_lookup=false \
    --admission_control=NamespaceLifecycle,NamespaceAutoProvision,LimitRanger,SecurityContextDeny,ServiceAccount,ResourceQuota \
    --runtime_config=api/v1 \
    --allow_privileged=true \
    --insecure_bind_address=0.0.0.0 \
    --insecure_port=8080 \
    --kubelet_https=true \
    --secure_port=6443 \
    --service-cluster-ip-range=10.100.0.0/16 \
    --etcd_servers=http://127.0.0.1:2379 \
    --bind-address=${DEFAULT_IPV4} \
    --logtostderr=true
    Restart=always
    RestartSec=10

- name: kube-controller-manager.service
  command: start
  content: |
    [Unit]
    Description=Kubernetes Controller Manager
    Requires=kube-apiserver.service
    After=kube-apiserver.service
    [Service]
    ExecStart=/opt/bin/hyperkube controller-manager \
    --cloud-provider=gce \
    --service_account_private_key_file=/opt/bin/kube-serviceaccount.key \
    --master=127.0.0.1:8080 \
    --logtostderr=true
    Restart=always
    RestartSec=10

- name: kube-scheduler.service
  command: start
  content: |
    [Unit]
    Description=Kubernetes Scheduler
    Requires=kube-apiserver.service
    After=kube-apiserver.service
    [Service]
    ExecStart=/opt/bin/hyperkube scheduler --master=127.0.0.1:8080
    Restart=always
    RestartSec=10

And on the node:

- name: kubelet.service
  command: start
  content: |
    [Unit]
    Description=Kubernetes Kubelet
    Requires=setup-network-environment.service
    After=setup-network-environment.service
    [Service]
    EnvironmentFile=/etc/network-environment
    WorkingDirectory=/root
    ExecStart=/opt/bin/hyperkube kubelet \
    --cloud-provider=gce \
    --address=0.0.0.0 \
    --port=10250 \
    --api_servers=<master_ip>:8080 \
    --allow_privileged=true \
    --logtostderr=true \
    --cadvisor_port=4194 \
    --healthz_bind_address=0.0.0.0 \
    --healthz_port=10248
    Restart=always
    RestartSec=10

- name: kube-proxy.service
  command: start
  content: |
    [Unit]
    Description=Kubernetes Proxy
    Requires=setup-network-environment.service
    After=setup-network-environment.service

    [Service]
    ExecStart=/opt/bin/hyperkube proxy \
    --master=<master_ip>:8080 \
    --logtostderr=true
    Restart=always
    RestartSec=10

To me it looks like a mismatch in naming and ip, but I'm not sure how to adjust my config to resolve. Any guidance greatly appreciated.

-- i0n
coreos
google-cloud-platform
kubernetes

1 Answer

8/24/2015

How did you create the nodes in your cluster? We've seen another instance of this issue due to bugs in the cluster bootstrapping script that was used that didn't apply the expected node names and tags.

If you recreate your cluster using the following two commands as recommended on the issue linked to above, creating load balancers should work for you:

export OS_DISTRIBUTION=coreos
cluster/kube-up.sh

Otherwise, you may need to wait for the issues to be fixed.

-- Alex Robinson
Source: StackOverflow