Can't delete underlying VM for a node in Kubernetes

3/29/2017

I'm running a three node cluster on GCE. I want to drain one node and delete the underlying VM.

Documentation for kubectl drain command says:

Once it returns (without giving an error), you can power down the node (or equivalently, if on a cloud platform, delete the virtual machine backing the node)

I execute the following commands:

  1. Get the nodes

    $ kl get nodes
    NAME                                      STATUS    AGE
    gke-jcluster-default-pool-9cc4e660-6q21   Ready     43m
    gke-jcluster-default-pool-9cc4e660-rx9p   Ready     6m
    gke-jcluster-default-pool-9cc4e660-xr4z   Ready     23h
  2. Drain node rx9p.

    $ kl drain gke-jcluster-default-pool-9cc4e660-rx9p --force
    node "gke-jcluster-default-pool-9cc4e660-rx9p" cordoned
    WARNING: Deleting pods not managed by ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet: fluentd-cloud-logging-gke-jcluster-default-pool-9cc4e660-rx9p, kube-proxy-gke-jcluster-default-pool-9cc4e660-rx9p
    node "gke-jcluster-default-pool-9cc4e660-rx9p" drained
  3. Delete gcloud VM.

     $ gcloud compute instances delete gke-jcluster-default-pool-9cc4e660-rx9p
  4. List VMs.

     $ gcloud compute instances list

    In the result, I'm seeing the VM I deleted above - rx9p. If I do kubectl get nodes, I'm seeing the rx9p node too.

What's going on? Something is restarting the VM I'm deleting? Do I have to wait for some timeout between the commands?

-- Petko M
google-compute-engine
google-kubernetes-engine
kubernetes

1 Answer

3/29/2017

You are on the right track with draining the node first.

The nodes (compute instances) are part of a managed instance group. If you delete just them with the gcloud compute instances delete command the managed instance group will recreate them.

To delete one properly use this command (after you have drained it!):

gcloud compute instance-groups managed delete-instances \
  gke-jcluster-default-pool-9cc4e660-grp \
  --instances=gke-jcluster-default-pool-9cc4e660-rx9p \
  --zone=...
-- Janos Lenart
Source: StackOverflow