In a Kubernetes cluster, is there a way to migrate etcd from external to internal?

1/24/2019

I made a Kubernetes cluster one year ago with an external etcd cluster (3 members).

A the time, I did not know that it was possible to make an etcd internal, so I made an external cluster and connected Kubernetes to it.

Now I am seeing that an internal cluster is a thing and it is a cleaner solution because the etcd nodes are updated when you update your Kubernetes cluster.

I can't find a clean solution to migrate an external etcd cluster to an internal cluster. I hope there is a solution with zero downtime. Do you know if it is possible please ?

Thank you for your response and have a nice day !

-- Nurza
etcd
external
internals
kubernetes
migration

2 Answers

10/23/2019

As I can understand you have 3 etcd cluster members, external from Kubernetes cluster perspective. The expected outcome is to have all three members running on Kubernetes master nodes. There is some information left undisclosed, so I try to explain several possible options.

First of all, there are several reasonable ways to run etcd process to use as Kubernetes control-plane key-value storage:

  • etcd run as static pod, having startup configuration in /etc/kubernetes/manifests/etcd.yaml file
  • etcd run as a system service defined in /etc/systemd/system/etcd.service or similar file
  • etcd run as a docker container configured using command line options. (this solution is not really safe, unless you can make the contaner restarted after failure or host reboot)

For experimental purposes, you can also run etcd:

  • as a simple process in linux userspace
  • as a stateful set in the kubernetes cluster
  • as a etcd cluster managed by etcd-operator.

My personal recommendation is to have 5 members etcd cluster: 3 members runs as a static pods on 3 master kubernetes nodes and two more runs as static pods on external (Kubernetes cluster independent) hosts. In this case you will still have a quorum if you have at least one master node running or if you loose two external nodes by any reason.

There are at least two way to migrate etcd cluster from external instances to the Kubernetes cluster master nodes. It works in the opposite way too.

Migration

It's quite straighforward way to migrate the cluster. During this procedure members are turned off (one at a time), moved to another host and started again. Your cluster shouldn't have any problems while you still have quorum in the etcd cluster. My recommendation is to have at least 3 or better 5 nodes etcd cluster to make the migration safer. For bigger clusters it's may be more convenient to use the other solution from my second answer.

The process of moving etcd member to another IP address is described in the official documentation:

To migrate a member:

  1. Stop the member process.
  2. Copy the data directory of the now-idle member to the new machine.
  3. Update the peer URLs for the replaced member to reflect the new machine according to the runtime reconfiguration instructions.
  4. Start etcd on the new machine, using the same configuration and the copy of the data directory.

Now let's look closer on each step:

0.1 Ensure your etcd cluster is healthy and all members are in a good condition. I would recommend also checking the logs of all etcd members, just in case.

(To successfuly run the following commands please refer to step 3 for auth variables and aliases)

# last two commands only show you members specified by using --endpoints command line option
# the following commands is suppose to run with root privileges because certificates are not accessible by regular user

e2 cluster-health
e3 endpoint health
e3 endpoint status

0.2 Check each etcd member configuration and find out where etcd data-dir is located, then ensure that it will remain accessible after etcd process termination. In most cases it's located under /var/lib/etcd on the host machine and used directly or mounted as a volume to etcd pod or docker container.

0.3 Create a snapshot of each etcd cluster member, it's better don't use it, than don't have it.

1. Stop etcd member process.

If you use kubelet to start etcd, as recommended here, move etcd.yaml file out of /etc/kubernetes/manifests/. Right after that etcd Pod will be terminated by kubelet:

sudo mv /etc/kubernetes/manifests/etcd.yaml ~/
sudo chmod 644 ~/etcd.yaml 

In case if you start etcd process as a systemd service you can stop it using the following command:

sudo systemctl stop etcd-service-name.service

In case of docker container you can stop it using the following command:

docker ps -a 
docker stop <etcd_container_id>
docker rm <etcd_container_id>

If you run the etcd process from the command line, you can kill it using the following command:

kill `pgrep etcd`

2. Copy the data directory of the now-idle member to the new machine.

Not much complexity here. Compact etcd data-dir to the file and copy it to the destination instance. I also recommend to copy etcd manifest or systemd service configuration if you plan to run etcd on the new instance in the same way.

tar -C /var/lib -czf etcd-member-name-data.tar.gz etcd
tar -czf etcd-member-name-conf.tar.gz [etcd.yaml] [/etc/systemd/system/etcd.service]  [/etc/kubernetes/manifests/etcd.conf ...]
scp etcd-member-name-data.tar.gz destination_host:~/
scp etcd-member-name-conf.tar.gz destination_host:~/

3. Update the peer URLs for the replaced member to reflect the new member IP address according to the runtime reconfiguration instructions.

There are two way to do it, by using etcd API or by running etcdctl utility.

That's how etcdctl way may look like:
(replace etcd endpoints variables with the correct etcd cluster members ip addresses)

# all etcd cluster members should be specified
export ETCDSRV="--endpoints https://etcd.ip.addr.one:2379,https://etcd.ip.addr.two:2379,https://etcd.ip.addr.three:2379"
#authentication parameters for v2 and v3 etcdctl APIs
export ETCDAUTH2="--ca-file /etc/kubernetes/pki/etcd/ca.crt --cert-file /etc/kubernetes/pki/etcd/peer.crt --key-file  /etc/kubernetes/pki/etcd/peer.key"
export ETCDAUTH3="--cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/peer.crt --key=/etc/kubernetes/pki/etcd/peer.key"

# etcdctl API v3 alias
alias e3="ETCDCTL_API=3 etcdctl $ETCDAUTH3 $ETCDSRV"
# etcdctl API v2 alias
alias e2="ETCDCTL_API=2 etcdctl $ETCDAUTH2 $ETCDSRV"

# list all etcd cluster members and their IDs
e2 member list

e2 member update member_id http://new.etcd.member.ip:2380
#or
e3 member update member_id --peer-urls="https://new.etcd.member.ip:2380"

That's how etcd API way may look like:

export CURL_ETCD_AUTH="--cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key --cacert /etc/kubernetes/pki/etcd/ca.crt"

curl https://health.etcd.istance.ip:2379/v2/members/member_id -XPUT -H "Content-Type: application/json" -d '{"peerURLs":["http://new.etcd.member.ip:2380"]}' ${CURL_ETCD_AUTH}

4. Start etcd on the new machine, using the adjusted configuration and the copy of the data directory.

Unpack etcd data-dir on the new host:

tar -xzf etcd-member-name-data.tar.gz -C /var/lib/

Adjust etcd startup configuration according to your needs. At this point it's easy to select another way to run etcd. Depending on your choice prepare manifest or service definition file and replace there old ip address with new. E.g.:

sed -i  's/\/10.128.0.12:/\/10.128.0.99:/g' etcd.yaml

Now it's time to start etcd by moving etcd.yaml to /etc/kubernetes/manifests/, or by running the following command (if you run etcd as a systemd service)

sudo systemctl start etcd-service-name.service

5. Check updated etcd process logs and etcd cluster health to ensure that member is healthy.

To do that you can use the following commands:

$ e2 cluster-health

$ kubectl logs etct_pod_name -n kube-system

$ docker logs etcd_container_id 2>&1 | less

$ journalctl -e -u etcd_service_name     
-- VAS
Source: StackOverflow

10/23/2019

The second solution I've mentioned in another answer is

Growing and then shrinking etcd cluster

The downside of this method is that etcd quorum size is temporary increased, and in case of several nodes failure, etcd cluster may break. To avoid it, you may want to remove one existing etcd cluster member before adding another one.

Here is the brief overview of the process:

  1. generate certificates for all additional members using etcd ca.crt and ca.key from existing etcd node folder (/etc/kubernetes/pki/etcd/).
  2. add new member to the cluster using etcdctl command
  3. create etcd config for new member
  4. start new etcd member using new keys and config
  5. check cluster health
  6. repeat steps 2-5 until all required etcd nodes are added
  7. remove one exessive etcd cluster member using etcdctl command
  8. check cluster health
  9. repeat steps 7-8 until the desired size of etcd cluster is achieved
  10. Adjust all etcd.yaml files for all etcd cluster members.
  11. Adjust etcd endpoints in all kube-apiserver.yaml manifests

Another possible sequence:

  1. generate certificates for all additional members using etcd ca.crt and ca.key from existing etcd node folder (/etc/kubernetes/pki/etcd/).
  2. remove one etcd cluster member using etcdctl command
  3. add new member to the cluster using etcdctl command
  4. create etcd config for new member
  5. start new etcd member using new keys and config
  6. check cluster health
  7. repeat steps 2-6 until required etcd configuration is achieved
  8. Adjust all etcd.yaml files for all etcd cluster members.
  9. Adjust etcd endpoints in all kube-apiserver.yaml manifests

How to generate certificates:

Note: If you have etcd cluster, you likely have etcd-CA certificate somewhere. Consider to use it along with the etcd-CA key to generate certificates for all additional etcd members.

Note: In case you choose to generate certificates manually, usual Kubernetes certificates' parameters are:

  • Signature Algorithm: sha256WithRSAEncryption
  • Public Key Algorithm: rsaEncryption
  • RSA Public-Key: (2048 bit)
  • CA certs age: 10 years
  • other certs age: 1 year

You can check the content of the certificates using the following command:

find /etc/kubernetes/pki/ -name *.crt | xargs -l  bash -c 'echo $0 ; openssl x509 -in $0 -text -noout'

How to remove a member from the etcd cluster

(Please refer to my another answer, step 3, for variables and alias definitions)

e3 member list

b67816d38b8e9d2, started, kube-ha-m3, https://10.128.0.12:2380, https://10.128.0.12:2379
3de72bd56f654b1c, started, kube-ha-m1, https://10.128.0.10:2380, https://10.128.0.10:2379
ac98ece88e3519b5, started, kube-etcd2, https://10.128.0.14:2380, https://10.128.0.14:2379
cfb0839e8cad4c8f, started, kube-ha-m2, https://10.128.0.11:2380, https://10.128.0.11:2379
eb9b83c725146b96, started, kube-etcd1, https://10.128.0.13:2380, https://10.128.0.13:2379
401a166c949e9584, started, kube-etcd3, https://10.128.0.15:2380, https://10.128.0.15:2379  # Let's remove this one

e2 member remove 401a166c949e9584

The member will shutdown instantly. To prevent further attempt of joining the cluster, move/delete etcd.yaml from /etc/kubernetes/manifests/ or shutdown etcd service on the etcd member node


How to add a member to the etcd cluster

e3 member add kube-etcd3 --peer-urls="https://10.128.0.16:2380"

The output shows the parameters required to start the new etcd cluster member, e.g.:

ETCD_NAME="kube-etcd3"
ETCD_INITIAL_CLUSTER="kube-ha-m3=https://10.128.0.15:2380,kube-ha-m1=https://10.128.0.10:2380,kube-etcd2=https://10.128.0.14:2380,kube-ha-m2=https://10.128.0.11:2380,kube-etcd1=https://10.128.0.13:2380,kube-etcd3=https://10.128.0.16:2380"
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.128.0.16:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"

Note: ETCD_INITIAL_CLUSTER variable contains all existing etcd cluster members and also the new node. If you need to add several nodes it should be done one node at a time.

Note: All ETCD_INITIAL_* variables and corresponded command line parameters only required for the first etcd Pod start. After successful addition of the node to the etcd cluster, these parameters are ignored and can be removed from startup configuration. All required information is stored in /var/lib/etcd folder in etcd database file.

The default etcd.yaml manifest could be generated using the following kubeadm commmand:

kubeadm init phase etcd local

It's better to move etcd.yaml file from /etc/kubernetes/manifests/ somewhere to make adjustments.

Also delete content of the /var/lib/etcd folder. It contains data of new etcd cluster, so it can't be used to add member to existing cluster.

Then it should be adjusted according to member add command output. (--advertise-client-urls, -initial-advertise-peer-urls, --initial-cluster, --initial-cluster-state, --listen-client-urls, --listen-peer-urls) E.g.:

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: etcd
    tier: control-plane
  name: etcd
  namespace: kube-system
spec:
  containers:
  - command:
    - etcd
    - --advertise-client-urls=https://10.128.0.16:2379
    - --cert-file=/etc/kubernetes/pki/etcd/server.crt
    - --client-cert-auth=true
    - --data-dir=/var/lib/etcd
    - --initial-advertise-peer-urls=https://10.128.0.16:2380
    - --initial-cluster=kube-ha-m3=https://10.128.0.15:2380,kube-ha-m1=https://10.128.0.10:2380,kube-etcd2=https://10.128.0.14:2380,kube-ha-m2=https://10.128.0.11:2380,kube-etcd1=https://10.128.0.13:2380,kube-etcd3=https://10.128.0.16:2380
    - --initial-cluster-state=existing
    - --key-file=/etc/kubernetes/pki/etcd/server.key
    - --listen-client-urls=https://10.128.0.16:2379
    - --listen-metrics-urls=http://127.0.0.1:2381
    - --listen-peer-urls=https://10.128.0.16:2380
    - --name=kube-etcd3
    - --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
    - --peer-client-cert-auth=true
    - --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
    - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
    - --snapshot-count=10000
    - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
    image: k8s.gcr.io/etcd:3.3.10
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 127.0.0.1
        path: /health
        port: 2381
        scheme: HTTP
      initialDelaySeconds: 15
      timeoutSeconds: 15
    name: etcd
    resources: {}
    volumeMounts:
    - mountPath: /var/lib/etcd
      name: etcd-data
    - mountPath: /etc/kubernetes/pki/etcd
      name: etcd-certs
  hostNetwork: true
  priorityClassName: system-cluster-critical
  volumes:
  - hostPath:
      path: /etc/kubernetes/pki/etcd
      type: DirectoryOrCreate
    name: etcd-certs
  - hostPath:
      path: /var/lib/etcd
      type: DirectoryOrCreate
    name: etcd-data

After saving the file, kubelet will restart etcd pod. Check etcd container logs to ensure it is joined to the cluster.


How to check cluster health

$ e2 cluster-health
member b67816d38b8e9d2 is healthy: got healthy result from https://10.128.0.15:2379
member 3de72bd56f654b1c is healthy: got healthy result from https://10.128.0.10:2379
member ac98ece88e3519b5 is healthy: got healthy result from https://10.128.0.14:2379
member cfb0839e8cad4c8f is healthy: got healthy result from https://10.128.0.11:2379
member eb9b83c725146b96 is healthy: got healthy result from https://10.128.0.13:2379
cluster is healthy

$ e2 member list
b67816d38b8e9d2: name=kube-ha-m3 peerURLs=https://10.128.0.15:2380 clientURLs=https://10.128.0.15:2379 isLeader=true
3de72bd56f654b1c: name=kube-ha-m1 peerURLs=https://10.128.0.10:2380 clientURLs=https://10.128.0.10:2379 isLeader=false
ac98ece88e3519b5: name=kube-etcd2 peerURLs=https://10.128.0.14:2380 clientURLs=https://10.128.0.14:2379 isLeader=false
cfb0839e8cad4c8f: name=kube-ha-m2 peerURLs=https://10.128.0.11:2380 clientURLs=https://10.128.0.11:2379 isLeader=false
eb9b83c725146b96: name=kube-etcd1 peerURLs=https://10.128.0.13:2380 clientURLs=https://10.128.0.13:2379 isLeader=false

$ e3 endpoint health
# the output includes only etcd members that are specified  in --endpoints cli option or corresponded environment variable. I've included only three out of five members
https://10.128.0.13:2379 is healthy: successfully committed proposal: took = 2.310436ms
https://10.128.0.15:2379 is healthy: successfully committed proposal: took = 1.795723ms
https://10.128.0.14:2379 is healthy: successfully committed proposal: took = 2.41462ms

$ e3 endpoint status
# the output includes only etcd members that are specified  in --endpoints cli option or corresponded environment variable. I've included only three out of five members
https://10.128.0.13:2379 is healthy: successfully committed proposal: took = 2.531676ms
https://10.128.0.15:2379 is healthy: successfully committed proposal: took = 1.285312ms
https://10.128.0.14:2379 is healthy: successfully committed proposal: took = 2.266932ms

How to check etcl Pod logs without using kubectl?

If you run etcd member using kubelet only, you can check its log using the following command:

docker logs `docker ps -a | grep etcd | grep -v pause | awk '{print $1}' | head -n1` 2>&1 | less

Note: Usually, only one etcd Pod can be run on the same node at the same time, because it uses database in the host directory /var/lib/etcd/ and it cannot be shared between two pods. Also etcd Pod uses node network interface to communicate with the etcd cluster.
Of course, you can configure etcd Pod to use different host directory and use different host ports as a workaround, but the above command assumes that the only one etcd Pod is present on the node.

-- VAS
Source: StackOverflow