I made a Kubernetes cluster one year ago with an external etcd cluster (3 members).
A the time, I did not know that it was possible to make an etcd internal, so I made an external cluster and connected Kubernetes to it.
Now I am seeing that an internal cluster is a thing and it is a cleaner solution because the etcd nodes are updated when you update your Kubernetes cluster.
I can't find a clean solution to migrate an external etcd cluster to an internal cluster. I hope there is a solution with zero downtime. Do you know if it is possible please ?
Thank you for your response and have a nice day !
As I can understand you have 3 etcd cluster members, external from Kubernetes cluster perspective. The expected outcome is to have all three members running on Kubernetes master nodes. There is some information left undisclosed, so I try to explain several possible options.
First of all, there are several reasonable ways to run etcd process to use as Kubernetes control-plane key-value storage:
/etc/kubernetes/manifests/etcd.yaml
file/etc/systemd/system/etcd.service
or similar fileFor experimental purposes, you can also run etcd:
My personal recommendation is to have 5 members etcd cluster: 3 members runs as a static pods on 3 master kubernetes nodes and two more runs as static pods on external (Kubernetes cluster independent) hosts. In this case you will still have a quorum if you have at least one master node running or if you loose two external nodes by any reason.
There are at least two way to migrate etcd cluster from external instances to the Kubernetes cluster master nodes. It works in the opposite way too.
It's quite straighforward way to migrate the cluster. During this procedure members are turned off (one at a time), moved to another host and started again. Your cluster shouldn't have any problems while you still have quorum in the etcd cluster. My recommendation is to have at least 3 or better 5 nodes etcd cluster to make the migration safer. For bigger clusters it's may be more convenient to use the other solution from my second answer.
The process of moving etcd member to another IP address is described in the official documentation:
To migrate a member:
- Stop the member process.
- Copy the data directory of the now-idle member to the new machine.
- Update the peer URLs for the replaced member to reflect the new machine according to the runtime reconfiguration instructions.
- Start etcd on the new machine, using the same configuration and the copy of the data directory.
Now let's look closer on each step:
0.1 Ensure your etcd cluster is healthy and all members are in a good condition. I would recommend also checking the logs of all etcd members, just in case.
(To successfuly run the following commands please refer to step 3 for auth variables and aliases)
# last two commands only show you members specified by using --endpoints command line option
# the following commands is suppose to run with root privileges because certificates are not accessible by regular user
e2 cluster-health
e3 endpoint health
e3 endpoint status
0.2 Check each etcd member configuration and find out where etcd data-dir is located, then ensure that it will remain accessible after etcd process termination. In most cases it's located under /var/lib/etcd on the host machine and used directly or mounted as a volume to etcd pod or docker container.
0.3 Create a snapshot of each etcd cluster member, it's better don't use it, than don't have it.
1. Stop etcd member process.
If you use kubelet
to start etcd, as recommended here, move etcd.yaml
file out of /etc/kubernetes/manifests/
. Right after that etcd Pod will be terminated by kubelet
:
sudo mv /etc/kubernetes/manifests/etcd.yaml ~/
sudo chmod 644 ~/etcd.yaml
In case if you start etcd process as a systemd service you can stop it using the following command:
sudo systemctl stop etcd-service-name.service
In case of docker container you can stop it using the following command:
docker ps -a
docker stop <etcd_container_id>
docker rm <etcd_container_id>
If you run the etcd process from the command line, you can kill it using the following command:
kill `pgrep etcd`
2. Copy the data directory of the now-idle member to the new machine.
Not much complexity here. Compact etcd data-dir to the file and copy it to the destination instance. I also recommend to copy etcd manifest or systemd service configuration if you plan to run etcd on the new instance in the same way.
tar -C /var/lib -czf etcd-member-name-data.tar.gz etcd
tar -czf etcd-member-name-conf.tar.gz [etcd.yaml] [/etc/systemd/system/etcd.service] [/etc/kubernetes/manifests/etcd.conf ...]
scp etcd-member-name-data.tar.gz destination_host:~/
scp etcd-member-name-conf.tar.gz destination_host:~/
3. Update the peer URLs for the replaced member to reflect the new member IP address according to the runtime reconfiguration instructions.
There are two way to do it, by using etcd API
or by running etcdctl
utility.
That's how etcdctl
way may look like:
(replace etcd endpoints variables with the correct etcd cluster members ip addresses)
# all etcd cluster members should be specified
export ETCDSRV="--endpoints https://etcd.ip.addr.one:2379,https://etcd.ip.addr.two:2379,https://etcd.ip.addr.three:2379"
#authentication parameters for v2 and v3 etcdctl APIs
export ETCDAUTH2="--ca-file /etc/kubernetes/pki/etcd/ca.crt --cert-file /etc/kubernetes/pki/etcd/peer.crt --key-file /etc/kubernetes/pki/etcd/peer.key"
export ETCDAUTH3="--cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/peer.crt --key=/etc/kubernetes/pki/etcd/peer.key"
# etcdctl API v3 alias
alias e3="ETCDCTL_API=3 etcdctl $ETCDAUTH3 $ETCDSRV"
# etcdctl API v2 alias
alias e2="ETCDCTL_API=2 etcdctl $ETCDAUTH2 $ETCDSRV"
# list all etcd cluster members and their IDs
e2 member list
e2 member update member_id http://new.etcd.member.ip:2380
#or
e3 member update member_id --peer-urls="https://new.etcd.member.ip:2380"
That's how etcd API
way may look like:
export CURL_ETCD_AUTH="--cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key --cacert /etc/kubernetes/pki/etcd/ca.crt"
curl https://health.etcd.istance.ip:2379/v2/members/member_id -XPUT -H "Content-Type: application/json" -d '{"peerURLs":["http://new.etcd.member.ip:2380"]}' ${CURL_ETCD_AUTH}
4. Start etcd on the new machine, using the adjusted configuration and the copy of the data directory.
Unpack etcd data-dir on the new host:
tar -xzf etcd-member-name-data.tar.gz -C /var/lib/
Adjust etcd startup configuration according to your needs. At this point it's easy to select another way to run etcd. Depending on your choice prepare manifest or service definition file and replace there old ip address with new. E.g.:
sed -i 's/\/10.128.0.12:/\/10.128.0.99:/g' etcd.yaml
Now it's time to start etcd by moving etcd.yaml
to /etc/kubernetes/manifests/
, or by running the following command (if you run etcd
as a systemd
service)
sudo systemctl start etcd-service-name.service
5. Check updated etcd process logs and etcd cluster health to ensure that member is healthy.
To do that you can use the following commands:
$ e2 cluster-health
$ kubectl logs etct_pod_name -n kube-system
$ docker logs etcd_container_id 2>&1 | less
$ journalctl -e -u etcd_service_name
The second solution I've mentioned in another answer is
The downside of this method is that etcd quorum size is temporary increased, and in case of several nodes failure, etcd cluster may break. To avoid it, you may want to remove one existing etcd cluster member before adding another one.
Here is the brief overview of the process:
ca.crt
and ca.key
from existing etcd node folder (/etc/kubernetes/pki/etcd/
).etcdctl
commandAnother possible sequence:
ca.crt
and ca.key
from existing etcd node folder (/etc/kubernetes/pki/etcd/
).etcd.yaml
files for all etcd cluster members.Note: If you have etcd cluster, you likely have etcd-CA certificate somewhere. Consider to use it along with the etcd-CA key to generate certificates for all additional etcd members.
Note: In case you choose to generate certificates manually, usual Kubernetes certificates' parameters are:
You can check the content of the certificates using the following command:
find /etc/kubernetes/pki/ -name *.crt | xargs -l bash -c 'echo $0 ; openssl x509 -in $0 -text -noout'
(Please refer to my another answer, step 3, for variables and alias definitions)
e3 member list
b67816d38b8e9d2, started, kube-ha-m3, https://10.128.0.12:2380, https://10.128.0.12:2379
3de72bd56f654b1c, started, kube-ha-m1, https://10.128.0.10:2380, https://10.128.0.10:2379
ac98ece88e3519b5, started, kube-etcd2, https://10.128.0.14:2380, https://10.128.0.14:2379
cfb0839e8cad4c8f, started, kube-ha-m2, https://10.128.0.11:2380, https://10.128.0.11:2379
eb9b83c725146b96, started, kube-etcd1, https://10.128.0.13:2380, https://10.128.0.13:2379
401a166c949e9584, started, kube-etcd3, https://10.128.0.15:2380, https://10.128.0.15:2379 # Let's remove this one
e2 member remove 401a166c949e9584
The member will shutdown instantly. To prevent further attempt of joining the cluster, move/delete etcd.yaml from /etc/kubernetes/manifests/ or shutdown etcd service on the etcd member node
e3 member add kube-etcd3 --peer-urls="https://10.128.0.16:2380"
The output shows the parameters required to start the new etcd cluster member, e.g.:
ETCD_NAME="kube-etcd3"
ETCD_INITIAL_CLUSTER="kube-ha-m3=https://10.128.0.15:2380,kube-ha-m1=https://10.128.0.10:2380,kube-etcd2=https://10.128.0.14:2380,kube-ha-m2=https://10.128.0.11:2380,kube-etcd1=https://10.128.0.13:2380,kube-etcd3=https://10.128.0.16:2380"
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.128.0.16:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"
Note: ETCD_INITIAL_CLUSTER
variable contains all existing etcd cluster members and also the new node. If you need to add several nodes it should be done one node at a time.
Note: All ETCD_INITIAL_*
variables and corresponded command line parameters only required for the first etcd Pod start. After successful addition of the node to the etcd cluster, these parameters are ignored and can be removed from startup configuration. All required information is stored in /var/lib/etcd
folder in etcd database file.
The default etcd.yaml
manifest could be generated using the following kubeadm commmand:
kubeadm init phase etcd local
It's better to move etcd.yaml
file from /etc/kubernetes/manifests/
somewhere to make adjustments.
Also delete content of the /var/lib/etcd
folder. It contains data of new etcd cluster, so it can't be used to add member to existing cluster.
Then it should be adjusted according to member add command output. (--advertise-client-urls, -initial-advertise-peer-urls, --initial-cluster, --initial-cluster-state, --listen-client-urls, --listen-peer-urls
) E.g.:
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: etcd
tier: control-plane
name: etcd
namespace: kube-system
spec:
containers:
- command:
- etcd
- --advertise-client-urls=https://10.128.0.16:2379
- --cert-file=/etc/kubernetes/pki/etcd/server.crt
- --client-cert-auth=true
- --data-dir=/var/lib/etcd
- --initial-advertise-peer-urls=https://10.128.0.16:2380
- --initial-cluster=kube-ha-m3=https://10.128.0.15:2380,kube-ha-m1=https://10.128.0.10:2380,kube-etcd2=https://10.128.0.14:2380,kube-ha-m2=https://10.128.0.11:2380,kube-etcd1=https://10.128.0.13:2380,kube-etcd3=https://10.128.0.16:2380
- --initial-cluster-state=existing
- --key-file=/etc/kubernetes/pki/etcd/server.key
- --listen-client-urls=https://10.128.0.16:2379
- --listen-metrics-urls=http://127.0.0.1:2381
- --listen-peer-urls=https://10.128.0.16:2380
- --name=kube-etcd3
- --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
- --peer-client-cert-auth=true
- --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
- --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
- --snapshot-count=10000
- --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
image: k8s.gcr.io/etcd:3.3.10
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /health
port: 2381
scheme: HTTP
initialDelaySeconds: 15
timeoutSeconds: 15
name: etcd
resources: {}
volumeMounts:
- mountPath: /var/lib/etcd
name: etcd-data
- mountPath: /etc/kubernetes/pki/etcd
name: etcd-certs
hostNetwork: true
priorityClassName: system-cluster-critical
volumes:
- hostPath:
path: /etc/kubernetes/pki/etcd
type: DirectoryOrCreate
name: etcd-certs
- hostPath:
path: /var/lib/etcd
type: DirectoryOrCreate
name: etcd-data
After saving the file, kubelet will restart etcd pod. Check etcd container logs to ensure it is joined to the cluster.
$ e2 cluster-health
member b67816d38b8e9d2 is healthy: got healthy result from https://10.128.0.15:2379
member 3de72bd56f654b1c is healthy: got healthy result from https://10.128.0.10:2379
member ac98ece88e3519b5 is healthy: got healthy result from https://10.128.0.14:2379
member cfb0839e8cad4c8f is healthy: got healthy result from https://10.128.0.11:2379
member eb9b83c725146b96 is healthy: got healthy result from https://10.128.0.13:2379
cluster is healthy
$ e2 member list
b67816d38b8e9d2: name=kube-ha-m3 peerURLs=https://10.128.0.15:2380 clientURLs=https://10.128.0.15:2379 isLeader=true
3de72bd56f654b1c: name=kube-ha-m1 peerURLs=https://10.128.0.10:2380 clientURLs=https://10.128.0.10:2379 isLeader=false
ac98ece88e3519b5: name=kube-etcd2 peerURLs=https://10.128.0.14:2380 clientURLs=https://10.128.0.14:2379 isLeader=false
cfb0839e8cad4c8f: name=kube-ha-m2 peerURLs=https://10.128.0.11:2380 clientURLs=https://10.128.0.11:2379 isLeader=false
eb9b83c725146b96: name=kube-etcd1 peerURLs=https://10.128.0.13:2380 clientURLs=https://10.128.0.13:2379 isLeader=false
$ e3 endpoint health
# the output includes only etcd members that are specified in --endpoints cli option or corresponded environment variable. I've included only three out of five members
https://10.128.0.13:2379 is healthy: successfully committed proposal: took = 2.310436ms
https://10.128.0.15:2379 is healthy: successfully committed proposal: took = 1.795723ms
https://10.128.0.14:2379 is healthy: successfully committed proposal: took = 2.41462ms
$ e3 endpoint status
# the output includes only etcd members that are specified in --endpoints cli option or corresponded environment variable. I've included only three out of five members
https://10.128.0.13:2379 is healthy: successfully committed proposal: took = 2.531676ms
https://10.128.0.15:2379 is healthy: successfully committed proposal: took = 1.285312ms
https://10.128.0.14:2379 is healthy: successfully committed proposal: took = 2.266932ms
If you run etcd member using kubelet only, you can check its log using the following command:
docker logs `docker ps -a | grep etcd | grep -v pause | awk '{print $1}' | head -n1` 2>&1 | less
Note: Usually, only one etcd Pod can be run on the same node at the same time, because it uses database in the host directory /var/lib/etcd/
and it cannot be shared between two pods. Also etcd Pod uses node network interface to communicate with the etcd cluster.
Of course, you can configure etcd Pod to use different host directory and use different host ports as a workaround, but the above command assumes that the only one etcd Pod is present on the node.