Welcome to the Linux Foundation Forum!

etcd restore overview on a kubernetes cluster

Hi !

I would like to get some clarification about the general process of recovering a permanently failed etcd cluster for which we have a snapshot.
For the sake of simplicity let's assume we only have a single master node with one single etcd instance running and one single worker node.
So from I read from the class content and also from the documentation about etcd restore, in my mind the overall procedure to restore a failed master node would be:

1 - Deploy a brand new VM (I am assuming the master node is a VM)
2 - run kubeadm and initialize a new cluster
3 - Stop all the API server processes and kubelet processes etc.
4 - Restore the etcd database using the snapshot
5 - Start all the processes (api-server)
6 - Check and verify the cluster status

Would that be close to a valid "restore procedure" ?

Many thanks !

Best Answer



Upcoming Training