Recover cluster. How can I restart kube-apiserver?
I am trying to post a question but it says "I am blocked"... What can I do?
Answers
-
(it is likely because I am pasting some commands output that are flagged as "suspicious")
0 -
I will create my question removing some of the command output:
I had a computer crash while the two VMs of the cluster were running. After rebooting, the cluster is not accessible via kubelet:
kubectl get nodes
couldn't get current server API group list: Get "https://k8scp:6443/api?timeout=32s": dial tcp 192.168.138.131:6443: connect: connection refusedNoting is listening in the port:
nc: connect to 192.168.138.131 port 6443 (tcp) failed: Connection refusedKubelet is running:
sudo service kubelet status
Active: active (running) since Wed 2024-08-14 08:43:13 UTC; 2min 52s ago
...I tried "sudo swapoff -a" and others...
How can I recover the cluster?
I thought of "reset" or even "init", but I would need to rejoin all nodes, I suppose. This is something that could happen in real life...0 -
Hi @ilmx,
After rebooting the VMs, do they preserve their original IP addresses? If the IPs change, the cluster's identity changes, and the access credentials of
kubectlare invalidated.If IPs are preserved, verify the
kubeletandcontainerdservices. They need to be active and running.sudo systemctl status kubeletsudo systemctl status containerdIf you do decide to
resetyour cluster, then all nodes need to be reset. First the control planeresetandinit, then the.kube/configfile re-generated. The worker needs to beresetandjoinwith the newly generated join command from the current init output. Also, when listing nodes withkubectl get nodes -o wide
ensure only current nodes are listed. If any of the old node entries are shown, remove them from the cluster withkubectl delete node node-name.Last but not least, I would recommend the VMs to have IP addresses that are NOT from the 192.168.0.0/16 subnet.
Regards,
-Chris0 -
Thanks Chris for the information and the suggestion
0
Categories
- All Categories
- 177 LFX Mentorship
- 177 LFX Mentorship: Linux Kernel
- 750 Linux Foundation IT Professional Programs
- 373 Cloud Engineer IT Professional Program
- 169 Advanced Cloud Engineer IT Professional Program
- 74 DevOps IT Professional Program - Discontinued
- 4 DevOps & GitOps IT Professional Program
- 99 Cloud Native Developer IT Professional Program
- 7.6K Training Courses & Learning Paths
- 1 AI & ML Training
- 1 Blockchain & Decentralized Identity Training
- 4 Cloud & Containers Training
- 1 Cybersecurity Training
- 2 DevOps & Site-Reliability Training
- 1 Linux Kernel Development Training
- 1 Networking Training
- 2 Open Source Best Practice Training
- 1 System Administration Training
- 1 System Engineering Training
- 1 Web & Application Development Training
- 792 Hardware
- 202 Drivers
- 68 I/O Devices
- 37 Monitors
- 95 Multimedia
- 173 Networking
- 91 Printers & Scanners
- 87 Storage
- 769 Linux Distributions
- 81 Debian
- 68 Fedora
- 22 Linux Mint
- 13 Mageia
- 24 openSUSE
- 150 Red Hat Enterprise
- 31 Slackware
- 13 SUSE Enterprise
- 356 Ubuntu
- 465 Linux System Administration
- 31 Cloud Computing
- 73 Command Line/Scripting
- Github systems admin projects
- 98 Linux Security
- 78 Network Management
- 101 System Management
- 46 Web Management
- 106 Mobile Computing
- 18 Android
- 73 Development
- 1.2K New to Linux
- 1K Getting Started with Linux
- 392 Off Topic
- 121 Introductions
- 181 Small Talk
- 29 Study Material
- 955 Programming and Development
- 310 Kernel Development
- 627 Software Development
- 983 Software
- 375 Applications
- 182 Command Line
- 5 Compiling/Installing
- 68 Games
- 317 Installation
- Archived
- 2 LFD140 Class Forum
- 1.4K LFS258 Class Forum
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)