master node not available after rebooting EC2

I stopped EC2 vms and after restarting master node is not available
- ubuntu@ip-xxx-xxx-xx-xxx:~$ systemctl status kubelet
- ● kubelet.service - kubelet: The Kubernetes Node Agent
- Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
- Drop-In: /etc/systemd/system/kubelet.service.d
- └─10-kubeadm.conf
- Active: active (running) since Tue 2019-01-08 19:12:13 UTC; 9min ago
- Docs: https://kubernetes.io/docs/home/
- Main PID: 1288 (kubelet)
- Tasks: 21
- Memory: 95.0M
- CPU: 1min 7.658s
- CGroup: /system.slice/kubelet.service
- ├─1288 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=cgroupfs --network-plugin=cni
- └─4535 /opt/cni/bin/calico
- Jan 08 19:22:03 ip-xxx-xxx-xx-xxx kubelet[1288]: E0108 19:22:03.267101 1288 kubelet.go:2236] node "ip-xxx-xxx-xx-xxx" not found
- Jan 08 19:22:03 ip-xxx-xxx-xx-xxx kubelet[1288]: E0108 19:22:03.432768 1288 kubelet.go:2236] node "ip-xxx-xxx-xx-xxx" not found
- Jan 08 19:22:03 ip-xxx-xxx-xx-xxx kubelet[1288]: E0108 19:22:03.597614 1288 kubelet.go:2236] node "ip-xxx-xxx-xx-xxx" not found
- Jan 08 19:22:04 ip-xxx-xxx-xx-xxx kubelet[1288]: 2019-01-08 19:22:04.089 [INFO][4535] calico.go 341: Extracted identifiers ContainerID="2af3731145389d511fb6c156e6fbcf5adb586d7290aa32f7d523ea80dceeb45b" Node="ip-xxx-xxx-xx-xxx" Orchestr
- Jan 08 19:22:04 ip-xxx-xxx-xx-xxx kubelet[1288]: 2019-01-08 19:22:04.089 [INFO][4535] client.go 202: Loading config from environment
- Jan 08 19:22:04 ip-xxx-xxx-xx-xxx kubelet[1288]: E0108 19:22:04.169439 1288 kubelet.go:2236] node "ip-xxx-xxx-xx-xxx" not found
- Jan 08 19:22:04 ip-xxx-xxx-xx-xxx kubelet[1288]: E0108 19:22:04.334586 1288 azure_dd.go:147] failed to get azure cloud in GetVolumeLimits, plugin.host: ip-xxx-xxx-xx-xxx
- Jan 08 19:22:04 ip-xxx-xxx-xx-xxx kubelet[1288]: E0108 19:22:04.824445 1288 kubelet.go:2236] node "ip-xxx-xxx-xx-xxx" not found
- Jan 08 19:22:04 ip-xxx-xxx-xx-xxx kubelet[1288]: E0108 19:22:04.945939 1288 kubelet.go:2236] node "ip-xxx-xxx-xx-xxx" not found
- Jan 08 19:22:05 ip-xxx-xxx-xx-xxx kubelet[1288]: E0108 19:22:05.071637 1288 kubelet.go:2236] node "ip-xxx-xxx-xx-xxx" not found
I assume it should be possible to reboot/stop&restart VMs.
Comments
-
Hi @crixo ,
It has been a while since I worked on AWS, but I remember being able to stop instances and then start them back up when I was ready to continue with my labs.
-Chris0 -
@crixo
I went thru lab 2.1 on 2 EC2 instances on AWS, and I had no trouble completing the lab, stopping then starting my instances. Aside from being assigned new public IPs, the node IPs remained the same, but the pod IPs have changed. I was also able to retest the service by accessing the nginx webserver via curl and browser.
Can you provide any other details?
Can you look into the bootstrap-config, kubelet-config or config files to see whether the master IP has changed?
Thanks,
-Chris0 -
Hi @chrispokorni,
I destroyed previous VM and after creating a new one I was able to reboot and continue to work w/ the cluster.
I noticed VM went wild due to kswapd0 was using most of the cpu.
Since the suggestion in the lab setup script to execute "sudo swapoff -a", I added into the aws VM the same command into
/etc/rc.local to execute it after each reboot0 -
AWS instances typically have swap disabled by default, the ones I've looked at and used at least. Perhaps something else? Did any of the containers restart? If there is a lot of activity and not enough resources - like when only one node is ready in a cluster - that the terminations of running containers because of OOM issues causes a stamped. After rebooting, the worker node is ready for the workload sharing and things work better.
Glad its working now.
Regards,
0 -
Hi @crixo,
You can verify swap by running one of the following:cat /etc/fstab
cat /proc/swaps
swapon -s
free -h
There seems to be a known issue with kswapd0 using a lot of CPU and there are a few solutions posted online, but I have not tried either of them so I am not sure what works and what doesn't.
Regards,
-Chris0
Categories
- All Categories
- 153 LFX Mentorship
- 153 LFX Mentorship: Linux Kernel
- 840 Linux Foundation IT Professional Programs
- 384 Cloud Engineer IT Professional Program
- 185 Advanced Cloud Engineer IT Professional Program
- 86 DevOps Engineer IT Professional Program
- 154 Cloud Native Developer IT Professional Program
- 151 Express Training Courses & Microlearning
- 149 Express Courses - Discussion Forum
- 2 Microlearning - Discussion Forum
- 6.9K Training Courses
- 49 LFC110 Class Forum - Discontinued
- 74 LFC131 Class Forum
- 55 LFD102 Class Forum
- 250 LFD103 Class Forum
- 25 LFD110 Class Forum
- 49 LFD121 Class Forum
- 2 LFD123 Class Forum
- 1 LFD125 Class Forum
- 19 LFD133 Class Forum
- 10 LFD134 Class Forum
- 19 LFD137 Class Forum
- 1 LFD140 Class Forum
- 73 LFD201 Class Forum
- 8 LFD210 Class Forum
- 6 LFD210-CN Class Forum
- 2 LFD213 Class Forum - Discontinued
- 128 LFD232 Class Forum - Discontinued
- 3 LFD233 Class Forum
- 5 LFD237 Class Forum
- 25 LFD254 Class Forum
- 733 LFD259 Class Forum
- 111 LFD272 Class Forum - Discontinued
- 4 LFD272-JP クラス フォーラム - Discontinued
- 15 LFD273 Class Forum
- 361 LFS101 Class Forum
- 3 LFS111 Class Forum
- 4 LFS112 Class Forum
- 4 LFS116 Class Forum
- 9 LFS118 Class Forum
- 2 LFS120 Class Forum
- 11 LFS142 Class Forum
- 9 LFS144 Class Forum
- 5 LFS145 Class Forum
- 6 LFS146 Class Forum
- 5 LFS147 Class Forum
- 20 LFS148 Class Forum
- 17 LFS151 Class Forum
- 6 LFS157 Class Forum
- 78 LFS158 Class Forum
- 1 LFS158-JP クラス フォーラム
- 13 LFS162 Class Forum
- 2 LFS166 Class Forum - Discontinued
- 8 LFS167 Class Forum
- 4 LFS170 Class Forum
- 2 LFS171 Class Forum - Discontinued
- 4 LFS178 Class Forum - Discontinued
- 4 LFS180 Class Forum
- 3 LFS182 Class Forum
- 6 LFS183 Class Forum
- 1 LFS184 Class Forum
- 36 LFS200 Class Forum
- 737 LFS201 Class Forum - Discontinued
- 3 LFS201-JP クラス フォーラム - Discontinued
- 22 LFS203 Class Forum
- 141 LFS207 Class Forum
- 3 LFS207-DE-Klassenforum
- 3 LFS207-JP クラス フォーラム
- 302 LFS211 Class Forum - Discontinued
- 56 LFS216 Class Forum - Discontinued
- 58 LFS241 Class Forum
- 51 LFS242 Class Forum
- 39 LFS243 Class Forum
- 17 LFS244 Class Forum
- 7 LFS245 Class Forum
- 1 LFS246 Class Forum
- 1 LFS248 Class Forum
- 120 LFS250 Class Forum
- 3 LFS250-JP クラス フォーラム
- 2 LFS251 Class Forum
- 160 LFS253 Class Forum
- 1 LFS254 Class Forum - Discontinued
- 3 LFS255 Class Forum
- 14 LFS256 Class Forum
- 2 LFS257 Class Forum
- 1.3K LFS258 Class Forum
- 12 LFS258-JP クラス フォーラム
- 139 LFS260 Class Forum
- 165 LFS261 Class Forum
- 44 LFS262 Class Forum
- 82 LFS263 Class Forum - Discontinued
- 15 LFS264 Class Forum - Discontinued
- 11 LFS266 Class Forum - Discontinued
- 25 LFS267 Class Forum
- 26 LFS268 Class Forum
- 38 LFS269 Class Forum
- 11 LFS270 Class Forum
- 202 LFS272 Class Forum - Discontinued
- 2 LFS272-JP クラス フォーラム - Discontinued
- 2 LFS274 Class Forum - Discontinued
- 4 LFS281 Class Forum - Discontinued
- 30 LFW111 Class Forum
- 263 LFW211 Class Forum
- 187 LFW212 Class Forum
- 16 SKF100 Class Forum
- 2 SKF200 Class Forum
- 3 SKF201 Class Forum
- 799 Hardware
- 200 Drivers
- 68 I/O Devices
- 37 Monitors
- 104 Multimedia
- 175 Networking
- 91 Printers & Scanners
- 85 Storage
- 763 Linux Distributions
- 82 Debian
- 67 Fedora
- 18 Linux Mint
- 13 Mageia
- 23 openSUSE
- 149 Red Hat Enterprise
- 31 Slackware
- 13 SUSE Enterprise
- 356 Ubuntu
- 472 Linux System Administration
- 39 Cloud Computing
- 71 Command Line/Scripting
- Github systems admin projects
- 96 Linux Security
- 78 Network Management
- 102 System Management
- 48 Web Management
- 72 Mobile Computing
- 19 Android
- 40 Development
- 1.2K New to Linux
- 1K Getting Started with Linux
- 382 Off Topic
- 116 Introductions
- 178 Small Talk
- 27 Study Material
- 814 Programming and Development
- 307 Kernel Development
- 489 Software Development
- 1.8K Software
- 263 Applications
- 183 Command Line
- 4 Compiling/Installing
- 988 Games
- 317 Installation
- 106 All In Program
- 106 All In Forum
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)