Welcome to the new Linux Foundation Forum!
master node not available after rebooting EC2

I stopped EC2 vms and after restarting master node is not available
[email protected]:~$ systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Tue 2019-01-08 19:12:13 UTC; 9min ago
Docs: https://kubernetes.io/docs/home/
Main PID: 1288 (kubelet)
Tasks: 21
Memory: 95.0M
CPU: 1min 7.658s
CGroup: /system.slice/kubelet.service
├─1288 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=cgroupfs --network-plugin=cni
└─4535 /opt/cni/bin/calico
Jan 08 19:22:03 ip-xxx-xxx-xx-xxx kubelet[1288]: E0108 19:22:03.267101 1288 kubelet.go:2236] node "ip-xxx-xxx-xx-xxx" not found
Jan 08 19:22:03 ip-xxx-xxx-xx-xxx kubelet[1288]: E0108 19:22:03.432768 1288 kubelet.go:2236] node "ip-xxx-xxx-xx-xxx" not found
Jan 08 19:22:03 ip-xxx-xxx-xx-xxx kubelet[1288]: E0108 19:22:03.597614 1288 kubelet.go:2236] node "ip-xxx-xxx-xx-xxx" not found
Jan 08 19:22:04 ip-xxx-xxx-xx-xxx kubelet[1288]: 2019-01-08 19:22:04.089 [INFO][4535] calico.go 341: Extracted identifiers ContainerID="2af3731145389d511fb6c156e6fbcf5adb586d7290aa32f7d523ea80dceeb45b" Node="ip-xxx-xxx-xx-xxx" Orchestr
Jan 08 19:22:04 ip-xxx-xxx-xx-xxx kubelet[1288]: 2019-01-08 19:22:04.089 [INFO][4535] client.go 202: Loading config from environment
Jan 08 19:22:04 ip-xxx-xxx-xx-xxx kubelet[1288]: E0108 19:22:04.169439 1288 kubelet.go:2236] node "ip-xxx-xxx-xx-xxx" not found
Jan 08 19:22:04 ip-xxx-xxx-xx-xxx kubelet[1288]: E0108 19:22:04.334586 1288 azure_dd.go:147] failed to get azure cloud in GetVolumeLimits, plugin.host: ip-xxx-xxx-xx-xxx
Jan 08 19:22:04 ip-xxx-xxx-xx-xxx kubelet[1288]: E0108 19:22:04.824445 1288 kubelet.go:2236] node "ip-xxx-xxx-xx-xxx" not found
Jan 08 19:22:04 ip-xxx-xxx-xx-xxx kubelet[1288]: E0108 19:22:04.945939 1288 kubelet.go:2236] node "ip-xxx-xxx-xx-xxx" not found
Jan 08 19:22:05 ip-xxx-xxx-xx-xxx kubelet[1288]: E0108 19:22:05.071637 1288 kubelet.go:2236] node "ip-xxx-xxx-xx-xxx" not found
I assume it should be possible to reboot/stop&restart VMs.
0
Comments
Hi @crixo ,
It has been a while since I worked on AWS, but I remember being able to stop instances and then start them back up when I was ready to continue with my labs.
-Chris
@crixo
I went thru lab 2.1 on 2 EC2 instances on AWS, and I had no trouble completing the lab, stopping then starting my instances. Aside from being assigned new public IPs, the node IPs remained the same, but the pod IPs have changed. I was also able to retest the service by accessing the nginx webserver via curl and browser.
Can you provide any other details?
Can you look into the bootstrap-config, kubelet-config or config files to see whether the master IP has changed?
Thanks,
-Chris
Hi @chrispokorni,
I destroyed previous VM and after creating a new one I was able to reboot and continue to work w/ the cluster.
I noticed VM went wild due to kswapd0 was using most of the cpu.
Since the suggestion in the lab setup script to execute "sudo swapoff -a", I added into the aws VM the same command into
/etc/rc.local to execute it after each reboot
AWS instances typically have swap disabled by default, the ones I've looked at and used at least. Perhaps something else? Did any of the containers restart? If there is a lot of activity and not enough resources - like when only one node is ready in a cluster - that the terminations of running containers because of OOM issues causes a stamped. After rebooting, the worker node is ready for the workload sharing and things work better.
Glad its working now.
Regards,
Hi @serewicz, Thanks a lot for clarifying. How do I check if the AWS instance has the swap disabled?
If so, is it correct that kswapd0 is active, actually very active, among the process listed by top?
Hi @crixo,
You can verify swap by running one of the following:
cat /etc/fstab
cat /proc/swaps
swapon -s
free -h
There seems to be a known issue with kswapd0 using a lot of CPU and there are a few solutions posted online, but I have not tried either of them so I am not sure what works and what doesn't.
Regards,
-Chris