Lab 3.3 - coredns CrashLoopBackoff
Following hardware issues I had to reinstall Kubernetes master and workers on another PC. I'm running a Ubuntu 20.04 based host with QEMU / kvm Ubuntu 18.04 server guests: master, worker1 to worker4.
Things I did in addition to the lab tutorial: Comment out swap creation in /etc/fstab. Networking is done by NetworkManager using static IP. The master also runs a bind9 DNS server (see further down).
All VMs are connected to a bridged network bridge0. ufw firewall is disabled on the VMs and the host.
Here is the output of
kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-kube-controllers-578894d4cd-mrj4l 1/1 Running 4 25h kube-system calico-node-9vsvc 0/1 Init:CrashLoopBackOff 7 25h kube-system calico-node-g9q9x 0/1 Running 4 134m kube-system calico-node-knppq 0/1 Completed 2 93m kube-system calico-node-wpfzq 1/1 Running 4 24h kube-system coredns-66bff467f8-gqgt9 0/1 Completed 0 57m kube-system coredns-66bff467f8-qnsjk 0/1 CrashLoopBackOff 11 56m kube-system etcd-master 1/1 Running 4 25h kube-system kube-apiserver-master 1/1 Running 6 25h kube-system kube-controller-manager-master 1/1 Running 7 25h kube-system kube-proxy-8wshb 1/1 Running 4 134m kube-system kube-proxy-gxnjw 0/1 Error 2 93m kube-system kube-proxy-hr92t 1/1 Running 4 24h kube-system kube-proxy-z8cx6 1/1 Running 4 25h kube-system kube-scheduler-master 1/1 Running 7 25h
I now disabled the ufw firewall on the host and deleted the calico-node... nodes. Then I deleted the coredns-... nodes and this is the result:
kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-kube-controllers-578894d4cd-mrj4l 1/1 Running 4 25h kube-system calico-node-knppq 1/1 Running 4 110m kube-system calico-node-kt5lh 1/1 Running 0 2m25s kube-system calico-node-wpfzq 1/1 Running 4 24h kube-system calico-node-z8h2t 1/1 Running 0 107s kube-system coredns-66bff467f8-9sjgq 0/1 CrashLoopBackOff 1 16s kube-system coredns-66bff467f8-hfk5b 0/1 Running 0 33s
I removed the bind9 DNS server on master but this led to other problems, among others it sometimes resolves names, at other times not. Right now name resolving doesn't work, though I tried to reverse the steps and have systemd-resolve up and running.
I guess I will be reinstalling the host, then the VMs and see if that solves the issues. I'm afraid the bind9 server on the master VM didn't help.
The other possible issue could be libvirt networking. I had manually configured a bridged network which usually works fine when editing the xml guest config files to enable bridged networking. Next time I will try to configure the bridge within virt-manager and see if it makes a difference.
Any suggestions as to the above CrashLoopBackoff errors are welcome. Perhaps I'm looking in the wrong place altogether.
Comments
-
Hello Heiko,
I would agree with you that the issue is tied to own QEMU/KVM is passing the traffic. With both calico-node and coredns failing, I would guess on the worker node, they host is not properly passing all the network traffic back to the master. As the image was loaded by Docker, we can tell that overall the nodes have access to the Internet, so the issue may only be between nodes.
If you start nginx or busybox with many replicas, do they reach Ready state on both the master and the worker? Does the output of kubectl get pod -o wide show any other issues that only happen on the worker node?
To troubleshoot the issue I would start a wireshark on the primary interface of each node. Is there only one interface per vm? Multiple interfaces can be an issue as well. If you terminate calico or coredns pod on the worker, you should see traffic going from worker to master. Of course it is all using TLS, so you could add these two flags to --insecure-port to set a port which will be bound in insecure mode and set the Interface/IP to use with --insecure-bind-address
You can also set the --bind-interface if you have multiple interfaces on the nodes, to narrow down where traffic goes.
I have a feeling that the calico and coredns traffic is not being sent to the master. If other pod traffic works, if you can get them running, I'd lean towards a bug.
Another idea is to put in a virtual switch with OvS. If it works, then we know the issue is QEMU/KVM networking and can explore options to changing the network type inside QEMU.
Regards,
1 -
Hello Heiko,
I spun up two U18 vms on my RHEL 8 system. Fresh install, then setup the cluster. No issue. Could be either something buggy in U20 (which uses SELinux AND apparmor and may have eBPF in there, or something about the networking of your QEMU/KVM instances. This is what I see:
1 -
Hello Tim, thanks for the detailed answers and the effort to replicate the problem.
In the meantime I reinstalled the host and the master. Since I already had installed a master and multiple workers on another PC and that went fine, I started to suspect some VM config issues. In fact I had taken the configs from my other PC without change and noticed that I had over-provisioned the VMs. The PC I'm using now has only 32Gig memory and 6 cores / 12 threads, making it borderline specs for a master and 4 workers. In fact, the whole PC would freeze at times, often followed by a crash of a worker VM.
I changed the VM configs but still get errors. So now I deleted the nodes and will try to recreate them.
I had configured my nodes to use static IPs so I could easily access them via SSH. After I modified /etc/netplan/0...yaml to activate NetworkManager, I used nmcli to configure the VM network interfaces. Here an example for the master:
nmcli con add con-name static0 type ethernet ifname enp1s0 autoconnect yes ipv4.addresses 192.168.0.130/24 gw4 192.168.0.1 ipv4.method manual ipv4.dns 8.8.8.8
Unfortunately this didn't solve the issues. I then created a bridge on the host (bridge0), which is used in the VM configurations:
In the past I had always configured a bridge on the host to be used for communication between VMs and host and VM to VM. Never had any issue. Seeing that you used virt-manager, I have some questions:
- Did you setup a virtual network in virt-manager, other than the default network with virbr0 device that uses DHCP range 192.168.122.2 - 192.168.122.254?
- Did you create a bridge on the host, or left networking at default?
I will first
kubectl delete node worker1
... and recreate them to see if that helps. If not I'm going grudgingly to try the default network setup with DHCP.Thanks again for the help.
0 -
Seems like I created a mess that needed some cleaning up. If you look at my IP range for the VMs (host: 192.168.0.129, VMs: 192.168.0.130-134), it's the same as the calico range (192.168.0.0).
I edited the calico.yaml and kubeadm-config.yaml files and changed the address range to 192.168.1.0/16 (in the calico file that meant to uncomment two lines).
Since I had used kubeadm reset on the master and the workers several times, I finally RTFM-ed the output of that command. It reminds us to remove the files in /etc/cni/net.d/*, but more importantly it mentions that the iptables rules are NOT deleted. Moreover, /var/lib contains remnants of the deployment that should be removed.
So here is what I did on the master and each worker:
kubeadm reset rm /etc/cni/net.d/* cd /var/lib ls rm -rf calico rm -rf cni rm -rf kubelet cd /etc/kubernetes/ ls ls manifests/ ls pki/ cd iptables -F iptables -L systemctl restart docker.service iptables -L kubeadm join k8smaster:6443 --token 4id98t.tzleaeq49ew8ahgm --discovery-token-ca-cert-hash sha256:1727ba505a5fe1dec308530497c56109bb7c92263c1464e78b6f19401ae1ec23
iptables -F flushes the iptables rules. The restart of docker.service is necessary to create new iptables rules for docker.
Here the result:
kubectl get nodes NAME STATUS ROLES AGE VERSION master Ready master 51m v1.18.1 worker1 Ready <none> 35m v1.18.1 worker2 Ready <none> 28m v1.18.1 worker3 Ready <none> 8m45s v1.18.1 worker4 Ready <none> 3m12s v1.18.1
Unfortunately my hardware cannot handle more than 4 nodes, as it freezes and crashes a worker node once I try to run the 5th node (4th worker node).
kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-kube-controllers-578894d4cd-dtdzx 1/1 Running 0 41m kube-system calico-node-2x7qz 1/1 Running 0 4m40s kube-system calico-node-4qqlc 1/1 Running 0 41m kube-system calico-node-lgdhf 0/1 Running 1 30m kube-system calico-node-nhr62 1/1 Running 0 36m kube-system calico-node-w29hn 1/1 Running 0 10m kube-system coredns-66bff467f8-2x44x 1/1 Running 0 52m kube-system coredns-66bff467f8-9s6hc 1/1 Running 0 52m kube-system etcd-master 1/1 Running 0 52m kube-system kube-apiserver-master 1/1 Running 0 52m kube-system kube-controller-manager-master 1/1 Running 2 52m kube-system kube-proxy-c45nx 1/1 Running 0 36m kube-system kube-proxy-d2z9k 1/1 Running 1 30m kube-system kube-proxy-rpmtk 1/1 Running 0 10m kube-system kube-proxy-tjwc5 1/1 Running 0 4m40s kube-system kube-proxy-z2ph4 1/1 Running 0 52m kube-system kube-scheduler-master 1/1 Running 2
Still I'd say it works as advertised. I will remove the 4th worker and see how stable it is.
0 -
So I got rid of worker4:
kubectl delete nodes worker4
But that doesn't seem the right way, as calico still runs pods for the removed node:
NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-kube-controllers-578894d4cd-dtdzx 1/1 Running 1 57m kube-system calico-node-4qqlc 1/1 Running 1 57m kube-system calico-node-lgdhf 0/1 Init:CrashLoopBackOff 2 45m kube-system calico-node-nhr62 0/1 Running 2 52m kube-system calico-node-w29hn 0/1 Running 1 25m kube-system coredns-66bff467f8-2x44x 1/1 Running 1 67m kube-system coredns-66bff467f8-9s6hc 1/1 Running 1 67m
This also affects the running pods, as you can see above 0/1 ready state. So I deleted the pods which finally gave me 1/1 results:
NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-kube-controllers-578894d4cd-dtdzx 1/1 Running 1 73m kube-system calico-node-4qqlc 1/1 Running 1 73m kube-system calico-node-57cpc 1/1 Running 0 62s kube-system calico-node-pqg6d 1/1 Running 0 62s kube-system calico-node-rlbtk 1/1 Running 0 102s kube-system coredns-66bff467f8-2x44x 1/1 Running 1 84m kube-system coredns-66bff467f8-9s6hc 1/1 Running 1 84m
0 -
Great Heiko! I'm glad you got it working.
Sounds like a combo between left over config files and some resource issues. Congrats, you found the two most difficult things to troubleshoot
If you want to add the fifth node, one idea would be to lower the resources on all the worker and the proxy to fit. If you don't have any big deployments the worker node and the HAProxy node don't use all that much resources. I always suggest folks use the same size as there is less chance of running out of resources with a high replica count, but the lab should work with smaller worker/proxy.
Regards,
0 -
Now that the cluster is running I've been testing it until the smoke comes out.
In exercise 3.4 I deployed the nginx container and web service. However, when I tried to access the nginx pod on worker1 from the master using curl IP-address it didn't work. I then realised that there was no tunl0 interface on the master. The workers are fine and all have the tunl0 interface. I suppose this isn't normal? I probably forgot to delete some stuff before I ran again the kubeadm init command. Is there a simple way to (re)activate the tunl0 interface?
Having fun with running
for i in {1..10000}; do curl 10.110.201.77:80 &>/dev/null; done
in one terminal window, watching tcpdump in another
sudo tcpdump -i tunl0
and watching and deleting the pods in a third window
kubectl get pods -o wide kubectl delete pod nginx-d46f5678b-xwdlj
The IP address in the curl command is the cluster-ip so the tcpdump shows different IPs for the end-points. In the worst case I managed to create a "service blackout" of 5-7 seconds in between pod deletion and recreation. In some instances the curl command would employ 2 nodes.
0 -
One more thing: You mentioned multiple interfaces could be an issue. Well, here is what I have on the master and the workers:
nmcli con show NAME UUID TYPE DEVICE docker0 fa09a716-5e2c-4f26-9be4-20641e117a28 bridge docker0 static2 ce36f612-d8da-423b-bc98-cde20aec5de7 ethernet enp1s0
There are two interfaces, one that I created (static2) and one that docker created. Is that OK like that?
0 -
Hi @heiko_s,
Once installed, Docker will create the docker bridge on the nodes where it is running. However, that bridge does not get utilized by Kubernetes, as it uses a third-party networking plugin. In other words, the docker bridge is harmless.
Regards,
-Chris0 -
Hello,
indeed. I was speaking of outbound interfaces, something like having eth0 and eth1 etc. But I don't think that's the case with your setup, but I there are extra considerations if your instances have multiple outbound interfaces.
I have found that an existing cluster which goes through a kubeadm reset, will have odd networking issues. I would rebuild with freshly installed VMs. That way you know what you are working with is a typcial setup instead of an unknown and little experienced situation of kubeadm reset plus some less typical networking configuration.
Regards,
0 -
Thanks Chris for clarifying the Docker bridge. Makes sense.
Tim, today I booted the PC and VMs and the master now has a tunl0 interface. At first everything seems to be running fine:
kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME master Ready master 19h v1.18.1 192.168.0.130 <none> Ubuntu 18.04.5 LTS 4.15.0-112-generic docker://19.3.6 worker1 Ready <none> 19h v1.18.1 192.168.0.131 <none> Ubuntu 18.04.5 LTS 4.15.0-112-generic docker://19.3.6 worker2 Ready <none> 19h v1.18.1 192.168.0.132 <none> Ubuntu 18.04.5 LTS 4.15.0-112-generic docker://19.3.6 worker3 Ready <none> 18h v1.18.1 192.168.0.133 <none> Ubuntu 18.04.5 LTS 4.15.0-112-generic docker://19.3.6
kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-d46f5678b-2mk4s 1/1 Running 1 11h 192.168.189.70 worker2 <none> <none> nginx-d46f5678b-d9v2v 1/1 Running 1 11h 192.168.189.69 worker2 <none> <none> nginx-d46f5678b-js92b 1/1 Running 1 10h 192.168.235.137 worker1 <none> <none>
kubectl get service nginx NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE nginx LoadBalancer 10.105.137.239 <pending> 80:30747/TCP 10h
kubectl get ep nginx NAME ENDPOINTS AGE nginx 192.168.189.69:80,192.168.189.70:80,192.168.235.137:80 10h
kubectl get deployments.apps nginx -o wide NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR nginx 3/3 3 3 13h nginx nginx app=nginx
heiko@master:~$ curl 10.105.137.239:80 <!DOCTYPE html> <html> <head> <title>Welcome to nginx!</title>
"oldy" is the host:
heiko@oldy:~$ curl 192.168.0.130:30747 <!DOCTYPE html> <html> <head> <title>Welcome to nginx!</title>
nginx is running on worker1 and worker2. I can access the webpage from the master now, using the cluster-ip. Access also works from outside (from host "oldy") using the LoadBalancer.
As I type this, the host again froze and I got a OOM and worker1 was killed (qemu-... process). I don't understand why a worker running in a VM would cause an OOM on the host. Need to look into that. Any ideas are welcome.
P.S.: All nodes / pods were running idle - no work load except kubernetes and the Firefox browser on the host.
P.S.S.: Something is wrong with this web forum page. I get delays and what not and when accessing it from the Macbook it warns me that the page uses a lot of energy and suggests to close the window. This is very odd and indeed when I'm on this forum page the laptop draws more battery power and slows down. Someone needs to look into that.
0 -
OOM problem solved: It was of course my mistake. I had reserved 24Gig huge pages out of a total of 32Gig at boot time. But I forgot to edit the worker VM configurations to add:
<memoryBacking> <hugepages/> </memoryBacking>
Without the above option, the VMs use regular memory, not the huge pages. No wonder that everything ground to a halt when running 4 VMs each 4Gig and having only 8Gig left on the host, plus swap space. The host was essentially using memory and swap space to provide the memory that I had allocated.
I discovered this when I ran the stress container in lab 4.1 and scaled to 3 replicas, with an eye on the host memory:
watch -n 3 free -h
It quickly shrank to 460Mi (from 8Gig total).
0 -
Hi Heiko,
Glad you found the issue as I couldn't think of what could be causing a host OOM. As far as the forum causing issues, it may be tied to the browser you are using. Are you using chrome? Does a different browser have the same high utilization?
I only have Linux systems to test with, and don't typically use Chrome. But when I do use Chrome, and especially if I have more than one tab open, I will notice it consumes a lot of resources.
Regards,
0 -
@serewicz said:
Hi Heiko,...As far as the forum causing issues, it may be tied to the browser you are using. Are you using chrome? Does a different browser have the same high utilization?
I only have Linux systems to test with, and don't typically use Chrome. But when I do use Chrome, and especially if I have more than one tab open, I will notice it consumes a lot of resources.
Regards,
I have several systems that I'm using, mostly Linux:
On the Linux systems I run Firefox.
My Macbook runs Safari.I never user Chrome - seems we have the same experience. I typically have multiple tabs open (sometimes several dozen). However, until now this never caused any issue.
The message I got on the Macbook with the "using a lot of energy" was a first for me.
0 -
Hi @heiko_s,
I experienced similar behavior with Chrome on Ubuntu, and I suspected it had something to do with my browser plugins. I disabled most of them, yet I was still experiencing page freezes with the forum. I always thought it must be an isolated case caused by my setup and did not think much else of it.
Could it be related to the caching mechanism responsible for saving drafts?
Regards,
-Chris0 -
I had the same issue when using Ubuntu 18.04.5 LTS for the master and worker. Everything worked fine until I applied the calico.yaml. Then docker was not able to pull the nginx image nor do any dns-lookup; they all timed out. I found out it was due to systemd-resolved. When I removed calico.yaml again, docker was able to pull the nginx image again and dns-lookups worked fine again. I solved it by using debian 10 in stead of Ubuntu. Never liked Ubuntu.
0
Categories
- All Categories
- 206 LFX Mentorship
- 206 LFX Mentorship: Linux Kernel
- 733 Linux Foundation IT Professional Programs
- 339 Cloud Engineer IT Professional Program
- 165 Advanced Cloud Engineer IT Professional Program
- 66 DevOps Engineer IT Professional Program
- 132 Cloud Native Developer IT Professional Program
- 119 Express Training Courses
- 119 Express Courses - Discussion Forum
- 5.9K Training Courses
- 40 LFC110 Class Forum - Discontinued
- 66 LFC131 Class Forum
- 39 LFD102 Class Forum
- 220 LFD103 Class Forum
- 17 LFD110 Class Forum
- 32 LFD121 Class Forum
- 17 LFD133 Class Forum
- 6 LFD134 Class Forum
- 17 LFD137 Class Forum
- 70 LFD201 Class Forum
- 3 LFD210 Class Forum
- 2 LFD210-CN Class Forum
- 2 LFD213 Class Forum - Discontinued
- 128 LFD232 Class Forum - Discontinued
- 1 LFD233 Class Forum
- 3 LFD237 Class Forum
- 23 LFD254 Class Forum
- 684 LFD259 Class Forum
- 109 LFD272 Class Forum
- 3 LFD272-JP クラス フォーラム
- 10 LFD273 Class Forum
- 97 LFS101 Class Forum
- LFS111 Class Forum
- 2 LFS112 Class Forum
- 1 LFS116 Class Forum
- 3 LFS118 Class Forum
- 2 LFS142 Class Forum
- 3 LFS144 Class Forum
- 3 LFS145 Class Forum
- 1 LFS146 Class Forum
- 2 LFS147 Class Forum
- 8 LFS151 Class Forum
- 1 LFS157 Class Forum
- 10 LFS158 Class Forum
- 4 LFS162 Class Forum
- 1 LFS166 Class Forum
- 3 LFS167 Class Forum
- 1 LFS170 Class Forum
- 1 LFS171 Class Forum
- 2 LFS178 Class Forum
- 2 LFS180 Class Forum
- 1 LFS182 Class Forum
- 4 LFS183 Class Forum
- 30 LFS200 Class Forum
- 737 LFS201 Class Forum - Discontinued
- 2 LFS201-JP クラス フォーラム
- 17 LFS203 Class Forum
- 113 LFS207 Class Forum
- 1 LFS207-DE-Klassenforum
- LFS207-JP クラス フォーラム
- 301 LFS211 Class Forum
- 55 LFS216 Class Forum
- 49 LFS241 Class Forum
- 43 LFS242 Class Forum
- 37 LFS243 Class Forum
- 13 LFS244 Class Forum
- 1 LFS245 Class Forum
- 45 LFS250 Class Forum
- 1 LFS250-JP クラス フォーラム
- LFS251 Class Forum
- 143 LFS253 Class Forum
- LFS254 Class Forum
- LFS255 Class Forum
- 6 LFS256 Class Forum
- LFS257 Class Forum
- 1.2K LFS258 Class Forum
- 9 LFS258-JP クラス フォーラム
- 114 LFS260 Class Forum
- 152 LFS261 Class Forum
- 41 LFS262 Class Forum
- 82 LFS263 Class Forum - Discontinued
- 15 LFS264 Class Forum - Discontinued
- 11 LFS266 Class Forum - Discontinued
- 23 LFS267 Class Forum
- 18 LFS268 Class Forum
- 29 LFS269 Class Forum
- 199 LFS272 Class Forum
- 1 LFS272-JP クラス フォーラム
- LFS274 Class Forum
- 3 LFS281 Class Forum
- 2 LFW111 Class Forum
- 257 LFW211 Class Forum
- 176 LFW212 Class Forum
- 12 SKF100 Class Forum
- SKF200 Class Forum
- 791 Hardware
- 199 Drivers
- 68 I/O Devices
- 37 Monitors
- 98 Multimedia
- 174 Networking
- 91 Printers & Scanners
- 85 Storage
- 754 Linux Distributions
- 82 Debian
- 67 Fedora
- 16 Linux Mint
- 13 Mageia
- 23 openSUSE
- 147 Red Hat Enterprise
- 31 Slackware
- 13 SUSE Enterprise
- 351 Ubuntu
- 464 Linux System Administration
- 39 Cloud Computing
- 70 Command Line/Scripting
- Github systems admin projects
- 91 Linux Security
- 78 Network Management
- 101 System Management
- 47 Web Management
- 56 Mobile Computing
- 17 Android
- 28 Development
- 1.2K New to Linux
- 1K Getting Started with Linux
- 365 Off Topic
- 113 Introductions
- 171 Small Talk
- 20 Study Material
- 523 Programming and Development
- 292 Kernel Development
- 213 Software Development
- 1.1K Software
- 212 Applications
- 181 Command Line
- 3 Compiling/Installing
- 405 Games
- 311 Installation
- 79 All In Program
- 79 All In Forum
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)