coreDNS CrashLoopBackoff
I have set up the cluster without error and am running through Lab 2.1. I noticed that the pods for coreDNS are failing.
I am running the nodes on bare metal.
Debug info is:
**kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE calico-etcd-wr2cf 1/1 Running 3 13h calico-kube-controllers-57c8947c94-g2lbc 1/1 Running 3 13h calico-node-lsjm9 2/2 Running 17 13h calico-node-zhgnd 2/2 Running 9 13h coredns-576cbf47c7-56thg 0/1 CrashLoopBackOff 54 13h coredns-576cbf47c7-nmznf 0/1 CrashLoopBackOff 54 13h etcd-nuc1 1/1 Running 4 13h kube-apiserver-nuc1 1/1 Running 4 13h kube-controller-manager-nuc1 1/1 Running 3 13h kube-proxy-ct89j 1/1 Running 3 13h kube-proxy-lbdxr 1/1 Running 5 13h kube-scheduler-nuc1 1/1 Running 3 13h
kubectl describe pods -n kube-system coredns-576cbf47c7-56thg
Name: coredns-576cbf47c7-56thg
Namespace: kube-system
Priority: 0
PriorityClassName: <none>
Node: nuc1/10.10.0.53
Start Time: Sat, 29 Dec 2018 23:06:32 +1100
Labels: k8s-app=kube-dns
pod-template-hash=576cbf47c7
Annotations: <none>
Status: Running
IP: 192.168.21.71
Controlled By: ReplicaSet/coredns-576cbf47c7
Containers:
coredns:
Container ID: docker://5491ac6a53be7f653036af7baaecfb318679882d3ad4b60c7c02b8846f3a4f9d
Image: k8s.gcr.io/coredns:1.2.2
Image ID: docker-pullable://k8s.gcr.io/coredns@sha256:3e2be1cec87aca0b74b7668bbe8c02964a95a402e45ceb51b2252629d608d03a
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Sun, 30 Dec 2018 12:39:49 +1100
Finished: Sun, 30 Dec 2018 12:39:50 +1100
Ready: False
Restart Count: 54
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-zwdp6 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-zwdp6:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-zwdp6
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: CriticalAddonsOnly
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 13h default-scheduler Successfully assigned kube-system/coredns-576cbf47c7-56thg to nuc1
Warning NetworkNotReady 13h (x8 over 13h) kubelet, nuc1 network is not ready: [runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized]
Normal Pulled 13h (x4 over 13h) kubelet, nuc1 Container image "k8s.gcr.io/coredns:1.2.2" already present on machine
Normal Created 13h (x4 over 13h) kubelet, nuc1 Created container
Normal Started 13h (x4 over 13h) kubelet, nuc1 Started container
Warning BackOff 12h (x255 over 13h) kubelet, nuc1 Back-off restarting failed container
Normal SandboxChanged 3h31m (x2 over 3h31m) kubelet, nuc1 Pod sandbox changed, it will be killed and re-created.
Warning BackOff 3h31m (x3 over 3h31m) kubelet, nuc1 Back-off restarting failed container
Normal Pulled 3h30m (x2 over 3h31m) kubelet, nuc1 Container image "k8s.gcr.io/coredns:1.2.2" already present on machine
Normal Created 3h30m (x2 over 3h31m) kubelet, nuc1 Created container
Normal Started 3h30m (x2 over 3h31m) kubelet, nuc1 Started container
Normal Pulled 3h29m (x4 over 3h30m) kubelet, nuc1 Container image "k8s.gcr.io/coredns:1.2.2" already present on machine
Normal Created 3h29m (x4 over 3h30m) kubelet, nuc1 Created container
Normal Started 3h29m (x4 over 3h30m) kubelet, nuc1 Started container
Warning BackOff 3h5m (x124 over 3h30m) kubelet, nuc1 Back-off restarting failed container
Warning FailedMount 92m kubelet, nuc1 MountVolume.SetUp failed for volume "coredns-token-zwdp6" : couldn't propagate object cache: timed out waiting for the condition
Normal SandboxChanged 91m (x2 over 92m) kubelet, nuc1 Pod sandbox changed, it will be killed and re-created.
Normal Pulled 90m (x4 over 91m) kubelet, nuc1 Container image "k8s.gcr.io/coredns:1.2.2" already present on machine
Normal Created 90m (x4 over 91m) kubelet, nuc1 Created container
Normal Started 90m (x4 over 91m) kubelet, nuc1 Started container
Warning BackOff 57m (x169 over 91m) kubelet, nuc1 Back-off restarting failed container
Normal SandboxChanged 49m (x2 over 50m) kubelet, nuc1 Pod sandbox changed, it will be killed and re-created.
Normal Pulled 47m (x4 over 49m) kubelet, nuc1 Container image "k8s.gcr.io/coredns:1.2.2" already present on machine
Normal Created 47m (x4 over 49m) kubelet, nuc1 Created container
Normal Started 47m (x4 over 49m) kubelet, nuc1 Started container
Warning BackOff 4m49s (x214 over 49m) kubelet, nuc1 Back-off restarting failed container
Comments
-
@bryonbaker ,
You can try to delete the 2 coredns pods, and they will be re-created.
Are you in Lab 2.1 of LFD259?
Thanks,
-Chris0 -
Hi,
The issue is actually thoroughly documented in the CoreDNS web site. It is caused because CoreDNS is detecting a loopback and it terminates. It is expected behaviour.The solution is to change the DNS setting in /etc/resolv.conf. For those using Ubuntu I have documented what to do here as it can be tricky - especially with Ubuntu Desktop edition.
There are other ways to solve it but in the end I set up an external DNS server with bind9 for resolving hostnames. Overkill I know.1
Categories
- All Categories
- 177 LFX Mentorship
- 177 LFX Mentorship: Linux Kernel
- 750 Linux Foundation IT Professional Programs
- 373 Cloud Engineer IT Professional Program
- 169 Advanced Cloud Engineer IT Professional Program
- 74 DevOps IT Professional Program - Discontinued
- 4 DevOps & GitOps IT Professional Program
- 99 Cloud Native Developer IT Professional Program
- 7.6K Training Courses & Learning Paths
- 1 AI & ML Training
- 1 Blockchain & Decentralized Identity Training
- 4 Cloud & Containers Training
- 1 Cybersecurity Training
- 2 DevOps & Site-Reliability Training
- 1 Linux Kernel Development Training
- 1 Networking Training
- 2 Open Source Best Practice Training
- 1 System Administration Training
- 1 System Engineering Training
- 1 Web & Application Development Training
- 792 Hardware
- 202 Drivers
- 68 I/O Devices
- 37 Monitors
- 95 Multimedia
- 173 Networking
- 91 Printers & Scanners
- 87 Storage
- 769 Linux Distributions
- 81 Debian
- 68 Fedora
- 22 Linux Mint
- 13 Mageia
- 24 openSUSE
- 150 Red Hat Enterprise
- 31 Slackware
- 13 SUSE Enterprise
- 356 Ubuntu
- 465 Linux System Administration
- 31 Cloud Computing
- 73 Command Line/Scripting
- Github systems admin projects
- 98 Linux Security
- 78 Network Management
- 101 System Management
- 46 Web Management
- 106 Mobile Computing
- 18 Android
- 73 Development
- 1.2K New to Linux
- 1K Getting Started with Linux
- 392 Off Topic
- 121 Introductions
- 181 Small Talk
- 29 Study Material
- 955 Programming and Development
- 310 Kernel Development
- 627 Software Development
- 983 Software
- 375 Applications
- 182 Command Line
- 5 Compiling/Installing
- 68 Games
- 317 Installation
- Archived
- 2 LFD140 Class Forum
- 1.4K LFS258 Class Forum
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)