coreDNS CrashLoopBackoff

bryonbaker · December 2018

I have set up the cluster without error and am running through Lab 2.1. I noticed that the pods for coreDNS are failing.

I am running the nodes on bare metal.

Debug info is:

**kubectl get pod -n kube-system

NAME                                       READY   STATUS             RESTARTS   AGE
calico-etcd-wr2cf                          1/1     Running            3          13h
calico-kube-controllers-57c8947c94-g2lbc   1/1     Running            3          13h
calico-node-lsjm9                          2/2     Running            17         13h
calico-node-zhgnd                          2/2     Running            9          13h
coredns-576cbf47c7-56thg                   0/1     CrashLoopBackOff   54         13h
coredns-576cbf47c7-nmznf                   0/1     CrashLoopBackOff   54         13h
etcd-nuc1                                  1/1     Running            4          13h
kube-apiserver-nuc1                        1/1     Running            4          13h
kube-controller-manager-nuc1               1/1     Running            3          13h
kube-proxy-ct89j                           1/1     Running            3          13h
kube-proxy-lbdxr                           1/1     Running            5          13h
kube-scheduler-nuc1                        1/1     Running            3          13h

kubectl describe pods -n kube-system coredns-576cbf47c7-56thg

Name:               coredns-576cbf47c7-56thg
Namespace:          kube-system
Priority:           0
PriorityClassName:  <none>
Node:               nuc1/10.10.0.53
Start Time:         Sat, 29 Dec 2018 23:06:32 +1100
Labels:             k8s-app=kube-dns
                    pod-template-hash=576cbf47c7
Annotations:        <none>
Status:             Running
IP:                 192.168.21.71
Controlled By:      ReplicaSet/coredns-576cbf47c7
Containers:
  coredns:
    Container ID:  docker://5491ac6a53be7f653036af7baaecfb318679882d3ad4b60c7c02b8846f3a4f9d
    Image:         k8s.gcr.io/coredns:1.2.2
    Image ID:      docker-pullable://k8s.gcr.io/coredns@sha256:3e2be1cec87aca0b74b7668bbe8c02964a95a402e45ceb51b2252629d608d03a
    Ports:         53/UDP, 53/TCP, 9153/TCP
    Host Ports:    0/UDP, 0/TCP, 0/TCP
    Args:
      -conf
      /etc/coredns/Corefile
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Sun, 30 Dec 2018 12:39:49 +1100
      Finished:     Sun, 30 Dec 2018 12:39:50 +1100
    Ready:          False
    Restart Count:  54
    Limits:
      memory:  170Mi
    Requests:
      cpu:        100m
      memory:     70Mi
    Liveness:     http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
    Environment:  <none>
    Mounts:
      /etc/coredns from config-volume (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from coredns-token-zwdp6 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      coredns
    Optional:  false
  coredns-token-zwdp6:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  coredns-token-zwdp6
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     CriticalAddonsOnly
                 node-role.kubernetes.io/master:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason           Age                     From               Message
  ----     ------           ----                    ----               -------
  Normal   Scheduled        13h                     default-scheduler  Successfully assigned kube-system/coredns-576cbf47c7-56thg to nuc1
  Warning  NetworkNotReady  13h (x8 over 13h)       kubelet, nuc1      network is not ready: [runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized]
  Normal   Pulled           13h (x4 over 13h)       kubelet, nuc1      Container image "k8s.gcr.io/coredns:1.2.2" already present on machine
  Normal   Created          13h (x4 over 13h)       kubelet, nuc1      Created container
  Normal   Started          13h (x4 over 13h)       kubelet, nuc1      Started container
  Warning  BackOff          12h (x255 over 13h)     kubelet, nuc1      Back-off restarting failed container
  Normal   SandboxChanged   3h31m (x2 over 3h31m)   kubelet, nuc1      Pod sandbox changed, it will be killed and re-created.
  Warning  BackOff          3h31m (x3 over 3h31m)   kubelet, nuc1      Back-off restarting failed container
  Normal   Pulled           3h30m (x2 over 3h31m)   kubelet, nuc1      Container image "k8s.gcr.io/coredns:1.2.2" already present on machine
  Normal   Created          3h30m (x2 over 3h31m)   kubelet, nuc1      Created container
  Normal   Started          3h30m (x2 over 3h31m)   kubelet, nuc1      Started container
  Normal   Pulled           3h29m (x4 over 3h30m)   kubelet, nuc1      Container image "k8s.gcr.io/coredns:1.2.2" already present on machine
  Normal   Created          3h29m (x4 over 3h30m)   kubelet, nuc1      Created container
  Normal   Started          3h29m (x4 over 3h30m)   kubelet, nuc1      Started container
  Warning  BackOff          3h5m (x124 over 3h30m)  kubelet, nuc1      Back-off restarting failed container
  Warning  FailedMount      92m                     kubelet, nuc1      MountVolume.SetUp failed for volume "coredns-token-zwdp6" : couldn't propagate object cache: timed out waiting for the condition
  Normal   SandboxChanged   91m (x2 over 92m)       kubelet, nuc1      Pod sandbox changed, it will be killed and re-created.
  Normal   Pulled           90m (x4 over 91m)       kubelet, nuc1      Container image "k8s.gcr.io/coredns:1.2.2" already present on machine
  Normal   Created          90m (x4 over 91m)       kubelet, nuc1      Created container
  Normal   Started          90m (x4 over 91m)       kubelet, nuc1      Started container
  Warning  BackOff          57m (x169 over 91m)     kubelet, nuc1      Back-off restarting failed container
  Normal   SandboxChanged   49m (x2 over 50m)       kubelet, nuc1      Pod sandbox changed, it will be killed and re-created.
  Normal   Pulled           47m (x4 over 49m)       kubelet, nuc1      Container image "k8s.gcr.io/coredns:1.2.2" already present on machine
  Normal   Created          47m (x4 over 49m)       kubelet, nuc1      Created container
  Normal   Started          47m (x4 over 49m)       kubelet, nuc1      Started container
  Warning  BackOff          4m49s (x214 over 49m)   kubelet, nuc1      Back-off restarting failed container

chrispokorni · December 2018

@bryonbaker ,
You can try to delete the 2 coredns pods, and they will be re-created.
Are you in Lab 2.1 of LFD259?
Thanks,
-Chris

bryonbaker · January 2019

Hi,
The issue is actually thoroughly documented in the CoreDNS web site. It is caused because CoreDNS is detecting a loopback and it terminates. It is expected behaviour.

The solution is to change the DNS setting in /etc/resolv.conf. For those using Ubuntu I have documented what to do here as it can be tricky - especially with Ubuntu Desktop edition.
There are other ways to solve it but in the end I set up an external DNS server with bind9 for resolving hostnames. Overkill I know.

coreDNS CrashLoopBackoff

Comments

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)