Welcome to the Linux Foundation Forum!

Coredns pods are Running but Not in Ready state

Options

__Hi experts,

I am landing to a situation where coredns pods are Running but Not in Ready state. I saw many posts related to this. Tried few solutions, but could not fix the same. If someone could help here.

Master node info:-__

root@pf-cloud$ cat /etc/os-release
NAME="Common Base Linux Mariner"
VERSION="2.0.20230407"
ID=mariner
VERSION_ID="2.0"
PRETTY_NAME="CBL-Mariner/Linux"
ANSI_COLOR="1;34"
HOME_URL="https://aka.ms/cbl-mariner"
BUG_REPORT_URL="https://aka.ms/cbl-mariner"
SUPPORT_URL="https://aka.ms/cbl-mariner"

Steps being followed:-

STEP 1: sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=40.40.40.1 --cri-socket=unix:///var/run/containerd/containerd.sock --node-name=40.40.40.1
STEP 2: mkdir -p $HOME/.kube
STEP 3: sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
STEP 4: sudo chown
(id -g) $HOME/.kube/config
STEP 5: kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml

Some of the outputs captured:-

root@pf-cloud$ kubectl logs pods/coredns-787d4945fb-6z9lt -n kube-system
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/ready: Still waiting on: "kubernetes"
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[WARNING] plugin/kubernetes: starting server with unsynced Kubernetes API
.:53
[INFO] plugin/reload: Running configuration SHA512 = 591cf328cccc12bc490481273e738df59329c62c0b729d94e8b61db9961c2fa5f046dd37f1cf888b953814040d180f52594972691cd6ff41be96639138a43908
CoreDNS-1.9.3
linux/amd64, go1.18.2, 45b0a11
[INFO] plugin/ready: Still waiting on: "kubernetes"
[INFO] plugin/ready: Still waiting on: "kubernetes"
[INFO] plugin/ready: Still waiting on: "kubernetes"
[WARNING] plugin/kubernetes: Kubernetes API connection failure: Get "https://10.96.0.1:443/version": dial tcp 10.96.0.1:443: i/o timeout
[INFO] plugin/ready: Still waiting on: "kubernetes"
[INFO] plugin/ready: Still waiting on: "kubernetes"
[INFO] plugin/ready: Still waiting on: "kubernetes"
[INFO] plugin/ready: Still waiting on: "kubernetes"
[INFO] plugin/ready: Still waiting on: "kubernetes"
[INFO] plugin/ready: Still waiting on: "kubernetes"
[INFO] plugin/ready: Still waiting on: "kubernetes"
[WARNING] plugin/kubernetes: Kubernetes API connection failure: Get "https://10.96.0.1:443/version": dial tcp 10.96.0.1:443: i/o timeout

root@pf-cloud$ kubectl describe pods/coredns-787d4945fb-6z9lt -n kube-system
Name: coredns-787d4945fb-6z9lt
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Service Account: coredns
Node: 40.40.40.1/40.40.40.1
Start Time: Fri, 28 Apr 2023 22:52:55 +0000
Labels: k8s-app=kube-dns
pod-template-hash=787d4945fb
Annotations:
Status: Running
IP: 10.244.0.14
IPs:
IP: 10.244.0.14
Controlled By: ReplicaSet/coredns-787d4945fb
Containers:
coredns:
Container ID: containerd://e173dfd514cc80f5368b6de08876d8d45c0ca17a88c1c69facf44a33106ed2bb
Image: registry.k8s.io/coredns/coredns:v1.9.3
Image ID: registry.k8s.io/coredns/coredns@sha256:8e352a029d304ca7431c6507b56800636c321cb52289686a581ab70aaa8a2e2a
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Running
Started: Fri, 28 Apr 2023 22:52:56 +0000
Ready: False
Restart Count: 0
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ccdnt (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
kube-api-access-ccdnt:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: CriticalAddonsOnly op=Exists
node-role.kubernetes.io/control-plane:NoSchedule
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
Warning Unhealthy 50s (x1803 over 4h25m) kubelet Readiness probe failed: HTTP probe failed with statuscode: 503

Comments

  • manasfirst
    Options

    Some more captures attached here.

  • chrispokorni
    chrispokorni Posts: 2,190
    Options

    Hi @manasfirst,

    Several of the steps noted above do not seem to be in sync with the lab guide. The OS release, CNI plugin, kubectl run as root. Is the API server advertising a public IP address?

    What are the outputs of the following commands?

    kubectl get nodes -o wide

    kubectl get pods -o wide -A

    What are the sizes of your nodes, and what VPC firewall rules do you have in place?

    Regards,
    -Chris

  • linchar
    linchar Posts: 1
    Options

    Hi @manasfirst ,

    I had a similar issue and turn out to be my firewall set up. Try disable firewall and see if the problem still exists.

    Best,

    Charles

  • jsr92
    jsr92 Posts: 1
    Options

    Same issue for me.
    After analysis:
    Coredns pod's container try to join https://10.96.0.1:443/version and get timeout because of node (the node with control plane) firewall.

    This address (10.96.0.1) is a cluster IP that redirect to endpoint default/kubernetes (generally on a public ip address, assume 1.2.3.4 after) on port 6443 (control-plane).

    On a node if you have an iptables rule that deny access to 1.2.3.4:6443 you will get this error. Simply add an accept rule (for example limited to from ip 10.0.0.0/8 or better appropriate restriction ) to allow this working (example of rule for control-plane node) :
    iptables -A INPUT -p tcp -m tcp --dport 6443 -s 10.0.0.0/8 -m state --state NEW -j ACCEPT

    or if eth0 is your public network interface:

    iptables -A INPUT -p tcp -m tcp --dport 6443 -i '!eth0' -m state --state NEW -j ACCEPT

    Or any other rule that will allow this traffic...

Categories

Upcoming Training