Lab 1c Step 5 - DaemonSets on my system lists 0
I ran the yaml code provided from the lab and i get "daemonset.apps/fluentd-ds created". However, when I run "kubectl get ds", i get this:
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
fluentd-ds 0 0 0 0 0 15m
Did I miss something here?
Comments
-
Can you post the results of the following:
kubectl describe ds fluentd-ds
kubectl get pods
I recall having to diagnose another couple of issues with the daemonset and will look in the meantime at what else could be the issue.
0 -
Kubernetes 1.24 (specifically Kubeadm 1.24, which came out last Tuesday) did introduce another control plane node taint, which the current LFS242 docs are not equipped to handle.
There are a couple of ways to handle this:
- Remove the other taint with:
kubectl taint node --all node-role.kubernetes.io/control-plane-
- Add another entry under the
tolerations
key under the DaemonSet's pod template. Basically going from this:
spec: tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule terminationGracePeriodSeconds: 30
to
spec: tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule - key: node-role.kubernetes.io/control-plane effect: NoSchedule terminationGracePeriodSeconds: 30
0 - Remove the other taint with:
-
kubectl describe ds fluentd-ds
Name: fluentd-ds
Selector: k8s-app=fluentd-logging
Node-Selector:
Labels: k8s-app=fluentd-logging
version=v1
Annotations: deprecated.daemonset.template.generation: 1
Desired Number of Nodes Scheduled: 0
Current Number of Nodes Scheduled: 0
Number of Nodes Scheduled with Up-to-date Pods: 0
Number of Nodes Scheduled with Available Pods: 0
Number of Nodes Misscheduled: 0
Pods Status: 0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: k8s-app=fluentd-logging
version=v1
Containers:
fluentd-ds:
Image: fluent/fluentd:latest
Port:
Host Port:
Limits:
memory: 200Mi
Environment:
FLUENTD_CONF: fluentd-kube.conf
Mounts:
/fluentd/etc from fluentd-conf (rw)
Volumes:
fluentd-conf:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: fluentd-config
Optional: false
Events:0 -
$ kubectl get ds
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
fluentd-ds 1 1 0 1 0 36m$ kubectl describe ds fluentd-ds
Name: fluentd-ds
Selector: k8s-app=fluentd-logging
Node-Selector:
Labels: k8s-app=fluentd-logging
version=v1
Annotations: deprecated.daemonset.template.generation: 2
Desired Number of Nodes Scheduled: 1
Current Number of Nodes Scheduled: 1
Number of Nodes Scheduled with Up-to-date Pods: 1
Number of Nodes Scheduled with Available Pods: 0
Number of Nodes Misscheduled: 0
Pods Status: 0 Running / 1 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: k8s-app=fluentd-logging
version=v1
Containers:
fluentd-ds:
Image: fluent/fluentd:latest
Port:
Host Port:
Limits:
memory: 200Mi
Environment:
FLUENTD_CONF: fluentd-kube.conf
Mounts:
/fluentd/etc from fluentd-conf (rw)
Volumes:
fluentd-conf:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: fluentd-config
Optional: false
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 4m34s daemonset-controller Created pod: fluentd-ds-d7dzr
Normal SuccessfulDelete 3m12s daemonset-controller Deleted pod: fluentd-ds-d7dzr
Normal SuccessfulCreate 3m12s daemonset-controller Created pod: fluentd-ds-tzbqw0 -
This is the pod info: (Note: I have redacted the Node information)
$ kubectl describe pods
Name: fluentd-ds-tzbqw
Namespace: default
Priority: 0
Node: ip-##-###-##-###/##.###.##.###
Start Time: Tue, 10 May 2022 22:24:56 +0000
Labels: controller-revision-hash=85777dbb94
k8s-app=fluentd-logging
pod-template-generation=2
version=v1
Annotations:
Status: Pending
IP:
IPs:
Controlled By: DaemonSet/fluentd-ds
Containers:
fluentd-ds:
Container ID:
Image: fluent/fluentd:latest
Image ID:
Port:
Host Port:
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Limits:
memory: 200Mi
Requests:
memory: 200Mi
Environment:
FLUENTD_CONF: fluentd-kube.conf
Mounts:
/fluentd/etc from fluentd-conf (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-tl27j (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
fluentd-conf:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: fluentd-config
Optional: false
kube-api-access-tl27j:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: Burstable
Node-Selectors:
Tolerations: node-role.kubernetes.io/control-plane:NoSchedule
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 10m default-scheduler Successfully assigned default/fluentd-ds-tzbqw to ip-##-###-##-###
Warning FailedMount 10m kubelet MountVolume.SetUp failed for volume "fluentd-conf" : failed to sync configmap cache: timed out waiting for the condition
Warning FailedCreatePodSandBox 10m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "3ac30f1bf7ed618c3f76015d80806cb5bf5cc7714744fed696b2654f2005796f": failed to find network info for sandbox "3ac30f1bf7ed618c3f76015d80806cb5bf5cc7714744fed696b2654f2005796f"
Warning FailedCreatePodSandBox 9m49s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "f741a2b2f91f0d7e9e2b9c306c7a932d320ae6d69ba31fa70d55a973657292b2": failed to find network info for sandbox "f741a2b2f91f0d7e9e2b9c306c7a932d320ae6d69ba31fa70d55a973657292b2"
Warning FailedCreatePodSandBox 9m36s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "e768452f57755756b054aa2e700488e0ace12d3aca567ea40a7967acc13f183f": failed to find network info for sandbox "e768452f57755756b054aa2e700488e0ace12d3aca567ea40a7967acc13f183f"
Warning FailedCreatePodSandBox 9m24s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "5ee1c66e96f93eaa0a01be450ce49d6b8c3207f599d376ec4812b788f3203c1c": failed to find network info for sandbox "5ee1c66e96f93eaa0a01be450ce49d6b8c3207f599d376ec4812b788f3203c1c"
Warning FailedCreatePodSandBox 9m10s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "194b530ee6e70e08d0f44d6ac349e78044fba6b5837bdcf6377878bc6d98d148": failed to find network info for sandbox "194b530ee6e70e08d0f44d6ac349e78044fba6b5837bdcf6377878bc6d98d148"
Warning FailedCreatePodSandBox 8m56s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "3d0532a468ca557bf72b4d3f6d0177539b077f3d7d637ef84e099cda0e379dee": failed to find network info for sandbox "3d0532a468ca557bf72b4d3f6d0177539b077f3d7d637ef84e099cda0e379dee"
Warning FailedCreatePodSandBox 8m41s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "47948cc827e4f2a77c1cdfc36a525277595494a3dff8e73fc1a74ce58f4794ab": failed to find network info for sandbox "47948cc827e4f2a77c1cdfc36a525277595494a3dff8e73fc1a74ce58f4794ab"
Warning FailedCreatePodSandBox 8m26s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "d92b4e4c7ae7b8e40b1c7c2ffbf6f0c4ea455654b08bcaeea20597a8bce95c2f": failed to find network info for sandbox "d92b4e4c7ae7b8e40b1c7c2ffbf6f0c4ea455654b08bcaeea20597a8bce95c2f"
Warning FailedCreatePodSandBox 8m15s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "8cb48d3687dcb16e73f1b0b0b29cbe5728c3c90d42a85705485761c2cc9bbfc3": failed to find network info for sandbox "8cb48d3687dcb16e73f1b0b0b29cbe5728c3c90d42a85705485761c2cc9bbfc3"
Warning FailedCreatePodSandBox 4m40s (x16 over 8m3s) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "769e461876b24b0a83930a286cc4a9313916d85c2872a65af200c82eb7cd9857": failed to find network info for sandbox "769e461876b24b0a83930a286cc4a9313916d85c2872a65af200c82eb7cd9857"0 -
Ok, can you do
kubectl get nodes
,kubectl describe node <name of the node>
, andkubectl get pods -n kube-system
0 -
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-##-###-##-### Ready control-plane 125m v1.24.0$ kubectl describe node ip-##-###-##-###
Name: ip-##-###-##-###
Roles: control-plane
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=ip-##-###-##-###
kubernetes.io/os=linux
node-role.kubernetes.io/control-plane=
node.kubernetes.io/exclude-from-external-load-balancers=
Annotations: kubeadm.alpha.kubernetes.io/cri-socket: unix:///var/run/containerd/containerd.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Tue, 10 May 2022 20:49:38 +0000
Taints:
Unschedulable: false
Lease:
HolderIdentity: ip-##-###-##-###
AcquireTime:
RenewTime: Tue, 10 May 2022 22:55:26 +0000
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Tue, 10 May 2022 21:14:11 +0000 Tue, 10 May 2022 21:14:11 +0000 WeaveIsUp Weave pod has set this
MemoryPressure False Tue, 10 May 2022 22:51:21 +0000 Tue, 10 May 2022 20:49:35 +0000 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Tue, 10 May 2022 22:51:21 +0000 Tue, 10 May 2022 20:49:35 +0000 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Tue, 10 May 2022 22:51:21 +0000 Tue, 10 May 2022 20:49:35 +0000 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Tue, 10 May 2022 22:51:21 +0000 Tue, 10 May 2022 21:14:15 +0000 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: ##.###.##.###
Hostname: ip-##-###-##-###
Capacity:
cpu: 2
ephemeral-storage: 30428560Ki
hugepages-2Mi: 0
memory: 8139472Ki
pods: 110
Allocatable:
cpu: 2
ephemeral-storage: 28042960850
hugepages-2Mi: 0
memory: 8037072Ki
pods: 110
System Info:
Machine ID: 5a1fb6aff7f74c4c824ceecb54c86725
System UUID: ec2423c5-ec39-cf74-f3ce-77cadc5e1cc6
Boot ID: eda8788e-0a3c-4dd7-ba06-948853e31fee
Kernel Version: 5.13.0-1022-aws
OS Image: Ubuntu 20.04.4 LTS
Operating System: linux
Architecture: amd64
Container Runtime Version: containerd://1.6.4
Kubelet Version: v1.24.0
Kube-Proxy Version: v1.24.0
Non-terminated Pods: (9 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
default fluentd-ds-tzbqw 0 (0%) 0 (0%) 200Mi (2%) 200Mi (2%) 30m
kube-system coredns-6d4b75cb6d-ng4t6 100m (5%) 0 (0%) 70Mi (0%) 170Mi (2%) 125m
kube-system coredns-6d4b75cb6d-rvnsh 100m (5%) 0 (0%) 70Mi (0%) 170Mi (2%) 125m
kube-system etcd-ip-##-###-##-### 100m (5%) 0 (0%) 100Mi (1%) 0 (0%) 125m
kube-system kube-apiserver-ip-##-###-##-### 250m (12%) 0 (0%) 0 (0%) 0 (0%) 125m
kube-system kube-controller-manager-ip-##-###-##-### 200m (10%) 0 (0%) 0 (0%) 0 (0%) 125m
kube-system kube-proxy-fjrl9 0 (0%) 0 (0%) 0 (0%) 0 (0%) 125m
kube-system kube-scheduler-ip-##-###-##-### 100m (5%) 0 (0%) 0 (0%) 0 (0%) 125m
kube-system weave-net-fpmpt 100m (5%) 0 (0%) 200Mi (2%) 0 (0%) 101m
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 950m (47%) 0 (0%)
memory 640Mi (8%) 540Mi (6%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events:$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-6d4b75cb6d-ng4t6 0/1 ContainerCreating 0 125m
coredns-6d4b75cb6d-rvnsh 0/1 ContainerCreating 0 125m
etcd-ip-##-###-##-### 1/1 Running 0 125m
kube-apiserver-##-###-##-### 1/1 Running 0 125m
kube-controller-manager-ip-##-###-##-### 1/1 Running 0 125m
kube-proxy-fjrl9 1/1 Running 0 125m
kube-scheduler-ip-##-###-##-### 1/1 Running 0 125m
weave-net-fpmpt 2/2 Running 1 (101m ago) 101m0 -
Those coredns pods are having some trouble. Can you describe one of them with
kubectl describe pod coredns-...
0 -
i get Error from server (NotFound): pods "coredns-6d4b75cb6d-rvnsh" not found from both.
0 -
Right, needed to add
-n kube-system
to look those pods up (they exist in the kube-system namespace)0 -
$ kubectl -n kube-system describe pod coredns-6d4b75cb6d-rvnsh
Name: coredns-6d4b75cb6d-rvnsh
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: ip-##-###-##-###/##.###.##.###
Start Time: Tue, 10 May 2022 21:14:15 +0000
Labels: k8s-app=kube-dns
pod-template-hash=6d4b75cb6d
Annotations:
Status: Pending
IP:
IPs:
Controlled By: ReplicaSet/coredns-6d4b75cb6d
Containers:
coredns:
Container ID:
Image: k8s.gcr.io/coredns/coredns:v1.8.6
Image ID:
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-srmb5 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
kube-api-access-srmb5:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: CriticalAddonsOnly op=Exists
node-role.kubernetes.io/control-plane:NoSchedule
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreatePodSandBox 29s (x559 over 123m) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "5b25c77911d0171de59c384205e0f655dee36a9b312b35beeb727391902074d8": failed to find network info for sandbox "5b25c77911d0171de59c384205e0f655dee36a9b312b35beeb727391902074d8"0 -
At this point we're a bit far into a rabbit hole with containerd and coredns - I think switching back over to Docker might be a quicker path. If you can, start anew with a different machine.
- Reset your cluster using
sudo kubeadm reset
andsudo rm -drf /etc/cni/net.d/
- Set up docker to use systemd as its cgroup manager:
cat <<EOF | sudo tee /etc/docker/daemon.json { "exec-opts": ["native.cgroupdriver=systemd"] } EOF sudo systemctl restart docker
- Install cri-dockerd (replaces dockershim which was removed in v1.24)
VER=$(curl -s https://api.github.com/repos/Mirantis/cri-dockerd/releases/latest|grep tag_name | cut -d '"' -f 4) wget https://github.com/Mirantis/cri-dockerd/releases/download/${VER}/cri-dockerd-${VER}-linux-amd64.tar.gz tar xvf cri-dockerd-${VER}-linux-amd64.tar.gz sudo mv cri-dockerd /usr/local/bin/ sudo wget https://raw.githubusercontent.com/Mirantis/cri-dockerd/50c048cb54e52cd9058f044671e309e9fbda82e4/packaging/systemd/cri-docker.service sudo wget https://raw.githubusercontent.com/Mirantis/cri-dockerd/50c048cb54e52cd9058f044671e309e9fbda82e4/packaging/systemd/cri-docker.socket sudo mv cri-docker.socket cri-docker.service /etc/systemd/system/ sudo sed -i -e 's,/usr/bin/cri-dockerd,/usr/local/bin/cri-dockerd,' /etc/systemd/system/cri-docker.service sudo mkdir -p /etc/systemd/system/cri-docker.service.d/ cat <<EOF | sudo tee /etc/systemd/system/cri-docker.service.d/cni.conf [Service] ExecStart= ExecStart=/usr/local/bin/cri-dockerd --container-runtime-endpoint fd:// --network-plugin=cni --cni-bin-dir=/opt/cni/bin --cni-cache-dir=/var/lib/cni/cache --cni-conf-dir=/etc/cni/net.d EOF sudo systemctl daemon-reload sudo systemctl enable cri-docker.service sudo systemctl enable --now cri-docker.socket
- Re-initialize the cluster with
sudo kubeadm init --cri-socket=unix:///run/cri-dockerd.sock
Apologies I could not help you get this current cluster working - the v1.24 update introduced quite a lot of changes that affected this particular lab.
1 - Reset your cluster using
-
Thanks for the detailed steps. What steps should I take after these?
0 -
Complete the rest of step 3 (summarized below):
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config kubectl taint nodes --all node-role.kubernetes.io/master- node-role.kubernetes.io/control-plane- kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
Then post the results of
kubectl get nodes
kubectl describe node <your node>
andkubectl get pods -n kube-system
1 -
the /etc/kubernetes/admin.conf file does not exist. what shall i do next?
0 -
It should exist after running
sudo kubeadm init --cri-socket=unix:///run/cri-dockerd.sock
, can you post the output of that command?1 -
your last command did the trick. this looks much better. thank you very much for the help. if you don't mind, could you please help recap/summarize what happened and how you help resolved this? tbh, i am a bit lost. trying to figure out is i messed up on a step or is this a kube/docker thing.
ubuntu@ip-##-###-##-###:~$ kubectl describe node ip-##-###-##-###
Name: ip-##-###-##-###
Roles: control-plane
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=ip-##-###-##-###
kubernetes.io/os=linux
node-role.kubernetes.io/control-plane=
node.kubernetes.io/exclude-from-external-load-balancers=
Annotations: kubeadm.alpha.kubernetes.io/cri-socket: unix:///run/cri-dockerd.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Wed, 11 May 2022 17:32:14 +0000
Taints:
Unschedulable: false
Lease:
HolderIdentity: ip-##-###-##-###
AcquireTime:
RenewTime: Wed, 11 May 2022 17:36:22 +0000
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Wed, 11 May 2022 17:35:55 +0000 Wed, 11 May 2022 17:35:55 +0000 WeaveIsUp Weave pod has set this
MemoryPressure False Wed, 11 May 2022 17:36:22 +0000 Wed, 11 May 2022 17:32:12 +0000 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Wed, 11 May 2022 17:36:22 +0000 Wed, 11 May 2022 17:32:12 +0000 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Wed, 11 May 2022 17:36:22 +0000 Wed, 11 May 2022 17:32:12 +0000 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Wed, 11 May 2022 17:36:22 +0000 Wed, 11 May 2022 17:36:02 +0000 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: ##.###.##.###
Hostname: ip-##-###-##-###
Capacity:
cpu: 2
ephemeral-storage: 30428560Ki
hugepages-2Mi: 0
memory: 8139472Ki
pods: 110
Allocatable:
cpu: 2
ephemeral-storage: 28042960850
hugepages-2Mi: 0
memory: 8037072Ki
pods: 110
System Info:
Machine ID: 5a1fb6aff7f74c4c824ceecb54c86725
System UUID: ec2423c5-ec39-cf74-f3ce-77cadc5e1cc6
Boot ID: eda8788e-0a3c-4dd7-ba06-948853e31fee
Kernel Version: 5.13.0-1022-aws
OS Image: Ubuntu 20.04.4 LTS
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://20.10.15
Kubelet Version: v1.24.0
Kube-Proxy Version: v1.24.0
Non-terminated Pods: (8 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
kube-system coredns-6d4b75cb6d-2r5nk 100m (5%) 0 (0%) 70Mi (0%) 170Mi (2%) 4m5s
kube-system coredns-6d4b75cb6d-c7cx2 100m (5%) 0 (0%) 70Mi (0%) 170Mi (2%) 4m4s
kube-system etcd-ip-##-###-##-### 100m (5%) 0 (0%) 100Mi (1%) 0 (0%) 4m12s
kube-system kube-apiserver-ip-##-###-##-### 250m (12%) 0 (0%) 0 (0%) 0 (0%) 4m10s
kube-system kube-controller-manager-ip-##-###-##-### 200m (10%) 0 (0%) 0 (0%) 0 (0%) 4m10s
kube-system kube-proxy-fm2vr 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4m5s
kube-system kube-scheduler-ip-##-###-##-### 100m (5%) 0 (0%) 0 (0%) 0 (0%) 4m12s
kube-system weave-net-gfffz 100m (5%) 0 (0%) 200Mi (2%) 0 (0%) 41s
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 950m (47%) 0 (0%)
memory 440Mi (5%) 340Mi (4%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Starting 4m3s kube-proxy
Normal NodeHasSufficientMemory 4m22s (x5 over 4m22s) kubelet Node ip-##-###-##-### status is now: NodeHasSufficientMemory
Normal NodeHasNoDiskPressure 4m22s (x4 over 4m22s) kubelet Node ip-##-###-##-### status is now: NodeHasNoDiskPressure
Normal NodeHasSufficientPID 4m22s (x4 over 4m22s) kubelet Node ip-##-###-##-### status is now: NodeHasSufficientPID
Normal NodeHasSufficientMemory 4m11s kubelet Node ip-##-###-##-### status is now: NodeHasSufficientMemory
Warning InvalidDiskCapacity 4m11s kubelet invalid capacity 0 on image filesystem
Normal NodeHasNoDiskPressure 4m11s kubelet Node ip-##-###-##-### status is now: NodeHasNoDiskPressure
Normal NodeHasSufficientPID 4m11s kubelet Node ip-##-###-##-### status is now: NodeHasSufficientPID
Normal NodeAllocatableEnforced 4m11s kubelet Updated Node Allocatable limit across pods
Normal Starting 4m11s kubelet Starting kubelet.
Normal RegisteredNode 4m6s node-controller Node ip-##-###-##-### event: Registered Node ip-##-###-##-### in Controller
Normal NodeReady 26s kubelet Node ip-##-###-##-### status is now: NodeReadyubuntu@ip-##-###-##-###:~$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-6d4b75cb6d-2r5nk 1/1 Running 0 4m24s
coredns-6d4b75cb6d-c7cx2 1/1 Running 0 4m23s
etcd-ip-##-###-##-### 1/1 Running 1 4m31s
kube-apiserver-ip-##-###-##-### 1/1 Running 1 4m29s
kube-controller-manager-ip-##-###-##-### 1/1 Running 1 4m29s
kube-proxy-fm2vr 1/1 Running 0 4m24s
kube-scheduler-ip-##-###-##-### 1/1 Running 1 4m31s
weave-net-gfffz 2/2 Running 0 60s**ubuntu@ip-##-###-##-###:~$ **kubectl cluster-info
Kubernetes control plane is running at https://##.###.##.###:6443
CoreDNS is running at https://##.###.##.###:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxyTo further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
0 -
Sure, what you faced was a Kubernetes/Docker issue. You did not miss any steps. I will try to be brief:
- Kubernetes 1.24 removed the in-code interface (known as Dockershim) that allowed Kubernetes to use Docker, which the lab relied on
- Your original fix, I believe, had Kubernetes use containerd, which gets installed alongside Docker when you use the get.docker.com script
- Part of setting up a container runtime to work with Kubernetes involves telling the container runtime to implement a networking strategy (usually CNI) - I do not think that was accounted for during the fix and this caused the network to not initialize properly
- To fix this, I had you install
cri-dockerd
, which is the same Dockershim code (as far as I know) but now hosted and maintained outside of the Kubernetes codebase (https://github.com/Mirantis/cri-dockerd). This allows Kubernetes to properly interact with Docker just like we originally wrote the lab to, and we also had it set up to use CNI (its service has--network-plugin=cni
for this) - After installing
cri-dockerd
I had you re-init the cluster with Docker as the underlying container runtime through the cri-dockerd socket
To summarize: It seemed like containerd was not properly configured to support the Kubernetes container network interface plugin (CNI), caused coredns to fail. Reinitializing the cluster to use Docker with CNI support properly enabled was the fix.
1 -
Thank you very much again for the clarification. Greatly appreciate the help and quick responses.
0 -
so one more thing, will the lab manual be updated? if i would like to redo the lab some time in the future, i want to make sure i perform the correct steps because i am sure i will definitely forget what happened
once it's updated, i would like to get the latest copy.0 -
There is a bit of a process that I need to follow in order to get an update out there, I'll see if I can get that started.
In the meantime, I'll pin or use this post as a reference. Thanks for sticking by! If you have any more issues or quests please continue to post on this forum.
0
Categories
- All Categories
- 167 LFX Mentorship
- 219 LFX Mentorship: Linux Kernel
- 795 Linux Foundation IT Professional Programs
- 355 Cloud Engineer IT Professional Program
- 179 Advanced Cloud Engineer IT Professional Program
- 82 DevOps Engineer IT Professional Program
- 127 Cloud Native Developer IT Professional Program
- 112 Express Training Courses
- 112 Express Courses - Discussion Forum
- 6.2K Training Courses
- 48 LFC110 Class Forum - Discontinued
- 17 LFC131 Class Forum
- 35 LFD102 Class Forum
- 227 LFD103 Class Forum
- 14 LFD110 Class Forum
- 39 LFD121 Class Forum
- 15 LFD133 Class Forum
- 7 LFD134 Class Forum
- 17 LFD137 Class Forum
- 63 LFD201 Class Forum
- 3 LFD210 Class Forum
- 5 LFD210-CN Class Forum
- 2 LFD213 Class Forum - Discontinued
- 128 LFD232 Class Forum - Discontinued
- 1 LFD233 Class Forum
- 2 LFD237 Class Forum
- 23 LFD254 Class Forum
- 649 LFD259 Class Forum
- 109 LFD272 Class Forum
- 3 LFD272-JP クラス フォーラム
- 10 LFD273 Class Forum
- 152 LFS101 Class Forum
- 1 LFS111 Class Forum
- 1 LFS112 Class Forum
- 1 LFS116 Class Forum
- 1 LFS118 Class Forum
- LFS120 Class Forum
- 7 LFS142 Class Forum
- 7 LFS144 Class Forum
- 3 LFS145 Class Forum
- 1 LFS146 Class Forum
- 3 LFS147 Class Forum
- 1 LFS148 Class Forum
- 15 LFS151 Class Forum
- 1 LFS157 Class Forum
- 33 LFS158 Class Forum
- 8 LFS162 Class Forum
- 1 LFS166 Class Forum
- 1 LFS167 Class Forum
- 3 LFS170 Class Forum
- 2 LFS171 Class Forum
- 1 LFS178 Class Forum
- 1 LFS180 Class Forum
- 1 LFS182 Class Forum
- 1 LFS183 Class Forum
- 29 LFS200 Class Forum
- 736 LFS201 Class Forum - Discontinued
- 2 LFS201-JP クラス フォーラム
- 14 LFS203 Class Forum
- 102 LFS207 Class Forum
- 1 LFS207-DE-Klassenforum
- 1 LFS207-JP クラス フォーラム
- 301 LFS211 Class Forum
- 55 LFS216 Class Forum
- 48 LFS241 Class Forum
- 42 LFS242 Class Forum
- 37 LFS243 Class Forum
- 15 LFS244 Class Forum
- LFS245 Class Forum
- LFS246 Class Forum
- 50 LFS250 Class Forum
- 1 LFS250-JP クラス フォーラム
- LFS251 Class Forum
- 154 LFS253 Class Forum
- LFS254 Class Forum
- LFS255 Class Forum
- 5 LFS256 Class Forum
- 1 LFS257 Class Forum
- 1.3K LFS258 Class Forum
- 10 LFS258-JP クラス フォーラム
- 111 LFS260 Class Forum
- 159 LFS261 Class Forum
- 41 LFS262 Class Forum
- 82 LFS263 Class Forum - Discontinued
- 15 LFS264 Class Forum - Discontinued
- 11 LFS266 Class Forum - Discontinued
- 20 LFS267 Class Forum
- 24 LFS268 Class Forum
- 29 LFS269 Class Forum
- 1 LFS270 Class Forum
- 199 LFS272 Class Forum
- 1 LFS272-JP クラス フォーラム
- LFS274 Class Forum
- 3 LFS281 Class Forum
- LFW111 Class Forum
- 260 LFW211 Class Forum
- 182 LFW212 Class Forum
- 13 SKF100 Class Forum
- 1 SKF200 Class Forum
- 1 SKF201 Class Forum
- 782 Hardware
- 198 Drivers
- 68 I/O Devices
- 37 Monitors
- 96 Multimedia
- 174 Networking
- 91 Printers & Scanners
- 83 Storage
- 743 Linux Distributions
- 80 Debian
- 67 Fedora
- 15 Linux Mint
- 13 Mageia
- 23 openSUSE
- 143 Red Hat Enterprise
- 31 Slackware
- 13 SUSE Enterprise
- 348 Ubuntu
- 461 Linux System Administration
- 39 Cloud Computing
- 70 Command Line/Scripting
- Github systems admin projects
- 90 Linux Security
- 77 Network Management
- 101 System Management
- 46 Web Management
- 64 Mobile Computing
- 17 Android
- 34 Development
- 1.2K New to Linux
- 1K Getting Started with Linux
- 371 Off Topic
- 114 Introductions
- 174 Small Talk
- 19 Study Material
- 507 Programming and Development
- 285 Kernel Development
- 204 Software Development
- 1.8K Software
- 211 Applications
- 180 Command Line
- 3 Compiling/Installing
- 405 Games
- 309 Installation
- 97 All In Program
- 97 All In Forum
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)