Exercise 3.3 - Worker node not ready
I have built the lab in Google Cloud as suggested, and I believe I have followed the instructions to the letter, however after I joined my worker node to my cluster (Exercise 3.2) the worker node is in a "NotReady" state and has been like that for a few hours, I think its the network connectivity between the nodes as when I try to ping one from the other I get "No route to host" yet, if I deploy a 3rd VM on the same network they can both ping the 3rd VM and the 3rd VM can ping the other 2, cheers.
Comments
-
Hi @jhurlstone,
Are all VMs in the same custom VPC, same subnet, and the VPC firewall rule allows all inbound traffic as per the demo video from the introductory chapter?
Regards,
-Chris0 -
I believe that I followed the video exactly. I have given both the master & worker VM's a reboot and they can now ping each other, and I have been running "kubectl get nodes" on the master periodically and very occasionally it will report that the worker is ready, then a few seconds later it goes back to being "NotReady".
0 -
Hi @jhurlstone,
What are the outputs of
kubectl get nodes -o wide
kubectl get pods -A -o wide
Regards,
-Chris0 -
get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master Ready control-plane 5h37m v1.25.1 10.2.0.6 Ubuntu 20.04.5 LTS 5.15.0-1030-gcp containerd://1.6.18
worker NotReady 3h44m v1.25.1 10.2.0.7 Ubuntu 20.04.5 LTS 5.15.0-1030-gcp containerd://1.6.18kubectl get pods -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system calico-kube-controllers-74677b4c5f-p9c9c 1/1 Running 1 (128m ago) 5h29m 10.2.219.69 master
kube-system calico-node-qhv2t 0/1 Running 1 (30m ago) 40m 10.2.0.7 worker
kube-system calico-node-tv5cx 0/1 Running 1 (128m ago) 5h29m 10.2.0.6 master
kube-system coredns-565d847f94-885qq 1/1 Running 1 (128m ago) 5h38m 10.2.219.68 master
kube-system coredns-565d847f94-rc2l2 1/1 Running 1 (128m ago) 5h38m 10.2.219.70 master
kube-system etcd-master 1/1 Running 1 (128m ago) 5h38m 10.2.0.6 master
kube-system kube-apiserver-master 1/1 Running 1 (128m ago) 5h38m 10.2.0.6 master
kube-system kube-controller-manager-master 1/1 Running 1 (128m ago) 5h38m 10.2.0.6 master
kube-system kube-proxy-2xfv5 1/1 Running 4 (30m ago) 3h44m 10.2.0.7 worker
kube-system kube-proxy-rlsxq 1/1 Running 1 (128m ago) 5h38m 10.2.0.6 master
kube-system kube-scheduler-master 1/1 Running 1 (128m ago) 5h38m 10.2.0.6 master
root@master:~#0 -
Hi @jhurlstone,
What is the machine type of your GCE VMs?
Are you running
kubectl
asroot
? Why?Regards,
-Chris0 -
I may have spotted a typo in my installation, I did not replace the "controlPlaneEndpoint: "k8scp:6443" with the actual hostname when I used the "kubeadm-config.yaml" my hostname for the controlplane VM is "master".
0 -
You don't have to replace it. As long as the alias is in the
/etc/hosts
file, it should all work. Also, make sure you have the correct control plane node IP in both cp and worker/etc/hosts
files0 -
The machine types are "e2-standard-2".
0 -
Hi @jhurlstone,
Since you modified your installation and configured
root
withkubectl
(not a good practice), what other changes have you made?When running the following commands:
kubectl describe node worker
kubectl -n kube-system describe pod calico-node-tv5cx
What are the Events at the very bottom of both outputs?
Regards,
-Chris0 -
I have double checked the etc/hosts which contain the same entry on both nodes "10.2.0.6 k8scp" and have just run multiple kubectl get nodes -o wide and as you can see the "worker" does occasionally come "Ready" then flicks back to "NotReady"
student@master:~$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master Ready control-plane 6h1m v1.25.1 10.2.0.6 Ubuntu 20.04.5 LTS 5.15.0-1030-gcp containerd://1.6.18
worker Ready 4h8m v1.25.1 10.2.0.7 Ubuntu 20.04.5 LTS 5.15.0-1030-gcp containerd://1.6.18
student@master:~$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master Ready control-plane 6h1m v1.25.1 10.2.0.6 Ubuntu 20.04.5 LTS 5.15.0-1030-gcp containerd://1.6.18
worker Ready 4h8m v1.25.1 10.2.0.7 Ubuntu 20.04.5 LTS 5.15.0-1030-gcp containerd://1.6.18
student@master:~$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master Ready control-plane 6h2m v1.25.1 10.2.0.6 Ubuntu 20.04.5 LTS 5.15.0-1030-gcp containerd://1.6.18
worker Ready 4h8m v1.25.1 10.2.0.7 Ubuntu 20.04.5 LTS 5.15.0-1030-gcp containerd://1.6.18
student@master:~$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master Ready control-plane 6h2m v1.25.1 10.2.0.6 Ubuntu 20.04.5 LTS 5.15.0-1030-gcp containerd://1.6.18
worker NotReady 4h9m v1.25.1 10.2.0.7 Ubuntu 20.04.5 LTS 5.15.0-1030-gcp containerd://1.6.18
student@master:~$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master Ready control-plane 6h2m v1.25.1 10.2.0.6 Ubuntu 20.04.5 LTS 5.15.0-1030-gcp containerd://1.6.18
worker NotReady 4h9m v1.25.1 10.2.0.7 Ubuntu 20.04.5 LTS 5.15.0-1030-gcp containerd://1.6.18
student@master:~$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master Ready control-plane 6h2m v1.25.1 10.2.0.6 Ubuntu 20.04.5 LTS 5.15.0-1030-gcp containerd://1.6.18
worker NotReady 4h9m v1.25.1 10.2.0.7 Ubuntu 20.04.5 LTS 5.15.0-1030-gcp containerd://1.6.18
student@master:~$0 -
kubectl describe node worker
most recent events
Normal Starting 8m57s kubelet Starting kubelet.
Warning InvalidDiskCapacity 8m57s kubelet invalid capacity 0 on image filesystem
Normal NodeAllocatableEnforced 8m54s kubelet Updated Node Allocatable limit across pods
Normal NodeHasNoDiskPressure 7m35s (x7 over 8m57s) kubelet Node worker status is now: NodeHasNoDiskPressure
Normal NodeHasSufficientPID 7m35s (x7 over 8m57s) kubelet Node worker status is now: NodeHasSufficientPID
Normal RegisteredNode 6m50s node-controller Node worker event: Registered Node worker in Controller
Normal NodeHasSufficientMemory 3m10s (x10 over 8m57s) kubelet Node worker status is now: NodeHasSufficientMemory
Normal NodeNotReady 2m30s (x2 over 6m10s) node-controller Node worker status is now: NodeNotReadykubectl -n kube-system describe pod calico-node-tv5cx
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 15m (x396 over 144m) kubelet (combined from similar events): Readiness probe failed: 2023-03-14 16:55:26.722 [INFO][25020] confd/health.go 180: Number of node(s) with BGP peering established = 0
calico/node is not ready: BIRD is not ready: BGP not established with 10.2.0.7
Normal SandboxChanged 9m33s kubelet Pod sandbox changed, it will be killed and re-created.
Normal Pulled 9m33s kubelet Container image "docker.io/calico/cni:v3.25.0" already present on machine
Normal Created 9m33s kubelet Created container upgrade-ipam
Normal Started 9m33s kubelet Started container upgrade-ipam
Normal Pulled 9m31s kubelet Container image "docker.io/calico/cni:v3.25.0" already present on machine
Normal Created 9m31s kubelet Created container install-cni
Normal Started 9m31s kubelet Started container install-cni
Normal Pulled 9m27s kubelet Container image "docker.io/calico/node:v3.25.0" already present on machine
Normal Created 9m27s kubelet Created container mount-bpffs
Normal Started 9m27s kubelet Started container mount-bpffs
Normal Pulled 9m26s kubelet Container image "docker.io/calico/node:v3.25.0" already present on machine
Normal Created 9m26s kubelet Created container calico-node
Normal Started 9m26s kubelet Started container calico-node
Warning Unhealthy 9m25s kubelet Readiness probe failed: calico/node is not ready: BIRD is not ready: Error querying BIRD: unable to connect to BIRDv4 socket: dial unix /var/run/bird/bird.ctl: connect: no such file or directory
Warning Unhealthy 9m23s (x2 over 9m24s) kubelet Readiness probe failed: calico/node is not ready: BIRD is not ready: Error querying BIRD: unable to connect to BIRDv4 socket: dial unix /var/run/calico/bird.ctl: connect: connection refused
Warning Unhealthy 9m13s kubelet Readiness probe failed: 2023-03-14 17:01:32.575 [INFO][248] confd/health.go 180: Number of node(s) with BGP peering established = 0
calico/node is not ready: BIRD is not ready: BGP not established with 10.2.0.7
Warning Unhealthy 9m3s kubelet Readiness probe failed: 2023-03-14 17:01:42.940 [INFO][272] confd/health.go 180: Number of node(s) with BGP peering established = 0
calico/node is not ready: BIRD is not ready: BGP not established with 10.2.0.7
Warning Unhealthy 5m23s kubelet Readiness probe failed: 2023-03-14 17:05:22.612 [INFO][946] confd/health.go 180: Number of node(s) with BGP peering established = 0
calico/node is not ready: BIRD is not ready: BGP not established with 10.2.0.7
Warning Unhealthy 2m13s kubelet Readiness probe failed: 2023-03-14 17:08:32.587 [INFO][1457] confd/health.go 180: Number of node(s) with BGP peering established = 0
calico/node is not ready: BIRD is not ready: BGP not established with 10.2.0.7
Warning Unhealthy 2m3s kubelet Readiness probe failed: 2023-03-14 17:08:42.513 [INFO][1489] confd/health.go 180: Number of node(s) with BGP peering established = 0
calico/node is not ready: BIRD is not ready: BGP not established with 10.2.0.7
Warning Unhealthy 113s kubelet Readiness probe failed: 2023-03-14 17:08:52.532 [INFO][1515] confd/health.go 180: Number of node(s) with BGP peering established = 0
calico/node is not ready: BIRD is not ready: BGP not established with 10.2.0.7
Warning Unhealthy 113s kubelet Readiness probe failed: 2023-03-14 17:08:52.765 [INFO][1537] confd/health.go 180: Number of node(s) with BGP peering established = 0
calico/node is not ready: BIRD is not ready: BGP not established with 10.2.0.7
Warning Unhealthy 93s (x2 over 103s) kubelet (combined from similar events): Readiness probe failed: 2023-03-14 17:09:12.519 [INFO][1578] confd/health.go 180: Number of node(s) with BGP peering established = 0
calico/node is not ready: BIRD is not ready: BGP not established with 10.2.0.70 -
Hi @jhurlstone,
I would recommend revisiting the infra provisioning steps and follow the custom VPC, subnet, and firewall config as presented in the video. Once fixed, the BGP should be established with the two calico-node pods 1/1 Running, and the worker should become Ready as well.
Regards,
-Chris0 -
I have run through the video and checked all the settings are the same with the following exceptions, "Region" (I have chosen a region local to me in the UK) and in the firewall setup the instructor has selected "IP Ranges" as the source filter, that is not an available option so I have chosen "IPv4 ranges", these are the only differences I can see, as for routing there is an option for "Dynamic routing mode" which my setup is configured identical to the video as in set to "Regional".
Many thanks for your ongoing assistance with this.
0 -
Just to let you know I have got it working by following the instructions on this link
https://www.unixcloudfusion.in/2022/02/solved-caliconode-is-not-ready-bird-is.html
Cheers
0 -
Hi @jhurlstone,
Thank you for posting the solution that worked for you.
Unfortunately, I was not able to reproduce the issue you reported therefore I could not test the suggested solution either, but will keep it in mind for the future.Regards,
-Chris0 -
Hi Hi @chrispokorni
Another thought was that when I built my VM's following the instructions in the video and selected "Ubuntu 20.04 LTS" I was forced to select x86/64 architecture, whether this made a difference to the ability of Calico to identify the Ethernet card property ("eth" or "ens") would be a guess, at the moment I am just happy I could resolve the issue and continue with the training, cheers Jonathan.
0
Categories
- All Categories
- 217 LFX Mentorship
- 217 LFX Mentorship: Linux Kernel
- 788 Linux Foundation IT Professional Programs
- 352 Cloud Engineer IT Professional Program
- 177 Advanced Cloud Engineer IT Professional Program
- 82 DevOps Engineer IT Professional Program
- 146 Cloud Native Developer IT Professional Program
- 137 Express Training Courses
- 137 Express Courses - Discussion Forum
- 6.2K Training Courses
- 46 LFC110 Class Forum - Discontinued
- 70 LFC131 Class Forum
- 42 LFD102 Class Forum
- 226 LFD103 Class Forum
- 18 LFD110 Class Forum
- 37 LFD121 Class Forum
- 18 LFD133 Class Forum
- 7 LFD134 Class Forum
- 18 LFD137 Class Forum
- 71 LFD201 Class Forum
- 4 LFD210 Class Forum
- 5 LFD210-CN Class Forum
- 2 LFD213 Class Forum - Discontinued
- 128 LFD232 Class Forum - Discontinued
- 2 LFD233 Class Forum
- 4 LFD237 Class Forum
- 24 LFD254 Class Forum
- 694 LFD259 Class Forum
- 111 LFD272 Class Forum
- 4 LFD272-JP クラス フォーラム
- 12 LFD273 Class Forum
- 146 LFS101 Class Forum
- 1 LFS111 Class Forum
- 3 LFS112 Class Forum
- 2 LFS116 Class Forum
- 4 LFS118 Class Forum
- 6 LFS142 Class Forum
- 5 LFS144 Class Forum
- 4 LFS145 Class Forum
- 2 LFS146 Class Forum
- 3 LFS147 Class Forum
- 1 LFS148 Class Forum
- 15 LFS151 Class Forum
- 2 LFS157 Class Forum
- 25 LFS158 Class Forum
- 7 LFS162 Class Forum
- 2 LFS166 Class Forum
- 4 LFS167 Class Forum
- 3 LFS170 Class Forum
- 2 LFS171 Class Forum
- 3 LFS178 Class Forum
- 3 LFS180 Class Forum
- 2 LFS182 Class Forum
- 5 LFS183 Class Forum
- 31 LFS200 Class Forum
- 737 LFS201 Class Forum - Discontinued
- 3 LFS201-JP クラス フォーラム
- 18 LFS203 Class Forum
- 130 LFS207 Class Forum
- 2 LFS207-DE-Klassenforum
- 1 LFS207-JP クラス フォーラム
- 302 LFS211 Class Forum
- 56 LFS216 Class Forum
- 52 LFS241 Class Forum
- 48 LFS242 Class Forum
- 38 LFS243 Class Forum
- 15 LFS244 Class Forum
- 2 LFS245 Class Forum
- LFS246 Class Forum
- 48 LFS250 Class Forum
- 2 LFS250-JP クラス フォーラム
- 1 LFS251 Class Forum
- 151 LFS253 Class Forum
- 1 LFS254 Class Forum
- 1 LFS255 Class Forum
- 7 LFS256 Class Forum
- 1 LFS257 Class Forum
- 1.2K LFS258 Class Forum
- 10 LFS258-JP クラス フォーラム
- 118 LFS260 Class Forum
- 159 LFS261 Class Forum
- 42 LFS262 Class Forum
- 82 LFS263 Class Forum - Discontinued
- 15 LFS264 Class Forum - Discontinued
- 11 LFS266 Class Forum - Discontinued
- 24 LFS267 Class Forum
- 22 LFS268 Class Forum
- 30 LFS269 Class Forum
- LFS270 Class Forum
- 202 LFS272 Class Forum
- 2 LFS272-JP クラス フォーラム
- 1 LFS274 Class Forum
- 4 LFS281 Class Forum
- 9 LFW111 Class Forum
- 259 LFW211 Class Forum
- 181 LFW212 Class Forum
- 13 SKF100 Class Forum
- 1 SKF200 Class Forum
- 1 SKF201 Class Forum
- 795 Hardware
- 199 Drivers
- 68 I/O Devices
- 37 Monitors
- 102 Multimedia
- 174 Networking
- 91 Printers & Scanners
- 85 Storage
- 758 Linux Distributions
- 82 Debian
- 67 Fedora
- 17 Linux Mint
- 13 Mageia
- 23 openSUSE
- 148 Red Hat Enterprise
- 31 Slackware
- 13 SUSE Enterprise
- 353 Ubuntu
- 468 Linux System Administration
- 39 Cloud Computing
- 71 Command Line/Scripting
- Github systems admin projects
- 93 Linux Security
- 78 Network Management
- 102 System Management
- 47 Web Management
- 63 Mobile Computing
- 18 Android
- 33 Development
- 1.2K New to Linux
- 1K Getting Started with Linux
- 371 Off Topic
- 114 Introductions
- 174 Small Talk
- 22 Study Material
- 805 Programming and Development
- 303 Kernel Development
- 484 Software Development
- 1.8K Software
- 261 Applications
- 183 Command Line
- 3 Compiling/Installing
- 987 Games
- 317 Installation
- 96 All In Program
- 96 All In Forum
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)