Cilium pod is isolating the cp node.
Hello,
After applying the cilium-cni.yaml, the cp node is isolated and no connections (any port) are allowed, except the console.
It seems that cilium deployment is not finished, one of two pods are not running.
From what I can see, there is an issue between cilium.yaml file and apparmor.
ade@cp:~$ kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME cp Ready control-plane 28m v1.29.1 192.168.0.201 <none> Ubuntu 20.04.6 LTS 5.4.0-190-generic containerd://1.7.19 worker1 NotReady <none> 7m51s v1.29.1 192.168.0.27 <none> Ubuntu 20.04.6 LTS 5.4.0-190-generic containerd://1.7.19 ade@cp:~$ ade@cp:~$ ping 192.168.0.27 PING 192.168.0.27 (192.168.0.27) 56(84) bytes of data. 64 bytes from 192.168.0.27: icmp_seq=1 ttl=64 time=0.284 ms 64 bytes from 192.168.0.27: icmp_seq=2 ttl=64 time=0.411 ms ^C --- 192.168.0.27 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1019ms rtt min/avg/max/mdev = 0.284/0.347/0.411/0.063 ms
ade@cp:~$ kubectl apply -f /home/ade/LFS458/SOLUTIONS/s_03/cilium-cni.yaml serviceaccount/cilium unchanged serviceaccount/cilium-operator unchanged secret/cilium-ca unchanged secret/hubble-server-certs unchanged configmap/cilium-config unchanged clusterrole.rbac.authorization.k8s.io/cilium unchanged clusterrole.rbac.authorization.k8s.io/cilium-operator unchanged clusterrolebinding.rbac.authorization.k8s.io/cilium unchanged clusterrolebinding.rbac.authorization.k8s.io/cilium-operator unchanged role.rbac.authorization.k8s.io/cilium-config-agent unchanged rolebinding.rbac.authorization.k8s.io/cilium-config-agent unchanged service/hubble-peer unchanged daemonset.apps/cilium created deployment.apps/cilium-operator created ade@cp:~$ ade@cp:~$ ade@cp:~$ kubectl get pods -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system cilium-7d2wv 0/1 Init:0/6 0 7s kube-system cilium-f7s4q 0/1 Init:4/6 0 7s kube-system cilium-operator-56bdb99ff6-vqntk 0/1 ContainerCreating 0 7s kube-system cilium-operator-56bdb99ff6-zm658 1/1 Running 0 7s kube-system coredns-76f75df574-7pdsg 0/1 Unknown 0 30m kube-system coredns-76f75df574-s5jnz 0/1 Unknown 0 30m kube-system etcd-cp 1/1 Running 1 (13m ago) 30m kube-system kube-apiserver-cp 1/1 Running 1 (13m ago) 30m kube-system kube-controller-manager-cp 1/1 Running 1 (13m ago) 30m kube-system kube-proxy-6thx7 1/1 Running 1 (13m ago) 30m kube-system kube-proxy-7qgl9 1/1 Running 0 10m kube-system kube-scheduler-cp 1/1 Running 1 (13m ago) 30m ade@cp:~$ ade@cp:~$
Answers
-
I am using Ubuntu 20.04 installed over Proxmox (as VM).
I have installed cilium as normal user (not root).0 -
Hi @andriesadelina,
I suspect your VM IP addresses overlap the Cilium CNI pod network
192.168.0.0/16
that is set in thecilium-cni.yaml
manifest. This can be modified at line 198 to a different subnet, that is distinct from your VM subnet and from the default Kubernetes Service subnet10.96.0.0/12
.You could try
10.200.0.0/16
for the pod network to avoid any future issues.First, remove cilium:
kubectl delete -f /home/student/LFS258/SOLUTIONS/s_03/cilium-cni.yaml
Then edit the cilium-cni.yaml manifest at line 198 with a desired pod network, then re-deploy the cilium-cni.yaml manifest.
Regards,
-Chris0 -
Hello @chrispokorni ,
I have modified the following, but the issue persists:
In file "ciulium-cni.yaml"
198 cluster-pool-ipv4-cidr: "10.200.0.0/16"
In file "/root/kubeadm-config.yaml"
networking: podSubnet: 10.10.0.0/16 serviceSubnet: 10.96.0.0/12
After that I run the below command:
ade@cp:~$ kubectl delete -f /home/ade/LFS458/SOLUTIONS/s_03/cilium-cni.yaml Error from server (NotFound): error when deleting "/home/ade/LFS458/SOLUTIONS/s_03/cilium-cni.yaml": serviceaccounts "cilium" not found Error from server (NotFound): error when deleting "/home/ade/LFS458/SOLUTIONS/s_03/cilium-cni.yaml": serviceaccounts "cilium-operator" not found Error from server (NotFound): error when deleting "/home/ade/LFS458/SOLUTIONS/s_03/cilium-cni.yaml": secrets "cilium-ca" not found Error from server (NotFound): error when deleting "/home/ade/LFS458/SOLUTIONS/s_03/cilium-cni.yaml": secrets "hubble-server-certs" not found Error from server (NotFound): error when deleting "/home/ade/LFS458/SOLUTIONS/s_03/cilium-cni.yaml": configmaps "cilium-config" not found Error from server (NotFound): error when deleting "/home/ade/LFS458/SOLUTIONS/s_03/cilium-cni.yaml": clusterroles.rbac.authorization.k8s.io "cilium" not found Error from server (NotFound): error when deleting "/home/ade/LFS458/SOLUTIONS/s_03/cilium-cni.yaml": clusterroles.rbac.authorization.k8s.io "cilium-operator" not found Error from server (NotFound): error when deleting "/home/ade/LFS458/SOLUTIONS/s_03/cilium-cni.yaml": clusterrolebindings.rbac.authorization.k8s.io "cilium" not found Error from server (NotFound): error when deleting "/home/ade/LFS458/SOLUTIONS/s_03/cilium-cni.yaml": clusterrolebindings.rbac.authorization.k8s.io "cilium-operator" not found Error from server (NotFound): error when deleting "/home/ade/LFS458/SOLUTIONS/s_03/cilium-cni.yaml": roles.rbac.authorization.k8s.io "cilium-config-agent" not found Error from server (NotFound): error when deleting "/home/ade/LFS458/SOLUTIONS/s_03/cilium-cni.yaml": rolebindings.rbac.authorization.k8s.io "cilium-config-agent" not found Error from server (NotFound): error when deleting "/home/ade/LFS458/SOLUTIONS/s_03/cilium-cni.yaml": services "hubble-peer" not found Error from server (NotFound): error when deleting "/home/ade/LFS458/SOLUTIONS/s_03/cilium-cni.yaml": daemonsets.apps "cilium" not found Error from server (NotFound): error when deleting "/home/ade/LFS458/SOLUTIONS/s_03/cilium-cni.yaml": deployments.apps "cilium-operator" not found
`ade@cp:~$ kubectl get all -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system pod/etcd-cp 1/1 Running 12 (10m ago) 2d2h 192.168.0.201 cp
kube-system pod/kube-apiserver-cp 1/1 Running 12 (10m ago) 2d2h 192.168.0.201 cp
kube-system pod/kube-controller-manager-cp 1/1 Running 12 (10m ago) 2d2h 192.168.0.201 cp
kube-system pod/kube-proxy-6thx7 1/1 Running 12 (10m ago) 2d2h 192.168.0.201 cp
kube-system pod/kube-proxy-7qgl9 1/1 Running 1 (61m ago) 2d2h 192.168.0.27 worker1
kube-system pod/kube-scheduler-cp 1/1 Running 12 (10m ago) 45m 192.168.0.201 cpNAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
default service/kubernetes ClusterIP 10.96.0.1 443/TCP 2d2h
kube-system service/kube-dns ClusterIP 10.96.0.10 53/UDP,53/TCP,9153/TCP 2d2h k8s-app=kube-dnsNAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE CONTAINERS IMAGES SELECTOR
kube-system daemonset.apps/kube-proxy 2 2 1 2 1 kubernetes.io/os=linux 2d2h kube-proxy registry.k8s.io/kube-proxy:v1.29.1 k8s-app=kube-proxy`And apply again the cilium-cni.yaml:
kubectl apply -f /home/ade/LFS458/SOLUTIONS/s_03/cilium-cni.yaml serviceaccount/cilium created serviceaccount/cilium-operator created secret/cilium-ca created secret/hubble-server-certs created configmap/cilium-config created clusterrole.rbac.authorization.k8s.io/cilium created clusterrole.rbac.authorization.k8s.io/cilium-operator created clusterrolebinding.rbac.authorization.k8s.io/cilium created clusterrolebinding.rbac.authorization.k8s.io/cilium-operator created role.rbac.authorization.k8s.io/cilium-config-agent created rolebinding.rbac.authorization.k8s.io/cilium-config-agent created service/hubble-peer created daemonset.apps/cilium created deployment.apps/cilium-operator created
The two pods that are assigned to worker1 node are in "Pending" status:
On worker1, I am receiving the below error:
Could you please tell me what could be the issue?
Thank you very much for your time!
0 -
.
0 -
Hi @andriesadelina,
There seems to be a mix of data in your source files. You seem to be following LFS258 while the cluster resources are attempted from LFS458.
Please follow the training material (lectures, lab guide and lab resources/solutions) released for the course you enrolled in - LFS258.
Please ensure that the hypervisor enabled a single bridged network adapter per VM. Keeping the VM's IP addresses on the 192.168.0.0/16 will avoid further confusion. Also, ensure the hypervisor does not block any ingress (inbound) traffic to the VMs (meaning that ALL protocols should be allowed to ALL port destinations from ALL sources).
In an attempt to salvage your current cluster, please complete the following:
1 - remove the Cilium installation (assuming you downloaded the correct LFS258 SOLUTIONS tarball, and replace "student" in the path with your user ID):
student@cp:~$ kubectl delete -f /home/student/LFS258/SOLUTIONS/s_03/cilium-cni.yaml
2 - delete the worker1 node from the cluster (run command on control plane node, as regular
non-root
user)student@cp:~$ kubectl delete node worker1
3 - reset the worker1 node (run the command as
root
on theworker1
node)root@worker1:~# kubeadm reset
confirm the reset when prompted4 - reset the cp node (run the command as
root
on thecp
node)root@cp:~# kubeadm reset
confirm the reset when prompted5 - edit the
/etc/hosts
file on the control plane node with the required control plane aliask8scp
assigned to the private IP of the control plane node (not the control plane node hostname) [lab guide exercise 3.1 step 19]... 192.168.x.x k8scp ...
6 - correct the
/root/kubeadm-config.yaml
manifest on the control plane node as such (control plane alias and desired pod subnet CIDR for the CNI plugin) [lab guide exercise 3.1 step 20]apiVersion: kubeadm.k8s.io/v1beta3 kind: ClusterConfiguration kubernetesVersion: 1.29.1 controlPlaneEndpoint: "k8scp:6443" networking: podSubnet: 10.200.0.0/16
7 - initialize the control plane (run the command as
root
on thecp
node) [lab guide exercise 3.1 step 21]root@cp:~# kubeadm init --config=kubeadm-config.yaml --upload-certs | tee kubeadm-init.out
8 - reset the cluster admin credentials for the non-root user [lab guide exercise 3.1 step 22]
student@cp:~$ rm $HOME/.kube/config student@cp:~$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config student@cp:~$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
9 - update the
cilium-cni.yaml
manifest on line 198 with the desired pod CIDR, to match the one supplied in thekubeadm-config.yaml
manifest earlier... cluster-pool-ipv4-cidr: "10.200.0.0/16" ...
10 - deploy cilium again [lab guide exercise 3.1 step 23]
student@cp:~$ kubectl apply -f /path/to/cilium-cni.yaml
11 - extract and copy the join command from the
/root/kubeadm-init.out
file generated by theinit
command on the control plane node (or just run the following command to generate itstudent@cp: ~$ sudo kubeadm token create --print-join-command
)12 - edit the
/etc/hosts
file on theworker1
node with the required control plane aliask8scp
assigned to the private IP of the control plane node (not the control plane node hostname, not the worker1 node hostname, not worker1 node private IP address) [lab guide exercise 3.2 step 12]13 - run the
join
command on the worker1 node (run the command asroot
on theworker1
node) [lab guide exercise 3.2 step 13]root@worker1:~# kubeadm join ....................
If completing these steps does not produce a working cluster, please decommission the VMs, and start over with two new VMs, provisioned to match the networking requirements described above.
Regards,
-Chris0 -
Hello @chrispokorni ,
After I have performed all of the above steps, the issue was fixed.
Thank you so much for your help!
0
Categories
- All Categories
- 217 LFX Mentorship
- 217 LFX Mentorship: Linux Kernel
- 788 Linux Foundation IT Professional Programs
- 352 Cloud Engineer IT Professional Program
- 177 Advanced Cloud Engineer IT Professional Program
- 82 DevOps Engineer IT Professional Program
- 146 Cloud Native Developer IT Professional Program
- 137 Express Training Courses
- 137 Express Courses - Discussion Forum
- 6.2K Training Courses
- 46 LFC110 Class Forum - Discontinued
- 70 LFC131 Class Forum
- 42 LFD102 Class Forum
- 226 LFD103 Class Forum
- 18 LFD110 Class Forum
- 37 LFD121 Class Forum
- 18 LFD133 Class Forum
- 7 LFD134 Class Forum
- 18 LFD137 Class Forum
- 71 LFD201 Class Forum
- 4 LFD210 Class Forum
- 5 LFD210-CN Class Forum
- 2 LFD213 Class Forum - Discontinued
- 128 LFD232 Class Forum - Discontinued
- 2 LFD233 Class Forum
- 4 LFD237 Class Forum
- 24 LFD254 Class Forum
- 694 LFD259 Class Forum
- 111 LFD272 Class Forum
- 4 LFD272-JP クラス フォーラム
- 12 LFD273 Class Forum
- 146 LFS101 Class Forum
- 1 LFS111 Class Forum
- 3 LFS112 Class Forum
- 2 LFS116 Class Forum
- 4 LFS118 Class Forum
- 6 LFS142 Class Forum
- 5 LFS144 Class Forum
- 4 LFS145 Class Forum
- 2 LFS146 Class Forum
- 3 LFS147 Class Forum
- 1 LFS148 Class Forum
- 15 LFS151 Class Forum
- 2 LFS157 Class Forum
- 25 LFS158 Class Forum
- 7 LFS162 Class Forum
- 2 LFS166 Class Forum
- 4 LFS167 Class Forum
- 3 LFS170 Class Forum
- 2 LFS171 Class Forum
- 3 LFS178 Class Forum
- 3 LFS180 Class Forum
- 2 LFS182 Class Forum
- 5 LFS183 Class Forum
- 31 LFS200 Class Forum
- 737 LFS201 Class Forum - Discontinued
- 3 LFS201-JP クラス フォーラム
- 18 LFS203 Class Forum
- 130 LFS207 Class Forum
- 2 LFS207-DE-Klassenforum
- 1 LFS207-JP クラス フォーラム
- 302 LFS211 Class Forum
- 56 LFS216 Class Forum
- 52 LFS241 Class Forum
- 48 LFS242 Class Forum
- 38 LFS243 Class Forum
- 15 LFS244 Class Forum
- 2 LFS245 Class Forum
- LFS246 Class Forum
- 48 LFS250 Class Forum
- 2 LFS250-JP クラス フォーラム
- 1 LFS251 Class Forum
- 151 LFS253 Class Forum
- 1 LFS254 Class Forum
- 1 LFS255 Class Forum
- 7 LFS256 Class Forum
- 1 LFS257 Class Forum
- 1.2K LFS258 Class Forum
- 10 LFS258-JP クラス フォーラム
- 118 LFS260 Class Forum
- 159 LFS261 Class Forum
- 42 LFS262 Class Forum
- 82 LFS263 Class Forum - Discontinued
- 15 LFS264 Class Forum - Discontinued
- 11 LFS266 Class Forum - Discontinued
- 24 LFS267 Class Forum
- 22 LFS268 Class Forum
- 30 LFS269 Class Forum
- LFS270 Class Forum
- 202 LFS272 Class Forum
- 2 LFS272-JP クラス フォーラム
- 1 LFS274 Class Forum
- 4 LFS281 Class Forum
- 9 LFW111 Class Forum
- 259 LFW211 Class Forum
- 181 LFW212 Class Forum
- 13 SKF100 Class Forum
- 1 SKF200 Class Forum
- 1 SKF201 Class Forum
- 795 Hardware
- 199 Drivers
- 68 I/O Devices
- 37 Monitors
- 102 Multimedia
- 174 Networking
- 91 Printers & Scanners
- 85 Storage
- 758 Linux Distributions
- 82 Debian
- 67 Fedora
- 17 Linux Mint
- 13 Mageia
- 23 openSUSE
- 148 Red Hat Enterprise
- 31 Slackware
- 13 SUSE Enterprise
- 353 Ubuntu
- 468 Linux System Administration
- 39 Cloud Computing
- 71 Command Line/Scripting
- Github systems admin projects
- 93 Linux Security
- 78 Network Management
- 102 System Management
- 47 Web Management
- 63 Mobile Computing
- 18 Android
- 33 Development
- 1.2K New to Linux
- 1K Getting Started with Linux
- 371 Off Topic
- 114 Introductions
- 174 Small Talk
- 22 Study Material
- 805 Programming and Development
- 303 Kernel Development
- 484 Software Development
- 1.8K Software
- 261 Applications
- 183 Command Line
- 3 Compiling/Installing
- 987 Games
- 317 Installation
- 96 All In Program
- 96 All In Forum
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)