gatekeeper CrashLoopBackOff
Hi,
I setup two Ubuntu 18.04 server, one is master, one is worker as described on tutorial.
When i run
kubectl create -f gatekeeper.yaml
pods that scheduled on master node become RUNNING state but on worker node they are not.
State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 2 Started: Wed, 04 Aug 2021 14:26:11 +0000 Finished: Wed, 04 Aug 2021 14:26:39 +0000 Ready: False Restart Count: 9 Limits: cpu: 1 memory: 512Mi Requests: cpu: 100m memory: 256Mi Liveness: http-get http://:9090/healthz delay=0s timeout=1s period=10s #success=1 #failure=3 Readiness: http-get http://:9090/readyz delay=0s timeout=1s period=10s #success=1 #failure=3
I tried to install nginx to see if there is a problem on worker node but nginx pods on worker node are in running mode, so it seems there is no problem about calico.
Regards
Comments
-
Thanks for fast reply.
Here is the pod statusargela@argela:~/gk$ kubectl get pods -A -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES gatekeeper-system gatekeeper-audit-54b5f86d57-nkz6q 0/1 Running 16 37m 192.168.171.72 worker <none> <none> gatekeeper-system gatekeeper-controller-manager-5b96bd668-88scv 1/1 Running 0 37m 192.168.132.69 argela <none> <none> gatekeeper-system gatekeeper-controller-manager-5b96bd668-f8z6j 0/1 CrashLoopBackOff 15 37m 192.168.171.73 worker <none> <none> gatekeeper-system gatekeeper-controller-manager-5b96bd668-sdv2n 0/1 CrashLoopBackOff 15 37m 192.168.171.74 worker <none> <none> kube-system calico-kube-controllers-5f6cfd688c-sl5d5 1/1 Running 0 9h 192.168.132.65 argela <none> <none> kube-system calico-node-hsfbl 1/1 Running 0 9h 192.168.20.225 argela <none> <none> kube-system calico-node-m6qw2 1/1 Running 0 8h 192.168.20.232 worker <none> <none> kube-system coredns-74ff55c5b-dbtct 1/1 Running 0 9h 192.168.132.66 argela <none> <none> kube-system coredns-74ff55c5b-wkgdx 1/1 Running 0 9h 192.168.132.67 argela <none> <none> kube-system etcd-argela 1/1 Running 0 9h 192.168.20.225 argela <none> <none> kube-system kube-apiserver-argela 1/1 Running 0 9h 192.168.20.225 argela <none> <none> kube-system kube-controller-manager-argela 1/1 Running 0 9h 192.168.20.225 argela <none> <none> kube-system kube-proxy-p5nlm 1/1 Running 0 9h 192.168.20.225 argela <none> <none> kube-system kube-proxy-pbp6r 1/1 Running 0 8h 192.168.20.232 worker <none> <none> kube-system kube-scheduler-argela 1/1 Running 0 9h 192.168.20.225 argela <none> <none>
It seems there is a problem with readiness and liveness url checks but there is no log for these pods.
For example gatekeeper-audit-54b5f86d57-nkz6q pod is in " 0/1 Running" status but no log for troubleshooting:argela@argela:~/gk$ kubectl -n gatekeeper-system logs gatekeeper-audit-54b5f86d57-nkz6q argela@argela:~/gk$
I've disabled firewall, to be sure I installed nginx, it installed successfully on both nodes. No ERROR line for calico.
Last State: Terminated Reason: Error Exit Code: 2I searched for exit code 2, one of the article says it is an error from application.
If you want other details i can provide.
This is a new setup for CKS, on Vmware.Thanks
0 -
Hi, ok i'll try more, thank you.
Yes, if gatekeeper pod scheduled on master node, it becomes running state with no problem, if it is scheduled on worker node CrashLoopBackOff occured.
here is the resource status:
master:
argela@argela:~/gk$ free -m total used free shared buff/cache available Mem: 7976 1285 1679 2 5011 6742 Swap: 0 0 0 argela@argela:~/gk$ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 3 On-line CPU(s) list: 0-2 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 3 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 85 Model name: Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz Stepping: 4 CPU MHz: 2294.609 BogoMIPS: 4589.21 Hypervisor vendor: VMware Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 1024K L3 cache: 25344K NUMA node0 CPU(s): 0-2 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 avx2 smep bmi2 invpcid avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xsaves arat pku ospke md_clear flush_l1d arch_capabilities argela@argela:~/gk$1 [##** 2.7%] Tasks: 92, 401 thr; 2 running 2 [###** 4.6%] Load average: 1.04 0.81 0.48 3 [###** 5.3%] Uptime: 2 days, 07:48:25 Mem[||||||||||||||####************************************************** 1.26G/7.79G] Swp[ 0K/0K]
worker:
1 [## 2.0%] Tasks: 60, 196 thr; 1 running 2 [##** 2.7%] Load average: 0.03 0.06 0.07 3 [##* 2.0%] Uptime: 09:33:19 Mem[|||||##*************** 446M/7.79G] Swp[ 0K/0K]
argela@worker:~$ free -m total used free shared buff/cache available Mem: 7976 468 5979 1 1528 7281 Swap: 0 0 0 argela@worker:~$ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 3 On-line CPU(s) list: 0-2 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 3 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 85 Model name: Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz Stepping: 4 CPU MHz: 2294.609 BogoMIPS: 4589.21 Hypervisor vendor: VMware Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 1024K L3 cache: 25344K NUMA node0 CPU(s): 0-2 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 avx2 smep bmi2 invpcid avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xsaves arat pku ospke md_clear flush_l1d arch_capabilities argela@worker:~$0 -
Hi again, there is a problem with Liveness and Readiness control but couldn't find any cause as there is no application log.
Here is the information from system:argela@argela:~/gk$ kubectl logs -n gatekeeper-system gatekeeper-audit-54b5f86d57-c4fpm argela@argela:~/gk$
Events:
gatekeeper-system 0s Normal Pulled pod/gatekeeper-audit-54b5f86d57-c4fpm Successfully pulled image "openpolicyagent/gatekeeper:v3.3.0" in 1.680295184s gatekeeper-system 0s Normal Created pod/gatekeeper-audit-54b5f86d57-c4fpm Created container manager gatekeeper-system 0s Normal Started pod/gatekeeper-audit-54b5f86d57-c4fpm Started container manager gatekeeper-system 0s Warning Unhealthy pod/gatekeeper-controller-manager-5b96bd668-c8vfp Liveness probe failed: Get "http://192.168.171.83:9090/healthz": dial tcp 192.168.171.83:9090: connect: connection refused gatekeeper-system 0s Warning Unhealthy pod/gatekeeper-controller-manager-5b96bd668-c8vfp Readiness probe failed: Get "http://192.168.171.83:9090/readyz": dial tcp 192.168.171.83:9090: connect: connection refused gatekeeper-system 0s Warning Unhealthy pod/gatekeeper-audit-54b5f86d57-c4fpm Liveness probe failed: Get "http://192.168.171.84:9090/healthz": dial tcp 192.168.171.84:9090: connect: connection refused gatekeeper-system 0s Warning Unhealthy pod/gatekeeper-controller-manager-5b96bd668-5sd92 Liveness probe failed: Get "http://192.168.171.82:9090/healthz": dial tcp 192.168.171.82:9090: connect: connection refused gatekeeper-system 0s Warning Unhealthy pod/gatekeeper-controller-manager-5b96bd668-5sd92 Readiness probe failed: Get "http://192.168.171.82:9090/readyz": dial tcp 192.168.171.82:9090: connect: connection refused gatekeeper-system 0s Warning Unhealthy pod/gatekeeper-audit-54b5f86d57-c4fpm Readiness probe failed: Get "http://192.168.171.84:9090/readyz": dial tcp 192.168.171.84:9090: connect: connection refused gatekeeper-system 0s Warning Unhealthy pod/gatekeeper-controller-manager-5b96bd668-c8vfp Liveness probe failed: Get "http://192.168.171.83:9090/healthz": dial tcp 192.168.171.83:9090: connect: connection refused gatekeeper-system 0s Warning Unhealthy pod/gatekeeper-controller-manager-5b96bd668-c8vfp Readiness probe failed: Get "http://192.168.171.83:9090/readyz": dial tcp 192.168.171.83:9090: connect: connection refused
Other information
argela@argela:~$ kubectl describe nodes | grep -i Taint
Taints:
Taints:master:
argela@argela:~$ systemctl status ufw
ufw.service - Uncomplicated firewall
Loaded: loaded (/lib/systemd/system/ufw.service; disabled; vendor preset: enabled)
Active: inactive (dead)
Docs: man:ufw(8)
argela@argela:~$ systemctl status apparmor.service
â— apparmor.service - AppArmor initialization
Loaded: loaded (/lib/systemd/system/apparmor.service; disabled; vendor preset: enabled)
Active: inactive (dead)
Docs: man:apparmor(7)
http://wiki.apparmor.net/
argela@argela:~$worker:
argela@worker:~$ systemctl status ufw
â— ufw.service - Uncomplicated firewall
Loaded: loaded (/lib/systemd/system/ufw.service; disabled; vendor preset: enabled)
Active: inactive (dead)
Docs: man:ufw(8)
argela@worker:~$ systemctl status apparmor.service
â— apparmor.service - AppArmor initialization
Loaded: loaded (/lib/systemd/system/apparmor.service; disabled; vendor preset: enabled)
Active: inactive (dead)
Docs: man:apparmor(7)
http://wiki.apparmor.net/ngingx deployment: (scheduled on both nodes)
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE EADINESS GATES
default nginx-deployment-66b6c48dd5-266gb 1/1 Running 0 5s 192.168.132.75 argela
default nginx-deployment-66b6c48dd5-724gh 1/1 Running 0 25s 192.168.171.76 worker
default nginx-deployment-66b6c48dd5-8fp49 1/1 Running 0 25s 192.168.132.73 argela
default nginx-deployment-66b6c48dd5-8hfwn 1/1 Running 0 35s 192.168.171.73 worker
default nginx-deployment-66b6c48dd5-cdgtf 1/1 Running 0 104s 192.168.171.69 worker
default nginx-deployment-66b6c48dd5-csfcw 1/1 Running 0 35s 192.168.171.74 worker
default nginx-deployment-66b6c48dd5-d6rml 1/1 Running 0 5s 192.168.132.74 argela
default nginx-deployment-66b6c48dd5-f8gvk 1/1 Running 0 55s 192.168.171.71 worker
default nginx-deployment-66b6c48dd5-lbjqc 1/1 Running 0 55s 192.168.171.72 worker
default nginx-deployment-66b6c48dd5-mlgj4 1/1 Running 0 104s 192.168.171.70 worker
default nginx-deployment-66b6c48dd5-pw87h 1/1 Running 0 5s 192.168.171.79 worker
default nginx-deployment-66b6c48dd5-t2hw9 1/1 Running 0 25s 192.168.171.77 worker
default nginx-deployment-66b6c48dd5-tz95g 1/1 Running 0 104s 192.168.171.68 worker
default nginx-deployment-66b6c48dd5-xw6rd 1/1 Running 0 25s 192.168.171.75 worker
default nginx-deployment-66b6c48dd5-zxmkm 1/1 Running 0 25s 192.168.171.78 workerThansk for your help
Regards,
Yavuz0 -
Hi, i've changed "--pod-network-cidr" to 10.0.0.33/16 and it is ok now.
I'm not sure, is it a problem if ip addresses of nodes are in calico's network cidr range.
I mean my ip addresses are 192.168.20.225 and 232, they are in 192.168.0.0/16 range.
Anyway it seems problems is resoved.
Thanks for your guidence.Every 2.0s: kubectl get pods -A -o wide argela: Sat Aug 7 10:31:41 2021 NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES gatekeeper-system gatekeeper-audit-54b5f86d57-pjcfd 1/1 Running 0 87s 10.0.171.65 worker <none> <none> gatekeeper-system gatekeeper-controller-manager-5b96bd668-dd2cq 1/1 Running 0 87s 10.0.132.68 argela <none> <none> gatekeeper-system gatekeeper-controller-manager-5b96bd668-pftv8 1/1 Running 0 87s 10.0.171.67 worker <none> <none> gatekeeper-system gatekeeper-controller-manager-5b96bd668-vnrj4 1/1 Running 0 87s 10.0.171.66 worker <none> <none> kube-system calico-kube-controllers-5f6cfd688c-6r48g 1/1 Running 0 5m7s 10.0.132.66 argela <none> <none> kube-system calico-node-4hr2r 1/1 Running 0 3m8s 192.168.20.232 worker <none> <none> kube-system calico-node-lmkq6 1/1 Running 0 5m7s 192.168.20.225 argela <none> <none> kube-system coredns-74ff55c5b-f6nl5 1/1 Running 0 5m7s 10.0.132.67 argela <none> <none> kube-system coredns-74ff55c5b-hzc8f 1/1 Running 0 5m7s 10.0.132.65 argela <none> <none> kube-system etcd-argela 1/1 Running 0 5m15s 192.168.20.225 argela <none> <none> kube-system kube-apiserver-argela 1/1 Running 0 5m15s 192.168.20.225 argela <none> <none> kube-system kube-controller-manager-argela 1/1 Running 0 5m15s 192.168.20.225 argela <none> <none> kube-system kube-proxy-h6wt2 1/1 Running 0 3m8s 192.168.20.232 worker <none> <none> kube-system kube-proxy-z8nk9 1/1 Running 0 5m7s 192.168.20.225 argela <none> <none> kube-system kube-scheduler-argela 1/1 Running 0 5m15s 192.168.20.225 argela <none> <none>
regards,
0
Categories
- All Categories
- 175 LFX Mentorship
- 175 LFX Mentorship: Linux Kernel
- 744 Linux Foundation IT Professional Programs
- 372 Cloud Engineer IT Professional Program
- 168 Advanced Cloud Engineer IT Professional Program
- 72 DevOps IT Professional Program - Discontinued
- 3 DevOps & GitOps IT Professional Program
- 98 Cloud Native Developer IT Professional Program
- 39 Express Training Courses & Microlearning
- 34 Express Courses - Discussion Forum
- 7.6K Training Courses & Learning Paths
- AI & ML Training
- Blockchain & Decentralized Identity Training
- Cloud & Containers Training
- Cybersecurity Training
- DevOps & Site-Reliability Training
- Linux Kernel Development Training
- Networking Training
- Open Source Best Practice Training
- System Administration Training
- System Engineering Training
- Web & Application Development Training
- 55 LFD102 Class Forum
- 261 LFD103 Class Forum
- 2 LFD103-JP クラス フォーラム
- 2 LFD114 Class Forum
- 56 LFD121 Class Forum
- 4 LFD123 Class Forum
- 3 LFD125 Class Forum
- 5 LFD137 Class Forum
- 2 LFD140 Class Forum
- 4 LFD210-CN Class Forum
- 2 LFD221 Class Forum
- 26 LFD254 Class Forum
- 764 LFD259 Class Forum
- 679 LFS101 Class Forum
- 3 LFS114 Class Forum
- 6 LFS118 Class Forum
- 3 LFS120 Class Forum
- 2 LFS140 Class Forum
- 13 LFS142 Class Forum
- 10 LFS144 Class Forum
- 8 LFS146 Class Forum
- 30 LFS148 Class Forum
- 6 LFS157 Class Forum
- 179 LFS158 Class Forum
- 2 LFS158-JP クラス フォーラム
- 19 LFS162 Class Forum
- 4 LFS180 Class Forum
- 3 LFS184 Class Forum
- 162 LFS207 Class Forum
- 3 LFS207-DE-Klassenforum
- 4 LFS207-JP クラス フォーラム
- 61 LFS241 Class Forum
- 52 LFS242 Class Forum
- 42 LFS243 Class Forum
- 19 LFS244 Class Forum
- 9 LFS245 Class Forum
- 3 LFS246 Class Forum
- 2 LFS248 Class Forum
- 179 LFS250 Class Forum
- 4 LFS250-JP クラス フォーラム
- 166 LFS253 Class Forum
- 5 LFS255 Class Forum
- 19 LFS256 Class Forum
- 3 LFS257 Class Forum
- 1.4K LFS258 Class Forum
- 13 LFS258-JP クラス フォーラム
- 151 LFS260 Class Forum
- 165 LFS261 Class Forum
- 26 LFS267 Class Forum
- 28 LFS268 Class Forum
- 39 LFS269 Class Forum
- 13 LFS270 Class Forum
- 792 Hardware
- 202 Drivers
- 68 I/O Devices
- 37 Monitors
- 95 Multimedia
- 173 Networking
- 91 Printers & Scanners
- 87 Storage
- 768 Linux Distributions
- 81 Debian
- 67 Fedora
- 22 Linux Mint
- 13 Mageia
- 24 openSUSE
- 150 Red Hat Enterprise
- 31 Slackware
- 13 SUSE Enterprise
- 356 Ubuntu
- 465 Linux System Administration
- 31 Cloud Computing
- 73 Command Line/Scripting
- Github systems admin projects
- 98 Linux Security
- 78 Network Management
- 101 System Management
- 46 Web Management
- 105 Mobile Computing
- 18 Android
- 72 Development
- 1.2K New to Linux
- 1K Getting Started with Linux
- 392 Off Topic
- 121 Introductions
- 181 Small Talk
- 29 Study Material
- 928 Programming and Development
- 310 Kernel Development
- 600 Software Development
- 969 Software
- 361 Applications
- 182 Command Line
- 5 Compiling/Installing
- 68 Games
- 317 Installation
- Archived
- 75 All In Program
- 75 All In Forum
- 25 LFC110 Class Forum - Discontinued
- 2 LFS112 Class Forum - Discontinued
- 22 LFS151 Class Forum - Discontinued
- 1 LFS166 Class Forum - Discontinued
- 9 LFS167 Class Forum - Discontinued
- 4 LFS170 Class Forum - Discontinued
- 1 LFS171 Class Forum - Discontinued
- 3 LFS178 Class Forum - Discontinued
- 736 LFS201 Class Forum - Discontinued
- 2 LFS201-JP クラス フォーラム - Discontinued
- 301 LFS211 Class Forum - Discontinued
- 55 LFS216 Class Forum - Discontinued
- 2 LFS251 Class Forum - Discontinued
- 1 LFS254 Class Forum - Discontinued
- 82 LFS263 Class Forum - Discontinued
- 15 LFS264 Class Forum - Discontinued
- 11 LFS266 Class Forum - Discontinued
- 2 LFS272-JP クラス フォーラム - Discontinued
- 202 LFS272 Class Forum - Discontinued
- 1 LFS274 Class Forum - Discontinued
- 4 LFS281 Class Forum - Discontinued
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)