Lab 11.1 pods stuck in 'pending'
Hi-
Not sure if anyone else has had this issue. While going through lab 11.1, everything up to step 7 worked as intended. However when commenting out the 'nodeSelector' and 'status' lines, deleting and recreating the pods they remain in pending status. Also they never try to deploy on the worker node. I have rebooted the master, deleted and recreated the vip.yaml, but can never get the pods to redeploy without being stuck in pending status.
Comments
-
After completing 12.3 lab, I was able to see from the dashboard that there are '0/2 nodes available: 1 insufficient cpu, and 1 didn't match nodes affinity.
0 -
Hi @jcremp77,
What are the sizes of your nodes (CPU, MEM, Disk) and what are you using as infrastructure (what cloud or what hypervisor) to provision your node instances?
Regards,
-Chris0 -
Hi @chrispokorni ,
Provisioned in GCP.
description: Efficient Instance, 2 vCPUs, 8 GB RAMguestCpus: 2id: '335002'imageSpaceGb: 0isSharedCpu: falsekind: compute#machineTypemaximumPersistentDisks: 128maximumPersistentDisksSizeGb: '263168'memoryMb: 8192name: e2-standard-20 -
Hi @jcremp77,
Node affinity may be a typo or just a missing property, while the increased CPU usage could be from a previously deployed application consuming an excessive amount of CPU on one of your nodes. If you run the
topcommand on both nodes, is there a process that seems to use a lot of CPU?Do you still have the
hogapplications running in your cluster? If misconfigured, they could consume all the CPU of the node(s) where they got deployed.Regards,
-Chris0 -
Hi @chrispokorni ,
top - 18:21:13 up 1:31, 1 user, load average: 0.29, 0.21, 0.24 Tasks: 170 total, 1 running, 169 sleeping, 0 stopped, 0 zombie %Cpu(s): 5.4 us, 3.0 sy, 0.0 ni, 90.9 id, 0.0 wa, 0.0 hi, 0.5 si, 0.2 st MiB Mem : 7961.4 total, 5014.6 free, 999.1 used, 1947.7 buff/cache MiB Swap: 0.0 total, 0.0 free, 0.0 used. 6846.9 avail Mem
0 -
Hi @jcremp77,
Can you
describeboth nodes? That may also indicate the amount of CPU allocated to each pod.Regards,
-Chris0 -
Name: controller-1 Roles: control-plane,master Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/os=linux kubernetes.io/arch=amd64 kubernetes.io/hostname=controller-1 kubernetes.io/os=linux node-role.kubernetes.io/control-plane= node-role.kubernetes.io/master= status=vip Annotations: kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock node.alpha.kubernetes.io/ttl: 0 projectcalico.org/IPv4Address: 10.240.0.10/32 projectcalico.org/IPv4IPIPTunnelAddr: 192.168.166.128 volumes.kubernetes.io/controller-managed-attach-detach: true CreationTimestamp: Tue, 13 Apr 2021 00:39:31 +0000 Taints: <none> Unschedulable: false Lease: HolderIdentity: controller-1 AcquireTime: <unset> RenewTime: Sat, 01 May 2021 01:04:07 +0000 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- NetworkUnavailable False Fri, 30 Apr 2021 20:44:00 +0000 Fri, 30 Apr 2021 20:44:00 +0000 CalicoIsUp Calico is running on this node MemoryPressure False Sat, 01 May 2021 00:59:36 +0000 Tue, 13 Apr 2021 00:39:30 +0000 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Sat, 01 May 2021 00:59:36 +0000 Tue, 13 Apr 2021 00:39:30 +0000 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Sat, 01 May 2021 00:59:36 +0000 Tue, 13 Apr 2021 00:39:30 +0000 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Sat, 01 May 2021 00:59:36 +0000 Thu, 15 Apr 2021 02:29:15 +0000 KubeletReady kubelet is posting ready status. AppArmor enabled Addresses: InternalIP: 10.240.0.10 Hostname: controller-1 Capacity: cpu: 2 ephemeral-storage: 203070420Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 8152424Ki pods: 110 Allocatable: cpu: 2 ephemeral-storage: 187149698763 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 8050024Ki pods: 110 System Info: Machine ID: b52c1cfa2137cb91e869e0394ee57863 System UUID: b52c1cfa-2137-cb91-e869-e0394ee57863 Boot ID: 20a1d288-dca4-49a6-9b8e-774d06163abc Kernel Version: 5.4.0-1042-gcp OS Image: Ubuntu 20.04.2 LTS Operating System: linux Architecture: amd64 Container Runtime Version: docker://19.3.8 Kubelet Version: v1.20.1 Kube-Proxy Version: v1.20.1 PodCIDR: 192.168.0.0/24 PodCIDRs: 192.168.0.0/24 Non-terminated Pods: (9 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE --------- ---- ------------ ---------- --------------- ------------- --- kube-system calico-kube-controllers-69496d8b75-pcm8c 0 (0%) 0 (0%) 0 (0%) 0 (0%) 15d kube-system calico-node-qqg8c 250m (12%) 0 (0%) 0 (0%) 0 (0%) 17d kube-system coredns-74ff55c5b-22hbv 100m (5%) 0 (0%) 70Mi (0%) 170Mi (2%) 5d22h kube-system coredns-74ff55c5b-clz4w 100m (5%) 0 (0%) 70Mi (0%) 170Mi (2%) 5d22h kube-system etcd-controller-1 100m (5%) 0 (0%) 100Mi (1%) 0 (0%) 15d kube-system kube-apiserver-controller-1 250m (12%) 0 (0%) 0 (0%) 0 (0%) 15d kube-system kube-controller-manager-controller-1 200m (10%) 0 (0%) 0 (0%) 0 (0%) 15d kube-system kube-proxy-t7n5p 0 (0%) 0 (0%) 0 (0%) 0 (0%) 15d kube-system kube-scheduler-controller-1 100m (5%) 0 (0%) 0 (0%) 0 (0%) 15d Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 1100m (55%) 0 (0%) memory 240Mi (3%) 340Mi (4%) ephemeral-storage 100Mi (0%) 0 (0%) hugepages-1Gi 0 (0%) 0 (0%) hugepages-2Mi 0 (0%) 0 (0%) Events: <none>0 -
-
Hi @jcremp77,
Nothing out of the ordinary so far. What is the YAML manifest describing the pod you are attempting to deploy?
Regards,
-Chris0 -
Hi @chrispokorni ,
I am going to go through all the labs again. Not sure what is going on, but it was with the vip.yaml. Will see what happens the 2nd time through, thanks again.
0
Categories
- All Categories
- 177 LFX Mentorship
- 177 LFX Mentorship: Linux Kernel
- 750 Linux Foundation IT Professional Programs
- 373 Cloud Engineer IT Professional Program
- 169 Advanced Cloud Engineer IT Professional Program
- 74 DevOps IT Professional Program - Discontinued
- 4 DevOps & GitOps IT Professional Program
- 99 Cloud Native Developer IT Professional Program
- 7.6K Training Courses & Learning Paths
- 1 AI & ML Training
- 1 Blockchain & Decentralized Identity Training
- 5 Cloud & Containers Training
- 1 Cybersecurity Training
- 2 DevOps & Site-Reliability Training
- 1 Linux Kernel Development Training
- 1 Networking Training
- 2 Open Source Best Practice Training
- 1 System Administration Training
- 1 System Engineering Training
- 1 Web & Application Development Training
- 792 Hardware
- 202 Drivers
- 68 I/O Devices
- 37 Monitors
- 95 Multimedia
- 173 Networking
- 91 Printers & Scanners
- 87 Storage
- 769 Linux Distributions
- 81 Debian
- 68 Fedora
- 22 Linux Mint
- 13 Mageia
- 24 openSUSE
- 150 Red Hat Enterprise
- 31 Slackware
- 13 SUSE Enterprise
- 356 Ubuntu
- 465 Linux System Administration
- 31 Cloud Computing
- 73 Command Line/Scripting
- Github systems admin projects
- 98 Linux Security
- 78 Network Management
- 101 System Management
- 46 Web Management
- 106 Mobile Computing
- 18 Android
- 73 Development
- 1.2K New to Linux
- 1K Getting Started with Linux
- 392 Off Topic
- 121 Introductions
- 181 Small Talk
- 29 Study Material
- 955 Programming and Development
- 310 Kernel Development
- 627 Software Development
- 984 Software
- 376 Applications
- 182 Command Line
- 5 Compiling/Installing
- 68 Games
- 317 Installation
- Archived
- 2 LFD140 Class Forum
- 1.4K LFS258 Class Forum
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)