Welcome to the Linux Foundation Forum!

Lab 11.1 pods stuck in 'pending'

Hi-

Not sure if anyone else has had this issue. While going through lab 11.1, everything up to step 7 worked as intended. However when commenting out the 'nodeSelector' and 'status' lines, deleting and recreating the pods they remain in pending status. Also they never try to deploy on the worker node. I have rebooted the master, deleted and recreated the vip.yaml, but can never get the pods to redeploy without being stuck in pending status.

Comments

  • jcremp77
    jcremp77 Posts: 37

    After completing 12.3 lab, I was able to see from the dashboard that there are '0/2 nodes available: 1 insufficient cpu, and 1 didn't match nodes affinity.

  • chrispokorni
    chrispokorni Posts: 2,155

    Hi @jcremp77,

    What are the sizes of your nodes (CPU, MEM, Disk) and what are you using as infrastructure (what cloud or what hypervisor) to provision your node instances?

    Regards,
    -Chris

  • jcremp77
    jcremp77 Posts: 37
    edited April 2021

    Hi @chrispokorni ,

    Provisioned in GCP.

    description: Efficient Instance, 2 vCPUs, 8 GB RAM
    guestCpus: 2
    id: '335002'
    imageSpaceGb: 0
    isSharedCpu: false
    kind: compute#machineType
    maximumPersistentDisks: 128
    maximumPersistentDisksSizeGb: '263168'
    memoryMb: 8192
    name: e2-standard-2

  • chrispokorni
    chrispokorni Posts: 2,155
    edited April 2021

    Hi @jcremp77,

    Node affinity may be a typo or just a missing property, while the increased CPU usage could be from a previously deployed application consuming an excessive amount of CPU on one of your nodes. If you run the top command on both nodes, is there a process that seems to use a lot of CPU?

    Do you still have the hog applications running in your cluster? If misconfigured, they could consume all the CPU of the node(s) where they got deployed.

    Regards,
    -Chris

  • jcremp77
    jcremp77 Posts: 37

    Hi @chrispokorni ,

    top - 18:21:13 up 1:31, 1 user, load average: 0.29, 0.21, 0.24 Tasks: 170 total, 1 running, 169 sleeping, 0 stopped, 0 zombie %Cpu(s): 5.4 us, 3.0 sy, 0.0 ni, 90.9 id, 0.0 wa, 0.0 hi, 0.5 si, 0.2 st MiB Mem : 7961.4 total, 5014.6 free, 999.1 used, 1947.7 buff/cache MiB Swap: 0.0 total, 0.0 free, 0.0 used. 6846.9 avail Mem

  • chrispokorni
    chrispokorni Posts: 2,155

    Hi @jcremp77,

    Can you describe both nodes? That may also indicate the amount of CPU allocated to each pod.

    Regards,
    -Chris

  • jcremp77
    jcremp77 Posts: 37

    Name: controller-1 Roles: control-plane,master Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/os=linux kubernetes.io/arch=amd64 kubernetes.io/hostname=controller-1 kubernetes.io/os=linux node-role.kubernetes.io/control-plane= node-role.kubernetes.io/master= status=vip Annotations: kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock node.alpha.kubernetes.io/ttl: 0 projectcalico.org/IPv4Address: 10.240.0.10/32 projectcalico.org/IPv4IPIPTunnelAddr: 192.168.166.128 volumes.kubernetes.io/controller-managed-attach-detach: true CreationTimestamp: Tue, 13 Apr 2021 00:39:31 +0000 Taints: <none> Unschedulable: false Lease: HolderIdentity: controller-1 AcquireTime: <unset> RenewTime: Sat, 01 May 2021 01:04:07 +0000 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- NetworkUnavailable False Fri, 30 Apr 2021 20:44:00 +0000 Fri, 30 Apr 2021 20:44:00 +0000 CalicoIsUp Calico is running on this node MemoryPressure False Sat, 01 May 2021 00:59:36 +0000 Tue, 13 Apr 2021 00:39:30 +0000 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Sat, 01 May 2021 00:59:36 +0000 Tue, 13 Apr 2021 00:39:30 +0000 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Sat, 01 May 2021 00:59:36 +0000 Tue, 13 Apr 2021 00:39:30 +0000 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Sat, 01 May 2021 00:59:36 +0000 Thu, 15 Apr 2021 02:29:15 +0000 KubeletReady kubelet is posting ready status. AppArmor enabled Addresses: InternalIP: 10.240.0.10 Hostname: controller-1 Capacity: cpu: 2 ephemeral-storage: 203070420Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 8152424Ki pods: 110 Allocatable: cpu: 2 ephemeral-storage: 187149698763 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 8050024Ki pods: 110 System Info: Machine ID: b52c1cfa2137cb91e869e0394ee57863 System UUID: b52c1cfa-2137-cb91-e869-e0394ee57863 Boot ID: 20a1d288-dca4-49a6-9b8e-774d06163abc Kernel Version: 5.4.0-1042-gcp OS Image: Ubuntu 20.04.2 LTS Operating System: linux Architecture: amd64 Container Runtime Version: docker://19.3.8 Kubelet Version: v1.20.1 Kube-Proxy Version: v1.20.1 PodCIDR: 192.168.0.0/24 PodCIDRs: 192.168.0.0/24 Non-terminated Pods: (9 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE --------- ---- ------------ ---------- --------------- ------------- --- kube-system calico-kube-controllers-69496d8b75-pcm8c 0 (0%) 0 (0%) 0 (0%) 0 (0%) 15d kube-system calico-node-qqg8c 250m (12%) 0 (0%) 0 (0%) 0 (0%) 17d kube-system coredns-74ff55c5b-22hbv 100m (5%) 0 (0%) 70Mi (0%) 170Mi (2%) 5d22h kube-system coredns-74ff55c5b-clz4w 100m (5%) 0 (0%) 70Mi (0%) 170Mi (2%) 5d22h kube-system etcd-controller-1 100m (5%) 0 (0%) 100Mi (1%) 0 (0%) 15d kube-system kube-apiserver-controller-1 250m (12%) 0 (0%) 0 (0%) 0 (0%) 15d kube-system kube-controller-manager-controller-1 200m (10%) 0 (0%) 0 (0%) 0 (0%) 15d kube-system kube-proxy-t7n5p 0 (0%) 0 (0%) 0 (0%) 0 (0%) 15d kube-system kube-scheduler-controller-1 100m (5%) 0 (0%) 0 (0%) 0 (0%) 15d Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 1100m (55%) 0 (0%) memory 240Mi (3%) 340Mi (4%) ephemeral-storage 100Mi (0%) 0 (0%) hugepages-1Gi 0 (0%) 0 (0%) hugepages-2Mi 0 (0%) 0 (0%) Events: <none>

  • jcremp77
    jcremp77 Posts: 37

    Hi @chrispokorni - Attached as .txt

  • chrispokorni
    chrispokorni Posts: 2,155

    Hi @jcremp77,

    Nothing out of the ordinary so far. What is the YAML manifest describing the pod you are attempting to deploy?

    Regards,
    -Chris

  • jcremp77
    jcremp77 Posts: 37

    Hi @chrispokorni ,

    I am going to go through all the labs again. Not sure what is going on, but it was with the vip.yaml. Will see what happens the 2nd time through, thanks again.

Categories

Upcoming Training