Unable to start pod/container in lab 2.3 - Error message is "Error from server (BadRequest) .."

tanwee · November 2021

Hi,

I have completed lab 2.2 to setup the Kubernetes cluster. When I tried to create a pod/container in lab 2.3, I keep seeing this error:

Error from server (BadRequest): container "nginx" in pod "nginx" is waiting to start: ContainerCreating

When I do a kubectl describe pod nginx, I see these errors:

.....
Warning FailedCreatePodSandBox 95s kubelet Failed to create pod
sandbox: rpc error: code = Unknown desc = failed to mount container k8s_POD_ngi
nx_default_e1106b28-303c-4e75-afc2-d6d14bd67913_0 in pod sandbox k8s_nginx_defau
lt_e1106b28-303c-4e75-afc2-d6d14bd67913_0(507780f27bf6a769b6e7178ebe52a032e8967f
2af9d720f1931933e0a202c917): error creating overlay mount to /var/lib/containers
/storage/overlay/0c9cccedaee7f6a42d1546dc06d3100072fb4ac860040aeb7b58d85d3e39c9a
c/merged, mount_data="nodev,metacopy=on,lowerdir=/var/lib/containers/storage/ove
rlay/l/4NMXH7DOMDNBJNKEOKA65YCBZS,upperdir=/var/lib/containers/storage/overlay/0
c9cccedaee7f6a42d1546dc06d3100072fb4ac860040aeb7b58d85d3e39c9ac/diff,workdir=/va
r/lib/containers/storage/overlay/0c9cccedaee7f6a42d1546dc06d3100072fb4ac860040ae
b7b58d85d3e39c9ac/work": invalid argument
.....

What am I doing wrong?

Thanks

TW

serewicz · November 2021

Hello,

I notice you have AppArmor enabled. That could be the cause of some headaches, does the problem persist when you disable it?

As all the failed pods are on your worker I would suspect it is either AppArmor, GCE VPC firewall issue, or a networking issue where the nodes are using overlapping IP addresses with the host.

Could you disable AppArmor on all nodes, ensure your VPC allows all traffic, and show the IP ranges used by your primary interface (something like ens4) on both nodes and show the results after.

Regards,

tanwee · December 2021

Hi,

Thanks for your help. I recreated the VMs in GCE again and it is working now. I must have misconfigured VPC the first time. I managed to pass the exam after going through the labs.

Rgds

TW

chrispokorni · November 2021

Hi @tanwee,

Is this a generic symptom observed on multiple pods or just this one? Please provide the output of the following command:

kubectl get pods -A -o wide

Regards,
-Chris

tanwee · November 2021

Hi Chris,

This error is occurring for any pod that I try to create on the cluster. The output of the command is:

NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default nginx 0/1 ContainerCreating 0 52s worker
kube-system calico-kube-controllers-5d995d45d6-6mk6b 1/1 Running 1 2d23h 192.168.242.65 cp
kube-system calico-node-s824n 0/1 Init:0/3 0 2d23h 10.2.0.5 worker
kube-system calico-node-zkxrn 1/1 Running 1 2d23h 10.2.0.4 cp
kube-system coredns-78fcd69978-4svtg 1/1 Running 1 2d23h 192.168.242.66 cp
kube-system coredns-78fcd69978-m4nsp 1/1 Running 1 2d23h 192.168.242.67 cp
kube-system etcd-cp 1/1 Running 1 2d23h 10.2.0.4 cp
kube-system kube-apiserver-cp 1/1 Running 1 2d23h 10.2.0.4 cp
kube-system kube-controller-manager-cp 1/1 Running 1 2d23h 10.2.0.4 cp
kube-system kube-proxy-fn5xm 0/1 ContainerCreating 0 2d23h 10.2.0.5 worker
kube-system kube-proxy-trxxb 1/1 Running 1 2d23h 10.2.0.4 cp
kube-system kube-scheduler-cp 1/1 Running 1 2d23h 10.2.0.4 cp

Rgds

Tan Wee

serewicz · November 2021

Hello,

From the look of things Calico is not running on your worker. There are a few reasons this could happen, but chances are it has to do with a networking configuration error or a firewall between your instances.

What are you using to run the lab exercises, GCE, AWS, Digital Ocean, VMWare, VirtualBox, two Linux laptops?

Regards,

tanwee · November 2021

Hi,

I created the 2 VMs in GCE following the GCE Lab setup video.

Thanks

TW

chrispokorni · November 2021

Hi @tanwee,

Thank you for the provided output. It seems that none of the containers scheduled to the worker node are able to start. The node itself may not be ready.
What may help are the outputs of the following two commands:

kubectl get nodes

kubectl describe node worker

Regards,
-Chris

serewicz · November 2021

I would also double check that the VPC is allowing all traffic between your VMs, as well.

Regards,

tanwee · November 2021

Hi,

This is the output of kubectl describe node worker. You can see the last message:

Node worker status is now: NodeReady

Rgds

TW

Name: worker
Roles:
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=worker
kubernetes.io/os=linux
Annotations: kubeadm.alpha.kubernetes.io/cri-socket: /var/run/crio/crio.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Sat, 20 Nov 2021 12:45:36 +0000
Taints:
Unschedulable: false
Lease:
HolderIdentity: worker
AcquireTime:
RenewTime: Wed, 24 Nov 2021 12:52:02 +0000
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
MemoryPressure False Wed, 24 Nov 2021 12:48:48 +0000 Wed, 24 Nov 2021 12:48:38 +0000 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Wed, 24 Nov 2021 12:48:48 +0000 Wed, 24 Nov 2021 12:48:38 +0000 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Wed, 24 Nov 2021 12:48:48 +0000 Wed, 24 Nov 2021 12:48:38 +0000 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Wed, 24 Nov 2021 12:48:48 +0000 Wed, 24 Nov 2021 12:48:48 +0000 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: 10.2.0.5
Hostname: worker
Capacity:
cpu: 2
ephemeral-storage: 20145724Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 7977Mi
pods: 110
Allocatable:
cpu: 2
ephemeral-storage: 18566299208
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 7877Mi
pods: 110
System Info:
Machine ID: 9df579dd8f5e6ed7ca105568417ac070
System UUID: 9DF579DD-8F5E-6ED7-CA10-5568417AC070
Boot ID: e3e3b6a9-26d9-4c48-bd0a-c0233a067a5c
Kernel Version: 4.15.0-1006-gcp
OS Image: Ubuntu 18.04 LTS
Operating System: linux
Architecture: amd64
Container Runtime Version: cri-o://1.22.1
Kubelet Version: v1.22.1
Kube-Proxy Version: v1.22.1
PodCIDR: 192.168.1.0/24
PodCIDRs: 192.168.1.0/24
Non-terminated Pods: (3 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
default nginx 0 (0%) 0 (0%) 0 (0%) 0 (0%) 24h
kube-system calico-node-s824n 250m (12%) 0 (0%) 0 (0%) 0 (0%) 4d
kube-system kube-proxy-fn5xm 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 250m (12%) 0 (0%)
memory 0 (0%) 0 (0%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal NodeHasSufficientMemory 4d (x2 over 4d) kubelet Node worker status is now: NodeHasSufficientMemory
Normal NodeHasNoDiskPressure 4d (x2 over 4d) kubelet Node worker status is now: NodeHasNoDiskPressure
Normal NodeHasSufficientPID 4d (x2 over 4d) kubelet Node worker status is now: NodeHasSufficientPID
Normal NodeAllocatableEnforced 4d kubelet Updated Node Allocatable limit across pods
Normal Starting 4d kubelet Starting kubelet.
Normal NodeReady 4d kubelet Node worker status is now: NodeReady
Normal NodeAllocatableEnforced 24h kubelet Updated Node Allocatable limit across pods
Normal Starting 24h kubelet Starting kubelet.
Normal NodeHasSufficientMemory 24h (x2 over 24h) kubelet Node worker status is now: NodeHasSufficientMemory
Normal NodeHasNoDiskPressure 24h (x2 over 24h) kubelet Node worker status is now: NodeHasNoDiskPressure
Normal NodeHasSufficientPID 24h (x2 over 24h) kubelet Node worker status is now: NodeHasSufficientPID
Warning Rebooted 24h kubelet Node worker has been rebooted, boot id: 603e4fdf-67f1-4f49-8986-5e9f749a6d95
Normal NodeNotReady 24h kubelet Node worker status is now: NodeNotReady
Normal NodeReady 24h kubelet Node worker status is now: NodeReady
Normal Starting 3m27s kubelet Starting kubelet.
Normal NodeHasSufficientMemory 3m26s (x2 over 3m26s) kubelet Node worker status is now: NodeHasSufficientMemory
Normal NodeHasNoDiskPressure 3m26s (x2 over 3m26s) kubelet Node worker status is now: NodeHasNoDiskPressure
Normal NodeHasSufficientPID 3m26s (x2 over 3m26s) kubelet Node worker status is now: NodeHasSufficientPID
Warning Rebooted 3m26s kubelet Node worker has been rebooted, boot id: e3e3b6a9-26d9-4c48-bd0a-c0233a067a5c
Normal NodeNotReady 3m26s kubelet Node worker status is now: NodeNotReady
Normal NodeAllocatableEnforced 3m26s kubelet Updated Node Allocatable limit across pods
Normal NodeReady 3m16s kubelet Node worker status is now: NodeReady

chrispokorni · December 2021

Glad it all worked out and congratulations on passing the exam @tanwee!

Regards,
-Chris

sgurenkov · December 2021

Hello, I have the same problem using GCE

kubectl get pods -A -o wide

sgurenkov · December 2021

any advice on how can I debug why that calico-node-82fl is not running?
I checked firewall and it has Ingress allow all, not sure how to debug it

serewicz · December 2021

Hello,

When you say you added an Ingress allow all, are you talking about the GCE VPC? Be sure to allow all traffic between nodes from the Google perspective.

Other than that what IP ranges did you choose? DId you assign 192.168 to your nodes by any chance?

Any deviation from the lab setup and exercise?

Regards,

sgurenkov · December 2021

Yes, I created a new GCE VPC network with 10.2.0.0/16 subnet, then added a Firewall rule with Ingress allow all ports. Then made sure I used that network in my instances
I am able to ping one node from another, it feels like a network issue, but lucking the experience I am not able to find the root cause. Current UI is little different from the lab setup video on GCE, but I tried to follow as close as possible.

I am now trying this guide: https://projectcalico.docs.tigera.io/getting-started/kubernetes/self-managed-public-cloud/gce. they use a little different network setup. I will let you know if this is going to work.

sgurenkov · December 2021

Using that guide and gcloud sdk I was able to create VPC network and two instances and then ran provisioning scripts from LFD259 solutions, calico is started successfully now:

chrispokorni · December 2021

Hi @sgurenkov,

The guide from tigera.io/project-calico/ may still be using docker as container runtime, which is different from the cri-o runtime recommended for this class. In the near future docker will no longer be supported as runtime for Kubernetes, therefore the labs have migrated onto a different container runtime.

The 10.8x.0.0 network connecting the calico-kube-controllers and the coredns pods was most likely initiated by Podman, and during the the init phase somehow the pods were assigned IP addresses from Podman's bridge network instead of being exposed over their respective nodes' IP addresses - the expected behavior.

A simple delete of these pods typically allows for the IP address assignment to be corrected.

Regards,
-Chris

elmoussaoui · January 2022

Hi Everyone, I am also facing the same issue, can you please help ?

I am using two Virtualbox vms connected with a nat network . AppArmor is uninstalled on both instances. all installation and requirement are exactly as on the lab exercises , I removed the vms and started fresh multiple times.

Thanks in advance!

result for : kubectl get nodes

**result for kubectl get pods -A -o wide : **

**result for kubectl describe node wn1 : **

Name: wn1 Roles: Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/os=linux kubernetes.io/arch=amd64 kubernetes.io/hostname=wn1 kubernetes.io/os=linux Annotations: kubeadm.alpha.kubernetes.io/cri-socket: /var/run/crio/crio.sock node.alpha.kubernetes.io/ttl: 0 volumes.kubernetes.io/controller-managed-attach-detach: true CreationTimestamp: Thu, 13 Jan 2022 15:56:16 +0000 Taints: Unschedulable: false Lease: HolderIdentity: wn1 AcquireTime: RenewTime: Thu, 13 Jan 2022 16:32:22 +0000 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- MemoryPressure False Thu, 13 Jan 2022 16:31:59 +0000 Thu, 13 Jan 2022 15:56:16 +0000 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Thu, 13 Jan 2022 16:31:59 +0000 Thu, 13 Jan 2022 15:56:16 +0000 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Thu, 13 Jan 2022 16:31:59 +0000 Thu, 13 Jan 2022 15:56:16 +0000 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Thu, 13 Jan 2022 16:31:59 +0000 Thu, 13 Jan 2022 15:56:26 +0000 KubeletReady kubelet is posting ready status. AppArmor enabled Addresses: InternalIP: 10.0.2.5 Hostname: wn1 Capacity: cpu: 2 ephemeral-storage: 25668836Ki hugepages-2Mi: 0 memory: 4039160Ki pods: 110 Allocatable: cpu: 2 ephemeral-storage: 23656399219 hugepages-2Mi: 0 memory: 3936760Ki pods: 110 System Info: Machine ID: 05291b4c92144126979989eab08c9a58 System UUID: 7CDF5996-062F-4049-B6C0-087A3C62288F Boot ID: a43595d1-f2b7-40dc-bef2-3063db315ff0 Kernel Version: 4.15.0-166-generic OS Image: Ubuntu 18.04.6 LTS Operating System: linux Architecture: amd64 Container Runtime Version: cri-o://1.22.1 Kubelet Version: v1.22.1 Kube-Proxy Version: v1.22.1 PodCIDR: 192.168.1.0/24 PodCIDRs: 192.168.1.0/24 Non-terminated Pods: (2 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age --------- ---- ------------ ---------- --------------- ------------- --- kube-system calico-node-zph59 250m (12%) 0 (0%) 0 (0%) 0 (0%) 36m kube-system kube-proxy-82vv9 0 (0%) 0 (0%) 0 (0%) 0 (0%) 36m Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 250m (12%) 0 (0%) memory 0 (0%) 0 (0%) ephemeral-storage 0 (0%) 0 (0%) hugepages-2Mi 0 (0%) 0 (0%) Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Starting 36m kubelet Starting kubelet. Normal NodeHasSufficientMemory 36m (x2 over 36m) kubelet Node wn1 status is now: NodeHasSufficientMemory Normal NodeHasNoDiskPressure 36m (x2 over 36m) kubelet Node wn1 status is now: NodeHasNoDiskPressure Normal NodeHasSufficientPID 36m (x2 over 36m) kubelet Node wn1 status is now: NodeHasSufficientPID Normal NodeAllocatableEnforced 36m kubelet Updated Node Allocatable limit across pods Normal NodeReady 35m kubelet Node wn1 status is now: NodeReady

last lines from result for : kubectl describe pod calico-node-zph59 --namespace=kube-system

Warning FailedCreatePodSandBox 2m5s (x151 over 35m) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to mount container k8s_POD_calico-node-zph59_kube-system_e8d117b7-aa0a-432f-8376-fe45ce85a4fe_0 in pod sandbox k8s_calico-node-zph59_kube-system_e8d117b7-aa0a-432f-8376-fe45ce85a4fe_0(d7ebf9a863e5dfa8018a6e1a233f0726a4ec9f16c73596e6d183f29313f3cd0c): error creating overlay mount to /var/lib/containers/storage/overlay/71684365cde8ce52493971416573fd46038082aaa807b64f55690d1744e53a78/merged, mount_data="nodev,metacopy=on,lowerdir=/var/lib/containers/storage/overlay/l/N6RBQYMDVUFBOBV74KB3GTYZF7,upperdir=/var/lib/containers/storage/overlay/71684365cde8ce52493971416573fd46038082aaa807b64f55690d1744e53a78/diff,workdir=/var/lib/containers/storage/overlay/71684365cde8ce52493971416573fd46038082aaa807b64f55690d1744e53a78/work": invalid argument

elmoussaoui · January 2022

Hi Everyone, I have solved the issue by adding

sudo sed -i 's/,metacopy=on//g' /etc/containers/storage.conf

to k8sSecond.sh after sudo apt-get install -y cri-o cri-o-runc podman buildah

This was an issue related to ubuntu 18.04

chrispokorni · January 2022

Hi @elmoussaoui,

What type of VMs are you using and which Ubuntu 18.04 did you have installed (desktop/server) ?

Regards,
-Chris

elmoussaoui · January 2022

Hi @chrispokorni,

Two virtual box VMs, connected with a Nat network, ubuntu-18.04.6-live-server-amd64

Regards,

chrispokorni · January 2022

Hi @elmoussaoui,

This seems to be encountered on local installs, while cloud ubuntu 18.04 server images do not display the same behavior.

Regards,
-Chris

Unable to start pod/container in lab 2.3 - Error message is "Error from server (BadRequest) .."

Welcome!

Best Answers

Answers

Welcome!

Welcome!

Quick Links

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)