Lab 2.2 - 'kubadm init' fails

fbui · November 2021

Hello,

When deploying a Control Plane Node using kubeadm (exercise 2.2), I'm launching 'sudo kubeadm init --config=$(find / -name kubeadm.yaml 2>/dev/null )' (command taken from k8scp.sh) which ultimately fails with:

I1129 18:10:45.586621    1407 checks.go:855] pulling: k8s.gcr.io/coredns:v1.8.4
[preflight] Some fatal errors occurred:
    [ERROR ImagePull]: failed to pull image k8s.gcr.io/coredns:v1.8.4: output: time="2021-11-29T18:10:48+01:00" level=fatal msg="pulling image: rpc error: code = Unknown desc = reading manifest v1.8.4 in k8s.gcr.io/coredns: manifest unknown: Failed to fetch \"v1.8.4\" from request \"/v2/coredns/manifests/v1.8.4\"."
, error: exit status 1

Any suggestion to fix it ?

Thanks.

chrispokorni · November 2021

Hi @fbui,

Is it possible there is a firewall that blocks access to the Google Container Registry? This could be a guest system firewall, and/or an infrastructure level firewall (at hypervisor level, VPC, etc.). Sometimes corporate VPNs may block such connection attempts.

Regards,
-Chris

fbui · November 2021

Hi,

I turned the firewall off as advised in the exercise preliminary, so I don't think the problem is related.

Pulling the image manually gave me the same error:

crictl -r unix:///var/run/crio/crio.sock pull k8s.gcr.io/coredns:v1.8.4
FATA[0000] pulling image: rpc error: code = Unknown desc = reading manifest v1.8.4 in k8s.gcr.io/coredns: manifest unknown: Failed to fetch "v1.8.4" from request "/v2/coredns/manifests/v1.8.4".

It seems that the issue is more related to the version of the manifest.

chrispokorni · November 2021

Hi @fbui,

Your pull command seems to be incomplete.

I attempted to bootstrap a control-plane node following all the steps from the lab guide as provided - and it was successful. I also tried to pull the image with crictl and podman and none of them complained either:

sudo crictl pull k8s.gcr.io/coredns/coredns:v1.8.4

sudo podman image pull k8s.gcr.io/coredns/coredns:v1.8.4

If these are not successful, I would start by looking at the networking configuration of the VMs, the local hypervisor or the cloud VPC, any associated firewalls, security groups, etc. I would also recommend watching the AWS/GCP setup videos from the introductory chapter, as they include essential tips for setting up the environment.

Regards,
-Chris

fbui · December 2021

Hi,

Both commands

crictl pull k8s.gcr.io/coredns/coredns:v1.8.4
podman image pull k8s.gcr.io/coredns/coredns:v1.8.4

succeeded.

Therefore it seems that the issue is due to
sudo kubeadm init --config=$(find / -name kubeadm.yaml 2>/dev/null )

The command (taken from k8scp.sh) launches crictl using the wrong path, ie k8s.gcr.io/coredns:v1.8.4 instead of k8s.gcr.io/coredns/coredns:v1.8.4

chrispokorni · December 2021

Hi @fbui,

Can you provide the line where crictl is called from your kubeadm.yaml file? In my version of the lab guide and SOLUTIONS there is no such call, but it all works as expected every single time I run the kubeadm init command.

Since this behavior cannot be reproduced, can you describe your environment? What VM's are you using, on what platform (what hypervisor, what cloud, what instance types, CPU, MEM, OS), what type of VM networking you have configured, any firewalls, etc? Do you have a history of commands all the way up to the step where you noticed the error? Can you provide the cp.out file?

Regards,
-Chris

bvssnraju · March 2022

I have the same issue on ubuntu-1804-lts (running in GCP), uploaded the cp.out file, could you pls help?

$ sudo kubeadm init --config=$(find / -name kubeadm.yaml 2>/dev/null )
W0320 23:41:47.568929 8885 utils.go:69] The recommended value for "resolvConf" in "KubeletConfiguration" is: /run/systemd/resolve/resolv.conf; the provided value is: /run/systemd/resolve/resolv.conf
[init] Using Kubernetes version: v1.23.1
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR CRI]: container runtime is not running: output: time="2022-03-20T23:41:49Z" level=fatal msg="connect: connect endpoint 'unix:///var/run/crio/crio.sock', make sure you are running as root and the endpoint has been started: context deadline exceeded"
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with --ignore-preflight-errors=...
To see the stack trace of this error execute with --v=5 or higher

akovi · March 2022

@bvssnraju, @fbui You must use Ubuntu 20.04 as the base OS.

dowzer · April 2022

Thanks, @ankovi > @akovi said:

@bvssnraju, @fbui You must use Ubuntu 20.04 as the base OS.

I think this is quite an important one. Did you consider adding a remark on the lab page?
I was confused when the cluster stopped working for me without a reason.

Also, did you notice that for Ubuntu 20.04 there is a warning message when running the init script?

Warning: policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
poddisruptionbudget.policy/calico-kube-controllers created

dowzer · April 2022

Thank you @serewicz!

juetten · June 2022

Hello there,

I'm also experiencing this kind of problem.

My setup is:

Vagrant VirtualBox Ubuntu 20.04
apparmor stopped and disabled
ENV prepared with company proxy, all (http|https|no)_proxy variables set

Running kubeadm init ... fails to pull the necessary images whereas podman instead is able to pull them.

I finally ended up in
1. running kubeadm init and let it fail
2. pull the image with podman
3. repeat at 1.

Any advise to overcome this is highly appreciated

chrispokorni · July 2022

Hi @juetten,

What errors did you see while kubeadm init was running?

After the cluster initialized, were you able to run the basicpod application and the firstpod deployment as instructed by Lab exercises 2.3 and 2.5 respectively?

Regards,
-Chris

Lab 2.2 - 'kubadm init' fails

Answers

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)