Welcome to the Linux Foundation Forum!

Lab 2.2 - 'kubadm init' fails

Hello,

When deploying a Control Plane Node using kubeadm (exercise 2.2), I'm launching 'sudo kubeadm init --config=$(find / -name kubeadm.yaml 2>/dev/null )' (command taken from k8scp.sh) which ultimately fails with:

I1129 18:10:45.586621    1407 checks.go:855] pulling: k8s.gcr.io/coredns:v1.8.4
[preflight] Some fatal errors occurred:
    [ERROR ImagePull]: failed to pull image k8s.gcr.io/coredns:v1.8.4: output: time="2021-11-29T18:10:48+01:00" level=fatal msg="pulling image: rpc error: code = Unknown desc = reading manifest v1.8.4 in k8s.gcr.io/coredns: manifest unknown: Failed to fetch \"v1.8.4\" from request \"/v2/coredns/manifests/v1.8.4\"."
, error: exit status 1

Any suggestion to fix it ?

Thanks.

Answers

  • Hi @fbui,

    Is it possible there is a firewall that blocks access to the Google Container Registry? This could be a guest system firewall, and/or an infrastructure level firewall (at hypervisor level, VPC, etc.). Sometimes corporate VPNs may block such connection attempts.

    Regards,
    -Chris

  • fbui
    fbui Posts: 3

    Hi,

    I turned the firewall off as advised in the exercise preliminary, so I don't think the problem is related.

    Pulling the image manually gave me the same error:

    crictl -r unix:///var/run/crio/crio.sock pull k8s.gcr.io/coredns:v1.8.4
    FATA[0000] pulling image: rpc error: code = Unknown desc = reading manifest v1.8.4 in k8s.gcr.io/coredns: manifest unknown: Failed to fetch "v1.8.4" from request "/v2/coredns/manifests/v1.8.4".
    

    It seems that the issue is more related to the version of the manifest.

  • chrispokorni
    chrispokorni Posts: 2,274
    edited November 2021

    Hi @fbui,

    Your pull command seems to be incomplete.

    I attempted to bootstrap a control-plane node following all the steps from the lab guide as provided - and it was successful. I also tried to pull the image with crictl and podman and none of them complained either:

    sudo crictl pull k8s.gcr.io/coredns/coredns:v1.8.4

    sudo podman image pull k8s.gcr.io/coredns/coredns:v1.8.4

    If these are not successful, I would start by looking at the networking configuration of the VMs, the local hypervisor or the cloud VPC, any associated firewalls, security groups, etc. I would also recommend watching the AWS/GCP setup videos from the introductory chapter, as they include essential tips for setting up the environment.

    Regards,
    -Chris

  • fbui
    fbui Posts: 3

    Hi,

    Both commands

    crictl pull k8s.gcr.io/coredns/coredns:v1.8.4
    podman image pull k8s.gcr.io/coredns/coredns:v1.8.4

    succeeded.

    Therefore it seems that the issue is due to
    sudo kubeadm init --config=$(find / -name kubeadm.yaml 2>/dev/null )

    The command (taken from k8scp.sh) launches crictl using the wrong path, ie k8s.gcr.io/coredns:v1.8.4 instead of k8s.gcr.io/coredns/coredns:v1.8.4

  • Hi @fbui,

    Can you provide the line where crictl is called from your kubeadm.yaml file? In my version of the lab guide and SOLUTIONS there is no such call, but it all works as expected every single time I run the kubeadm init command.

    Since this behavior cannot be reproduced, can you describe your environment? What VM's are you using, on what platform (what hypervisor, what cloud, what instance types, CPU, MEM, OS), what type of VM networking you have configured, any firewalls, etc? Do you have a history of commands all the way up to the step where you noticed the error? Can you provide the cp.out file?

    Regards,
    -Chris

  • I have the same issue on ubuntu-1804-lts (running in GCP), uploaded the cp.out file, could you pls help?

    $ sudo kubeadm init --config=$(find / -name kubeadm.yaml 2>/dev/null )
    W0320 23:41:47.568929 8885 utils.go:69] The recommended value for "resolvConf" in "KubeletConfiguration" is: /run/systemd/resolve/resolv.conf; the provided value is: /run/systemd/resolve/resolv.conf
    [init] Using Kubernetes version: v1.23.1
    [preflight] Running pre-flight checks
    error execution phase preflight: [preflight] Some fatal errors occurred:
    [ERROR CRI]: container runtime is not running: output: time="2022-03-20T23:41:49Z" level=fatal msg="connect: connect endpoint 'unix:///var/run/crio/crio.sock', make sure you are running as root and the endpoint has been started: context deadline exceeded"
    , error: exit status 1
    [preflight] If you know what you are doing, you can make a check non-fatal with --ignore-preflight-errors=...
    To see the stack trace of this error execute with --v=5 or higher

  • akovi
    akovi Posts: 16

    @bvssnraju, @fbui You must use Ubuntu 20.04 as the base OS.

  • dowzer
    dowzer Posts: 2

    Thanks, @ankovi > @akovi said:

    @bvssnraju, @fbui You must use Ubuntu 20.04 as the base OS.

    I think this is quite an important one. Did you consider adding a remark on the lab page?
    I was confused when the cluster stopped working for me without a reason.

    Also, did you notice that for Ubuntu 20.04 there is a warning message when running the init script?

    Warning: policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
    poddisruptionbudget.policy/calico-kube-controllers created

  • serewicz
    serewicz Posts: 1,000

    Hello,

    The lab guide does call out the use of Ubuntu 20.04. It is mentioned a few places, but notably in the Overview section before the installation lab.

    As well the warning you see is from CNCF software. I assume they are aware of the warning and will be updating what kubeadm uses soon. You can search through issues under https://github.com/kubernetes and add one if there isn't already work being done to take care of the warning.

    Regards,

  • dowzer
    dowzer Posts: 2

    Thank you @serewicz!

  • juetten
    juetten Posts: 3

    Hello there,

    I'm also experiencing this kind of problem.

    My setup is:

    • Vagrant VirtualBox Ubuntu 20.04
    • apparmor stopped and disabled
    • ENV prepared with company proxy, all (http|https|no)_proxy variables set

    Running kubeadm init ... fails to pull the necessary images whereas podman instead is able to pull them.

    I finally ended up in
    1. running kubeadm init and let it fail
    2. pull the image with podman
    3. repeat at 1.

    Any advise to overcome this is highly appreciated :)

  • chrispokorni
    chrispokorni Posts: 2,274

    Hi @juetten,

    What errors did you see while kubeadm init was running?

    After the cluster initialized, were you able to run the basicpod application and the firstpod deployment as instructed by Lab exercises 2.3 and 2.5 respectively?

    Regards,
    -Chris

Categories

Upcoming Training