Welcome to the Linux Foundation Forum!

3.2 Node Not Ready

ilmx
ilmx Posts: 18
edited October 2023 in LFS258 Class Forum

I have followed the steps in the PDF to add a node to the cluster but I get a status "NotReady" in the node.

Reason          Message
------          -------
KubeletNotReady container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized

What can be the issue? I am using local VMs and the network communication between them works, as the node could be added in the cluster.

Answers

  • chrispokorni
    chrispokorni Posts: 2,349

    Hi @ilmx,

    There could be several reasons why the node remains in a NotReady state. What hypervisor are you using, what are the sizes (cpu, mem, disk) of the VMs, what guest OS is running the VMs (distribution, release), what type of network are VMs attached to, is all ingress traffic enabled at the hypervisor level?

    Regards,
    -Chris

  • Is kubelet is running?

  • ilmx
    ilmx Posts: 18

    Hi,

    I am using VMware Workstation. I have created a Host-Only network for them, and assigned IPs statically to each machine. No firewall in place. They can communicate with each other (ping, netcat, etc).

    I have created one CP and two working nodes. The VMs are exactly the same:

    2 CPU / 4 GB RAM / 500 GB disc
    Distributor ID: Ubuntu
    Description:    Ubuntu 22.04 LTS
    Release:        22.04
    Codename:       jammy
    

    The cilium pods in the CP run well, but the ones in the WN don't:

    cilium-operator-788c7d7585-592lg   0/1     ContainerCreating   0   5h15m   10.10.10.4      wn2
    cilium-operator-788c7d7585-hr5tk   0/1     ContainerCreating   0   5h15m   10.10.10.3      wn1
    cilium-tk29d                       0/1     Init:0/6            0   5h15m   10.10.10.3      wn1
    cilium-vth84                       0/1     Init:0/6            0   5h13m   10.10.10.4      wn2
    cilium-zjqdb                       1/1     Running             0   5h13m   10.10.10.2      cp 
    coredns-5d78c9869d-lpnbq           1/1     Running             0   5h13m   192.168.0.214   cp 
    coredns-5d78c9869d-n8xd2           1/1     Running             0   5h13m   192.168.0.246   cp 
    etcd-cp                            1/1     Running             0   2d23h   10.10.10.2      cp
    kube-apiserver-cp                  1/1     Running             2   2d23h   10.10.10.2      cp
    kube-controller-manager-cp         1/1     Running             17  2d23h   10.10.10.2      cp
    kube-proxy-26bvd                   1/1     Running             0   2d23h   10.10.10.2      cp
    kube-proxy-lzf86                   0/1     ContainerCreating   0   43h     10.10.10.4      wn2
    kube-proxy-r8pz6                   0/1     ContainerCreating   0   5h13m   10.10.10.3      wn1
    kube-scheduler-cp                  1/1     Running             16  5h13m   10.10.10.2      cp
    

    I tried to create a second node to start from scratch but the same issue happened.

    I did a change in the YAML file provided (cilium-cni.yaml) because I realised there was a network cidr defined inside, so I changed it to 10.10.10.0/16 to match my network. Then I re-applied but nothing changed.

    I am not sure how to investigate further.

  • ilmx
    ilmx Posts: 18

    @kishorevaishnav said:
    Is kubelet is running?

    Yes, kubelet is running in every machine:

    $ systemctl status kubelet
    ● kubelet.service - kubelet: The Kubernetes Node Agent
         Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
        Drop-In: /etc/systemd/system/kubelet.service.d
                 └─10-kubeadm.conf
         Active: active (running) since Tue 2023-10-24 20:36:24 UTC; 2 days ago
    
  • chrispokorni
    chrispokorni Posts: 2,349

    Hi @ilmx,

    Please use the OS release recommended by the lab guide - Ubuntu 20.04 LTS.
    The CPU seems adequate per VM, memory could be increased to 8 GB per VM, and 20-50 GB disk space should suffice for the purpose of the lab exercises.
    All VMs should have access to/from the internet, to/from each other. All ingress traffic to the VMs from all sources, to all ports and all protocols should be allowed. Please ensure the network type selected satisfies these requirements.
    Defining a Cilium Pod network CIDR that is distinct from your existing VM network will help to understand the different networks that take part in the Kubernetes traffic management.

    Regards,
    -Chris

  • ilmx
    ilmx Posts: 18

    Thanks. I will try also with Ubuntu 20, but I would like to understand how to solve the problem. The network complies except for the Internet access (although I can hack into this to e.g. pull a Pod). But I believe the lack of Internet should not be an issue for my specific problem, should it?

    Is there anything specific that I can do to research the issue cni plugin not initialized?

  • marco.ferretti
    marco.ferretti Posts: 15
    edited December 2023

    Hi
    I bumped in the same issue and it took me almost one week to solve it.
    I did take the course, initially, two versions ago when the CNI plugin used to be Calico and I must say it worked like a charm. Unfortunately, the instructors or whoever is supposed to test the material did not do a great job with this release.
    The bottom line is, despite the claims of the documentation for the course to "have been written to be vendor-agnostic so could run on AWS, local hardware, or inside of virtualization to give you the most flexibility and options" they are not as it, I presume, takes for granted some pre-requisites that are not taken into account when they explain how to install the required software and spin up the cluster.

    In particular, in order for me to make the cluster worker nodes connect in my virtual box environment, I had to ditch the kubeadm-config.yaml file part and initialize the control plane as

    kubeadm init --pod-network-cidr=$PODSUBNET --skip-phases=addon/kube-proxy --control-plane-endpoint $CONTROL_IP

    Where $CONTROL_IP is the ip address of my VBOX cluster node, $PODSUBNET is the default pod subnet of cilium, --skip-phases=addon/kube-proxy skips the installation of kube-proxy

    Then I added an extra parameter to my kubelet : echo "Environment=\"KUBELET_EXTRA_ARGS=--node-ip=$CONTROL_IP\"" | sudo tee -a /etc/systemd/system/kubelet.service.d/10-kubeadm.conf, and finally installed cilium cli and let it handle the deployment of cilium CNI:

    CLI_ARCH=amd64
    CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/master/stable.txt)
    curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
    sudo tar xzvfC cilium-linux-${CLI_ARCH}.tar.gz /usr/local/bin
    rm cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
    sudo cilium install --version ${CILIUM_VERSION}

    At the end I just enabled hubble (I don't think it's necessary) : sudo cilium hubble enable

Categories

Upcoming Training