Welcome to the Linux Foundation Forum!

Issues getting worker node to join cluster in Lab3.2

As the title says - I'm having issues with the worker node in Lab 3.2.

I got the install finished successfully on the cp node. I am able to perform every step on the worker node up until the "kubeadm join" command. When I run the join command, it hangs on the following step:

[kubelet-start] Starting the kubelet
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is not healthy after 4m1.996320433s

When I look at "journalctl -xeu kubelet" for detailed errors, it tells me that conf files are missing/not found:

"Unhandled Error" err="unable to read existing bootstrap client config from /etc/kubernetes/kubelet.conf: invalid configuration
"command failed" err="failed to run Kubelet: unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory"

Wouldn't it be expected behavior that the conf files are not there until after joining the cluster and receiving them from the cp? Are there steps missing in the lab instructions?

I am running the lab on local VMs. I can confirm that they have basic connectivity on all ports.

Thanks for the help!

Comments

  • chrispokorni
    chrispokorni Posts: 2,533

    Hi @vtvash,

    The lab guide is complete with the steps necessary to successfully bootstrap a two node Kubernetes cluster.

    However, the steps will fail if the Virtual environment is inadequate. VMs should have 2 vCPUs, 8 GB RAM, 20 GB vdisk (fully allocated), single bridged network interface, IP addresses that do not overlap 10.96.0.0/12 and 192.168.0.0/16 ranges. Guest OS Ubuntu 24.04 LTS. The hypervisor firewall should allow all incoming traffic from all sources, all protocols, to all ports (promiscuous mode enabled and set to allow-all).

    What is the host OS and architecture? What hypervisor are you running? Is nested virtualization enabled?

    Are there any work-related security controls active on your host?

    Regards,
    -Chris

  • vtvash
    vtvash Posts: 2

    VMs meet the minimum requirements, are on a bridged network interface, use the correct RFC1918 sub-range (I don't see those /12 and /16 subnet ranges listed anywhere, but thankfully the addresses they have fit). The network policy is wide open.

    All that being said, the error talks about a conf file on the local machine. Based on the pre-checks, it appears to be validating the local config before it even attempts network connectivity. Are the outputs a red herring? Is there anything else I can do to try to make it work? Do I just need to blow the VM away, start over, and hope that works?

Categories

Upcoming Training