Welcome to the Linux Foundation Forum!

Lab 3.1 Struggling to get a fuctional cluster

lee.muzzy
lee.muzzy Posts: 5
edited February 4 in LFS258 Class Forum

Hey I'm having a lot of trouble getting a cluster installed correctly. I am able to run the init command succesfully but whenever I try and print a join command or check the status I get the mess in the image below
Errors:

This is what is in my kubeadm config file

And this is what is in my hosts file

Any help appreciated!

Some extra notes: I am using Microsoft Azure Pods. And I usually stop an restart them inbetween working on the class if that is an issue

Best Answer

  • chrispokorni
    chrispokorni Posts: 2,419
    Answer ✓

    Hi @lee.muzzy,

    The clarification is in the screenshot you provided of step 19. If you take a closer look, you see the control plane node hostname is "cp" (as suggested by the prompt root@cp:~#), while its alias in the hosts file is k8scp.

    Regards,
    -Chris

Answers

  • chrispokorni
    chrispokorni Posts: 2,419

    Hi @lee.muzzy,

    Before we attempt to sort out the cluster, please clarify which course you are enrolled in.

    You are posting in the LFD259 forum, yet the screenshots seem to display configuration options from the LFS258 course - some slightly incorrect though.

    Also, please describe what are the Microsoft Azure Pods that host your lab environment systems. Are they perhaps Azure Virtual Machines (VMs)?

    Regards,
    -Chris

  • @chrispokorni Hey Thanks. yes this is class LFS258 I'm not sure how I got to this forum I thought I followed the link but I guess I am wrong. I will repost in the correct forum. Yes I meant Azure Virtual Mechines. My brain is a bit fried from spending the last 3 hours trying to figure this out haha

  • chrispokorni
    chrispokorni Posts: 2,419

    No need to repost.
    I can move this to the correct forum.

  • chrispokorni
    chrispokorni Posts: 2,419

    Hi @lee.muzzy,

    There are a few inconsistencies I need to point out from the screenshots above:
    1 - The kubectl command should not work when run by root. It was configured to run as a regular user, please review and follow the lab guide steps as presented.
    2 - The k8scp name was intended to be only an alias to the control plane node, not the actual hostname of the control plane node.
    3 - The intended version of the initial cluster bootstraping was 1.31.1. However, deviating to 1.31.5 should not prevent you from moving forward.
    4 - Before launching the Cilium CNI, please correct the cilium-cni.yaml manifest around line 222, so the entry is:

      cluster-pool-ipv4-cidr: "192.168.0.0/16"
    

    (replacing "10.0.0.0/8").

    For VM size (CPU, RAM, disk size), recommended OS, and firewall/security group requirements please watch the two video guides for setting up cloud VMs on GCP and AWS. The requirements are similar for Azure cloud also.

    Regards,
    -Chris

  • Thanks @chrispokorni I appears my CP is crashing shorty after initializing. Is there a way to obtain logs?
    Here are all the commands I ran after init

  • chrispokorni
    chrispokorni Posts: 2,419

    Hi @lee.muzzy,

    Did you manage to fix the inconsistencies I pointed out earlier, before initializing the cluster? The little information that is revealed by the latest screenshots, it seems the hostname of the control plane node is still the intended alias, and I do not see where the cilium-cni.yaml manifest is updated.

    Also, it is still unclear to me what OS runs the VMs, what are the sizes of your VMs (CPU, RAM, disk), and how is the firewall configured to filter inbound traffic to the VMs.

    Try to reboot the control plane node to allow the installed software to reset itself.

    After reboot, try to display the state of the kubelet and containerd services:

    sudo systemctl status kubelet
    sudo systemctl status containerd
    

    If possible you can try the following commands, to display the state of your node and any available pods:

    kubectl get nodes -o wide
    kubectl get pods -A -o wide
    

    Regards,
    -Chris

  • lee.muzzy
    lee.muzzy Posts: 5
    edited February 6

    @chrispokorni Thank you I will try what you suggested. However I don't understand what you mean by " The k8scp name was intended to be only an alias to the control plane node, not the actual hostname of the control plane node." The instructions seem to tell me that is exsactly what I should use. What should I put instead of k8scp?

  • @chrispokorni Thanks I did update the hostname but it did not fix my issue. So I started from scratch and made sure all my commands ran correctly and that I used the correct operating sysem(I was on a newer version) and it worked! I appreciate your help and awnsering my questions it was super helpful!

Categories

Upcoming Training