Welcome to the Linux Foundation Forum!

Lab 2.8: Kubeadm init

When I issue the command, "kubeadm init", I receive the following:

[init] Using Kubernetes version: v1.24.7
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR IsPrivilegedUser]: user is not running as root
[preflight] If you know what you are doing, you can make a check non-fatal with --ignore-preflight-errors=...
To see the stack trace of this error execute with --v=5 or higher

Do we need to be in root?

Answers

  • Hi @tstaffordsmith,

    Prior to running the kubeadm init command, did you happen to inspect the k8scp.sh script file for the correct syntax? I would also encourage you to install the lab guide recommended Kubernetes version with the --kubernetes-version 1.24.1 flag.

    Regards,
    -Chris

  • @chrispokorni I started from scratch and received this after running sudo kubeadm init:

    sudo kubeadm init
    [init] Using Kubernetes version: v1.25.4
    [preflight] Running pre-flight checks
    [WARNING SystemVerification]: missing optional cgroups: blkio
    error execution phase preflight: [preflight] Some fatal errors occurred:
    [ERROR Port-6443]: Port 6443 is in use
    [ERROR Port-10259]: Port 10259 is in use
    [ERROR Port-10257]: Port 10257 is in use
    [ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
    [ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
    [ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
    [ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
    [ERROR Port-10250]: Port 10250 is in use
    [ERROR Port-2379]: Port 2379 is in use
    [ERROR Port-2380]: Port 2380 is in use
    [ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
    [preflight] If you know what you are doing, you can make a check non-fatal with --ignore-preflight-errors=...
    To see the stack trace of this error execute with --v=5 or higher

  • Successfully troubleshot. In case anyone else encounters this issue >>

    Solving 'ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty'
    sudo rm -rf /var/lib/etcd

    Solving '[ERROR FileAvailable .....]`
    sudo kubeadm reset
    sudo kubeadm init

  • chrispokorni
    chrispokorni Posts: 2,301
    edited November 2022

    Hi @tstaffordsmith,

    Running consecutive kubeadm init commands will not fix the previous errors. If you really want to start from scratch, run sudo kubeadm reset prior to running init again.

    After a successful reset run the following command, which should install version 1.25.1 that is recommended by the latest course release, and assumes that your pod network plugin (calico) will manage the 192.168.0.0/16 network:

    sudo kubeadm init --kubernetes-version "1.25.1" --pod-network-cidr "192.168.0.0/16"

    Regards,
    -Chris

  • Thank you @chrispokorni. The cp node seems to be working.

    Before joining my worker node to my cp node via kubeadm join, I'm trying to install calico on my cp node.

    sudo kubeadm init

    sudo kubeadm init --pod-network-cidr "192.168.0.0/16"

    mkdir -p $HOME/.kube

    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

    sudo chown $(id -u):$(id -g) $HOME/.kube/config

    Then, following Calico's installation instructions here: https://projectcalico.docs.tigera.io/getting-started/kubernetes/quickstart, I inputted:

    kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.24.5/manifests/tigera-operator.yaml


    Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")

  • Hi @tstaffordsmith,

    Have you tried following the calico installation instructions provided in the course material instead?

    Regards,
    -Chris

  • Yes I got this working successfully.

    The control plane node seems to be running fine but running any command on the worker node results in the error: "The connection to the server localhost:8080 was refused - did you specify the right host or port?"

  • Hi @tstaffordsmith,

    Can you provide the commands run on the worker node that produced the connection refused error?

    Regards,
    -Chris

  • ^above is the history of my worker node commands

  • I was able to troubleshoot. The problem stemmed from my aws ec2 instances' security group. Thank you @chrispokorni.

Categories

Upcoming Training