Welcome to the Linux Foundation Forum!

Lab2.2: kubeadm join failed because of connection refused

Hi There,

I guess I need help to shed some light how to resolve the connection refused issue when I am trying to join work node via kubeadm command. Following is the error from this command:

vagrant@worker:~$ sudo kubeadm join 10.0.2.15:6443 --token yajnah.8v1n4d2ivgbo6hlx \

--discovery-token-ca-cert-hash sha256:84ba35a6760a1f74c9b1876fc34ce066e0c6c07e7d88890e3c24d23080519f09

[preflight] Running pre-flight checks
error execution phase preflight: couldn't validate the identity of the API Server: Get "https://10.0.2.15:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": dial tcp 10.0.2.15:6443: connect: connection refused
To see the stack trace of this error execute with --v=5 or higher

  1. I did not run following as I am seeing calico network pods already provisioned, if I need to run, how do I get the exact name of the .yaml?

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

  1. LAN IPs for CP and Worker nodes:
    172.16.0.100 CP-node, 172.16.0.102 Worker Node

  2. I can ping CP node from Worker Node:
    vagrant@worker:~$ ping 172.16.0.100
    PING 172.16.0.100 (172.16.0.100) 56(84) bytes of data.
    64 bytes from 172.16.0.100: icmp_seq=1 ttl=64 time=0.583 ms

  3. CP node PODS info:

vagrant@cp:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-6799f5f4b4-6xbkb 1/1 Running 0 48m
kube-system calico-node-trznz 1/1 Running 0 48m
kube-system coredns-6d4b75cb6d-jxmtz 1/1 Running 0 48m
kube-system coredns-6d4b75cb6d-k6cf8 1/1 Running 0 48m
kube-system etcd-cp 1/1 Running 0 48m
kube-system kube-apiserver-cp 1/1 Running 0 48m
kube-system kube-controller-manager-cp 1/1 Running 0 48m
kube-system kube-proxy-67mbw 1/1 Running 0 48m
kube-system kube-scheduler-cp 1/1 Running 0 48m

  1. CP node IP info:
    vagrant@cp:~$ ip addr
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
    valid_lft forever preferred_lft forever
    2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 08:00:27:a2:6b:fd brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic eth0
    valid_lft 79051sec preferred_lft 79051sec
    inet6 fe80::a00:27ff:fea2:6bfd/64 scope link
    valid_lft forever preferred_lft forever
    3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 08:00:27:13:16:91 brd ff:ff:ff:ff:ff:ff
    inet 172.16.0.100/24 brd 172.16.0.255 scope global eth1
    valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe13:1691/64 scope link
    valid_lft forever preferred_lft forever
    4: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
    inet 192.168.242.64/32 scope global tunl0
    valid_lft forever preferred_lft forever
    7: cali549db1682f5@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1480 qdisc noqueue state UP group default
    link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netns cni-4645a764-3f02-3264-4374-c7257cf21be1
    inet6 fe80::ecee:eeff:feee:eeee/64 scope link
    valid_lft forever preferred_lft forever
    8: cali2d70479e511@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1480 qdisc noqueue state UP group default
    link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netns cni-71509c2a-11e2-3cd0-0ce6-098d2a3f091f
    inet6 fe80::ecee:eeff:feee:eeee/64 scope link
    valid_lft forever preferred_lft forever
    9: cali8bdaed7ef27@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1480 qdisc noqueue state UP group default
    link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netns cni-c7333fb0-3f78-f392-0a72-1d0159f59103
    inet6 fe80::ecee:eeff:feee:eeee/64 scope link
    valid_lft forever preferred_lft forever
    vagrant@cp:~$

Thanks
Shao

Answers

  • Hi @caishaoping,

    Similar issues have already been reported and solutioned several times in the forum.

    On VMs with multiple network interfaces, the control plane gets advertised on the default interface, in this case the one with IP 10.0.2.15. However, it seems that the intent may have been to use the 172.16.x.x private IP address. One solution would be to ensure your VMs only receive a single network interface each during provisioning, connected to a bridged network (promiscuous mode set to allow all). If both interfaces are needed, then the kubeadm init command from the k8scp.sh script file should include the --apiserver-advertise-address=CP-node-private-IP option.

    The network plugin is installed part of the same k8scp.sh script, there is no need to manually install the plugin. I would recommend inspecting both script files k8scp.sh and k8sWorker.sh to understand what they are doing in terms of installation and configuration on each VM.

    Regards,
    -Chris

  • Thanks.

    I checked k8scp.sh, it has following, if I assign a 192.160 private address, will this also help avoid the issue?

    Configure the cluster

    sudo kubeadm init --pod-network-cidr=192.168.0.0/16

    Regards

  • Hi @caishaoping,

    Please ensure that there is no overlap between subnets of nodes, pods, and services.

    By default services use 10.96.0.0/12 managed by the cluster, and the default pod subnet is 192.168.0.0/16 managed with the pod network plugin - calico. With that in mind, the desired nodes subnet should not overlap the services and pods subnets.

    Regards,
    -Chris

  • Thanks @chrispokorni This did help my understand why VM's private IP should not be 192.168... After a few tries, I am not able to join my worker not to control-panel node:

    vagrant@cp:~$ kubectl get nodes
    NAME STATUS ROLES AGE VERSION
    cp Ready control-plane 27m v1.24.1
    worker Ready 5m34s v1.24.1
    vagrant@cp:~$

    1. On Control-panel node (CP node): after sudo kubeadm reset, i did following before "sudo kubeadm init ....", kind of followed the scripts in k8scp.sh file
      sudo systemctl enable -now kubelet
      sudo swapoff -a
      ..
      sudo systemctl restart containerd
      sudo systemctl enable containerd

    sudo kubeadm init --pod-network-cidr=192.168.0.0/16 --apiserver-advertise-address=172.16.0.100

    Then, got the new token for worker node to join:
    sudo kubeadm token create --print-join-command

    1. On worker node, it is simple,
      sudo systemctl enable --now kubelet
      sudo swapoff -a
      sudo kubeadm reset
      sudo kubeadm join .....

    Thanks
    Shao

  • Hello Again @chrispokorni, want to take this thread further with one related question:,

    Today, after restart of my host Windows machine, I started up two VMs (CP + Worker ndoes), but I am not able to connect to nodes via "kubectl" command of, like, "kubectl get nodes",

    vagrant@cp:~$ kubectl get nodes
    The connection to the server 172.16.0.100:6443 was refused - did you specify the right host or port?
    vagrant@cp:~$

    Given the steps mentioned in previous chat, which included the kubeadm init command like "sudo kubeadm init --pod-network-cidr=192.168.0.0/16 --apiserver-advertise-address=172.16.0.100"

    Following is the message I am getting, if doing 'sudo systemctl status kubelet":

    vagrant@cp:~$ sudo systemctl status kubelet

    ● kubelet.service - kubelet: The Kubernetes Node Agent
    Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/kubelet.service.d
    └─10-kubeadm.conf
    Active: activating (auto-restart) (Result: exit-code) since Sat 2022-09-10 01:53:56 UTC; 6s ago
    Docs: https://kubernetes.io/docs/home/
    Process: 11595 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=1/FAILUR>
    Main PID: 11595 (code=exited, status=1/FAILURE)

    Sep 10 01:53:56 cp systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
    Sep 10 01:53:56 cp systemd[1]: kubelet.service: Failed with result 'exit-code'.
    lines 1-11/11 (END)

    Where should I start the troubleshooting? Or did I miss any the earlier procedures of setting up cp+worker nodes?

    Thanks
    Shao

  • Hi There, here is quick update to previous observation and question, I guess it might be the reason that my VM nodes are really slow in startup,

    After a few minutes spending time going through past chats to get some ideas, then I tried again, my CP node is up and ready followed by worker node a couple of minutes later.

    vagrant@cp:~$ kubectl get nodes
    NAME STATUS ROLES AGE VERSION
    cp Ready control-plane 29h v1.24.1
    worker Ready 28h v1.24.1

    My question is: I did do "sudo swapoff -a" on both VMs, not sure if this helps, With limited knowledge on linux admin, my question is: do I need to do "swapoff -a" every time after rebooting?

    Thanks
    Shao

  • Hi There,
    sorry for the question without thoughtful thinking :) , so let me follow up to conclude this thread:
    when I start my VMs today, yes, swap is active, so I need to "swapoff -a", here is the check:

    This system is built by the Bento project by Chef Software
    More information can be found at https://github.com/chef/bento
    Last login: Sat Sep 10 01:42:55 2022 from 172.16.0.1

    1. to check if swap is active, yes, it is actually active
      vagrant@worker:~$ sudo swapon -s
      Filename Type Size Used Priority
      /swap.img file 1999868 0 -2

    2. to disable swap
      vagrant@worker:~$ sudo swapoff -a

    3. to recheck if swap is off, yes, it is off now
      vagrant@worker:~$ sudo swapon -s
      vagrant@worker:~$

    4. Furtherly, I sudo vim /etc/fstab and comment swap related lines like below:

    /swap.img none swap sw 0 0

    1. Reboot VMs, verified that now swap' disable survives reboot :

    This system is built by the Bento project by Chef Software
    More information can be found at https://github.com/chef/bento
    Last login: Sat Sep 10 16:12:07 2022 from 172.16.0.1

    vagrant@cp:~$ sudo swapon -s
    vagrant@cp:~$

    vagrant@cp:~$ kubectl get nodes
    NAME STATUS ROLES AGE VERSION
    cp Ready control-plane 43h v1.24.1
    worker Ready 43h v1.24.1

    Happy Ending! Thanks to all!

Categories

Upcoming Training