Welcome to the Linux Foundation Forum!

kubeadm join ISSUE: [discovery] Failed to request cluster info, will try again:

Posts: 3
edited October 2019 in LFS258 Class Forum

Hi,

I've been trying to add a worker node to the cluster. I've followed the doc but I'm hitting this issue and I can't find a way past it. I've obviously missed something. Here is the command and error:

PART1

  1. root@ip-172-31-18-206:~# kubeadm join --token od1wg1.a9wd79hstxz3ll4z 172.31.19.37:6443 --discovery-token-ca-cert-hash sha256:4aed0a78c329495d91e031a336668ccaf07528c84b7120f230f2f
  2. 161a98e7693 --v=2
  3. I1018 15:46:36.761858 25485 join.go:367] [preflight] found NodeName empty; using OS hostname as NodeName
  4. I1018 15:46:36.761930 25485 initconfiguration.go:105] detected and using CRI socket: /var/run/dockershim.sock
  5. [preflight] Running pre-flight checks
  6. I1018 15:46:36.762004 25485 preflight.go:90] [preflight] Running general checks
  7. I1018 15:46:36.762037 25485 checks.go:254] validating the existence and emptiness of directory /etc/kubernetes/manifests
  8. I1018 15:46:36.762083 25485 checks.go:292] validating the existence of file /etc/kubernetes/kubelet.conf
  9. I1018 15:46:36.762124 25485 checks.go:292] validating the existence of file /etc/kubernetes/bootstrap-kubelet.conf
  10. I1018 15:46:36.762140 25485 checks.go:105] validating the container runtime
  11. I1018 15:46:36.806159 25485 checks.go:131] validating if the service is enabled and active
  12. [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
  13. I1018 15:46:36.858635 25485 checks.go:341] validating the contents of file /proc/sys/net/bridge/bridge-nf-call-iptables
  14. I1018 15:46:36.858693 25485 checks.go:341] validating the contents of file /proc/sys/net/ipv4/ip_forward
  15. I1018 15:46:36.858729 25485 checks.go:653] validating whether swap is enabled or not
  16. I1018 15:46:36.858762 25485 checks.go:382] validating the presence of executable ip
  17. I1018 15:46:36.858793 25485 checks.go:382] validating the presence of executable iptables
  18. I1018 15:46:36.858813 25485 checks.go:382] validating the presence of executable mount
  19. I1018 15:46:36.858834 25485 checks.go:382] validating the presence of executable nsenter
  20. I1018 15:46:36.858851 25485 checks.go:382] validating the presence of executable ebtables
  21. I1018 15:46:36.858870 25485 checks.go:382] validating the presence of executable ethtool
  22. I1018 15:46:36.858891 25485 checks.go:382] validating the presence of executable socat
  23. I1018 15:46:36.858910 25485 checks.go:382] validating the presence of executable tc
  24. I1018 15:46:36.858927 25485 checks.go:382] validating the presence of executable touch
  25. I1018 15:46:36.858950 25485 checks.go:524] running all checks
  26. I1018 15:46:36.873553 25485 checks.go:412] checking whether the given node name is reachable using net.LookupHost
  27. I1018 15:46:36.882411 25485 checks.go:622] validating kubelet version
  28. I1018 15:46:36.937337 25485 checks.go:131] validating if the service is enabled and active
  29. I1018 15:46:36.943627 25485 checks.go:209] validating availability of port 10250
  30. I1018 15:46:36.943778 25485 checks.go:292] validating the existence of file /etc/kubernetes/pki/ca.crt
  31. I1018 15:46:36.943797 25485 checks.go:439] validating if the connectivity type is via proxy or direct
  32. I1018 15:46:36.943826 25485 join.go:427] [preflight] Discovering cluster-info
  33. I1018 15:46:36.944224 25485 token.go:200] [discovery] Trying to connect to API Server "172.31.19.37:6443"
  34. I1018 15:46:36.944877 25485 token.go:75] [discovery] Created cluster-info discovery client, requesting info from "https://172.31.19.37:6443"
  35. I1018 15:47:06.945803 25485 token.go:83] [discovery] Failed to request cluster info, will try again: [Get https://172.31.19.37:6443/api/v1/namespaces/kube-public/configmaps/cluster-info: dial tcp 172.31.19.37:6443: i/o timeout]
  36. I1018 15:47:41.946445 25485 token.go:83] [discovery] Failed to request cluster info, will try again: [Get https://172.31.19.37:6443/api/v1/namespaces/kube-public/configmaps/cluster-info: dial tcp 172.31.19.37:6443: i/o timeout]
  37. ^C
  38. root@ip-172-31-18-206:~#

I'm able to telnet to port 22 from the worker to the master:

  1. root@ip-172-31-18-206:~# telnet 172.31.19.37 22
  2. Trying 172.31.19.37...
  3. Connected to 172.31.19.37.
  4. Escape character is '^]'.
  5. SSH-2.0-OpenSSH_7.2p2 Ubuntu-4ubuntu2.8
  6. ^]
  7. telnet> quit
  8. Connection closed.
  9. root@ip-172-31-18-206:~#
  10. root@ip-172-31-18-206:~# telnet 172.31.19.37 6443
  11. Trying 172.31.19.37...

PART2 with more details follows once it gets approved.

Welcome!

It looks like you're new here. Sign in or register to get started.
Sign In

Answers

  • Posts: 2,451

    Hi @dmccuk,

    Similar discussions have been posted recently in the forum, where a second node fails to join the cluster.
    From your output, the failure is a timeout when accessing port 6443 on the master node.
    Port 22 is irrelevant in this scenario since Kubernetes uses lots of different individual port numbers and port ranges - and 6443 is one of them.

    Read carefully the special instructions at the beginning of Lab exercise 3.1. These instructions are critical in setting up your infrastructure's networking (firewall rules) for inter-node communication.

    Regards,
    -Chris

  • Posts: 3

    Hi Chris,

    Thanks for your message. I worked out what I hadn't done. I'll write it here so others can benefit:

    1) In AWS, create a new security group and open up all the ports.
    2) Select one of your Kubernetes instances --> actions --> networking
    3) Tick the new kubernetes group, adding it to your instance.
    4) Repeat for all the other kubernetes instances.
    5) Retry the failing command.

    I hope that helps.

    Dennis

  • PART2:

    The FW is off on both the worker and the master:

    WORKER:

    1. root@ip-172-31-18-206:~# sudo ufw status
    2. Status: inactive
    3. root@ip-172-31-18-206:~#
    4. root@ip-172-31-18-206:~# service ufw status
    5. ufw.service - Uncomplicated firewall
    6. Loaded: loaded (/lib/systemd/system/ufw.service; enabled; vendor preset: enabled)
    7. Active: inactive (dead) since Fri 2019-10-18 16:00:34 UTC; 1min 13s ago
    8. Process: 26404 ExecStop=/lib/ufw/ufw-init stop (code=exited, status=0/SUCCESS)
    9. Main PID: 396 (code=exited, status=0/SUCCESS)
    10.  
    11. Oct 18 14:51:35 ubuntu systemd[1]: Started Uncomplicated firewall.
    12. Oct 18 16:00:34 ip-172-31-18-206 systemd[1]: Stopping Uncomplicated firewall...
    13. Oct 18 16:00:34 ip-172-31-18-206 ufw-init[26404]: Skip stopping firewall: ufw (not enabled)
    14. Oct 18 16:00:34 ip-172-31-18-206 systemd[1]: Stopped Uncomplicated firewall.
    15. Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
    16. root@ip-172-31-18-206:~#

    MASTER:

    1. ubuntu@ip-172-31-19-37:~$ sudo ufw status
    2. Status: inactive
    3. ubuntu@ip-172-31-19-37:~$ sudo service ufw status
    4. ufw.service - Uncomplicated firewall
    5. Loaded: loaded (/lib/systemd/system/ufw.service; enabled; vendor preset: enabled)
    6. Active: inactive (dead) since Fri 2019-10-18 16:02:49 UTC; 14s ago
    7. Process: 6637 ExecStop=/lib/ufw/ufw-init stop (code=exited, status=0/SUCCESS)
    8. Main PID: 379 (code=exited, status=0/SUCCESS)
    9.  
    10. Oct 18 14:51:23 ubuntu systemd[1]: Started Uncomplicated firewall.
    11. Oct 18 16:02:49 ip-172-31-19-37 systemd[1]: Stopping Uncomplicated firewall...
    12. Oct 18 16:02:49 ip-172-31-19-37 ufw-init[6637]: Skip stopping firewall: ufw (not enabled)
    13. Oct 18 16:02:49 ip-172-31-19-37 systemd[1]: Stopped Uncomplicated firewall.
    14. Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
    15. ubuntu@ip-172-31-19-37:~$

    Here is the master namespaces:

    1. ubuntu@ip-172-31-19-37:~$ kubectl get pods --all-namespaces
    2. NAMESPACE NAME READY STATUS RESTARTS AGE
    3. kube-system calico-node-9zmmr 2/2 Running 0 63m
    4. kube-system coredns-fb8b8dccf-mbg2w 1/1 Running 0 65m
    5. kube-system coredns-fb8b8dccf-nbm88 1/1 Running 0 65m
    6. kube-system etcd-ip-172-31-19-37 1/1 Running 0 64m
    7. kube-system kube-apiserver-ip-172-31-19-37 1/1 Running 0 64m
    8. kube-system kube-controller-manager-ip-172-31-19-37 1/1 Running 0 64m
    9. kube-system kube-proxy-tztvb 1/1 Running 0 65m
    10. kube-system kube-scheduler-ip-172-31-19-37 1/1 Running 0 64m

    I've been through this link and the steps I'm taking are identical:
    https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#pod-network

    The TOKEN and openssl key I'm using in my join command:

    1. ubuntu@ip-172-31-19-37:~$ kubeadm token list
    2. TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
    3. od1wg1.a9wd79hstxz3ll4z 22h 2019-10-19T14:58:11Z authentication,signing The default bootstrap token generated by 'kubeadm init'. system:bootstrappers:kubeadm:default-node-token
    4. ubuntu@ip-172-31-19-37:~$
    5. ubuntu@ip-172-31-19-37:~$ openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl
    6. dgst -sha256 -hex | sed 's/^.* //'
    7. 4aed0a78c329495d91e031a336668ccaf07528c84b7120f230f2f161a98e7693
    8. ubuntu@ip-172-31-19-37:~$

    NETSTAT from the master:

    1. ubuntu@ip-172-31-19-37:~$ netstat -tnlp
    2. (Not all processes could be identified, non-owned process info
    3. will not be shown, you would have to be root to see it all.)
    4. Active Internet connections (only servers)
    5. Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
    6. tcp 0 0 127.0.0.1:9099 0.0.0.0:* LISTEN -
    7. tcp 0 0 172.31.19.37:2379 0.0.0.0:* LISTEN -
    8. tcp 0 0 127.0.0.1:2379 0.0.0.0:* LISTEN -
    9. tcp 0 0 172.31.19.37:2380 0.0.0.0:* LISTEN -
    10. tcp 0 0 127.0.0.1:10257 0.0.0.0:* LISTEN -
    11. tcp 0 0 127.0.0.1:43122 0.0.0.0:* LISTEN -
    12. tcp 0 0 127.0.0.1:10259 0.0.0.0:* LISTEN -
    13. tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN -
    14. tcp 0 0 127.0.0.1:42623 0.0.0.0:* LISTEN -
    15. tcp 0 0 127.0.0.1:10248 0.0.0.0:* LISTEN -
    16. tcp 0 0 127.0.0.1:10249 0.0.0.0:* LISTEN -
    17. tcp6 0 0 :::10250 :::* LISTEN -
    18. tcp6 0 0 :::10251 :::* LISTEN -
    19. tcp6 0 0 :::6443 :::* LISTEN -
    20. tcp6 0 0 :::10252 :::* LISTEN -
    21. tcp6 0 0 :::10256 :::* LISTEN -
    22. tcp6 0 0 :::22 :::* LISTEN -

    I'm stuck! If anyone can help I would really appreciate it!

Welcome!

It looks like you're new here. Sign in or register to get started.
Sign In

Welcome!

It looks like you're new here. Sign in or register to get started.
Sign In

Categories

Upcoming Training