kubeadm join ISSUE: [discovery] Failed to request cluster info, will try again:

Hi,
I've been trying to add a worker node to the cluster. I've followed the doc but I'm hitting this issue and I can't find a way past it. I've obviously missed something. Here is the command and error:
PART1
[email protected]:~# kubeadm join --token od1wg1.a9wd79hstxz3ll4z 172.31.19.37:6443 --discovery-token-ca-cert-hash sha256:4aed0a78c329495d91e031a336668ccaf07528c84b7120f230f2f 161a98e7693 --v=2 I1018 15:46:36.761858 25485 join.go:367] [preflight] found NodeName empty; using OS hostname as NodeName I1018 15:46:36.761930 25485 initconfiguration.go:105] detected and using CRI socket: /var/run/dockershim.sock [preflight] Running pre-flight checks I1018 15:46:36.762004 25485 preflight.go:90] [preflight] Running general checks I1018 15:46:36.762037 25485 checks.go:254] validating the existence and emptiness of directory /etc/kubernetes/manifests I1018 15:46:36.762083 25485 checks.go:292] validating the existence of file /etc/kubernetes/kubelet.conf I1018 15:46:36.762124 25485 checks.go:292] validating the existence of file /etc/kubernetes/bootstrap-kubelet.conf I1018 15:46:36.762140 25485 checks.go:105] validating the container runtime I1018 15:46:36.806159 25485 checks.go:131] validating if the service is enabled and active [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/ I1018 15:46:36.858635 25485 checks.go:341] validating the contents of file /proc/sys/net/bridge/bridge-nf-call-iptables I1018 15:46:36.858693 25485 checks.go:341] validating the contents of file /proc/sys/net/ipv4/ip_forward I1018 15:46:36.858729 25485 checks.go:653] validating whether swap is enabled or not I1018 15:46:36.858762 25485 checks.go:382] validating the presence of executable ip I1018 15:46:36.858793 25485 checks.go:382] validating the presence of executable iptables I1018 15:46:36.858813 25485 checks.go:382] validating the presence of executable mount I1018 15:46:36.858834 25485 checks.go:382] validating the presence of executable nsenter I1018 15:46:36.858851 25485 checks.go:382] validating the presence of executable ebtables I1018 15:46:36.858870 25485 checks.go:382] validating the presence of executable ethtool I1018 15:46:36.858891 25485 checks.go:382] validating the presence of executable socat I1018 15:46:36.858910 25485 checks.go:382] validating the presence of executable tc I1018 15:46:36.858927 25485 checks.go:382] validating the presence of executable touch I1018 15:46:36.858950 25485 checks.go:524] running all checks I1018 15:46:36.873553 25485 checks.go:412] checking whether the given node name is reachable using net.LookupHost I1018 15:46:36.882411 25485 checks.go:622] validating kubelet version I1018 15:46:36.937337 25485 checks.go:131] validating if the service is enabled and active I1018 15:46:36.943627 25485 checks.go:209] validating availability of port 10250 I1018 15:46:36.943778 25485 checks.go:292] validating the existence of file /etc/kubernetes/pki/ca.crt I1018 15:46:36.943797 25485 checks.go:439] validating if the connectivity type is via proxy or direct I1018 15:46:36.943826 25485 join.go:427] [preflight] Discovering cluster-info I1018 15:46:36.944224 25485 token.go:200] [discovery] Trying to connect to API Server "172.31.19.37:6443" I1018 15:46:36.944877 25485 token.go:75] [discovery] Created cluster-info discovery client, requesting info from "https://172.31.19.37:6443" I1018 15:47:06.945803 25485 token.go:83] [discovery] Failed to request cluster info, will try again: [Get https://172.31.19.37:6443/api/v1/namespaces/kube-public/configmaps/cluster-info: dial tcp 172.31.19.37:6443: i/o timeout] I1018 15:47:41.946445 25485 token.go:83] [discovery] Failed to request cluster info, will try again: [Get https://172.31.19.37:6443/api/v1/namespaces/kube-public/configmaps/cluster-info: dial tcp 172.31.19.37:6443: i/o timeout] ^C [email protected]:~#
I'm able to telnet to port 22 from the worker to the master:
[email protected]:~# telnet 172.31.19.37 22 Trying 172.31.19.37... Connected to 172.31.19.37. Escape character is '^]'. SSH-2.0-OpenSSH_7.2p2 Ubuntu-4ubuntu2.8 ^] telnet> quit Connection closed. [email protected]:~# [email protected]:~# telnet 172.31.19.37 6443 Trying 172.31.19.37...
PART2 with more details follows once it gets approved.
Answers
-
Hi @dmccuk,
Similar discussions have been posted recently in the forum, where a second node fails to join the cluster.
From your output, the failure is a timeout when accessing port 6443 on the master node.
Port 22 is irrelevant in this scenario since Kubernetes uses lots of different individual port numbers and port ranges - and 6443 is one of them.Read carefully the special instructions at the beginning of Lab exercise 3.1. These instructions are critical in setting up your infrastructure's networking (firewall rules) for inter-node communication.
Regards,
-Chris0 -
Hi Chris,
Thanks for your message. I worked out what I hadn't done. I'll write it here so others can benefit:
1) In AWS, create a new security group and open up all the ports.
2) Select one of your Kubernetes instances --> actions --> networking
3) Tick the new kubernetes group, adding it to your instance.
4) Repeat for all the other kubernetes instances.
5) Retry the failing command.I hope that helps.
Dennis
0 -
PART2:
The FW is off on both the worker and the master:
WORKER:
[email protected]:~# sudo ufw status Status: inactive [email protected]:~# [email protected]:~# service ufw status ● ufw.service - Uncomplicated firewall Loaded: loaded (/lib/systemd/system/ufw.service; enabled; vendor preset: enabled) Active: inactive (dead) since Fri 2019-10-18 16:00:34 UTC; 1min 13s ago Process: 26404 ExecStop=/lib/ufw/ufw-init stop (code=exited, status=0/SUCCESS) Main PID: 396 (code=exited, status=0/SUCCESS) Oct 18 14:51:35 ubuntu systemd[1]: Started Uncomplicated firewall. Oct 18 16:00:34 ip-172-31-18-206 systemd[1]: Stopping Uncomplicated firewall... Oct 18 16:00:34 ip-172-31-18-206 ufw-init[26404]: Skip stopping firewall: ufw (not enabled) Oct 18 16:00:34 ip-172-31-18-206 systemd[1]: Stopped Uncomplicated firewall. Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable. [email protected]:~#
MASTER:
[email protected]:~$ sudo ufw status Status: inactive [email protected]:~$ sudo service ufw status ● ufw.service - Uncomplicated firewall Loaded: loaded (/lib/systemd/system/ufw.service; enabled; vendor preset: enabled) Active: inactive (dead) since Fri 2019-10-18 16:02:49 UTC; 14s ago Process: 6637 ExecStop=/lib/ufw/ufw-init stop (code=exited, status=0/SUCCESS) Main PID: 379 (code=exited, status=0/SUCCESS) Oct 18 14:51:23 ubuntu systemd[1]: Started Uncomplicated firewall. Oct 18 16:02:49 ip-172-31-19-37 systemd[1]: Stopping Uncomplicated firewall... Oct 18 16:02:49 ip-172-31-19-37 ufw-init[6637]: Skip stopping firewall: ufw (not enabled) Oct 18 16:02:49 ip-172-31-19-37 systemd[1]: Stopped Uncomplicated firewall. Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable. [email protected]:~$
Here is the master namespaces:
[email protected]:~$ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-node-9zmmr 2/2 Running 0 63m kube-system coredns-fb8b8dccf-mbg2w 1/1 Running 0 65m kube-system coredns-fb8b8dccf-nbm88 1/1 Running 0 65m kube-system etcd-ip-172-31-19-37 1/1 Running 0 64m kube-system kube-apiserver-ip-172-31-19-37 1/1 Running 0 64m kube-system kube-controller-manager-ip-172-31-19-37 1/1 Running 0 64m kube-system kube-proxy-tztvb 1/1 Running 0 65m kube-system kube-scheduler-ip-172-31-19-37 1/1 Running 0 64m
I've been through this link and the steps I'm taking are identical:
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#pod-networkThe TOKEN and openssl key I'm using in my join command:
[email protected]:~$ kubeadm token list TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS od1wg1.a9wd79hstxz3ll4z 22h 2019-10-19T14:58:11Z authentication,signing The default bootstrap token generated by 'kubeadm init'. system:bootstrappers:kubeadm:default-node-token [email protected]:~$ [email protected]:~$ openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //' 4aed0a78c329495d91e031a336668ccaf07528c84b7120f230f2f161a98e7693 [email protected]:~$
NETSTAT from the master:
[email protected]:~$ netstat -tnlp (Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.) Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 127.0.0.1:9099 0.0.0.0:* LISTEN - tcp 0 0 172.31.19.37:2379 0.0.0.0:* LISTEN - tcp 0 0 127.0.0.1:2379 0.0.0.0:* LISTEN - tcp 0 0 172.31.19.37:2380 0.0.0.0:* LISTEN - tcp 0 0 127.0.0.1:10257 0.0.0.0:* LISTEN - tcp 0 0 127.0.0.1:43122 0.0.0.0:* LISTEN - tcp 0 0 127.0.0.1:10259 0.0.0.0:* LISTEN - tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN - tcp 0 0 127.0.0.1:42623 0.0.0.0:* LISTEN - tcp 0 0 127.0.0.1:10248 0.0.0.0:* LISTEN - tcp 0 0 127.0.0.1:10249 0.0.0.0:* LISTEN - tcp6 0 0 :::10250 :::* LISTEN - tcp6 0 0 :::10251 :::* LISTEN - tcp6 0 0 :::6443 :::* LISTEN - tcp6 0 0 :::10252 :::* LISTEN - tcp6 0 0 :::10256 :::* LISTEN - tcp6 0 0 :::22 :::* LISTEN -
I'm stuck! If anyone can help I would really appreciate it!
0
Categories
- 8.9K All Categories
- 13 LFX Mentorship
- 66 LFX Mentorship: Linux Kernel
- 364 Linux Foundation Boot Camps
- 231 Cloud Engineer Boot Camp
- 70 Advanced Cloud Engineer Boot Camp
- 25 DevOps Engineer Boot Camp
- 5 Cloud Native Developer Boot Camp
- 852 Training Courses
- 15 LFC110 Class Forum
- 16 LFD102 Class Forum
- 102 LFD103 Class Forum
- 3 LFD121 Class Forum
- 55 LFD201 Class Forum
- 1 LFD213 Class Forum - Discontinued
- 128 LFD232 Class Forum
- 19 LFD254 Class Forum
- 431 LFD259 Class Forum
- 86 LFD272 Class Forum
- 1 LFD272-JP クラス フォーラム
- 16 LFS200 Class Forum
- 694 LFS201 Class Forum
- LFS201-JP クラス フォーラム
- 271 LFS211 Class Forum
- 50 LFS216 Class Forum
- 26 LFS241 Class Forum
- 27 LFS242 Class Forum
- 19 LFS243 Class Forum
- 6 LFS244 Class Forum
- 9 LFS250 Class Forum
- LFS250-JP クラス フォーラム
- 108 LFS253 Class Forum
- 791 LFS258 Class Forum
- 7 LFS258-JP クラス フォーラム
- 51 LFS260 Class Forum
- 79 LFS261 Class Forum
- 13 LFS262 Class Forum
- 76 LFS263 Class Forum
- 14 LFS264 Class Forum
- 10 LFS266 Class Forum
- 8 LFS267 Class Forum
- 9 LFS268 Class Forum
- 6 LFS269 Class Forum
- 180 LFS272 Class Forum
- 1 LFS272-JP クラス フォーラム
- 187 LFW211 Class Forum
- 103 LFW212 Class Forum
- 878 Hardware
- 207 Drivers
- 74 I/O Devices
- 43 Monitors
- 115 Multimedia
- 204 Networking
- 98 Printers & Scanners
- 82 Storage
- 724 Linux Distributions
- 82 Debian
- 64 Fedora
- 12 Linux Mint
- 13 Mageia
- 22 openSUSE
- 126 Red Hat Enterprise
- 33 Slackware
- 13 SUSE Enterprise
- 347 Ubuntu
- 447 Linux System Administration
- 33 Cloud Computing
- 64 Command Line/Scripting
- Github systems admin projects
- 89 Linux Security
- 73 Network Management
- 105 System Management
- 45 Web Management
- 50 Mobile Computing
- 18 Android
- 19 Development
- 1.2K New to Linux
- 1.1K Getting Started with Linux
- 499 Off Topic
- 119 Introductions
- 193 Small Talk
- 19 Study Material
- 748 Programming and Development
- 240 Kernel Development
- 474 Software Development
- 902 Software
- 247 Applications
- 178 Command Line
- 2 Compiling/Installing
- 72 Games
- 314 Installation
- 20 All In Program
- 20 All In Forum
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)