Running pre-flight checks hang
Hello,
I am getting stuck at task 8 of Ex 2.2. I used the suggested grep command to get the precise sudo kubeadm command to use in the worker node. I am making sure that I am copying line by line. Unfortunately, running pre-flight checks hang. I used the --v=5 flag as well(it was prompted by kubectl) and it can not connect to the IP address I specified even though it is the same as the one written in cp.out file. I even used the kubectl get nodes -o wide command to check the IP address of the control panel node and its the same. Anyone has any suggestions on how to tackle this problem? Was I supposed to run any other command before using sudo kubeadm join? Thanks in advance
Best Answer
-
Hi @gmmajal,
After adding
10.0.0.10 k8scp
to the two/etc/hosts
files, perform the following to attempt to grow the cluster:On the CP node (your control plane with assumed private IP 10.0.0.10) run the following command:
sudo kubeadm token create --print-join-command
On the WORKER node (with an assumed private IP 10.0.0.x) run the following commands:
sudo kubeadm reset
sudo kubeadm join ...
#<-- the entire join command generated on the CP nodeIf this join is still not successful then please review the VPC and firewall configuration steps from the demo video for GCP. Also, ensure both VMs (CP and WORKER) are created in the same VPC/subnet, so that they are both protected by the same firewall (open to all inbound traffic, all protocols, from all sources, to all port destinations).
Regards,
-Chris0
Answers
-
Hi @gmmajal,
Please provide the output produced by the
kubeadm join
command, using the code format.Also, keep in mind that correctly setting up the infrastructure is essential. Did you follow the provisioning videos from the introductory chapter? The most important aspects are the VPC network and firewall configuration.
What cloud or what local hypervisor provisions your infrastructure? What is the guest OS of the VMs? How many network interfaces on each VM? Are your firewalls disabled as instructed?Regards,
-Chris0 -
I0410 18:17:52.181740 6386 join.go:413] [preflight] found NodeName empty; using OS hostname as NodeName I0410 18:17:52.182229 6386 initconfiguration.go:122] detected and using CRI socket: unix:///var/run/containerd/containerd.sock [preflight] Running pre-flight checks I0410 18:17:52.182428 6386 preflight.go:93] [preflight] Running general checks I0410 18:17:52.182509 6386 checks.go:280] validating the existence of file /etc/kubernetes/kubelet.conf I0410 18:17:52.182540 6386 checks.go:280] validating the existence of file /etc/kubernetes/bootstrap-kubelet.conf I0410 18:17:52.182564 6386 checks.go:104] validating the container runtime I0410 18:17:52.225113 6386 checks.go:639] validating whether swap is enabled or not I0410 18:17:52.225242 6386 checks.go:370] validating the presence of executable crictl I0410 18:17:52.225287 6386 checks.go:370] validating the presence of executable conntrack I0410 18:17:52.225320 6386 checks.go:370] validating the presence of executable ip I0410 18:17:52.225353 6386 checks.go:370] validating the presence of executable iptables I0410 18:17:52.225389 6386 checks.go:370] validating the presence of executable mount I0410 18:17:52.225430 6386 checks.go:370] validating the presence of executable nsenter I0410 18:17:52.225462 6386 checks.go:370] validating the presence of executable ebtables I0410 18:17:52.225494 6386 checks.go:370] validating the presence of executable ethtool I0410 18:17:52.225522 6386 checks.go:370] validating the presence of executable socat I0410 18:17:52.225552 6386 checks.go:370] validating the presence of executable tc I0410 18:17:52.225580 6386 checks.go:370] validating the presence of executable touch I0410 18:17:52.225615 6386 checks.go:516] running all checks I0410 18:17:52.244927 6386 checks.go:401] checking whether the given node name is valid and reachable using net.LookupHost I0410 18:17:52.250203 6386 checks.go:605] validating kubelet version I0410 18:17:52.331698 6386 checks.go:130] validating if the "kubelet" service is enabled and active I0410 18:17:52.347137 6386 checks.go:203] validating availability of port 10250 I0410 18:17:52.347514 6386 checks.go:280] validating the existence of file /etc/kubernetes/pki/ca.crt I0410 18:17:52.347552 6386 checks.go:430] validating if the connectivity type is via proxy or direct I0410 18:17:52.347613 6386 checks.go:329] validating the contents of file /proc/sys/net/bridge/bridge-nf-call-iptables I0410 18:17:52.347697 6386 checks.go:329] validating the contents of file /proc/sys/net/ipv4/ip_forward I0410 18:17:52.347753 6386 join.go:532] [preflight] Discovering cluster-info I0410 18:17:52.347806 6386 token.go:80] [discovery] Created cluster-info discovery client, requesting info from "10.0.0.6:6443" I0410 18:18:02.349715 6386 token.go:217] [discovery] Failed to request cluster-info, will try again: Get "https://10.0.0.6:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) I0410 18:18:18.743849 6386 token.go:217] [discovery] Failed to request cluster-info, will try again: Get "https://10.0.0.6:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) I0410 18:18:34.340314 6386 token.go:217] [discovery] Failed to request cluster-info, will try again: Get "https://10.0.0.6:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) I0410 18:18:50.171644 6386 token.go:217] [discovery] Failed to request cluster-info, will try again: Get "https://10.0.0.6:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) Get "https://10.0.0.6:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) couldn't validate the identity of the API Server k8s.io/kubernetes/cmd/kubeadm/app/discovery.For cmd/kubeadm/app/discovery/discovery.go:45 k8s.io/kubernetes/cmd/kubeadm/app/cmd.(*joinData).TLSBootstrapCfg cmd/kubeadm/app/cmd/join.go:533 k8s.io/kubernetes/cmd/kubeadm/app/cmd.(*joinData).InitCfg cmd/kubeadm/app/cmd/join.go:543 k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/join.runPreflight cmd/kubeadm/app/cmd/phases/join/preflight.go:98 k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1 cmd/kubeadm/app/cmd/phases/workflow/runner.go:259 k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll cmd/kubeadm/app/cmd/phases/workflow/runner.go:446 k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run cmd/kubeadm/app/cmd/phases/workflow/runner.go:232 k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdJoin.func1 cmd/kubeadm/app/cmd/join.go:180 github.com/spf13/cobra.(*Command).execute vendor/github.com/spf13/cobra/command.go:940 github.com/spf13/cobra.(*Command).ExecuteC vendor/github.com/spf13/cobra/command.go:1068 github.com/spf13/cobra.(*Command).Execute vendor/github.com/spf13/cobra/command.go:992 k8s.io/kubernetes/cmd/kubeadm/app.Run cmd/kubeadm/app/kubeadm.go:50 main.main cmd/kubeadm/kubeadm.go:25 runtime.main /usr/local/go/src/runtime/proc.go:267 runtime.goexit /usr/local/go/src/runtime/asm_amd64.s:1650 error execution phase preflight k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1 cmd/kubeadm/app/cmd/phases/workflow/runner.go:260 k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll cmd/kubeadm/app/cmd/phases/workflow/runner.go:446 k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run cmd/kubeadm/app/cmd/phases/workflow/runner.go:232 k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdJoin.func1 cmd/kubeadm/app/cmd/join.go:180 github.com/spf13/cobra.(*Command).execute vendor/github.com/spf13/cobra/command.go:940 github.com/spf13/cobra.(*Command).ExecuteC vendor/github.com/spf13/cobra/command.go:1068 github.com/spf13/cobra.(*Command).Execute vendor/github.com/spf13/cobra/command.go:992 k8s.io/kubernetes/cmd/kubeadm/app.Run cmd/kubeadm/app/kubeadm.go:50 main.main cmd/kubeadm/kubeadm.go:25 runtime.main /usr/local/go/src/runtime/proc.go:267 runtime.goexit /usr/local/go/src/runtime/asm_amd64.s:1650
The aforementioned block is my (truncated)output when I run the
sudo kubeadm join
command with av=5
flag. Kubectl prompted me to add this flag in order to get a more verbose output to identify the nature of the error.With regards to your questions:
1) I am using the Google Cloud Engine and I am connecting to the VM instances, via putty.
2) The OS is Ubuntu 20.04. LTS
3) I have ensured that I have chosen the VPC network that I made specifically for this class(following the instructions provided in the first lesson). There's just one network per VM.4) I have also made sure I have disabled the firewall. I have also added a screen shot of the firewall rule that's operational.
0 -
Hi @gmmajal,
Thank you for the detailed output.
What are the custom entries of the/etc/hosts
files, what are the private IP addresses and the hostnames of the two VMs?What are the outputs of
kubectl get nodes -o wide
andkubectl get pods -A -o wide
?Regards,
-Chris0 -
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-system cilium-k62nk 1/1 Running 0 10m 10.0.0.10 cp <none> <none> kube-system cilium-operator-58684c48c9-b4c8f 1/1 Running 0 10m 10.0.0.10 cp <none> <none> kube-system coredns-76f75df574-725bc 1/1 Running 0 10m 10.0.0.5 cp <none> <none> kube-system coredns-76f75df574-gccb4 1/1 Running 0 10m 10.0.0.245 cp <none> <none> kube-system etcd-cp 1/1 Running 0 10m 10.0.0.10 cp <none> <none> kube-system kube-apiserver-cp 1/1 Running 0 10m 10.0.0.10 cp <none> <none> kube-system kube-controller-manager-cp 1/1 Running 0 10m 10.0.0.10 cp <none> <none> kube-system kube-proxy-bqh7r 1/1 Running 0 10m 10.0.0.10 cp <none> <none> kube-system kube-scheduler-cp 1/1 Running 0 10m 10.0.0.10 cp <none> <none>
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME cp Ready control-plane 18m v1.29.1 10.0.0.10 <none> Ubuntu 20.04.6 LTS 5.15.0-1053-gcp containerd://1.6.31
Hi Chris,
Thanks for the prompt response. The first block is the output forkubectl get pods
command on the control panel node. The second block is the output for thekubectl get nodes
command on the control panel node.The entry inside the hosts file is the following:
127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters ff02::3 ip6-allhosts 169.254.169.254 metadata.google.internal metadata
The hostname of the vm instances are worker and cp. The IP addresses are 34.91.60.229 and 34.32.234.112, respectively.
With regards to the firewall rule I just wanted to recheck one thing. There are a few rules created by default for an instance of a VPC network on Google cloud. Are we supposed to delete them entirely, before inserting our own firewall rule?
Regards,
GMMajal0 -
Hi @gmmajal,
You probably missed a step in the lab exercise. You must configure both
/etc/hosts
files, on each node respectively with the same additional entryCP-NODE-PRIVATE-IP k8scp
. In your case the additional entry should be10.0.0.10 k8scp
.Regards,
-Chris0 -
I made the additional entry to the
etc/hosts
files on both nodes. Unfortunately, the problem still persists. Can you tell me which part of the exercise is responsible for making the configuration you mentioned in your earlier message? Unfortunately, I couldn't really find it. If I runkubectl get nodes
on my worker node I get the following output:E0412 10:07:35.586706 16538 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused E0412 10:07:35.587280 16538 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused E0412 10:07:35.588770 16538 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused E0412 10:07:35.589209 16538 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused E0412 10:07:35.590655 16538 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused The connection to the server localhost:8080 was refused - did you specify the right host or port?
Even running a curl request:
curl https://10.0.0.10:6443
results in acurl: (28) Failed to connect to 10.0.0.10 port 6443: Connection timed out
connection timed out error. Is there somewhere else where I need to modify entries to allow this connection to happen?Regards,
GMMajal0 -
Hi @chrispokorni ,
Thanks for your response. I tried what you suggested about growing the cluster first and then trying to connect the worker node to the cp. Unfortunately, that did not work. I started all over again making the VPC and VM instances. I followed each instruction carefully and it seems the original problem was indeed with my Firewall rule. I followed all the instructions in the exercise as stated and this time it worked. I did not have to insert any additional information in the
etc/hosts
file. The problem was with my Firewall setup to begin with.Regards,
GMMajal0 -
can you describe please how did you configurate your firewall rule?, I've been stuck for a week, I also don't see instructions about editing
etc/hosts
.0 -
Hi @arenasgt,
You may find helpful the videos in the introductory chapter, describing the infrastructure provisioning and networking configuration on AWS and GCP respectively.
Regards,
-Chris0
Categories
- All Categories
- 219 LFX Mentorship
- 219 LFX Mentorship: Linux Kernel
- 793 Linux Foundation IT Professional Programs
- 354 Cloud Engineer IT Professional Program
- 179 Advanced Cloud Engineer IT Professional Program
- 82 DevOps Engineer IT Professional Program
- 147 Cloud Native Developer IT Professional Program
- 138 Express Training Courses
- 138 Express Courses - Discussion Forum
- 6.2K Training Courses
- 47 LFC110 Class Forum - Discontinued
- 71 LFC131 Class Forum
- 42 LFD102 Class Forum
- 227 LFD103 Class Forum
- 19 LFD110 Class Forum
- 38 LFD121 Class Forum
- 18 LFD133 Class Forum
- 7 LFD134 Class Forum
- 18 LFD137 Class Forum
- 71 LFD201 Class Forum
- 5 LFD210 Class Forum
- 5 LFD210-CN Class Forum
- 2 LFD213 Class Forum - Discontinued
- 128 LFD232 Class Forum - Discontinued
- 2 LFD233 Class Forum
- 4 LFD237 Class Forum
- 24 LFD254 Class Forum
- 697 LFD259 Class Forum
- 111 LFD272 Class Forum
- 4 LFD272-JP クラス フォーラム
- 12 LFD273 Class Forum
- 151 LFS101 Class Forum
- 1 LFS111 Class Forum
- 3 LFS112 Class Forum
- 2 LFS116 Class Forum
- 4 LFS118 Class Forum
- LFS120 Class Forum
- 7 LFS142 Class Forum
- 5 LFS144 Class Forum
- 4 LFS145 Class Forum
- 3 LFS146 Class Forum
- 3 LFS147 Class Forum
- 1 LFS148 Class Forum
- 15 LFS151 Class Forum
- 2 LFS157 Class Forum
- 30 LFS158 Class Forum
- 7 LFS162 Class Forum
- 2 LFS166 Class Forum
- 4 LFS167 Class Forum
- 3 LFS170 Class Forum
- 2 LFS171 Class Forum
- 3 LFS178 Class Forum
- 3 LFS180 Class Forum
- 2 LFS182 Class Forum
- 5 LFS183 Class Forum
- 32 LFS200 Class Forum
- 737 LFS201 Class Forum - Discontinued
- 3 LFS201-JP クラス フォーラム
- 18 LFS203 Class Forum
- 134 LFS207 Class Forum
- 2 LFS207-DE-Klassenforum
- 1 LFS207-JP クラス フォーラム
- 302 LFS211 Class Forum
- 56 LFS216 Class Forum
- 52 LFS241 Class Forum
- 48 LFS242 Class Forum
- 38 LFS243 Class Forum
- 15 LFS244 Class Forum
- 2 LFS245 Class Forum
- LFS246 Class Forum
- 49 LFS250 Class Forum
- 2 LFS250-JP クラス フォーラム
- 1 LFS251 Class Forum
- 153 LFS253 Class Forum
- 1 LFS254 Class Forum
- 1 LFS255 Class Forum
- 9 LFS256 Class Forum
- 1 LFS257 Class Forum
- 1.3K LFS258 Class Forum
- 10 LFS258-JP クラス フォーラム
- 119 LFS260 Class Forum
- 159 LFS261 Class Forum
- 42 LFS262 Class Forum
- 82 LFS263 Class Forum - Discontinued
- 15 LFS264 Class Forum - Discontinued
- 11 LFS266 Class Forum - Discontinued
- 24 LFS267 Class Forum
- 24 LFS268 Class Forum
- 30 LFS269 Class Forum
- LFS270 Class Forum
- 202 LFS272 Class Forum
- 2 LFS272-JP クラス フォーラム
- 1 LFS274 Class Forum
- 4 LFS281 Class Forum
- 9 LFW111 Class Forum
- 259 LFW211 Class Forum
- 181 LFW212 Class Forum
- 13 SKF100 Class Forum
- 1 SKF200 Class Forum
- 1 SKF201 Class Forum
- 796 Hardware
- 199 Drivers
- 68 I/O Devices
- 37 Monitors
- 103 Multimedia
- 174 Networking
- 91 Printers & Scanners
- 85 Storage
- 758 Linux Distributions
- 82 Debian
- 67 Fedora
- 17 Linux Mint
- 13 Mageia
- 23 openSUSE
- 148 Red Hat Enterprise
- 31 Slackware
- 13 SUSE Enterprise
- 353 Ubuntu
- 468 Linux System Administration
- 39 Cloud Computing
- 71 Command Line/Scripting
- Github systems admin projects
- 93 Linux Security
- 78 Network Management
- 102 System Management
- 47 Web Management
- 63 Mobile Computing
- 18 Android
- 33 Development
- 1.2K New to Linux
- 1K Getting Started with Linux
- 371 Off Topic
- 114 Introductions
- 174 Small Talk
- 22 Study Material
- 805 Programming and Development
- 303 Kernel Development
- 484 Software Development
- 1.8K Software
- 263 Applications
- 183 Command Line
- 3 Compiling/Installing
- 987 Games
- 317 Installation
- 97 All In Program
- 97 All In Forum
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)