Welcome to the Linux Foundation Forum!

Running pre-flight checks hang

Hello,

I am getting stuck at task 8 of Ex 2.2. I used the suggested grep command to get the precise sudo kubeadm command to use in the worker node. I am making sure that I am copying line by line. Unfortunately, running pre-flight checks hang. I used the --v=5 flag as well(it was prompted by kubectl) and it can not connect to the IP address I specified even though it is the same as the one written in cp.out file. I even used the kubectl get nodes -o wide command to check the IP address of the control panel node and its the same. Anyone has any suggestions on how to tackle this problem? Was I supposed to run any other command before using sudo kubeadm join? Thanks in advance

Welcome!

It looks like you're new here. Sign in or register to get started.
Sign In

Best Answer

  • Posts: 2,444
    Answer ✓

    Hi @gmmajal,

    After adding 10.0.0.10 k8scp to the two /etc/hosts files, perform the following to attempt to grow the cluster:

    On the CP node (your control plane with assumed private IP 10.0.0.10) run the following command:
    sudo kubeadm token create --print-join-command

    On the WORKER node (with an assumed private IP 10.0.0.x) run the following commands:
    sudo kubeadm reset
    sudo kubeadm join ... #<-- the entire join command generated on the CP node

    If this join is still not successful then please review the VPC and firewall configuration steps from the demo video for GCP. Also, ensure both VMs (CP and WORKER) are created in the same VPC/subnet, so that they are both protected by the same firewall (open to all inbound traffic, all protocols, from all sources, to all port destinations).

    Regards,
    -Chris

Answers

  • Posts: 2,444
    edited April 2024

    Hi @gmmajal,

    Please provide the output produced by the kubeadm join command, using the code format.

    Also, keep in mind that correctly setting up the infrastructure is essential. Did you follow the provisioning videos from the introductory chapter? The most important aspects are the VPC network and firewall configuration.
    What cloud or what local hypervisor provisions your infrastructure? What is the guest OS of the VMs? How many network interfaces on each VM? Are your firewalls disabled as instructed?

    Regards,
    -Chris

  • Posts: 7
    edited April 2024
    1. I0410 18:17:52.181740 6386 join.go:413] [preflight] found NodeName empty; using OS hostname as NodeName
    2. I0410 18:17:52.182229 6386 initconfiguration.go:122] detected and using CRI socket: unix:///var/run/containerd/containerd.sock
    3. [preflight] Running pre-flight checks
    4. I0410 18:17:52.182428 6386 preflight.go:93] [preflight] Running general checks
    5. I0410 18:17:52.182509 6386 checks.go:280] validating the existence of file /etc/kubernetes/kubelet.conf
    6. I0410 18:17:52.182540 6386 checks.go:280] validating the existence of file /etc/kubernetes/bootstrap-kubelet.conf
    7. I0410 18:17:52.182564 6386 checks.go:104] validating the container runtime
    8. I0410 18:17:52.225113 6386 checks.go:639] validating whether swap is enabled or not
    9. I0410 18:17:52.225242 6386 checks.go:370] validating the presence of executable crictl
    10. I0410 18:17:52.225287 6386 checks.go:370] validating the presence of executable conntrack
    11. I0410 18:17:52.225320 6386 checks.go:370] validating the presence of executable ip
    12. I0410 18:17:52.225353 6386 checks.go:370] validating the presence of executable iptables
    13. I0410 18:17:52.225389 6386 checks.go:370] validating the presence of executable mount
    14. I0410 18:17:52.225430 6386 checks.go:370] validating the presence of executable nsenter
    15. I0410 18:17:52.225462 6386 checks.go:370] validating the presence of executable ebtables
    16. I0410 18:17:52.225494 6386 checks.go:370] validating the presence of executable ethtool
    17. I0410 18:17:52.225522 6386 checks.go:370] validating the presence of executable socat
    18. I0410 18:17:52.225552 6386 checks.go:370] validating the presence of executable tc
    19. I0410 18:17:52.225580 6386 checks.go:370] validating the presence of executable touch
    20. I0410 18:17:52.225615 6386 checks.go:516] running all checks
    21. I0410 18:17:52.244927 6386 checks.go:401] checking whether the given node name is valid and reachable using net.LookupHost
    22. I0410 18:17:52.250203 6386 checks.go:605] validating kubelet version
    23. I0410 18:17:52.331698 6386 checks.go:130] validating if the "kubelet" service is enabled and active
    24. I0410 18:17:52.347137 6386 checks.go:203] validating availability of port 10250
    25. I0410 18:17:52.347514 6386 checks.go:280] validating the existence of file /etc/kubernetes/pki/ca.crt
    26. I0410 18:17:52.347552 6386 checks.go:430] validating if the connectivity type is via proxy or direct
    27. I0410 18:17:52.347613 6386 checks.go:329] validating the contents of file /proc/sys/net/bridge/bridge-nf-call-iptables
    28. I0410 18:17:52.347697 6386 checks.go:329] validating the contents of file /proc/sys/net/ipv4/ip_forward
    29. I0410 18:17:52.347753 6386 join.go:532] [preflight] Discovering cluster-info
    30. I0410 18:17:52.347806 6386 token.go:80] [discovery] Created cluster-info discovery client, requesting info from "10.0.0.6:6443"
    31. I0410 18:18:02.349715 6386 token.go:217] [discovery] Failed to request cluster-info, will try again: Get "https://10.0.0.6:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    32. I0410 18:18:18.743849 6386 token.go:217] [discovery] Failed to request cluster-info, will try again: Get "https://10.0.0.6:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    33. I0410 18:18:34.340314 6386 token.go:217] [discovery] Failed to request cluster-info, will try again: Get "https://10.0.0.6:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    34. I0410 18:18:50.171644 6386 token.go:217] [discovery] Failed to request cluster-info, will try again: Get "https://10.0.0.6:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    35. Get "https://10.0.0.6:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    36. couldn't validate the identity of the API Server
    37. k8s.io/kubernetes/cmd/kubeadm/app/discovery.For
    38. cmd/kubeadm/app/discovery/discovery.go:45
    39. k8s.io/kubernetes/cmd/kubeadm/app/cmd.(*joinData).TLSBootstrapCfg
    40. cmd/kubeadm/app/cmd/join.go:533
    41. k8s.io/kubernetes/cmd/kubeadm/app/cmd.(*joinData).InitCfg
    42. cmd/kubeadm/app/cmd/join.go:543
    43. k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/join.runPreflight
    44. cmd/kubeadm/app/cmd/phases/join/preflight.go:98
    45. k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
    46. cmd/kubeadm/app/cmd/phases/workflow/runner.go:259
    47. k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
    48. cmd/kubeadm/app/cmd/phases/workflow/runner.go:446
    49. k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
    50. cmd/kubeadm/app/cmd/phases/workflow/runner.go:232
    51. k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdJoin.func1
    52. cmd/kubeadm/app/cmd/join.go:180
    53. github.com/spf13/cobra.(*Command).execute
    54. vendor/github.com/spf13/cobra/command.go:940
    55. github.com/spf13/cobra.(*Command).ExecuteC
    56. vendor/github.com/spf13/cobra/command.go:1068
    57. github.com/spf13/cobra.(*Command).Execute
    58. vendor/github.com/spf13/cobra/command.go:992
    59. k8s.io/kubernetes/cmd/kubeadm/app.Run
    60. cmd/kubeadm/app/kubeadm.go:50
    61. main.main
    62. cmd/kubeadm/kubeadm.go:25
    63. runtime.main
    64. /usr/local/go/src/runtime/proc.go:267
    65. runtime.goexit
    66. /usr/local/go/src/runtime/asm_amd64.s:1650
    67. error execution phase preflight
    68. k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
    69. cmd/kubeadm/app/cmd/phases/workflow/runner.go:260
    70. k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
    71. cmd/kubeadm/app/cmd/phases/workflow/runner.go:446
    72. k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
    73. cmd/kubeadm/app/cmd/phases/workflow/runner.go:232
    74. k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdJoin.func1
    75. cmd/kubeadm/app/cmd/join.go:180
    76. github.com/spf13/cobra.(*Command).execute
    77. vendor/github.com/spf13/cobra/command.go:940
    78. github.com/spf13/cobra.(*Command).ExecuteC
    79. vendor/github.com/spf13/cobra/command.go:1068
    80. github.com/spf13/cobra.(*Command).Execute
    81. vendor/github.com/spf13/cobra/command.go:992
    82. k8s.io/kubernetes/cmd/kubeadm/app.Run
    83. cmd/kubeadm/app/kubeadm.go:50
    84. main.main
    85. cmd/kubeadm/kubeadm.go:25
    86. runtime.main
    87. /usr/local/go/src/runtime/proc.go:267
    88. runtime.goexit
    89. /usr/local/go/src/runtime/asm_amd64.s:1650
    90.  

    Hi @chrispokorni

    The aforementioned block is my (truncated)output when I run the sudo kubeadm join command with a v=5 flag. Kubectl prompted me to add this flag in order to get a more verbose output to identify the nature of the error.

    With regards to your questions:
    1) I am using the Google Cloud Engine and I am connecting to the VM instances, via putty.
    2) The OS is Ubuntu 20.04. LTS
    3) I have ensured that I have chosen the VPC network that I made specifically for this class(following the instructions provided in the first lesson). There's just one network per VM.

    4) I have also made sure I have disabled the firewall. I have also added a screen shot of the firewall rule that's operational.

  • Posts: 2,444

    Hi @gmmajal,

    Thank you for the detailed output.
    What are the custom entries of the /etc/hosts files, what are the private IP addresses and the hostnames of the two VMs?

    What are the outputs of kubectl get nodes -o wide and kubectl get pods -A -o wide ?

    Regards,
    -Chris

  • Posts: 7
    1. NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
    2. kube-system cilium-k62nk 1/1 Running 0 10m 10.0.0.10 cp <none> <none>
    3. kube-system cilium-operator-58684c48c9-b4c8f 1/1 Running 0 10m 10.0.0.10 cp <none> <none>
    4. kube-system coredns-76f75df574-725bc 1/1 Running 0 10m 10.0.0.5 cp <none> <none>
    5. kube-system coredns-76f75df574-gccb4 1/1 Running 0 10m 10.0.0.245 cp <none> <none>
    6. kube-system etcd-cp 1/1 Running 0 10m 10.0.0.10 cp <none> <none>
    7. kube-system kube-apiserver-cp 1/1 Running 0 10m 10.0.0.10 cp <none> <none>
    8. kube-system kube-controller-manager-cp 1/1 Running 0 10m 10.0.0.10 cp <none> <none>
    9. kube-system kube-proxy-bqh7r 1/1 Running 0 10m 10.0.0.10 cp <none> <none>
    10. kube-system kube-scheduler-cp 1/1 Running 0 10m 10.0.0.10 cp <none> <none>
    1. NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
    2. cp Ready control-plane 18m v1.29.1 10.0.0.10 <none> Ubuntu 20.04.6 LTS 5.15.0-1053-gcp containerd://1.6.31
    3.  

    Hi Chris,
    Thanks for the prompt response. The first block is the output for kubectl get pods command on the control panel node. The second block is the output for the kubectl get nodes command on the control panel node.

    The entry inside the hosts file is the following:

    1. 127.0.0.1 localhost
    2.  
    3. # The following lines are desirable for IPv6 capable hosts
    4. ::1 ip6-localhost ip6-loopback
    5. fe00::0 ip6-localnet
    6. ff00::0 ip6-mcastprefix
    7. ff02::1 ip6-allnodes
    8. ff02::2 ip6-allrouters
    9. ff02::3 ip6-allhosts
    10. 169.254.169.254 metadata.google.internal metadata

    The hostname of the vm instances are worker and cp. The IP addresses are 34.91.60.229 and 34.32.234.112, respectively.

    With regards to the firewall rule I just wanted to recheck one thing. There are a few rules created by default for an instance of a VPC network on Google cloud. Are we supposed to delete them entirely, before inserting our own firewall rule?

    Regards,
    GMMajal

  • Posts: 2,444

    Hi @gmmajal,

    You probably missed a step in the lab exercise. You must configure both /etc/hosts files, on each node respectively with the same additional entry CP-NODE-PRIVATE-IP k8scp. In your case the additional entry should be 10.0.0.10 k8scp.

    Regards,
    -Chris

  • Posts: 7

    Hi @chrispokorni

    I made the additional entry to the etc/hosts files on both nodes. Unfortunately, the problem still persists. Can you tell me which part of the exercise is responsible for making the configuration you mentioned in your earlier message? Unfortunately, I couldn't really find it. If I run kubectl get nodes on my worker node I get the following output:

    1. E0412 10:07:35.586706 16538 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused
    2. E0412 10:07:35.587280 16538 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused
    3. E0412 10:07:35.588770 16538 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused
    4. E0412 10:07:35.589209 16538 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused
    5. E0412 10:07:35.590655 16538 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused
    6. The connection to the server localhost:8080 was refused - did you specify the right host or port?

    Even running a curl request:curl https://10.0.0.10:6443 results in a
    curl: (28) Failed to connect to 10.0.0.10 port 6443: Connection timed out
    connection timed out error. Is there somewhere else where I need to modify entries to allow this connection to happen?

    Regards,
    GMMajal

  • Posts: 7

    Hi @chrispokorni ,

    Thanks for your response. I tried what you suggested about growing the cluster first and then trying to connect the worker node to the cp. Unfortunately, that did not work. I started all over again making the VPC and VM instances. I followed each instruction carefully and it seems the original problem was indeed with my Firewall rule. I followed all the instructions in the exercise as stated and this time it worked. I did not have to insert any additional information in the etc/hosts file. The problem was with my Firewall setup to begin with.

    Regards,
    GMMajal

  • Posts: 1

    can you describe please how did you configurate your firewall rule?, I've been stuck for a week, I also don't see instructions about editing etc/hosts.

  • Posts: 2,444

    Hi @arenasgt,

    You may find helpful the videos in the introductory chapter, describing the infrastructure provisioning and networking configuration on AWS and GCP respectively.

    Regards,
    -Chris

Welcome!

It looks like you're new here. Sign in or register to get started.
Sign In

Welcome!

It looks like you're new here. Sign in or register to get started.
Sign In

Categories

Upcoming Training