[Exercise 2.2: Deploy a New Cluster] Hit "node xx not found" issue
I followed the guide to create my cluster on Ali Cloud, and the two instances with 2cpu, 8G.
root@master:~# cat /etc/hosts 10.250.115.210 master 10.250.115.211 slaver root@master:~# hostname master
the kubeadm init always block at following block
I0201 00:57:06.271718 29692 waitcontrolplane.go:91] [wait-control-plane] Waiting for the API server to be healthy [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [kubelet-check] Initial timeout of 40s passed.
I googled it and found the following issue seems like what I met.
https://github.com/cri-o/cri-o/issues/2357
https://github.com/kubernetes/kubeadm/issues/1153
https://github.com/kubernetes/kubeadm/issues/2370
https://github.com/kubernetes/kubernetes/issues/106464
I did remove the docker if it exists, and double confirm the type of container group is the same in crio and kubelet. the error reporting is still kubelet's problem like below:
Feb 01 00:57:19 master kubelet[29902]: E0201 00:57:19.198073 29902 kubelet.go:2422] "Error getting node" err="node \"master\" not found" Feb 01 00:57:19 master kubelet[29902]: E0201 00:57:19.298393 29902 kubelet.go:2422] "Error getting node" err="node \"master\" not found" Feb 01 00:57:19 master kubelet[29902]: E0201 00:57:19.398656 29902 kubelet.go:2422] "Error getting node" err="node \"master\" not found" Feb 01 00:57:19 master kubelet[29902]: E0201 00:57:19.499651 29902 kubelet.go:2422] "Error getting node" err="node \"master\" not found" Feb 01 00:57:19 master kubelet[29902]: E0201 00:57:19.599724 29902 kubelet.go:2422] "Error getting node" err="node \"master\" not found" Feb 01 00:57:19 master kubelet[29902]: E0201 00:57:19.700032 29902 kubelet.go:2422] "Error getting node" err="node \"master\" not found" Feb 01 00:57:19 master kubelet[29902]: E0201 00:57:19.800410 29902 kubelet.go:2422] "Error getting node" err="node \"master\" not found" Feb 01 00:57:19 master kubelet[29902]: E0201 00:57:19.900674 29902 kubelet.go:2422] "Error getting node" err="node \"master\" not found" Feb 01 00:57:20 master kubelet[29902]: E0201 00:57:20.001051 29902 kubelet.go:2422] "Error getting node" err="node \"master\" not found" Feb 01 00:57:20 master kubelet[29902]: E0201 00:57:20.101439 29902 kubelet.go:2422] "Error getting node" err="node \"master\" not found"
I tried to upgrade the kubeadm, kubelet, kubectl to the newest version 1.23.3. it seems not to work. is there anyone who may give some insight about it? thanks.
BTW, below is the kubeadm.yaml serve for kubeadm init.
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/crio/crio.sock
name: master
taints: null
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.23.3
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
podSubnet: 192.168.0.0/16
scheduler: {}
---
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
anonymous:
enabled: false
webhook:
cacheTTL: 0s
enabled: true
x509:
clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
mode: Webhook
webhook:
cacheAuthorizedTTL: 0s
cacheUnauthorizedTTL: 0s
cgroupDriver: systemd
clusterDNS:
- 10.96.0.10
clusterDomain: cluster.local
cpuManagerReconcilePeriod: 0s
evictionPressureTransitionPeriod: 0s
fileCheckFrequency: 0s
healthzBindAddress: 127.0.0.1
healthzPort: 10248
httpCheckFrequency: 0s
imageMinimumGCAge: 0s
kind: KubeletConfiguration
logging: {}
nodeStatusReportFrequency: 0s
nodeStatusUpdateFrequency: 0s
resolvConf: /run/systemd/resolve/resolv.conf
rotateCertificates: true
runtimeRequestTimeout: 0s
shutdownGracePeriod: 0s
shutdownGracePeriodCriticalPods: 0s
staticPodPath: /etc/kubernetes/manifests
streamingConnectionIdleTimeout: 0s
syncFrequency: 0s
volumeStatsAggPeriod: 0s
Best Answer
-
it is caused by the API server is not healthy.
0
Answers
-
I forgot one thing, I tested it locally in VMware Pro with the same configuration, and the problem remains the same.
0 -
Hi @yang.wang11,
Kubernetes is highly sensitive to VM instance/Node networking configuration. Have you had a chance to watch the two cluster set up videos for AWS and GCP? While they are different cloud providers, it is possible you may find some networking and firewall configuration tips that can be used in other cloud settings or local hypervisors.
I would also stick with the recommended Kubernetes v1.22.1, as per the lab guide, and the VM guest OS - Ubuntu 18.04 LTS. Disable guest OS firewalls if any are enabled by default, and disable swap as well.
Regards,
-Chris0
Categories
- All Categories
- 177 LFX Mentorship
- 177 LFX Mentorship: Linux Kernel
- 754 Linux Foundation IT Professional Programs
- 374 Cloud Engineer IT Professional Program
- 170 Advanced Cloud Engineer IT Professional Program
- 74 DevOps IT Professional Program - Discontinued
- 5 DevOps & GitOps IT Professional Program
- 100 Cloud Native Developer IT Professional Program
- 7.6K Training Courses & Learning Paths
- 2 AI & ML Training
- 1 Blockchain & Decentralized Identity Training
- 5 Cloud & Containers Training
- 1 Cybersecurity Training
- 2 DevOps & Site-Reliability Training
- 1 Linux Kernel Development Training
- 1 Networking Training
- 2 Open Source Best Practice Training
- 2 System Administration Training
- 1 System Engineering Training
- 1 Web & Application Development Training
- 794 Hardware
- 202 Drivers
- 68 I/O Devices
- 37 Monitors
- 95 Multimedia
- 173 Networking
- 91 Printers & Scanners
- 89 Storage
- 769 Linux Distributions
- 81 Debian
- 68 Fedora
- 22 Linux Mint
- 13 Mageia
- 24 openSUSE
- 150 Red Hat Enterprise
- 31 Slackware
- 13 SUSE Enterprise
- 356 Ubuntu
- 465 Linux System Administration
- 31 Cloud Computing
- 73 Command Line/Scripting
- Github systems admin projects
- 98 Linux Security
- 78 Network Management
- 101 System Management
- 46 Web Management
- 112 Mobile Computing
- 20 Android
- 77 Development
- 1.2K New to Linux
- 1K Getting Started with Linux
- 393 Off Topic
- 121 Introductions
- 182 Small Talk
- 29 Study Material
- 977 Programming and Development
- 310 Kernel Development
- 649 Software Development
- 990 Software
- 382 Applications
- 182 Command Line
- 5 Compiling/Installing
- 68 Games
- 317 Installation
- Archived
- 2 LFD140 Class Forum
- 1.4K LFS258 Class Forum
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)