I need help kubeadm join

etofran810 · March 2022

I have executed command to join node
kubeadm join --token 6bjbzh.e2sy2wc9uau0kofc k8scp:6443 --discovery-token-ca-cert-hash sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855

and I had the follow issue.
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
error execution phase preflight: couldn't validate the identity of the API Server: cluster CA found in cluster-info ConfigMap is invalid: none of the public keys "sha256:a0318ce2a89902bbbcf0fa578f4b7d912f5e957cbc3967d7bb64f6884d288f11" are pinned
To see the stack trace of this error execute with --v=5 or higher

please help me someone

etofran810 · March 2022

I have check syslog I looked the follow message

Mar 10 17:53:42 workerkub systemd[1]: kubelet.service: Service hold-off time over, scheduling restart.
Mar 10 17:53:42 workerkub systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 108.
Mar 10 17:53:42 workerkub systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Mar 10 17:53:42 workerkub systemd[1]: Started kubelet: The Kubernetes Node Agent.
Mar 10 17:53:43 workerkub kubelet[5822]: E0310 17:53:43.007203 5822 server.go:204] "Failed to load kubelet config file" err="failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file \"/var/lib/kubelet/config.yaml\", error: open /var/lib/kubelet/config.yaml: no such file or directory" path="/var/lib/kubelet/config.yaml"
Mar 10 17:53:43 workerkub systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Mar 10 17:53:43 workerkub systemd[1]: kubelet.service: Failed with result 'exit-code'.
Mar 10 17:53:53 workerkub systemd[1]: kubelet.service: Service hold-off time over, scheduling restart.
Mar 10 17:53:53 workerkub systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 109.
Mar 10 17:53:53 workerkub systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Mar 10 17:53:53 workerkub systemd[1]: Started kubelet: The Kubernetes Node Agent.
Mar 10 17:53:53 workerkub kubelet[5854]: E0310 17:53:53.259710 5854 server.go:204] "Failed to load kubelet config file" err="failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file \"/var/lib/kubelet/config.yaml\", error: open /var/lib/kubelet/config.yaml: no such file or directory" path="/var/lib/kubelet/config.yaml"
Mar 10 17:53:53 workerkub systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Mar 10 17:53:53 workerkub systemd[1]: kubelet.service: Failed with result 'exit-code'

serewicz · March 2022

Hello,

Thank you for the note. First off I notice there is no IP address in your command, which would normally be right after the join statement in the command. Were there any errors on your control plane node? Please run kubectl get pod --all-namespaces on the control plane, as perhaps something isn't running correctly. As well the error is about lack of access to a file which would typically be there. To assist with troubleshooting, what version of the OS are you using, and what version of Kubernetes were you installing?

Regards,

etofran810 · March 2022

I dont indicate IP addres beacuse I have indicated alias k8scp:6443 the alias inside /etc/hosts

command was executed in node Master.
franktorres@masterkub:~$ kubectl get pod --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-6fd7b9848d-lcdrn 1/1 Running 0 35h
kube-system calico-node-d7vx5 1/1 Running 0 35h
kube-system coredns-558bd4d5db-mbkqr 1/1 Running 0 35h
kube-system coredns-558bd4d5db-vskr5 1/1 Running 0 35h
kube-system etcd-masterkub 1/1 Running 0 35h
kube-system kube-apiserver-masterkub 1/1 Running 0 35h
kube-system kube-controller-manager-masterkub 1/1 Running 0 35h
kube-system kube-proxy-dhpdq 1/1 Running 0 35h
kube-system kube-scheduler-masterkub 1/1 Running 0 35h
franktorres@masterkub:~$

chrispokorni · March 2022

Hi @etofran810,

When using an alias such as k8scp for the init and join phases, the same alias needs to be defined as the controlPlaneEndpoint in the ClusterConfiguration resource found in kubeadm.yaml. In addition, this alias and the control-plane node's private IP address should be added to each node's /etc/hosts file.

If the alias is not defined then the init phase does no use it to bind the CA and other certificates to this endpoint, causing authentication failures in the cluster.

Assuming all above is defined as expected, you could try sudo kubeadm reset on your worker node. Then generate a new token on the control-plane node and display the matching join command sudo kubeadm token create --print-join-command. Copy the displayed join command and run it on the worker node sudo kubeadm join ....

If all else fails, rebuilding your cluster from scratch should provide you with a clean start.

Regards,
-Chris

etofran810 · March 2022

Can I execute command kubeadm reset in node master, and will execute command init for initialize cp?

chrispokorni · March 2022

Hi @etofran810,

Running sudo kubeadm reset on the control plane node will clean up all the cluster configuration that was created during the init process, including all certificates, keys, cluster admin authentication credentials, etc. This will render invalid all additional joined nodes as well, which should be reset as well before adding them into a new cluster.

After a successful reset of the control plane node you can run the init command again on the control plane node, followed by new join commands for the additional nodes that need to be added back into the cluster.

Regards
-Chris

etofran810 · March 2022

thanks, I have started all process, when execute init the output indicate the follow command,
kubeadm join masterkub-1:6443 --token zyjg96.kb6iia18tdmff3wm \

--discovery-token-ca-cert-hash sha256:e4e0083a590af909aa19a1bc6a0497c237e5fd9b4a43dd3e836f8a4a55ac8daf \
--control-plane --certificate-key ea813cf4c836c4270513227cc1b6fd7750e3809044cb425c6341283192202760

right now it's join
This node has joined the cluster and a new control plane instance was created:

NAME STATUS ROLES AGE VERSION
masterkub-1 Ready control-plane,master 22m v1.21.1
workerkub Ready control-plane,master 3m5s v1.21.1

chrispokorni · March 2022

Hi @etofran810,

The kubeadm join command allows us to add into the cluster both types of nodes - control plane and worker. For the control plane type of node it requires the additional flags and hash, as provided above. So it seems that when adding the workerkub node to the cluster the join command selected was the one specific to control plane nodes, and as a result the worker node shows the control-plane ROLE.

Regards,
-Chris

etofran810 · March 2022

Ok, I have question, when I poweroff node, and poweron, I check command kubectl get nodes, not show information
kubectl get nodes
The connection to the server masterkub-1:6443 was refused - did you specify the right host or port?
@masterkub-1:~$ kubectl describe node masterkub-1
The connection to the server masterkub-1:6443 was refused - did you specify the right host or port?

I must execute command init every I poweron node

etofran810 · March 2022

I have checked syslog and looked
Mar 24 00:35:46 masterkub-1 kubelet[1133]: E0324 00:35:46.859423 1133 controller.go:144] failed to ensure lease exists, will retry in 7s, error: Get "https://masterkub-1:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/masterkub-1?timeout=10s": dial tcp 10.2.0.3:6443: connect: connection refused

Mar 24 00:35:51 masterkub-1 kubelet[1133]: E0324 00:35:51.391056 1133 kubelet.go:2291] "Error getting node" err="node \"masterkub-1\" not found"
Mar 24 00:35:51 masterkub-1 kubelet[1133]: E0324 00:35:51.491907 1133 kubelet.go:2291] "Error getting node" err="node \"masterkub-1\" not found"

I check ip add and the IP created por calico, I don't look

etofran810 · March 2022

I have rebooted several time and command kubectl worked

etofran810 · March 2022

Can I modify role in node worker?
kubectl get nodes
NAME STATUS ROLES AGE VERSION
masterkub-1 Ready control-plane,master 2d v1.22.1
workerkub Ready control-plane,master 47h v1.21.1

It's ready upgrade node Master.

chrispokorni · March 2022

Hi @etofran810,

You can drain the worker node, then delete the worker node from the cluster, then generate a new join command from the control plane node, reset the worker node, and then run the newly generated join command on the worker node.

control plane: kubectl drain <worker> --ignore-daemonsets
control plane: kubectl delete node <worker>
control plane: sudo kubeadm token create --print-join-command
worker node: sudo kubeadm reset
worker node: sudo kubeadm join ...

Now the worker node should no longer show the control-plane ROLE.

Regards,
-Chris

etofran810 · April 2022

I am installing new cluster, in node master CP, and I am executing the same process.

now I had the follow error when execute the kubeadm init.

[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.

Unfortunately, an error has occurred:
    timed out waiting for the condition

This error is likely caused by:
    - The kubelet is not running
    - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
    - 'systemctl status kubelet'
    - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.

Here is one example how you may list all Kubernetes containers running in docker:
    - 'docker ps -a | grep kube | grep -v pause'
    Once you have found the failing container, you can inspect its logs with:
    - 'docker logs CONTAINERID'

error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

help me please

etofran810 · April 2022

log syslog
Apr 20 21:27:57 masterkub systemd[1]: Stopped kubelet: Apr 20 21:27:57 masterkub systemd[1]: Started kubelet: Apr 20 21:27:57 masterkub kubelet[5722]: Flag --network-plugin Apr 20 21:27:57 masterkub kubelet[5722]: Flag --network-plugin Apr 20 21:27:57 masterkub systemd[1]: Started Kubernetes Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.670271 Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.670686 Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.674947 Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.676330 Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.739949 Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.740333 Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.740439 Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.742777 Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.742819 Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.742871 Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.742951 Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.742988 Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.743013 Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.752719 Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.752762 Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.752920 Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.758152 Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.758260 Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.758393 Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.767434 Apr 20 21:27:57 masterkub kubelet[5722]: E0420 21:27:57.767497 Apr 20 21:27:57 masterkub systemd[1]: kubelet.service: Apr 20 21:27:57 masterkub systemd[1]: kubelet.service: The Kubernetes Node Agent.
The Kubernetes Node Agent.
has been deprecated, will be removed along with dockershim.
has been deprecated, will be removed along with dockershim.
systemd probe.
5722 server.go:446] "Kubelet version" kubeletVersion="v1.23.1"
5722 server.go:874] "Client rotation is on, will bootstrap in background"
5722 certificate_store.go:130] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-current.pem".
5722 dynamic_cafile_content.go:156] "Starting controller" name="client-ca-bundle::/etc/kubernetes/pki/ca.crt"
5722 server.go:693] "--cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /"
5722 container_manager_linux.go:281] "Container manager verified user specified cgroup-root exists" cgroupRoot=[]
5722 container_manager_linux.go:286] "Creating Container Manager object based on Node Config" nodeConfig={RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:systemd KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: ReservedSystemCPUs: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:imagefs.available Operator:LessThan Value:{Quantity: Percentage:0.15} GracePeriod:0s MinReclaim:} {Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:} {Signal:nodefs.available Operator:LessThan Value:{Quantity: Percentage:0.1} GracePeriod:0s MinReclaim:} {Signal:nodefs.inodesFree Operator:LessThan Value:{Quantity: Percentage:0.05} GracePeriod:0s MinReclaim:}]} QOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerPolicyOptions:map[] ExperimentalTopologyManagerScope:container ExperimentalCPUManagerReconcilePeriod:10s ExperimentalMemoryManagerPolicy:None ExperimentalMemoryManagerReservedMemory:[] ExperimentalPodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms ExperimentalTopologyManagerPolicy:none}
5722 topology_manager.go:133] "Creating topology manager with policy per scope" topologyPolicyName="none" topologyScopeName="container"
5722 container_manager_linux.go:321] "Creating device plugin manager" devicePluginEnabled=true
5722 state_mem.go:36] "Initialized new in-memory state store"
5722 kubelet.go:313] "Using dockershim is deprecated, please consider using a full-fledged CRI implementation"
5722 client.go:80] "Connecting to docker on the dockerEndpoint" endpoint="unix:///var/run/docker.sock"
5722 client.go:99] "Start docker client with request timeout" timeout="2m0s"
5722 docker_service.go:571] "Hairpin mode is set but kubenet is not enabled, falling back to HairpinVeth" hairpinMode=promiscuous-bridge
5722 docker_service.go:243] "Hairpin mode is set" hairpinMode=hairpin-veth
5722 cni.go:240] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
5722 cni.go:240] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
5722 docker_service.go:258] "Docker cri networking managed by the network plugin" networkPluginName="cni"
5722 cni.go:240] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
5722 docker_service.go:264] "Docker Info" dockerInfo=&{ID:7QRH:E3HB:5Y4A:2ALN:QJGZ:LA2S:XGZ6:ZDID:IHSL:HG2C:76M3:P5WG Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStopped:0 Images:7 Driver:overlay2 DriverStatus:[[Backing Filesystem extfs] [Supports d_type true] [Native Overlay Diff true] [userxattr false]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host ipvlan macvlan null overlay] Authorization:[] Log:[awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog]} MemoryLimit:true SwapLimit:false KernelMemory:true KernelMemoryTCP:true CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true PidsLimit:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:23 OomKillDisable:true NGoroutines:34 SystemTime:2022-04-20T21:27:57.7595313Z LoggingDriver:json-file CgroupDriver:cgroupfs CgroupVersion:1 NEventsListener:0 KernelVersion:5.4.0-1069-gcp OperatingSystem:Ubuntu 18.04.6 LTS OSVersion:18.04 OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc0003fcd20 NCPU:2 MemTotal:7812435968 GenericResources:[] DockerRootDir:/var/lib/docker HTTPProxy: HTTPSProxy: NoProxy: Name:masterkub Labels:[] ExperimentalBuild:false ServerVersion:20.10.7 ClusterStore: ClusterAdvertise: Runtimes:map[io.containerd.runc.v2:{Path:runc Args:[] Shim:} io.containerd.runtime.v1.linux:{Path:runc Args:[] Shim:} runc:{Path:runc Args:[] Shim:}] DefaultRuntime:runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster: Warnings:[]} LiveRestoreEnabled:false Isolation: InitBinary:docker-init ContainerdCommit:{ID: Expected:} RuncCommit:{ID: Expected:} InitCommit:{ID: Expected:} SecurityOptions:[name=apparmor name=seccomp,profile=default] ProductLicense: DefaultAddressPools:[] Warnings:[WARNING: No swap limit support]}
5722 server.go:302] "Failed to run kubelet" err="failed to run Kubelet: misconfiguration: kubelet cgroup driver: \"systemd\" is different from docker cgroup driver: \"cgroupfs\""
Main process exited, code=exited, status=1/FAILURE
Failed with result 'exit-code'.

etofran810 · April 2022

log suyslog
Apr 20 21:46:22 masterkub kubelet[1055]: E0420 21:46:22.431207 1055 kubelet.go:2347] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"
Apr 20 21:46:23 masterkub kubelet[1055]: I0420 21:46:23.842211 1055 cni.go:240] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
Apr 20 21:46:27 masterkub kubelet[1055]: E0420 21:46:27.441583 1055 kubelet.go:2347] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"
Apr 20 21:46:28 masterkub kubelet[1055]: I0420 21:46:28.843277 1055 cni.go:240] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
Apr 20 21:46:31 masterkub kubelet[1055]: E0420 21:46:31.965335 1055 summary_sys_containers.go:48] "Failed to get system container stats" err="failed to get cgroup stats for \"/system.slice/docker.service\": failed to get container info for \"/system.slice/docker.service\": unknown container \"/system.slice/docker.service\"" containerName="/system.slice/docker.service"
Apr 20 21:46:32 masterkub kubelet[1055]: E0420 21:46:32.450832 1055 kubelet.go:2347] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"
Apr 20 21:46:33 masterkub kubelet[1055]: I0420 21:46:33.844389 1055 cni.go:240] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
Apr 20 21:46:37 masterkub kubelet[1055]: E0420 21:46:37.461076 1055 kubelet.go:2347] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"

ls -ltr /etc/cni/net.d
root@masterkub:~# ls -ltr /etc/cni/net.d
total 0
root@masterkub:~#

etofran810 · April 2022

I have restarted several time and now, the cluster is OK

@masterkub:~$ kubectl get node

NAME STATUS ROLES AGE VERSION
masterkub Ready control-plane,master 14h v1.23.1
workerkub Ready 13h v1.23.1

I need help kubeadm join

Comments

@masterkub:~$ kubectl get node

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)