Welcome to the Linux Foundation Forum!

I need help kubeadm join

I have executed command to join node
kubeadm join --token 6bjbzh.e2sy2wc9uau0kofc k8scp:6443 --discovery-token-ca-cert-hash sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855

and I had the follow issue.
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
error execution phase preflight: couldn't validate the identity of the API Server: cluster CA found in cluster-info ConfigMap is invalid: none of the public keys "sha256:a0318ce2a89902bbbcf0fa578f4b7d912f5e957cbc3967d7bb64f6884d288f11" are pinned
To see the stack trace of this error execute with --v=5 or higher

please help me someone

Comments

  • I have check syslog I looked the follow message

    Mar 10 17:53:42 workerkub systemd[1]: kubelet.service: Service hold-off time over, scheduling restart.
    Mar 10 17:53:42 workerkub systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 108.
    Mar 10 17:53:42 workerkub systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
    Mar 10 17:53:42 workerkub systemd[1]: Started kubelet: The Kubernetes Node Agent.
    Mar 10 17:53:43 workerkub kubelet[5822]: E0310 17:53:43.007203 5822 server.go:204] "Failed to load kubelet config file" err="failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file \"/var/lib/kubelet/config.yaml\", error: open /var/lib/kubelet/config.yaml: no such file or directory" path="/var/lib/kubelet/config.yaml"
    Mar 10 17:53:43 workerkub systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
    Mar 10 17:53:43 workerkub systemd[1]: kubelet.service: Failed with result 'exit-code'.
    Mar 10 17:53:53 workerkub systemd[1]: kubelet.service: Service hold-off time over, scheduling restart.
    Mar 10 17:53:53 workerkub systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 109.
    Mar 10 17:53:53 workerkub systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
    Mar 10 17:53:53 workerkub systemd[1]: Started kubelet: The Kubernetes Node Agent.
    Mar 10 17:53:53 workerkub kubelet[5854]: E0310 17:53:53.259710 5854 server.go:204] "Failed to load kubelet config file" err="failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file \"/var/lib/kubelet/config.yaml\", error: open /var/lib/kubelet/config.yaml: no such file or directory" path="/var/lib/kubelet/config.yaml"
    Mar 10 17:53:53 workerkub systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
    Mar 10 17:53:53 workerkub systemd[1]: kubelet.service: Failed with result 'exit-code'

  • serewicz
    serewicz Posts: 1,000

    Hello,

    Thank you for the note. First off I notice there is no IP address in your command, which would normally be right after the join statement in the command. Were there any errors on your control plane node? Please run kubectl get pod --all-namespaces on the control plane, as perhaps something isn't running correctly. As well the error is about lack of access to a file which would typically be there. To assist with troubleshooting, what version of the OS are you using, and what version of Kubernetes were you installing?

    Regards,

  • I dont indicate IP addres beacuse I have indicated alias k8scp:6443 the alias inside /etc/hosts

    command was executed in node Master.
    franktorres@masterkub:~$ kubectl get pod --all-namespaces
    NAMESPACE NAME READY STATUS RESTARTS AGE
    kube-system calico-kube-controllers-6fd7b9848d-lcdrn 1/1 Running 0 35h
    kube-system calico-node-d7vx5 1/1 Running 0 35h
    kube-system coredns-558bd4d5db-mbkqr 1/1 Running 0 35h
    kube-system coredns-558bd4d5db-vskr5 1/1 Running 0 35h
    kube-system etcd-masterkub 1/1 Running 0 35h
    kube-system kube-apiserver-masterkub 1/1 Running 0 35h
    kube-system kube-controller-manager-masterkub 1/1 Running 0 35h
    kube-system kube-proxy-dhpdq 1/1 Running 0 35h
    kube-system kube-scheduler-masterkub 1/1 Running 0 35h
    franktorres@masterkub:~$

  • chrispokorni
    chrispokorni Posts: 2,315

    Hi @etofran810,

    When using an alias such as k8scp for the init and join phases, the same alias needs to be defined as the controlPlaneEndpoint in the ClusterConfiguration resource found in kubeadm.yaml. In addition, this alias and the control-plane node's private IP address should be added to each node's /etc/hosts file.

    If the alias is not defined then the init phase does no use it to bind the CA and other certificates to this endpoint, causing authentication failures in the cluster.

    Assuming all above is defined as expected, you could try sudo kubeadm reset on your worker node. Then generate a new token on the control-plane node and display the matching join command sudo kubeadm token create --print-join-command. Copy the displayed join command and run it on the worker node sudo kubeadm join ....

    If all else fails, rebuilding your cluster from scratch should provide you with a clean start.

    Regards,
    -Chris

  • Can I execute command kubeadm reset in node master, and will execute command init for initialize cp?

  • chrispokorni
    chrispokorni Posts: 2,315

    Hi @etofran810,

    Running sudo kubeadm reset on the control plane node will clean up all the cluster configuration that was created during the init process, including all certificates, keys, cluster admin authentication credentials, etc. This will render invalid all additional joined nodes as well, which should be reset as well before adding them into a new cluster.

    After a successful reset of the control plane node you can run the init command again on the control plane node, followed by new join commands for the additional nodes that need to be added back into the cluster.

    Regards
    -Chris

  • thanks, I have started all process, when execute init the output indicate the follow command,
    kubeadm join masterkub-1:6443 --token zyjg96.kb6iia18tdmff3wm \

    --discovery-token-ca-cert-hash sha256:e4e0083a590af909aa19a1bc6a0497c237e5fd9b4a43dd3e836f8a4a55ac8daf \
    --control-plane --certificate-key ea813cf4c836c4270513227cc1b6fd7750e3809044cb425c6341283192202760

    right now it's join
    This node has joined the cluster and a new control plane instance was created:

    NAME STATUS ROLES AGE VERSION
    masterkub-1 Ready control-plane,master 22m v1.21.1
    workerkub Ready control-plane,master 3m5s v1.21.1

  • chrispokorni
    chrispokorni Posts: 2,315

    Hi @etofran810,

    The kubeadm join command allows us to add into the cluster both types of nodes - control plane and worker. For the control plane type of node it requires the additional flags and hash, as provided above. So it seems that when adding the workerkub node to the cluster the join command selected was the one specific to control plane nodes, and as a result the worker node shows the control-plane ROLE.

    Regards,
    -Chris

  • Ok, I have question, when I poweroff node, and poweron, I check command kubectl get nodes, not show information
    kubectl get nodes
    The connection to the server masterkub-1:6443 was refused - did you specify the right host or port?
    @masterkub-1:~$ kubectl describe node masterkub-1
    The connection to the server masterkub-1:6443 was refused - did you specify the right host or port?

    I must execute command init every I poweron node

  • etofran810
    etofran810 Posts: 51
    edited March 2022

    I have checked syslog and looked
    Mar 24 00:35:46 masterkub-1 kubelet[1133]: E0324 00:35:46.859423 1133 controller.go:144] failed to ensure lease exists, will retry in 7s, error: Get "https://masterkub-1:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/masterkub-1?timeout=10s": dial tcp 10.2.0.3:6443: connect: connection refused

    Mar 24 00:35:51 masterkub-1 kubelet[1133]: E0324 00:35:51.391056 1133 kubelet.go:2291] "Error getting node" err="node \"masterkub-1\" not found"
    Mar 24 00:35:51 masterkub-1 kubelet[1133]: E0324 00:35:51.491907 1133 kubelet.go:2291] "Error getting node" err="node \"masterkub-1\" not found"

    I check ip add and the IP created por calico, I don't look

  • I have rebooted several time and command kubectl worked

  • Can I modify role in node worker?
    kubectl get nodes
    NAME STATUS ROLES AGE VERSION
    masterkub-1 Ready control-plane,master 2d v1.22.1
    workerkub Ready control-plane,master 47h v1.21.1

    It's ready upgrade node Master.

  • chrispokorni
    chrispokorni Posts: 2,315

    Hi @etofran810,

    You can drain the worker node, then delete the worker node from the cluster, then generate a new join command from the control plane node, reset the worker node, and then run the newly generated join command on the worker node.

    control plane: kubectl drain <worker> --ignore-daemonsets
    control plane: kubectl delete node <worker>
    control plane: sudo kubeadm token create --print-join-command
    worker node: sudo kubeadm reset
    worker node: sudo kubeadm join ...

    Now the worker node should no longer show the control-plane ROLE.

    Regards,
    -Chris

  • I am installing new cluster, in node master CP, and I am executing the same process.

    now I had the follow error when execute the kubeadm init.

    [control-plane] Creating static Pod manifest for "kube-scheduler"
    [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
    [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
    [kubelet-check] Initial timeout of 40s passed.
    [kubelet-check] It seems like the kubelet isn't running or healthy.
    [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
    [kubelet-check] It seems like the kubelet isn't running or healthy.
    [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
    [kubelet-check] It seems like the kubelet isn't running or healthy.
    [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
    [kubelet-check] It seems like the kubelet isn't running or healthy.
    [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
    [kubelet-check] It seems like the kubelet isn't running or healthy.
    [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.

    Unfortunately, an error has occurred:
        timed out waiting for the condition
    
    This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
    
    If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'
    
    Additionally, a control plane component may have crashed or exited when started by the container runtime.
    To troubleshoot, list all containers using your preferred container runtimes CLI.
    
    Here is one example how you may list all Kubernetes containers running in docker:
        - 'docker ps -a | grep kube | grep -v pause'
        Once you have found the failing container, you can inspect its logs with:
        - 'docker logs CONTAINERID'
    

    error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
    To see the stack trace of this error execute with --v=5 or higher

    help me please

  • log syslog
    Apr 20 21:27:57 masterkub systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
    Apr 20 21:27:57 masterkub systemd[1]: Started kubelet: The Kubernetes Node Agent.
    Apr 20 21:27:57 masterkub kubelet[5722]: Flag --network-plugin has been deprecated, will be removed along with dockershim.
    Apr 20 21:27:57 masterkub kubelet[5722]: Flag --network-plugin has been deprecated, will be removed along with dockershim.
    Apr 20 21:27:57 masterkub systemd[1]: Started Kubernetes systemd probe.
    Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.670271 5722 server.go:446] "Kubelet version" kubeletVersion="v1.23.1"
    Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.670686 5722 server.go:874] "Client rotation is on, will bootstrap in background"
    Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.674947 5722 certificate_store.go:130] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-current.pem".
    Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.676330 5722 dynamic_cafile_content.go:156] "Starting controller" name="client-ca-bundle::/etc/kubernetes/pki/ca.crt"
    Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.739949 5722 server.go:693] "--cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /"
    Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.740333 5722 container_manager_linux.go:281] "Container manager verified user specified cgroup-root exists" cgroupRoot=[]
    Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.740439 5722 container_manager_linux.go:286] "Creating Container Manager object based on Node Config" nodeConfig={RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:systemd KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: ReservedSystemCPUs: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:imagefs.available Operator:LessThan Value:{Quantity: Percentage:0.15} GracePeriod:0s MinReclaim:} {Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:} {Signal:nodefs.available Operator:LessThan Value:{Quantity: Percentage:0.1} GracePeriod:0s MinReclaim:} {Signal:nodefs.inodesFree Operator:LessThan Value:{Quantity: Percentage:0.05} GracePeriod:0s MinReclaim:}]} QOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerPolicyOptions:map[] ExperimentalTopologyManagerScope:container ExperimentalCPUManagerReconcilePeriod:10s ExperimentalMemoryManagerPolicy:None ExperimentalMemoryManagerReservedMemory:[] ExperimentalPodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms ExperimentalTopologyManagerPolicy:none}
    Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.742777 5722 topology_manager.go:133] "Creating topology manager with policy per scope" topologyPolicyName="none" topologyScopeName="container"
    Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.742819 5722 container_manager_linux.go:321] "Creating device plugin manager" devicePluginEnabled=true
    Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.742871 5722 state_mem.go:36] "Initialized new in-memory state store"
    Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.742951 5722 kubelet.go:313] "Using dockershim is deprecated, please consider using a full-fledged CRI implementation"
    Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.742988 5722 client.go:80] "Connecting to docker on the dockerEndpoint" endpoint="unix:///var/run/docker.sock"
    Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.743013 5722 client.go:99] "Start docker client with request timeout" timeout="2m0s"
    Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.752719 5722 docker_service.go:571] "Hairpin mode is set but kubenet is not enabled, falling back to HairpinVeth" hairpinMode=promiscuous-bridge
    Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.752762 5722 docker_service.go:243] "Hairpin mode is set" hairpinMode=hairpin-veth
    Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.752920 5722 cni.go:240] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
    Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.758152 5722 cni.go:240] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
    Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.758260 5722 docker_service.go:258] "Docker cri networking managed by the network plugin" networkPluginName="cni"
    Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.758393 5722 cni.go:240] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
    Apr 20 21:27:57 masterkub kubelet[5722]: I0420 21:27:57.767434 5722 docker_service.go:264] "Docker Info" dockerInfo=&{ID:7QRH:E3HB:5Y4A:2ALN:QJGZ:LA2S:XGZ6:ZDID:IHSL:HG2C:76M3:P5WG Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStopped:0 Images:7 Driver:overlay2 DriverStatus:[[Backing Filesystem extfs] [Supports d_type true] [Native Overlay Diff true] [userxattr false]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host ipvlan macvlan null overlay] Authorization:[] Log:[awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog]} MemoryLimit:true SwapLimit:false KernelMemory:true KernelMemoryTCP:true CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true PidsLimit:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:23 OomKillDisable:true NGoroutines:34 SystemTime:2022-04-20T21:27:57.7595313Z LoggingDriver:json-file CgroupDriver:cgroupfs CgroupVersion:1 NEventsListener:0 KernelVersion:5.4.0-1069-gcp OperatingSystem:Ubuntu 18.04.6 LTS OSVersion:18.04 OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc0003fcd20 NCPU:2 MemTotal:7812435968 GenericResources:[] DockerRootDir:/var/lib/docker HTTPProxy: HTTPSProxy: NoProxy: Name:masterkub Labels:[] ExperimentalBuild:false ServerVersion:20.10.7 ClusterStore: ClusterAdvertise: Runtimes:map[io.containerd.runc.v2:{Path:runc Args:[] Shim:} io.containerd.runtime.v1.linux:{Path:runc Args:[] Shim:} runc:{Path:runc Args:[] Shim:}] DefaultRuntime:runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster: Warnings:[]} LiveRestoreEnabled:false Isolation: InitBinary:docker-init ContainerdCommit:{ID: Expected:} RuncCommit:{ID: Expected:} InitCommit:{ID: Expected:} SecurityOptions:[name=apparmor name=seccomp,profile=default] ProductLicense: DefaultAddressPools:[] Warnings:[WARNING: No swap limit support]}
    Apr 20 21:27:57 masterkub kubelet[5722]: E0420 21:27:57.767497 5722 server.go:302] "Failed to run kubelet" err="failed to run Kubelet: misconfiguration: kubelet cgroup driver: \"systemd\" is different from docker cgroup driver: \"cgroupfs\""
    Apr 20 21:27:57 masterkub systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
    Apr 20 21:27:57 masterkub systemd[1]: kubelet.service: Failed with result 'exit-code'.

  • log suyslog
    Apr 20 21:46:22 masterkub kubelet[1055]: E0420 21:46:22.431207 1055 kubelet.go:2347] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"
    Apr 20 21:46:23 masterkub kubelet[1055]: I0420 21:46:23.842211 1055 cni.go:240] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
    Apr 20 21:46:27 masterkub kubelet[1055]: E0420 21:46:27.441583 1055 kubelet.go:2347] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"
    Apr 20 21:46:28 masterkub kubelet[1055]: I0420 21:46:28.843277 1055 cni.go:240] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
    Apr 20 21:46:31 masterkub kubelet[1055]: E0420 21:46:31.965335 1055 summary_sys_containers.go:48] "Failed to get system container stats" err="failed to get cgroup stats for \"/system.slice/docker.service\": failed to get container info for \"/system.slice/docker.service\": unknown container \"/system.slice/docker.service\"" containerName="/system.slice/docker.service"
    Apr 20 21:46:32 masterkub kubelet[1055]: E0420 21:46:32.450832 1055 kubelet.go:2347] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"
    Apr 20 21:46:33 masterkub kubelet[1055]: I0420 21:46:33.844389 1055 cni.go:240] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
    Apr 20 21:46:37 masterkub kubelet[1055]: E0420 21:46:37.461076 1055 kubelet.go:2347] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"

    ls -ltr /etc/cni/net.d
    root@masterkub:~# ls -ltr /etc/cni/net.d
    total 0
    root@masterkub:~#

  • I have restarted several time and now, the cluster is OK

    @masterkub:~$ kubectl get node

    NAME STATUS ROLES AGE VERSION
    masterkub Ready control-plane,master 14h v1.23.1
    workerkub Ready 13h v1.23.1

Categories

Upcoming Training