Lab 2.2 - Unable To Start Control Plane Node

j0hns0n · August 2022

Hello everyone,

i am currently facing an issue during exercise 2.2:

My setup is the following:

Using an AWS Instance (t2.large) with the following spec:

2 CPU
8G memory
20G disk space

After startup & connect i did the following:

check firewall status - disabled
disabled swap
checked SELinux - disabled
Disabled AppArmor with the following commands

sudo systemctl stop apparmor
sudo systemctl disable apparmor

when running the mentioned shell script k8scp.sh i get the success message:

Your Kubernetes control-plane has initialized successfully!

But the kubectl at the end of the script will show the following output:

The connection to the server 172.31.39.164:6443 was refused - did you specify the right host or port?

After some time, i can run the kubectl command but it will show the CP node as NotReady.

The describe command for this node gives returns:

Name:               ip-172-31-39-164
Roles:              control-plane
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=ip-172-31-39-164
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/control-plane=
                    node.kubernetes.io/exclude-from-external-load-balancers=
Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: unix:///var/run/containerd/containerd.sock
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Thu, 25 Aug 2022 09:19:42 +0000
Taints:             node-role.kubernetes.io/control-plane:NoSchedule
                    node-role.kubernetes.io/master:NoSchedule
                    node.kubernetes.io/not-ready:NoSchedule
Unschedulable:      false
Lease:
  HolderIdentity:  ip-172-31-39-164
  AcquireTime:     <unset>
  RenewTime:       Thu, 25 Aug 2022 09:21:18 +0000
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Thu, 25 Aug 2022 09:20:57 +0000   Thu, 25 Aug 2022 09:19:38 +0000   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Thu, 25 Aug 2022 09:20:57 +0000   Thu, 25 Aug 2022 09:19:38 +0000   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Thu, 25 Aug 2022 09:20:57 +0000   Thu, 25 Aug 2022 09:19:38 +0000   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            False   Thu, 25 Aug 2022 09:20:57 +0000   Thu, 25 Aug 2022 09:19:38 +0000   KubeletNotReady              container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
Addresses:
  InternalIP:  172.31.39.164
  Hostname:    ip-172-31-39-164
Capacity:
  cpu:                2
  ephemeral-storage:  20134592Ki
  hugepages-2Mi:      0
  memory:             8137712Ki
  pods:               110
Allocatable:
  cpu:                2
  ephemeral-storage:  18556039957
  hugepages-2Mi:      0
  memory:             8035312Ki
  pods:               110
System Info:
  Machine ID:                 18380e0a74d14c1db72eeaba35b3daa2
  System UUID:                ec2c0143-a6ec-7352-60c1-21888f960243
  Boot ID:                    50f8ff11-1232-4069-bcee-9df6ba3da059
  Kernel Version:             5.15.0-1017-aws
  OS Image:                   Ubuntu 22.04.1 LTS
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.6.7
  Kubelet Version:            v1.24.1
  Kube-Proxy Version:         v1.24.1
Non-terminated Pods:          (4 in total)
  Namespace                   Name                                        CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                                        ------------  ----------  ---------------  -------------  ---
  kube-system                 etcd-ip-172-31-39-164                       100m (5%)     0 (0%)      100Mi (1%)       0 (0%)         24s
  kube-system                 kube-apiserver-ip-172-31-39-164             250m (12%)    0 (0%)      0 (0%)           0 (0%)         24s
  kube-system                 kube-controller-manager-ip-172-31-39-164    200m (10%)    0 (0%)      0 (0%)           0 (0%)         18s
  kube-system                 kube-scheduler-ip-172-31-39-164             100m (5%)     0 (0%)      0 (0%)           0 (0%)         17s
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests    Limits
  --------           --------    ------
  cpu                650m (32%)  0 (0%)
  memory             100Mi (1%)  0 (0%)
  ephemeral-storage  0 (0%)      0 (0%)
  hugepages-2Mi      0 (0%)      0 (0%)
Events:
  Type     Reason                   Age                  From     Message
  ----     ------                   ----                 ----     -------
  Warning  InvalidDiskCapacity      107s                 kubelet  invalid capacity 0 on image filesystem
  Normal   NodeHasSufficientMemory  107s (x3 over 107s)  kubelet  Node ip-172-31-39-164 status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure    107s (x3 over 107s)  kubelet  Node ip-172-31-39-164 status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientPID     107s (x2 over 107s)  kubelet  Node ip-172-31-39-164 status is now: NodeHasSufficientPID
  Normal   NodeAllocatableEnforced  107s                 kubelet  Updated Node Allocatable limit across pods
  Normal   Starting                 107s                 kubelet  Starting kubelet.
  Normal   NodeAllocatableEnforced  97s                  kubelet  Updated Node Allocatable limit across pods
  Normal   Starting                 97s                  kubelet  Starting kubelet.
  Warning  InvalidDiskCapacity      97s                  kubelet  invalid capacity 0 on image filesystem
  Normal   NodeHasSufficientMemory  97s                  kubelet  Node ip-172-31-39-164 status is now: NodeHasSufficientMemory
  Normal   NodeHasSufficientPID     97s                  kubelet  Node ip-172-31-39-164 status is now: NodeHasSufficientPID
  Normal   NodeHasNoDiskPressure    97s                  kubelet  Node ip-172-31-39-164 status is now: NodeHasNoDiskPressure
  Normal   Starting                 33s                  kubelet  Starting kubelet.
  Warning  InvalidDiskCapacity      33s                  kubelet  invalid capacity 0 on image filesystem
  Normal   NodeHasSufficientMemory  32s (x8 over 33s)    kubelet  Node ip-172-31-39-164 status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure    32s (x7 over 33s)    kubelet  Node ip-172-31-39-164 status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientPID     32s (x7 over 33s)    kubelet  Node ip-172-31-39-164 status is now: NodeHasSufficientPID
  Normal   NodeAllocatableEnforced  32s                  kubelet  Updated Node Allocatable limit across pods

After some time, the node seems to terminate and any kubectl command will return this error message:

The connection to the server 172.31.39.164:6443 was refused - did you specify the right host or port?

I have the feeling, that there is some issue with the networking, but i cant figure out, what exactly. I tried the steps several times, everytime with a fresh AWS instance.

Can anyone please help me with this issue?

Many thanks in advance

amayorga · August 2022

HI @chrispokorni, thanks for the help and tips. After reading other threads in the forum, I have tried using the LTS version 20.04 instead of 22.04. Although the default version in EC2 does not have containerd installed, but easy to solve

Now seems to be working fine

 kubectl get node
NAME               STATUS   ROLES           AGE   VERSION
ip-172-31-41-155   Ready    <none>          11m   v1.24.1
ip-172-31-47-37    Ready    control-plane   23h   v1.24.1

kubectl get pod -n kube-system
NAME                                       READY   STATUS              RESTARTS      AGE
calico-kube-controllers-5b97f5d8cf-sfwfb   1/1     Running             1 (46m ago)   24h
calico-node-5h77g                          0/1     Init:0/3            0             24m
calico-node-9vz4r                          1/1     Running             1 (46m ago)   24h
coredns-6d4b75cb6d-b5tf6                   1/1     Running             1 (46m ago)   24h
coredns-6d4b75cb6d-wknrz                   1/1     Running             1 (46m ago)   24h
etcd-ip-172-31-47-37                       1/1     Running             1 (46m ago)   24h
kube-apiserver-ip-172-31-47-37             1/1     Running             1 (46m ago)   24h
kube-controller-manager-ip-172-31-47-37    1/1     Running             1 (46m ago)   24h
kube-proxy-8wpqj                           1/1     Running             1 (46m ago)   24h
kube-proxy-dk9p6                           0/1     ContainerCreating   0             24m
kube-scheduler-ip-172-31-47-37             1/1     Running             1 (46m ago)   24h

BR
Alberto

j0hns0n · August 2022

When checking the pods of the kube-system namespace i can see that some of these are caught in a loop.

NAME                                       READY   STATUS             RESTARTS      AGE
coredns-6d4b75cb6d-5mv6l                   0/1     Pending            0             51s
coredns-6d4b75cb6d-ht77w                   0/1     Pending            0             51s
etcd-ip-172-31-39-164                      1/1     Running            2 (94s ago)   85s
kube-apiserver-ip-172-31-39-164            1/1     Running            1 (94s ago)   85s
kube-controller-manager-ip-172-31-39-164   1/1     Running            2 (94s ago)   79s
kube-proxy-292zd                           1/1     Running            1 (50s ago)   52s
kube-scheduler-ip-172-31-39-164            0/1     CrashLoopBackOff   2 (5s ago)    78s

Looking closer into the kube-scheduler i can see the following:

              kubernetes.io/config.hash: 641b4e44950584cb2848b582a6bae80f
                      kubernetes.io/config.mirror: 641b4e44950584cb2848b582a6bae80f
                      kubernetes.io/config.seen: 2022-08-25T09:20:49.832469811Z
                      kubernetes.io/config.source: file
                      seccomp.security.alpha.kubernetes.io/pod: runtime/default
Status:               Running
IP:                   172.31.39.164
IPs:
  IP:           172.31.39.164
Controlled By:  Node/ip-172-31-39-164
Containers:
  kube-scheduler:
    Container ID:  containerd://be09d0a5460bd2cc62849d9a66f4ea2e771471ca6bba0eebf5b18a576dd328d8
    Image:         k8s.gcr.io/kube-scheduler:v1.24.4
    Image ID:      k8s.gcr.io/kube-scheduler@sha256:378509dd1111937ca2791cf4c4814bc0647714e2ab2f4fc15396707ad1a987a2
    Port:          <none>
    Host Port:     <none>
    Command:
      kube-scheduler
      --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
      --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
      --bind-address=127.0.0.1
      --kubeconfig=/etc/kubernetes/scheduler.conf
      --leader-elect=true
    State:          Running
      Started:      Thu, 25 Aug 2022 09:22:43 +0000
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Thu, 25 Aug 2022 09:20:51 +0000
      Finished:     Thu, 25 Aug 2022 09:22:18 +0000
    Ready:          False
    Restart Count:  3
    Requests:
      cpu:        100m
    Liveness:     http-get https://127.0.0.1:10259/healthz delay=10s timeout=15s period=10s #success=1 #failure=8
    Startup:      http-get https://127.0.0.1:10259/healthz delay=10s timeout=15s period=10s #success=1 #failure=24
    Environment:  <none>
    Mounts:
      /etc/kubernetes/scheduler.conf from kubeconfig (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  kubeconfig:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/kubernetes/scheduler.conf
    HostPathType:  FileOrqCreate
QoS Class:         Burstable
Node-Selectors:    <none>
Tolerations:       :NoExecute op=Exists
Events:
  Type     Reason          Age                 From     Message
  ----     ------          ----                ----     -------
  Normal   SandboxChanged  31s (x2 over 119s)  kubelet  Pod sandbox changed, it will be killed and re-created.
  Normal   Killing         31s                 kubelet  Stopping container kube-scheduler
  Warning  BackOff         22s (x5 over 31s)   kubelet  Back-off restarting failed container
  Normal   Pulled          6s (x2 over 118s)   kubelet  Container image "k8s.gcr.io/kube-scheduler:v1.24.4" already present on machine
  Normal   Created         6s (x2 over 118s)   kubelet  Created container kube-scheduler
  Normal   Started         6s (x2 over 117s)   kubelet  Started container kube-scheduler

j0hns0n · August 2022

The logs for this pod look like this:

I0825 09:23:21.869581       1 serving.go:348] Generated self-signed cert in-memory
I0825 09:23:22.199342       1 server.go:147] "Starting Kubernetes Scheduler" version="v1.24.4"
I0825 09:23:22.199377       1 server.go:149] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
I0825 09:23:22.203198       1 secure_serving.go:210] Serving securely on 127.0.0.1:10259
I0825 09:23:22.203278       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0825 09:23:22.203296       1 shared_informer.go:255] Waiting for caches to sync for RequestHeaderAuthRequestController
I0825 09:23:22.203323       1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
I0825 09:23:22.211009       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I0825 09:23:22.211197       1 shared_informer.go:255] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0825 09:23:22.211296       1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I0825 09:23:22.211417       1 shared_informer.go:255] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0825 09:23:22.304407       1 shared_informer.go:262] Caches are synced for RequestHeaderAuthRequestController
I0825 09:23:22.304694       1 leaderelection.go:248] attempting to acquire leader lease kube-system/kube-scheduler...
I0825 09:23:22.312381       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0825 09:23:22.312443       1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0825 09:23:22.313870       1 leaderelection.go:258] successfully acquired lease kube-system/kube-scheduler

amayorga · August 2022

Hello everyone.
I'm facing similar issue during this exercice.
Same AWS instance configuration and firewall and rules pre k8scp.sh execution.

Your Kubernetes control-plane has initialized successfully!

For the first few minutes of boot I can access the kubectl commands, but after a while I can't.

Any clue on this?

Thanks

kubectl get pods -n kube-system
NAME                                       READY   STATUS             RESTARTS       AGE
coredns-6d4b75cb6d-2wrn2                   0/1     Pending            0              23m
coredns-6d4b75cb6d-vg5f7                   0/1     Pending            0              23m
etcd-ip-172-31-34-203                      1/1     Running            9 (87s ago)    24m
kube-apiserver-ip-172-31-34-203            1/1     Running            8 (87s ago)    24m
kube-controller-manager-ip-172-31-34-203   0/1     CrashLoopBackOff   10 (9s ago)    24m
kube-proxy-fhl7c                           1/1     Running            11 (68s ago)   23m
kube-scheduler-ip-172-31-34-203            1/1     Running            10 (64s ago)   24m

NAME                                       READY   STATUS             RESTARTS        AGE
coredns-6d4b75cb6d-2wrn2                   0/1     Pending            0               25m
coredns-6d4b75cb6d-vg5f7                   0/1     Pending            0               25m
etcd-ip-172-31-34-203                      1/1     Running            11 (65s ago)    26m
kube-apiserver-ip-172-31-34-203            1/1     Running            8 (3m34s ago)   26m
kube-controller-manager-ip-172-31-34-203   0/1     CrashLoopBackOff   11 (85s ago)    26m
kube-proxy-fhl7c                           1/1     Running            12 (98s ago)    25m
kube-scheduler-ip-172-31-34-203            1/1     Running            11 (85s ago)    27m

amayorga · August 2022

My control-plane node information

chrispokorni · August 2022

Hello @j0hns0n and @amayorga,

Prior to provisioning the EC2 instances and any SGs needed for the lab environment, did you happen to watch the demo video from the introductory chapter of the course? It may provide tips for configuring the networking required by the EC2 instances to support the Kubernetes installation.

From all pod listings it seems that the pod network plugin (calico) is not running. It may have not been installed, or it did not start due possible provisioning and networking issues.

Regards,
-Chris

j0hns0n · August 2022

Hello @chrispokorni ,

many thanks for your reply. I watched the videos three times and read the beginning instructions several times.

i tried adjusting the script, so that the calico is initialized afterwards. In this case i got the node in a Ready state. Unfortunately the node shuts down after several minutes. I could see that the kube-controller-manager-pod had an error, which seems to cause the whole node to shut down.

when describing the kube-controller-manager i get the following output:

Name:                 kube-controller-manager-ip-172-31-15-79
Namespace:            kube-system
Priority:             2000001000
Priority Class Name:  system-node-critical
Node:                 ip-172-31-15-79/172.31.15.79
Start Time:           Mon, 29 Aug 2022 19:22:10 +0000
Labels:               component=kube-controller-manager
                      tier=control-plane
Annotations:          kubernetes.io/config.hash: 779a2592f7699f3e79c55431781e2f49
                      kubernetes.io/config.mirror: 779a2592f7699f3e79c55431781e2f49
                      kubernetes.io/config.seen: 2022-08-29T19:21:32.165202650Z
                      kubernetes.io/config.source: file
                      seccomp.security.alpha.kubernetes.io/pod: runtime/default
Status:               Running
IP:                   172.31.15.79
IPs:
  IP:           172.31.15.79
Controlled By:  Node/ip-172-31-15-79
Containers:
  kube-controller-manager:
    Container ID:  containerd://bef1b64a79c090852db4331f0d7f92fa15347ed5b5a72e4f97920678c948aeb2
    Image:         k8s.gcr.io/kube-controller-manager:v1.24.4
    Image ID:      k8s.gcr.io/kube-controller-manager@sha256:f9400b11d780871e4e87cac8a8d4f8fc6bb83d7793b58981020b43be55f71cb9
    Port:          <none>
    Host Port:     <none>
    Command:
      kube-controller-manager
      --allocate-node-cidrs=true
      --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
      --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
      --bind-address=127.0.0.1
      --client-ca-file=/etc/kubernetes/pki/ca.crt
      --cluster-cidr=192.168.0.0/16
      --cluster-name=kubernetes
      --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
      --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
      --controllers=*,bootstrapsigner,tokencleaner
      --kubeconfig=/etc/kubernetes/controller-manager.conf
      --leader-elect=true
      --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
      --root-ca-file=/etc/kubernetes/pki/ca.crt
      --service-account-private-key-file=/etc/kubernetes/pki/sa.key
      --service-cluster-ip-range=10.96.0.0/12
      --use-service-account-credentials=true
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    2
      Started:      Mon, 29 Aug 2022 19:30:14 +0000
      Finished:     Mon, 29 Aug 2022 19:30:21 +0000
    Ready:          False
    Restart Count:  10
    Requests:
      cpu:        200m
    Liveness:     http-get https://127.0.0.1:10257/healthz delay=10s timeout=15s period=10s #success=1 #failure=8
    Startup:      http-get https://127.0.0.1:10257/healthz delay=10s timeout=15s period=10s #success=1 #failure=24
    Environment:  <none>
    Mounts:
      /etc/ca-certificates from etc-ca-certificates (ro)
      /etc/kubernetes/controller-manager.conf from kubeconfig (ro)
      /etc/kubernetes/pki from k8s-certs (ro)
      /etc/pki from etc-pki (ro)
      /etc/ssl/certs from ca-certs (ro)
      /usr/libexec/kubernetes/kubelet-plugins/volume/exec from flexvolume-dir (rw)
      /usr/local/share/ca-certificates from usr-local-share-ca-certificates (ro)
      /usr/share/ca-certificates from usr-share-ca-certificates (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  ca-certs:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/ssl/certs
    HostPathType:  DirectoryOrCreate
  etc-ca-certificates:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/ca-certificates
    HostPathType:  DirectoryOrCreate
  etc-pki:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/pki
    HostPathType:  DirectoryOrCreate
  flexvolume-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /usr/libexec/kubernetes/kubelet-plugins/volume/exec
    HostPathType:  DirectoryOrCreate
  k8s-certs:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/kubernetes/pki
    HostPathType:  DirectoryOrCreate
  kubeconfig:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/kubernetes/controller-manager.conf
    HostPathType:  FileOrCreate
  usr-local-share-ca-certificates:
    Type:          HostPath (bare host directory volume)
    Path:          /usr/local/share/ca-certificates
    HostPathType:  DirectoryOrCreate
  usr-share-ca-certificates:
    Type:          HostPath (bare host directory volume)
    Path:          /usr/share/ca-certificates
    HostPathType:  DirectoryOrCreate
QoS Class:         Burstable
Node-Selectors:    <none>
Tolerations:       :NoExecute op=Exists
Events:
  Type     Reason          Age                    From     Message
  ----     ------          ----                   ----     -------
  Normal   Killing         9m28s                  kubelet  Stopping container kube-controller-manager
  Warning  Unhealthy       9m23s                  kubelet  Startup probe failed: Get "https://127.0.0.1:10257/healthz": dial tcp 127.0.0.1:10257: connect: connection refused
  Normal   SandboxChanged  9m8s                   kubelet  Pod sandbox changed, it will be killed and re-created.
  Warning  BackOff         7m12s (x3 over 7m15s)  kubelet  Back-off restarting failed container
  Normal   Created         6m59s (x2 over 9m8s)   kubelet  Created container kube-controller-manager
  Normal   Started         6m59s (x2 over 9m8s)   kubelet  Started container kube-controller-manager
  Normal   Pulled          6m59s (x2 over 9m8s)   kubelet  Container image "k8s.gcr.io/kube-controller-manager:v1.24.4" already present on machine
  Normal   Killing         76s (x2 over 2m57s)    kubelet  Stopping container kube-controller-manager
  Normal   SandboxChanged  75s (x3 over 4m24s)    kubelet  Pod sandbox changed, it will be killed and re-created.
  Warning  BackOff         58s (x11 over 2m57s)   kubelet  Back-off restarting failed container
  Normal   Pulled          46s (x3 over 4m23s)    kubelet  Container image "k8s.gcr.io/kube-controller-manager:v1.24.4" already present on machine
  Normal   Created         46s (x3 over 4m23s)    kubelet  Created container kube-controller-manager
  Normal   Started         46s (x3 over 4m23s)    kubelet  Started container kube-controller-manager

For some reason the startup probe seems to fail. This indicates a possible network issue. But i followed all the steps from the instructions. Do you have any idea?

Many thanks in advance

chrispokorni · August 2022

Hi @j0hns0n,

Did you experience the same behavior on Kubernetes v1.24.1, as presented by the lab guide?

A similar behavior was observed some years back prior to a new version release. Since 1.24.4 is currently the last release prior to 1.25.0, I am suspecting some unexpected changes in the code are causing this behavior.

By delaying calico start, did you eventually see all calico pods in a Running state?

Can you provide a screenshot of the SG configuration, and the output of:

kubectl get pods --all-namespaces -o wide
OR
kubectl get po -A -owide

Just to rule out any possible node and pod networking issues.

Regards,
-Chris

chrispokorni · August 2022

Hi @amayorga,

I am glad it all works now, although a bit surprised that containerd did not get installed by the k8scp.sh and k8sWorker.sh scripts.

If you look at the k8scp.sh and k8sWorker.sh script files, can you find the containerd configuration and instalation commands in each file? If they did not install containerd on either of the nodes, can you provide the content of the cp.out and worker.out files? I'd be curious to see if any errors were generated and recorded.

Regards,
-Chris

amayorga · August 2022

Hi @chrispokorni.
I've checked the k8scp.sh script and the containerd installation section is present.

# Install the containerd software
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
sudo apt update
sudo apt install containerd.io -y

But not worked for me

Sorry but I don't have the output of the execution in which the installation of containerd failed

BR
Alberto

j0hns0n · August 2022

Hello @chrispokorni & @amayorga ,

using LTS version 20.04 did the trick I dont even have problems with containerd. After running k8scp.sh my control plane is up and running!

ubuntu@ip-172-31-1-47:~$ kubectl get node
NAME             STATUS   ROLES           AGE     VERSION
ip-172-31-1-47   Ready    control-plane   3m51s   v1.24.1
ubuntu@ip-172-31-1-47:~$ kubectl get po -n kube-system
NAME                                       READY   STATUS    RESTARTS   AGE
calico-kube-controllers-6799f5f4b4-w45vp   1/1     Running   0          3m36s
calico-node-ws5dl                          1/1     Running   0          3m36s
coredns-6d4b75cb6d-64n2n                   1/1     Running   0          3m36s
coredns-6d4b75cb6d-w5nhv                   1/1     Running   0          3m36s
etcd-ip-172-31-1-47                        1/1     Running   0          3m50s
kube-apiserver-ip-172-31-1-47              1/1     Running   0          3m50s
kube-controller-manager-ip-172-31-1-47     1/1     Running   0          3m50s
kube-proxy-xx6tr                           1/1     Running   0          3m36s
kube-scheduler-ip-172-31-1-47              1/1     Running   0          3m52s

Thanks to both of you

Lab 2.2 - Unable To Start Control Plane Node

Best Answer

Answers

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)