Welcome to the Linux Foundation Forum!

Lab 2.2 - Unable To Start Control Plane Node

Hello everyone,

i am currently facing an issue during exercise 2.2:

My setup is the following:

Using an AWS Instance (t2.large) with the following spec:

2 CPU
8G memory
20G disk space

After startup & connect i did the following:

check firewall status - disabled
disabled swap
checked SELinux - disabled
Disabled AppArmor with the following commands

  1. sudo systemctl stop apparmor
  2. sudo systemctl disable apparmor

when running the mentioned shell script k8scp.sh i get the success message:

Your Kubernetes control-plane has initialized successfully!

But the kubectl at the end of the script will show the following output:

The connection to the server 172.31.39.164:6443 was refused - did you specify the right host or port?

After some time, i can run the kubectl command but it will show the CP node as NotReady.

The describe command for this node gives returns:

  1. Name: ip-172-31-39-164
  2. Roles: control-plane
  3. Labels: beta.kubernetes.io/arch=amd64
  4. beta.kubernetes.io/os=linux
  5. kubernetes.io/arch=amd64
  6. kubernetes.io/hostname=ip-172-31-39-164
  7. kubernetes.io/os=linux
  8. node-role.kubernetes.io/control-plane=
  9. node.kubernetes.io/exclude-from-external-load-balancers=
  10. Annotations: kubeadm.alpha.kubernetes.io/cri-socket: unix:///var/run/containerd/containerd.sock
  11. volumes.kubernetes.io/controller-managed-attach-detach: true
  12. CreationTimestamp: Thu, 25 Aug 2022 09:19:42 +0000
  13. Taints: node-role.kubernetes.io/control-plane:NoSchedule
  14. node-role.kubernetes.io/master:NoSchedule
  15. node.kubernetes.io/not-ready:NoSchedule
  16. Unschedulable: false
  17. Lease:
  18. HolderIdentity: ip-172-31-39-164
  19. AcquireTime: <unset>
  20. RenewTime: Thu, 25 Aug 2022 09:21:18 +0000
  21. Conditions:
  22. Type Status LastHeartbeatTime LastTransitionTime Reason Message
  23. ---- ------ ----------------- ------------------ ------ -------
  24. MemoryPressure False Thu, 25 Aug 2022 09:20:57 +0000 Thu, 25 Aug 2022 09:19:38 +0000 KubeletHasSufficientMemory kubelet has sufficient memory available
  25. DiskPressure False Thu, 25 Aug 2022 09:20:57 +0000 Thu, 25 Aug 2022 09:19:38 +0000 KubeletHasNoDiskPressure kubelet has no disk pressure
  26. PIDPressure False Thu, 25 Aug 2022 09:20:57 +0000 Thu, 25 Aug 2022 09:19:38 +0000 KubeletHasSufficientPID kubelet has sufficient PID available
  27. Ready False Thu, 25 Aug 2022 09:20:57 +0000 Thu, 25 Aug 2022 09:19:38 +0000 KubeletNotReady container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
  28. Addresses:
  29. InternalIP: 172.31.39.164
  30. Hostname: ip-172-31-39-164
  31. Capacity:
  32. cpu: 2
  33. ephemeral-storage: 20134592Ki
  34. hugepages-2Mi: 0
  35. memory: 8137712Ki
  36. pods: 110
  37. Allocatable:
  38. cpu: 2
  39. ephemeral-storage: 18556039957
  40. hugepages-2Mi: 0
  41. memory: 8035312Ki
  42. pods: 110
  43. System Info:
  44. Machine ID: 18380e0a74d14c1db72eeaba35b3daa2
  45. System UUID: ec2c0143-a6ec-7352-60c1-21888f960243
  46. Boot ID: 50f8ff11-1232-4069-bcee-9df6ba3da059
  47. Kernel Version: 5.15.0-1017-aws
  48. OS Image: Ubuntu 22.04.1 LTS
  49. Operating System: linux
  50. Architecture: amd64
  51. Container Runtime Version: containerd://1.6.7
  52. Kubelet Version: v1.24.1
  53. Kube-Proxy Version: v1.24.1
  54. Non-terminated Pods: (4 in total)
  55. Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
  56. --------- ---- ------------ ---------- --------------- ------------- ---
  57. kube-system etcd-ip-172-31-39-164 100m (5%) 0 (0%) 100Mi (1%) 0 (0%) 24s
  58. kube-system kube-apiserver-ip-172-31-39-164 250m (12%) 0 (0%) 0 (0%) 0 (0%) 24s
  59. kube-system kube-controller-manager-ip-172-31-39-164 200m (10%) 0 (0%) 0 (0%) 0 (0%) 18s
  60. kube-system kube-scheduler-ip-172-31-39-164 100m (5%) 0 (0%) 0 (0%) 0 (0%) 17s
  61. Allocated resources:
  62. (Total limits may be over 100 percent, i.e., overcommitted.)
  63. Resource Requests Limits
  64. -------- -------- ------
  65. cpu 650m (32%) 0 (0%)
  66. memory 100Mi (1%) 0 (0%)
  67. ephemeral-storage 0 (0%) 0 (0%)
  68. hugepages-2Mi 0 (0%) 0 (0%)
  69. Events:
  70. Type Reason Age From Message
  71. ---- ------ ---- ---- -------
  72. Warning InvalidDiskCapacity 107s kubelet invalid capacity 0 on image filesystem
  73. Normal NodeHasSufficientMemory 107s (x3 over 107s) kubelet Node ip-172-31-39-164 status is now: NodeHasSufficientMemory
  74. Normal NodeHasNoDiskPressure 107s (x3 over 107s) kubelet Node ip-172-31-39-164 status is now: NodeHasNoDiskPressure
  75. Normal NodeHasSufficientPID 107s (x2 over 107s) kubelet Node ip-172-31-39-164 status is now: NodeHasSufficientPID
  76. Normal NodeAllocatableEnforced 107s kubelet Updated Node Allocatable limit across pods
  77. Normal Starting 107s kubelet Starting kubelet.
  78. Normal NodeAllocatableEnforced 97s kubelet Updated Node Allocatable limit across pods
  79. Normal Starting 97s kubelet Starting kubelet.
  80. Warning InvalidDiskCapacity 97s kubelet invalid capacity 0 on image filesystem
  81. Normal NodeHasSufficientMemory 97s kubelet Node ip-172-31-39-164 status is now: NodeHasSufficientMemory
  82. Normal NodeHasSufficientPID 97s kubelet Node ip-172-31-39-164 status is now: NodeHasSufficientPID
  83. Normal NodeHasNoDiskPressure 97s kubelet Node ip-172-31-39-164 status is now: NodeHasNoDiskPressure
  84. Normal Starting 33s kubelet Starting kubelet.
  85. Warning InvalidDiskCapacity 33s kubelet invalid capacity 0 on image filesystem
  86. Normal NodeHasSufficientMemory 32s (x8 over 33s) kubelet Node ip-172-31-39-164 status is now: NodeHasSufficientMemory
  87. Normal NodeHasNoDiskPressure 32s (x7 over 33s) kubelet Node ip-172-31-39-164 status is now: NodeHasNoDiskPressure
  88. Normal NodeHasSufficientPID 32s (x7 over 33s) kubelet Node ip-172-31-39-164 status is now: NodeHasSufficientPID
  89. Normal NodeAllocatableEnforced 32s kubelet Updated Node Allocatable limit across pods
  90.  

After some time, the node seems to terminate and any kubectl command will return this error message:

The connection to the server 172.31.39.164:6443 was refused - did you specify the right host or port?

I have the feeling, that there is some issue with the networking, but i cant figure out, what exactly. I tried the steps several times, everytime with a fresh AWS instance.

Can anyone please help me with this issue?

Many thanks in advance

Welcome!

It looks like you're new here. Sign in or register to get started.
Sign In

Best Answer

  • Posts: 6
    edited August 2022 Answer ✓

    HI @chrispokorni, thanks for the help and tips. After reading other threads in the forum, I have tried using the LTS version 20.04 instead of 22.04. Although the default version in EC2 does not have containerd installed, but easy to solve :)

    Now seems to be working fine

    1. kubectl get node
    2. NAME STATUS ROLES AGE VERSION
    3. ip-172-31-41-155 Ready <none> 11m v1.24.1
    4. ip-172-31-47-37 Ready control-plane 23h v1.24.1
    1. kubectl get pod -n kube-system
    2. NAME READY STATUS RESTARTS AGE
    3. calico-kube-controllers-5b97f5d8cf-sfwfb 1/1 Running 1 (46m ago) 24h
    4. calico-node-5h77g 0/1 Init:0/3 0 24m
    5. calico-node-9vz4r 1/1 Running 1 (46m ago) 24h
    6. coredns-6d4b75cb6d-b5tf6 1/1 Running 1 (46m ago) 24h
    7. coredns-6d4b75cb6d-wknrz 1/1 Running 1 (46m ago) 24h
    8. etcd-ip-172-31-47-37 1/1 Running 1 (46m ago) 24h
    9. kube-apiserver-ip-172-31-47-37 1/1 Running 1 (46m ago) 24h
    10. kube-controller-manager-ip-172-31-47-37 1/1 Running 1 (46m ago) 24h
    11. kube-proxy-8wpqj 1/1 Running 1 (46m ago) 24h
    12. kube-proxy-dk9p6 0/1 ContainerCreating 0 24m
    13. kube-scheduler-ip-172-31-47-37 1/1 Running 1 (46m ago) 24h

    BR
    Alberto

Answers

  • Posts: 5

    When checking the pods of the kube-system namespace i can see that some of these are caught in a loop.

    1. NAME READY STATUS RESTARTS AGE
    2. coredns-6d4b75cb6d-5mv6l 0/1 Pending 0 51s
    3. coredns-6d4b75cb6d-ht77w 0/1 Pending 0 51s
    4. etcd-ip-172-31-39-164 1/1 Running 2 (94s ago) 85s
    5. kube-apiserver-ip-172-31-39-164 1/1 Running 1 (94s ago) 85s
    6. kube-controller-manager-ip-172-31-39-164 1/1 Running 2 (94s ago) 79s
    7. kube-proxy-292zd 1/1 Running 1 (50s ago) 52s
    8. kube-scheduler-ip-172-31-39-164 0/1 CrashLoopBackOff 2 (5s ago) 78s

    Looking closer into the kube-scheduler i can see the following:

    1. kubernetes.io/config.hash: 641b4e44950584cb2848b582a6bae80f
    2. kubernetes.io/config.mirror: 641b4e44950584cb2848b582a6bae80f
    3. kubernetes.io/config.seen: 2022-08-25T09:20:49.832469811Z
    4. kubernetes.io/config.source: file
    5. seccomp.security.alpha.kubernetes.io/pod: runtime/default
    6. Status: Running
    7. IP: 172.31.39.164
    8. IPs:
    9. IP: 172.31.39.164
    10. Controlled By: Node/ip-172-31-39-164
    11. Containers:
    12. kube-scheduler:
    13. Container ID: containerd://be09d0a5460bd2cc62849d9a66f4ea2e771471ca6bba0eebf5b18a576dd328d8
    14. Image: k8s.gcr.io/kube-scheduler:v1.24.4
    15. Image ID: k8s.gcr.io/kube-scheduler@sha256:378509dd1111937ca2791cf4c4814bc0647714e2ab2f4fc15396707ad1a987a2
    16. Port: <none>
    17. Host Port: <none>
    18. Command:
    19. kube-scheduler
    20. --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
    21. --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
    22. --bind-address=127.0.0.1
    23. --kubeconfig=/etc/kubernetes/scheduler.conf
    24. --leader-elect=true
    25. State: Running
    26. Started: Thu, 25 Aug 2022 09:22:43 +0000
    27. Last State: Terminated
    28. Reason: Completed
    29. Exit Code: 0
    30. Started: Thu, 25 Aug 2022 09:20:51 +0000
    31. Finished: Thu, 25 Aug 2022 09:22:18 +0000
    32. Ready: False
    33. Restart Count: 3
    34. Requests:
    35. cpu: 100m
    36. Liveness: http-get https://127.0.0.1:10259/healthz delay=10s timeout=15s period=10s #success=1 #failure=8
    37. Startup: http-get https://127.0.0.1:10259/healthz delay=10s timeout=15s period=10s #success=1 #failure=24
    38. Environment: <none>
    39. Mounts:
    40. /etc/kubernetes/scheduler.conf from kubeconfig (ro)
    41. Conditions:
    42. Type Status
    43. Initialized True
    44. Ready False
    45. ContainersReady False
    46. PodScheduled True
    47. Volumes:
    48. kubeconfig:
    49. Type: HostPath (bare host directory volume)
    50. Path: /etc/kubernetes/scheduler.conf
    51. HostPathType: FileOrqCreate
    52. QoS Class: Burstable
    53. Node-Selectors: <none>
    54. Tolerations: :NoExecute op=Exists
    55. Events:
    56. Type Reason Age From Message
    57. ---- ------ ---- ---- -------
    58. Normal SandboxChanged 31s (x2 over 119s) kubelet Pod sandbox changed, it will be killed and re-created.
    59. Normal Killing 31s kubelet Stopping container kube-scheduler
    60. Warning BackOff 22s (x5 over 31s) kubelet Back-off restarting failed container
    61. Normal Pulled 6s (x2 over 118s) kubelet Container image "k8s.gcr.io/kube-scheduler:v1.24.4" already present on machine
    62. Normal Created 6s (x2 over 118s) kubelet Created container kube-scheduler
    63. Normal Started 6s (x2 over 117s) kubelet Started container kube-scheduler
    64.  
  • Posts: 5

    The logs for this pod look like this:

    1. I0825 09:23:21.869581 1 serving.go:348] Generated self-signed cert in-memory
    2. I0825 09:23:22.199342 1 server.go:147] "Starting Kubernetes Scheduler" version="v1.24.4"
    3. I0825 09:23:22.199377 1 server.go:149] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
    4. I0825 09:23:22.203198 1 secure_serving.go:210] Serving securely on 127.0.0.1:10259
    5. I0825 09:23:22.203278 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
    6. I0825 09:23:22.203296 1 shared_informer.go:255] Waiting for caches to sync for RequestHeaderAuthRequestController
    7. I0825 09:23:22.203323 1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
    8. I0825 09:23:22.211009 1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
    9. I0825 09:23:22.211197 1 shared_informer.go:255] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
    10. I0825 09:23:22.211296 1 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
    11. I0825 09:23:22.211417 1 shared_informer.go:255] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
    12. I0825 09:23:22.304407 1 shared_informer.go:262] Caches are synced for RequestHeaderAuthRequestController
    13. I0825 09:23:22.304694 1 leaderelection.go:248] attempting to acquire leader lease kube-system/kube-scheduler...
    14. I0825 09:23:22.312381 1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
    15. I0825 09:23:22.312443 1 shared_informer.go:262] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
    16. I0825 09:23:22.313870 1 leaderelection.go:258] successfully acquired lease kube-system/kube-scheduler
  • Hello everyone.
    I'm facing similar issue during this exercice.
    Same AWS instance configuration and firewall and rules pre k8scp.sh execution.

    Your Kubernetes control-plane has initialized successfully!

    For the first few minutes of boot I can access the kubectl commands, but after a while I can't.

    Any clue on this?

    Thanks

    1. kubectl get pods -n kube-system
    2. NAME READY STATUS RESTARTS AGE
    3. coredns-6d4b75cb6d-2wrn2 0/1 Pending 0 23m
    4. coredns-6d4b75cb6d-vg5f7 0/1 Pending 0 23m
    5. etcd-ip-172-31-34-203 1/1 Running 9 (87s ago) 24m
    6. kube-apiserver-ip-172-31-34-203 1/1 Running 8 (87s ago) 24m
    7. kube-controller-manager-ip-172-31-34-203 0/1 CrashLoopBackOff 10 (9s ago) 24m
    8. kube-proxy-fhl7c 1/1 Running 11 (68s ago) 23m
    9. kube-scheduler-ip-172-31-34-203 1/1 Running 10 (64s ago) 24m
    1. NAME READY STATUS RESTARTS AGE
    2. coredns-6d4b75cb6d-2wrn2 0/1 Pending 0 25m
    3. coredns-6d4b75cb6d-vg5f7 0/1 Pending 0 25m
    4. etcd-ip-172-31-34-203 1/1 Running 11 (65s ago) 26m
    5. kube-apiserver-ip-172-31-34-203 1/1 Running 8 (3m34s ago) 26m
    6. kube-controller-manager-ip-172-31-34-203 0/1 CrashLoopBackOff 11 (85s ago) 26m
    7. kube-proxy-fhl7c 1/1 Running 12 (98s ago) 25m
    8. kube-scheduler-ip-172-31-34-203 1/1 Running 11 (85s ago) 27m
  • My control-plane node information

  • Posts: 2,451

    Hello @j0hns0n and @amayorga,

    Prior to provisioning the EC2 instances and any SGs needed for the lab environment, did you happen to watch the demo video from the introductory chapter of the course? It may provide tips for configuring the networking required by the EC2 instances to support the Kubernetes installation.

    From all pod listings it seems that the pod network plugin (calico) is not running. It may have not been installed, or it did not start due possible provisioning and networking issues.

    Regards,
    -Chris

  • Posts: 5

    Hello @chrispokorni ,

    many thanks for your reply. I watched the videos three times and read the beginning instructions several times.

    i tried adjusting the script, so that the calico is initialized afterwards. In this case i got the node in a Ready state. Unfortunately the node shuts down after several minutes. I could see that the kube-controller-manager-pod had an error, which seems to cause the whole node to shut down.

    when describing the kube-controller-manager i get the following output:

    1. Name: kube-controller-manager-ip-172-31-15-79
    2. Namespace: kube-system
    3. Priority: 2000001000
    4. Priority Class Name: system-node-critical
    5. Node: ip-172-31-15-79/172.31.15.79
    6. Start Time: Mon, 29 Aug 2022 19:22:10 +0000
    7. Labels: component=kube-controller-manager
    8. tier=control-plane
    9. Annotations: kubernetes.io/config.hash: 779a2592f7699f3e79c55431781e2f49
    10. kubernetes.io/config.mirror: 779a2592f7699f3e79c55431781e2f49
    11. kubernetes.io/config.seen: 2022-08-29T19:21:32.165202650Z
    12. kubernetes.io/config.source: file
    13. seccomp.security.alpha.kubernetes.io/pod: runtime/default
    14. Status: Running
    15. IP: 172.31.15.79
    16. IPs:
    17. IP: 172.31.15.79
    18. Controlled By: Node/ip-172-31-15-79
    19. Containers:
    20. kube-controller-manager:
    21. Container ID: containerd://bef1b64a79c090852db4331f0d7f92fa15347ed5b5a72e4f97920678c948aeb2
    22. Image: k8s.gcr.io/kube-controller-manager:v1.24.4
    23. Image ID: k8s.gcr.io/kube-controller-manager@sha256:f9400b11d780871e4e87cac8a8d4f8fc6bb83d7793b58981020b43be55f71cb9
    24. Port: <none>
    25. Host Port: <none>
    26. Command:
    27. kube-controller-manager
    28. --allocate-node-cidrs=true
    29. --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
    30. --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
    31. --bind-address=127.0.0.1
    32. --client-ca-file=/etc/kubernetes/pki/ca.crt
    33. --cluster-cidr=192.168.0.0/16
    34. --cluster-name=kubernetes
    35. --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
    36. --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
    37. --controllers=*,bootstrapsigner,tokencleaner
    38. --kubeconfig=/etc/kubernetes/controller-manager.conf
    39. --leader-elect=true
    40. --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
    41. --root-ca-file=/etc/kubernetes/pki/ca.crt
    42. --service-account-private-key-file=/etc/kubernetes/pki/sa.key
    43. --service-cluster-ip-range=10.96.0.0/12
    44. --use-service-account-credentials=true
    45. State: Waiting
    46. Reason: CrashLoopBackOff
    47. Last State: Terminated
    48. Reason: Error
    49. Exit Code: 2
    50. Started: Mon, 29 Aug 2022 19:30:14 +0000
    51. Finished: Mon, 29 Aug 2022 19:30:21 +0000
    52. Ready: False
    53. Restart Count: 10
    54. Requests:
    55. cpu: 200m
    56. Liveness: http-get https://127.0.0.1:10257/healthz delay=10s timeout=15s period=10s #success=1 #failure=8
    57. Startup: http-get https://127.0.0.1:10257/healthz delay=10s timeout=15s period=10s #success=1 #failure=24
    58. Environment: <none>
    59. Mounts:
    60. /etc/ca-certificates from etc-ca-certificates (ro)
    61. /etc/kubernetes/controller-manager.conf from kubeconfig (ro)
    62. /etc/kubernetes/pki from k8s-certs (ro)
    63. /etc/pki from etc-pki (ro)
    64. /etc/ssl/certs from ca-certs (ro)
    65. /usr/libexec/kubernetes/kubelet-plugins/volume/exec from flexvolume-dir (rw)
    66. /usr/local/share/ca-certificates from usr-local-share-ca-certificates (ro)
    67. /usr/share/ca-certificates from usr-share-ca-certificates (ro)
    68. Conditions:
    69. Type Status
    70. Initialized True
    71. Ready False
    72. ContainersReady False
    73. PodScheduled True
    74. Volumes:
    75. ca-certs:
    76. Type: HostPath (bare host directory volume)
    77. Path: /etc/ssl/certs
    78. HostPathType: DirectoryOrCreate
    79. etc-ca-certificates:
    80. Type: HostPath (bare host directory volume)
    81. Path: /etc/ca-certificates
    82. HostPathType: DirectoryOrCreate
    83. etc-pki:
    84. Type: HostPath (bare host directory volume)
    85. Path: /etc/pki
    86. HostPathType: DirectoryOrCreate
    87. flexvolume-dir:
    88. Type: HostPath (bare host directory volume)
    89. Path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec
    90. HostPathType: DirectoryOrCreate
    91. k8s-certs:
    92. Type: HostPath (bare host directory volume)
    93. Path: /etc/kubernetes/pki
    94. HostPathType: DirectoryOrCreate
    95. kubeconfig:
    96. Type: HostPath (bare host directory volume)
    97. Path: /etc/kubernetes/controller-manager.conf
    98. HostPathType: FileOrCreate
    99. usr-local-share-ca-certificates:
    100. Type: HostPath (bare host directory volume)
    101. Path: /usr/local/share/ca-certificates
    102. HostPathType: DirectoryOrCreate
    103. usr-share-ca-certificates:
    104. Type: HostPath (bare host directory volume)
    105. Path: /usr/share/ca-certificates
    106. HostPathType: DirectoryOrCreate
    107. QoS Class: Burstable
    108. Node-Selectors: <none>
    109. Tolerations: :NoExecute op=Exists
    110. Events:
    111. Type Reason Age From Message
    112. ---- ------ ---- ---- -------
    113. Normal Killing 9m28s kubelet Stopping container kube-controller-manager
    114. Warning Unhealthy 9m23s kubelet Startup probe failed: Get "https://127.0.0.1:10257/healthz": dial tcp 127.0.0.1:10257: connect: connection refused
    115. Normal SandboxChanged 9m8s kubelet Pod sandbox changed, it will be killed and re-created.
    116. Warning BackOff 7m12s (x3 over 7m15s) kubelet Back-off restarting failed container
    117. Normal Created 6m59s (x2 over 9m8s) kubelet Created container kube-controller-manager
    118. Normal Started 6m59s (x2 over 9m8s) kubelet Started container kube-controller-manager
    119. Normal Pulled 6m59s (x2 over 9m8s) kubelet Container image "k8s.gcr.io/kube-controller-manager:v1.24.4" already present on machine
    120. Normal Killing 76s (x2 over 2m57s) kubelet Stopping container kube-controller-manager
    121. Normal SandboxChanged 75s (x3 over 4m24s) kubelet Pod sandbox changed, it will be killed and re-created.
    122. Warning BackOff 58s (x11 over 2m57s) kubelet Back-off restarting failed container
    123. Normal Pulled 46s (x3 over 4m23s) kubelet Container image "k8s.gcr.io/kube-controller-manager:v1.24.4" already present on machine
    124. Normal Created 46s (x3 over 4m23s) kubelet Created container kube-controller-manager
    125. Normal Started 46s (x3 over 4m23s) kubelet Started container kube-controller-manager
    126.  

    For some reason the startup probe seems to fail. This indicates a possible network issue. But i followed all the steps from the instructions. Do you have any idea?

    Many thanks in advance

  • Posts: 2,451
    edited August 2022

    Hi @j0hns0n,

    Did you experience the same behavior on Kubernetes v1.24.1, as presented by the lab guide?

    A similar behavior was observed some years back prior to a new version release. Since 1.24.4 is currently the last release prior to 1.25.0, I am suspecting some unexpected changes in the code are causing this behavior.

    By delaying calico start, did you eventually see all calico pods in a Running state?

    Can you provide a screenshot of the SG configuration, and the output of:

    kubectl get pods --all-namespaces -o wide
    OR
    kubectl get po -A -owide

    Just to rule out any possible node and pod networking issues.

    Regards,
    -Chris

  • Posts: 2,451

    Hi @amayorga,

    I am glad it all works now, although a bit surprised that containerd did not get installed by the k8scp.sh and k8sWorker.sh scripts.

    If you look at the k8scp.sh and k8sWorker.sh script files, can you find the containerd configuration and instalation commands in each file? If they did not install containerd on either of the nodes, can you provide the content of the cp.out and worker.out files? I'd be curious to see if any errors were generated and recorded.

    Regards,
    -Chris

  • Hi @chrispokorni.
    I've checked the k8scp.sh script and the containerd installation section is present.

    1. # Install the containerd software
    2. curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
    3. sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
    4. sudo apt update
    5. sudo apt install containerd.io -y

    But not worked for me :'(

    Sorry but I don't have the output of the execution in which the installation of containerd failed

    BR
    Alberto

  • Posts: 5

    Hello @chrispokorni & @amayorga ,

    using LTS version 20.04 did the trick :) I dont even have problems with containerd. After running k8scp.sh my control plane is up and running! :smiley:

    1. ubuntu@ip-172-31-1-47:~$ kubectl get node
    2. NAME STATUS ROLES AGE VERSION
    3. ip-172-31-1-47 Ready control-plane 3m51s v1.24.1
    4. ubuntu@ip-172-31-1-47:~$ kubectl get po -n kube-system
    5. NAME READY STATUS RESTARTS AGE
    6. calico-kube-controllers-6799f5f4b4-w45vp 1/1 Running 0 3m36s
    7. calico-node-ws5dl 1/1 Running 0 3m36s
    8. coredns-6d4b75cb6d-64n2n 1/1 Running 0 3m36s
    9. coredns-6d4b75cb6d-w5nhv 1/1 Running 0 3m36s
    10. etcd-ip-172-31-1-47 1/1 Running 0 3m50s
    11. kube-apiserver-ip-172-31-1-47 1/1 Running 0 3m50s
    12. kube-controller-manager-ip-172-31-1-47 1/1 Running 0 3m50s
    13. kube-proxy-xx6tr 1/1 Running 0 3m36s
    14. kube-scheduler-ip-172-31-1-47 1/1 Running 0 3m52s

    Thanks to both of you ;)

Welcome!

It looks like you're new here. Sign in or register to get started.
Sign In

Welcome!

It looks like you're new here. Sign in or register to get started.
Sign In

Categories

Upcoming Training