Welcome to the Linux Foundation Forum!

Help with lab - Upgrade the Cluster

Posts: 4
edited August 2021 in LFS258 Class Forum

I've some problems with lab 4.1 Upgrade The Cluster, after I finished the CP node upgrade the Calico Controller Manager does not start.

  1. NAMESPACE NAME READY STATUS RESTARTS AGE
  2. kube-system calico-kube-controllers-5f6cfd688c-pgm6x 0/1 CrashLoopBackOff 5 4m50s
  3. kube-system calico-node-bbtns 1/1 Running 0 7m59s
  4. kube-system calico-node-lhxc4 1/1 Running 0 28m
  5. kube-system coredns-558bd4d5db-g2l9k 0/1 Running 0 92s
  6. kube-system coredns-558bd4d5db-z84b5 0/1 Running 0 92s
  7. kube-system coredns-74ff55c5b-d2v8d 0/1 Running 0 4m50s
  8. kube-system etcd-cp 1/1 Running 1 64s
  9. kube-system kube-apiserver-cp 1/1 Running 1 63s
  10. kube-system kube-controller-manager-cp 1/1 Running 0 63s
  11. kube-system kube-proxy-7x9gp 1/1 Running 0 14s
  12. kube-system kube-proxy-95lcf 1/1 Running 0 38s
  13. kube-system kube-scheduler-cp 1/1 Running 0 64s
  14. kube-system upgrade-health-check-8bws2 0/1 Completed 0 38s
  15.  

I'm just follow the instructions and do it the same with worker node. When I drain worker node this happens:

  1. error when evicting pods/"calico-kube-controllers-5f6cfd688c-pgm6x" -n "kube-system" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.

I've check the Pods and Calico Kube Constrollers doest not start.

  1. students@cp:~$ kubectl get pods --all-namespaces
  2. NAMESPACE NAME READY STATUS RESTARTS AGE
  3. kube-system calico-kube-controllers-5f6cfd688c-pgm6x 0/1 Running 10 10m
  4. kube-system calico-node-bbtns 1/1 Running 0 13m
  5. kube-system calico-node-lhxc4 1/1 Running 0 34m
  6. kube-system coredns-558bd4d5db-csnp8 1/1 Running 0 3m21s
  7. kube-system coredns-558bd4d5db-x2g5s 1/1 Running 0 3m21s
  8. kube-system etcd-cp 1/1 Running 1 6m32s
  9. kube-system kube-apiserver-cp 1/1 Running 1 6m31s
  10. kube-system kube-controller-manager-cp 1/1 Running 0 6m31s
  11. kube-system kube-proxy-7x9gp 1/1 Running 0 5m42s
  12. kube-system kube-proxy-95lcf 1/1 Running 0 6m6s
  13. kube-system kube-scheduler-cp 1/1 Running 0 6m32s

When i inspect the logs from calico-kube-controller, I've got this:

main.go 118: Failed to initialize Calico datastore error=Get https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded

I think the calico does not connect with coredns and i'v delete thoose pods. After that all pods running fine. But i dont know why this happens.

I've tried 2 times this lab and this happens twice. But on the second I've succed and solved the problem.

Comments

  • I had this issue, calico-kube-controllers stuck in a crashloop. Deleting the pod fixed the issue, i was then able to drain the worker

Welcome!

It looks like you're new here. Sign in or register to get started.
Sign In

Welcome!

It looks like you're new here. Sign in or register to get started.
Sign In

Categories

Upcoming Training