Office hours - Oct 25 (LFS258)

chrispokorni · October 2022

Hi,

Today we looked at two distinct Kubernetes clusters, both with startup troubles.

One cluster installed on VirtualBox VM instances. After successful operations eventually the cluster admin resized the LV hosting the virtual disks of the two instances, all while the hypervisor was dynamically allocating space for the virtual disks, a combination that eventually caused the kubelet agents to panic, worker node became NotReady and eventually the control plane agents crashed as well. A temp fix was to restart the VMs, re-boostrap the worker node and re-join it with the control plane node, however, the cluster was still slow at processing kubectl requests. The recommended solution would be to rebuild the cluster and ensure its nodes are provisioned with the expected amount of CPU, memory and disk space, while the hypervisor should provision fixed size virtual disks.

The second cluster installed on VMware ESXi instances. The issue encountered is that the calico network plugin does not get installed, so the pod network is never initialized. This being a company/corporate development environment, chances are pretty high that there are firewalls in place to prevent certain protocols from accessing certain ports. In addition, there may be incompatibilities between the calico network plugin and the networking implementation of the ESXi hypervisor. In order to make this work, multiple approaches need to be examined: to ensure there are no firewalls between instances, if possible find specific settings that would make calico compatible with the infrastructure, or find a different yet compatible network.

The session was not recorded.

Regards,
-Chris

Office hours - Oct 25 (LFS258)

Welcome!

Welcome!

Quick Links

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)