Unable to setup cluster on Lab 3.1

asmoljo · February 2021

Hi Chris,

I attached logs from those containers who are constantly restarting ... api, controller and scheduler.

top
top - 09:57:41 up 36 min, 1 user, load average: 0.29, 0.16, 0.23
Tasks: 4 total, 0 running, 4 sleeping, 0 stopped, 0 zombie
%Cpu(s): 8.8 us, 2.4 sy, 0.0 ni, 88.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 6103588 total, 3482592 free, 333140 used, 2287856 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 5506916 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
26816 root 20 0 1856936 96232 63604 S 2.0 1.6 0:27.10 kubelet
29878 root 20 0 746656 45204 31028 S 1.0 0.7 0:19.92 kube-scheduler
25248 root 20 0 10.121g 42052 20236 S 1.0 0.7 0:22.01 etcd
23202 root 20 0 1351992 104864 50108 S 1.0 1.7 0:43.68 dockerd

df -h
Filesystem Size Used Avail Use% Mounted on
udev 2.9G 0 2.9G 0% /dev
tmpfs 597M 1.1M 595M 1% /run
/dev/mapper/ubuntu--vg-ubuntu--lv 19G 7.3G 11G 42% /
tmpfs 3.0G 0 3.0G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 3.0G 0 3.0G 0% /sys/fs/cgroup
/dev/sda2 976M 78M 832M 9% /boot
tmpfs 597M 0 597M 0% /run/user/1000
overlay 19G 7.3G 11G 42% /var/lib/docker/overlay2/07fb7deba941efdc033be06b21acb9e16370cd970afbdadfbf3687fd1a82dbec/merged
overlay 19G 7.3G 11G 42% /var/lib/docker/overlay2/b4449e9b143491ef05d5958d8d7102ba73f43bc121dcdd600273e466328fcd34/merged
overlay 19G 7.3G 11G 42% /var/lib/docker/overlay2/0151f780322fb4f35a2cf556b560fe13552d9a0edee9c63b75b07b10d58f77b8/merged
overlay 19G 7.3G 11G 42% /var/lib/docker/overlay2/fd31cabfb8ef7dc15b148de17cd53bdbbe4286be546389376fd8cf5b086b6e18/merged
shm 64M 0 64M 0% /var/lib/docker/containers/aedf361be28fe4a9cb7e3dc5b19f8482a2c6ea89ed3fd949cf38b23d89fd390c/mounts/shm
shm 64M 0 64M 0% /var/lib/docker/containers/9e8931ec407841b6086cf483669d1a887bfe0c127f7f732c3298cb8387d109b1/mounts/shm
shm 64M 0 64M 0% /var/lib/docker/containers/e72fac362f215b647811d748c3be5d39d136a70991a0f111c9e0b551c4164733/mounts/shm
shm 64M 0 64M 0% /var/lib/docker/containers/7ca7e2c303e6ccc6fe0b014123b03a2fe2869e97f47576681fcc1a9d332cb302/mounts/shm
overlay 19G 7.3G 11G 42% /var/lib/docker/overlay2/d5416c9a3486899fb5d9468453491ef14e35e72dceaeb7ffb1b2ca4e108c65d0/merged
overlay 19G 7.3G 11G 42% /var/lib/docker/overlay2/22a4483bc4a6dcb6a643670e18d3d3f5a766d5469e5822b1fd733061a104abed/merged

asmoljo · February 2021

and apiserver log

chrispokorni · February 2021

Hi @asmoljo,

It seems that the controller and scheduler errors are timeouts caused by unsuccessful attempts to communicate with the API server. Did you see any errors during kubeadm init? Was the VirtualBox VM for the control-plane node provisioned new for this cluster, or has it been used for other clusters as well (you mentioned prior rke and kubespray clusters).

What is the VM provisioning process you go through in VirtualBox, and what other actions do you take prior to beginning with Lab 3.1?
What is the commands history of your CLI, as you go through exercise 3.1?

Regards,
-Chris

asmoljo · February 2021

@chrispokorni said:
Hi @asmoljo,

It seems that the controller and scheduler errors are timeouts caused by unsuccessful attempts to communicate with the API server. Did you see any errors during kubeadm init? Was the VirtualBox VM for the control-plane node provisioned new for this cluster, or has it been used for other clusters as well (you mentioned prior rke and kubespray clusters).

What is the VM provisioning process you go through in VirtualBox, and what other actions do you take prior to beginning with Lab 3.1?
What is the commands history of your CLI, as you go through exercise 3.1?

Regards,
-Chris

Hi Chris
yes i saw the errors after the init command, i have already sent them. I created a brand new server for the LFS258 course. VM creation procedure:
new, enter name, machine folder, type = Linux and version = ubuntu 64bit. RAM 4094Mb, create vdisk now, vdi disk, fixed size 20GB. set 2 CPU, set net bridged adapter, set optical drive as ubuntu server 18.04 image, and install. unfortunately I no longer have a command history.
I switched from the virtual box to the vmware workstation and for now everything works fine for me.

shasha · February 2021

hi friends,
As I'm following the instructions in the book, when i reached to sep 14 with this command
kubeadm init --config=kubeadm-config.yaml --upload-certs | tee kubeadm-init.out # Save output for future review

i got this for output >>>
invalid configuration for GroupVersionKind /, Kind=: kind and apiVersion is mandatory information that must be specified
To see the stack trace of this error execute with --v=5 or higher

how can i solve this ??? what is wrong in my lab ???

serewicz · February 2021

@shasha Please see my response to your other, identical post.

RonaldHeirbaut · February 2021

I had a network/DNS problem as soon as I applied calico.yaml. DNS names were not resolved anymore. It has to do with systemd-resolved. When I 'unapplied' calico.yaml, DNS queries were fine again. I switched to Debian 10 iso Ubuntu 18.04. That worked for me. I did not feel like solving Ubuntu-problems.

chrispokorni · February 2021

Hi @RonaldHeirbaut,

Where did you setup your Ubuntu 18 VMs - in the cloud, or through a local hypervisor? What Ubuntu image did you use, and from where?

Regards,
-Chris

RonaldHeirbaut · March 2021

Hi Chris, I use libvirtd (both on Centos 7 as Debian 10) and downloaded the Ubuntu image from Ubuntu. Version 18.04LTS from https://ubuntu.com/download/server#releases. Installation as simple as next-next-finish

chrispokorni · March 2021

Hi @RonaldHeirbaut,

The "next-next-finish" install implies all default configs, including networking and DNS. The networking type and how the hypervisor configures guest VMs' DNS play a key role in the behavior of your VMs during the cluster bootstrapping phase.

I typically use VirtualBox locally, and the defaults are not allowing me to bootstrap my cluster. Meaning that I have to deviate from defaults, especially when the networking is configured by the hypervisor for each VM. In VBox terms this translates to a single adapter, bridged network with promiscuous mode checked and allowing all traffic.

Regards,
-Chris

RonaldHeirbaut · March 2021

@chrispokorni, the strange thing is that I can do a docker pull, dig, nslookup, ping whatsoever until I appy calico.yaml. Then it all fails. When I delete calico, everything works again.

serewicz · March 2021

Hello,

When you apply the calico.yaml file there are new pods created. These pods are responsible for handling the traffic inside of the node, and do so by manipulating **iptables* commands. As a result the firewall of the node is modified, which could effect all traffic.

Regards,

Unable to setup cluster on Lab 3.1

Comments

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)