Unable to join master node

suarna · September 2020

When I try to join to the master node from the worker node i get the following error when i type the command:

I get:

Both the hash for the ca.crt certificate and the token are correct,doing a ping against the master ip and k8smaster dns name returns correct answers, could you help me with this issue?

Thanks in advance

chrispokorni · September 2020

Hi @suarna,

It seems that the join has trouble resolving the "k8smaster" alias. Edit the /etc/hosts of the worker and ensure that the correct alias is used, and the private IP of the master is used as well.

Also, keep in mind that running the join command several times will cause more errors that help. In between joins I would recommend running a sudo kubeadm reset.

Regards,
-Chris

serewicz · September 2020

Hello,

Please also include the exact kubeadm init command you used on the master along with the contents of the kubeadm-config.yaml file.

Regards,

suarna · September 2020

Looking in dept i saw the problem. I am using two VM's in one of them i deploy the master node and in the other the worker node, i have two network devices on each of it,one device connected via NAT through the host system to the outside world and the other devices connected both to the same internal network. I want to start the service listening on the device of the internal network instead of the NATed device,but i do not know why the service is created on the other device despite the content of the config file.

The init command used is:

The ip 192.168.1.1 is conmected to the internal network,(this is device in which i want run the service)

Here you can see the kubeadm-config.yaml file content

And finally the admin.conf created when the master node is deployed,the service (and the certs) are created for the the ip 10.0.2.15 that correponds with the NATed device instead of the ip 192.168.1.1that corresponds with the device connected to the internal network, and in that way reflected in the hosts file and in the kubeadm-config.yaml file (in the image below I have intentionally ommited the certificates content)

In what way can i control in which device can I deploy the service running for the port 6443?

serewicz · September 2020

Hello,

Your pod subnet (using 192.168.x.y) should not be the same as the host network (k8smaster also being 192.168.x.y). It will cause issues with routing. If you read the lab exercise you will find in Exercise 3.1, step 9 and 10 to not use the same IP address.

Also the labs are tested with single interfaces, you may encounter other issues if you use a multi-interface configuration. You can read up on the use of --apiserver-advertise-address string setting to kubeadm

Regards,

suarna · September 2020

Hi @serewicz

Thanks for the quick response...I will try with other network configuration and see wath happens

Many thanks

chrispokorni · September 2020

Hi @suarna,

In addition, your kubeadm-config.yaml file seems to be improperly formatted. The last line with the podSubnet: ... entry should only be indented 2 (two) spaces from the left. Yours seems to be indented about 8 (eight) spaces, and that is something that will cause that property and its assigned value to be unrecognized by the yaml to json converter.

Regards,
-Chris

suarna · September 2020

Thanks for the remark @chrispokorni, i will look close at that

suarna · September 2020

Finally I was able to join the worker node with the master node,but the node shows the not ready state when I run the_ kubectl get nodes_ command

Running the command kubectl describe node worker the following error is shown,it seems that there is an issue with the network plugin

Looking at the pod i can see two containers stucked in the states Init:0/3 and Container Creating,if i delete them they automaticaly try to run again with the same result.

chrispokorni · September 2020

Hi,

To troubleshoot the problem pods, you could run the kubectl describe command and study the Events section of the output:

kubectl -n kube-system describe pod kube-proxy-hkbbm
kubectl -n kube-system describe pod calico-node-9zfd6

Any meaningful clues?

Regards,
-Chris

suarna · September 2020

I have obtained the following warning...either the calico or the kube-proxy containers show the same.

chrispokorni · September 2020

It seems your worker cannot resolve the container registry address.

Can you compare the /etc/resolv.conf files of worker and master, and see what is missing from the worker resolv file?

Regards,
-Chris

serewicz · September 2020

Hello,

This issue seems tied to host networking, not Kubernetes. Did you make sure the pod network and the host network no longer overlap, both had been 192.168.x.y?

I note a previous error from docker said "network plugin is not ready". If you log into your worker VM and run sudo docker run hello-world what output do you see?

Are you using VirtualBox or some other tool to host your VMs? Do both VMs have a single interface or more than one?

Regards,

suarna · September 2020

Hi

Know is working fine...I had an error in a netplan config file in the worker side, I was driving traffic to the incorrect gateway,i noticed the error executing the suggested docker command provided by @serevwick,when then i could see that docker wasn't able to download the image from the docker registry, many thanks to both.

serewicz · September 2020

Great! Glad you have it working. :-)

Unable to join master node

Comments

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)