lab2.1 kubectl untainted not working

raghava.cheruku · July 2018

Hi,

can anyone explain the mistake here ?. I searched in https://kubernetes.io/docs for some hint on this "untainted"

but no informaiton found.

admin1@k8master:~$ kubectl taint nodes --all node-role.kubernetes.io/master-node “k8master” untainted

error: at least one taint update is required

admin1@k8master:~$

however, I went to next command deploying the firstpod, and it got deployed on my k8sseond node.

can someone explain howto make this untained work?

regards

Raghava

chrispokorni · July 2018

Hi,

When you use kubeadm to build a cluster with master and minion nodes, the master is tainted in order to prevent the cluster from scheduling pods on the master. Since we are only using a few nodes for these labs, in Lab 2 we remove that taint from the master, allowing the cluster to schedule pods on the master as well.

The lab manual (and I hope we are both looking at the same version/content) suggests using the following command in order to remove the taint (to untaint the master):


kubectl taint nodes --all node-role.kubernetes.io/master-

(note the "-" at the end of the command, it instructs kubectl to "remove" the taint)

and then the output is the following:


node “ckad-1” untainted
taint "node-role.kubernetes.io/master:" not found

where the first line of the output is a confirmation of the node "ckad-1" (master) being successfully untainted, and the second line is the attempt to untaint the second node, but no taint is being found (note the "--all" option used above, which instructs kubectl to remove the taint from "all" nodes, both master and minions).

From the official documentation I found this page:

https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/

and there are quite a few hits if you search for "taint". Searching for "untaint" will not return many results.

Good luck!

-Chris

bryonbaker · August 2018

I know why you made the mistake. The lab script is really confusing - I made the same mistake...

The steps make you think you have to type this command:
kubectl taint nodes --all node-role.kubernetes.io/master-node “ckad-1” untainted

But the last part is actually the output. You only type this:
kubectl taint nodes --all node-role.kubernetes.io/master-

And the output is:
node/master untainted
error: taint "node-role.kubernetes.io/master:" not found

All in all the lab doco is pretty bad I think.

bixue · November 2018

Hi i was unable to remove the taint node.kubernetes.io/not-ready on my minion node.

command:
kubectl taint nodes --all node.kubernetes.io/not-ready:NoSchedule-
result:
node/pc1-node2 untainted error: taint "node.kubernetes.io/not-ready:NoSchedule" not found

I have ran this for thousands of times, but every time it prints me the same output. Then i ran

command:
kubectl describe nodes | grep -i taint
result:
Taints: Taints: node.kubernetes.io/not-ready:NoSchedule

The taint is so stubborn that it does not go away. Could someone tell me how can i debug this issue?

Regards,
Bin

chrispokorni · November 2018

Hi Bin,
"NoSchedule" is the effect of the taint and it should not be a part of the taint removal command.
Look closely at the lab exercise, and compare with your command provided above.

-Chris

serewicz · November 2018

I think if you read the paragraph of the step, you should find a sentence "Note the minus sign (-) at the end, which is the syntax to remove a taint". As well there is an extra space to indicate the end of the command and the output.
Regards,

bixue · November 2018

Hi Chris,

Apologies, i tried both
command
kubectl taint nodes --all node.kubernetes.io/not-ready-

and

kubectl taint nodes --all node.kubernetes.io/not-ready:NoSchedule-

neither worked

Both of them returned
node/pc1-node2 untainted error: taint "node.kubernetes.io/not-ready:" not found

It's strange that the command returned success but the taint is actually not removed.

Regards,
Bin

bixue · November 2018

Hi serewicz,

I had the minus sign(-) at the end, what do you mean by the extra space? I tried

kubectl taint nodes --all node.kubernetes.io/not-ready-

with an extra space at the end of the command, but the result is

node/pc1-node2 untainted error: taint "node.kubernetes.io/not-ready:" not found

Regards,
Bin

chrispokorni · November 2018

Hi Bin,
From both responses above the very first command seems to be correct.
The lab instructions mention that it takes a while and a few attempts for the taint removal to be successful.
What is worth noting that once the command is successful and the taint is removed, any subsequent attempt would produce that same output - the error that the taint is not found.
Did you verify that the taints are still there or have been removed?
Run the first command in step 12. Can you provide that output?

Thanks,
-Chris

serewicz · November 2018

Hello Bin,

The extra space is seen between the command and the output in the book. Typically command output is on the very next line, but the book shows an extra space to help illustrate the the last character to be typed would be the minus sign (-), and what follows in the book is output.

From the output you posted, showing

node/pc1-node2 untainted

error: taint "node.kubernetes.io/not-ready:" not found

Indicates the taint was removed from the first node, but was not on the second node. If you had three nodes in your cluster you would three lines of output. If the taint has not been set on a node you would get a "not found" output. If it is set on the node you would see the "untainted" output.

Regards,

bixue · November 2018

Hi Chris,

The output is
Taints: Taints: node.kubernetes.io/not-ready:NoSchedule

I have been running
kubectl taint nodes --all node.kubernetes.io/not-ready-
since yesterday and the output is always like above

Regards,
Bin

bixue · November 2018

Hi Serewicz,

Thank you very much for the detailed explanation. I totally understand what the book is trying to say. It's just that my setup seems not producing the expected result. Btw, I'm using virtualbox on mac with 2 nodes of ubuntu 16.04

Regards,
Bin

bixue · November 2018

Hi Guys,

Thanks for the help. I reset the cluster and set it up again and everything starts working. Appreciate your time:)

Regards,
Bin

suser · March 2020

Hello
I have the same problem here not being able to untaint my nodes during lab 2.2. I use qemu VM on proxmox. The only solution for this problem is realy to reset the whole cluster?
I tried many times wating long minutes, I typed it with no extra spaces.

kubectl get node
NAME STATUS ROLES AGE VERSION
kmaster NotReady master 25h v1.17.1
kw1 NotReady 23h v1.17.1

kubectl taint nodes --all node.kubernetes.io/not-ready-
node/kmaster untainted
node/kw1 untainted

kubectl get node
NAME STATUS ROLES AGE VERSION
kmaster NotReady master 25h v1.17.1
kw1 NotReady 23h v1.17.1

kubectl describe nodes | grep -i taints
Taints: node.kubernetes.io/not-ready:NoSchedule
Taints: node.kubernetes.io/not-ready:NoSchedule

Regards,
Stefan

chrispokorni · March 2020

Hello Stefan,

The taints found on your nodes are generated by the cluster to indicate that the nodes are not ready, and they will automatically be removed once your nodes become ready. Issuing kubectl describe node <node-name> command may indicate why your nodes are not ready. Please provide the output of that command and the output of kubectl get pods --all-namespaces.

Those outputs may help us troubleshoot your cluster.

Regards,
-Chris

suser · March 2020

Thanks Chris,

Meanwhile I ran kubeadm reset and I re-created the master node only so far, and I am still unable to untaint my kmaster node.
The output of kubectl get pods --all-namespaces

vio@kmaster:~$ kubectl describe nodes | grep -i taints
Taints: node.kubernetes.io/not-ready:NoExecute
vio@kmaster:~$ kubectl describe nodes | grep -i taints
Taints: node.kubernetes.io/not-ready:NoExecute
vio@kmaster:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kmaster NotReady master 8m25s v1.17.1
vio@kmaster:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-6955765f44-rvdv7 0/1 Pending 0 8m13s
kube-system coredns-6955765f44-sszvz 0/1 Pending 0 8m13s
kube-system etcd-kmaster 1/1 Running 0 8m8s
kube-system kube-apiserver-kmaster 1/1 Running 0 8m8s
kube-system kube-controller-manager-kmaster 1/1 Running 0 8m8s
kube-system kube-proxy-z24hm 1/1 Running 0 8m13s
kube-system kube-scheduler-kmaster 1/1 Running 0 8m8s
vio@kmaster:~$

My attempts to untaint keep reading:

vio@kmaster:~$ kubectl taint nodes --all node.kubernetes.io/not-ready-
node/kmaster untainted
vio@kmaster:~$ kubectl describe nodes | grep -i taints
Taints: node-role.kubernetes.io/master:NoSchedule

I also attached the output of kubectl describe node kmaster command

Thanks in advance!

Stefan

chrispokorni · March 2020

Hi Stefan,

It seems your coredns pods are not running. That is the reason why your node never reaches ready state. Delete both coredns pods and allow the cluster to re-create them for you, while keeping an eye on their state. Once they show a running state, check your master node again. It should now show ready.

Regards,
-Chris

serewicz · March 2020

Hello,

If you notice your coredns pods are both showing as pending. This would cause the node to be listed as NoSchedule.

What environment are you using to run the labs? GCE, AWS, DigitalOcean?

When you run kubectl describe for one of the coredns pods, what are the messages in the output at the end?

The issue is these pods, not the taint.

Regards,

suser · March 2020

Hi Chris,

I use Qemu VM on local proxmox type 1 hypervisor with plenty of resources, latency shouldn't be an issue.

vio@kmaster:~$ sudo kubectl describe coredns-6955765f44-rvdv7
error: the server doesn't have a resource type "coredns-6955765f44-rvdv7"

Now the pods show a different status:

vio@kmaster:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-6955765f44-rvdv7 0/1 ContainerCreating 0 89m
kube-system coredns-6955765f44-sszvz 0/1 ContainerCreating 0 89m
kube-system etcd-kmaster 1/1 Running 0 89m
kube-system kube-apiserver-kmaster 1/1 Running 0 89m
kube-system kube-controller-manager-kmaster 1/1 Running 0 89m
kube-system kube-proxy-dqb6x 1/1 Running 0 38m
kube-system kube-proxy-z24hm 1/1 Running 0 89m
kube-system kube-scheduler-kmaster 1/1 Running 0 89m

Stefan

chrispokorni · March 2020

Coredns pods not running has nothing to do with latency issues. They cannot run because they never receive IP addresses, which should be provided by calico. You have no calico pods running. Did calico get downloaded together with the required rbac file? Have the calico pods been started and the rbac permissions created?

Regards,
-Chris

suser · March 2020

Chris,

I ran the init command as set on lab 2.2 file "sudo kubeadm init --kubernetes-version 1.17.1 --pod-network-cidr 192.168.0.0/16", but VM IP is 10.1.10.30, could this be the issue?

Stefan

suser · March 2020

Chris,
I ran again all the Lab2.2 script k8sMaster.sh step by step and I discovered that the lines
wget --no-check-certificate https://tinyurl.com/yb4xturm -O rbac-kdd.yaml
and
wget --no-check-certificate https://tinyurl.com/y2vqsobb -O calico.yaml
they do not download any yaml file, but some GIF files instead
therefore the lines
kubectl apply -f rbac-kdd.yaml
kubectl apply -f calico.yam
generates wrong format related errors and I am not able to correctly setup the environment using given materials.
Can you help with the correct files rbac-kdd.yaml and calico.yaml?

Thank you!

chrispokorni · March 2020

It seems to be a strange behavior, which I was not able to reproduce. However, the script in the Solutions tarball does not use the --no-check-certificate option.

It could be that your instances do not correctly resolve the URLs. Did you check the resolv.conf files of your instances?

If all else fails, just download the correct files by clicking on the two working links above and then create the yaml files manually.

Regards,
-Chris

serewicz · March 2020

Hello Stefan,

Please share, what operating system are you using?

Where are you running the labs? GCE, AWS, Digital Ocean?

Regards,

suser · March 2020

Hi Chris,

I run the labs on VM which run on local hypervisor (proxmox as I mentioned), I do not use any vendor you mentioned.
I use ubuntu 18 LTS OS on VM.

Stefan

suser · March 2020

@chrispokorni Hello, I have to add the flag --no-check-certificate because the site https://tinyurl.com/y2vqsobb uses a self signed certificate, and I cannot connect without this flag from my end and I guess this is the normal default behavior. Maybe the class materials are not good.

Stefan

serewicz · March 2020

Hello,

Thank you for letting us know what service and OS are using for the labs, i must have missed that before. I'm not familiar with proxmox. But I'll take a look. In my experience when these sorts of issue happen it ends up being some feature or security which blocks the appropriate messages from being sent.

When the lab is run using GCE, AWS, Digital Ocecan, VirtualBox, QEMU/KVM, and bare metal it works as written. This would lead me to believe there is something unknown inside of promox which is causing the issues. With the assumption you are using copy and paste, and not typing URLs by hand, my first guess is something network related. That there is something blocked between nodes, or VMs. My second guess is the hyper-visor is not translating the commands as expected through LXC, or not presenting the network interfaces in an expected manner.

Are you in a secure and locked-down environment such that you cannot use self-signed certificates?

Regards,

suser · March 2020

Hello,
Yes, behind my firewall I cannot accept self signed certificate, but I am fine with adding the flag manually. But yet I am not getting the yaml files this way. Can you supply them here for me?
@Chris What DNS setting should I use? I can change that. (Currently my VM uses the host dns 1.1.1.1 and 1.0.0.1 through the router).

Stefan

chrispokorni · March 2020

Hi Stefan,

Since you have the links, just open them in a browser, copy the files over and you should be able to produce the yaml manifests in your environment. It seems to be a simple workaround since your environment does not permit the download of such files. Providing them "here" would change the yaml formatting which would not help you in any way.

So far I have not seen such behavior on GCE, AWS EC2, DigitalOcean, Virtualbox, or even on Minikube. Using different environment variations may imply slightly different configuration and installation options. That is to be expected since each environment treats features differently: networking, firewall rules, provisioning and some of the installation options. Once the cluster is up and running, then all else would work as presented.

Regards,
-Chris

suser · March 2020

Thank you very much. Got the files using VPN, unable to allow tinyurl redirections to long good url from my end at this point. I will let you know how it goes.

Stefan

suser · March 2020

Thanks @serewicz and @chrispokorni, it just worked smoothly having the yaml files. I never used tinyurl before, please note that it is not suitable for any environment.
Stefan

lab2.1 kubectl untainted not working

Welcome!

Comments

Welcome!

Welcome!

Quick Links

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)