coredns issue on Ubuntu 16.04 on Virtualbox

btanoue · September 2019

Hi,
I decided to practice for the exam now that I have a bit of time.
I create a VM using Ubuntu 16.04 and create a NatNetwork on 10.0.2.0
I then did a fresh install of kubernetes using the manual from the class.

I see coredns in the Pending when I do a kubectl get pods -A
I didn't add rbac or calico yet. Just ran the commands with the kubeadm.
I also didn't add the worker node nor do the taint since I figure this should be up and running right out of the box on a fresh install.
I also upgraded kubectl, kubeadm etc to the latest version, but that didn't help.

here is my (note, I tried to use the code function but it was a special character at the fist and last line of every line, is there an easier way for a block of text?)

kubectl describe pod coredns-5c98db65d4-q5zbd -n kube-system

Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8080/health delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-68lsj (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-68lsj:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-68lsj
Optional: false
QoS Class: Burstable
Node-Selectors: beta.kubernetes.io/os=linux
Tolerations: CriticalAddonsOnly
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 16s (x54 over 5m21s) default-scheduler 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.

chrispokorni · September 2019

Hi @btanoue ,
You need to remove taints from the master for the scheduler to be able to place the coredns pods on the master (since you did not attach a worker node) OR have at least a worker node join the cluster, AND you need calico started on the cluster for coredns pods to run successfully - as they receive their IPs from calico.
The installation steps are in a particular sequence for a good reason

Regards,
-Chris

btanoue · September 2019

Thanks crispokorni.

Thank you. I wasn't sure if the default install was supposed to start coredns on the k8smaster. I'm still kind of learning how it all connects and debugging this stuff is fun.

So Calico is like DHCP for the pods based on what you said. I wasn't sure how that was working but now I do.

I guess me now checking what's running where all the time did more harm than good since I was trying to really understand how it is all connected. In this case, I was shooting myself in the foot for no reason LOL.

btanoue · September 2019

OK, I installed rbac and calico.
Installed the second node.
Followed the directions and removed taints.

coredns stays ContainerCreating

Warning FailedCreatePodSandBox 78s (x4 over 81s) kubelet, kubemaster (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "54394195e138ac17122a338a6e25ad6de0b1ba544f0bd8560439c5b95aad1cdb" network for pod "coredns-5c98db65d4-4zwlq": NetworkPlugin cni failed to set up pod "coredns-5c98db65d4-4zwlq_kube-system" network: no podCidr for node kubemaster
Normal Scheduled default-scheduler Successfully assigned kube-system/coredns-5c98db65d4-4zwlq to kubemaster

I then created the nginx deployment and it stays in that same state with ContainerCreating.
I feel it has something to do with Calico.

My network is on 10.0.2.0
Calico and Kubadmin-init were set on 10.0.1.0

Any thoughts?

chrispokorni · September 2019

@btanoue , did you get to Step 6 in Exercise 3.3? It may help with the coredns pods.

-Chris

btanoue · September 2019

Yes, I did delete the coredns containers and when the respawned they went back to the ContainerCreating state.

I'm starting to think it has something to do with the podcidr and cni. I just don't know how to fix it.

btanoue · September 2019

OK, so I fixed it but I'm not sure exactly how this works.

kubectl patch node kubemaster -p '{"spec":{"podCIDR":"10.0.1.0/16"}}'
kubectl patch node kubeworker -p '{"spec":{"podCIDR":"10.0.1.0/16"}}'

I understand that I pushed the cidr to the nodes.
But what I don't understand is they the kubeadm-init file and calico didn't set this up?

Any ideas? I'd like to understand why it didn't work and also understand how the patch works a little better.
But It did create pods and deployments now. I can scale nginx to 3 replicas and they are Running.

chrispokorni · September 2019

Understanding IP network sizes would help in this case. More specifically understanding what are the minimum and maximum IP addresses in such a range. Understand the size of the default calico pod network 192.168.0.0/16, then the size of 10.0.1.0/16 and its relationship with 10.0.2.0.
After you have this part figured out, keep in mind that IP blocks should not overlap: node IPs, with pod IPs, and with service IPs.

Regards,
-Chris

coredns issue on Ubuntu 16.04 on Virtualbox

Comments

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)