coredns issue on Ubuntu 16.04 on Virtualbox
Hi,
I decided to practice for the exam now that I have a bit of time.
I create a VM using Ubuntu 16.04 and create a NatNetwork on 10.0.2.0
I then did a fresh install of kubernetes using the manual from the class.
I see coredns in the Pending when I do a kubectl get pods -A
I didn't add rbac or calico yet. Just ran the commands with the kubeadm.
I also didn't add the worker node nor do the taint since I figure this should be up and running right out of the box on a fresh install.
I also upgraded kubectl, kubeadm etc to the latest version, but that didn't help.
here is my (note, I tried to use the code function but it was a special character at the fist and last line of every line, is there an easier way for a block of text?)
kubectl describe pod coredns-5c98db65d4-q5zbd -n kube-system
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8080/health delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-68lsj (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-68lsj:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-68lsj
Optional: false
QoS Class: Burstable
Node-Selectors: beta.kubernetes.io/os=linux
Tolerations: CriticalAddonsOnly
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 16s (x54 over 5m21s) default-scheduler 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
Comments
-
Hi @btanoue ,
You need to remove taints from the master for the scheduler to be able to place the coredns pods on the master (since you did not attach a worker node) OR have at least a worker node join the cluster, AND you need calico started on the cluster for coredns pods to run successfully - as they receive their IPs from calico.
The installation steps are in a particular sequence for a good reason
Regards,
-Chris0 -
Thanks crispokorni.
Thank you. I wasn't sure if the default install was supposed to start coredns on the k8smaster. I'm still kind of learning how it all connects and debugging this stuff is fun.
So Calico is like DHCP for the pods based on what you said. I wasn't sure how that was working but now I do.
I guess me now checking what's running where all the time did more harm than good since I was trying to really understand how it is all connected. In this case, I was shooting myself in the foot for no reason LOL.
0 -
OK, I installed rbac and calico.
Installed the second node.
Followed the directions and removed taints.coredns stays ContainerCreating
Warning FailedCreatePodSandBox 78s (x4 over 81s) kubelet, kubemaster (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "54394195e138ac17122a338a6e25ad6de0b1ba544f0bd8560439c5b95aad1cdb" network for pod "coredns-5c98db65d4-4zwlq": NetworkPlugin cni failed to set up pod "coredns-5c98db65d4-4zwlq_kube-system" network: no podCidr for node kubemaster
Normal Scheduled default-scheduler Successfully assigned kube-system/coredns-5c98db65d4-4zwlq to kubemasterI then created the nginx deployment and it stays in that same state with ContainerCreating.
I feel it has something to do with Calico.My network is on 10.0.2.0
Calico and Kubadmin-init were set on 10.0.1.0Any thoughts?
0 -
@btanoue , did you get to Step 6 in Exercise 3.3? It may help with the coredns pods.
-Chris
0 -
Yes, I did delete the coredns containers and when the respawned they went back to the ContainerCreating state.
I'm starting to think it has something to do with the podcidr and cni. I just don't know how to fix it.
0 -
OK, so I fixed it but I'm not sure exactly how this works.
kubectl patch node kubemaster -p '{"spec":{"podCIDR":"10.0.1.0/16"}}'
kubectl patch node kubeworker -p '{"spec":{"podCIDR":"10.0.1.0/16"}}'I understand that I pushed the cidr to the nodes.
But what I don't understand is they the kubeadm-init file and calico didn't set this up?Any ideas? I'd like to understand why it didn't work and also understand how the patch works a little better.
But It did create pods and deployments now. I can scale nginx to 3 replicas and they are Running.0 -
Understanding IP network sizes would help in this case. More specifically understanding what are the minimum and maximum IP addresses in such a range. Understand the size of the default calico pod network 192.168.0.0/16, then the size of 10.0.1.0/16 and its relationship with 10.0.2.0.
After you have this part figured out, keep in mind that IP blocks should not overlap: node IPs, with pod IPs, and with service IPs.Regards,
-Chris0
Categories
- All Categories
- 177 LFX Mentorship
- 177 LFX Mentorship: Linux Kernel
- 750 Linux Foundation IT Professional Programs
- 373 Cloud Engineer IT Professional Program
- 169 Advanced Cloud Engineer IT Professional Program
- 74 DevOps IT Professional Program - Discontinued
- 4 DevOps & GitOps IT Professional Program
- 99 Cloud Native Developer IT Professional Program
- 7.6K Training Courses & Learning Paths
- 1 AI & ML Training
- 1 Blockchain & Decentralized Identity Training
- 3 Cloud & Containers Training
- 1 Cybersecurity Training
- 2 DevOps & Site-Reliability Training
- 1 Linux Kernel Development Training
- 1 Networking Training
- 1 Open Source Best Practice Training
- 1 System Administration Training
- 1 System Engineering Training
- 1 Web & Application Development Training
- 792 Hardware
- 202 Drivers
- 68 I/O Devices
- 37 Monitors
- 95 Multimedia
- 173 Networking
- 91 Printers & Scanners
- 87 Storage
- 769 Linux Distributions
- 81 Debian
- 68 Fedora
- 22 Linux Mint
- 13 Mageia
- 24 openSUSE
- 150 Red Hat Enterprise
- 31 Slackware
- 13 SUSE Enterprise
- 356 Ubuntu
- 465 Linux System Administration
- 31 Cloud Computing
- 73 Command Line/Scripting
- Github systems admin projects
- 98 Linux Security
- 78 Network Management
- 101 System Management
- 46 Web Management
- 106 Mobile Computing
- 18 Android
- 73 Development
- 1.2K New to Linux
- 1K Getting Started with Linux
- 392 Off Topic
- 121 Introductions
- 181 Small Talk
- 29 Study Material
- 955 Programming and Development
- 310 Kernel Development
- 627 Software Development
- 983 Software
- 375 Applications
- 182 Command Line
- 5 Compiling/Installing
- 68 Games
- 317 Installation
- Archived
- 2 LFD140 Class Forum
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)