calico-node pod: CrashLoopBackOff, coredns pod: ContainerCreating in vagrant

b10s · September 2019

Hello,
I'm trying to follow LAB 3_[1-5] but got the following error:

error screen shoot

I have deleted two coredns pods as recommended in LAB_3.3 but still have this state of new spawned pods.

Some logs:

kubectl -n kube-system describe pods coredns-5c98db65d4-dt499 gives

...
  Warning  FailedCreatePodSandBox  7m8s (x4 over 7m11s)     kubelet, master-node  (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "bb3a35937e92bf16d7d2d565ecef5c7259e2a7a391141df19f21c4cd6cc08172" network for pod "coredns-5c98db65d4-dt499": NetworkPlugin cni failed to set up pod "coredns-5c98db65d4-dt499_kube-system" network: no podCidr for node master-node

kubectl -n kube-system describe pods calico-node-w62gggives:

...
Warning  Unhealthy  10m (x7 over 11m)  kubelet, worker-node  Readiness probe failed: Threshold time for bird readiness check:  30s
calico/node is not ready: felix is not ready: Get http://localhost:9099/readiness: dial tcp 127.0.0.1:9099: connect: connection refused
...

promisc mode on master and worker is enabled, e.g. master node:

$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 02:32:7c:93:db:e1 brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global enp0s3
       valid_lft forever preferred_lft forever
    inet6 fe80::32:7cff:fe93:dbe1/64 scope link
       valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:97:49:fb brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.10/24 brd 10.0.0.255 scope global enp0s8
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe97:49fb/64 scope link
       valid_lft forever preferred_lft forever
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:0f:c8:1b:f4 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
5: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1440 qdisc noqueue state UNKNOWN group default qlen 1
    link/ipip 0.0.0.0 brd 0.0.0.0

I have few questions:

What is the reason of this problem?
How can I fix it?

b10s · September 2019

Here is my two vagrant files for master node and for worker node.

master:

# -*- mode: ruby -*-                                                                
# vi: set ft=ruby :                                                                 

$configureMasterNodeBox = <<-SCRIPT                                                 
   #LFS258                                                                          
   apt-get update && apt-get upgrade -y                                             
   apt-get install -y docker.io                                                     

   cat <<EOF >/etc/apt/sources.list.d/kubernetes.list                               
   deb http://apt.kubernetes.io/ kubernetes-xenial main                             
EOF                                                                                 

  curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -  
  apt-get update                                                                    
  apt-get install -y kubeadm=1.15.1-00 kubelet=1.15.1-00 kubectl=1.15.1-00          
  wget https://tinyurl.com/yb4xturm -O rbac-kdd.yaml                                
  wget https://tinyurl.com/y8lvqc9g -O calico.yaml                                  
  IP_ADDR=`ifconfig enp0s8 | grep Mask | awk '{print $2}'| cut -f2 -d:`             
  echo "$IP_ADDR k8smaster" >> /etc/hosts                                           

  cat <<EOF >kubeadm-config.yaml                                                    
  apiVersion: kubeadm.k8s.io/v1beta2                                                
  kind: ClusterConfiguration                                                        
  kubernetesVersion: 1.15.1 #<-- Use the word stable for newest version             
  controlPlaneEndpoint: "k8smaster:6443" #<-- Use the node alias not the IP         
  networking:                                                                       
    podSubnet: 192.168.0.0/16                                                       
EOF                                                                                 

  kubeadm init --config=kubeadm-config.yaml --upload-certs | tee kubeadm-init.out # Save output for future review
  #copying credentials to regular user - vagrant                                    
  sudo --user=vagrant mkdir -p /home/vagrant/.kube                                  
  cp -i /etc/kubernetes/admin.conf /home/vagrant/.kube/config                       
  chown $(id -u vagrant):$(id -g vagrant) /home/vagrant/.kube/config                
SCRIPT                                                                              


$kubectl = <<-SCRIPT                                                                
  mkdir -p $HOME/.kube                                                              
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config                          
  sudo chown $(id -u):$(id -g) $HOME/.kube/config                                   
  sudo cp /root/rbac-kdd.yaml .                                                     
  kubectl apply -f rbac-kdd.yaml                                                    
  sudo cp /root/calico.yaml .                                                       
  kubectl apply -f calico.yaml                                                      
  source <(kubectl completion bash)                                                 
  echo "source <(kubectl completion bash)" >> ~/.bashrc                             
  sudo kubeadm config print init-defaults                                           
SCRIPT                                                                              

Vagrant.configure("2") do |config|                                                  
  config.vm.box = "ubuntu/xenial64"                                                 
  config.vm.hostname = "master-node"                                                
  config.vm.network :private_network, ip: "10.0.0.10"                               
  config.vm.provision "shell", inline: $configureMasterNodeBox                      
  config.vm.provision "shell", inline: $kubectl, privileged: false                  
end

worker:

# -*- mode: ruby -*-                                                                
# vi: set ft=ruby :                                                                 

$configureWorkerNodeBox = <<-SCRIPT                                                 
  apt-get update && apt-get upgrade -y                                              
  apt-get install -y docker.io                                                      

  cat <<EOF >/etc/apt/sources.list.d/kubernetes.list                                
  deb http://apt.kubernetes.io/ kubernetes-xenial main                              
EOF                                                                                 

  curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -   
  apt-get update                                                                    
  apt-get install -y kubeadm=1.15.1-00 kubelet=1.15.1-00 kubectl=1.15.1-00          
SCRIPT                                                                              

Vagrant.configure("2") do |config|                                                  
  config.vm.box = "ubuntu/xenial64"                                                 
  config.vm.hostname = "worker-node"                                                
  config.vm.network :private_network, ip: "10.0.0.11"                               
  config.vm.provision "shell", inline: $configureWorkerNodeBox                      
end

for worker node I run by hands:

# vim /etc/hosts
10.0.0.10 k8smaster

kubeadm join ...

Inspired by https://github.com/ecomm-integration-ballerina/kubernetes-cluster/blob/master/Vagrantfile which actually works.

Just wondering what did I missed?

b10s · September 2019

Seems I found the problem place.

OK if I use the following provisioning command:

IP_ADDR=`ifconfig enp0s8 | grep Mask | awk '{print $2}'| cut -f2 -d:`          
HOST_NAME=$(hostname -s)                                                       
kubeadm init --apiserver-advertise-address=$IP_ADDR --apiserver-cert-extra-sans=$IP_ADDR  --node-name $HOST_NAME --pod-network-cidr=172.16.0.0/16

NOT OK if I use the following provisioning command:

cat <<EOF >kubeadm-config.yaml                                                 
apiVersion: kubeadm.k8s.io/v1beta2                                             
kind: ClusterConfiguration                                                     
kubernetesVersion: 1.15.1 #<-- Use the word stable for newest version          
controlPlaneEndpoint: "k8smaster:6443" #<-- Use the node alias not the IP      
networking:                                                                    
  podSubnet: 192.168.0.0/16                                                    
EOF                                                                              

kubeadm init --config=kubeadm-config.yaml --upload-certs | tee kubeadm-init.out # Save output for future review

Still not sure why it happening. Could someone explain in details?

chrispokorni · September 2019

Hi @b10s ,

From your first output, it seems that the networking between your nodes is not configured properly.
When provisioning local nodes it is recommended to enable promiscuous mode with allow-all-traffic (all sources, all destinations, all ports, all protocols) in order to allow all Kubernetes agents to talk to each other.
On the Ubuntu nodes once provisioned, also check any firewalls which may block some traffic.

If the tutorial you follow works, then you can always just replace the configuration options with the ones from the lab exercise in this course, and see what happens then.

If all else fails, you can simply spin-up two Ubuntu 16.04 LTS VMs with VirtualBox, configure them with promiscuous mode to allow-all-traffic and continue from there.

Good luck!
-Chris

b10s · September 2019

@chrispokorni thank you for checking.

If by not configured properly you meant promiscuous mode then in my first output you can see PROMISC.

It is at the very beginning, the out of ip a.

Or have I misunderstood you? Can you explain, why do you think I didn't configure promiscuous mode ?

chrispokorni · September 2019

When configuring the promiscuous mode you have 3 available options: allow all, allow VMs, deny. Is yous configured to allow all?

b10s · September 2019

@chrispokorni I hope it is allow all since VirtualBox does it as a bridge

calico-node pod: CrashLoopBackOff, coredns pod: ContainerCreating in vagrant

Comments

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)