Lab 10 task 14 Unable to install vim

hhness · April 2021

Hello.
I am working on Lab 10.1.14.
When i try to install vim on the thirdpage pod it seems to fail to retrieve the software.

"
root@thirdpage-5867bc9dfd-nmp8x:/# apt-get update
Err:1 http://security.debian.org/debian-security buster/updates InRelease
Temporary failure resolving 'security.debian.org'
Err:2 http://deb.debian.org/debian buster InRelease
Temporary failure resolving 'deb.debian.org'
Err:3 http://deb.debian.org/debian buster-updates InRelease
Temporary failure resolving 'deb.debian.org'
Reading package lists... Done
W: Failed to fetch http://deb.debian.org/debian/dists/buster/InRelease Temporary failure resolving 'deb.debian.org'
W: Failed to fetch http://security.debian.org/debian-security/dists/buster/updates/InRelease Temporary failure resolving 'security.debian.org'
W: Failed to fetch http://deb.debian.org/debian/dists/buster-updates/InRelease Temporary failure resolving 'deb.debian.org'
W: Some index files failed to download. They have been ignored, or old ones used instead.
root@thirdpage-5867bc9dfd-nmp8x:/# apt-get install vim
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package vim
"

Please help.

chrispokorni · April 2021

Hi @hhness,

It seems that your container cannot resolve deb.debian.org. Can you ping and/or traceroute that domain from inside the container? What about ping/traceroute from the node/VM running that container?

Regards,
-Chris

hhness · April 2021

Hi @chrispokorni
Thank you for replying.

I may be misunderstanding what you mean by domain but i tried the following.

When trying either ping or tracerroute i from inside of the container i get:
"
root@thirdpage-5867bc9dfd-nmp8x:/# ping thirdpage-5867bc9dfd-nmp8x
bash: ping: command not found
root@thirdpage-5867bc9dfd-nmp8x:/# traceroute thirdpage-5867bc9dfd-nmp8x
bash: traceroute: command not found
root@thirdpage-5867bc9dfd-nmp8x:/#
"
So i believe that the container does not have ping or traceroute.

I also tried it form the node where thirdpage is installed (the worker node):
"
student@worker:~$ ping thirdpage-5867bc9dfd-nmp8x
ping: thirdpage-5867bc9dfd-nmp8x: Temporary failure in name resolution
student@worker:~$ traceroute thirdpage-5867bc9dfd-nmp8x
thirdpage-5867bc9dfd-nmp8x: Temporary failure in name resolution
Cannot handle "host" cmdline arg `thirdpage-5867bc9dfd-nmp8x' on position 1 (argc 1)
"

hhness · April 2021

I alos found that coredns is unavailable.
"
student@master:~$ kubectl get deployments --all-namespaces
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
default nginx 1/1 1 1 5d19h
default secondapp 1/1 1 1 4d22h
default thirdpage 1/1 1 1 4d20h
kube-system calico-kube-controllers 1/1 1 1 17d
kube-system coredns 0/2 2 0 17d
low-usage-limit limited-hog 1/1 1 1 12d
"
Perhaps that will have an effect?

chrispokorni · April 2021

Hi @hhness,

I was hoping to possibly see ping and traceroute results from the container to "deb.debian.org", and from the node running the container to the same "deb.debian.com".

Since you mentioned coredns, please run kubectl get pods -A -o wide and share the output.

Regards,
-Chris

hhness · April 2021

@chrispokorni

"
student@master:~$ kubectl get pods -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default nginx-6799fc88d8-rltqc 1/1 Running 0 5d20h 192.168.219.90 master
default secondapp-959796d85-lq89k 1/1 Running 0 4d22h 192.168.171.95 worker
default thirdpage-5867bc9dfd-nmp8x 1/1 Running 0 4d21h 192.168.171.96 worker
kube-system calico-kube-controllers-69496d8b75-4825x 1/1 Running 2 12d 192.168.219.82 master
kube-system calico-node-5ct7s 1/1 Running 3 15d 10.2.0.3 worker
kube-system calico-node-7bvnb 1/1 Running 4 17d 10.2.0.2 master
kube-system coredns-74ff55c5b-mfc27 0/1 CrashLoopBa ckOff 1648 5d20h 192.168.219.91 master
kube-system coredns-74ff55c5b-r2d2p 0/1 CrashLoopBa ckOff 1646 5d20h 192.168.219.92 master
kube-system etcd-master 1/1 Running 2 12d 10.2.0.2 master
kube-system kube-apiserver-master 1/1 Running 2 12d 10.2.0.2 master
kube-system kube-controller-manager-master 1/1 Running 2 12d 10.2.0.2 master
kube-system kube-proxy-26hcr 1/1 Running 2 12d 10.2.0.3 worker
kube-system kube-proxy-prdrv 1/1 Running 2 12d 10.2.0.2 master
kube-system kube-scheduler-master 1/1 Running 2 12d 10.2.0.2 master
kube-system traefik-ingress-controller-lj2x8 1/1 Running 0 4d21h 10.2.0.2 master
kube-system traefik-ingress-controller-ttl9p 1/1 Running 0 4d21h 10.2.0.3 worker
low-usage-limit limited-hog-7c5ddc8c74-dnx5b 1/1 Running 2 12d 192.168.171.69 worker
student@master:~$
"

hhness · April 2021

Here are the results form ping and traceroute .
Still not possible to do either ping or tracerout on the container.
Tried it on both the master and worker node and gor same results.
"
student@worker:~$ ping deb.debian.org
PING debian.map.fastlydns.net (151.101.86.132) 56(84) bytes of data.
64 bytes from 151.101.86.132 (151.101.86.132): icmp_seq=1 ttl=53 time=9.43 ms
64 bytes from 151.101.86.132 (151.101.86.132): icmp_seq=2 ttl=53 time=9.42 ms
64 bytes from 151.101.86.132 (151.101.86.132): icmp_seq=3 ttl=53 time=9.55 ms
64 bytes from 151.101.86.132 (151.101.86.132): icmp_seq=4 ttl=53 time=9.44 ms
64 bytes from 151.101.86.132 (151.101.86.132): icmp_seq=5 ttl=53 time=9.45 ms
^C
--- debian.map.fastlydns.net ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4005ms
rtt min/avg/max/mdev = 9.422/9.462/9.553/0.116 ms
student@worker:~$ traceroute deb.debian.org
traceroute to deb.debian.org (151.101.86.132), 30 hops max, 60 byte packets
1 * * *
2 * * *
3 * * *
4 * * *
5 * * *
6 * * *
7 * * *
8 * * *
9 * * *
10 * * *
11 * * *
12 * * *
13 * * *
14 * * *
15 * * *
16 * * *
17 * * *
18 * * *
19 * * *
20 * * *
21 * * *
22 * * *
23 * * *
24 * * *
25 * * *
26 * * *
27 * * *
28 * * *
29 * * *
30 * * *
student@worker:~$
"

hhness · April 2021

And using deb.debian.com
"
student@worker:~$ traceroute deb.debian.com
deb.debian.com: Name or service not known
Cannot handle "host" cmdline arg `deb.debian.com' on position 1 (argc 1)
student@worker:~$ ping deb.debian.com
ping: deb.debian.com: Name or service not known
"

chrispokorni · April 2021

Hi @hhness,

Thanks for checking. The nodes seem to be ok, but because of the coredns pods not running in the cluster, your applications are not being configured for DNS, that is why you cannot install any packages on the thirdpage container.

Let's try to find out why the coredns pods stopped running. Please run kubectl -n kube-system describe pod coredns-xxxx-yyy for each coredns pod and share the output.

Regards,
-Chris

hhness · April 2021

@chrispokorni
The first one:
"
student@master:~$ kubectl -n kube-system describe pod coredns-74ff55c5b-mfc27
Name: coredns-74ff55c5b-mfc27
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: master/10.2.0.2
Start Time: Thu, 15 Apr 2021 15:06:17 +0000
Labels: k8s-app=kube-dns
pod-template-hash=74ff55c5b
Annotations: cni.projectcalico.org/podIP: 192.168.219.91/32
cni.projectcalico.org/podIPs: 192.168.219.91/32
Status: Running
IP: 192.168.219.91
IPs:
IP: 192.168.219.91
Controlled By: ReplicaSet/coredns-74ff55c5b
Containers:
coredns:
Container ID: docker://87a2c1857f012f0af0c973b7fd50fd676296609ec80c7b5860e7f31404296ff1
Image: k8s.gcr.io/coredns:1.7.0
Image ID: docker-pullable://k8s.gcr.io/coredns@sha256:73ca82b4ce829766d4f1f10947c3a338888f876fbed0540dc849c89ff256e90c
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Thu, 22 Apr 2021 08:38:35 +0000
Finished: Thu, 22 Apr 2021 08:38:35 +0000
Ready: False
Restart Count: 1899
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-f5js7 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-f5js7:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-f5js7
Optional: false
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: CriticalAddonsOnly op=Exists
node-role.kubernetes.io/control-plane:NoSchedule
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning BackOff 4m35s (x45850 over 6d17h) kubelet Back-off restarting failed container
"

and the second one:

"
student@master:~$ kubectl -n kube-system describe pod coredns-74ff55c5b-r2d2p
Name: coredns-74ff55c5b-r2d2p
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: master/10.2.0.2
Start Time: Thu, 15 Apr 2021 15:06:17 +0000
Labels: k8s-app=kube-dns
pod-template-hash=74ff55c5b
Annotations: cni.projectcalico.org/podIP: 192.168.219.92/32
cni.projectcalico.org/podIPs: 192.168.219.92/32
Status: Running
IP: 192.168.219.92
IPs:
IP: 192.168.219.92
Controlled By: ReplicaSet/coredns-74ff55c5b
Containers:
coredns:
Container ID: docker://3ac0076f52cb966985ac19d795a89b278adfd2fd5ac366abc1c5ea87d0deb76a
Image: k8s.gcr.io/coredns:1.7.0
Image ID: docker-pullable://k8s.gcr.io/coredns@sha256:73ca82b4ce829766d4f1f10947c3a338888f876fbed0540dc849c89ff256e90c
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Thu, 22 Apr 2021 08:38:14 +0000
Finished: Thu, 22 Apr 2021 08:38:14 +0000
Ready: False
Restart Count: 1898
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-f5js7 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-f5js7:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-f5js7
Optional: false
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: CriticalAddonsOnly op=Exists
node-role.kubernetes.io/control-plane:NoSchedule
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning BackOff 47s (x45886 over 6d17h) kubelet Back-off restarting failed container
"

hhness · April 2021

Seems like the problem is CrashLoopBackOff?
Whatever that means...

chrispokorni · April 2021

Hi @hhness,

During lab exercise 9.3 when you worked with coredns, did you encounter any issues?

I would recommend revisiting it and ensuring that all the edits to the coredns configmap were properly saved, then delete again the coredns pods in order to force the controller to re-start them.

Regards,
-Chris

hhness · April 2021

Hi @chrispokorni

Yes i posted the problem under "Exercise 9.3 pod named nettool instead of ubuntu".

This caused me to be unable to complete one of the tasks involving dig on the nettool container.

serewicz · April 2021

Hello,

What are you using to run your lab VMs? It may be that your environment is having network issues which is causing an issue with the cluster pods.

Regards,

hhness · April 2021

@serewicz
Hi,
Sorry for the late reply.

I am using google cloud i followed the setup instructions from the start of the course.
Any recommendations on how to check for errors?

chrispokorni · April 2021

Hi @hhness,

How are your VPC and firewall rule(s) configured? Is all traffic allowed from all sources, to all ports, all protocols? Are there any firewalls active on your nodes?

Regards,
-Chris

hhness · May 2021

@chrispokorni
As i said in the last post i have set up the VPC according to the recommendations i followed the video as best i could.
I am no expert here but it seems to met that there are no rules applied.
Attached you see the firewall rules og the subnet from Google Cloud.

As for the pods i used ufw status adn got Status: inactive for both master and worker node.
"
student@worker:~$ sudo ufw status
Status: inactive
"

mikerossiter · May 2021

I found GCP to be very confusing (it wouldn't even generate an instance saying there weren't enough resources in my region??? In the whole of the London data center?) and has already changed a bit since the install video was made. AWS seemed to work though although I found the best way was to run two VMs with static IPs in Virtual Box.

hhness · May 2021

@mikerossiter
Yes, I experienced the same, having to try different regions befoe i got the VM up and running.
I have managed all other tasks until this so i suspect its related to the setup of the containers and not GCP.
The VMs seems to work as intended.

chrispokorni · May 2021

Hi @hhness,

What VPC is configured for your VM instances?

Regards,
-Chris

hhness · May 2021

@chrispokorni
More or less the exact same as in the tutorial, except for region.
I have increased trasmission unit to 1500 instead of 1460, for some reason.

Lab 10 task 14 Unable to install vim

Answers

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)