LAB 3.4 - Failed to pull image when Master and Worker are ready
The deployment is not ready when master and worker are ready however if master is only working the error doesn't appear.
USE CASE OK:
_u2004@k8sm0:~$ date
Wed 23 Dec 2020 06:09:17 PM UTC
u2004@k8sm0:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8sm0 Ready control-plane,master 6h22m v1.20.1
k8sw0 NotReady 4h58m v1.20.1
u2004@k8sm0:~$ kubectl create deployment nginx --image=nginx
deployment.apps/nginx created
u2004@k8sm0:~$ kubectl get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
nginx 1/1 1 1 11s <------------------------------------------------- READY 1/1
u2004@k8sm0:~$ kubectl get events --sort-by='.lastTimestamp'
36s Normal ScalingReplicaSet deployment/nginx Scaled up replica set nginx-6799fc88d8 to 1
36s Normal SuccessfulCreate replicaset/nginx-6799fc88d8 Created pod: nginx-6799fc88d8-vr2mr
35s Normal Pulling pod/nginx-6799fc88d8-vr2mr Pulling image "nginx"
32s Normal Pulled pod/nginx-6799fc88d8-vr2mr Successfully pulled image "nginx" in 2.797267818s
32s Normal Created pod/nginx-6799fc88d8-vr2mr Created container nginx
32s Normal Started pod/nginx-6799fc88d8-vr2mr Started container nginx
u2004@k8sm0:~$ kubectl delete deployment nginx
deployment.apps "nginx" deleted
u2004@k8sm0:~$ kubectl get deployment
No resources found in default namespace._
USE CASE NOK:
_u2004@k8sm0:~$ date
Wed 23 Dec 2020 06:13:49 PM UTC
u2004@k8sm0:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8sm0 Ready control-plane,master 6h26m v1.20.1
k8sw0 Ready 5h2m v1.20.1
u2004@k8sm0:~$ kubectl get deployment
No resources found in default namespace.
u2004@k8sm0:~$ kubectl create deployment nginx --image=nginx
deployment.apps/nginx created
u2004@k8sm0:~$ kubectl get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
nginx 0/1 1 0 16s
u2004@k8sm0:~$ kubectl get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
nginx 0/1 1 0 41s
u2004@k8sm0:~$ kubectl get events --sort-by='.lastTimestamp'
2m23s Normal NodeAllocatableEnforced node/k8sw0 Updated Node Allocatable limit across pods
2m23s Normal NodeHasSufficientPID node/k8sw0 Node k8sw0 status is now: NodeHasSufficientPID
2m23s Normal NodeHasNoDiskPressure node/k8sw0 Node k8sw0 status is now: NodeHasNoDiskPressure
2m23s Normal NodeHasSufficientMemory node/k8sw0 Node k8sw0 status is now: NodeHasSufficientMemory
2m23s Normal Starting node/k8sw0 Starting kubelet.
2m23s Warning Rebooted node/k8sw0 Node k8sw0 has been rebooted, boot id: 3148704a-d187-451c-b603-43b3a30be807
2m23s Normal NodeReady node/k8sw0 Node k8sw0 status is now: NodeReady
2m12s Normal Starting node/k8sw0 Starting kube-proxy.
117s Normal SuccessfulCreate replicaset/nginx-6799fc88d8 Created pod: nginx-6799fc88d8-z8hpq
117s Normal ScalingReplicaSet deployment/nginx Scaled up replica set nginx-6799fc88d8 to 1
116s Normal Pulling pod/nginx-6799fc88d8-z8hpq Pulling image "nginx"
0s Warning Failed pod/nginx-6799fc88d8-z8hpq Failed to pull image "nginx": rpc error: code = Unknown desc = dial tcp: lookup registry-1.docker.io: Temporary failure in name resolution
0s Warning Failed pod/nginx-6799fc88d8-z8hpq Error: ErrImagePull_
Comments
-
Resolution fails when both nodes are up and running:
u2004@k8sm0:/tmp$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8sm0 Ready control-plane,master 6h47m v1.20.1
k8sw0 Ready 5h23m v1.20.1u2004@k8sm0:/tmp$ ping google.es
ping: google.es: Temporary failure in name resolutionu2004@k8sm0:/tmp$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8sm0 Ready control-plane,master 6h51m v1.20.1
k8sw0 NotReady 5h26m v1.20.1u2004@k8sm0:/tmp$ ping google.es
PING google.es (172.217.17.3) 56(84) bytes of data.
64 bytes from mad07s09-in-f3.1e100.net (172.217.17.3): icmp_seq=1 ttl=128 time=15.9 ms
64 bytes from mad07s09-in-f3.1e100.net (172.217.17.3): icmp_seq=2 ttl=128 time=21.0 ms
64 bytes from mad07s09-in-f3.1e100.net (172.217.17.3): icmp_seq=3 ttl=128 time=18.0 ms
64 bytes from mad07s09-in-f3.1e100.net (172.217.17.3): icmp_seq=4 ttl=128 time=19.8 ms
64 bytes from mad07s09-in-f3.1e100.net (172.217.17.3): icmp_seq=5 ttl=128 time=20.4 ms
^C
--- google.es ping statistics ---0 -
My Lab:
- VMWare
- Ubuntu 20.04 LTS
- K8S 1.20.1
Last version of all component as Kubernetes documentation recommends.
This is a rare case, because is there any depedency between nodes status and the network?
I believe this case is not related to the versions.
Regards
0 -
In order to disscard the version I have re-deployed my Lab according to current course documentation.
The Issue is reproduced. I have it produced in my laptop and in other environment
###### VERSIONS #######
student@master:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.5 LTS
Release: 18.04
Codename: bionicstudent@master:~$ apt-show-versions | grep -i kube
cri-tools:amd64/kubernetes-xenial 1.13.0-01 uptodate
kubeadm:amd64/kubernetes-xenial 1.18.1-00 upgradeable to 1.20.1-00
kubectl:amd64/kubernetes-xenial 1.18.1-00 upgradeable to 1.20.1-00
kubelet:amd64/kubernetes-xenial 1.18.1-00 upgradeable to 1.20.1-00
kubernetes-cni:amd64/kubernetes-xenial 0.8.7-00 uptodate###### NOK USE CASE #####
student@master:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready master 19m v1.18.1
worker Ready 4m11s v1.18.1student@master:~$ kubectl get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
nginx 0/1 1 0 2m2sstudent@master:~$ kubectl get events --sort-by='.lastTimestamp'
13m Normal NodeReady node/worker Node worker status is now: NodeReady
6m9s Normal SuccessfulCreate replicaset/nginx-6799fc88d8 Created pod: nginx-6799fc88d8-9xd8t
6m9s Normal ScalingReplicaSet deployment/nginx Scaled up replica set nginx-6799fc88d8 to 1
6m9s Normal Scheduled pod/nginx-6799fc88d8-9xd8t Successfully assigned default/nginx-6799fc88d8-9xd8t to worker
4m4s Normal Pulling pod/nginx-6799fc88d8-9xd8t Pulling image "nginx"
3m49s Warning Failed pod/nginx-6799fc88d8-9xd8t Error: ErrImagePull
3m49s Warning Failed pod/nginx-6799fc88d8-9xd8t Failed to pull image "nginx": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
3m22s Warning Failed pod/nginx-6799fc88d8-9xd8t Error: ImagePullBackOff
3m22s Normal BackOff pod/nginx-6799fc88d8-9xd8t Back-off pulling image "nginx"###### OK USE CASE #####
student@master:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready master 33m v1.18.1
worker NotReady 17m v1.18.1student@master:~$ kubectl get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
nginx 1/1 1 1 20sstudent@master:~$ kubectl get events --sort-by='.lastTimestamp'
5m41s Normal NodeNotReady node/worker Node worker status is now: NodeNotReady
89s Normal ScalingReplicaSet deployment/nginx Scaled up replica set nginx-6799fc88d8 to 1
89s Normal SuccessfulCreate replicaset/nginx-6799fc88d8 Created pod: nginx-6799fc88d8-wn9m4
89s Normal Scheduled pod/nginx-6799fc88d8-wn9m4 Successfully assigned default/nginx-6799fc88d8-wn9m4 to master
88s Normal Pulling pod/nginx-6799fc88d8-wn9m4 Pulling image "nginx"
74s Normal Started pod/nginx-6799fc88d8-wn9m4 Started container nginx
74s Normal Created pod/nginx-6799fc88d8-wn9m4 Created container nginx
74s Normal Pulled pod/nginx-6799fc88d8-wn9m4 Successfully pulled image "nginx"###### OTHER TESTS ######
When both nodes are in the cluster ready all name resolutions are affected:
student@master:~$ sudo apt install apt-show-versions
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
libapt-pkg-perl
The following NEW packages will be installed:
apt-show-versions libapt-pkg-perl
0 upgraded, 2 newly installed, 0 to remove and 5 not upgraded.
Need to get 96.6 kB of archives.
After this operation, 312 kB of additional disk space will be used.
Do you want to continue? [Y/n] y
Err:1 http://es.archive.ubuntu.com/ubuntu bionic/main amd64 libapt-pkg-perl amd64 0.1.33build1
Temporary failure resolving 'es.archive.ubuntu.com'
0% [Working]^Cstudent@master:~$ ping google.es
ping: google.es: Temporary failure in name resolutionWhen MASTER only is in the cluster ready resolution name works fine:
student@master:~$ sudo apt install apt-show-versions
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
libapt-pkg-perl
The following NEW packages will be installed:
apt-show-versions libapt-pkg-perl
0 upgraded, 2 newly installed, 0 to remove and 5 not upgraded.
Need to get 96.6 kB of archives.
After this operation, 312 kB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 http://es.archive.ubuntu.com/ubuntu bionic/main amd64 libapt-pkg-perl amd64 0.1.33build1 [68.0 kB]
Get:2 http://es.archive.ubuntu.com/ubuntu bionic/universe amd64 apt-show-versions all 0.22.7ubuntu1 [28.6 kB]
Fetched 96.6 kB in 16s (6,140 B/s)
Selecting previously unselected package libapt-pkg-perl.
(Reading database ... 67530 files and directories currently installed.)
Preparing to unpack .../libapt-pkg-perl_0.1.33build1_amd64.deb ...
Unpacking libapt-pkg-perl (0.1.33build1) ...
Selecting previously unselected package apt-show-versions.
Preparing to unpack .../apt-show-versions_0.22.7ubuntu1_all.deb ...
Unpacking apt-show-versions (0.22.7ubuntu1) ...
Setting up libapt-pkg-perl (0.1.33build1) ...
Setting up apt-show-versions (0.22.7ubuntu1) ...
** initializing cache. This may take a while **
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...student@master:~$ ping google.es
PING google.es (216.58.209.67) 56(84) bytes of data.
64 bytes from mad07s22-in-f3.1e100.net (216.58.209.67): icmp_seq=1 ttl=128 time=38.2 ms
64 bytes from mad07s22-in-f3.1e100.net (216.58.209.67): icmp_seq=2 ttl=128 time=20.6 ms
64 bytes from mad07s22-in-f3.1e100.net (216.58.209.67): icmp_seq=3 ttl=128 time=42.9 ms
64 bytes from mad07s22-in-f3.1e100.net (216.58.209.67): icmp_seq=4 ttl=128 time=17.5 ms0 -
check and compare the routes and dns settings changes between the OK and NOK use cases
0 -
There isn't firewall
Routes appear OK
# Use case OK
student@master:~$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 192.168.10.2 0.0.0.0 UG 100 0 0 ens33
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
192.168.10.0 0.0.0.0 255.255.255.0 U 0 0 0 ens33
192.168.10.2 0.0.0.0 255.255.255.255 UH 100 0 0 ens33
192.168.219.64 0.0.0.0 255.255.255.192 U 0 0 0 *
192.168.219.76 0.0.0.0 255.255.255.255 UH 0 0 0 cali4f2dae3ae57
192.168.219.77 0.0.0.0 255.255.255.255 UH 0 0 0 cali3b44909318d
192.168.219.78 0.0.0.0 255.255.255.255 UH 0 0 0 calif48570d0d2estudent@master:~$ ip route
default via 192.168.10.2 dev ens33 proto dhcp src 192.168.10.133 metric 100
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.10.0/24 dev ens33 proto kernel scope link src 192.168.10.133
192.168.10.2 dev ens33 proto dhcp scope link src 192.168.10.133 metric 100
192.168.219.76 dev cali4f2dae3ae57 scope link
192.168.219.77 dev cali3b44909318d scope link
192.168.219.78 dev calif48570d0d2e scope link# Use case NOK
student@master:~$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 192.168.10.2 0.0.0.0 UG 100 0 0 ens33
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
192.168.10.0 0.0.0.0 255.255.255.0 U 0 0 0 ens33
192.168.10.2 192.168.10.134 255.255.255.255 UGH 0 0 0 tunl0
192.168.10.2 0.0.0.0 255.255.255.255 UH 100 0 0 ens33
192.168.171.64 192.168.10.134 255.255.255.192 UG 0 0 0 tunl0
192.168.219.64 0.0.0.0 255.255.255.192 U 0 0 0 *
192.168.219.76 0.0.0.0 255.255.255.255 UH 0 0 0 cali4f2dae3ae57
192.168.219.77 0.0.0.0 255.255.255.255 UH 0 0 0 cali3b44909318d
192.168.219.78 0.0.0.0 255.255.255.255 UH 0 0 0 calif48570d0d2estudent@master:~$ ip route
default via 192.168.10.2 dev ens33 proto dhcp src 192.168.10.133 metric 100
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.10.0/24 dev ens33 proto kernel scope link src 192.168.10.133
192.168.10.2 via 192.168.10.134 dev tunl0 proto bird onlink
192.168.10.2 dev ens33 proto dhcp scope link src 192.168.10.133 metric 100
192.168.171.64/26 via 192.168.10.134 dev tunl0 proto bird onlink
blackhole 192.168.219.64/26 proto bird
192.168.219.73 dev cali3b44909318d scope link
192.168.219.74 dev cali4f2dae3ae57 scope link
192.168.219.75 dev calif48570d0d2e scope link0 -
The problem is with the DNS resolution.
When the second node is Ready, DNS resolution fails.
By default in ubuntu 18.04 the name resolution is managed by systemd-resolved service. (Standard installation using Official Server ISO )
#### Use Case OK
_student@master:/run/systemd/resolve$ dig google.es
; <<>> DiG 9.11.3-1ubuntu1.13-Ubuntu <<>> google.es
;; global options: +cmd
;; connection timed out; no servers could be reached
student@master:/run/systemd/resolve$ dig google.es; <<>> DiG 9.11.3-1ubuntu1.13-Ubuntu <<>> google.es
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 27420
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;google.es. IN A;; ANSWER SECTION:
google.es. 5 IN A 172.217.17.3;; Query time: 24 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Sat Dec 26 12:30:10 UTC 2020
;; MSG SIZE rcvd: 54_student@master:/run/systemd/resolve$ netstat -anp | grep -i ":53"
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 127.0.0.53:53 0.0.0.0:* LISTEN -
udp 0 0 127.0.0.53:53 0.0.0.0:* -#### Use Case NOK
_student@master:~$ dig google.es
; <<>> DiG 9.11.3-1ubuntu1.13-Ubuntu <<>> google.es
;; global options: +cmd
;; connection timed out; no servers could be reachedstudent@master:~$ ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=128 time=62.9 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=128 time=23.0 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=128 time=24.0 ms_student@master:/run/systemd/resolve$ netstat -anp | grep -i ":53"
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 127.0.0.53:53 0.0.0.0:* LISTEN -
udp 0 0 127.0.0.53:53 0.0.0.0:* -
udp 0 0 192.168.219.64:35529 192.168.10.2:53 ESTABLISHED -I would like to understand why the behaviour is modified by Kubernetes/Calico/Other component.
As WA I could configure 8.8.8.8 as DNS server but I prefer to understand this use case.Regards
0 -
The issues could be related to:
Some Linux distributions (e.g. Ubuntu) use a local DNS resolver by default (systemd-resolved). Systemd-resolved moves and replaces /etc/resolv.conf with a stub file that can cause a fatal forwarding loop when resolving names in upstream servers. This can be fixed manually by using kubelet's --resolv-conf flag to point to the correct resolv.conf (With systemd-resolved, this is /run/systemd/resolve/resolv.conf). kubeadm automatically detects systemd-resolved, and adjusts the kubelet flags accordingly.
https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/
0 -
There aren't errors in coredns
I moved my network to 172.16.0.0/12
Finally I solved the issue re-installing the platform completely but I disabled systemd-resolved before.
2
Categories
- All Categories
- 175 LFX Mentorship
- 175 LFX Mentorship: Linux Kernel
- 745 Linux Foundation IT Professional Programs
- 372 Cloud Engineer IT Professional Program
- 168 Advanced Cloud Engineer IT Professional Program
- 73 DevOps IT Professional Program - Discontinued
- 3 DevOps & GitOps IT Professional Program
- 98 Cloud Native Developer IT Professional Program
- 7.6K Training Courses & Learning Paths
- AI & ML Training
- Blockchain & Decentralized Identity Training
- Cloud & Containers Training
- Cybersecurity Training
- DevOps & Site-Reliability Training
- Linux Kernel Development Training
- Networking Training
- Open Source Best Practice Training
- System Administration Training
- System Engineering Training
- Web & Application Development Training
- 55 LFD102 Class Forum
- 2 LFD103-JP クラス フォーラム
- 2 LFD114 Class Forum
- 4 LFD123 Class Forum
- 5 LFD137 Class Forum
- 2 LFD140 Class Forum
- 4 LFD210-CN Class Forum
- 2 LFD221 Class Forum
- 764 LFD259 Class Forum
- 681 LFS101 Class Forum
- 2 LFS140 Class Forum
- 30 LFS148 Class Forum
- 2 LFS158-JP クラス フォーラム
- 4 LFS180 Class Forum
- 3 LFS184 Class Forum
- 162 LFS207 Class Forum
- 3 LFS207-DE-Klassenforum
- 4 LFS207-JP クラス フォーラム
- 61 LFS241 Class Forum
- 52 LFS242 Class Forum
- 42 LFS243 Class Forum
- 19 LFS244 Class Forum
- 9 LFS245 Class Forum
- 3 LFS246 Class Forum
- 2 LFS248 Class Forum
- 179 LFS250 Class Forum
- 4 LFS250-JP クラス フォーラム
- 166 LFS253 Class Forum
- 5 LFS255 Class Forum
- 19 LFS256 Class Forum
- 3 LFS257 Class Forum
- 1.4K LFS258 Class Forum
- 165 LFS261 Class Forum
- 26 LFS267 Class Forum
- 28 LFS268 Class Forum
- 792 Hardware
- 202 Drivers
- 68 I/O Devices
- 37 Monitors
- 95 Multimedia
- 173 Networking
- 91 Printers & Scanners
- 87 Storage
- 768 Linux Distributions
- 81 Debian
- 67 Fedora
- 22 Linux Mint
- 13 Mageia
- 24 openSUSE
- 150 Red Hat Enterprise
- 31 Slackware
- 13 SUSE Enterprise
- 356 Ubuntu
- 465 Linux System Administration
- 31 Cloud Computing
- 73 Command Line/Scripting
- Github systems admin projects
- 98 Linux Security
- 78 Network Management
- 101 System Management
- 46 Web Management
- 105 Mobile Computing
- 18 Android
- 72 Development
- 1.2K New to Linux
- 1K Getting Started with Linux
- 392 Off Topic
- 121 Introductions
- 181 Small Talk
- 29 Study Material
- 933 Programming and Development
- 310 Kernel Development
- 605 Software Development
- 974 Software
- 366 Applications
- 182 Command Line
- 5 Compiling/Installing
- 68 Games
- 317 Installation
- Archived
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)