Lab 12.7 Problem timeouts on dashboard

Hi I have problems in the labs 12.7, also 12.x but i has fix it, but in this Lab is impossible to me. I´m problems with timeouts, I know that is problem of network (I think) but how is the best line to fix it.
[email protected]:~/k8s$ kubectl -n kubernetes-dashboard logs kubernetes-dashboard-b65488c4-2cp6s
2020/02/05 05:23:28 Using namespace: kubernetes-dashboard
2020/02/05 05:23:28 Using in-cluster config to connect to apiserver
2020/02/05 05:23:28 Starting overwatch
2020/02/05 05:23:28 Using secret token for csrf signing
2020/02/05 05:23:28 Initializing csrf token from kubernetes-dashboard-csrf secret
panic: Get https://10.96.0.1:443/api/v1/namespaces/kubernetes-dashboard/secrets/kubernetes-dashboard-csrf: dial tcp 10.96.0.1:443: i/o timeout
goroutine 1 [running]:
github.com/kubernetes/dashboard/src/app/backend/client/csrf.(csrfTokenManager).init(0xc00050f740)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/csrf/manager.go:40 +0x3b4
github.com/kubernetes/dashboard/src/app/backend/client/csrf.NewCsrfTokenManager(...)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/csrf/manager.go:65
github.com/kubernetes/dashboard/src/app/backend/client.(clientManager).initCSRFKey(0xc000381b80)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/manager.go:487 +0xc7
github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).init(0xc000381b80)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/manager.go:455 +0x47
github.com/kubernetes/dashboard/src/app/backend/client.NewClientManager(...)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/manager.go:536
main.main()
/home/travis/build/kubernetes/dashboard/src/app/backend/dashboard.go:105 +0x212
Thanks
Comments
-
Hi @CharcoGreen,
From your output, it seems you are experiencing a timeout on port 443. Is it in use by another application, or is it blocked by a firewall of your OS or a firewall at the infrastructure level?
The first step would be to determine why your traffic is blocked, and after that, come up with an action plan to fix the issue.
Regards,
-Chris0 -
Thanks for your help,
I´m renew my cluster and my firewall rules0 -
I have the same issue. Running nodes on VMWare Fusion. The metrics-server pod logs show:
Error: Get https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication: dial tcp 10.96.0.1:443: i/o timeout
If I utilise nodeSelector to force it to the master it works fine.
But, trying to run it on a worker I always get that error.
I have the extra args:
- args: - --cert-dir=/tmp - --secure-port=4443 - --kubelet-insecure-tls - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname image: k8s.gcr.io/metrics-server-amd64:v0.3.6
From the worker node I can curl
https://10.96.0.1:443/
just fine, and also from within a pod on the same node (I used the kube-proxy pod container to test from).# curl -k https://10.96.0.1 { "kind": "Status", "apiVersion": "v1", "metadata": { }, "status": "Failure", "message": "forbidden: User \"system:anonymous\" cannot get path \"/\"", "reason": "Forbidden", "details": { }, "code": 403 }
No proimiscious mode on any of my interfaces:
netstat -i #can see no P Kernel Interface table Iface MTU RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg calib8f1 1440 1109 0 0 0 1074 0 0 0 BMRU docker0 1500 0 0 0 0 0 0 0 0 BMU eth0 1500 199755 0 0 0 86684 0 0 0 BMRU eth1 1500 263085 0 0 0 219869 0 0 0 BMRU lo 65536 208914 0 0 0 208914 0 0 0 LRU tunl0 1440 12756 0 0 0 12719 0 0 0 ORU
Here are my interfaces on the worker:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 00:0c:29:9d:50:59 brd ff:ff:ff:ff:ff:ff inet 192.168.134.131/24 brd 192.168.134.255 scope global dynamic eth0 valid_lft 1624sec preferred_lft 1624sec inet6 fe80::20c:29ff:fe9d:5059/64 scope link valid_lft forever preferred_lft forever 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 00:0c:29:9d:50:63 brd ff:ff:ff:ff:ff:ff inet 192.168.10.3/24 brd 192.168.10.255 scope global eth1 valid_lft forever preferred_lft forever inet6 fe80::20c:29ff:fe9d:5063/64 scope link valid_lft forever preferred_lft forever 4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default link/ether 02:42:cc:78:79:00 brd ff:ff:ff:ff:ff:ff inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0 valid_lft forever preferred_lft forever 7: [email protected]: <NOARP,UP,LOWER_UP> mtu 1440 qdisc noqueue state UNKNOWN group default qlen 1000 link/ipip 0.0.0.0 brd 0.0.0.0 inet 192.168.230.192/32 brd 192.168.230.192 scope global tunl0 valid_lft forever preferred_lft forever 74: [email protected]: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1440 qdisc noqueue state UP group default link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 0
As I have multiple interfaces I've edited the calico daemonset to ensure the vmware private network interface is being used (btw does anyone know how to set this based on specific nodes? If I wanted eth0 on one node, but eth1 on another?):
containers: - env: - name: IP_AUTODETECTION_METHOD value: interface=eth1
Running tshark on the worker when the metrics-server pod is starting I see:
# tshark -i any 'port 443' Running as user "root" and group "root". This could be dangerous. Capturing on 'any' 1 0.000000000 192.168.230.250 → 10.96.0.1 TCP 76 55514 → 443 [SYN] Seq=0 Win=28000 Len=0 MSS=1400 SACK_PERM=1 TSval=2513108884 TSecr=0 WS=128 2 1.018008286 192.168.230.250 → 10.96.0.1 TCP 76 [TCP Retransmission] 55514 → 443 [SYN] Seq=0 Win=28000 Len=0 MSS=1400 SACK_PERM=1 TSval=2513109902 TSecr=0 WS=128 3 3.033443940 192.168.230.250 → 10.96.0.1 TCP 76 [TCP Retransmission] 55514 → 443 [SYN] Seq=0 Win=28000 Len=0 MSS=1400 SACK_PERM=1 TSval=2513111918 TSecr=0 WS=128 4 7.192949413 192.168.230.250 → 10.96.0.1 TCP 76 [TCP Retransmission] 55514 → 443 [SYN] Seq=0 Win=28000 Len=0 MSS=1400 SACK_PERM=1 TSval=2513116077 TSecr=0 WS=128 5 15.385327465 192.168.230.250 → 10.96.0.1 TCP 76 [TCP Retransmission] 55514 → 443 [SYN] Seq=0 Win=28000 Len=0 MSS=1400 SACK_PERM=1 TSval=2513124269 TSecr=0 WS=128
Which shows the traffic via tunl0 I suppose, but I'm now a little lost as to where to go from here.
I've looked through IPTables and can't see anything, and I wondered if maybe there was something with nftables in there but I checked and there are only iptables modules loaded - no nft.
Any more ideas? I feel pretty lost now.
0 -
I've actually got it to 'work' but I neither like nor understand the solution, which irks me.
I edited the metrics-server deployment and added
hostNetwork: true
. New pod starts up on the worker and all is fine. I don't actually understand what this is doing however, and why it works. I also then see nothing from tshark.So, I wonder why is this working, and does it indicate where I can fix the problem properly?
0 -
Hi @dnx,
The
hostNetwork
is a feature borrowed from container runtimes, where a container can share the host's network namespace, hence expose itself directly under the host's IP address. While a convenient feature, it also poses security concerns since the container now has access to the node's network stack, which otherwise would not be allowed considering the resource isolation a container was aimed to provide.The Kubernetes pod operates the same way when the
hostNetwork
attribute is set totrue
. The pod is exposed directly under the node's IP address, sharing the host's network namespace. Easy to implement and use, yet not the most secure. In this case, the pod no longer receives it's IP address from the CNI network plugin (calico) as it is exposed directly via the node's IP address, thus eliminating a level of traffic routing and network abstraction. What seems to be an easy fix, it is not how things were intended to work in Kubernetes. If a pod does not operate as expected over the pod network implemented by the CNI network plugin, there may be several issues with your setup. Several aspects could play a role in why your pod does not behave as expected: the infrastructure networking overall, (in)compatibility between your infra and the CNI plugin or just a missed configuration option specific to the mix of technologies in your setup.Part of being a Kubernetes admin is to figure out compatibility and incompatibilities between your infrastructure and cluster components and to discover specifics about certain configuration options in order to overcome such issues (where such specific options are available). Unfortunately, Kubernetes does not fix misconfigured networks, infrastructure, or incompatibilities for us.
Regards,
-Chris0 -
Thanks for the explanation of hostNetwork @chrispokorni . Given all the things that I've checked and listed above do you have any tips or ideas as to where to check next? I've spent a whole day so far on this and feel like I've run to the end of my abilities thus far.
0 -
I went back to basics and checked my cluster init. After destroying the cluster and recreating with some changes it now works fine.
The two things I changed:
- added
--apiserver-advertise-address
to kubeadm, set to the IP of eth1(vmware private network) - changed calico.yaml and
--pod-network-cidr
to 172.16.0.0/16 as I was using 192.168 ranges for eth0 and eth1
0 - added
-
I am glad it all works now.
I was going to suggest exploring the networking section of your hypervisor's documentation, cross-referenced with the calico network plugin documentation to find the missing link. It seems that you found it in the meantimeGreat work!
Regards,
-Chris0
Categories
- 10K All Categories
- 29 LFX Mentorship
- 82 LFX Mentorship: Linux Kernel
- 473 Linux Foundation Boot Camps
- 268 Cloud Engineer Boot Camp
- 96 Advanced Cloud Engineer Boot Camp
- 44 DevOps Engineer Boot Camp
- 32 Cloud Native Developer Boot Camp
- 1 Express Training Courses
- 1 Express Courses - Discussion Forum
- 1.6K Training Courses
- 18 LFC110 Class Forum
- 4 LFC131 Class Forum
- 19 LFD102 Class Forum
- 133 LFD103 Class Forum
- 9 LFD121 Class Forum
- 60 LFD201 Class Forum
- LFD210 Class Forum
- 1 LFD213 Class Forum - Discontinued
- 128 LFD232 Class Forum
- 23 LFD254 Class Forum
- 545 LFD259 Class Forum
- 100 LFD272 Class Forum
- 1 LFD272-JP クラス フォーラム
- 1 LFS145 Class Forum
- 20 LFS200 Class Forum
- 739 LFS201 Class Forum
- 1 LFS201-JP クラス フォーラム
- 1 LFS203 Class Forum
- 38 LFS207 Class Forum
- 296 LFS211 Class Forum
- 53 LFS216 Class Forum
- 45 LFS241 Class Forum
- 40 LFS242 Class Forum
- 33 LFS243 Class Forum
- 10 LFS244 Class Forum
- 27 LFS250 Class Forum
- 1 LFS250-JP クラス フォーラム
- 131 LFS253 Class Forum
- 968 LFS258 Class Forum
- 10 LFS258-JP クラス フォーラム
- 86 LFS260 Class Forum
- 124 LFS261 Class Forum
- 29 LFS262 Class Forum
- 78 LFS263 Class Forum
- 15 LFS264 Class Forum
- 10 LFS266 Class Forum
- 17 LFS267 Class Forum
- 16 LFS268 Class Forum
- 14 LFS269 Class Forum
- 195 LFS272 Class Forum
- 1 LFS272-JP クラス フォーラム
- 207 LFW211 Class Forum
- 149 LFW212 Class Forum
- 892 Hardware
- 213 Drivers
- 74 I/O Devices
- 44 Monitors
- 115 Multimedia
- 206 Networking
- 100 Printers & Scanners
- 85 Storage
- 747 Linux Distributions
- 88 Debian
- 64 Fedora
- 13 Linux Mint
- 13 Mageia
- 24 openSUSE
- 133 Red Hat Enterprise
- 33 Slackware
- 13 SUSE Enterprise
- 354 Ubuntu
- 469 Linux System Administration
- 38 Cloud Computing
- 68 Command Line/Scripting
- Github systems admin projects
- 93 Linux Security
- 77 Network Management
- 107 System Management
- 48 Web Management
- 62 Mobile Computing
- 22 Android
- 26 Development
- 1.2K New to Linux
- 1.1K Getting Started with Linux
- 524 Off Topic
- 127 Introductions
- 210 Small Talk
- 19 Study Material
- 783 Programming and Development
- 257 Kernel Development
- 492 Software Development
- 919 Software
- 255 Applications
- 181 Command Line
- 2 Compiling/Installing
- 76 Games
- 316 Installation
- 46 All In Program
- 46 All In Forum
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)