Lab 12.3 Cannot make metrics-server work.
I used the step 1 to download the package. The step 2 did not work, because the package changed. I used the way I got from the web site to install metrics-server:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml
When I taped the "student@master:~$ kubectl top node", I got " error: metrics not available yet".
I checked and found "metrics-server-5f956b6d5f-2lbzv 1/1 Running 0 3m58s".
I deleted the deployment. svc, and sa of the "metrics-serve", and used:
sudo kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml
did it again. I got the same error.
What should I do to fix this problem?
Thanks,
Wei
Comments
-
Hi Wei,
Typically, the metrics-server needs a good few minutes to start collecting metrics from your cluster. If you issue the
kubectl top
command too soon, you will see themetrics not available yet
message.Regards,
-Chris0 -
Hi Chris,
Thank you very much for your response.
But it still did not work.Wei
0 -
Hello,
I have just run the lab and it worked. Here is what I did:
git clone https://github.com/kubernetes-incubator/metrics-server.git
less README.md
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.7/components.yaml
kubectl -n kube-system edit deployments.apps metrics-server
....
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-insecure-tls #<<---------------Added this insecure TLS line
image: k8s.gcr.io/metrics-server/metrics-server:v0.3.7
....
sleep 120 ; kubectl -n kube-system top pod
NAME CPU(cores) MEMORY(bytes)
calico-kube-controllers-578894d4cd-48hrz 1m 6Mi
calico-node-cx7gb 25m 25Mi
calico-node-hxfwn 30m 24Mi
coredns-66bff467f8-rqhvr 3m 6Mi
coredns-66bff467f8-sbnn4 3m 6Mi
etcd-master 19m 42Mi
....It works.
Regards,
0 -
It works. Thank you very much for your help.
0 -
Great! I've added it to the update and it should be part of the 1.19 release.
0 -
The metrics-server just monitor the worker node and the pods in the worker node, not the master nod snd the pods in the master node.
student@master:~$ kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
worker 88m 4% 839Mi 11%
master0 -
Hello,
Other than waiting for a bit to see if it just needs to update the other node, there are a few things that could lead to this. What are you using for your lab environment, platform/OS/K8s version?Please show the output of kubectl get node
Is the master node still tainted?
Regards,
0 -
I'm unclear on "allowed" websites during exam "github.com/kubernetes-sigs" doesn't appear to be listed here: https://www.cncf.io/certification/cka/faq/ but seems to be needed to install metrics-server.
"in order to access assets at https://kubernetes.io/docs/ and its subdomains, https://github.com/kubernetes/ and its subdomains, or https://kubernetes.io/blog/ . No other tabs may be opened and no other sites may be navigated to."
@serewicz can you clarify please?
0 -
Hello,
There is only so much I can say about an exam due to the confidentiality agreement we all accept in order to take the exam. I can't talk directly about what may or may not be on the exam, other than suggest the Candidate Handbook https://training.linuxfoundation.org/go/cka-ckad-candidate-handbook or the general certification email address.
The certification team are top notch. I'm sure that they know what is available and not as they design the questions.
Regards,
0 -
Please email certificationsupport@linuxfoundation.org for any exam-related question if the Candidate Handbook does not provide the answers you are looking for.
0 -
For the metrics-server just monitor the worker node and the pods in the worker node, I restart the whole VMs. and Now it just monitor the master node and the pods in the master node, not monitor the worker node and the pods in the worker node.
I used the ubuntu-1804-bionic-v20200701 two GCE VMs.
The other info:
student@master:~$ kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
master 198m 9% 1225Mi 16%
workerstudent@master:~$ kubectl top pod --all-namespaces
NAMESPACE NAME CPU(cores) MEMORY(bytes)
kube-system calico-kube-controllers-578894d4cd-pq67q 2m 8Mi
kube-system calico-node-hxrhx 23m 64Mi
kube-system coredns-66bff467f8-2mrnn 3m 6Mi
kube-system coredns-66bff467f8-m7ls6 3m 6Mi
kube-system etcd-master 19m 47Mi
kube-system kube-apiserver-master 37m 286Mi
kube-system kube-controller-manager-master 12m 38Mi
kube-system kube-proxy-mzj9b 1m 13Mi
kube-system kube-scheduler-master 4m 11Mi
kube-system metrics-server-6c7dddc9f8-7w8w2 1m 10Mi
student@master:~$student@master:~$ kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.1", GitCommit:"7879fc12a63337efff607952a323df90cdc7a335", GitTreeState:"clean", BuildDate:"2020-04-08T17:38:50Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.1", GitCommit:"7879fc12a63337efff607952a323df90cdc7a335", GitTreeState:"clean", BuildDate:"2020-04-08T17:30:47Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}0 -
Hi @zhangwe,
Could you provide the output of
kubectl get ds,po -o wide --all-namespaces
?Regards,
-Chris0 -
student@master:~$ kubectl get node --show-labels
NAME STATUS ROLES AGE VERSION LABELS
master Ready master 16d v1.18.1 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=master,kubernetes.io/os=linux,node-role.kubernetes.io/master=
worker Ready 16d v1.18.1 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=worker,kubernetes.io/os=linux0 -
Hello,
It looks like the versions are okay. Could you verify you have an allow all traffic rule in the GCE VPC firewall?
As Chris mentioned it would also be good to see the output of kubectl get ds,po -o wide --all-namespaces
Regards,
0 -
student@master:~$ kubectl get ds,po -o wide --all-namespaces
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE CONTAINERS IMAGES SELECTOR
kube-system daemonset.apps/calico-node 2 2 2 2 2 kubernetes.io/os=linux 16d calico-node calico/node:v3.15.1 k8s-app=calico-node
kube-system daemonset.apps/kube-proxy 2 2 2 2 2 kubernetes.io/os=linux 16d kube-proxy k8s.gcr.io/kube-proxy:v1.18.1 k8s-app=kube-proxyNAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default pod/hello-1597090080-8kpd5 0/1 Completed 0 2m17s 192.168.171.107 worker
default pod/hello-1597090140-7wvc6 0/1 Completed 0 76s 192.168.171.78 worker
default pod/hello-1597090200-w8bz8 0/1 Completed 0 16s 192.168.171.95 worker
default pod/mypod 0/1 ContainerCreating 0 41m worker
default pod/nginx 0/1 Pending 0 30m
default pod/nginx-pod 0/1 Completed 0 10m 192.168.171.79 worker
kube-system pod/calico-kube-controllers-578894d4cd-pq67q 1/1 Running 12 16d 192.168.219.97 master
kube-system pod/calico-node-gqrcc 1/1 Running 13 16d 10.142.0.10 worker
kube-system pod/calico-node-hxrhx 1/1 Running 12 16d 10.142.0.9 master
kube-system pod/coredns-66bff467f8-2mrnn 1/1 Running 12 16d 192.168.219.98 master
kube-system pod/coredns-66bff467f8-m7ls6 1/1 Running 12 16d 192.168.219.102 master
kube-system pod/etcd-master 1/1 Running 12 16d 10.142.0.9 master
kube-system pod/kube-apiserver-master 1/1 Running 14 16d 10.142.0.9 master
kube-system pod/kube-controller-manager-master 1/1 Running 12 16d 10.142.0.9 master
kube-system pod/kube-proxy-kd42s 1/1 Running 12 16d 10.142.0.10 worker
kube-system pod/kube-proxy-mzj9b 1/1 Running 12 16d 10.142.0.9 master
kube-system pod/kube-scheduler-master 1/1 Running 12 16d 10.142.0.9 master
kube-system pod/metrics-server-6c7dddc9f8-24jr7 1/1 Running 2 3d22h 192.168.171.126 worker
kube-system pod/metrics-server-6c7dddc9f8-7w8w2 1/1 Running 3 4d1h 192.168.219.101 master
namespace1 pod/pod1 0/1 Completed 0 2d23h worker
namespace2 pod/pod2 0/1 Completed 0 2d23h master
namespace3 pod/pod3 0/1 Completed 0 2d23h worker
namespace4 pod/pod4 0/1 Completed 0 2d23h worker
qos-example pod/qos-demo 1/1 Running 10 10d 192.168.171.112 worker
student@master:~$0 -
Hello,
Thank you for the info. I notice that you have a fair amount of restarts on all of the long-running containers, and a pending container. Are you using 2cpu/7.5+ or larger VMs?
Also, please verify you have an allow all traffic rule for you GCE VPC, like this:
I just reproduced the lab and can see both VMs. I'm leaning towards something in your VPC firewall settings.
Regards,
0 -
Yes, I am using using 2cpu/7.5GB VMS with 15GB disk-space.
The same network policies and VPC firewall settings I used before. And before, metrics-server worked on bith nodes. I used instance template to create VMs.
This time, when I created metrics-server, I forgot to add "- --kubelet-insecure-tls #". According to your suggestion, I added it, and restart the the whole. I found metrics-server just worked only on one node.0 -
Hello,
Do you mean that the same settings you used before is to allow all traffic to all IPs? Perhaps your previous wasn't quite correct.
We have looked at much of the differences and I'm not seeing why your metrics server would not be reporting from both nodes, with the VPC firewall as the primary culprit. Obvioulsy the image is proper because it is reporting from the master. The difference is the network which connects them. Check your firewall settings.
Perhaps during Friday's office hours you can share your screen and we can look at your VPC firewall settings. I just ran the ten commands and both nodes are reporting metrics.
Regards,
0 -
Hi @zhangwe,
According to step 4 of the metrics-server lab exercise, do you have both the following lines in the metrics-server Deployment yaml manifest, and they are properly aligned?
- --kubelet-insecure-tls - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
Also, what is the output of:
kubectl -n kube-system get svc,ep
Regards,
-Chris0 -
Hi Chris,
Yes, I missed this line "- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname" too. After I added this line, it works all well now.Thank you very much for your help.
Wei
0 -
Interesting. I did not add that preferred address line to my configuration, and it works.
0
Categories
- All Categories
- 217 LFX Mentorship
- 217 LFX Mentorship: Linux Kernel
- 788 Linux Foundation IT Professional Programs
- 352 Cloud Engineer IT Professional Program
- 177 Advanced Cloud Engineer IT Professional Program
- 82 DevOps Engineer IT Professional Program
- 146 Cloud Native Developer IT Professional Program
- 137 Express Training Courses
- 137 Express Courses - Discussion Forum
- 6.2K Training Courses
- 46 LFC110 Class Forum - Discontinued
- 70 LFC131 Class Forum
- 42 LFD102 Class Forum
- 226 LFD103 Class Forum
- 18 LFD110 Class Forum
- 37 LFD121 Class Forum
- 18 LFD133 Class Forum
- 7 LFD134 Class Forum
- 18 LFD137 Class Forum
- 71 LFD201 Class Forum
- 4 LFD210 Class Forum
- 5 LFD210-CN Class Forum
- 2 LFD213 Class Forum - Discontinued
- 128 LFD232 Class Forum - Discontinued
- 2 LFD233 Class Forum
- 4 LFD237 Class Forum
- 24 LFD254 Class Forum
- 694 LFD259 Class Forum
- 111 LFD272 Class Forum
- 4 LFD272-JP クラス フォーラム
- 12 LFD273 Class Forum
- 147 LFS101 Class Forum
- 1 LFS111 Class Forum
- 3 LFS112 Class Forum
- 2 LFS116 Class Forum
- 4 LFS118 Class Forum
- 6 LFS142 Class Forum
- 5 LFS144 Class Forum
- 4 LFS145 Class Forum
- 2 LFS146 Class Forum
- 3 LFS147 Class Forum
- 1 LFS148 Class Forum
- 15 LFS151 Class Forum
- 2 LFS157 Class Forum
- 25 LFS158 Class Forum
- 7 LFS162 Class Forum
- 2 LFS166 Class Forum
- 4 LFS167 Class Forum
- 3 LFS170 Class Forum
- 2 LFS171 Class Forum
- 3 LFS178 Class Forum
- 3 LFS180 Class Forum
- 2 LFS182 Class Forum
- 5 LFS183 Class Forum
- 31 LFS200 Class Forum
- 737 LFS201 Class Forum - Discontinued
- 3 LFS201-JP クラス フォーラム
- 18 LFS203 Class Forum
- 130 LFS207 Class Forum
- 2 LFS207-DE-Klassenforum
- 1 LFS207-JP クラス フォーラム
- 302 LFS211 Class Forum
- 56 LFS216 Class Forum
- 52 LFS241 Class Forum
- 48 LFS242 Class Forum
- 38 LFS243 Class Forum
- 15 LFS244 Class Forum
- 2 LFS245 Class Forum
- LFS246 Class Forum
- 48 LFS250 Class Forum
- 2 LFS250-JP クラス フォーラム
- 1 LFS251 Class Forum
- 151 LFS253 Class Forum
- 1 LFS254 Class Forum
- 1 LFS255 Class Forum
- 7 LFS256 Class Forum
- 1 LFS257 Class Forum
- 1.2K LFS258 Class Forum
- 10 LFS258-JP クラス フォーラム
- 118 LFS260 Class Forum
- 159 LFS261 Class Forum
- 42 LFS262 Class Forum
- 82 LFS263 Class Forum - Discontinued
- 15 LFS264 Class Forum - Discontinued
- 11 LFS266 Class Forum - Discontinued
- 24 LFS267 Class Forum
- 22 LFS268 Class Forum
- 30 LFS269 Class Forum
- LFS270 Class Forum
- 202 LFS272 Class Forum
- 2 LFS272-JP クラス フォーラム
- 1 LFS274 Class Forum
- 4 LFS281 Class Forum
- 9 LFW111 Class Forum
- 259 LFW211 Class Forum
- 181 LFW212 Class Forum
- 13 SKF100 Class Forum
- 1 SKF200 Class Forum
- 1 SKF201 Class Forum
- 795 Hardware
- 199 Drivers
- 68 I/O Devices
- 37 Monitors
- 102 Multimedia
- 174 Networking
- 91 Printers & Scanners
- 85 Storage
- 758 Linux Distributions
- 82 Debian
- 67 Fedora
- 17 Linux Mint
- 13 Mageia
- 23 openSUSE
- 148 Red Hat Enterprise
- 31 Slackware
- 13 SUSE Enterprise
- 353 Ubuntu
- 468 Linux System Administration
- 39 Cloud Computing
- 71 Command Line/Scripting
- Github systems admin projects
- 93 Linux Security
- 78 Network Management
- 102 System Management
- 47 Web Management
- 63 Mobile Computing
- 18 Android
- 33 Development
- 1.2K New to Linux
- 1K Getting Started with Linux
- 371 Off Topic
- 114 Introductions
- 174 Small Talk
- 22 Study Material
- 805 Programming and Development
- 303 Kernel Development
- 484 Software Development
- 1.8K Software
- 261 Applications
- 183 Command Line
- 3 Compiling/Installing
- 987 Games
- 317 Installation
- 96 All In Program
- 96 All In Forum
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)