Welcome to the Linux Foundation Forum!

Lab 12.3 Cannot make metrics-server work.

I used the step 1 to download the package. The step 2 did not work, because the package changed. I used the way I got from the web site to install metrics-server:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml
When I taped the "student@master:~$ kubectl top node", I got " error: metrics not available yet".
I checked and found "metrics-server-5f956b6d5f-2lbzv 1/1 Running 0 3m58s".
I deleted the deployment. svc, and sa of the "metrics-serve", and used:
sudo kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml
did it again. I got the same error.
What should I do to fix this problem?

Thanks,

Wei

Comments

  • chrispokorni
    chrispokorni Posts: 2,349

    Hi Wei,

    Typically, the metrics-server needs a good few minutes to start collecting metrics from your cluster. If you issue the kubectl top command too soon, you will see the metrics not available yet message.

    Regards,
    -Chris

  • zhangwe
    zhangwe Posts: 45

    Hi Chris,

    Thank you very much for your response.
    But it still did not work.

    Wei

  • serewicz
    serewicz Posts: 1,000

    Hello,

    I have just run the lab and it worked. Here is what I did:
    git clone https://github.com/kubernetes-incubator/metrics-server.git
    less README.md
    kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.7/components.yaml
    kubectl -n kube-system edit deployments.apps metrics-server

    ....
    - --cert-dir=/tmp
    - --secure-port=4443
    - --kubelet-insecure-tls #<<---------------Added this insecure TLS line
    image: k8s.gcr.io/metrics-server/metrics-server:v0.3.7
    ....
    sleep 120 ; kubectl -n kube-system top pod
    NAME CPU(cores) MEMORY(bytes)
    calico-kube-controllers-578894d4cd-48hrz 1m 6Mi
    calico-node-cx7gb 25m 25Mi
    calico-node-hxfwn 30m 24Mi
    coredns-66bff467f8-rqhvr 3m 6Mi
    coredns-66bff467f8-sbnn4 3m 6Mi
    etcd-master 19m 42Mi
    ....

    It works.

    Regards,

  • zhangwe
    zhangwe Posts: 45

    It works. Thank you very much for your help.

  • serewicz
    serewicz Posts: 1,000

    Great! I've added it to the update and it should be part of the 1.19 release.

  • zhangwe
    zhangwe Posts: 45

    The metrics-server just monitor the worker node and the pods in the worker node, not the master nod snd the pods in the master node.
    student@master:~$ kubectl top node
    NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
    worker 88m 4% 839Mi 11%
    master

  • serewicz
    serewicz Posts: 1,000

    Hello,
    Other than waiting for a bit to see if it just needs to update the other node, there are a few things that could lead to this. What are you using for your lab environment, platform/OS/K8s version?

    Please show the output of kubectl get node

    Is the master node still tainted?

    Regards,

  • I'm unclear on "allowed" websites during exam "github.com/kubernetes-sigs" doesn't appear to be listed here: https://www.cncf.io/certification/cka/faq/ but seems to be needed to install metrics-server.

    "in order to access assets at https://kubernetes.io/docs/ and its subdomains, https://github.com/kubernetes/ and its subdomains, or https://kubernetes.io/blog/ . No other tabs may be opened and no other sites may be navigated to."

    @serewicz can you clarify please?

  • serewicz
    serewicz Posts: 1,000

    Hello,

    There is only so much I can say about an exam due to the confidentiality agreement we all accept in order to take the exam. I can't talk directly about what may or may not be on the exam, other than suggest the Candidate Handbook https://training.linuxfoundation.org/go/cka-ckad-candidate-handbook or the general certification email address.

    The certification team are top notch. I'm sure that they know what is available and not as they design the questions.

    Regards,

  • fcioanca
    fcioanca Posts: 2,151

    Please email certificationsupport@linuxfoundation.org for any exam-related question if the Candidate Handbook does not provide the answers you are looking for.

  • zhangwe
    zhangwe Posts: 45

    For the metrics-server just monitor the worker node and the pods in the worker node, I restart the whole VMs. and Now it just monitor the master node and the pods in the master node, not monitor the worker node and the pods in the worker node.
    I used the ubuntu-1804-bionic-v20200701 two GCE VMs.
    The other info:
    student@master:~$ kubectl top node
    NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
    master 198m 9% 1225Mi 16%
    worker

    student@master:~$ kubectl top pod --all-namespaces
    NAMESPACE NAME CPU(cores) MEMORY(bytes)
    kube-system calico-kube-controllers-578894d4cd-pq67q 2m 8Mi
    kube-system calico-node-hxrhx 23m 64Mi
    kube-system coredns-66bff467f8-2mrnn 3m 6Mi
    kube-system coredns-66bff467f8-m7ls6 3m 6Mi
    kube-system etcd-master 19m 47Mi
    kube-system kube-apiserver-master 37m 286Mi
    kube-system kube-controller-manager-master 12m 38Mi
    kube-system kube-proxy-mzj9b 1m 13Mi
    kube-system kube-scheduler-master 4m 11Mi
    kube-system metrics-server-6c7dddc9f8-7w8w2 1m 10Mi
    student@master:~$

    student@master:~$ kubectl version
    Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.1", GitCommit:"7879fc12a63337efff607952a323df90cdc7a335", GitTreeState:"clean", BuildDate:"2020-04-08T17:38:50Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.1", GitCommit:"7879fc12a63337efff607952a323df90cdc7a335", GitTreeState:"clean", BuildDate:"2020-04-08T17:30:47Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}

  • chrispokorni
    chrispokorni Posts: 2,349

    Hi @zhangwe,

    Could you provide the output of kubectl get ds,po -o wide --all-namespaces ?

    Regards,
    -Chris

  • zhangwe
    zhangwe Posts: 45

    student@master:~$ kubectl get node --show-labels
    NAME STATUS ROLES AGE VERSION LABELS
    master Ready master 16d v1.18.1 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=master,kubernetes.io/os=linux,node-role.kubernetes.io/master=
    worker Ready 16d v1.18.1 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=worker,kubernetes.io/os=linux

  • serewicz
    serewicz Posts: 1,000

    Hello,

    It looks like the versions are okay. Could you verify you have an allow all traffic rule in the GCE VPC firewall?

    As Chris mentioned it would also be good to see the output of kubectl get ds,po -o wide --all-namespaces

    Regards,

  • zhangwe
    zhangwe Posts: 45

    student@master:~$ kubectl get ds,po -o wide --all-namespaces
    NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE CONTAINERS IMAGES SELECTOR
    kube-system daemonset.apps/calico-node 2 2 2 2 2 kubernetes.io/os=linux 16d calico-node calico/node:v3.15.1 k8s-app=calico-node
    kube-system daemonset.apps/kube-proxy 2 2 2 2 2 kubernetes.io/os=linux 16d kube-proxy k8s.gcr.io/kube-proxy:v1.18.1 k8s-app=kube-proxy

    NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
    default pod/hello-1597090080-8kpd5 0/1 Completed 0 2m17s 192.168.171.107 worker
    default pod/hello-1597090140-7wvc6 0/1 Completed 0 76s 192.168.171.78 worker
    default pod/hello-1597090200-w8bz8 0/1 Completed 0 16s 192.168.171.95 worker
    default pod/mypod 0/1 ContainerCreating 0 41m worker
    default pod/nginx 0/1 Pending 0 30m
    default pod/nginx-pod 0/1 Completed 0 10m 192.168.171.79 worker
    kube-system pod/calico-kube-controllers-578894d4cd-pq67q 1/1 Running 12 16d 192.168.219.97 master
    kube-system pod/calico-node-gqrcc 1/1 Running 13 16d 10.142.0.10 worker
    kube-system pod/calico-node-hxrhx 1/1 Running 12 16d 10.142.0.9 master
    kube-system pod/coredns-66bff467f8-2mrnn 1/1 Running 12 16d 192.168.219.98 master
    kube-system pod/coredns-66bff467f8-m7ls6 1/1 Running 12 16d 192.168.219.102 master
    kube-system pod/etcd-master 1/1 Running 12 16d 10.142.0.9 master
    kube-system pod/kube-apiserver-master 1/1 Running 14 16d 10.142.0.9 master
    kube-system pod/kube-controller-manager-master 1/1 Running 12 16d 10.142.0.9 master
    kube-system pod/kube-proxy-kd42s 1/1 Running 12 16d 10.142.0.10 worker
    kube-system pod/kube-proxy-mzj9b 1/1 Running 12 16d 10.142.0.9 master
    kube-system pod/kube-scheduler-master 1/1 Running 12 16d 10.142.0.9 master
    kube-system pod/metrics-server-6c7dddc9f8-24jr7 1/1 Running 2 3d22h 192.168.171.126 worker
    kube-system pod/metrics-server-6c7dddc9f8-7w8w2 1/1 Running 3 4d1h 192.168.219.101 master
    namespace1 pod/pod1 0/1 Completed 0 2d23h worker
    namespace2 pod/pod2 0/1 Completed 0 2d23h master
    namespace3 pod/pod3 0/1 Completed 0 2d23h worker
    namespace4 pod/pod4 0/1 Completed 0 2d23h worker
    qos-example pod/qos-demo 1/1 Running 10 10d 192.168.171.112 worker
    student@master:~$

  • serewicz
    serewicz Posts: 1,000

    Hello,

    Thank you for the info. I notice that you have a fair amount of restarts on all of the long-running containers, and a pending container. Are you using 2cpu/7.5+ or larger VMs?

    Also, please verify you have an allow all traffic rule for you GCE VPC, like this:

    I just reproduced the lab and can see both VMs. I'm leaning towards something in your VPC firewall settings.

    Regards,

  • zhangwe
    zhangwe Posts: 45

    Yes, I am using using 2cpu/7.5GB VMS with 15GB disk-space.

    The same network policies and VPC firewall settings I used before. And before, metrics-server worked on bith nodes. I used instance template to create VMs.
    This time, when I created metrics-server, I forgot to add "- --kubelet-insecure-tls #". According to your suggestion, I added it, and restart the the whole. I found metrics-server just worked only on one node.

  • serewicz
    serewicz Posts: 1,000

    Hello,

    Do you mean that the same settings you used before is to allow all traffic to all IPs? Perhaps your previous wasn't quite correct.

    We have looked at much of the differences and I'm not seeing why your metrics server would not be reporting from both nodes, with the VPC firewall as the primary culprit. Obvioulsy the image is proper because it is reporting from the master. The difference is the network which connects them. Check your firewall settings.

    Perhaps during Friday's office hours you can share your screen and we can look at your VPC firewall settings. I just ran the ten commands and both nodes are reporting metrics.

    Regards,

  • chrispokorni
    chrispokorni Posts: 2,349
    edited August 2020

    Hi @zhangwe,

    According to step 4 of the metrics-server lab exercise, do you have both the following lines in the metrics-server Deployment yaml manifest, and they are properly aligned?

    - --kubelet-insecure-tls
    - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
    

    Also, what is the output of:

    kubectl -n kube-system get svc,ep

    Regards,
    -Chris

  • zhangwe
    zhangwe Posts: 45

    Hi Chris,
    Yes, I missed this line "- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname" too. After I added this line, it works all well now.

    Thank you very much for your help.

    Wei

  • serewicz
    serewicz Posts: 1,000

    Interesting. I did not add that preferred address line to my configuration, and it works.

Categories

Upcoming Training