Welcome to the Linux Foundation Forum!

Lab 12.3 Cannot make metrics-server work.

I used the step 1 to download the package. The step 2 did not work, because the package changed. I used the way I got from the web site to install metrics-server:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml
When I taped the "[email protected]:~$ kubectl top node", I got " error: metrics not available yet".
I checked and found "metrics-server-5f956b6d5f-2lbzv 1/1 Running 0 3m58s".
I deleted the deployment. svc, and sa of the "metrics-serve", and used:
sudo kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml
did it again. I got the same error.
What should I do to fix this problem?

Thanks,

Wei

Comments

  • chrispokornichrispokorni Posts: 640

    Hi Wei,

    Typically, the metrics-server needs a good few minutes to start collecting metrics from your cluster. If you issue the kubectl top command too soon, you will see the metrics not available yet message.

    Regards,
    -Chris

  • zhangwezhangwe Posts: 45

    Hi Chris,

    Thank you very much for your response.
    But it still did not work.

    Wei

  • serewiczserewicz Posts: 761

    Hello,

    I have just run the lab and it worked. Here is what I did:
    git clone https://github.com/kubernetes-incubator/metrics-server.git
    less README.md
    kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.7/components.yaml
    kubectl -n kube-system edit deployments.apps metrics-server

    ....
    - --cert-dir=/tmp
    - --secure-port=4443
    - --kubelet-insecure-tls #<<---------------Added this insecure TLS line
    image: k8s.gcr.io/metrics-server/metrics-server:v0.3.7
    ....
    sleep 120 ; kubectl -n kube-system top pod
    NAME CPU(cores) MEMORY(bytes)
    calico-kube-controllers-578894d4cd-48hrz 1m 6Mi
    calico-node-cx7gb 25m 25Mi
    calico-node-hxfwn 30m 24Mi
    coredns-66bff467f8-rqhvr 3m 6Mi
    coredns-66bff467f8-sbnn4 3m 6Mi
    etcd-master 19m 42Mi
    ....

    It works.

    Regards,

  • zhangwezhangwe Posts: 45

    It works. Thank you very much for your help.

  • serewiczserewicz Posts: 761

    Great! I've added it to the update and it should be part of the 1.19 release.

  • zhangwezhangwe Posts: 45

    The metrics-server just monitor the worker node and the pods in the worker node, not the master nod snd the pods in the master node.
    [email protected]:~$ kubectl top node
    NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
    worker 88m 4% 839Mi 11%
    master

  • serewiczserewicz Posts: 761

    Hello,
    Other than waiting for a bit to see if it just needs to update the other node, there are a few things that could lead to this. What are you using for your lab environment, platform/OS/K8s version?

    Please show the output of kubectl get node

    Is the master node still tainted?

    Regards,

  • I'm unclear on "allowed" websites during exam "github.com/kubernetes-sigs" doesn't appear to be listed here: https://www.cncf.io/certification/cka/faq/ but seems to be needed to install metrics-server.

    "in order to access assets at https://kubernetes.io/docs/ and its subdomains, https://github.com/kubernetes/ and its subdomains, or https://kubernetes.io/blog/ . No other tabs may be opened and no other sites may be navigated to."

    @serewicz can you clarify please?

  • serewiczserewicz Posts: 761

    Hello,

    There is only so much I can say about an exam due to the confidentiality agreement we all accept in order to take the exam. I can't talk directly about what may or may not be on the exam, other than suggest the Candidate Handbook https://training.linuxfoundation.org/go/cka-ckad-candidate-handbook or the general certification email address.

    The certification team are top notch. I'm sure that they know what is available and not as they design the questions.

    Regards,

  • fcioancafcioanca Posts: 558

    Please email [email protected] for any exam-related question if the Candidate Handbook does not provide the answers you are looking for.

  • zhangwezhangwe Posts: 45

    For the metrics-server just monitor the worker node and the pods in the worker node, I restart the whole VMs. and Now it just monitor the master node and the pods in the master node, not monitor the worker node and the pods in the worker node.
    I used the ubuntu-1804-bionic-v20200701 two GCE VMs.
    The other info:
    [email protected]:~$ kubectl top node
    NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
    master 198m 9% 1225Mi 16%
    worker

    [email protected]:~$ kubectl top pod --all-namespaces
    NAMESPACE NAME CPU(cores) MEMORY(bytes)
    kube-system calico-kube-controllers-578894d4cd-pq67q 2m 8Mi
    kube-system calico-node-hxrhx 23m 64Mi
    kube-system coredns-66bff467f8-2mrnn 3m 6Mi
    kube-system coredns-66bff467f8-m7ls6 3m 6Mi
    kube-system etcd-master 19m 47Mi
    kube-system kube-apiserver-master 37m 286Mi
    kube-system kube-controller-manager-master 12m 38Mi
    kube-system kube-proxy-mzj9b 1m 13Mi
    kube-system kube-scheduler-master 4m 11Mi
    kube-system metrics-server-6c7dddc9f8-7w8w2 1m 10Mi
    [email protected]:~$

    [email protected]:~$ kubectl version
    Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.1", GitCommit:"7879fc12a63337efff607952a323df90cdc7a335", GitTreeState:"clean", BuildDate:"2020-04-08T17:38:50Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.1", GitCommit:"7879fc12a63337efff607952a323df90cdc7a335", GitTreeState:"clean", BuildDate:"2020-04-08T17:30:47Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}

  • chrispokornichrispokorni Posts: 640

    Hi @zhangwe,

    Could you provide the output of kubectl get ds,po -o wide --all-namespaces ?

    Regards,
    -Chris

  • zhangwezhangwe Posts: 45

    [email protected]:~$ kubectl get node --show-labels
    NAME STATUS ROLES AGE VERSION LABELS
    master Ready master 16d v1.18.1 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=master,kubernetes.io/os=linux,node-role.kubernetes.io/master=
    worker Ready 16d v1.18.1 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=worker,kubernetes.io/os=linux

  • serewiczserewicz Posts: 761

    Hello,

    It looks like the versions are okay. Could you verify you have an allow all traffic rule in the GCE VPC firewall?

    As Chris mentioned it would also be good to see the output of kubectl get ds,po -o wide --all-namespaces

    Regards,

  • zhangwezhangwe Posts: 45

    [email protected]:~$ kubectl get ds,po -o wide --all-namespaces
    NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE CONTAINERS IMAGES SELECTOR
    kube-system daemonset.apps/calico-node 2 2 2 2 2 kubernetes.io/os=linux 16d calico-node calico/node:v3.15.1 k8s-app=calico-node
    kube-system daemonset.apps/kube-proxy 2 2 2 2 2 kubernetes.io/os=linux 16d kube-proxy k8s.gcr.io/kube-proxy:v1.18.1 k8s-app=kube-proxy

    NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
    default pod/hello-1597090080-8kpd5 0/1 Completed 0 2m17s 192.168.171.107 worker
    default pod/hello-1597090140-7wvc6 0/1 Completed 0 76s 192.168.171.78 worker
    default pod/hello-1597090200-w8bz8 0/1 Completed 0 16s 192.168.171.95 worker
    default pod/mypod 0/1 ContainerCreating 0 41m worker
    default pod/nginx 0/1 Pending 0 30m
    default pod/nginx-pod 0/1 Completed 0 10m 192.168.171.79 worker
    kube-system pod/calico-kube-controllers-578894d4cd-pq67q 1/1 Running 12 16d 192.168.219.97 master
    kube-system pod/calico-node-gqrcc 1/1 Running 13 16d 10.142.0.10 worker
    kube-system pod/calico-node-hxrhx 1/1 Running 12 16d 10.142.0.9 master
    kube-system pod/coredns-66bff467f8-2mrnn 1/1 Running 12 16d 192.168.219.98 master
    kube-system pod/coredns-66bff467f8-m7ls6 1/1 Running 12 16d 192.168.219.102 master
    kube-system pod/etcd-master 1/1 Running 12 16d 10.142.0.9 master
    kube-system pod/kube-apiserver-master 1/1 Running 14 16d 10.142.0.9 master
    kube-system pod/kube-controller-manager-master 1/1 Running 12 16d 10.142.0.9 master
    kube-system pod/kube-proxy-kd42s 1/1 Running 12 16d 10.142.0.10 worker
    kube-system pod/kube-proxy-mzj9b 1/1 Running 12 16d 10.142.0.9 master
    kube-system pod/kube-scheduler-master 1/1 Running 12 16d 10.142.0.9 master
    kube-system pod/metrics-server-6c7dddc9f8-24jr7 1/1 Running 2 3d22h 192.168.171.126 worker
    kube-system pod/metrics-server-6c7dddc9f8-7w8w2 1/1 Running 3 4d1h 192.168.219.101 master
    namespace1 pod/pod1 0/1 Completed 0 2d23h worker
    namespace2 pod/pod2 0/1 Completed 0 2d23h master
    namespace3 pod/pod3 0/1 Completed 0 2d23h worker
    namespace4 pod/pod4 0/1 Completed 0 2d23h worker
    qos-example pod/qos-demo 1/1 Running 10 10d 192.168.171.112 worker
    [email protected]:~$

  • serewiczserewicz Posts: 761

    Hello,

    Thank you for the info. I notice that you have a fair amount of restarts on all of the long-running containers, and a pending container. Are you using 2cpu/7.5+ or larger VMs?

    Also, please verify you have an allow all traffic rule for you GCE VPC, like this:

    I just reproduced the lab and can see both VMs. I'm leaning towards something in your VPC firewall settings.

    Regards,

  • zhangwezhangwe Posts: 45

    Yes, I am using using 2cpu/7.5GB VMS with 15GB disk-space.

    The same network policies and VPC firewall settings I used before. And before, metrics-server worked on bith nodes. I used instance template to create VMs.
    This time, when I created metrics-server, I forgot to add "- --kubelet-insecure-tls #". According to your suggestion, I added it, and restart the the whole. I found metrics-server just worked only on one node.

  • serewiczserewicz Posts: 761

    Hello,

    Do you mean that the same settings you used before is to allow all traffic to all IPs? Perhaps your previous wasn't quite correct.

    We have looked at much of the differences and I'm not seeing why your metrics server would not be reporting from both nodes, with the VPC firewall as the primary culprit. Obvioulsy the image is proper because it is reporting from the master. The difference is the network which connects them. Check your firewall settings.

    Perhaps during Friday's office hours you can share your screen and we can look at your VPC firewall settings. I just ran the ten commands and both nodes are reporting metrics.

    Regards,

  • chrispokornichrispokorni Posts: 640
    edited August 11

    Hi @zhangwe,

    According to step 4 of the metrics-server lab exercise, do you have both the following lines in the metrics-server Deployment yaml manifest, and they are properly aligned?

    - --kubelet-insecure-tls
    - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
    

    Also, what is the output of:

    kubectl -n kube-system get svc,ep

    Regards,
    -Chris

  • zhangwezhangwe Posts: 45

    Hi Chris,
    Yes, I missed this line "- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname" too. After I added this line, it works all well now.

    Thank you very much for your help.

    Wei

  • serewiczserewicz Posts: 761

    Interesting. I did not add that preferred address line to my configuration, and it works.

Sign In or Register to comment.