Welcome to the Linux Foundation Forum!

LAB 13.3: Adding tools for monitoring and metrics - Metrics API not available after 10+ minutes

I'm hands on with lab 13.3, at step 6 I've done everything with no issues (and I've already check twice every step); but in step 7 after 15 minutes waiting for a different output from the command "kubectl top pod" or "kubectl top nodes", I'm still getting the same:

error: Metrics API not available

Can anybody help me telling me if there is something missing in instructions?

Thank you in advance.

Comments

  • Hi @juanalmaraz,

    From your metrics-server deployment, can you provide the code snippet representing the container args and the image, similar to the snippet shown in Lab 13.3 step 5 of the lab guide? Typically typos in this section can cause issues with the metrics-server.

    Regards,
    -Chris

  • @chrispokorni I've got the same problem. This is the kubectl -n kube-system describe deployment metrics-server :

    Containers:
       metrics-server:
        Image:      k8s.gcr.io/metrics-server/metrics-server:v0.3.7
        Port:       4443/TCP
        Host Port:  0/TCP
        Args:
          --cert-dir=/tmp
          --secure-port=4443
          --kubelet-insecure-tls
          --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
    
  • Hi @zmicier0k,

    Since Kubernetes release v1.22 the metrics-server v0.3.x may no longer be compatible with latest releases. I would suggest installing the latest metrics-server release v0.6.x and at step 5 provide the following arguments when editing the metrics-server Deployment resource:

    - --kubelet-insecure-tls
    - --kubelet-preferred-address-types=Hostname,InternalDNS,InternalIP,ExternalDNS,ExternalIP
    

    Regards,
    -Chris

  • lzambra
    lzambra Posts: 6

    Hi @chrispokorni I am facing the same issue, after reading the documentation I am not sure If I have to add an additional node, still not working for me even applying the new version.

  • chrispokorni
    chrispokorni Posts: 2,155

    Hi @lzambra,

    Please ensure that the metrics-server (latest release) installation command from step 3 runs successfully and all necessary artifacts are created. The following step 4 should display the metrics-server pod in a running state. If the metrics-server pod is not listed, the previous step may have failed.
    Once the pod is visible, only then proceed to step 5 and edit the metrics-server deployment, as described in the lab guide and my comment above.

    These steps should ensure the installation and proper configuration of your metrics-server deployment.

    When installing, do you see any errors?
    When listing pods, what is the state of the metrics-server pod?

    Regards,
    -Chris

  • lzambra
    lzambra Posts: 6
    edited January 3

    I've added the configuration previous mentioned on this post, and still not working. This is the logs of one pod:

    I0103 01:17:10.298497 1 serving.go:342] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
    I0103 01:17:11.417949 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
    I0103 01:17:11.418056 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
    I0103 01:17:11.418124 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
    I0103 01:17:11.418199 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
    I0103 01:17:11.418247 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
    I0103 01:17:11.418317 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
    I0103 01:17:11.418679 1 secure_serving.go:267] Serving securely on [::]:4443
    I0103 01:17:11.418808 1 dynamic_serving_content.go:131] "Starting controller" name="serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key"
    I0103 01:17:11.419381 1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
    W0103 01:17:11.419697 1 shared_informer.go:372] The sharedIndexInformer has started, run more than once is not allowed
    I0103 01:17:11.519010 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
    I0103 01:17:11.519024 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
    I0103 01:17:11.519052 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController
    E0103 01:17:24.912919 1 scraper.go:140] "Failed to scrape node" err="Get \"https://worker:10250/metrics/resource\": context deadline exceeded" node="worker"
    E0103 01:17:24.912964 1 scraper.go:140] "Failed to scrape node" err="Get \"https://cp:10250/metrics/resource\": context deadline exceeded" node="cp"
    I0103 01:17:29.229201 1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
    I0103 01:17:39.233714 1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
    E0103 01:17:39.913491 1 scraper.go:140] "Failed to scrape node" err="Get \"https://cp:10250/metrics/resource\": context deadline exceeded" node="cp"
    E0103 01:17:39.913492 1 scraper.go:140] "Failed to scrape node" err="Get \"https://worker:10250/metrics/resource\": context deadline exceeded" node="worker"
    I0103 01:17:49.230115 1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"

  • chrispokorni
    chrispokorni Posts: 2,155

    Hi @lzambra,

    What is the output of kubectl -n kube-system describe deployment metrics-server ?

    Regards,
    -Chris

  • lzambra
    lzambra Posts: 6

    Name: metrics-server
    Namespace: kube-system
    CreationTimestamp: Wed, 03 Jan 2024 01:08:57 +0000
    Labels: k8s-app=metrics-server
    Annotations: deployment.kubernetes.io/revision: 3
    Selector: k8s-app=metrics-server
    Replicas: 1 desired | 1 updated | 2 total | 0 available | 2 unavailable
    StrategyType: RollingUpdate
    MinReadySeconds: 0
    RollingUpdateStrategy: 0 max unavailable, 25% max surge
    Pod Template:
    Labels: k8s-app=metrics-server
    Service Account: metrics-server
    Containers:
    metrics-server:
    Image: registry.k8s.io/metrics-server/metrics-server:v0.6.4
    Port: 4443/TCP
    Host Port: 0/TCP
    Args:
    --cert-dir=/tmp
    --secure-port=4443
    --kubelet-insecure-tls
    --kubelet-preferred-address-types=Hostname,InternalDNS,InternalIP,ExternalDNS,ExternalIP
    --kubelet-use-node-status-port
    --metric-resolution=15s
    Requests:
    cpu: 100m
    memory: 200Mi
    Liveness: http-get http://:http/livez delay=0s timeout=1s period=10s #success=1 #failure=3
    Readiness: http-get http://:http/readyz delay=20s timeout=1s period=10s #success=1 #failure=3
    Environment:
    Mounts:
    /tmp from tmp-dir (rw)
    Volumes:
    tmp-dir:
    Type: EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:
    Priority Class Name: system-cluster-critical
    Conditions:
    Type Status Reason
    ---- ------ ------
    Available False MinimumReplicasUnavailable
    Progressing False ProgressDeadlineExceeded
    OldReplicaSets: metrics-server-fbb469ccc (0/0 replicas created), metrics-server-67865f7db4 (1/1 replicas created)
    NewReplicaSet: metrics-server-b58456f69 (1/1 replicas created)
    Events:

Categories

Upcoming Training