LAB 13.3: Adding tools for monitoring and metrics - Metrics API not available after 10+ minutes
I'm hands on with lab 13.3, at step 6 I've done everything with no issues (and I've already check twice every step); but in step 7 after 15 minutes waiting for a different output from the command "kubectl top pod" or "kubectl top nodes", I'm still getting the same:
error: Metrics API not available
Can anybody help me telling me if there is something missing in instructions?
Thank you in advance.
Comments
-
Hi @juanalmaraz,
From your
metrics-server
deployment, can you provide the code snippet representing the containerargs
and theimage
, similar to the snippet shown in Lab 13.3 step 5 of the lab guide? Typically typos in this section can cause issues with themetrics-server
.Regards,
-Chris0 -
@chrispokorni I've got the same problem. This is the kubectl -n kube-system describe deployment metrics-server :
Containers: metrics-server: Image: k8s.gcr.io/metrics-server/metrics-server:v0.3.7 Port: 4443/TCP Host Port: 0/TCP Args: --cert-dir=/tmp --secure-port=4443 --kubelet-insecure-tls --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
0 -
Hi @zmicier0k,
Since Kubernetes release v1.22 the metrics-server v0.3.x may no longer be compatible with latest releases. I would suggest installing the latest metrics-server release v0.6.x and at step 5 provide the following arguments when editing the metrics-server Deployment resource:
- --kubelet-insecure-tls - --kubelet-preferred-address-types=Hostname,InternalDNS,InternalIP,ExternalDNS,ExternalIP
Regards,
-Chris0 -
Hi @chrispokorni I am facing the same issue, after reading the documentation I am not sure If I have to add an additional node, still not working for me even applying the new version.
0 -
Hi @lzambra,
Please ensure that the metrics-server (latest release) installation command from step 3 runs successfully and all necessary artifacts are created. The following step 4 should display the metrics-server pod in a running state. If the metrics-server pod is not listed, the previous step may have failed.
Once the pod is visible, only then proceed to step 5 and edit the metrics-server deployment, as described in the lab guide and my comment above.These steps should ensure the installation and proper configuration of your metrics-server deployment.
When installing, do you see any errors?
When listing pods, what is the state of the metrics-server pod?Regards,
-Chris0 -
I've added the configuration previous mentioned on this post, and still not working. This is the logs of one pod:
I0103 01:17:10.298497 1 serving.go:342] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I0103 01:17:11.417949 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0103 01:17:11.418056 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0103 01:17:11.418124 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I0103 01:17:11.418199 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0103 01:17:11.418247 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I0103 01:17:11.418317 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0103 01:17:11.418679 1 secure_serving.go:267] Serving securely on [::]:4443
I0103 01:17:11.418808 1 dynamic_serving_content.go:131] "Starting controller" name="serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key"
I0103 01:17:11.419381 1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
W0103 01:17:11.419697 1 shared_informer.go:372] The sharedIndexInformer has started, run more than once is not allowed
I0103 01:17:11.519010 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0103 01:17:11.519024 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0103 01:17:11.519052 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController
E0103 01:17:24.912919 1 scraper.go:140] "Failed to scrape node" err="Get \"https://worker:10250/metrics/resource\": context deadline exceeded" node="worker"
E0103 01:17:24.912964 1 scraper.go:140] "Failed to scrape node" err="Get \"https://cp:10250/metrics/resource\": context deadline exceeded" node="cp"
I0103 01:17:29.229201 1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
I0103 01:17:39.233714 1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
E0103 01:17:39.913491 1 scraper.go:140] "Failed to scrape node" err="Get \"https://cp:10250/metrics/resource\": context deadline exceeded" node="cp"
E0103 01:17:39.913492 1 scraper.go:140] "Failed to scrape node" err="Get \"https://worker:10250/metrics/resource\": context deadline exceeded" node="worker"
I0103 01:17:49.230115 1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"0 -
Hi @lzambra,
What is the output of
kubectl -n kube-system describe deployment metrics-server
?Regards,
-Chris0 -
Name: metrics-server
Namespace: kube-system
CreationTimestamp: Wed, 03 Jan 2024 01:08:57 +0000
Labels: k8s-app=metrics-server
Annotations: deployment.kubernetes.io/revision: 3
Selector: k8s-app=metrics-server
Replicas: 1 desired | 1 updated | 2 total | 0 available | 2 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 0 max unavailable, 25% max surge
Pod Template:
Labels: k8s-app=metrics-server
Service Account: metrics-server
Containers:
metrics-server:
Image: registry.k8s.io/metrics-server/metrics-server:v0.6.4
Port: 4443/TCP
Host Port: 0/TCP
Args:
--cert-dir=/tmp
--secure-port=4443
--kubelet-insecure-tls
--kubelet-preferred-address-types=Hostname,InternalDNS,InternalIP,ExternalDNS,ExternalIP
--kubelet-use-node-status-port
--metric-resolution=15s
Requests:
cpu: 100m
memory: 200Mi
Liveness: http-get http://:http/livez delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:http/readyz delay=20s timeout=1s period=10s #success=1 #failure=3
Environment:
Mounts:
/tmp from tmp-dir (rw)
Volumes:
tmp-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit:
Priority Class Name: system-cluster-critical
Conditions:
Type Status Reason
---- ------ ------
Available False MinimumReplicasUnavailable
Progressing False ProgressDeadlineExceeded
OldReplicaSets: metrics-server-fbb469ccc (0/0 replicas created), metrics-server-67865f7db4 (1/1 replicas created)
NewReplicaSet: metrics-server-b58456f69 (1/1 replicas created)
Events:0
Categories
- All Categories
- 217 LFX Mentorship
- 217 LFX Mentorship: Linux Kernel
- 788 Linux Foundation IT Professional Programs
- 352 Cloud Engineer IT Professional Program
- 177 Advanced Cloud Engineer IT Professional Program
- 82 DevOps Engineer IT Professional Program
- 146 Cloud Native Developer IT Professional Program
- 137 Express Training Courses
- 137 Express Courses - Discussion Forum
- 6.2K Training Courses
- 46 LFC110 Class Forum - Discontinued
- 70 LFC131 Class Forum
- 42 LFD102 Class Forum
- 226 LFD103 Class Forum
- 18 LFD110 Class Forum
- 37 LFD121 Class Forum
- 18 LFD133 Class Forum
- 7 LFD134 Class Forum
- 18 LFD137 Class Forum
- 71 LFD201 Class Forum
- 4 LFD210 Class Forum
- 5 LFD210-CN Class Forum
- 2 LFD213 Class Forum - Discontinued
- 128 LFD232 Class Forum - Discontinued
- 2 LFD233 Class Forum
- 4 LFD237 Class Forum
- 24 LFD254 Class Forum
- 694 LFD259 Class Forum
- 111 LFD272 Class Forum
- 4 LFD272-JP クラス フォーラム
- 12 LFD273 Class Forum
- 146 LFS101 Class Forum
- 1 LFS111 Class Forum
- 3 LFS112 Class Forum
- 2 LFS116 Class Forum
- 4 LFS118 Class Forum
- 6 LFS142 Class Forum
- 5 LFS144 Class Forum
- 4 LFS145 Class Forum
- 2 LFS146 Class Forum
- 3 LFS147 Class Forum
- 1 LFS148 Class Forum
- 15 LFS151 Class Forum
- 2 LFS157 Class Forum
- 25 LFS158 Class Forum
- 7 LFS162 Class Forum
- 2 LFS166 Class Forum
- 4 LFS167 Class Forum
- 3 LFS170 Class Forum
- 2 LFS171 Class Forum
- 3 LFS178 Class Forum
- 3 LFS180 Class Forum
- 2 LFS182 Class Forum
- 5 LFS183 Class Forum
- 31 LFS200 Class Forum
- 737 LFS201 Class Forum - Discontinued
- 3 LFS201-JP クラス フォーラム
- 18 LFS203 Class Forum
- 130 LFS207 Class Forum
- 2 LFS207-DE-Klassenforum
- 1 LFS207-JP クラス フォーラム
- 302 LFS211 Class Forum
- 56 LFS216 Class Forum
- 52 LFS241 Class Forum
- 48 LFS242 Class Forum
- 38 LFS243 Class Forum
- 15 LFS244 Class Forum
- 2 LFS245 Class Forum
- LFS246 Class Forum
- 48 LFS250 Class Forum
- 2 LFS250-JP クラス フォーラム
- 1 LFS251 Class Forum
- 151 LFS253 Class Forum
- 1 LFS254 Class Forum
- 1 LFS255 Class Forum
- 7 LFS256 Class Forum
- 1 LFS257 Class Forum
- 1.2K LFS258 Class Forum
- 10 LFS258-JP クラス フォーラム
- 118 LFS260 Class Forum
- 159 LFS261 Class Forum
- 42 LFS262 Class Forum
- 82 LFS263 Class Forum - Discontinued
- 15 LFS264 Class Forum - Discontinued
- 11 LFS266 Class Forum - Discontinued
- 24 LFS267 Class Forum
- 22 LFS268 Class Forum
- 30 LFS269 Class Forum
- LFS270 Class Forum
- 202 LFS272 Class Forum
- 2 LFS272-JP クラス フォーラム
- 1 LFS274 Class Forum
- 4 LFS281 Class Forum
- 9 LFW111 Class Forum
- 259 LFW211 Class Forum
- 181 LFW212 Class Forum
- 13 SKF100 Class Forum
- 1 SKF200 Class Forum
- 1 SKF201 Class Forum
- 795 Hardware
- 199 Drivers
- 68 I/O Devices
- 37 Monitors
- 102 Multimedia
- 174 Networking
- 91 Printers & Scanners
- 85 Storage
- 758 Linux Distributions
- 82 Debian
- 67 Fedora
- 17 Linux Mint
- 13 Mageia
- 23 openSUSE
- 148 Red Hat Enterprise
- 31 Slackware
- 13 SUSE Enterprise
- 353 Ubuntu
- 468 Linux System Administration
- 39 Cloud Computing
- 71 Command Line/Scripting
- Github systems admin projects
- 93 Linux Security
- 78 Network Management
- 102 System Management
- 47 Web Management
- 63 Mobile Computing
- 18 Android
- 33 Development
- 1.2K New to Linux
- 1K Getting Started with Linux
- 371 Off Topic
- 114 Introductions
- 174 Small Talk
- 22 Study Material
- 805 Programming and Development
- 303 Kernel Development
- 484 Software Development
- 1.8K Software
- 261 Applications
- 183 Command Line
- 3 Compiling/Installing
- 987 Games
- 317 Installation
- 96 All In Program
- 96 All In Forum
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)