Lab 11.1 - linkerd-identity pod not running in latest version - known issue?

t.sander · January 2022

Hi, I am trying to get linker running (latest version) on a Centos 7 k8s cluster. The linkerd-identity pod seems to have a problem with corresponding log content of the identity container inside. Did anyone has similar problems?

general info

cat /etc/redhat-release 
CentOS Linux release 7.9.2009 (Core)

kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.0", GitCommit:"ab69524f795c42094a6630298ff53f3c3ebab7f4", GitTreeState:"clean", BuildDate:"2021-12-07T18:16:20Z", GoVersion:"go1.17.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.0", GitCommit:"ab69524f795c42094a6630298ff53f3c3ebab7f4", GitTreeState:"clean", BuildDate:"2021-12-07T18:09:57Z", GoVersion:"go1.17.3", Compiler:"gc", Platform:"linux/amd64"}

procedure

linkerd installation went fine
linkerd check --pre went fine
running linkerd install | apply -f - followed by linkerd check yields:

 Linkerd core checks
===================
 
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API
 
kubernetes-version
------------------
√ is running the minimum Kubernetes API version
√ is running the minimum kubectl version
 
linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
| No running pods for "linkerd-destination"

observations

the linkerd-identity pod remain in status CrashLoopBackOff and all other pods keep waiting
container within linkerd-identitypod
log from identity container
log from linkerd-proxy container

Any help understanding the problem is appreciated.

chrispokorni · January 2022

Hi @t.sander,

Several issues have been reported in the forum with linkerd 2.11. It seems, however, that downgrading to 2.10.0 resolved most of them.

Regards,
-Chris

cccsss000111 · January 2022

How do you downgrade? I'll post it if I find it.

cccsss000111 · January 2022

not quite there yet, but i've made some progress:

export LINKERD2_VERSION=stable-2.10.2
curl -s -L https://run.linkerd.io/install | sh -
vi ~/.bashrc

Add these two lines

export LINKERD2_VERSION=stable-2.6.0
export PATH=$PATH:$HOME/.linkerd2/bin

$ source ~/.bashrc
$ linkerd version
Client version: stable-2.6.0
Server version: unavailable
$ linkerd check --pre

??grant cluser_admin privledge to sa??

$ linkerd install | kubectl apply -f -

chrispokorni · January 2022

Hi @cccsss000111,

Downloading the linkerd setup.sh and modifying the linkerd version also works:

$ curl -sL run.linkerd.io/install > setup.sh
$ vim setup.sh

Locate and edit the following line:

LINKERD2_VERSION=${LINKERD2_VERSION:-stable-2.10.1}

Then run setup.sh and continue with the rest of the steps from the lab guide:
$ sh setup.sh

Regards,
-Chris

bullgo · January 2022

Hi @chrispokorni, could you also share the process to debug and locate the issue is related to the version? How to cultivate the ideas when debugging this kind of error?

alihasanahmedk · January 2022

@t.sander
I am using Virtual Machines and Linkerd Version 11.1 works fine with me.
Try the logs of the pod.

kubectl describe pod pod_name -n linkerd

Check what errors are showing. If still stuck let me know.

t.sander · January 2022

@alihasanahmedk
Hi, I am running two virtual machines with Centos 7 (as seen in my first post). Exploiting version 2.11.1 I get the following three pods

kubectl -n linkerd get pods                                       
 
NAME                                      READY   STATUS            RESTARTS      AGE
linkerd-destination-75cdb6c9c-nrztn       0/4     PodInitializing   0             2m56s
linkerd-identity-54795b9f9f-5lz6s         0/2     Running           4 (23s ago)   2m56s
linkerd-proxy-injector-6b5699bdcc-75pbx   0/2     PodInitializing   0             2m56s

The faulty one is the linkerd-identity

kubectl -n linkerd describe pod linkerd-identity-54795b9f9f-5lz6s
 
Events:
  Type     Reason     Age                 From               Message
  ----     ------     ----                ----               -------
  Normal   Scheduled  116s                default-scheduler  Successfully assigned linkerd/linkerd-identity-54795b9f9f-5lz6s to cpu-rsm-cn02
  Normal   Pulled     115s                kubelet            Container image "cr.l5d.io/linkerd/proxy-init:v1.4.0" already present on machine
  Normal   Created    115s                kubelet            Created container linkerd-init
  Normal   Started    115s                kubelet            Started container linkerd-init
  Normal   Created    113s                kubelet            Created container linkerd-proxy
  Normal   Started    113s                kubelet            Started container linkerd-proxy
  Normal   Pulled     113s                kubelet            Container image "cr.l5d.io/linkerd/proxy:stable-2.11.1" already present on machine
  Warning  Unhealthy  85s (x2 over 95s)   kubelet            Liveness probe failed: Get "http://192.168.168.110:9990/ping": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
  Warning  Unhealthy  85s (x5 over 111s)  kubelet            Readiness probe failed: Get "http://192.168.168.110:9990/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
  Normal   Created    83s (x2 over 113s)  kubelet            Created container identity
  Normal   Pulled     83s (x2 over 113s)  kubelet            Container image "cr.l5d.io/linkerd/controller:stable-2.11.1" already present on machine
  Normal   Started    83s (x2 over 113s)  kubelet            Started container identity
  Warning  Unhealthy  82s (x6 over 111s)  kubelet            Readiness probe failed: HTTP probe failed with statuscode: 503

Regards

alihasanahmedk · January 2022

Please share the output of these commands.
kubectl -n linkerd describe pod linkerd-destination-75cdb6c9c-nrztn
kubectl -n linkerd describe pod linkerd-proxy-injector-6b5699bdcc-75pbx

t.sander · January 2022

These are the complete output of both pods (see attached files)

alihasanahmedk · January 2022

According to provided logs, your Pods are working fine. Can you share me the latest result of kubectl get pods -n linkerd

t.sander · February 2022

kubectl -n linkerd get pods 
NAME                                      READY   STATUS             RESTARTS          AGE
linkerd-destination-5df7f655b5-zbk2b      0/4     PodInitializing    0                 19h
linkerd-heartbeat-27393840-2s6pn          0/1     Error              0                 19h
linkerd-heartbeat-27393840-d7n6n          0/1     Error              0                 19h
linkerd-heartbeat-27393840-j848j          0/1     Error              0                 19h
linkerd-heartbeat-27393840-jcgb2          0/1     Error              0                 19h
linkerd-heartbeat-27393840-m8nw4          0/1     Error              0                 19h
linkerd-heartbeat-27393840-s5jjm          0/1     Error              0                 19h
linkerd-heartbeat-27393840-vj6cl          0/1     Error              0                 19h
linkerd-identity-54795b9f9f-p4xfg         0/2     CrashLoopBackOff   367 (4m49s ago)   19h
linkerd-proxy-injector-56b89fc6d4-p98qg   0/2     PodInitializing    0                 19h

The main problem is shown by the log of the identity container hosted by the linkerd-identity pod

time="2022-02-01T07:49:07Z" level=fatal msg="Failed to initialize identity service: Post \"https://10.96.0.1:443/apis/authorization.k8s.io/v1/selfsubjectaccessreviews\": dial tcp 10.96.0.1:443: i/o timeout"

alihasanahmedk · February 2022

Try by deleting the identity pod then check the pod again and wait for pod status.
kubectl delete pod -n linkerd linkerd-identity-54795b9f9f-p4xfg
Otherwise let me know we will set meet up call to resolve this issue.

t.sander · February 2022

@alihasanahmedk
As expected the pod is failing again after deleting it:

kubectl -n linkerd get pod
NAME                                      READY   STATUS             RESTARTS      AGE
linkerd-destination-5df7f655b5-zbk2b      0/4     PodInitializing    0             25h
linkerd-heartbeat-27395280-28kl8          0/1     Error              0             66m
linkerd-heartbeat-27395280-659gq          0/1     Error              0             69m
linkerd-heartbeat-27395280-84bqq          0/1     Error              0             77m
linkerd-heartbeat-27395280-bxfmp          0/1     Error              0             72m
linkerd-heartbeat-27395280-dpd75          0/1     Error              0             83m
linkerd-heartbeat-27395280-j9clg          0/1     Error              0             75m
linkerd-heartbeat-27395280-mmlhp          0/1     Error              0             80m
linkerd-identity-54795b9f9f-rd44s         0/2     CrashLoopBackOff   4 (65s ago)   4m
linkerd-proxy-injector-56b89fc6d4-p98qg   0/2     PodInitializing    0             25h

alihasanahmedk · February 2022

@t.sander we can arrange an online meeting to resolve this issue. you can reach me at alihasanahmedkhan@gmail.com

chrispokorni · February 2022

Hi @t.sander,

Your describe attachments indicate that your Pods and Nodes may use overlapping subnets. If that is the case, the networking inside your cluster is impacted as a result. Is your Calico using the default 192.168.0.0/16 network? Are your nodes assigned IP addresses from a 192.168.0.0/x subnet?

Regards,
-Chris

t.sander · February 2022

Hi @chrispokorni,
I followed the instructions from the 3.1 Labs and set CALICO_IPV4POOL_CIDR to 192.168.0.0/16 in calico.yaml and edited the kubeadm-config.yaml accordingly.

The nodes have the following IP adresses
192.168.149.111/24 (k8scp)
192.168.149.112/24 (worker)
so they have the 192.168.149.0/24 network

The linkerd-identity pod got 192.168.168.109/32

chrispokorni · February 2022

Hi @t.sander,

As suspected, the two networks overlap, and it eventually causes routing issues within the cluster.

This can be resolved by ensuring that the two networks do not overlap - either altering the CIDR in calico.yaml and kubead-config.yaml for cluster init, or provisioning the VMs with IP addresses from a different subnet.

Regards,
-Chris

Lab 11.1 - linkerd-identity pod not running in latest version - known issue?

general info

procedure

observations

Welcome!

Answers

Add these two lines

Welcome!

Welcome!

Quick Links

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)