Welcome to the Linux Foundation Forum!

Lab 11.1 - linkerd-identity pod not running in latest version - known issue?

Hi, I am trying to get linker running (latest version) on a Centos 7 k8s cluster. The linkerd-identity pod seems to have a problem with corresponding log content of the identity container inside. Did anyone has similar problems?

general info

cat /etc/redhat-release 
CentOS Linux release 7.9.2009 (Core)
kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.0", GitCommit:"ab69524f795c42094a6630298ff53f3c3ebab7f4", GitTreeState:"clean", BuildDate:"2021-12-07T18:16:20Z", GoVersion:"go1.17.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.0", GitCommit:"ab69524f795c42094a6630298ff53f3c3ebab7f4", GitTreeState:"clean", BuildDate:"2021-12-07T18:09:57Z", GoVersion:"go1.17.3", Compiler:"gc", Platform:"linux/amd64"}

procedure

  • linkerd installation went fine
  • linkerd check --pre went fine
  • running linkerd install | apply -f - followed by linkerd check yields:
 Linkerd core checks
===================

kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

kubernetes-version
------------------
√ is running the minimum Kubernetes API version
√ is running the minimum kubectl version

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
| No running pods for "linkerd-destination" 

observations

  • the linkerd-identity pod remain in status CrashLoopBackOff and all other pods keep waiting
  • container within linkerd-identitypod
  • log from identity container
  • log from linkerd-proxy container

Any help understanding the problem is appreciated.

Answers

  • chrispokorni
    chrispokorni Posts: 2,301

    Hi @t.sander,

    Several issues have been reported in the forum with linkerd 2.11. It seems, however, that downgrading to 2.10.0 resolved most of them.

    Regards,
    -Chris

  • How do you downgrade? I'll post it if I find it.

  • not quite there yet, but i've made some progress:

    export LINKERD2_VERSION=stable-2.10.2
    curl -s -L https://run.linkerd.io/install | sh -
    vi ~/.bashrc

    Add these two lines

    export LINKERD2_VERSION=stable-2.6.0
    export PATH=$PATH:$HOME/.linkerd2/bin

    $ source ~/.bashrc
    $ linkerd version
    Client version: stable-2.6.0
    Server version: unavailable
    $ linkerd check --pre

    ??grant cluser_admin privledge to sa??

    $ linkerd install | kubectl apply -f -

  • chrispokorni
    chrispokorni Posts: 2,301

    Hi @cccsss000111,

    Downloading the linkerd setup.sh and modifying the linkerd version also works:

    $ curl -sL run.linkerd.io/install > setup.sh
    $ vim setup.sh

    Locate and edit the following line:

    LINKERD2_VERSION=${LINKERD2_VERSION:-stable-2.10.1}

    Then run setup.sh and continue with the rest of the steps from the lab guide:
    $ sh setup.sh

    Regards,
    -Chris

  • bullgo
    bullgo Posts: 1

    Hi @chrispokorni, could you also share the process to debug and locate the issue is related to the version? How to cultivate the ideas when debugging this kind of error?

  • alihasanahmedk
    alihasanahmedk Posts: 34
    edited January 2022

    @t.sander
    I am using Virtual Machines and Linkerd Version 11.1 works fine with me.
    Try the logs of the pod.

    kubectl describe pod pod_name -n linkerd

    Check what errors are showing. If still stuck let me know.

  • @alihasanahmedk
    Hi, I am running two virtual machines with Centos 7 (as seen in my first post). Exploiting version 2.11.1 I get the following three pods

    kubectl -n linkerd get pods                                       
    
    NAME                                      READY   STATUS            RESTARTS      AGE
    linkerd-destination-75cdb6c9c-nrztn       0/4     PodInitializing   0             2m56s
    linkerd-identity-54795b9f9f-5lz6s         0/2     Running           4 (23s ago)   2m56s
    linkerd-proxy-injector-6b5699bdcc-75pbx   0/2     PodInitializing   0             2m56s
    

    The faulty one is the linkerd-identity

    kubectl -n linkerd describe pod linkerd-identity-54795b9f9f-5lz6s
    
    Events:
      Type     Reason     Age                 From               Message
      ----     ------     ----                ----               -------
      Normal   Scheduled  116s                default-scheduler  Successfully assigned linkerd/linkerd-identity-54795b9f9f-5lz6s to cpu-rsm-cn02
      Normal   Pulled     115s                kubelet            Container image "cr.l5d.io/linkerd/proxy-init:v1.4.0" already present on machine
      Normal   Created    115s                kubelet            Created container linkerd-init
      Normal   Started    115s                kubelet            Started container linkerd-init
      Normal   Created    113s                kubelet            Created container linkerd-proxy
      Normal   Started    113s                kubelet            Started container linkerd-proxy
      Normal   Pulled     113s                kubelet            Container image "cr.l5d.io/linkerd/proxy:stable-2.11.1" already present on machine
      Warning  Unhealthy  85s (x2 over 95s)   kubelet            Liveness probe failed: Get "http://192.168.168.110:9990/ping": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
      Warning  Unhealthy  85s (x5 over 111s)  kubelet            Readiness probe failed: Get "http://192.168.168.110:9990/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
      Normal   Created    83s (x2 over 113s)  kubelet            Created container identity
      Normal   Pulled     83s (x2 over 113s)  kubelet            Container image "cr.l5d.io/linkerd/controller:stable-2.11.1" already present on machine
      Normal   Started    83s (x2 over 113s)  kubelet            Started container identity
      Warning  Unhealthy  82s (x6 over 111s)  kubelet            Readiness probe failed: HTTP probe failed with statuscode: 503
    

    Regards

  • Please share the output of these commands.
    kubectl -n linkerd describe pod linkerd-destination-75cdb6c9c-nrztn
    kubectl -n linkerd describe pod linkerd-proxy-injector-6b5699bdcc-75pbx

  • t.sander
    t.sander Posts: 6
    edited January 2022

    These are the complete output of both pods (see attached files)

  • According to provided logs, your Pods are working fine. Can you share me the latest result of kubectl get pods -n linkerd

  • kubectl -n linkerd get pods 
    NAME                                      READY   STATUS             RESTARTS          AGE
    linkerd-destination-5df7f655b5-zbk2b      0/4     PodInitializing    0                 19h
    linkerd-heartbeat-27393840-2s6pn          0/1     Error              0                 19h
    linkerd-heartbeat-27393840-d7n6n          0/1     Error              0                 19h
    linkerd-heartbeat-27393840-j848j          0/1     Error              0                 19h
    linkerd-heartbeat-27393840-jcgb2          0/1     Error              0                 19h
    linkerd-heartbeat-27393840-m8nw4          0/1     Error              0                 19h
    linkerd-heartbeat-27393840-s5jjm          0/1     Error              0                 19h
    linkerd-heartbeat-27393840-vj6cl          0/1     Error              0                 19h
    linkerd-identity-54795b9f9f-p4xfg         0/2     CrashLoopBackOff   367 (4m49s ago)   19h
    linkerd-proxy-injector-56b89fc6d4-p98qg   0/2     PodInitializing    0                 19h
    

    The main problem is shown by the log of the identity container hosted by the linkerd-identity pod

    time="2022-02-01T07:49:07Z" level=fatal msg="Failed to initialize identity service: Post \"https://10.96.0.1:443/apis/authorization.k8s.io/v1/selfsubjectaccessreviews\": dial tcp 10.96.0.1:443: i/o timeout"
    
  • Try by deleting the identity pod then check the pod again and wait for pod status.
    kubectl delete pod -n linkerd linkerd-identity-54795b9f9f-p4xfg
    Otherwise let me know we will set meet up call to resolve this issue.

  • @alihasanahmedk
    As expected the pod is failing again after deleting it:

    kubectl -n linkerd get pod
    NAME                                      READY   STATUS             RESTARTS      AGE
    linkerd-destination-5df7f655b5-zbk2b      0/4     PodInitializing    0             25h
    linkerd-heartbeat-27395280-28kl8          0/1     Error              0             66m
    linkerd-heartbeat-27395280-659gq          0/1     Error              0             69m
    linkerd-heartbeat-27395280-84bqq          0/1     Error              0             77m
    linkerd-heartbeat-27395280-bxfmp          0/1     Error              0             72m
    linkerd-heartbeat-27395280-dpd75          0/1     Error              0             83m
    linkerd-heartbeat-27395280-j9clg          0/1     Error              0             75m
    linkerd-heartbeat-27395280-mmlhp          0/1     Error              0             80m
    linkerd-identity-54795b9f9f-rd44s         0/2     CrashLoopBackOff   4 (65s ago)   4m
    linkerd-proxy-injector-56b89fc6d4-p98qg   0/2     PodInitializing    0             25h
    
  • alihasanahmedk
    alihasanahmedk Posts: 34
    edited February 2022

    @t.sander we can arrange an online meeting to resolve this issue. you can reach me at alihasanahmedkhan@gmail.com

  • Hi @t.sander,

    Your describe attachments indicate that your Pods and Nodes may use overlapping subnets. If that is the case, the networking inside your cluster is impacted as a result. Is your Calico using the default 192.168.0.0/16 network? Are your nodes assigned IP addresses from a 192.168.0.0/x subnet?

    Regards,
    -Chris

  • Hi @chrispokorni,
    I followed the instructions from the 3.1 Labs and set CALICO_IPV4POOL_CIDR to 192.168.0.0/16 in calico.yaml and edited the kubeadm-config.yaml accordingly.

    The nodes have the following IP adresses
    192.168.149.111/24 (k8scp)
    192.168.149.112/24 (worker)
    so they have the 192.168.149.0/24 network

    The linkerd-identity pod got 192.168.168.109/32

  • Hi @t.sander,

    As suspected, the two networks overlap, and it eventually causes routing issues within the cluster.

    This can be resolved by ensuring that the two networks do not overlap - either altering the CIDR in calico.yaml and kubead-config.yaml for cluster init, or provisioning the VMs with IP addresses from a different subnet.

    Regards,
    -Chris

Categories

Upcoming Training