Welcome to the Linux Foundation Forum!

LAB 2.4 Adding fluentd to basic.yaml failing to start

kelloggpa
kelloggpa Posts: 3
edited November 2022 in LFD259 Class Forum

Good day,

I am running my cp and worker nodes as VMs via Parallels Desktop on my MacBook Pro M1. After making some tweaks to the setup scripts I have my cp and worker nodes running on the Ubuntu 20.04 arm64 VMs. I've been able to expose nginx pod to my host OS.

Now I've added the fdlogger container to my basic.yaml pod, but it is failing to start. I can only get 1/2 containers running in the pod, with kubectl get pod reporting:

NAME       READY   STATUS             RESTARTS     AGE
basicpod   1/2     CrashLoopBackOff   1 (4s ago)   6s

Here is my current basic.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: basicpod
  labels:
    type: webserver
spec:
  containers:
  - name: webcont
    image: nginx
    ports:
    - containerPort: 80
  - name: fdlogger
    image: fluent/fluentd

Does anyone have some pointers on how I can further troubleshoot?

This is about all I've been able to find, but don't know how to look further:

% kubectl logs --since=1h -c fdlogger basicpod
exec /bin/entrypoint.sh: exec format error

Thank you for any assistance you can provide,
Phil

Best Answer

  • kelloggpa
    kelloggpa Posts: 3
    Answer ✓

    Hi @chrispokorni,

    I was able to resolve my issue. It seems that nginx has put together a 'multi-platform' image, but fluentd still has multiple architecture specific images. I appended the tag edge-debian-arm64 to the image declaration and then restarted and it worked. I'm not sure if that is the tag I ultimately want, but for now that should be fine.

    I had tried the edge-debian-armhf without success earlier, but missed the 'arm64' version until now.

    I think the default behavior, e.g., when an image is specified without a tag, the pod will pull the image tagged latest. If this is indeed true, then I'm pulling a 4 year old image for amd64.

    Thanks for your help.

    Phil

Answers

  • Hi @kelloggpa,

    The logs commands returns an output if the container is running, thus producing logs.

    Would you be able to provide the outputs the following commands?

    kubectl describe pod basicpod

    kubectl get pod -A -o wide

    Regards,
    -Chris

  • kelloggpa
    kelloggpa Posts: 3
    edited November 2022

    Hi @chrispokorni,

    Thanks for your reply. Here is the information you requested

    % kubectl describe pod basicpod
    Name:         basicpod
    Namespace:    default
    Priority:     0
    Node:         kube-worker/10.211.55.13
    Start Time:   Sun, 06 Nov 2022 11:43:40 -0700
    Labels:       type=webserver
    Annotations:  cni.projectcalico.org/containerID: c626e7ff191220ef23aaf4699f6b6b1033e71234a2df3782a9ed1b3f261ea03a
                  cni.projectcalico.org/podIP: 192.168.73.140/32
                  cni.projectcalico.org/podIPs: 192.168.73.140/32
    Status:       Running
    IP:           192.168.73.140
    IPs:
      IP:  192.168.73.140
    Containers:
      webcont:
        Container ID:   containerd://797264fe2365e0235d4220d3ddb00b479d913bcec263d5d7aef9e3c985ddc8c3
        Image:          nginx
        Image ID:       docker.io/library/nginx@sha256:943c25b4b66b332184d5ba6bb18234273551593016c0e0ae906bab111548239f
        Port:           80/TCP
        Host Port:      0/TCP
        State:          Running
          Started:      Sun, 06 Nov 2022 11:43:41 -0700
        Ready:          True
        Restart Count:  0
        Environment:    <none>
        Mounts:
          /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-5wpkz (ro)
      fdlogger:
        Container ID:   containerd://fefa39d4c3939f8e6a1bf2989749355f0cfa4c416204cce308ef01342074400d
        Image:          fluent/fluentd
        Image ID:       docker.io/fluent/fluentd@sha256:7eece00d1bc784ac1e9722b2580911cd3ead5afd740dad6594be945b3b1dd884
        Port:           <none>
        Host Port:      <none>
        State:          Terminated
          Reason:       Error
          Exit Code:    1
          Started:      Sun, 06 Nov 2022 16:17:22 -0700
          Finished:     Sun, 06 Nov 2022 16:17:22 -0700
        Last State:     Terminated
          Reason:       Error
          Exit Code:    1
          Started:      Sun, 06 Nov 2022 15:50:35 -0700
          Finished:     Sun, 06 Nov 2022 15:50:35 -0700
        Ready:          False
        Restart Count:  23
        Environment:    <none>
        Mounts:
          /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-5wpkz (ro)
    Conditions:
      Type              Status
      Initialized       True 
      Ready             False 
      ContainersReady   False 
      PodScheduled      True 
    Volumes:
      kube-api-access-5wpkz:
        Type:                    Projected (a volume that contains injected data from multiple sources)
        TokenExpirationSeconds:  3607
        ConfigMapName:           kube-root-ca.crt
        ConfigMapOptional:       <nil>
        DownwardAPI:             true
    QoS Class:                   BestEffort
    Node-Selectors:              <none>
    Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
    Events:
      Type     Reason   Age                     From     Message
      ----     ------   ----                    ----     -------
      Warning  BackOff  170m (x277 over 4h33m)  kubelet  Back-off restarting failed container
      Normal   Pulling  15s (x25 over 4h33m)    kubelet  Pulling image "fluent/fluentd"
    
    % kubectl get pod -A -o wide
    NAMESPACE     NAME                                      READY   STATUS             RESTARTS         AGE     IP               NODE          NOMINATED NODE   READINESS GATES
    default       basicpod                                  1/2     CrashLoopBackOff   23 (2m25s ago)   4h36m   192.168.73.140   kube-worker   <none>           <none>
    kube-system   calico-kube-controllers-66bfd4dbc-9hxrg   1/1     Running            0                29h     192.168.55.131   kube-cp       <none>           <none>
    kube-system   calico-node-5bwvx                         1/1     Running            0                21h     10.211.55.13     kube-worker   <none>           <none>
    kube-system   calico-node-rzrb4                         1/1     Running            0                29h     10.211.55.12     kube-cp       <none>           <none>
    kube-system   coredns-6d4b75cb6d-6hjb7                  1/1     Running            0                29h     192.168.55.129   kube-cp       <none>           <none>
    kube-system   coredns-6d4b75cb6d-lmrgf                  1/1     Running            0                29h     192.168.55.130   kube-cp       <none>           <none>
    kube-system   etcd-kube-cp                              1/1     Running            0                29h     10.211.55.12     kube-cp       <none>           <none>
    kube-system   kube-apiserver-kube-cp                    1/1     Running            0                29h     10.211.55.12     kube-cp       <none>           <none>
    kube-system   kube-controller-manager-kube-cp           1/1     Running            1 (22h ago)      29h     10.211.55.12     kube-cp       <none>           <none>
    kube-system   kube-proxy-6h89h                          1/1     Running            0                29h     10.211.55.12     kube-cp       <none>           <none>
    kube-system   kube-proxy-hm87k                          1/1     Running            0                21h     10.211.55.13     kube-worker   <none>           <none>
    kube-system   kube-scheduler-kube-cp                    1/1     Running            1 (22h ago)      29h     10.211.55.12     kube-cp       <none>           <none>
    
  • tjuanico
    tjuanico Posts: 4

    Same issue on my arm64 laptop. This entry safe me a few google'n hours. Thanks!

  • ashfaqahmed
    ashfaqahmed Posts: 1
    edited July 2023

    Same here. Without any tag, the image not found error. I am using google compute nodes with ubuntu 20.04

    Events:
      Type     Reason     Age                From               Message
      ----     ------     ----               ----               -------
      Normal   Scheduled  17s                default-scheduler  Successfully assigned default/basicpod to worker
      Normal   Pulling    16s                kubelet            Pulling image "nginx"
      Normal   Pulled     16s                kubelet            Successfully pulled image "nginx" in 294.211729ms (294.23196ms including waiting)
      Normal   Created    16s                kubelet            Created container webcont
      Normal   Started    16s                kubelet            Started container webcont
      Normal   Pulling    16s                kubelet            Pulling image "fluent/fluentd"
      Warning  Failed     16s                kubelet            Failed to pull image "fluent/fluentd": rpc error: code = NotFound desc = failed to pull and unpack image "docker.io/fluent/fluentd:latest": failed to resolve reference "docker.io/fluent/fluentd:latest": docker.io/fluent/fluentd:latest: not found
      Warning  Failed     16s                kubelet            Error: ErrImagePull
      Normal   BackOff    14s (x2 over 15s)  kubelet            Back-off pulling image "fluent/fluentd"
      Warning  Failed     14s (x2 over 15s)  kubelet            Error: ImagePullBackOff
    
    

    And when using the above mentioned tag edge-debian-arm64 it successfully pulls the image but is unable to start the flogger container

    Events:
      Type     Reason     Age                     From               Message
      ----     ------     ----                    ----               -------
      Normal   Scheduled  7m39s                   default-scheduler  Successfully assigned default/basicpod to worker
      Normal   Pulling    7m38s                   kubelet            Pulling image "nginx"
      Normal   Pulled     7m38s                   kubelet            Successfully pulled image "nginx" in 355.034454ms (355.095081ms including waiting)
      Normal   Created    7m38s                   kubelet            Created container webcont
      Normal   Started    7m38s                   kubelet            Started container webcont
      Normal   Pulling    7m38s                   kubelet            Pulling image "fluent/fluentd:edge-debian-arm64"
      Normal   Pulled     7m30s                   kubelet            Successfully pulled image "fluent/fluentd:edge-debian-arm64" in 7.670395324s (7.670421793s including waiting)
      Normal   Created    6m44s (x4 over 7m30s)   kubelet            Created container flogger
      Normal   Started    6m44s (x4 over 7m30s)   kubelet            Started container flogger
      Normal   Pulled     6m44s (x3 over 7m28s)   kubelet            Container image "fluent/fluentd:edge-debian-arm64" already present on machine
      Warning  BackOff    2m31s (x24 over 7m27s)  kubelet            Back-off restarting failed container flogger in pod basicpod_default(a83c00d0-0c5b-4a13-a947-72d44cf0b4cd)
    
    
  • @chrispokorni

    fluent/fluentd:edge-debian-arm64

    same issue as above for @ashfaqahmed

    Events:
    Type Reason Age From Message
    ---- ------ ---- ---- -------
    Normal Scheduled 83s default-scheduler Successfully assigned default/basicpod to worker
    Normal Pulling 82s kubelet Pulling image "nginx"
    Normal Pulled 82s kubelet Successfully pulled image "nginx" in 327.712348ms (327.721573ms including waiting)
    Normal Created 82s kubelet Created container webcont
    Normal Started 82s kubelet Started container webcont
    Normal Pulling 82s kubelet Pulling image "fluent/fluentd:edge-debian-arm64"
    Normal Pulled 75s kubelet Successfully pulled image "fluent/fluentd:edge-debian-arm64" in 6.607935264s (6.607946163s including waiting)
    Normal Created 37s (x4 over 75s) kubelet Created container fdlogger
    Normal Started 37s (x4 over 75s) kubelet Started container fdlogger
    Normal Pulled 37s (x3 over 73s) kubelet Container image "fluent/fluentd:edge-debian-arm64" already present on machine
    Warning BackOff 11s (x6 over 72s) kubelet Back-off restarting failed container fdlogger in pod basicpod_default(38cc413b-0e44-471a-949f-8a8d4062a513)

  • mkevinmchugh
    mkevinmchugh Posts: 17
    edited July 2023

    @ashfaqahmed @chrispokorni

    @chrispokorni - first thank you for your input above. I would never had found an answer without those clues.

    After a bit of poking around (and searching on Chris' tag), I found edge-debian. I am on GCE. The architecture is x86-64.... and honestly guessing a bit. But, I tried it and it worked.

    https://hub.docker.com/r/fluent/fluentd/

    "v1.16.1-debian-1.0, v1.16-debian-1, edge-debian (multiarch image for arm64(AArch64) and amd64(x86_64))"

  • zite
    zite Posts: 11

    Hi, I'm also having trouble running the fluentd.
    I receive:
    NAME READY STATUS RESTARTS AGE
    basicpod 1/2 ImagePullBackOff 0 10s
    What should I do?

  • chrispokorni
    chrispokorni Posts: 2,273

    Hi @ashfaqahmed, @mkevinmchugh, @zite,

    It seem that on Docker Hub the Fluentd image repositories have been reorganized following the release of fluentd v1.

    Please update basic.yaml with image: fluentd instead. It runs fluentd v0, the version tested in this lab exercise and in later lab exercise 5.3. It seems it is no longer maintained, however, the image: fluent/fluentd:edge-debian running fluentd v1 may not work in later lab exercise 5.3.

    Regards,
    -Chris

  • Setting fluent/fluentd:v1.16.2-1.0 resolved the issue for me

    $ cat <<'HERE' | kubectl create --filename - 
    apiVersion: v1
    kind: Pod
    metadata:
      name: basicpod
      labels:
        type: webserver
    spec:
      containers:
      - name: webcont
        image: nginx
        ports:
        - containerPort: 80
      - name: fdlogger
        image: fluent/fluentd:v1.16.2-1.0 
    HERE
    pod/basicpod created
    $ kubectl get pod basicpod 
    NAME       READY   STATUS    RESTARTS   AGE
    basicpod   2/2     Running   0          38s
    

Categories

Upcoming Training