Welcome to the Linux Foundation Forum!

Lab 8.3: NFS volume fails to mount

Hello,

I was going to ask for some guidance, but just figured out the solution. I'll post it here in-case anyone encounters the same problem.

I was stuck between step 7 & 8 of lab 8.3.

PV and PVC were created OK, but the pod created in step 6 seems to fail to mount the NFS volume and I was uncertain whether I made a mistake or if there's was another problem.

kubectl get pods:

NAME                         READY   STATUS              RESTARTS   AGE
nginx-nfs-5f58fd64fd-qsqs8   0/1     ContainerCreating   0          16m

kubectl describe pod nginx-nfs-5f58fd64fd-qsqs8:

Name:           nginx-nfs-5f58fd64fd-qsqs8
Namespace:      default
Priority:       0
Node:           k8s-worker/172.31.46.65
Start Time:     Wed, 21 Jul 2021 16:14:36 +0000
Labels:         pod-template-hash=5f58fd64fd
                run=nginx
Annotations:    <none>
Status:         Pending
IP:
IPs:            <none>
Controlled By:  ReplicaSet/nginx-nfs-5f58fd64fd
Containers:
  nginx:
    Container ID:
    Image:          nginx
    Image ID:
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /opt from nfs-vol (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-jg5t5 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  nfs-vol:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  pvc-one
    ReadOnly:   false
  kube-api-access-jg5t5:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason       Age                   From               Message
  ----     ------       ----                  ----               -------
  Normal   Scheduled    16m                   default-scheduler  Successfully assigned default/nginx-nfs-5f58fd64fd-qsqs8 to k8s-worker
  Warning  FailedMount  3m4s (x3 over 9m54s)  kubelet            Unable to attach or mount volumes: unmounted volumes=[nfs-vol], unattached volumes=[kube-api-access-jg5t5 nfs-vol]: timed out waiting for the condition
  Warning  FailedMount  47s (x4 over 14m)     kubelet            Unable to attach or mount volumes: unmounted volumes=[nfs-vol], unattached volumes=[nfs-vol kube-api-access-jg5t5]: timed out waiting for the condition
  Warning  FailedMount  10s (x16 over 16m)    kubelet            MountVolume.SetUp failed for volume "pvvol-1" : mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t nfs k8scp:/opt/sfw /var/lib/kubelet/pods/c79023bd-4e26-4ffd-b065-a198c8c03303/volumes/kubernetes.io~nfs/pvvol-1
Output: mount: /var/lib/kubelet/pods/c79023bd-4e26-4ffd-b065-a198c8c03303/volumes/kubernetes.io~nfs/pvvol-1: bad option; for several filesystems (e.g. nfs, cifs) you might need a /sbin/mount.<type> helper program.

Googling the error message suggested failure to install nfs-common, but it was installed on the cp node. While preparing this post I realized the pod is running on the worker node which didn't have it.

Running sudo apt -y install nfs-common there and then recreating the pod resolved the problem.

I see I missed that 8.2 step 5 should've been run on the worker node, so my mistake.

Comments

  • Thanks for this hint!

  • Hi, I did the whole thing, including the "nfs-common" and I am not making any progress. I am not sure if someone has faced the same problem.

  • Hi @lzambra,

    Please provide the sequence of commands you executed on each node, part of the nfs installation, and their corresponding outputs.

    Regards,
    -Chris

  • ok, info to someone that might face the same issue:

    On previous lab, you need to use the namespace "small", if you try to create the new pvc, this is going to use "pvc-one" as name, and you will have two under the same name. I just removed the pvc-one that belongs to small and recreate the pvc without namespace, also, I noticed that didn't work. But the issue here was that PersistentVolume was not created. So, I create the PersistentVolume (PVol.yaml) and after the "pvc.yaml" worked!

  • vbalas
    vbalas Posts: 1

    Hi, I also encountered this error with pod failing to mount nfs volume. However in my case specific error was "Output: mount.nfs: Failed to resolve server cp: Name or service not known"

    I did not have "cp" defined in /etc/hosts on nodes, just "k8scp" as stated in labs. There are many issues on net where pod can't resolve nfs server hostname even if dns is working fine and the solution was to use IP address of the "cp" node in the PVol.yaml instead of hostname:

    server:1.2.3.4

  • Hi,
    I have followed these steps and yet I encounter this issue in lab 9.3 (previously 8.3):
    kubectl describe pod

    Name:             sakshi-nginx-nfs-68686f6d59-tljf7
    Namespace:        default
    Priority:         0
    Service Account:  default
    Node:             sakshi-k8s-worker-node/134.221.126.56
    Start Time:       Tue, 13 Aug 2024 13:11:12 +0000
    Labels:           pod-template-hash=68686f6d59
                      run=nginx
    Annotations:      <none>
    Status:           Pending
    IP:
    IPs:              <none>
    Controlled By:    ReplicaSet/sakshi-nginx-nfs-68686f6d59
    Containers:
      nginx:
        Container ID:
        Image:          nginx
        Image ID:
        Port:           80/TCP
        Host Port:      0/TCP
        State:          Waiting
          Reason:       ContainerCreating
        Ready:          False
        Restart Count:  0
        Environment:    <none>
        Mounts:
          /opt from nfs-vol (rw)
          /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-mqqj7 (ro)
    Conditions:
      Type              Status
      Initialized       True
      Ready             False
      ContainersReady   False
      PodScheduled      True
    Volumes:
      nfs-vol:
        Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
        ClaimName:  pvc-one
        ReadOnly:   false
      kube-api-access-mqqj7:
        Type:                    Projected (a volume that contains injected data from multiple sources)
        TokenExpirationSeconds:  3607
        ConfigMapName:           kube-root-ca.crt
        ConfigMapOptional:       <nil>
        DownwardAPI:             true
    QoS Class:                   BestEffort
    Node-Selectors:              <none>
    Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
    Events:
      Type     Reason       Age                  From               Message
      ----     ------       ----                 ----               -------
      Normal   Scheduled    10m                  default-scheduler  Successfully assigned default/sakshi-nginx-nfs-68686f6d59-tljf7 to sakshi-k8s-worker-node
      Warning  FailedMount  97s (x4 over 7m57s)  kubelet            MountVolume.SetUp failed for volume "pvvol-1" : mount failed: exit status 32
    Mounting command: mount
    Mounting arguments: -t nfs sakshi-k8s-cp-node:/opt/sfw /var/lib/kubelet/pods/a15e58a8-2031-4658-9a31-133257090160/volumes/kubernetes.io~nfs/pvvol-1
    Output: mount.nfs: Connection timed out
    

    This is how my PVol.yaml looks like:

    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: pvvol-1
    spec:
      capacity:
        storage: 1Gi
      accessModes:
        - ReadWriteMany
      persistentVolumeReclaimPolicy: Retain
      nfs:
        path: /opt/sfw
        server: sakshi-k8s-cp-node   #<-- Edit to match cp node
        readOnly: false
    
    

    Can anyone suggest a solution or someone else faced something similar?

  • chrispokorni
    chrispokorni Posts: 2,281

    Hi @sakshi1120,

    Typically completing steps 1 through 4 of lab exercise 9.2 (Creating a Persistent NFS Volume) on the control plane node, followed by step 5 of lab exercise 9.2 on the worker node are sufficient to initiate the NFS server on the control plane node and the NFS client on the worker node respectively. Ensure that the steps work with the k8scp alias as presented, and/or with the control plane node hostname.

    However, there may be network settings at play to impact the desired operations and eventually outcomes in the subsequent lab exercise 9.3. In this particular scenario, can the worker node resolve the control plane node hostname? If in doubt, you can add the control plane node hostname alongside the k8scp recommended alias to the /etc/hosts file of the worker node. The updated entry should look like this cp-private-IP k8scp cp-node-hostname (where you substitute cp-private-IP with the private IP of your control plane node, and cp-node-hostname with the hostname of your control plane node).

    Try to aim for consistency; if the k8scp alias works in lab exercise 9.2 (step 5), then use the same k8scp alias in the PV definition manifest in step 6 of lab exercise 9.2, otherwise, attempt the same with the control plane node hostname instead. I would recommend using the hostname approach because the solution remains operational even in chapter 16 when the k8scp alias will be assigned to another server.

    Regards,
    -Chris

  • Thanks Chris! This worked! I used control plane hostname in the PV definition manifest which the worker node couldn't resolve. Using the k8scp alias solved it.

  • darkozepp
    darkozepp Posts: 1

    Thank you very much.

Categories

Upcoming Training