Welcome to the Linux Foundation Forum!

Lab 3.2 Section 15

I am stuck on Section 15 of Lab 3.2. To verify if Simpleapp is running on second node using "sudo crictl ps". I first set up the crictl config by mistake. However I am getting the following error.

I know in cp I have 6 instance simpleapp deployed to, so I am not sure what this error means. "FATA[0000] validate service connection: CRI v1 runtime API is not implemented for endpoint "unix:///run/containerd/containerd.sock": rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"


Best Answer

  • chrispokorni
    chrispokorni Posts: 2,570
    Answer ✓

    Hi @kenneth.kon001,

    After a quick sanity testing I can confirm that the custom containerd config is preserved across several cp and worker node reboots. This assumes the custom config is applied only once, following the recommended sequence, after the local-repo-setup.sh has been executed only once. No other edits of config.toml are needed - even after several reboots - the custom entries are preserved.

    Per my earlier edit, the repo/simpleapp image needs to be pushed again into the local registry (step 12 command #2), because the registry catalog is cleaned by the reboot. Without repopulating the registry catalog, the try1 Deployment's replicas will show ErrImagePull and CrashLoopBackOff status. After the image push, that populates the registry, the try1 replicas will eventually retrieve the image and reach the Running state.

    Regards,
    -Chris

Answers

  • Hi @kenneth.kon001,

    Your worker does not seem to host active workload. It is the same issue as before - misconfigured containerd runtime. Per the earlier discussion, apply the same solution on the worker node as well, to configure the containerd runtime, which then enables the crictl CLI.

    Regards,
    -Chris

  • Weird, I thought I ran the process on Worker node. Let me try that. @chrispokorni I appreciate you messaging me back.

  • @chrispokorni

    Alright, this is what I thought. I did follow the steps you mentioned. However when I do sudo reboot. It resets the config.toml. So I have to repeat the step.

    So I am running into problem, I can run "sudo crictl ps" however now I am not seeing the simpleapp container within the worker node. When I created the try deploying simpleapp in cp "kubectl create deployment try1 --image=$repo/simpleapp" the node its created with says cp instead of worker. Do you know what might be causing this issue?


  • chrispokorni
    chrispokorni Posts: 2,570
    edited December 2025

    Hi @kenneth.kon001,

    Your try1 Deployment's pod replicas are all hosted by the control plane node as a result of the scheduling process. I would expect some, but not necessarily all, of these six replicas to be scheduled onto the worker node also. If they are not scheduled onto the worker node may indicate an unreachable, unschedulable, or unhealthy worker node. Also, the "Terminating" status of all six replicas is not desired.

    From your output, it seems your worker node is only hosting the three networking infrastructure pods. What is the state of your cluster overall? What output is produced by:

    kubectl get nodes -o wide
    kubectl get pods -A -o wide
    

    I will take another look at the containerd configuration solution and check it against node reboots.

    EDIT: The registry catalog is expected to be empty after a reboot. So a new simpleapp image push is necessary after a reboot in order to launch the try1 Deployment.

    Regards,
    -Chris

  • Hi @chrispokorni,

    Pushing the repo/simpleapp image into local registry did the trick. Thank you for the help. Hopefully I do not run into anymore weird bugs.

Categories

Upcoming Training