Welcome to the Linux Foundation Forum!

Lab 3.1 Cilium endpoints are not reachable.

I have installed K8S cluster on AWS EC2 instances following lab 3.1.

Ubuntu 24.04
K8S 1.30.1
Cilium 1.16.1

AWS VPC: 10.0.0.0/16
POD CIDR: 192.168.0.0/24

The host are able to see each other, but the cilium endpoints are not reachable. What might be the problem ?

ubuntu@ip-10-0-1-70:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8scp Ready control-plane 7m55s v1.30.1
worker1 Ready 6m39s v1.30.1

ubuntu@ip-10-0-1-70:~$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
cilium-2lc9m 1/1 Running 0 7m38s
cilium-envoy-85hfc 1/1 Running 0 7m38s
cilium-envoy-vnh2z 1/1 Running 0 7m12s
cilium-fh2tk 1/1 Running 0 7m12s
cilium-operator-64767f6566-jrqzm 1/1 Running 0 7m38s
cilium-operator-64767f6566-rpvmh 1/1 Running 0 7m38s
coredns-7db6d8ff4d-lpmcp 1/1 Running 0 8m11s
coredns-7db6d8ff4d-vkk9h 1/1 Running 0 8m11s
etcd-k8scp 1/1 Running 0 8m25s
kube-apiserver-k8scp 1/1 Running 0 8m26s
kube-controller-manager-k8scp 1/1 Running 0 8m25s
kube-proxy-q7m68 1/1 Running 0 8m11s
kube-proxy-tj6kr 1/1 Running 0 7m12s
kube-scheduler-k8scp 1/1 Running 0 8m25s

root@ip-10-0-1-70:/home/cilium# cilium-dbg status
KVStore: Ok Disabled
Kubernetes: Ok 1.30 (v1.30.1) [linux/amd64]
Kubernetes APIs: ["EndpointSliceOrEndpoint", "cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "cilium/v2alpha1::CiliumCIDRGroup", "core/v1::Namespace", "core/v1::Pods", "core/v1::Service", "networking.k8s.io/v1::NetworkPolicy"]
KubeProxyReplacement: False
Host firewall: Disabled
SRv6: Disabled
CNI Chaining: none
CNI Config file: successfully wrote CNI configuration file to /host/etc/cni/net.d/05-cilium.conflist
Cilium: Ok 1.16.1 (v1.16.1-68579055)
NodeMonitor: Listening for events on 15 CPUs with 64x4096 of shared memory
Cilium health daemon: Ok
IPAM: IPv4: 4/254 allocated from 192.168.0.0/24,
IPv4 BIG TCP: Disabled
IPv6 BIG TCP: Disabled
BandwidthManager: Disabled
Routing: Network: Tunnel [vxlan] Host: Legacy
Attach Mode: TCX
Device Mode: veth
Masquerading: IPTables [IPv4: Enabled, IPv6: Disabled]
Controller Status: 29/29 healthy
Proxy Status: OK, ip 192.168.0.199, 0 redirects active on ports 10000-20000, Envoy: external
Global Identity Range: min 256, max 65535
Hubble: Ok Current/Max Flows: 1612/4095 (39.37%), Flows/s: 3.31 Metrics: Disabled
Encryption: Disabled
Cluster health: 1/2 reachable (2025-02-21T19:26:11Z)
Name IP Node Endpoints
worker1 10.0.1.152 reachable unreachable
Modules Health: Stopped(0) Degraded(0) OK(40)

Answers

  • Not able to attach files here

    time="2025-02-21T19:18:06Z" level=info msg=" --cluster-pool-ipv4-cidr='192.168.0.0/16'" subsys=daemon
    time="2025-02-21T19:18:06Z" level=info msg=" --cluster-pool-ipv4-mask-size='24'" subsys=daemon

  • chrispokorni
    chrispokorni Posts: 2,434

    Hi @miro.gospodinov,

    To ensure a proper setup of your AWS EC2 instances, VPC, and SG respectively please review the demo video guide from the introductory chapter of the course.

    Also, ensure consistency in the podSubnet CIDR value populated in the kubeadm-config.yaml manifest and the cluster-pool-ipv4-cidr value in cilium-cni.yaml manifest. I recommend double checking both manifests prior to initializing the cluster and making edits if necessary to have both set with 192.168.0.0/16, to avoid conflicts with the EC2 private IP addresses or with the Kubernetes default Service subnet. Using inconsistent ranges, such as /16 in one place and /24 someplace else may cause routing conflicts in the cluster.

    From the output, it seems your Nodes are Ready, and all the Pods are Running - as they should be.

    Regards,
    -Chris

  • Hello Chris,

    The issue has been resolved. The root cause was identified as a misconfiguration of the security groups. By enabling unrestricted network access ("All traffic"), the problem was successfully rectified and functionality restored.

Categories

Upcoming Training