Lab 9.1 (Services and Endpoints) - Master can't curl to endpoints on Worker/Node
Hi, I was just going through all the labs, got to 9.1 and have hit an interesting problem.
When I create the NGINX deployment, then create the endpoints using the command:
kubectl -n accounting expose deployment nginx-one
It works fine, and I get the endpoints as below:
[username@k8smaster1 ~]$ kubectl -n accounting get endpoints nginx-one NAME ENDPOINTS AGE nginx-one 192.168.249.49:8080,192.168.249.50:8080 12m
The problem is that curl will only work from the worker nodes, not the master node
[username@k8smaster1 ~]$ curl -l 192.168.249.49:80 ^C (after several minutes)
Whereas on the worker:
[username@k8snode1 ~]$ curl -l 192.168.249.49:80 <!DOCTYPE html> <html> <head> <title>Welcome to nginx!</title> <style> body { width: 35em; margin: 0 auto; font-family: Tahoma, Verdana, Arial, sans-serif; } </style> </head> <body> <h1>Welcome to nginx!</h1> <p>If you see this page, the nginx web server is successfully installed and working. Further configuration is required.</p> <p>For online documentation and support please refer to <a href="http://nginx.org/">nginx.org</a>.<br/> Commercial support is available at <a href="http://nginx.com/">nginx.com</a>.</p> <p><em>Thank you for using nginx.</em></p> </body> </html>
I assume this has something to do with my networking setup, but since the kubectl commands on the master can return the information, and I can ping back and forth no problem from the two virtual machines, I'm a little stumped as to what would be causing this issue.
In all honesty, I'm not great on the networking side so perhaps there is something wrong with the calico networking that is setup, but I would really appreciate if anyone could point me in some troubleshooting directions for this.
Best Answer
-
Hello,
If you are using Azure there are known problems, and have been for a while.
I would suggest you use AWS, GCE, Digital Ocean, VirtualBox, VMWare, QEMU/KVM, or bare mental instead, which the labs have been tested upon. The testing is primarily done using GCE vms, but every so often I test on others. Only Azure, so far, has issues.
Regards,
0
Answers
-
Hi @psyrus,
Such timeouts are common when the VM to VM (or node to node) networking is not properly configured, outside of the Kubernetes cluster.
What are you using as infrastructure for your cluster? Are you in the cloud or a local hypervisor?
Regards,
-Chris0 -
Hi @chrispokorni I am using virtual machines hosted in Azure.
They are on the same VNET, on the same subnet with no networking restrictions in place between the nodes.
The Azure layer networking has a CIDR range of:
10.0.22.0/23The K8s calico setup is such that:
[username@k8smaster1 ~]$ kubectl get configmaps -n kube-system kubeadm-config -o yaml apiVersion: v1 data: ClusterConfiguration: | apiServer: extraArgs: authorization-mode: Node,RBAC timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta2 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controlPlaneEndpoint: k8smaster:6443 controllerManager: {} dns: type: CoreDNS etcd: local: dataDir: /var/lib/etcd imageRepository: k8s.gcr.io kind: ClusterConfiguration kubernetesVersion: v1.20.6 networking: dnsDomain: cluster.local podSubnet: 192.168.0.0/16 serviceSubnet: 10.96.0.0/12 scheduler: {} ClusterStatus: | apiEndpoints: k8smaster1: advertiseAddress: 10.0.22.4 bindPort: 6443 apiVersion: kubeadm.k8s.io/v1beta2 kind: ClusterStatus kind: ConfigMap metadata: creationTimestamp: "2021-04-08T04:49:09Z" managedFields: - apiVersion: v1 fieldsType: FieldsV1 fieldsV1: f:data: .: {} f:ClusterConfiguration: {} f:ClusterStatus: {} manager: kubeadm operation: Update time: "2021-04-08T04:49:09Z" name: kubeadm-config namespace: kube-system resourceVersion: "10235" uid: 198fd9ad-8fe7-4b01-93b5-2ecb087dfdb0
I am curious if the service subnet is the problem. Both the nodes are available from the k8s perspective:
[username@k8smaster1 ~]$ kubectl get nodes NAME STATUS ROLES AGE VERSION k8smaster1 Ready control-plane,master 39d v1.20.6 k8snode1 Ready <none> 39d v1.20.6
0 -
Hi @psyrus,
As long as your networks do not overlap (Pods, Nodes, Services) your cluster should be fine.
On Azure the network plugin may not behave as expected due to Azure's specific network implementation. For the Calico network plugin you can find Azure specific installation details, or you may search the forum for solutions posted by learners who completed the labs on Azure.
Regards,
-Chris0 -
Thanks @chrispokorni I will check the Azure specific stuff. I hadn't thought of that because I figured it's all virtualized and shouldn't make a difference unless I wanted to take specific advantages of Azure's networking 'special features'. Will post back after some investigation.
0 -
@chrispokorni FYI I am not using AKS-Engine, and instead running directly on virtual machines, and as such there is nothing special that needs to be done from a Calico networking perspective.
Quoted below for completeness:
https://docs.projectcalico.org/getting-started/kubernetes/self-managed-public-cloud/azure#other-options-and-toolsOther options and tools
Calico networking
You can also deploy Calico for both networking and policy enforcement. In this mode, Calico uses a VXLAN-based overlay network that masks the IP addresses of the pods from the underlying Azure VNET. This can be useful in large deployments or when running multiple clusters and IP address space is a big concern.
Unfortunately, aks-engine does not support this mode, so you must use a different tool chain to install and manage the cluster. Some options:
Use Terraform to provision the Azure networks and VMs, then kubeadm to install the Kubernetes cluster.
Use KubesprayAre there any tips/tricks that I could run (kubectl interrogation commands) or linux level checks that I can do to root out the cause of the "iptables proxy mode" (as far as I can tell) misbehaving?
0 -
Thanks for the information guys, although it doesn't really help with tracking down the problems. There really should be some way to troubleshoot these networking issues from the k8s side, but perhaps that's not what the forum is about. I will just deal with this myself.
0 -
Hi @psyrus,
You may also try one of my prior suggestions:
search the forum for solutions posted by learners who completed the labs on Azure
Between the LFS258 and LFD259 forums I am sure that lessons learned and suggestions have been shared for the benefit of future learners.
Regards,
-Chris0 -
Hi @lhensley,
When provisioning the GCE instances, did you happen to follow the video guide from the introductory chapter? It also covers the VPC networking requirements and firewalls needed to enable traffic between your instances.
Regards,
-Chris0 -
That's probably it. I was initially trying to use local hardware and after pretty quickly coming upon some unexpected behavior decided to use GCE instead - but did not backtrack and follow those steps closely.
0 -
Hi, had the same problem trying to run kubernetes on Azure VMs with Ubuntu Server.
Got one CP node and two worker ones. No node can reach the pods running on other nodes with curl. I'm using Calico for pod networking. Problem is two fold - in Calico setup and Azure routing (or lack thereof).
To fix...
TL;DR: disable bird in calico setup and create Azure routing table to map routes between internal node IPs and VM IPs.See full description with picture on stackoverflow: https://stackoverflow.com/questions/60222243/calico-k8s-on-azure-cant-access-pods
0
Categories
- All Categories
- 167 LFX Mentorship
- 219 LFX Mentorship: Linux Kernel
- 795 Linux Foundation IT Professional Programs
- 355 Cloud Engineer IT Professional Program
- 179 Advanced Cloud Engineer IT Professional Program
- 82 DevOps Engineer IT Professional Program
- 127 Cloud Native Developer IT Professional Program
- 112 Express Training Courses
- 112 Express Courses - Discussion Forum
- 6.2K Training Courses
- 48 LFC110 Class Forum - Discontinued
- 17 LFC131 Class Forum
- 35 LFD102 Class Forum
- 227 LFD103 Class Forum
- 14 LFD110 Class Forum
- 39 LFD121 Class Forum
- 15 LFD133 Class Forum
- 7 LFD134 Class Forum
- 17 LFD137 Class Forum
- 63 LFD201 Class Forum
- 3 LFD210 Class Forum
- 5 LFD210-CN Class Forum
- 2 LFD213 Class Forum - Discontinued
- 128 LFD232 Class Forum - Discontinued
- 1 LFD233 Class Forum
- 2 LFD237 Class Forum
- 23 LFD254 Class Forum
- 697 LFD259 Class Forum
- 109 LFD272 Class Forum
- 3 LFD272-JP クラス フォーラム
- 10 LFD273 Class Forum
- 152 LFS101 Class Forum
- 1 LFS111 Class Forum
- 1 LFS112 Class Forum
- 1 LFS116 Class Forum
- 1 LFS118 Class Forum
- LFS120 Class Forum
- 7 LFS142 Class Forum
- 7 LFS144 Class Forum
- 3 LFS145 Class Forum
- 1 LFS146 Class Forum
- 3 LFS147 Class Forum
- 1 LFS148 Class Forum
- 15 LFS151 Class Forum
- 1 LFS157 Class Forum
- 33 LFS158 Class Forum
- 8 LFS162 Class Forum
- 1 LFS166 Class Forum
- 1 LFS167 Class Forum
- 3 LFS170 Class Forum
- 2 LFS171 Class Forum
- 1 LFS178 Class Forum
- 1 LFS180 Class Forum
- 1 LFS182 Class Forum
- 1 LFS183 Class Forum
- 29 LFS200 Class Forum
- 736 LFS201 Class Forum - Discontinued
- 2 LFS201-JP クラス フォーラム
- 14 LFS203 Class Forum
- 102 LFS207 Class Forum
- 1 LFS207-DE-Klassenforum
- 1 LFS207-JP クラス フォーラム
- 301 LFS211 Class Forum
- 55 LFS216 Class Forum
- 48 LFS241 Class Forum
- 42 LFS242 Class Forum
- 37 LFS243 Class Forum
- 15 LFS244 Class Forum
- LFS245 Class Forum
- LFS246 Class Forum
- 50 LFS250 Class Forum
- 1 LFS250-JP クラス フォーラム
- LFS251 Class Forum
- 154 LFS253 Class Forum
- LFS254 Class Forum
- LFS255 Class Forum
- 5 LFS256 Class Forum
- 1 LFS257 Class Forum
- 1.3K LFS258 Class Forum
- 10 LFS258-JP クラス フォーラム
- 111 LFS260 Class Forum
- 159 LFS261 Class Forum
- 41 LFS262 Class Forum
- 82 LFS263 Class Forum - Discontinued
- 15 LFS264 Class Forum - Discontinued
- 11 LFS266 Class Forum - Discontinued
- 20 LFS267 Class Forum
- 24 LFS268 Class Forum
- 29 LFS269 Class Forum
- 1 LFS270 Class Forum
- 199 LFS272 Class Forum
- 1 LFS272-JP クラス フォーラム
- LFS274 Class Forum
- 3 LFS281 Class Forum
- 9 LFW111 Class Forum
- 260 LFW211 Class Forum
- 182 LFW212 Class Forum
- 13 SKF100 Class Forum
- 1 SKF200 Class Forum
- 1 SKF201 Class Forum
- 782 Hardware
- 198 Drivers
- 68 I/O Devices
- 37 Monitors
- 96 Multimedia
- 174 Networking
- 91 Printers & Scanners
- 83 Storage
- 743 Linux Distributions
- 80 Debian
- 67 Fedora
- 15 Linux Mint
- 13 Mageia
- 23 openSUSE
- 143 Red Hat Enterprise
- 31 Slackware
- 13 SUSE Enterprise
- 348 Ubuntu
- 461 Linux System Administration
- 39 Cloud Computing
- 70 Command Line/Scripting
- Github systems admin projects
- 90 Linux Security
- 77 Network Management
- 101 System Management
- 46 Web Management
- 64 Mobile Computing
- 17 Android
- 34 Development
- 1.2K New to Linux
- 1K Getting Started with Linux
- 371 Off Topic
- 114 Introductions
- 174 Small Talk
- 19 Study Material
- 507 Programming and Development
- 285 Kernel Development
- 204 Software Development
- 1.8K Software
- 211 Applications
- 180 Command Line
- 3 Compiling/Installing
- 405 Games
- 309 Installation
- 97 All In Program
- 97 All In Forum
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)