Welcome to the Linux Foundation Forum!

Issue with worker node on Lab 3.2 Step 30 (worker pull from registry)

Hi,

I have a problem in Step 30 of Lab 3.2 :(

context: i have two ec2 instances with port TCP 5000 opened in its security group

worker node (cannot pull from local registry):

ubuntu@ip-172-31-16-147:~/LFD259/SOLUTIONS/s_02$ cat /etc/docker/daemon.json
{ "insecure-registries":["10.111.241.52:5000"] }

ubuntu@ip-172-31-16-147:~/LFD259/SOLUTIONS/s_02$ k get svc
NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
kubernetes   ClusterIP   10.96.0.1       <none>        443/TCP    24h
nginx        ClusterIP   10.107.231.29   <none>        443/TCP    61m
registry     ClusterIP   10.111.241.52   <none>        5000/TCP   61m

ubuntu@ip-172-31-16-147:~/LFD259/SOLUTIONS/s_02$ docker pull 10.111.241.52:5000/simpleapp
Using default tag: latest
Error response from daemon: Get http://10.111.241.52:5000/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

ubuntu@ip-172-31-16-147:~/LFD259/SOLUTIONS/s_02$ curl http://10.111.241.52:5000/v2/
^C

master node (works fine):

ubuntu@ip-172-31-17-134:~/LFD259/SOLUTIONS/s_02$ k get svc
NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
kubernetes   ClusterIP   10.96.0.1       <none>        443/TCP    24h
nginx        ClusterIP   10.107.231.29   <none>        443/TCP    58m
registry     ClusterIP   10.111.241.52   <none>        5000/TCP   58m

ubuntu@ip-172-31-17-134:~/LFD259/SOLUTIONS/s_02$ bat /etc/docker/daemon.json
{ "insecure-registries":["10.111.241.52:5000"] }

ubuntu@ip-172-31-17-134:~/LFD259/SOLUTIONS/s_02$ curl http://10.111.241.52:5000/v2/
{}

ubuntu@ip-172-31-17-134:~/LFD259/SOLUTIONS/s_02$ sudo docker pull 10.111.241.52:5000/tagtest
Using default tag: latest
latest: Pulling from tagtest
Digest: sha256:134c7fe821b9d359490cd009ce7ca322453f4f2d018623f849e580a89a685e5d
Status: Image is up to date for 10.111.241.52:5000/tagtest:latest

Any ideas? Thank you!

Comments

  • Typically docker commands run as root. Also, I am not sure what the ownership of the daemon.json file is, normally it should be root.
    I see an interesting setup there with kubectl running on both nodes.
    As far as curl goes on the worker, there may still be an SG issue.

    Regards,
    -Chris

  • rcougil
    rcougil Posts: 4
    edited December 2019

    I've added user to docker group, so no need to sudo i guess. If i do a docker info i see how the daemon.json config has been picked after restart docker service in both worker and master nodes.

    Yeah, i've installed kubectl in worker node too to troubleshoot, but problem was there before :|

    Curl does not work from worker to master using that k8s cluster inner ip (10.111.241.52), but Ping works with master node IP (private IP from of ec2 instances within the same VPC). I've opened all TCP ports on SG with same result. So... no really confident about being a SG issue.

    Some colleagues suggested that maybe the type of the registry service should be NodePort instead of ClusterIP in order to publish the address and port to the worker node. Actually, I've changed it to NodePort and it worked, but i am not sure that was the right approach since the Lab is pretty clear about using k8s inner IP from the worker node (type ClusterIP).

    Any other ideas? Do you need more info from my side?
    Many thanks for the help

    Regards,
    Rubén

  • serewicz
    serewicz Posts: 1,000

    Hello,

    I see from your context you mention opening TCP port 5000 in your security group. Could you please try the lab again, this time with the firewall fully removed as suggested in the lab setup guide and video and see if the issue persists?

    Kind regards,

  • Hi @rcougil,
    Is your SG overall restrictive with individual rules to allow traffic to specific ports? An SG configured to allow all ingress traffic may be the solution in this case. For an extra level of security, a custom VPC may be suitable for the all-open SG, with the Kubernetes cluster nodes deployed in the custom VPC.

    Regards,
    -Chris

  • serewicz
    serewicz Posts: 1,000

    Hello again,
    Please reference the docker-compose.yaml file in the lab. You will note that there is an nginx registry running on port 443. If you add this port to your network security settings does the pull command work? Also, when you run the sudo docker pull command, if you include a -vvv

  • Hello, I had a similar problem maybe my fix could solve also your case.
    As the lab uses calico network plugin, the TCP port 179 should be open on every node's iptables.

    A good way to verify that this is eventually your case is via
    sudo calicoctl node status

    You can get calicoctl from
    curl -O -L https://github.com/projectcalico/calicoctl/releases/download/v3.11.1/calicoctl

    If you see anything different from Established in the Info column it means that the connection among your nodes is not fully established.

    In my case the port TCP 179 was not accepting incoming connections on the worker machine. I just fixed it on that machine's iptables by
    sudo iptables -I INPUT 5 -i eth0 -p tcp --dport 179 -m state --state NEW,ESTABLISHED -j ACCEPT
    and everything started working properly, calicoctl showing Established now.

Categories

Upcoming Training