Unable to expose a TFTP service from within a Kubernetes cluster

4/8/2019

I'm using Rancher's nginx ingress controller, rancher/nginx-ingress-controller:0.21.0-rancher3, which should be based on https://github.com/kubernetes/ingress-nginx AFAIK.

My udp-services is configured as:

apiVersion: v1
kind: ConfigMap
metadata:
  name: udp-services
  namespace: ingress-nginx
data:
  69: "default/tftp:69"
  8881: "default/test:8881"

Running nc -l -u -p 8881 on default/test can communicate with an out-of-cluster client just fine. That should mean that the udp proxying works, at least for some cases. However tftp requests to default/tftp timeout consistently.

Roughly, a TFTP read should work as below:

  1. Client port A => Server port 69 (request)
  2. Server port B => Client port A (send data, and note it's a new port B)
  3. Client port A => Server port B (acknowledgement)

tcpdump running on the tftp server shows the communication is like:

  1. Host port A => Server port 69 (request a file)
  2. Server port B => Host port A (sending data back to port A)
  3. Host => Server, ICMP port unreachable (but port A is unreachable)

At the same time the ingress logs something like:

TIMESTAMP [error] ... upstream timed out (110: Connection timed out) while proxying connection, udp client: ::1, server: [::]:69, upstream: "...:69", bytes from/to client:..., bytes from/to upstream:...

TFTP requests from another in-cluster container work just fine. This should mean that the TFTP server itself is not the direct source of problem. And the issue is how the ingress controller handles the requests.

I found tftpd has a --port-range argument which can pin which ports tftpd can use to respond. I tried to pin it to port 8881 (--port-range 8881:8881), but the requests are still being dropped.

My guess is that the ingress does not redirect the packet back to the client since the reply is not from port 69, but port B.

Did anyone succeed to expose a TFTP service within a Kubernetes cluster?

-- Vin
kubernetes
kubernetes-ingress

1 Answer

2/24/2020

It is not a 100% solution but i found a workaround for the exact same thing. The problem is that tftp creates a new outbound UDP connection that isn't known in the host's state table. Thus it treats it like an outgoing request rather than a reply. I will also note that TFTP client apps handle this fine, but PXE drivers (at least intel ones) do not.

If you are using Calico as your CNI, you can disable "natOutgoing" on the IPPool. if you need NAT, you can create a second IPPool without NAT.

https://docs.projectcalico.org/networking/assign-ip-addresses-topology#features

I disabled it for the default by calicoctl get ippool -oyaml | sed 's/natOutgoing: true/natOutgoing: false/g' | calicoctl apply -f -

I am sure other CNI plugins may have a similar workaround

-- Roogles
Source: StackOverflow