Docker Implementation of Published Ports

Articles » Docker Networking for Container-Based Services » Docker Implementation of Published Ports

The default single-host Docker networking implementation uses iptables NAT table to implement published ports (Docker Swarm uses a load balancer on every swarm member), and in this part of the article we’ll decode the intricate setup it has to use to get the job done.

We’ll start with a simple web server and publish its HTTP port to host port 8080.

$ docker run --rm -d --name web_1 -p 8080:80 webapp
4bcbe1c9b3d0347b9ab4166692ca2d5f220766dac3ae648f8eee2fbe3dc43dcb
$ dps
NAMES               IMAGE               PORTS
web_1               webapp              0.0.0.0:8080->80/tcp
We’ll use alias dps='docker ps --format "table {{.Names}}\t{{.Image}}\t{{.Ports}}"' throughout the rest of this article to simplify the printouts.

After starting our web server and publishing its HTTP port to host port 8080, the host NAT table contains these rules:

$ sudo iptables -t nat -S
-P PREROUTING ACCEPT
-P INPUT ACCEPT
-P OUTPUT ACCEPT
-P POSTROUTING ACCEPT
-N DOCKER
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
-A POSTROUTING -s 172.17.0.2/32 -d 172.17.0.2/32 -p tcp -m tcp 
   --dport 80 -j MASQUERADE
-A DOCKER -i docker0 -j RETURN
-A DOCKER ! -i docker0 -p tcp -m tcp --dport 8080 -j DNAT
   --to-destination 172.17.0.2:80

To understand these rules we have to consider two different paths a packet can take to reach a container-based service through a published port.

Packets received from external clients are sent to one of the host’s IP addresses, so they should go through PREROUTING and INPUT chains… but as the PREROUTING chain changes the destination IP address to the container IP address, the packets get routed toward a container, and thus go through PREROUTING and POSTROUTING chains (FORWARD chain is not present in NAT table).

Containers running in independent network namespaces look like independent IP hosts connected to an internal Linux bridge. Packets sent to containers are thus routed by the host TCP/IP stack.

Packet generated by local processes should go through OUTPUT and INPUT chains, but similar to PREROUTING chain, OUTPUT chain changes the destination IP address, resulting in packets traversing OUTPUT and POSTROUTING chains.

With this in mind, let’s analyze individual rules, starting with the DOCKER chain where the true magic happens:

-A DOCKER -i docker0 -j RETURN
-A DOCKER ! -i docker0 -p tcp -m tcp --dport 8080 -j DNAT 
   --to-destination 172.17.0.2:80

The DOCKER chain contains two sets of rules:

  • No NAT is performed if a packet is coming from a Docker-created Linux bridge
  • If the destination port matches a published port, the destination IP address and port are rewritten to container IP address and port.

Adding containers with published ports expands the second part of the DOCKER chain. Adding custom Docker networks expands the first part of the DOCKER chain:

The DOCKER chain is expanded after creating a custom Docker network
$ docker network create --driver=bridge --subnet=192.168.99.0/24 br0
c4d101f845543b007068763d017d35d4c24b55bc63a817aa76d74d4e1510814c
$ sudo iptables -t nat -S DOCKER
-N DOCKER
-A DOCKER -i br-c4d101f84554 -j RETURN
-A DOCKER -i docker0 -j RETURN
-A DOCKER ! -i docker0 -p tcp -m tcp --dport 8080 -j DNAT 
   --to-destination 172.17.0.2:80
Note that the DOCKER chain generated after we added the second Docker network effectively prevents NAT translation of published ports for packets received from any container… we’ll explore the fix Docker uses in the next section.

The DOCKER chain is used in PREROUTING and OUTPUT chains. In the PREROUTING chain, the destination address type is checked, and the DOCKER chain is invoked for local destinations, ensuring published ports work only with local addresses:

DOCKER chain used in PREROUTING chain
-P PREROUTING ACCEPT
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER

The OUTPUT chain is a bit more convoluted: DOCKER chain is invoked only if the destination is a local address and not a loopback address. The interesting question we’re facing is thus “how can we connect to a published port through loopback address” as we did when exploring published ports. We’ll address this question in the next section.

DOCKER chain used in OUTPUT chain
$ sudo iptables -t nat -S OUTPUT
-P OUTPUT ACCEPT
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER

Finally, the POSTROUTING chain is used to implement outbound container access:

-P POSTROUTING ACCEPT
-A POSTROUTING -s 192.168.99.0/24 ! -o br-c4d101f84554 -j MASQUERADE
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
-A POSTROUTING -s 172.17.0.2/32 -d 172.17.0.2/32 -p tcp -m tcp 
   --dport 80 -j MASQUERADE

For every Docker network, the POSTROUTING chain contains a rule saying “if the source IP address is from a Docker network, but the destination interface is not the same Docker network, perform source NAT”. It also contains weird rules covering the cases where a container with a published port would send itself a packet through the host TCP/IP stack. Please don’t ask me under what scenario one might hit those rules…

Binding a Published Port to a Single IP Address

Binding a published port to a single IP address simply makes the rules in the DOCKER chain a bit more specific:

Binding a published port to a single IP address
$ docker run --rm -d --name web_2 -p 192.168.33.2:8081:80 webapp
4db11d87e36a4d48f9acb21ce33d794b10900b903b0b8b3432fdd8bfa2247be9
$ sudo iptables -t nat -S DOCKER
-N DOCKER
-A DOCKER -i br-c4d101f84554 -j RETURN
-A DOCKER -i docker0 -j RETURN
-A DOCKER ! -i docker0 -p tcp -m tcp --dport 8080 -j DNAT 
   --to-destination 172.17.0.2:80
-A DOCKER -d 192.168.33.2/32 ! -i docker0 -p tcp -m tcp 
   --dport 8081 -j DNAT --to-destination 172.17.0.3:80 

When you don’t specify an IP address with a published port, the corresponding DOCKER chain rule checks the destination TCP port number; when a published port is bound to a single IP address, the corresponding rule checks both destination TCP port and destination IP address.

More Information

Sidebar