This is one of those pieces that you keep in your head for ages but never get around to write up. Tcpdumping I was doing of late brought it back so here it is.
We all know the 3-way handshake in TCP: SYN + SYN/ACK + ACK and voila! But this is not the end of the story. Page 23 of RFC793 describes the state machine a TCP connection is. There is a transition from the “LISTEN” state to “SYN SENT”. RFC says that a listening socket can send a SYN packet to a host and that way change state to “SYN SENT” just like clients normally do when they try to connect to a listening server. If precisely at the same time that host sends SYN to the server, handshake will be four staged so to speak:
- server: SYN -> client (server changes state from “LISTEN” to “SYN SENT”)
- client: SYN -> server (client changes state from “CLOSED” to “SYN SENT”)
- server: ACK -> client (server changes state from “SYN SENT” to “SYN RCVD”)
- client: ACK -> server (client changes state from “SYN SENT” to “SYN RCVD”)
If ACKs are successfully received, both client and server change from “SYN RCVD” to the “ESTAB” state, which is a properly open TCP connection. Steps 3 and 4 can be swapped around of course, it is only important that the first two happen “simultaneously enough” to trigger the four stage exchange instead of the normal one. Another way of thinking about it is, a normal 3-way handshake is a case of the 4-way handshake when one end is so late that it can combine its SYN packet with an ACK response to the SYN that arrived fast.
The client remains typical, but the server socket required to reproduce the 4-way handshake is a weird creature. As far as I could check linux kernel sources, listen(2) and connect(2) are mutually exclusive on the same socket. Even though RFC accommodates it, linux API prohibits it. Which makes sense.
But then again, looking at the grand scheme of things, from the perspective of client that has just switched to “SYN SENT” there is no way of telling whether the incoming SYN is from some weird server… or another client. Therefore, if the four staged server-client sequence above is valid, a client-client variation thereof is valid, too. We can establish a TCP connection using connect(2) on both ends:
- client A: SYN -> client B
- client B: SYN -> client A
- client A: ACK -> client B
- client B: ACK -> client A
As I already mentioned a couple of times, the SYN packets must be sent out at the same time. Clients must synchronize their connect(2) calls. The faster network the tighter synchronization must be. On WANs however latencies are long enough for it to trigger with a little help from NTP. Also, both ports must be known, so both clients must bind(2) first.
What does it all have to do with firewalls?
Let’s consider two hosts hidden behind firewall/NAT machines:
hostA <—> fwA <—> Fluffy Internet Cloud <—> fwB <—> hostB
- hostA sends a SYN to the public IP of fwB and a known port. fwA translates the source IP to its public IP and starts tracking the session for it. Assumption here is, that fwA does not change the source port of the SYN packet – which often is the case.
- Simultaneously, hostB sends a SYN to the public IP of fwA and a known port. fwB translates the source IP, doesn’t change the source port and starts tracking the session for this packet.
- SYN from point 1) arrives to fwB and because it matches the public IP of fwB and the source port of packet from point 2) it’s being assumed to be a continuation of the TCP setup – so fwB passes the packet along to hostB who replies to it with an ACK.
- The same happens with the SYN packet sent by hostB, fwA treats it as a part of the session setup and lets it thru and hostA replies with an ACK.
Despite traffic being blocked, a TCP connection is open directly between hostA and hostB with no need of a jump host.
Here is POC passing messages in the setup described above. Specifically, I tested it between my laptop and an AWS EC2 instance:
laptop <—> router (public IP: 126.96.36.199) <— … —> AWS <—> EC2 instance (public IP: 188.8.131.52)
- I made sure the EC2 instance has a public IP but traffic is blocked.
- I made sure the EC2 and laptop use the same NTP servers and are synchronized (NTP offsets were of 10-20ms on both ends).
- Then I run on my laptop: ./tcp-4way.py 184.108.40.206 5555 hello
- And on the EC2 instance: ./tcp-4way.py 220.127.116.11 5555 world
Tcpdump shows the [S], [S], [S.], [S.] sequence in handshake after which transfer continues.
00:03:53.859554 IP 192.168.1.6.5555 > 18.104.22.168.5555: Flags [S], seq 995877679, win 65535, options [mss 1460,nop,wscale 5,nop,nop,TS val 1185156363 ecr 0,sackOK,eol], length 0 00:00:00.017374 IP 22.214.171.124.5555 > 192.168.1.6.5555: Flags [S], seq 413927417, win 26883, options [mss 1460,sackOK,TS val 123624052 ecr 0,nop,wscale 7], length 0 00:00:00.000124 IP 192.168.1.6.5555 > 126.96.36.199.5555: Flags [S.], seq 995877679, ack 413927418, win 65535, options [mss 1460,nop,wscale 5,nop,nop,TS val 1185156380 ecr 123624052,sackOK,eol], length 0 00:00:00.013627 IP 188.8.131.52.5555 > 192.168.1.6.5555: Flags [S.], seq 413927417, ack 995877680, win 26883, options [mss 1460,sackOK,TS val 123624055 ecr 1185156363,nop,wscale 7], length 0