What could cause one side of the TCP connection in FIN_WAIT2 state and the other side is fully closed (i.e. not in CLOSE_WAIT state)?

1/29/2022

I recently encountered an TCP related issue and hope someone could shed some light on it.

I have an application1 in container1/pod1 that is connected to a server (client_ip:12345 <-> server_ip:443). After running for a while, the server decided to close this connection, so it sent FIN to the client and the client sends ACK back to the server (saw these two packets in the pcap). This sould leave the client in CLOSE_WAIT and the server in FIN_WAIT2.

In this situation, the client should call close() and send FIN back to the server. But I've found that the application lack close() in its code, so in theory, the client should be stuck in CLOSE_WAIT and the server will be in FIN_WAIT2 until FIN_WAIT2 timeout. The port 12345 on the client side shouldn't be reused by any other new connections.

However, it seems somehow the client_ip:12345 <-> server_ip:443 socket state on the client side was no longer in CLOSE_WAIT state (become fully closed and available), so when another application2 in container2 was up, it randomly pick the same port 12345 (the kernel assign the ephemeral source port from the range) to connect to the server's port 443. Because the server side were still in FIN_WAIT2, so the connection couldn't be established, and thus the service got interrupted until FIN_WAIT2 state timeout (300 secs).

I understand I should fix the application code by adding close(). However, I'm curious about what could make the CLOSE_WAIT state disappear/reset on the client side and let another application be able to pick the same 12345 port to connect to the server?

I found a F5 Bug mentioned a similar situation: "Client side connection has been fully closed. This may occur if a client SSL profile is in use and an 'Encrypted Alert' has been received."

https://cdn.f5.com/product/bugtracker/ID812693.html

I'm wondering if there are any other possibilities that could cause FIN_WAIT2 on one side and fully closed on the other side (not in CLOSE_WAIT)?

For example, the process using this socket was killed? But AFAIK, after killing that process, the socket file descriptor should be closed and a FIN should still be sent by TCP?

I hope someone could shed some light on it!

-- CSL
kubernetes
linux
tcp

0 Answers