A possible LAST_ACK DoS and fix (Please CC: me)

From: Wang Jian (lark@linux.net.cn)
Date: Sat Apr 08 2000 - 20:44:06 EST

Hello all,

The attachment is a patch to fix DoS effect on a large smtp server, I
think it is useful so I post it here.

Here are some references:



It seems that this problem has been discussed a few times in kernel
list and networking list but no cure is made.

The LAST_ACK DoS is something like blocking server with thousands of
sockets left in LAST_ACK state.

I see the DoS effect in the smtp servers of a large ICP. The scenario

       [ACE Layer 4 Switch] ------Internet------[PIX firewall]
             | |
             ^ Some Clients
       ----- -----
       | | |
  linux smtp servers pool

The ACE switch seems not to affect the results. Here is the DoS looks

1. 3-way handshake and then a socket is established
2. SMTP server end socket S sends welcome message "200 ..." to client C
3. For some unknown reason, PIX(or client) sends FIN to server socket
4. S receives FIN, gives it to application, changes to CLOSE_WAIT state
5. Applicatioin closes socket, S sends out FIN, changes to LAST_ACK
   state. Still, there are unsent data in skb .
6. No ACK from C, so S keeps in LAST_ACK state for about 15-20mins
   during retries to send out data.
7. A new socket is established, and then it again becomes "ghost
8. step 1-7 again and again, Server dumbed because too many LAST_ACK

Because there are many users(clients) behind PIX firewall, the sockets
in LAST_ACK state will increase and reach a fatal threshold that
server no longer responses.

The problem here is the PIX firewall. PIX firewall should do
something in response to "false" data from SMTP server, i.e, sends
back RST. But it doesn't. With this interactive bug, and provided
many PIX firewalls are there, sockets in LAST_ACK state can roar up
to 4000 in 5 minutes.

The annoying thing is that we know denying the PIX can't be good
point because we can't risk denying normal users. And if it is true,
normal users are just victims of this PIX bug, we can't be that cruel
to them. And we can't deny so many PIX's out there!

How the bad guy attack servers? Established TCP connection is
controllable to applications , so bad guy can't create too many
connections in a short time if the applications are well designed.
But 20min gives him good chances if he can make use of LAST_ACK
socket. Normally, bad guy can't attack servers by drop the
connections without further interaction, because applications can
handle that. But LAST_ACK is a kernel thing and applications have no
control over it, bad guy can sploit it to create as many "ghost
connections" as he will until the server dumbed.

Solaris can be tuned by set tcp_abort...(?) parameter to a small number
with ndd tool, but linux has no such thing. Generally, we can do

 # echo 4 > /proc/sys/net/ipv4/tcp_retries1
 # echo 5 > /proc/sys/net/ipv4/tcp_retries2

but the side effects to slow links can't be ignored.

Considering about the SMTP service characteristics: smtp server can
safely discard the unsent data in LAST_ACK state without problem,
because a smtp session should be closed by server, not client. If
a client close socket first, it doesn't matter whether server has
unsent data. And so do other services such as HTTP, telnet.

The attachment is a patch which handles LAST_ACK state differently. A
/proc/sys/net/ipv4/tcp_last_ack_retries is added, and the value takes
effect if it is between 1 to tcp_retries2. If <= 0, the feature is
disabled, if > tcp_retries2, tcp_retries2 takes effect. I think my
patch is simple and effective enough, but I may be wrong. The patch is
against 2.2.14 vanilla.

I do suggest using "4" ( 3+6+12+24 = 45s), but "1" seems ok :-)

Feedback is welcome. And I am not in the kenrel list, Please CC: me.


- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/

This archive was generated by hypermail 2b29 : Sat Apr 15 2000 - 21:00:12 EST