Re: After many hours all outbound connections get stuck in SYN_SENT

From: Ilpo Järvinen
Date: Thu Dec 20 2007 - 16:06:19 EST


On Thu, 20 Dec 2007, James Nichols wrote:

> > You'd probably should also investigate the Linux kernel,
> > especially the size and locks of the components of the Sack data
> > structures and what happens to those data structures after Sack is
> > disabled (presumably the Sack data structure is in some unhappy
> > circumstance, and disabling Sack allows the data to be discarded,
> > magically unclaging the box).

...Not sure if you want now to invent such structure. Yes, we have per skb
->sacked but again in SYN_SENT there are very few things who touch it at
all, and they just set it to zero (though it would not even be mandatory
for tcp_transmit_skb, IIRC, checked that just couple of days ago due to
other things).

Another thing is the rx_opt.sack_ok which is just couple flag bits that
tell the TCP variant in use (and it's mostly used only after SYN handshake
completes). The rest (the actual SACK blocks) is in the ack_skb but again
it has very little meaning in SYN_SENT state unless somebody is crazy
enough to add SACK blocks to SYN-ACKs :-).

> > In the absence of the reporter wanting to dump the kernel's
> > core, how about a patch to print the Sack datastructure when
> > the command to disable Sack is received by the kernel?
> > Maybe just print the last 16b of the IP address?
>
> Given the fact that I've had this problem for so long, over a variety
> of networking hardware vendors and colo-facilities, this really sounds
> good to me. It will be challenging for me to justify a kernel core
> dump, but a simple patch to dump the Sack data would be do-able.

If your symptoms really are: SYNs leaving (if they show up in tcpdump, for
sure they've left TCP code already) and SYN-ACK not showing up even in
something as early as in tcpdump (for sure TCP side code didn't execute at
that point yet), there's very little change that Linux' TCP code has some
bug in it, only things that do something in such scenario are the SYN
generation and retransmitting SYNs (and those are trivially verifiable
from tcpdump).


--
i.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/