Re: What is TCPRenoRecoveryFail ?

From: Bjorn Wesen (bjorn.wesen@axis.com)
Date: Wed Feb 27 2002 - 13:22:29 EST


On Wed, 27 Feb 2002, bert hubert wrote:
> On Wed, Feb 27, 2002 at 01:46:55PM +0000, Bjorn Wesen wrote:
> > I have a TCP connection that is sending bulk data from a Linux 2.4.17
> > machine to a client. At some point, one of the packets from the Linux
> > machine is lost, so the client asks for a retransmit by acking the last
> > received correct packet. Then the Linux machine just keeps filling the
> > clients open window, ignoring that and subsequent retransmit requests,
> > never retransmitting any data.
>
> Please show a tcpdump -v of this happening, including the initial SYN
> packets. I strongly suspect something in your network of mucking with TCP
> options.

Ok this is mangled by the email client but i attached the binary dump of
the relevant packets. The dump is taken on the windows machine, which
complicates the analysis because perhaps the network card itself is
screwing up, but that is a low probability because all succeeding
communication after the TCP timeout works fine (and parallell
communication). tcpdump says 'squid' because the src port happened to be
3128..

23:46:43.009000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: . [tcp
sum ok] 4269884068:4269885528(1460) ack 7148250 win 5840 (DF) (ttl 64, id
37958, len 1500)
23:46:43.009000 dh10-13-18-213.axis.se.squid > 10.13.18.46.http: . [tcp
sum ok] ack 1460 win 8760 (DF) (ttl 128, id 54605, len 40)
23:46:43.009000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: . [tcp
sum ok] 1460:2920(1460) ack 1 win 5840 (DF) (ttl 64, id 37959, len 1500)
23:46:43.009000 dh10-13-18-213.axis.se.squid > 10.13.18.46.http: . [tcp
sum ok] ack 2920 win 8760 (DF) (ttl 128, id 54861, len 40)
23:46:43.010000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: . [tcp
sum ok] 2920:4380(1460) ack 1 win 5840 (DF) (ttl 64, id 37960, len 1500)
23:46:43.010000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: . [tcp
sum ok] 5840:7300(1460) ack 1 win 5840 (DF) (ttl 64, id 37962, len 1500)
^^^^-- the last data packet windows receives

23:46:43.010000 dh10-13-18-213.axis.se.squid > 10.13.18.46.http: . [tcp
sum ok] ack 4380 win 8760 (DF) (ttl 128, id 55117, len 40)
^^^^-- first in a row of futile ACK's at relative seq 4380

23:46:43.011000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: . [tcp
sum ok] 7300:8760(1460) ack 1 win 5840 (DF) (ttl 64, id 37963, len 1500)
23:46:43.011000 dh10-13-18-213.axis.se.squid > 10.13.18.46.http: . [tcp
sum ok] ack 4380 win 8760 (DF) (ttl 128, id 55373, len 40)
23:46:43.011000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: . [tcp
sum ok] 8760:10220(1460) ack 1 win 5840 (DF) (ttl 64, id 37964, len 1500)
23:46:43.011000 dh10-13-18-213.axis.se.squid > 10.13.18.46.http: . [tcp
sum ok] ack 4380 win 8760 (DF) (ttl 128, id 55629, len 40)
23:46:43.012000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: P [tcp
sum ok] 10220:11680(1460) ack 1 win 5840 (DF) (ttl 64, id 37965, len 1500)
23:46:43.012000 dh10-13-18-213.axis.se.squid > 10.13.18.46.http: . [tcp
sum ok] ack 4380 win 8760 (DF) (ttl 128, id 55885, len 40)

.. long timeout here until the server finally gives up the connection ..

23:56:46.111000 10.13.18.46.http > dh10-13-18-213.axis.se.squid: F [tcp
sum ok] 11680:11680(0) ack 1 win 5840 (DF) (ttl 64, id 37966, len 40)

I just hope I'm not doing anything stupid.. well that's best of course
because it's less job to make it work :)

The seq numbers are close to 4G but do not wrap, btw.

-BW



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Feb 28 2002 - 21:00:38 EST