Re: Windows 9x and RFC1323 [problems with tcp SACK option, help?]

Paul (set@pobox.com)
Thu, 9 Dec 1999 18:47:21 -0500 (EST)


On Thu, 9 Dec 1999, Marek Habersack wrote:

@>
@>Hi,
@>
@> For some time I have had a certain problem with users accessing a 2.2.13
@>Linux machine from Windows. The problem was/is that when they, e.g., open a
@>WWW page on my server the page either a) starts loading after a loong time
@>and goes really slow (albeit the ping and traceroute show that delay isn't
@>really big - around 140ms at average) or b) the page shows up with random
@>parts NOT being sent to the opening party - either the images don't show up,
@>or parts of text are missing (after the reload everything's just fine).
@>Also, it seems that POP3 connections break at random times with "server
@>timeout" error message on the user's screen. Now, I started to look at the
@>TCP tunable options in the Linux kernel, since I didn't notice the problem
@>until kernel 2.2.5 or so (don't remember exactly) - so I thought it might be
@>some optimization of the tcp code, or some bug out there. I noticed that
@>three options (tcp_sack, tcp_window_scaling, tcp_timestamps) are turned on -
@>they used to default to 0 before AFAIR. The comment on line 77 of
@>net/ipv4/tcp_input says that these variables are on for testing and that
@>they might be turned off in the final 2.2 (which, I suppose, was the case
@>until some version of 2.2). I turned them off by hand and the problems with
@>WWW and such seem to disappear for now. My question is - is it known that
@>Win9x tcp stack DOESN'T fully support these RFC1323 options? The RFC says
@>they should not interfere with any client code, but it seems they do...
@>

Ive been planning to post something about my problems with the tcp
SACK option, but haven't gotten around too it, partly because of an
overwhelming uninterest in a previous post on the subject on linux-net
I made:)

In short, I haven't been able to find any computer that supports the
SACK option out there in the world that when the option is actually
used, works. At first I thought it was a broken sun implementation or
something, because my machine seemed to be doing the right thing, but
recently I had the same exact problems with ftp.redhat.com. (Im assuming
thats running linux, so even linux to linux its not working.)

Thus I generally run with the tcp_sack option off.

Why do I have problems? Well, because of an incompatibility with my
isp and my modem, I have to run with modem error correction off. This
virtually guarantees a dropped packet on a sufficiently large download,
and if SACK is enabled, it will trigger the use of this tcp option. I
can also get it to drop a packet by jiggling the phone hook, if I want
to force the issue:) In further testing, I was completely unable to
cause SACK to occur with my ether-network. (disconnecting the cables,
playing with the terminator resistor, etc.) So, I only have seen my problems
with PPP.

What, you may ask, are my problems? Well, before I post mind numbing
anotated tcpdumps, I would really like someone to contact me that would
be interested in helping with this problem. Ive read through the tcp code,
but couldnt see what _might_ be wrong....
In short, if Im doing an ftp download, I miss a dropped packet at some
point. Then I start SACK'ing the subsequent packets. Eventually, the
other side has to go back and fill in the missing packet, but it sends
the packet _before_ the one Im missing-- it sends a packet Ive been ACKing
as received the entire time the SACK option is being used. I of course,
ACK this packet, and maintain my SACK options. It sends the same packet again.
Repeat endlessly.

This is not a hardware problem, and it always, and only occurs when the
tcp_sack option is enabled. I am willing to work with anyone to figure out
what could be causing my problems.

Paul

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/