Re: TCP stack bug related to F-RTO?

From: Joe Cao
Date: Sat Sep 26 2009 - 12:54:00 EST


Hi Ilpo,

Can you elaborate on "Some retransmission would happen here as step 3"? When the second timeout happens, it will again go into FRTO and then retransmit the write queue head.

I looked at the patch (debian Bug#478062) that's probably what you mentioned as the fix. All it does was to exclude the SACK case when considering FRTO. But in my case, SACK was enabled, as seen in the trace.

In other words, do we still have a problem with FRTO when SACK is enabled in the latest kernel?

Thanks,
Joe

--- On Fri, 9/25/09, Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxx> wrote:

> From: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxx>
> Subject: Re: TCP stack bug related to F-RTO?
> To: "Joe Cao" <caoco2002@xxxxxxxxx>
> Cc: "Ray Lee" <ray-lk@xxxxxxxxxxxxx>, "Netdev" <netdev@xxxxxxxxxxxxxxx>, "LKML" <linux-kernel@xxxxxxxxxxxxxxx>
> Date: Friday, September 25, 2009, 11:03 AM
> On Fri, 25 Sep 2009, Joe Cao wrote:
>
> > Thanks for the reply!  Do you happen to know
> which patch fixed the
> > problem?
>
> You can find those patches from the stable queue git tree.
> I gave you hint
> from what release to look from in the last mail. However,
> as 2.6.24 is
> anyway obsolete my recommendation is that you should
> probably consider
> upgrading to fix all the other bugs that have been found
> since 2.6.24 was
> obsoleted.
>
> > Is there a bug tracking system for linux kernel?
>
> Nothing that knows everything about everything.
>
> > I studied the FRTO code in latest kernel 2.6.31.. 
> It seems the problem
> > is still there: 
> >
> > 1. Every time a RTO fires, because tcp_is_sackfrto(tp)
> returns 1,
> > tcp_use_frto() returns true.  And the server tcp
> enters FRTO.
> > 2. After the head of write queue is retransmitted, two
> new data packets
> > are transmitted, the server receives two
> dup-ACKs.  That will make the
> > TCP enter tcp_enter_frto_loss(), however, that only
> rests ssthresh and
> > some other fields.
>
> Perhaps those other fields are far more important than you
> think... :-)
> ...Some retransmission would happen here as step 3.
>
> > 3. After another longer RTO fires, because
> tcp_is_sackfrto(tp) returns
> > 1, tcp_use_frto() again returns true.  The stack
> enters FRTO again.
> > 4. The above repeats and the stack couldn't
> retransmits the lost packets
> > faster.
> >
> > Is my understanding above correct?
>
> ...No. All magic that happens in tcp_enter_frto_loss should
> be enough to
> really do more than a single retransmission (that is, in
> any other than
> 2.6.24 series kernel). There was an unfortunate bug in this
> area in 2.6.24
> which basically undoed the effect of correct actions
> tcp_enter_frto_loss
> did which effectively prevented tcp_xmit_retransmit_queue
> from doing its
> part.
>
> --
> i.
>
> --- On Fri, 9/25/09, Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxx>
> wrote:
>
> > From: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxx>
> > Subject: Re: TCP stack bug related to F-RTO?
> > To: "Ray Lee" <ray-lk@xxxxxxxxxxxxx>
> > Cc: "Joe Cao" <caoco2002@xxxxxxxxx>,
> "Netdev" <netdev@xxxxxxxxxxxxxxx>,
> "LKML" <linux-kernel@xxxxxxxxxxxxxxx>,
> jcaoco2002@xxxxxxxxx
> > Date: Friday, September 25, 2009, 6:09 AM
> > On Thu, 24 Sep 2009, Ray Lee wrote:
> >
> > > [adding netdev cc:]
> > >
> > > On Thu, Sep 24, 2009 at 10:43 AM, Joe Cao <caoco2002@xxxxxxxxx>
> > wrote:
> > > >
> > > > Hello,
> > > >
> > > > I have found the following behavior with
> > different versions of linux
> > > > kernel. The attached pcap trace is collected
> with
> > server
> > > > (192.168.0.13) running 2.6.24 and shows the
> > problem. Basically the
> > > > behavior is like this:
> > > >
> > > > 1. The client opens up a big window,
> > > > 2. the server sends 19 packets in a row (pkt
> #14-
> > #32 in the trace), but all of them are dropped due to
> some
> > congestion.
> > > > 3. The server hits RTO and retransmits pkt
> #14 in
> > #33
> > > > 4. The client immediately acks #33 (=#14),
> and
> > the server (seems like to enter F-RTO) expends the
> window
> > and sends *NEW* pkt #35 & #36.=A0 Timeoute is
> doubled to
> > 2*RTO; The client immediately sends two Dup-ack to #35
> and
> > #36.
> > > > 5. after 2*RTO, pkt #15 is retransmitted in
> #39.
> > > > 6. The client immediately acks #39 (=#15) in
> #40,
> > and the server continues to expand the window and
> sends two
> > *NEW* pkt #41 & #42. Now the timeoute is doubled
> to 4
> > *RTO.
> > > > 8. After 4*RTO timeout, #16 is
> retransmitted.
> > > > 9....
> > > > 10. The above steps repeats for
> retransmitting
> > pkt #16-#32 and each time the timeout is doubled.
> > > > 11. It takes a long long time to retransmit
> all
> > the lost packets and before that is done, the client
> sends a
> > RST because of timeout.
> > > >
> > > > The above behavior looks like F-RTO is in
> effect.
> >  And there seems to
> > > > be a bug in the TCP's congestion control
> and
> > retransmission algorithm.
> > > > Why doesn't the TCP on server (running
> 2.6.24)
> > enter the slow start?
> > > > Why should the server take that long to
> recover
> > from a short period
> > > > of packet loss?
> > > >
> > > > Has anyone else noticed similar problem
> before?
> >  If my analysis was
> > > > wrong, can anyone gives me some pointers to
> > what's really wrong and
> > > > how to fix it?
> >
> > Yes, 2.6.24 is an obsoleted version with known wrongs
> in
> > FRTO
> > implementation. Fixes never when to 2.6.24 stable
> series as
> > it was
> > _already_ obsoleted when the problems where reported
> and
> > found. The
> > correct fixes may be found from 2.6.25.7 (.7 iirc) and
> are
> > included from
> > 2.6.26 onward too.
> >
> > Just in case you happen to run ubuntu based kernel
> from
> > that era (of
> > course you should be reporting the bug here then...),
> a
> > word of warning:
> > it seemed nearly impossible for them to get a simple
> thing
> > like that
> > fixed, I haven't been looking if they'd eventually
> come to
> > some sensible
> > conclusion in that matter or is it still unresolved
> (or
> > e.g., closed
> > without real resolution).
>
>




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/