Re: networking todo, was Re: Linux-2.4.0-test9-pre2

From: Andi Kleen (ak@suse.de)
Date: Tue Sep 19 2000 - 21:38:24 EST


On Tue, Sep 19, 2000 at 06:54:30PM -0700, David S. Miller wrote:
> Date: Wed, 20 Sep 2000 03:51:37 +0200
> From: "Andi Kleen" <ak@suse.de>
>
> > Receiver side SMP reordering is still there, but I'm not sure if it is
> > fixable (but it'll surely hit people that cannot use Linux senders, I
> > just see the reports)
> >
> > Reordering is a non-issue for ipv4/ipv6, all of the massive TCP input
> > rewrite was strictly about dealing with this. No SMP reordering
> > should cause any bogus fast retransmits etc. for example.
>
> You mean TCP output rewrite ? The fix was strictly sender side only
> (I first thought one of the ack changes alexey did was an attempt at a
> hackish receiver side solution, but he told me that was false)
>
> > What is the problem? Are you referering to the LAPB/X25 stuff?
>
> When you have a non linux 2.4 sender you lose.
>
> Alexey's changes detect reordering on the input side, regardless of
> whether it is speaking to a Linux senders or not, to avoid false
> retransmits.

We must be talking about different things. It of course detects it on
ACK input, but only for data it did send itself. Every TCP detects
reordering automatically on the input with the sequence number check,
but all we still do is to send an ACK immediately and go into
quickack mode.

Or did I miss something ?

The only way to do true reordering handling on the receiver I can think of
would be to use something like the soft-timers to do very very fast delayed
acks even for rcv_mss sized packets and hope that you collect packets from
all CPUs in the delay, but overall it could still cost you a lot by
disturbing the ACK clock

[I talked a lot with Andrea about this during OLS, and we couldn't
figure out a good way]

> Please show me (and even more importantly Alexey) an example of where
> receive reordering detection is dependant upon Linux TCP sender
> behavior, his code it is as generic as I can imagine it to be. In
> fact, his code got lots of the testing on a web server serving almost
> exclusively clients running windows :-)

Web clients probably do not send enough data to make reordering a problem
because the request fits into 1-2 packets and the 3way handshake is not
reordering sensitive.
(I haven't looked at SpecWeb that closely, but has it really any client
sent data that is >packet size?)

> Ok, linux_mib is obviously not exact but in that case I would argue
> that the extra size needed (to get to the next a power of 2) would
> outweight whatever instruction performance gain we'd get.
>
> As for the udp etc. case, how do we pick a "number" to make these
> arrays as you say they should be?

Just use __cacheline_aligned instead, like I did with the ip_local_data
in the patch you just rejected. There is still the problem that
generic SMP x86 kernels use a 32byte cacheline. Not a problem
currently because the only x86 SMP is pII/pIII which has 32byte, but
with Williamette/Athlon SMP it'll be a problem because these have 64byte
and 128byte cache lines. With __cacheline_aligned it'll be easy though
to pick up such changes.

> I would only make these changes for the snmp mibs which are "smaller"
> than this "number" we pick, the larger ones won't see much false
> sharing at all.
>
> I think this is really a small and trite issue actually.

I don't think so. There was so much pain to avoid cache line bouncing for
fast paths, it would be a shame to add it again for silly statistics keeping.

> Removing a lot of code also means undoing all the testing done so far
> with that code present :-)))

True.

On the other hand, the inetpeer code is only really exercised on machines
that talk to lots and lots of destinations (=real servers), and 2.4 testing
on such machines still has to begin.

Given that 2.4 testing has not really begun yet I would guess that it is
safer to remove it now @)

-Andi (who is off to bed now)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Sep 23 2000 - 21:00:22 EST