Re: [patch] longstanding chksum patch

Gerhard.Stegmann (mingo@chiara.csoma.elte.hu)
Fri, 10 Sep 1999 13:51:24 +0200 (CEST)


On Fri, 10 Sep 1999, Artur Skawina wrote:

> > clc (as all the other flag-manipulation instructions) is non-pairable.
>
> on a ppro+ CLC is a 1 uop instruction, and as such "pairable". (check
> with an intel manual if you don't believe me). [...]

check out the Intel Optimization Guide, 24281603.pdf, page 121:

CLC - Clear Carry Flag
NP

NP - not pairable, executes in U-pipe.

> [...] Like i said, I've
> tried exchanging the testl for a clc, and, in one case, it _was_ faster.
> so i can believe it could be faster in andreas case too.

i can clearly demonstrate with code that the testl thing pairs nicely
while clc doesnt - and Intel docs agree with me.

> [...you made me curious so i tried his latest patch w/ testl/clc...]
> Hmm, most of the time they come out equal, differences are in the noise
> and depend on measurement method (for a single csum run timed with
> rdtsc the results are almost always identical and rarely CLC wins,
> for 10 runs testl wins by a narrow margin).

unless you _really_ know how to measure pairing effects, be careful before
jumping to conclusions. It's very easy to mess up the measurement.

-- mingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/