Re: IPv4 kernel messages

MOLNAR Ingo (mingo@chiara.csoma.elte.hu)
Wed, 9 Sep 1998 20:17:25 +0200 (CEST)


On Wed, 9 Sep 1998, Oliver Xymoron wrote:

> According to Stevens, I had another detail wrong as well - the checksum is
> 16-bit 1's-complement of 16-bit words, not 8-bit words. This is still
> manageable, I think.. We can mimic the sums generated by csum_partial by
> reading in 64-bits, splitting it into two registers, and calculating a 64
> bit checksum. This shouldn't overflow for any packets we're likely to
> handle.. 32-bit carries can be folded back into the lower half of the sum
> at the end. Same number of operations as before in the inner loop.

i have implemented the suggested method this way:

+ movq (%%esi),%%mm1;
+ movq %%mm1, %%mm2
+ punpckhwd %%mm0, %%mm1
+ punpcklwd %%mm0, %%mm2
+ paddd %%mm1, %%mm6
+ paddd %%mm2, %%mm7
+
+ movq 8(%%esi),%%mm1;
+ movq %%mm1, %%mm2
+ punpckhwd %%mm0, %%mm1
+ punpcklwd %%mm0, %%mm2
+ paddd %%mm1, %%mm6
+ paddd %%mm2, %%mm7
[...]

(%%mm0 has zero in it, this is the only slightly nontrivial issue, punpck*
instructions do not auto-extend) My original method was something like
this:

+ movq %%mm1, %%mm3;
+ paddd (%%esi),%%mm1;
+ pcmpgtd %%mm1, %%mm3;
+ psubd %%mm3, %%mm1;
+
+ movq %%mm1, %%mm3;
+ paddd 8(%%esi),%%mm1;
+ pcmpgtd %%mm1, %%mm3;
+ psubd %%mm3, %%mm1;

which is about equivalent to the suggested extending-based method
speed-wise, but it's a bit more complex so the other method i think has
more potential to be faster on other and future CPUs.

the final wrapping of the 64 bit packed checksum into a 32 bit result
checksum can be done this way:

+ paddd %%mm6, %%mm7
+
+ movd %%mm7, %%ecx
+ psrlq $32, %%mm7
+ movd %%mm7, %%ebx
+
+ addl %%ecx, %%eax
+ adcl %%ebx, %%eax
+ adcl $0, %%eax

(%%eax is the incoming 32 bit checksum)

i'll post patches to linux-kernel soon, they will do a bootup-benchmark of
various checksumming routines and will pick the fastest one.

-- mingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/faq.html