IP Checksumming

Richard B. Johnson (root@analogic.com)
Mon, 18 Nov 1996 17:49:02 -0500 (EST)


I can't figure out the %)&^$ GNU pseudo assembly that thinks that
all assembly is written like a 68k (source->dest). However, using
Intel numonics, the following checksums an IP packet the fastest.
It can also copy while checksumming.

It minimizes the number of jumps which flush prefetch buffers and
messes up the speed.

I have watched Linux use different checksum methods throughout the
years and it seems that things are getting worse.

If someone could convert this to the GNU stuff without breaking the
logic, I know it would perform MUCH better than the present kernel
IP checksums even though it uses words for memory access. The present
checksum routines try several tricks. However, every time a branch
on compare occurs, there is an enormous penality because the prefetch
buffer must be flushed. Further, the present routines in ../../asm
don't take advantage of the Intel architecture. These routines should
not be 'portable', but should be put into ../../i386/lib to take advantage
of the string primatives built into the Intel devices. Other machines
often have other built-in primatives that could help them also.

I have used this in a TCP/IP "stack" for about 4 years here at
Analogic and I can get over 90 percent of 10-base-t bandwidth with
a 33MHz machine!

Contrary to the documentation written in /i386/lib/checksum.c, the
pentium does not suffer from lack of alignment as long as you use
the internal routines.

I am sure that if someone takes the time to convert this stuff, they
will be very pleasantly suprised.

;
;
; inst dest, source
;
mov esi,offset source_addr ; Get location of string
IFDEF COPY_WHILE_CHKSUM
mov edi,offset dest_addr ; Where to copy
ENDIF
mov ecx,string_length ; Get string word length
xor eax,eax ; Zero register, clear CY
mov edx,eax ; Zero register
cld ; Forwards
;
l0: lodsw ; Get word, source_addr++
IFDEF COPY_WHILE_CHKSUM
stosw ; Put word dest_addr++
ENDIF
adc edx,eax ; Sum to accumulator + CY
loop l0
;
adc edx,0 ; Possible last carry
IFDEF GONNA_SUM_INTO_WHAT_YOU_CHECKSUMMED
not edx ; Invert all bits
ENDIF
mov eax,edx ; To return in eax
;
; That's all folks!
;