Re: [rc4-amd64] RC4 optimized for AMD64

From: dean gaudet
Date: Mon Nov 01 2004 - 23:51:54 EST


On Mon, 1 Nov 2004, Marc Bevand wrote:

> I have just published a small paper about optimizing RC4 for
> AMD64 (x86-64). A working implementation is also provided:
>
> http://epita.fr/~bevand_m/papers/rc4-amd64.html
>
> Kernel people may be interested given the fact that Linux
> already implements RC4.

you've made a non-portable flags assumption:

> dec %r11b
> ror $8, %r8 # (ror does not change ZF)
> jnz 1b

the contents of ZF are undefined after a rotation... most importantly they
differ between p4 (ZF is set according to result) and k8 (ZF unchanged).

do you really measure a perf improvement from this assumption? note that
p4 would prefer "sub $1, %r11b" here instead of dec... but the difference
is likely minimal.

-dean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/