RE: x86/csum: Remove unnecessary odd handling

From: David Laight
Date: Sun Jan 07 2024 - 06:44:59 EST


From: H. Peter Anvin
> Sent: 07 January 2024 01:09
>
> On January 6, 2024 2:08:48 PM PST, David Laight <David.Laight@xxxxxxxxxx> wrote:
...
> >The best loop for 256+ bytes is an adxc/adxo one.
> >However that requires the run-time patching.
...
> Rather than runtime patching perhaps separate paths...

It will need to detect the cpu type earlier, so a static
branch is probably enough.
Easier than substituting the entire code block.

I think it is silvermont and knight's landing that have
a 4 clock penalty for 64bit adxc (Intel atom family).
That might only be a decode penalty, so doesn't affect
the loop 'that much' (adc is 2 clocks on those cpu).
So probably not actually worth doing a run-time
performance check.

I might 'cook up' a full checksum function later.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)