Re: [PATCH] crypto: x86/crc32c-intel - Don't match some Zhaoxin CPUs

From: Ard Biesheuvel
Date: Sat Dec 12 2020 - 05:55:52 EST


On Sat, 12 Dec 2020 at 10:36, Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
>
> On Fri, 11 Dec 2020 at 20:07, Eric Biggers <ebiggers@xxxxxxxxxx> wrote:
> >
> > On Fri, Dec 11, 2020 at 07:29:04PM +0800, Tony W Wang-oc wrote:
> > > The driver crc32c-intel match CPUs supporting X86_FEATURE_XMM4_2.
> > > On platforms with Zhaoxin CPUs supporting this X86 feature, When
> > > crc32c-intel and crc32c-generic are both registered, system will
> > > use crc32c-intel because its .cra_priority is greater than
> > > crc32c-generic. This case expect to use crc32c-generic driver for
> > > some Zhaoxin CPUs to get performance gain, So remove these Zhaoxin
> > > CPUs support from crc32c-intel.
> > >
> > > Signed-off-by: Tony W Wang-oc <TonyWWang-oc@xxxxxxxxxxx>
> >
> > Does this mean that the performance of the crc32c instruction on those CPUs is
> > actually slower than a regular C implementation? That's very weird.
> >
>
> This driver does not use CRC instructions, but carryless
> multiplication and aggregation. So I suppose the pclmulqdq instruction
> triggers some pathological performance limitation here.
>

Just noticed it uses both crc instructions and pclmulqdq instructions.
Sorry for the noise.

> That means the crct10dif driver probably needs the same treatment.

Tony, can you confirm that the problem is in the CRC instructions and
not in the PCLMULQDQ code path that supersedes it when available?