RE: [PATCH 0/2] Optimize memchr()

From: David Laight
Date: Mon Aug 15 2022 - 06:59:45 EST


From: Pavel Machek
> Sent: 12 August 2022 20:07
>
> Hi!
>
> > This patche series optimized "memchr()" and add a macro for
> > "memchr_inv()" so that both funtions can use it to generate bit mask.
> >
> > The original implementaion of "memchr()" is based on byte-wise comparison,
> > which do not fully use 64-bit or 32-bit register in CPU. We implement a
> > word-wise comparison so that at least 4 bytes can be compared at the same
> > time. The optimized "memchr()" is nearly 4x faster than the original one
> > for long strings. In Linux Kernel, we find that the length of the string
>
> Well... how much slower is it for short strings?

And cold cache??

David

> > searched by "memchr()" is up to 512 bytes in drivers/misc/lkdtm/heap.c.
> > In our test, the optimized version is about 20% faster if the target
> > character is at the end of the string when going through a 512-byte
> > string.
>
> "What is the average length passed to memchr" would be more useful question.
>
> Best regards,
> Pavel

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)