Re: [PATCH v3] lib: optimize cpumask_next_and()

From: Alexey Dobriyan
Date: Thu Oct 26 2017 - 08:58:07 EST


> - Refactored _find_next_common_bit into _find_next_bit., as suggested
> by Yury Norov. This has no adverse effects on the performance side,
> as the compiler successfully inlines the code.

1)
Gentoo ships 5.4.0 which doesn't inline this code on x86_64 defconfig
(which has OPTIMIZE_INLINING).


ffffffff813556c0 <find_next_bit>:
ffffffff813556c0: 55 push rbp
ffffffff813556c1: 48 89 d1 mov rcx,rdx
ffffffff813556c4: 45 31 c0 xor r8d,r8d
ffffffff813556c7: 48 89 f2 mov rdx,rsi
ffffffff813556ca: 31 f6 xor esi,esi
ffffffff813556cc: 48 89 e5 mov rbp,rsp
ffffffff813556cf: e8 7c ff ff ff call
ffffffff81355650 <_find_next_bit>
ffffffff813556d4: 5d pop rbp
ffffffff813556d5: c3 ret

2)
Making "and" operation to be centerpiece of this code is kind of meh
find_next_or_bit() will be hard to implement.