Re: [PATCH] KVM: PPC: BOOK3S: book3s_hv_nested.c: improve branch prediction for k.alloc

From: Michael Ellerman
Date: Tue Apr 11 2023 - 02:35:25 EST


Kautuk Consul <kconsul@xxxxxxxxxxxxxxxxxx> writes:
> On 2023-04-07 09:01:29, Sean Christopherson wrote:
>> On Fri, Apr 07, 2023, Bagas Sanjaya wrote:
>> > On Fri, Apr 07, 2023 at 05:31:47AM -0400, Kautuk Consul wrote:
>> > > I used the unlikely() macro on the return values of the k.alloc
>> > > calls and found that it changes the code generation a bit.
>> > > Optimize all return paths of k.alloc calls by improving
>> > > branch prediction on return value of k.alloc.
>>
>> Nit, this is improving code generation, not branch prediction.
> Sorry my mistake.
>>
>> > What about below?
>> >
>> > "Improve branch prediction on kmalloc() and kzalloc() call by using
>> > unlikely() macro to optimize their return paths."
>>
>> Another nit, using unlikely() doesn't necessarily provide a measurable optimization.
>> As above, it does often improve code generation for the happy path, but that doesn't
>> always equate to improved performance, e.g. if the CPU can easily predict the branch
>> and/or there is no impact on the cache footprint.

> I see. I will submit a v2 of the patch with a better and more accurate
> description. Does anyone else have any comments before I do so ?

In general I think unlikely should be saved for cases where either the
compiler is generating terrible code, or the likelyness of the condition
might be surprising to a human reader.

eg. if you had some code that does a NULL check and it's *expected* that
the value is NULL, then wrapping that check in likely() actually adds
information for a human reader.

Also please don't use unlikely in init paths or other cold paths, it
clutters the code (only slightly but a little) and that's not worth the
possible tiny benefit for code that only runs once or infrequently.

I would expect the compilers to do the right thing in all
these cases without the unlikely. But if you can demonstrate that they
meaningfully improve the code generation with a before/after
dissassembly then I'd be interested.

cheers