Re: [tip: x86/boot] x86/boot: Use 32-bit XOR to clear registers

From: Andrew Cooper
Date: Fri Mar 01 2024 - 11:02:29 EST


On 01/03/2024 1:10 pm, Ard Biesheuvel wrote:
> On Fri, 1 Mar 2024 at 13:51, Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
>> On Fri, Mar 1, 2024 at 1:45 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
>>> On Fri, 1 Mar 2024 at 13:39, tip-bot2 for Uros Bizjak
>>> <tip-bot2@xxxxxxxxxxxxx> wrote:
>>>> The following commit has been merged into the x86/boot branch of tip:
>>>>
>>>> Commit-ID: 721f791ce1cddfa5f2bf524ac14741bfa0f72697
>>>> Gitweb: https://git.kernel.org/tip/721f791ce1cddfa5f2bf524ac14741bfa0f72697
>>>> Author: Uros Bizjak <ubizjak@xxxxxxxxx>
>>>> AuthorDate: Wed, 24 Jan 2024 11:38:59 +01:00
>>>> Committer: Ingo Molnar <mingo@xxxxxxxxxx>
>>>> CommitterDate: Fri, 01 Mar 2024 12:47:37 +01:00
>>>>
>>>> x86/boot: Use 32-bit XOR to clear registers
>>>>
>>>> x86_64 zero extends 32-bit operations, so for 64-bit operands,
>>>> XORL r32,r32 is functionally equal to XORQ r64,r64, but avoids
>>>> a REX prefix byte when legacy registers are used.
>>>>
>>> ... and so this change is pointless churn when not using legacy
>>> registers, right?
>> Although there is no code size change with REX registers, it would
>> look weird to use XORQ with REX registers and XORL with legacy regs.
> You are changing an isolated occurrence of XORQ into XORL on the basis
> that XORQ 'looks weird', and would produce a longer opcode if the
> occurrence in question would be using a different register than it
> actually uses.
>
> Apologies for the bluntness, but in my book, this really falls firmly
> into the 'pointless churn' territory. The startup code is not
> performance critical, neither in terms of size nor in speed, and so
> I'd prefer to avoid these kinds of changes. Just my 2c, though - Ingo
> has already merged the patch.

Without trying to get into an argument here...

The better reason is that Silvermont Atoms don't recognise the 64bit
form as a zeroing idiom.  They only recognise the 32bit form of the idiom.

Therefore in fastpaths it is (marginally) important to xorl %r15d, %r15d
rather than xorq %r15, %r15.

But this instance is not a fastpath, and it also doesn't save any
encoding space, so I'm not sure it was really worth changing.

~Andrew