Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm"

From: Andrey Ryabinin
Date: Fri Jul 21 2017 - 05:11:14 EST




On 07/20/2017 11:56 PM, Josh Poimboeuf wrote:
> On Thu, Jul 20, 2017 at 06:30:24PM +0300, Andrey Ryabinin wrote:
>> FWIW bellow is my understanding of what's going on.
>>
>> It seems clang treats local named register almost the same as ordinary
>> local variables.
>> The only difference is that before reading the register variable clang
>> puts variable's value into the specified register.
>>
>> So clang just assigns stack slot for the variable __sp where it's
>> going to keep variable's value.
>> But since __sp is unitialized (we haven't assign anything to it), the
>> value of the __sp is some garbage from stack.
>> inline asm specifies __sp as input, so clang assumes that it have to
>> load __sp into 'rsp' because inline asm is going to use
>> it. And it just loads garbage from stack into 'rsp'
>>
>> In fact, such behavior (I mean storing the value on stack and loading
>> into reg before the use) is very useful.
>> Clang's behavior allows to keep the value assigned to the
>> call-clobbered register across the function calls.
>>
>> Unlike clang, gcc assigns value to the register right away and doesn't
>> store the value anywhere else. So if the reg is
>> call clobbered register you have to be absolutely sure that there is
>> no subsequent function call that might clobber the register.
>>
>> E.g. see some real examples
>> https://patchwork.kernel.org/patch/4111971/ or 98d4ded60bda("msm: scm:
>> Fix improper register assignment").
>> These bugs shouldn't happen with clang.
>>
>> But the global named register works slightly differently in clang. For
>> the global, the value is just the value of the register itself,
>> whatever it is. Read/write from global named register is just like
>> direct read/write to the register
>
> Thanks, that clears up a lot of the confusion for me.
>
> Still, unfortunately, I don't think that's going to work for GCC.
> Changing the '__sp' register variable to global in the header file
> causes it to make a *bunch* of changes across the kernel, even in
> functions which don't do inline asm. It seems to be disabling some
> optimizations across the board.

All I see is just bunch of reordering of independent instructions, like this:

-ffffffff81012760: 5b pop %rbx
-ffffffff81012761: 31 c0 xor %eax,%eax
+ffffffff81012760: 31 c0 xor %eax,%eax
+ffffffff81012762: 5b pop %rbx

-ffffffff810c29ae: 48 83 c4 28 add $0x28,%rsp
-ffffffff810c29b2: 89 d8 mov %ebx,%eax
+ffffffff810c29ae: 89 d8 mov %ebx,%eax
+ffffffff810c29b0: 48 83 c4 28 add $0x28,%rsp

I haven't noticed any single bad/harmful change. The size of .text remained the same.

And btw, arm/arm64 already use global current_stack_pointer just fine.

> I do have another idea, which is to replace all uses of
>
> asm(" ... call foo ... " : outputs : inputs : clobbers);
>
> with a new ASM_CALL macro:
>
> ASM_CALL(" ... call foo ... ", outputs, inputs, clobbers);
>
> Then the compiler differences can be abstracted out, with GCC adding
> "sp" as an output constraint and clang doing nothing (for now).
>