Re: [RFC][PATCH 15/22] x86,vmx: Remove .fixup usage

From: Sean Christopherson
Date: Fri Nov 05 2021 - 14:18:14 EST


On Thu, Nov 04, 2021, Peter Zijlstra wrote:
> In the vmread exceptin path, use the, thus far, unused output register
> to push the @fault argument onto the stack. This, in turn, enables the
> exception handler to not do pushes and only modify that register when
> an exception does occur.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> ---
> arch/x86/kvm/vmx/vmx_ops.h | 14 ++++++--------
> 1 file changed, 6 insertions(+), 8 deletions(-)
>
> --- a/arch/x86/kvm/vmx/vmx_ops.h
> +++ b/arch/x86/kvm/vmx/vmx_ops.h
> @@ -80,9 +80,11 @@ static __always_inline unsigned long __v
> * @field, and bounce through the trampoline to preserve
> * volatile registers.
> */
> - "push $0\n\t"
> + "xorl %k1, %k1\n\t"
> + "2:\n\t"
> + "push %1\n\t"
> "push %2\n\t"

This trick doesn't work if the compiler selects the same GPR for %1 and %2, as
the "field" will get lost.

0x00000000000005a2 <+66>: 0f 78 c0 vmread %rax,%rax
0x00000000000005a5 <+69>: 3e 77 0b ja,pt 0x5b3 <vmx_read_guest_seg_selector+83>
0x00000000000005a8 <+72>: 31 c0 xor %eax,%eax
0x00000000000005aa <+74>: 50 push %rax
0x00000000000005ab <+75>: 50 push %rax
0x00000000000005ac <+76>: e8 00 00 00 00 callq 0x5b1 <vmx_read_guest_seg_selector+81>
0x00000000000005b1 <+81>: 58 pop %rax
0x00000000000005b2 <+82>: 58 pop %rax

I'm struggling to think of a not-awful way to handle this without custom fixup.
Once "asm goto with output" is ubiquitous, we can get rid of this mess entirely,
but until then...

The best idea I can come up with is to abuse the current spec and use bits N:16
to indicate a fault, i.e. use EX_TYPE_EFAULT_REG. VMCS fields are technically
32-bit values, but bits 31:15 are reserved. That would yield a false positive
if KVM really screwed up and/or state was corrupted and the VMREAD actually tried
to read the non-existed field -EFAULT, but that's very, very unlikely (-1 would
be slightly more likey, hence EX_TYPE_EFAULT_REG instead of EX_TYPE_NEG_REG).
The fault patch would also lose the field, but vmread_error() doesn't consume
that info, I can't think of a reason that KVM would want to, and if the debugger
cares, most of the time the field is hardcoded at the call site.

Another horrific idea would be to encode the fault information in EFLAGS. On
VM-Fail, the arithmetic flags are all set to specific values. That would require
yet another flavor of _ASM_EXTABLE_ though, and would make the asm subroutine
even more ugly than it already is. Probably not worth it.

Forcing different registers would be another option, but I would strongly prefer
to give the compiler free reign to optimize the happy path.

Tested all paths of this on 64-bit and 32-bit KVM.