Re: [PATCH] Fix undefined operation VMXOFF during reboot and crash

From: Thomas Gleixner
Date: Wed Jun 10 2020 - 17:34:45 EST


"David P. Reed" <dpreed@xxxxxxxxxxxx> writes:
> +/*
> + * Fix any unwanted undefined operation fault due to VMXOFF instruction that
> + * is needed to ensure that CPU is not in VMX root operation at time of
> + * a reboot/panic CPU reset. There is no safe and reliable way to know
> + * if a processor is in VMX root operation, other than to skip the
> + * VMXOFF. It is safe to just skip any VMXOFF that might generate this
> + * exception, when VMX operation is enabled in CR4. In the extremely
> + * rare case that a VMXOFF is erroneously executed while VMX is enabled,
> + * but VMXON has not been executed yet, the undefined opcode fault
> + * should not be missed by valid code, though it would be an error.
> + * To detect this, we could somehow restrict the instruction address
> + * to the specific use during reboot/panic.
> + */
> +static int fixup_emergency_vmxoff(struct pt_regs *regs, int trapnr)
> +{
> + const static u8 insn_vmxoff[3] = { 0x0f, 0x01, 0xc4 };
> + u8 ud[3];
> +
> + if (trapnr != X86_TRAP_UD)
> + return 0;
> + if (!cpu_vmx_enabled())
> + return 0;
> + if (!this_cpu_read(doing_emergency_vmxoff))
> + return 0;
> +
> + /* undefined instruction must be in kernel and be VMXOFF */
> + if (regs->ip < TASK_SIZE_MAX)
> + return 0;
> + if (probe_kernel_address((u8 *)regs->ip, ud))
> + return 0;
> + if (memcmp(ud, insn_vmxoff, sizeof(insn_vmxoff)))
> + return 0;
> +
> + regs->ip += sizeof(insn_vmxoff);
> + return 1;

We have exception fixups to avoid exactly that kind of horrible
workarounds all over the place.

static inline int cpu_vmxoff_safe(void)
{
int err;

asm volatile("2: vmxoff; xor %[err],%[err]\n"
"1:\n\t"
".section .fixup,\"ax\"\n\t"
"3: mov %[fault],%[err] ; jmp 1b\n\t"
".previous\n\t"
_ASM_EXTABLE(2b, 3b)
: [err] "=a" (err)
: [fault] "i" (-EFAULT)
: "memory");
return err;
}

static inline void __cpu_emergency_vmxoff(void)
{
if (!cpu_vmx_enabled())
return;
if (!cpu_vmxoff_safe())
cr4_clear_bits(X86_CR4_VMXE);
}

Problem solved.

Thanks,

tglx