Re: [PATCH 05/14] KVM: x86: Emulator fixes for eip canonical checks on near branches

From: Nadav Amit
Date: Sat Oct 25 2014 - 15:58:06 EST



> On Oct 24, 2014, at 20:53, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>
> On 10/24/2014 08:07 AM, Paolo Bonzini wrote:
>> From: Nadav Amit <namit@xxxxxxxxxxxxxxxxx>
>>
>> Before changing rip (during jmp, call, ret, etc.) the target should be asserted
>> to be canonical one, as real CPUs do. During sysret, both target rsp and rip
>> should be canonical. If any of these values is noncanonical, a #GP exception
>> should occur. The exception to this rule are syscall and sysenter instructions
>> in which the assigned rip is checked during the assignment to the relevant
>> MSRs.
>
> Careful here. AMD CPUs (IIUC) send #PF (or maybe #GP) from CPL3 instead
> of #GP from CPL0 on sysret to a non-canonical address. That behavior is
> *far* better than the Intel behavior, and it may be important.
I wasn’t aware of this discrepancy, and it is really not written clearly in AMD manual (I have to take your word). It is possible AMD decided to inject #GP from CPL3 (#PF makes no sense).

Anyhow, I think it is much harder to emulate AMD’s behaviour on Intel. Theoretically, the easy way would be for the host to set a non-canonical guest RIP/RSP and inject #GP, but Intel CPUs don’t allow the host to do so. Instead, the host needs to emulate the entire exception injection. This is very hard and error-prone process due to the variety of scenarios (interrupt/task-gate on the IDT, #DF, nested-exceptions, etc.)


>
> If an OS relies on that behavior on AMD CPUs, and guest ring 3 can force
> guest ring 0 to do an emulated sysret to a non-canonical address, than
> the guest ring3 code can own the guest ring0 code.
>
> —Andy

Sysexit (I mistakenly wrote sysret on the description), out of all the control transfer instructions, seems the hardest to exploit, since it must be executed in CPL0.
Remember that this bug does not result in host crashing, but in guest crashing: If guest userspace is able to cause KVM to emulate a jump instruction to a non-canonical address, it can crash the entire guest (by preventing VM-entry from succeeding). To use sysexit for such exploit, the guest userspace needs also to somehow fool the guest kernel into returning into non-canonical RIP.

Nadav--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/