Re: [PATCH] x86/mm: determine whether the fault address is canonical

From: Sean Christopherson
Date: Mon Oct 07 2019 - 11:13:27 EST


On Mon, Oct 07, 2019 at 04:44:23PM +0200, Ingo Molnar wrote:
>
> * Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
> > > All the other reasons would require a fairly egregious kernel bug, hence
> > > the speculation that the #GP is due to a non-canonical address. Something
> > > like the following would be more precise, though highly unlikely to ever
> > > be exercised, e.g. KVM had a fatal bug related to injecting a non-zero
> > > error code that went unnoticed for years.
> > >
> > > WARN_ONCE(trapnr == X86_TRAP_GP, "General protection fault in user access. %s?\n",
> > > (IS_ENABLED(CONFIG_X86_64) && !error_code) ? "Non-canonical address" :
> > > "Segmentation bug");
> >
> > Instead of trying to guess the reason of the #GPF (which guess might be
> > wrong), please just state it as the reason if we are sure that the cause
> > is a non-canonical address - and provide a best-guess if it's not but
> > clearly signal that it's a guess.
> >
> > I.e. if I understood all the cases correctly we'd have three types of
> > messages generated:
> >
> > !error_code:
> > "General protection fault in user access, due to non-canonical address."

A non-canonical #GP always has an error code of '0', but the reverse isn't
technically true. And 32-bit mode obviously can't generate a non-canonical
address.

But practically speaking, since _ASM_EXTABLE_UA() should only be used for
reg<->mem instructions, the only way to get a #GP on a usercopy instruction
would be to corrupt the code itself or have a bad segment loaded in 32-bit
mode. So qualifying the non-canonical message on '64-bit && !error_code'
is techncally more precise/correct, but likely meaningless in practice.

> > error_code && !is_canonical_addr(fault_addr):
> > "General protection fault in user access. Non-canonical address?"
> >
> > error_code && is_canonical_addr(fault_addr):
> > "General protection fault in user access. Segmentation bug?"
>
> Now that I've read the rest of the thread, since fault_addr is always 0
> we can ignore most of this I suspect ...