Re: [PATCH] x86/fpu/xstate: Fix XSTATE_WARN_ON() to emit relevant diagnostics

From: Andrew Cooper
Date: Wed Sep 28 2022 - 13:58:12 EST


On 28/09/2022 17:27, Borislav Petkov wrote:
> On Wed, Aug 10, 2022 at 11:19:09PM +0100, Andrew Cooper wrote:
>> "XSAVE consistency problem" has been reported under Xen, but that's the extent
>> of my divination skills.
>>
>> Modify XSTATE_WARN_ON() to force the caller to provide relevant diagnostic
>> information, and modify each caller suitably.
>>
>> For check_xstate_against_struct(), this removes a double WARN() where one will
>> do perfectly fine.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> ---
>> CC: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>> CC: Ingo Molnar <mingo@xxxxxxxxxx>
>> CC: Borislav Petkov <bp@xxxxxxxxx>
>> CC: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
>> CC: x86@xxxxxxxxxx
>> CC: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
>>
>> RFC: CC stable? This has been wonky debugging for 7 years.
>>
>> Apparently "size 832 != kernel_size 0" so let the debugging continue...
> I've got a similar bug report from people running Linux guest on some
> other prop. HV. And I wanted to give them a debugging patch which dumps
> *all* the relevant data along the path of paranoid_xstate_size_valid(),
> the loop in there and xstate_calculate_size().
>
> Looking how you might need something like that too, how about you extend
> your patch to do that and have it being toggled on by a xstate=debug
> cmdline?
>
> It feels like this would be a useful thing to have with the gazillion of
> XSTATE features and dynamic buffer allocation...

So we've actually found and fixed the issue, but XSAVE and therefore
automatically gnarly.

https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=c3bd0b83ea5b7c0da6542687436042eeea1e7909

There is no real hardware with XSAVEC but not XSAVES; the spec does try
to distinguish the two, and it's useful for virt to offer XSAVEC without
XSAVES.

CPUID.0xd[1].ebx is spec'd as the total size for XSAVES of all current
XCR0|XSS states.  This is known dodgy already for native, as it leaks
the current MSR_XSS setting into userspace.

I had written the logic originally to hide this dynamic field if XSAVES
wasn't enumerated, but Linux now uses it if XSAVEC is enumerated, to
cross-check what it can see elsewhere in the CPUID state.

I'm pretty sure things will break again when the host MSR_XSS becomes
non-zero, but I have no free time to spend on any of this in the first
place.

Ultimately, the issue here is that there is a privileged state leak, and
Linux is now relying on it for a sanity check.

~Andrew