Re: [PATCH v10 10/28] x86/fpu/xstate: Update the XSTATE save function to support dynamic states

From: Bae, Chang Seok
Date: Sat Oct 02 2021 - 18:55:36 EST


On Oct 2, 2021, at 14:31, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> On Fri, Oct 01 2021 at 17:41, Thomas Gleixner wrote:
>> On Wed, Aug 25 2021 at 08:53, Chang S. Bae wrote:
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index 74dde635df40..7c46747f6865 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -9899,11 +9899,16 @@ static void kvm_save_current_fpu(struct fpu *fpu)
>>> * KVM does not support dynamic user states yet. Assume the buffer
>>> * always has the minimum size.
>
> I have to come back to this because that assumption is just broken.
>
> create_vcpu()
> vcpu->user_fpu = alloc_default_fpu_size();
> vcpu->guest_fpu = alloc_default_fpu_size();
>
> vcpu_task()
> get_amx_permission()
> use_amx()
> #NM
> alloc_larger_state()
> ...
> kvm_arch_vcpu_ioctl_run()
> kvm_arch_vcpu_ioctl_run()
> kvm_load_guest_fpu()
> kvm_save_current_fpu(vcpu->arch.user_fpu);
> save_fpregs_to_fpstate(fpu); <- Out of bounds write
>
> Adding a comment that KVM does not yet support dynamic user states does
> not cut it, really.
>
> Even if the above is unlikely, it is possible and has to be handled
> correctly at the point where AMX support is enabled in the kernel
> independent of guest support.

I see.

> You have two options:
>
> 1) Always allocate the large buffer size which is required to
> accomodate all possible features.
>
> Trivial, but waste of memory.
>
> 2) Make the allocation dynamic which seems to be trivial to do in
> kvm_load_guest_fpu() at least for vcpu->user_fpu.
>
> The vcpu->guest_fpu handling can probably be postponed to the
> point where AMX is actually exposed to guests, but it's probably
> not the worst idea to think about the implications now.

FWIW, the proposed KVM patch for AMX looks to take (1) here [1] and
Paolo said [2]:

Most guests will not need the whole xstate feature set. So perhaps you
could set XFD to the host value | the guest value, trap #NM if the
host XFD is zero, and possibly reflect the exception to the guest's XFD
and XFD_ERR.

In addition, loading the guest XFD MSRs should use the MSR autoload
feature (add_atomic_switch_msr).

And then I guess discussion goes on..

Thanks,
Chang

[1] https://lore.kernel.org/kvm/20210207154256.52850-4-jing2.liu@xxxxxxxxxxxxxxx/
[2] https://lore.kernel.org/kvm/ae5b0195-b04f-8eef-9e0d-2a46c761d2d5@xxxxxxxxxx/