Re: [RFC PATCH 07/10] x86/fpu: Rellocate fpstate on save_fpregs_to_fpstate

From: Jiaxun Yang
Date: Fri Dec 03 2021 - 10:52:02 EST




在2021年12月3日十二月 下午3:18,Dave Hansen写道:
> On 12/3/21 3:39 AM, Jiaxun Yang wrote:
>>>> if (likely(use_xsave())) {
>>>> + xstate_update_size(fpu);
>>>> os_xsave(fpu->fpstate);
>>>> update_avx_timestamp(fpu);
>>>> return;
>>> Have you considered what exactly happens when you hit that WARN_ON_FPU()
>>> which otherwise ignores the allocation error? Have you considered what
>>> happens on the os_xsave() that follows it immediately? How about what
>>> happens the next time this task runs after that failure?
>> Thank you for the catch.
>> This is a few questions that I don't have answer, so it's a RFC.
>>
>> I thought it is unlikely to happen as kmalloc has emergency pool.
>> But in case it happens, I guess the best way to handle it is just
>> send SIGILL to corresponding user process or panic if it's kernel
>> fpu use?
>
> We've thought a *LOT* about this exact problem over the past few years.
>
> Intel even added hardware (XFD) to prevent the situation where you land
> in the context switch code, fail a memory allocation, and have to
> destroy user data in registers. Without XFD, there are also zero ways
> to avoid this happening to apps, *other* than preallocating the memory
> in the first place.
>
> I don't think there is *any* viable path forward with this series.

Hmm, actually I can come up some ways to workaround it.
Like we can have some sort of preallocated emergency pool
with max_feature and utilize them in case of allocation failure during context switch.
We'll get some chance to fulfill the pool again after going back from interrupt context :-)

But maybe you are right, it's not for me, a first year undergraduate student,
to comment on solutions from thousands of brilliant brains at Intel.

Appreciate for your comments to let me understand the nature of the problem.

Thanks.
--
- Jiaxun