Re: [RFC PATCH 1/4] x86/entry/nmi: Switch to the entry stack before switching to the thread stack

From: Thomas Gleixner
Date: Sat Jun 26 2021 - 03:03:31 EST


On Fri, Jun 25 2021 at 13:00, Peter Zijlstra wrote:
> On Fri, Jun 25, 2021 at 12:40:53PM +0200, Peter Zijlstra wrote:
>> On Sat, Jun 19, 2021 at 08:13:15PM -0700, Andy Lutomirski wrote:
>> >
>> >
>> > On Sat, Jun 19, 2021, at 3:51 PM, Thomas Gleixner wrote:
>> > > On Tue, Jun 01 2021 at 14:52, Lai Jiangshan wrote:
>> > > > From: Lai Jiangshan <laijs@xxxxxxxxxxxxxxxxx>
>> > > >
>> > > > Current kernel has no code to enforce data breakpoint not on the thread
>> > > > stack. If there is any data breakpoint on the top area of the thread
>> > > > stack, there might be problem.
>> > >
>> > > And because the kernel does not prevent data breakpoints on the thread
>> > > stack we need to do more complicated things in the already horrible
>> > > entry code instead of just doing the obvious and preventing data
>> > > breakpoints on the thread stack?
>> >
>> > Preventing breakpoints on the thread stack is a bit messy: it’s
>> > possible for a breakpoint to be set before the address in question is
>> > allocated for the thread stack.
>>
>> How about we call into C from the entry stack and have the from-user
>> stack swizzle there. The from-kernel entries land on the ISTs and those
>> are already excluded.
>>
>> > None of this is NMI-specific. #DB itself has the same problem. We
>> > could plausibly solve it differently by disarming breakpoints in the
>> > entry asm before switching stacks. I’m not sure how much I like that
>> > approach.
>>
>> I'm not sure I see how, from-user #DB already doesn't clear DR7, and if
>> we recurse, we'll get a from-kernel trap, which will land on the IST,
>> whcih is excluded, and then we clear DR7 there.
>>
>> IST and entry stack are excluded, the only problem we have is thread
>> stack, and that can be solved by calling into C from the entry stack.
>>
>> I should put teaching objtool about .data references from .noinstr.text
>> and .entry.text higher on the todo list I suppose ...
>
> Also, I think we can run the from-user exceptions on the entry stack,
> without ever switching to the kernel stack, except for #PF, which is
> magical and schedules.

No. Pretty much any exception coming from user space can schedule and
even if it does not voluntary it can be preempted.

Thanks,

tglx