Re: #PF from NMI

From: Paul E. McKenney
Date: Fri Nov 13 2020 - 20:05:49 EST


On Sat, Nov 14, 2020 at 12:13:58AM +0100, Thomas Gleixner wrote:
> On Fri, Nov 13 2020 at 13:53, Peter Zijlstra wrote:
> > [ 139.226724] WARNING: CPU: 9 PID: 2290 at kernel/rcu/tree.c:932 __rcu_irq_enter_check_tick+0x84/0xd0
> > [ 139.226753] irqentry_enter+0x25/0x40
> > [ 139.226753] exc_page_fault+0x38/0x4c0
> > [ 139.226753] asm_exc_page_fault+0x1e/0x30
>
> ...
>
> > [ 139.226757] perf_callchain_user+0xf4/0x280
> >
> > Which is a #PF from NMI context, which is perfectly fine. However
> > __rcu_irq_enter_check_tick() is triggering WARN.
> >
> > AFAICT the right thing is to simply remove the warn like so.
> >
> > ---
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 430ba58d8bfe..9bda92d8b914 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -928,8 +928,8 @@ void __rcu_irq_enter_check_tick(void)
> > {
> > struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
> >
> > - // Enabling the tick is unsafe in NMI handlers.
> > - if (WARN_ON_ONCE(in_nmi()))
> > + // if we're here from NMI, there's nothing to do.
> > + if (in_nmi())
> > return;
> >
> > RCU_LOCKDEP_WARN(rcu_dynticks_curr_cpu_in_eqs(),
>
> Yes. That's right.
>
> To answer Pauls question:
>
> > But is a corresponding change required on return-from-NMI side?
> > Looks OK to me at first glance, but I could be missing something.
>
> No. The corresponding issue is not return from NMI. The corresponding
> problem is the return from the page fault handler, but there is nothing
> to worry about. That part is NMI safe already.

In that case:

Reviewed-by: Paul E. McKenney <paulmck@xxxxxxxxxx>

Or let me know (and get me a Signed-off-by) if you want me to take it.

Thanx, Paul

> And Luto's as well:
>
> > with the following caveat that has nothing to do with NMI: the rest of
> > irqentry_enter() has tracing calls in it. Does anything prevent
> > infinite recursion if one of those tracing calls causes a page fault?
>
> nmi:
> ...
> trace_hardirqs_off_finish() {
> if (!this_cpu_read(tracing_irq_cpu)) {
> this_cpu_write(tracing_irq_cpu, 1);
> ...
> }
> ...
> perf()
>
> #PF
> save_cr2()
>
> irqentry_enter()
> trace_hardirqs_off_finish()
> if (!this_cpu_read(tracing_irq_cpu)) {
>
> So yes, it is recursion protected unless I'm missing something.
>
> Thanks,
>
> tglx