Re: [RFC PATCH] Fix abnormal rcu dynticks_nesting values related toasync page fault

From: Paul E. McKenney
Date: Tue Nov 27 2012 - 10:25:59 EST


On Tue, Nov 27, 2012 at 09:01:33AM -0500, Sasha Levin wrote:
> On 11/27/2012 08:07 AM, Gleb Natapov wrote:
> > Those rcu_irq_enter()/rcu_irq_exit() were introduced by commit
> > c5e015d4949aa665 "KVM guest: exit idleness when handling
> > KVM_PV_REASON_PAGE_NOT_PRESENT", but now I am starting to question this
> > commit. KVM_PV_REASON_PAGE_NOT_PRESENT should not kick cpu out of
> > idleness. kvm_async_pf_task_wait() checks that cpu is idle and calls
> > halt if it is. After that commit schedule() may be called between
> > rcu_irq_enter()/rcu_irq_exit() which is probably illegal. Paul?

It is legal to call rcu_irq_enter() and then schedule().

In fact, it turns out that it -has- to be legal, due to some
architectures' quaint habit of entering interrupt/exception handlers
that they never leave, and possibly vice versa.

> otoh, calling schedule() apparently kicks cpu out of idleness now.

But if you call rcu_irq_enter() and then schedule(), and if schedule()
switches to the idle thread, and if execution proceeds to the point
where rcu_idle_enter() is called, then, RCU will quite naturally
decide that it is fully idle. At that point, it is illegal to invoke
rcu_irq_exit() unless/until you have either: (1) exited the idle loop
(as in called rcu_idle_exit()) or (2) taken an interrupt, which will
call rcu_irq_enter().

And to think that when I started coding RCU's dyntick-idle funtionality,
I was thinking in terms of a simple nesting counter. Silly me! ;-)

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/