Re: [patch V6 12/37] x86/entry: Provide idtentry_entry/exit_cond_rcu()

From: Andy Lutomirski
Date: Tue May 19 2020 - 16:25:00 EST


On Tue, May 19, 2020 at 1:20 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> Thomas Gleixner <tglx@xxxxxxxxxxxxx> writes:
> > Andy Lutomirski <luto@xxxxxxxxxx> writes:
> >> On Fri, May 15, 2020 at 5:10 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> >>> The pagefault handler cannot use the regular idtentry_enter() because that
> >>> invokes rcu_irq_enter() if the pagefault was caused in the kernel. Not a
> >>> problem per se, but kernel side page faults can schedule which is not
> >>> possible without invoking rcu_irq_exit().
> >>>
> >>> Adding rcu_irq_exit() and a matching rcu_irq_enter() into the actual
> >>> pagefault handling code would be possible, but not pretty either.
> >>>
> >>> Provide idtentry_entry/exit_cond_rcu() which calls rcu_irq_enter() only
> >>> when RCU is not watching. The conditional RCU enabling is a correctness
> >>> issue: A kernel page fault which hits a RCU idle reason can neither
> >>> schedule nor is it likely to survive. But avoiding RCU warnings or RCU side
> >>> effects is at least increasing the chance for useful debug output.
> >>>
> >>> The function is also useful for implementing lightweight reschedule IPI and
> >>> KVM posted interrupt IPI entry handling later.
> >>
> >> Why is this conditional? That is, couldn't we do this for all
> >> idtentry_enter() calls instead of just for page faults? Evil things
> >> like NMI shouldn't go through this path at all.
> >
> > I thought about that, but then ended up with the conclusion that RCU
> > might be unhappy, but my conclusion might be fundamentally wrong.
>
> It's about this:
>
> rcu_nmi_enter()
> {
> if (!rcu_is_watching()) {
> make it watch;
> } else if (!in_nmi()) {
> do_magic_nohz_dyntick_muck();
> }
>
> So if we do all irq/system vector entries conditional then the
> do_magic() gets never executed. After that I got lost...

I'm also baffled by that magic, but I'm also not suggesting doing this
to *all* entries -- just the not-super-magic ones that use
idtentry_enter().

Paul, what is this code actually trying to do?