Re: [RFC][PATCH 2/3] locking,entry: #PF vs TRACE_IRQFLAGS

From: peterz
Date: Mon Aug 10 2020 - 07:57:59 EST


On Fri, Aug 07, 2020 at 04:21:48PM -0400, Steven Rostedt wrote:
> On Fri, 07 Aug 2020 21:23:38 +0200
> Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> > Much of the complexity in irqenter_{enter,exit}() is due to #PF being
> > the sole exception that can schedule from kernel context.
> >
> > One additional wrinkle with #PF is that it is non-maskable, it can
> > happen _anywhere_. Due to this, and the wonders of tracing, we can get
> > the 'normal' NMI nesting vs TRACE_IRQFLAGS:
> >
> > local_irq_disable()
> > raw_local_irq_disable();
> > trace_hardirqs_off();
> >
> > local_irq_enable();
>
> Do you mean to have that ';' there? That is, it the below is called
> from local_irq_enable(), right? A ';' means that local_irq_enable()
> is completed.

Indeed, it's just really hard not to type ';' at the end :-)

>
> > trace_hardirqs_on();
> > <#PF>
> > trace_hardirqs_off()
> > ...
> > if (!regs_irqs_disabled(regs)
>
> regs has it disabled, so this is false, right?

Yup, I'll add: // false, after it to clarify.

> > trace_hardirqs_on();
> > </#PF>
>
> I missed the '/' in the above. At first I thought this was another page
> fault :-/
>
> > // WHOOPS -- lockdep thinks IRQs are disabled again!
> > raw_local_irqs_enable();
> >
> > Rework irqenter_{enter,exit}() to save/restore the software state.
> >
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> > ---
> > include/linux/entry-common.h | 1
> > kernel/entry/common.c | 52 ++++++++++++++++++++-----------------------
> > 2 files changed, 26 insertions(+), 27 deletions(-)
> >
> > --- a/include/linux/entry-common.h
> > +++ b/include/linux/entry-common.h
> > @@ -310,6 +310,7 @@ void irqentry_exit_to_user_mode(struct p
> > #ifndef irqentry_state
> > typedef struct irqentry_state {
> > bool exit_rcu;
> > + bool irqs_enabled;
>
> Instead of passing a structure around, should we look at converting
> "irqentry_state" into a flags field?

Probably, on x86_64-linux sizeof(_Bool) == 1, so it's two bytes and that
fits perfectly fine in a normal return value, but yeah, this is common
code now and we can't rely on sizeof(_Bool) being sane.