Re: [PATCHv3 3/3] arm64/entry-common: supplement irq accounting

From: Pingfan Liu
Date: Fri Oct 01 2021 - 10:11:08 EST


On Thu, Sep 30, 2021 at 02:53:14PM +0100, Mark Rutland wrote:
> On Thu, Sep 30, 2021 at 09:17:08PM +0800, Pingfan Liu wrote:
> > At present, the irq entry/exit accounting, which is performed by
> > handle_domain_irq(), overlaps with arm64 exception entry code somehow.
> >
> > By supplementing irq accounting on arm64 exception entry code, the
> > accounting in handle_domain_irq() can be dropped totally by selecting
> > the macro HAVE_ARCH_IRQENTRY.
>
> I think we need to be more thorough and explain the specific problem and
> solution. How about we crib some wording from patch 1, and say:
>
> arm64: entry: avoid double-accounting IRQ RCU entry
>
> When an IRQ is taken, some accounting needs to be performed to enter
> and exit IRQ context around the IRQ handler. On arm64 some of this
> accounting is performed by both the architecture code and the IRQ
> domain code, resulting in calling rcu_irq_enter() twice per exception
> entry, violating the expectations of the core RCU code, and resulting
> in failing to identify quiescent periods correctly (e.g. in
> rcu_is_cpu_rrupt_from_idle()).
>
> To fix this, we must perform all the accounting from the architecture
> code. We prevent the IRQ domain code from performing any accounting by
> selecting HAVE_ARCH_IRQENTRY, and must call irq_enter_rcu() and
> irq_exit_rcu() around invoking the root IRQ handler.
>
> When we take a pNMI from a context with IRQs disabled, we'll perform
> the necessary accounting as part of arm64_enter_nmi() and
> arm64_exit_nmi(), and should only call irq_enter_rcu() and
> irq_exit_rcu() when we may have taken a regular interrupt.
>
It is a wonderful and elaborated log.

> That way it's clear what specifically the overlap is and the problem(s)
> it results in. The bit at the end explains why we don't call
> irq_{enter,exit}_rcu() when we're certain we've taken a pNMI.
>
I have learned much from the log you contribute to this series. I will keep
learning how to improve my ability of log. Thanks again!

> > Signed-off-by: Pingfan Liu <kernelfans@xxxxxxxxx>
> > Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxx>
> > Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> > Cc: Will Deacon <will@xxxxxxxxxx>
> > Cc: Mark Rutland <mark.rutland@xxxxxxx>
> > Cc: Marc Zyngier <maz@xxxxxxxxxx>
> > Cc: Joey Gouly <joey.gouly@xxxxxxx>
> > Cc: Sami Tolvanen <samitolvanen@xxxxxxxxxx>
> > Cc: Julien Thierry <julien.thierry@xxxxxxx>
> > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > Cc: Yuichi Ito <ito-yuichi@xxxxxxxxxxx>
> > Cc: linux-kernel@xxxxxxxxxxxxxxx
> > To: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> > ---
> > arch/arm64/Kconfig | 1 +
> > arch/arm64/kernel/entry-common.c | 4 ++++
> > 2 files changed, 5 insertions(+)
> >
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index 5c7ae4c3954b..d29bae38a951 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -98,6 +98,7 @@ config ARM64
> > select ARCH_HAS_UBSAN_SANITIZE_ALL
> > select ARM_AMBA
> > select ARM_ARCH_TIMER
> > + select HAVE_ARCH_IRQENTRY
>
> Please put this with the other HAVE_ARCH_* entries in
> arch/arm64/Kconfig -- it should be between HAVE_ARCH_HUGE_VMAP and
> HAVE_ARCH_JUMP_LABEL to keep that in alphabetical order.
>
OK, I will fix it in V4.

> With that and the title and commit message above:
>
> Reviewed-by: Mark Rutland <mark.rutland@xxxxxxx>
>
Thanks,

Pingfan