Re: [PATCH] net: stmmac: Don't call _irqoff() with hardirqs enabled

From: Jakub Kicinski
Date: Sun Oct 11 2020 - 11:57:03 EST


On Sun, 11 Oct 2020 15:42:24 +0200 Heiner Kallweit wrote:
> On 10.10.2020 17:22, Jakub Kicinski wrote:
> > On Sat, 10 Oct 2020 15:08:15 +0200 Heiner Kallweit wrote:
> >> On 09.10.2020 18:06, Heiner Kallweit wrote:
> >>> On 09.10.2020 17:58, Jakub Kicinski wrote:
> >>>> On Fri, 9 Oct 2020 16:54:06 +0200 Heiner Kallweit wrote:
> >>>>> I'm thinking about a __napi_schedule version that disables hard irq's
> >>>>> conditionally, based on variable force_irqthreads, exported by the irq
> >>>>> subsystem. This would allow to behave correctly with threadirqs set,
> >>>>> whilst not loosing the _irqoff benefit with threadirqs unset.
> >>>>> Let me come up with a proposal.
> >>>>
> >>>> I think you'd need to make napi_schedule_irqoff() behave like that,
> >>>> right? Are there any uses of napi_schedule_irqoff() that are disabling
> >>>> irqs and not just running from an irq handler?
> >>>>
> >>> Right, the best approach depends on the answer to the latter question.
> >>> I didn't check this yet, therefore I described the least intrusive approach.
> >>>
> >>
> >> With some help from coccinelle I identified the following functions that
> >> call napi_schedule_irqoff() or __napi_schedule_irqoff() and do not run
> >> from an irq handler (at least not at the first glance).
> >>
> >> dpaa2_caam_fqdan_cb
> >> qede_simd_fp_handler
> >> mlx4_en_rx_irq
> >> mlx4_en_tx_irq
> >
> > Don't know the others but FWIW the mlx4 ones run from an IRQ handler,
> > AFAICT:
> >
> > static irqreturn_t mlx4_interrupt(int irq, void *dev_ptr)
> > static irqreturn_t mlx4_msi_x_interrupt(int irq, void *eq_ptr)
> > mlx4_eq_int()
> > mlx4_cq_completion
> > cq->comp()
> >
> >> qeth_qdio_poll
> >> netvsc_channel_cb
> >> napi_watchdog
> >
> > This one runs from a hrtimer, which I believe will be a hard irq
> > context on anything but RT. I could be wrong.
> >
>
> Typically forced irq threading will not be enabled, therefore going
> back to use napi_schedule() in drivers in most cases will cause
> losing the benefit of the irqoff version. Something like the following
> should be better. Only small drawback I see is that in case of forced
> irq threading hrtimers will still run in hardirq context and we lose
> the benefit of the irqoff version in napi_watchdog().
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index a146bac84..7d18560b2 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6393,7 +6393,11 @@ EXPORT_SYMBOL(napi_schedule_prep);
> */
> void __napi_schedule_irqoff(struct napi_struct *n)
> {
> - ____napi_schedule(this_cpu_ptr(&softnet_data), n);
> + /* hard irqs may not be masked in case of forced irq threading */
> + if (force_irqthreads)
> + __napi_schedule(n);
> + else
> + ____napi_schedule(this_cpu_ptr(&softnet_data), n);
> }
> EXPORT_SYMBOL(__napi_schedule_irqoff);

Does

if (force_irqthreads)
local_irq_save(flags);
____napi_schedule(this_cpu_ptr(&softnet_data), n);
if (force_irqthreads)
local_irq_restore(flags);

not produce more concise assembly?