Re: [PATCH RFC net-next v2 7/7] net: skbuff: always try to recycle PP pages directly when in softirq

From: Jakub Kicinski
Date: Thu Jul 20 2023 - 13:12:37 EST


On Thu, 20 Jul 2023 18:46:02 +0200 Alexander Lobakin wrote:
> From: Jakub Kicinski <kuba@xxxxxxxxxx>
> Date: Wed, 19 Jul 2023 13:51:50 -0700
>
> > On Wed, 19 Jul 2023 18:34:46 +0200 Alexander Lobakin wrote:
> [...]
> >>
> >> If we're on the same CPU where the NAPI would run and in the same
> >> context, i.e. softirq, in which the NAPI would run, what is the problem?
> >> If there really is a good one, I can handle it here.
> >
> > #define SOFTIRQ_BITS 8
> > #define SOFTIRQ_MASK (__IRQ_MASK(SOFTIRQ_BITS) << SOFTIRQ_SHIFT)
> > # define softirq_count() (preempt_count() & SOFTIRQ_MASK)
> > #define in_softirq() (softirq_count())
>
> I do remember those, don't worry :)
>
> > I don't know what else to add beyond that and the earlier explanation.
>
> My question was "how can two things race on one CPU in one context if it
> implies they won't ever happen simultaneously", but maybe my zero
> knowledge of netcons hides something from me.

One of them is in hardirq.

> > AFAIK pages as allocated by page pool do not benefit from the usual
> > KASAN / KMSAN checkers, so if we were to double-recycle a page once
> > a day because of a netcons race - it's going to be a month long debug
> > for those of us using Linux in production.
>
> if (!test_bit(&napi->state, NPSVC))

if you have to the right check is !in_hardirq()

> ? It would mean we're not netpolling.
> Otherwise, if this still is not enough, I'do go back to my v1 approach
> with having a NAPI flag, which would tell for sure we're good to go. I
> got confused by your "wouldn't just checking for softirq be enough"! T.T
> Joking :D

I guess the problem I'm concerned about can already happen.
I'll send a lockdep annotation shortly.