Re: [PATCH 0/3] softirq: uncontroversial change

From: Jason Xing
Date: Fri Apr 21 2023 - 05:48:39 EST


On Fri, Apr 21, 2023 at 5:33 PM Paolo Abeni <pabeni@xxxxxxxxxx> wrote:
>
> On Fri, 2023-04-21 at 10:48 +0800, Jason Xing wrote:
> >
> > > My understanding is that we want to avoid adding more heuristics here,
> > > preferring a consistent refactor.
> > >
> > > I would like to propose a revert of:
> > >
> > > 4cd13c21b207 softirq: Let ksoftirqd do its job
> > >
> > > the its follow-ups:
> > >
> > > 3c53776e29f8 Mark HI and TASKLET softirq synchronous
> > > 0f50524789fc softirq: Don't skip softirq execution when softirq thread is parking
> >
> > More than this, I list some related patches mentioned in the above
> > commit 3c53776e29f8:
> > 1ff688209e2e ("watchdog: core: make sure the watchdog_worker is not deferred")
> > 8d5755b3f77b ("watchdog: softdog: fire watchdog even if softirqs do
> > not get to run")
> > 217f69743681 ("net: busy-poll: allow preemption in sk_busy_loop()")
>
[...]
> The first 2 changes replace plain timers with HR ones, could possibly
> be reverted, too, but it should not be a big deal either way.
>
> I think instead we want to keep the third commit above, as it should be
> useful when napi threaded is enabled.
>
> Generally speaking I would keep the initial revert to the bare minimum.

I agree with you :)

>
> > > The problem originally addressed by 4cd13c21b207 can now be tackled
> > > with the threaded napi, available since:
> > >
> > > 29863d41bb6e net: implement threaded-able napi poll loop support
> > >
> > > Reverting the mentioned commit should address the latency issues
> > > mentioned by Jakub - I verified it solves a somewhat related problem in
> > > my setup - and reduces the layering of heuristics in this area.
> >
> > Sure, it is. I also can verify its usefulness in the real workload.
> > Some days ago I also sent a heuristics patch [1] that can bypass the
> > ksoftirqd if the user chooses to mask some type of softirq. Let the
> > user decide it.
> >
> > But I observed that if we mask some softirqs, or we can say,
> > completely revert the commit 4cd13c21b207, the load would go higher
> > and the kernel itself may occupy/consume more time than before. They
> > were tested under the similar workload launched by our applications.
> >
> > [1]: https://lore.kernel.org/all/20230410023041.49857-1-kerneljasonxing@xxxxxxxxx/
>
> Thanks for the reference, I would have missed that patch otherwise.
>
> My understanding is that adding more knobs here is in the opposite
> direction of what Thomas is suggesting, and IMHO the 'now mask' should
> not be exposed to user-space.

Could you please share the link about what Thomas is suggesting? I
missed it. At the beginning, I didn't have the guts to revert the
commit directly. Instead I wrote a compromised patch that is not that
elegant as you said. Anyway, the idea is common, but reverting the
whole commit may involve more work. I will spend some time digging
into this part.

More suggestions are also welcome :)

Thanks,
Jason

>
> >
> Thanks for the feedback,
>
> Paolo
>