RE: [RFC][PATCH v3 0/3] Softirq -rt Optimizations

From: David Laight
Date: Wed Sep 28 2022 - 09:51:32 EST


From: Qais Yousef
> Sent: 28 September 2022 14:01
>
> Hi John
>
> On 09/21/22 01:25, John Stultz wrote:
> > Hey all,
> >
> > This series is a set of patches that optimize scheduler decisions around
> > realtime tasks and softirqs. This series is a rebased and reworked set
> > of changes that have been shipping on Android devices for a number of
> > years, originally created to resolve audio glitches seen on devices
> > caused by softirqs for network or storage drivers.
> >
> > Long running softirqs cause issues because they aren’t currently taken
> > into account when a realtime task is woken up, but they will delay
> > realtime tasks from running if the realtime tasks are placed on a cpu
> > currently running a softirq.
>
> Thanks a lot for sending this series. I've raised this problem in various
> venues in the past, but it seems it is hard to do something better than what
> you propose here.
>
> Borrowing some behaviours from PREEMPT_RT (like threadedirqs) won't cut it
> outside PREEMPT_RT AFAIU.
>
> Peter did suggest an alternative at one point in the past to be more aggressive
> in limiting softirqs [1] but I never managed to find the time to verify it
> - especially its impact on network throughput as this seems to be the tricky
> trade-of (and tricky thing to verify for me at least). I'm not sure if BLOCK
> softirqs are as sensitive.

I've had issues with the opposite problem.
Long running RT tasks stopping the softint code running.

If an RT task is running, the softint will run in the context of the
RT task - so has priority over it.
If the RT task isn't running the softint stops the RT task being scheduled.
This is really just the same.

If the softint defers back to thread context it won't be scheduled
until any RT task finishes. This is the opposite priority.

IIRC there is another strange case where the RT thread has been woken
but isn't yet running - can't remember the exact details.

I can (mostly) handle the RT task being delayed (there are a lot of RT
threads sharing the work) but it is paramount that the ethernet receive
code actually runs - I can't afford to drop packets (they contain audio
the RT threads are processing).

In my case threaded NAPI (mostly) fixes it - provided the NAPI thread are RT.

David


>
> I think the proposed approach is not intrusive and offers a good balance that
> is well contained and easy to improve upon on the future. It's protected with
> a configuration option so users that don't want it can easily disable it.
>
> [1] https://gitlab.arm.com/linux-arm/linux-qy/-/commits/core/softirq/
>
>
> Thanks
>
> --
> Qais Yousef

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)