Re: [RFC 1/2] softirq: Defer net rx/tx processing to ksoftirqd context

From: Eric Dumazet
Date: Thu Jan 11 2018 - 15:37:46 EST


On Thu, Jan 11, 2018 at 12:34 PM, Dmitry Safonov <dima@xxxxxxxxxx> wrote:
> On Thu, 2018-01-11 at 12:22 -0800, Linus Torvalds wrote:
>> On Thu, Jan 11, 2018 at 12:16 PM, Eric Dumazet <edumazet@xxxxxxxxxx>
>> wrote:
>> >
>> > Note that when I implemented TCP Small queues, I did experiments
>> > between
>> > using a work queue or a tasklet, and workqueues added unacceptable
>> > P99
>> > latencies, when many user threads are competing with kernel
>> > threads.
>>
>> Yes.
>>
>> So I think one solution might be to have a hybrid system, where we do
>> the softirq's synchronously normally (which is what you really want
>> for good latency).
>>
>> But then fall down on a threaded model - but that fallback case
>> should
>> be per-softirq, not global. So if one softirq uses a lot of CPU time,
>> that shouldn't affect the latency of other softirqs.
>>
>> So maybe we could get rid of the per-cpu ksoftirqd entirely, and
>> replace it with with per-cpu and per-softirq workqueues?
>>
>> Would something like that sound sane?
>>
>> Just a SMOP/SMOT (small matter of programming/testing).
>
> I could try to write a PoC for that..
> What should be the trigger to fall into workqueue?
> How to tell if there're too many softirqs of the kind?
> Current logic with if (pending) in the end of __do_softirq()
> looks working selectively..
> It looks to be still possible to starve a cpu.

I guess we would need to track amount of time spent while processing
sortirq (while interrupting a non idle task)