Re: [PATCH v5 0/7] SCHED_DEADLINE server infrastructure

From: Daniel Bristot de Oliveira
Date: Tue Feb 20 2024 - 03:32:34 EST


On 2/20/24 04:28, Huang, Ying wrote:
> Daniel Bristot de Oliveira <bristot@xxxxxxxxxx> writes:
>
>> Hi
>>
>> On 2/19/24 08:33, Huang, Ying wrote:
>>> Hi, Daniel,
>>>
>>> Thanks a lot for your great patchset!
>>>
>>> We have a similar starvation issue in mm subsystem too. Details are in
>>> the patch description of the below commit. In short, task A is busy
>>> looping on some event, while task B will signal the event after some
>>> work. If the priority of task A is higher than that of task B, task B
>>> may be starved.
>>
>> ok...
>>
>>>
>>> IIUC, if task A is RT task while task B is fair task, then your patchset
>>> will solve the issue.
>>
>> This patch set will not solve the issue. It will mitigate the effect of the
>> problem. Still the system will perform very poorly...
>
> I don't think that it's common (or even reasonable) for real-time tasks
> to use swap. So, IMHO, performance isn't very important here. But, we
> need to avoid live-lock anyway. I think that your patchset solves the
> live-lock issue.

I mean, if for you this is solving your user problem, be happy :-) Play with parameters...
find a way to tune your system as a user... use it :)

But your problem is also "solved" with RT throttling without RT_RUNTIME_SHARE (the
default since... two years ago, I think). So there is not much news here.

IMHO, it is not a solution. As a developer, there is a synchronization problem
in swap code, and pushing a workaround to the scheduling side is not the way to go...

>
>>> If both task A and task B is RT tasks, is there
>>> some way to solve the issue?
>>
>> I would say reworking the swap algorithm, as it is not meant to be used when
>> real-time tasks are in place.
>>
>> As an exercise, let's say that we add a server per priority on FIFO, with a default
>> 50ms/1s runtime period. Your "real-time" workload would suffer a 950ms latency,
>> busy loop in vain.
>
> If the target is only the live-lock avoidance, is it possible to run
> lower priority runnable tasks for a short while if we run long enough in
> the busy loop?

If you do it in the algorithm side (instead of relying on scheduling), it could be a
thing.

I think NAPI still uses something like this: Busy-loop for two jiffies in the softirq
context (a priority higher than all threads on the !rt kernel), then move to thread
the thread context to avoid starvation. In the swap case, it could run for two jiffies
and then go to sleep for a while. How well will swap people receive this as a solution...
I do not know :) I would first try something better than this using synchronization
primitives.

This patch set is for things outside of kernel control. For example, people running
poll mode DPDK in user-space with FIFO priority; FIFO tasks in user-space for too long...
with a better design than rt throttling.

Will this patch help in misbehaving kernel activities: yes. Is it a reason not to
fix kernel problems? I do not think so, and I bet many other people do not believe as
well.

-- Daniel