Re: [PATCH v5 0/7] SCHED_DEADLINE server infrastructure

From: Huang, Ying
Date: Tue Feb 20 2024 - 03:43:57 EST


Daniel Bristot de Oliveira <bristot@xxxxxxxxxx> writes:

> On 2/20/24 04:28, Huang, Ying wrote:
>> Daniel Bristot de Oliveira <bristot@xxxxxxxxxx> writes:
>>
>>> Hi
>>>
>>> On 2/19/24 08:33, Huang, Ying wrote:
>>>> Hi, Daniel,
>>>>
>>>> Thanks a lot for your great patchset!
>>>>
>>>> We have a similar starvation issue in mm subsystem too. Details are in
>>>> the patch description of the below commit. In short, task A is busy
>>>> looping on some event, while task B will signal the event after some
>>>> work. If the priority of task A is higher than that of task B, task B
>>>> may be starved.
>>>
>>> ok...
>>>
>>>>
>>>> IIUC, if task A is RT task while task B is fair task, then your patchset
>>>> will solve the issue.
>>>
>>> This patch set will not solve the issue. It will mitigate the effect of the
>>> problem. Still the system will perform very poorly...
>>
>> I don't think that it's common (or even reasonable) for real-time tasks
>> to use swap. So, IMHO, performance isn't very important here. But, we
>> need to avoid live-lock anyway. I think that your patchset solves the
>> live-lock issue.
>
> I mean, if for you this is solving your user problem, be happy :-) Play with parameters...
> find a way to tune your system as a user... use it :)
>
> But your problem is also "solved" with RT throttling without RT_RUNTIME_SHARE (the
> default since... two years ago, I think). So there is not much news here.
>
> IMHO, it is not a solution. As a developer, there is a synchronization problem
> in swap code, and pushing a workaround to the scheduling side is not the way to go...
>
>>
>>>> If both task A and task B is RT tasks, is there
>>>> some way to solve the issue?
>>>
>>> I would say reworking the swap algorithm, as it is not meant to be used when
>>> real-time tasks are in place.
>>>
>>> As an exercise, let's say that we add a server per priority on FIFO, with a default
>>> 50ms/1s runtime period. Your "real-time" workload would suffer a 950ms latency,
>>> busy loop in vain.
>>
>> If the target is only the live-lock avoidance, is it possible to run
>> lower priority runnable tasks for a short while if we run long enough in
>> the busy loop?
>
> If you do it in the algorithm side (instead of relying on scheduling), it could be a
> thing.
>
> I think NAPI still uses something like this: Busy-loop for two jiffies in the softirq
> context (a priority higher than all threads on the !rt kernel), then move to thread
> the thread context to avoid starvation. In the swap case, it could run for two jiffies
> and then go to sleep for a while. How well will swap people receive this as a solution...
> I do not know :) I would first try something better than this using synchronization
> primitives.
>
> This patch set is for things outside of kernel control. For example, people running
> poll mode DPDK in user-space with FIFO priority; FIFO tasks in user-space for too long...
> with a better design than rt throttling.
>
> Will this patch help in misbehaving kernel activities: yes. Is it a reason not to
> fix kernel problems? I do not think so, and I bet many other people do not believe as
> well.

I totally agree with you that we need to fix the kernel problems. And,
Thanks for your information!

--
Best Regards,
Huang, Ying