Re: [PATCH] sched/pelt: avoid underestimate of task utilization

From: Vincent Guittot
Date: Wed Nov 22 2023 - 12:37:45 EST


The same but with plain text instead of html ...

On Wed, 22 Nov 2023 at 17:40, Hongyan Xia <hongyan.xia2@xxxxxxx> wrote:
>
> Hi Vincent,
>
> On 22/11/2023 14:01, Vincent Guittot wrote:
> > It has been reported that thread's util_est can significantly decrease as
> > a result of sharing the CPU with other threads. The use case can be easily
> > reproduced with a periodic task TA that runs 1ms and sleeps 100us.
> > When the task is alone on the CPU, its max utilization and its util_est is
> > around 888. If another similar task starts to run on the same CPU, TA will
> > have to share the CPU runtime and its maximum utilization will decrease
> > around half the CPU capacity (512) then TA's util_est will follow this new
> > maximum trend which is only the result of sharing the CPU with others
> > tasks. Such situation can be detected with runnable_avg wich is close or
> > equal to util_avg when TA is alone but increases above util_avg when TA
> > shares the CPU with other threads and wait on the runqueue.
>
> Thanks for bringing this case up. I'm a bit nervous skipping util_est
> updates this way. While it is true that this avoids dropping util_est
> when the task is still busy doing stuff, it also avoids dropping
> util_est when the task really is becoming less busy. If a task has a
> legitimate reason to drop its utilization, it looks weird to me that its
> util_est dropping can be stopped by a new task joining this rq which
> pushes up runnable_avg.

We prefer an util_est that overestimate rather than under estimate
because in 1st case you will not provide enough performance to the
task which will remain under provisioned whereas in the other case you
will create some idle time which will enable to reduce contention and
as a result reduce the util_est so the overestimate will be transient
whereas the underestimate will be remain

> Also, something about rt-app. Is there an easy way to ask an rt-app
> thread to achieve a certain amount of throughput (like loops per
> second)? I think 'runs 1ms and sleeps 100us' may not entirely simulate a
> task that really wants to preserve a util_est of 888. If its utilization


We can do this in rt-app with timer instead of sleep but in this case
there is no sleep and as a result no update of util_est. In the case
raised in [1] by lukasz and according to the shared charts, there are
some sleep phases even when the task must share the cpu. This can
typically happen when you have a pipe of threads: A task prepares
some data, wakes up next step and waits for the result. Then the 1st
task is woken up when it's done and to prepare the next data and so on
... In this case, the task will slow down because of time sharing and
there is still sleep phase

>
>
> really is that high, its sleep time will become less and less when
> sharing the rq with another task, or even has no idle time and become
> 1024 which will trigger overutilization and migration.
>
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> > ---
> >
> > This patch implements what I mentioned in [1]. I have been able to
> > reproduce such pattern with rt-app.
> >
> > [1] https://lore.kernel.org/lkml/CAKfTPtDd-HhF-YiNTtL9i5k0PfJbF819Yxu4YquzfXgwi7voyw@xxxxxxxxxxxxxx/#t
> >