Re: [PATCH v2] sched/psi: Fix avgs_work re-arm in psi_avgs_work()

From: Suren Baghdasaryan
Date: Fri Oct 14 2022 - 12:41:40 EST


On Fri, Oct 14, 2022 at 7:18 AM Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
>
> On Fri, Oct 14, 2022 at 07:05:51PM +0800, Chengming Zhou wrote:
> > Pavan reported a problem that PSI avgs_work idle shutoff is not
> > working at all. Because PSI_NONIDLE condition would be observed in
> > psi_avgs_work()->collect_percpu_times()->get_recent_times() even if
> > only the kworker running avgs_work on the CPU.
> >
> > Although commit 1b69ac6b40eb ("psi: fix aggregation idle shut-off")
> > avoided the ping-pong wake problem when the worker sleep, psi_avgs_work()
> > still will always re-arm the avgs_work, so shutoff is not working.
> >
> > This patch changes to use PSI_STATE_RESCHEDULE to flag whether to
> > re-arm avgs_work in get_recent_times(). For the current CPU, we re-arm
> > avgs_work only when (NR_RUNNING > 1 || NR_IOWAIT > 0 || NR_MEMSTALL > 0),
> > for other CPUs we can just check PSI_NONIDLE delta. The new flag
> > is only used in psi_avgs_work(), so we check in get_recent_times()
> > that current_work() is avgs_work.
> >
> > One potential problem is that the brief period of non-idle time
> > incurred between the aggregation run and the kworker's dequeue will
> > be stranded in the per-cpu buckets until avgs_work run next time.
> > The buckets can hold 4s worth of time, and future activity will wake
> > the avgs_work with a 2s delay, giving us 2s worth of data we can leave
> > behind when shut off the avgs_work. If the kworker run other works after
> > avgs_work shut off and doesn't have any scheduler activities for 2s,
> > this maybe a problem.
> >
> > Reported-by: Pavan Kondeti <quic_pkondeti@xxxxxxxxxxx>
> > Signed-off-by: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>
>
> Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>

Please make sure to test the final version. With that done,

Acked-by: Suren Baghdasaryan <surenb@xxxxxxxxxx>