Re: [PATCH v6 2/2] cpuidle: teo: Introduce util-awareness

From: Qais Yousef
Date: Mon Sep 18 2023 - 20:04:48 EST


On 09/18/23 12:41, Kajetan Puchalski wrote:

> Yes very much agreed on this part, they definitely can be improved. I'm
> currently exploring some other approaches to this issue as well but
> that's more of a long-term thing so it'll likely take a couple of months
> to see whether they're workable or not.

Sounds good :)

> The role I think the util bits are supposed to play here is mainly to
> give us a view from a bit higher up than the metrics themselves do
> because of how quickly those decay. Another way to do it would be some
> way to make the threshold self-adjusting in the same way the metrics
> are, e.g. increase the threshold if we're suddenly hitting too many too
> shallow sleeps and decrease when we're hitting too many deep sleeps as a
> result of the util checks. This would take care of the edge cases
> currently falling through the cracks of their util being right around the
> threshold, the mechanism would adjust within a few seconds of a workload
> like video playback running. Could be a step in the right direction at least.

I'm not keen on the 'too many' part personally. It does feel finger in the air
to me. But maybe it can work.

Beside timers, tasks wake up on synchronization mechanisms and IPC, we can
potentially gather more info from there. For example, if a task is blocked on
a kernel lock, then chances its in kernel contention and it supposed to wake up
soon for the kernel to finish servicing the operation (ie: finish the syscall
and return to user mode).

Android has the notion of cpus being in deeper idle states in EAS. I think we
could benefit from such thing in upstream too. Not all idle cpus are equal when
idle states are taken into account, for both power and latency reasons. So
there's room to improve on the wake up side to help keep CPUs in deeper idle
states too.

Load balancer state can give indications too. It'll try to spread to idle CPUs
first. Its activity is a good indication on the demand for an idle CPU to be
ready to run something soon.

Generally maybe scheduler and idle governor can coordinate better to
provide/request some low latency idle CPUs, but allow the rest to go into
deeper idle states, per TEO's rule of course.

On HMP systems this will be trickier for us as not all CPUs are equal. And
cluster idle can complicate things further. So the nomination of a low latency
CPU needs to consider more things into account compared to other systems
(except NUMA maybe) where any idle CPU is as good as any other.


Thanks!

--
Qais Yousef