Re: [PATCH v5 03/10] cpufreq/schedutil: add rt utilization tracking

From: Patrick Bellasi
Date: Fri Jun 01 2018 - 13:24:16 EST


On 01-Jun 18:23, Peter Zijlstra wrote:
> On Thu, May 31, 2018 at 10:46:07AM +0200, Juri Lelli wrote:
> > On 30/05/18 17:46, Quentin Perret wrote:
>
> > > So I understand why we want to got to max freq when a RT task is running,
> > > but I think there are use cases where we might want to be more conservative
> > > and use the util_avg of the RT rq instead. The first use case is
> > > battery-powered devices where going to max isn't really affordable from
> > > an energy standpoint. Android, for example, has been using a RT
> > > utilization signal to select OPPs for quite a while now, because going
> > > to max blindly is _very_ expensive.
> > >
> > > And the second use-case is thermal pressure. On some modern CPUs, going to
> > > max freq can lead to stringent thermal capping very quickly, at the
> > > point where your CPUs might not have enough capacity to serve your tasks
> > > properly. And that can ultimately hurt the very RT tasks you originally
> > > tried to run fast. In these systems, in the long term, you'd be better off
> > > not asking for more than what you really need ...
> >
> > Proposed the same at last LPC. Peter NAKed it (since RT is all about
> > meeting deadlines, and when using FIFO/RR we don't really know how fast
> > the CPU should go to meet them, so go to max is the only safe decision).
> >
> > > So what about having a sched_feature to select between going to max and
> > > using the RT util_avg ? Obviously the default should keep the current
> > > behaviour.
> >
> > Peter, would SCHED_FEAT make a difference? :)
>
> Hurmph...
>
> > Or Patrick's utilization capping applied to RT..
>
> There might be something there, IIRC that tracks the max potential
> utilization for the running tasks. So at that point we can set a
> frequency to minimize idle time.

Or we can do the opposite: we go to max by default (as it is now) and
if you think that some RT tasks don't need the full speed, you can
apply a util_max to them.

That way, when a RT task is running alone on a CPU, we can run it
only at a custom max freq which is known to be ok according to your
latency requirements.

If instead it's running with other CFS tasks, we add already the CFS
utilization, which will result into a speedup of the RT task to give
back the CPU to CFS.

> It's not perfect, because while the clamping thing effectively sets a
> per-task bandwidth, the max filter is wrong. Also there's no CBS to
> enforce anything.

Right, well... from user-space potentially if you carefully set the RT
cpu's controller (both bandwidth and clamping) and keep track of the
allocated bandwidth, you can still ensure that all your RT tasks will
be able to run, according to their prio.

> With RT servers we could aggregate the group bandwidth and limit from
> that...

What we certainly miss I think it's the EDF scheduler: it's not
possible to run certain RT tasks before others irrespectively of they
relative priority.

--
#include <best/regards.h>

Patrick Bellasi