Re: [RFC PATCH 1/9] sched,cgroup: Add interface for latency-nice

From: Peter Zijlstra
Date: Thu Sep 05 2019 - 07:40:51 EST


On Thu, Sep 05, 2019 at 12:18:55PM +0100, Patrick Bellasi wrote:

> Right, we have this dualism to deal with and current mainline behaviour
> is somehow in the middle.
>
> BTW, the FB requirement is the same we have in Android.
> We want some CFS tasks to have very small latency and a low chance
> to be preempted by the wake-up of less-important "background" tasks.
>
> I'm not totally against the usage of a signed range, but I'm thinking
> that since we are introducing a new (non POSIX) concept we can get the
> chance to make it more human friendly.

I'm arguing that signed _is_ more human friendly ;-)

> Give the two extremes above, would not be much simpler and intuitive to
> have 0 implementing the FB/Android (no latency) case and 1024 the
> (max latency) Oracle case?

See, I find the signed thing more natural, negative is a bias away from
latency sensitive, positive is a bias towards latency sensitive.

Also; 0 is a good default value ;-)

> Moreover, we will never match completely the nice semantic, give that
> a 1 nice unit has a proper math meaning, isn't something like 10% CPU
> usage change for each step?

Only because we were nice when implementing it. Posix leaves it
unspecified and we could change it at any time. The only real semantics
is a relative 'weight' (opengroup uses the term 'favourable').

> Could changing the name to "latency-tolerance" break the tie by marking
> its difference wrt prior/nice levels? AFAIR, that was also the original
> proposal [1] by PaulT during the OSPM discussion.

latency torrerance could still be a signed entity, positive would
signify we're more tolerant of latency (ie. less sensitive) while
negative would be less tolerant (ie. more sensitive).

> For latency-nice instead we will likely base our biasing strategies on
> some predefined (maybe system-wide configurable) const thresholds.

I'm not quite sure; yes, for some of these things, like the idle search
on wakeup, certainly. But say for wakeup-preemption, we could definitely
make it a task relative attribute.