Re: RFC: documentation of the autogroup feature [v2]

From: Michael Kerrisk (man-pages)
Date: Fri Nov 25 2016 - 11:33:41 EST


Hi Peter,

On 11/25/2016 05:04 PM, Peter Zijlstra wrote:
> On Fri, Nov 25, 2016 at 04:04:25PM +0100, Michael Kerrisk (man-pages) wrote:
>>>> âââââââââââââââââââââââââââââââââââââââââââââââââââââââ
>>>> âFIXME â
>>>> âââââââââââââââââââââââââââââââââââââââââââââââââââââââ
>>>> âHow do the nice value of a process and the nice â
>>>> âvalue of an autogroup interact? Which has priority? â
>>>> â â
>>>> âIt *appears* that the autogroup nice value is used â
>>>> âfor CPU distribution between task groups, and that â
>>>> âthe process nice value has no effect there. (I.e., â
>>>> âsuppose two autogroups each contain a CPU-bound â
>>>> âprocess, with one process having nice==0 and the â
>>>> âother having nice==19. It appears that they each â
>>>> âget 50% of the CPU.) It appears that the process â
>>>> ânice value has effect only with respect to schedulâ â
>>>> âing relative to other processes in the *same* autoâ â
>>>> âgroup. Is this correct? â
>>>> âââââââââââââââââââââââââââââââââââââââââââââââââââââââ
>>>
>>> Yup, entity nice level affects distribution among peer entities.
>>
>> Huh! I only just learned about this via my experiments while
>> investigating autogroups.
>>
>> How long have things been like this? Always? (I don't think
>> so.) Since the arrival of CFS? Since the arrival of
>> autogrouping? (I'm guessing not.) Since some other point?
>> (When?)
>
> Ever since cfs-cgroup,

Okay. That begs the question still though.

> this is a fundamental design point of cgroups,
> and has therefore always been the case for autogroups (as that is
> nothing more than an application of the cgroup code).

Understood.

>> It seems to me that this renders the traditional process
>> nice pretty much useless. (I bet I'm not the only one who'd
>> be surprised by the current behavior.)
>
> Its really rather fundamental to how the whole hierarchical things
> works.
>
> CFS is a weighted fair queueing scheduler; this means each entity
> receives:
>
> w_i
> dt_i = dt --------
> \Sum w_j
>
>
> CPU
> ______/ \______
> / | | \
> A B C D
>
>
> So if each entity {A,B,C,D} has equal weight, then they will receive
> equal time. Explicitly, for C you get:
>
>
> w_C
> dt_C = dt -----------------------
> (w_A + w_B + w_C + w_D)
>
>
> Extending this to a hierarchy, we get:
>
>
> CPU
> ______/ \______
> / | | \
> A B C D
> / \
> E F
>
> Where C becomes a 'server' for entities {E,F}. The weight of C does not
> depend on its child entities. This way the time of {E,F} becomes a
> straight product of their ratio with C. That is; the whole thing
> becomes, where l denotes the level in the hierarchy and i an
> entity on that level:
>
> l w_g,i
> dt_l,i = dt \Prod ----------
> g=0 \Sum w_g,j
>
>
> Or more concretely, for E:
>
> w_E
> dt_1,E = dt_0,C -----------
> (w_E + w_F)
>
> w_C w_E
> = dt ----------------------- -----------
> (w_A + w_B + w_C + w_D) (w_E + w_F)
>
>
> And this 'trivially' extends to SMP, with the tricky bit being that the
> sums over all entities end up being machine wide, instead of per CPU,
> which is a real and royal pain for performance.

Okay -- you're really quite the ASCII artist. And somehow,
I think you needed to compose the mail in LaTeX. But thanks
for the detail. It's helpful, for me at least.

> Note that this property, where the weight of the server entity is
> independent from its child entities is a desired feature. Without that
> it would be impossible to control the relative weights of groups, and
> that is the sole parameter of the WFQ model.
>
> It is also why Linus so likes autogroups, each session competes equally
> amongst one another.

I get it. But, the behavior changes for the process nice value are
undocumented, and they should be documented. I understand
what the behavior change was. But not yet when.

Cheers,

Michael

--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/