Re: [PATCH 0/2] sched/eevdf: sched_attr::sched_runtime slice hint

From: Mike Galbraith
Date: Sun Sep 17 2023 - 23:45:32 EST


On Sat, 2023-09-16 at 22:33 +0100, Qais Yousef wrote:
>
> Example of conflicting requirements that come across frequently:
>
>         1. Improve wake up latency without for SCHED_OTHER. Many tasks
>            end up using SCHED_FIFO/SCHED_RR to compensate for this
>            shortcoming. RT tasks lack power management and fairness and
>            can be hard and error prone to use correctly and portably.

This bit appears to be dealt with about as nicely as it can be in a
fair class by the latency nice patch set, and deals with both
individual tasks and groups thereof, ie has cgroups support.

Its trade slice for latency fits EEVDF nicely IMHO. As its name
implies, the trade agreement language is relative niceness, which I
find more appropriate than time units, use of which would put the deal
squarely into the realm of RT, thus have no place in a fair class.

I don't yet know how effective it is. I dinged up schedtool to play
with both it and $subject, but have yet to target any pet piglets or
measured impact of shiny new lipstick cannon.

>         2. Prefer spreading vs prefer packing on wake up for a group of
>            tasks. Geekbench-like workloads would benefit from
>            parallelising on different CPUs. hackbench type of workloads
>            can benefit from waking on up same CPUs or a CPU that is
>            closer in the cache hierarchy.
>
>         3. Nice values for SCHED_OTHER are system wide and require
>            privileges. Many workloads would like a way to set relative
>            nice value so they can preempt each others, but not be
>            impact or be impacted by other tasks belong to different
>            workloads on the system.
>
>         4. Provide a way to tag some tasks as 'background' to keep them
>            out of the way. SCHED_IDLE is too strong for some of these
>            tasks but yet they can be computationally heavy. Example
>            tasks are garbage collectors. Their work is both important
>            and not important.

All three of those make my eyebrows twitch mightily even in their not
well defined form: any notion of applying badges to identify groups of
tasks would constitute creation of yet another cgroups.

-Mike