Re: [PATCH 0/2] sched/eevdf: sched_attr::sched_runtime slice hint

From: Qais Yousef
Date: Tue Sep 19 2023 - 17:09:01 EST


On 09/18/23 05:43, Mike Galbraith wrote:
> On Sat, 2023-09-16 at 22:33 +0100, Qais Yousef wrote:
> >
> > Example of conflicting requirements that come across frequently:
> >
> >         1. Improve wake up latency without for SCHED_OTHER. Many tasks
> >            end up using SCHED_FIFO/SCHED_RR to compensate for this
> >            shortcoming. RT tasks lack power management and fairness and
> >            can be hard and error prone to use correctly and portably.
>
> This bit appears to be dealt with about as nicely as it can be in a
> fair class by the latency nice patch set, and deals with both
> individual tasks and groups thereof, ie has cgroups support.

AFAIU the latency_nice is no longer going forward. But I could be mistaken.

> Its trade slice for latency fits EEVDF nicely IMHO. As its name
> implies, the trade agreement language is relative niceness, which I
> find more appropriate than time units, use of which would put the deal
> squarely into the realm of RT, thus have no place in a fair class.

Nice (or latency nice) have global indication that can make sense within the
specific context tested on. Like RT priorities.

Abstract notion is fine if you have a better suggestion, but being global
relative is a problem IMO. The intended consumers are application writers; who
have no prior knowledge about the system they'll be running on. I think that
was the main point against latency_nice IIUC.

> I don't yet know how effective it is. I dinged up schedtool to play
> with both it and $subject, but have yet to target any pet piglets or
> measured impact of shiny new lipstick cannon.
>
> >         2. Prefer spreading vs prefer packing on wake up for a group of
> >            tasks. Geekbench-like workloads would benefit from
> >            parallelising on different CPUs. hackbench type of workloads
> >            can benefit from waking on up same CPUs or a CPU that is
> >            closer in the cache hierarchy.
> >
> >         3. Nice values for SCHED_OTHER are system wide and require
> >            privileges. Many workloads would like a way to set relative
> >            nice value so they can preempt each others, but not be
> >            impact or be impacted by other tasks belong to different
> >            workloads on the system.
> >
> >         4. Provide a way to tag some tasks as 'background' to keep them
> >            out of the way. SCHED_IDLE is too strong for some of these
> >            tasks but yet they can be computationally heavy. Example
> >            tasks are garbage collectors. Their work is both important
> >            and not important.
>
> All three of those make my eyebrows twitch mightily even in their not
> well defined form: any notion of applying badges to identify groups of
> tasks would constitute creation of yet another cgroups.

cgroups require root privilege. And it is intended for sysadmins to split
system resources between apps. It doesn't help an app to describe the
relationship between its tasks. Nor any requirements for them to do their job
properly. But rather impose something on them regardless of what they want.


Cheers

--
Qais Yousef