Re: [PATCH v8 3/6] cpuset: Add cpuset.sched.load_balance flag to v2

From: Patrick Bellasi
Date: Fri May 25 2018 - 08:00:07 EST


On 24-May 11:22, Waiman Long wrote:
> On 05/24/2018 11:16 AM, Juri Lelli wrote:
> > On 24/05/18 11:09, Waiman Long wrote:
> >> On 05/24/2018 10:36 AM, Juri Lelli wrote:
> >>> On 17/05/18 16:55, Waiman Long wrote:
> >>>
> >>> [...]
> >>>
> >>>> + A parent cgroup cannot distribute all its CPUs to child
> >>>> + scheduling domain cgroups unless its load balancing flag is
> >>>> + turned off.
> >>>> +
> >>>> + cpuset.sched.load_balance
> >>>> + A read-write single value file which exists on non-root
> >>>> + cpuset-enabled cgroups. It is a binary value flag that accepts
> >>>> + either "0" (off) or a non-zero value (on). This flag is set
> >>>> + by the parent and is not delegatable.
> >>>> +
> >>>> + When it is on, tasks within this cpuset will be load-balanced
> >>>> + by the kernel scheduler. Tasks will be moved from CPUs with
> >>>> + high load to other CPUs within the same cpuset with less load
> >>>> + periodically.
> >>>> +
> >>>> + When it is off, there will be no load balancing among CPUs on
> >>>> + this cgroup. Tasks will stay in the CPUs they are running on
> >>>> + and will not be moved to other CPUs.
> >>>> +
> >>>> + The initial value of this flag is "1". This flag is then
> >>>> + inherited by child cgroups with cpuset enabled. Its state
> >>>> + can only be changed on a scheduling domain cgroup with no
> >>>> + cpuset-enabled children.
> >>> [...]
> >>>
> >>>> + /*
> >>>> + * On default hierachy, a load balance flag change is only allowed
> >>>> + * in a scheduling domain with no child cpuset.
> >>>> + */
> >>>> + if (cgroup_subsys_on_dfl(cpuset_cgrp_subsys) && balance_flag_changed &&
> >>>> + (!is_sched_domain(cs) || css_has_online_children(&cs->css))) {
> >>>> + err = -EINVAL;
> >>>> + goto out;
> >>>> + }
> >>> The rule is actually
> >>>
> >>> - no child cpuset
> >>> - and it must be a scheduling domain

I always a bit confused by the usage of "scheduling domain", which
overlaps with the SD concept from the scheduler standpoint.

AFAIU a cpuset sched domain is not granted to be turned into an
actual scheduler SD, am I wrong?

If that's the case, why not better disambiguate these two concept by
calling the cpuset one a "cpus partition" or eventually "cpuset domain"?

--
#include <best/regards.h>

Patrick Bellasi