Re: [RFC PATCH v1] sched/uclamp: Introduce SCHED_FLAG_RESET_UCLAMP_ON_FORK flag

From: Saravana Kannan
Date: Wed Apr 19 2023 - 18:50:00 EST


On Wed, Apr 19, 2023 at 10:54 AM Qais Yousef <qyousef@xxxxxxxxxxx> wrote:
>
> Hi David!
>
> On 04/16/23 14:34, David Dai wrote:
> > A userspace service may manage uclamp dynamically for individual tasks and
> > a child task will unintentionally inherit a pesudo-random uclamp setting.
> > This could result in the child task being stuck with a static uclamp value
> > that results in poor performance or poor power.
> >
> > Using SCHED_FLAG_RESET_ON_FORK is too coarse for this usecase and will
> > reset other useful scheduler attributes. Adding a
> > SCHED_FLAG_RESET_UCLAMP_ON_FORK will allow userspace to have finer control
> > over scheduler attributes of child processes.
>
> Thanks a lot for the patch. This has a been a known limitation for a while but
> didn't manage to find the time to push anything yet.
>
> ADPF (Android Dynamic Performance Framework) exposes APIs to manage performance
> for a set of pids [1]. Only these tasks belong to the session and any forked
> tasked is expected to have its uclamp values reset. But as you pointed out, the
> current RESET_ON_FORK resets everything, but we don't want that as these
> attributes don't belong to ADPF to decide whether they should be reset too or
> not. And not resetting them means we can end up with tasks inheriting random
> uclamp values unintentionally. We can't tell these tasks not to fork anything.
> If the forked tasks are expected to be part of the session, then their pids
> must be added explicitly.
>
> [1] https://developer.android.com/reference/android/os/PerformanceHintManager#createHintSession(int%5B%5D,%20long)
>
> >
> > Cc: Qais Yousef <qyousef@xxxxxxxxxx>
> > Cc: Quentin Perret <qperret@xxxxxxxxxx>
> > Cc: Saravana Kannan <saravanak@xxxxxxxxxx>
> > Signed-off-by: David Dai <davidai@xxxxxxxxxx>
> > ---
> > include/linux/sched.h | 3 +++
> > include/uapi/linux/sched.h | 4 +++-
> > kernel/sched/core.c | 6 +++++-
> > tools/include/uapi/linux/sched.h | 4 +++-
> > 4 files changed, 14 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index 63d242164b1a..b1676b9381f9 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -885,6 +885,9 @@ struct task_struct {
> > unsigned sched_reset_on_fork:1;
>
> nit: can't we convert to a flag and re-use?
>
> > unsigned sched_contributes_to_load:1;
> > unsigned sched_migrated:1;
> > +#ifdef CONFIG_UCLAMP_TASK
> > + unsigned sched_reset_uclamp_on_fork:1;
> > +#endif
> >
> > /* Force alignment to the next boundary: */
> > unsigned :0;
> > diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
> > index 3bac0a8ceab2..7515106e1f1a 100644
> > --- a/include/uapi/linux/sched.h
> > +++ b/include/uapi/linux/sched.h
> > @@ -132,12 +132,14 @@ struct clone_args {
> > #define SCHED_FLAG_KEEP_PARAMS 0x10
> > #define SCHED_FLAG_UTIL_CLAMP_MIN 0x20
> > #define SCHED_FLAG_UTIL_CLAMP_MAX 0x40
> > +#define SCHED_FLAG_RESET_UCLAMP_ON_FORK 0x80
> >
> > #define SCHED_FLAG_KEEP_ALL (SCHED_FLAG_KEEP_POLICY | \
> > SCHED_FLAG_KEEP_PARAMS)
> >
> > #define SCHED_FLAG_UTIL_CLAMP (SCHED_FLAG_UTIL_CLAMP_MIN | \
> > - SCHED_FLAG_UTIL_CLAMP_MAX)
> > + SCHED_FLAG_UTIL_CLAMP_MAX | \
> > + SCHED_FLAG_RESET_UCLAMP_ON_FORK)
>
> I was considering to have something a bit more generic that allows selecting
> which attributes to reset.
>
> For example a syscall with SCHED_FLAG_RESET_ON_FORK_SEL combined with
> SCHED_FLAG_UCLAMP_MIN/MAX will only reset those. This should make it extensible
> if we have other similar use cases in the future. The downside it *might*
> require to be done in a separate syscall to the one that sets these parameter.
> But it should be done once.

In addition to the downside you mentioned, I'm not a huge fan of this
suggestion since the meaning of the SCHED_FLAG_RESET_ON_FORK_SEL flag
changes based on what other flags or attrs are set. I'd rather we have
explicit flags.

SCHED_FLAG_RESET_ON_FORK_SEL makes it harder to maintain the userspace
code/makes it easy to accidentally introduce bugs. For example, a
syscall could be setting UCLAMP_MIN and RESET_ON_FORK_SEL. Someone
else might come and change the call to also set a nice value but not
remember to split it up into two calls. Whereas with an explicit flag
like David's proposal, we won't hit such an issue.

Also, we'll need to have separate flags internally to track what needs
to be reset on fork vs not. So we really aren't saving anything by
adding RESET_ON_FORK_SEL.

At least, that's my 2 cents.

-Saravana

> Maybe there's a better interface, but I think it makes sense to do it in a way
> that we won't have to do this again. Would be good to hear from maintainers
> first before you take my word for it ;-)
>
>
> Cheers
>
> --
> Qais Yousef
>
> >
> > #define SCHED_FLAG_ALL (SCHED_FLAG_RESET_ON_FORK | \
> > SCHED_FLAG_RECLAIM | \
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 0d18c3969f90..f2d5f7911855 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -1943,6 +1943,10 @@ static void __setscheduler_uclamp(struct task_struct *p,
> > uclamp_se_set(&p->uclamp_req[UCLAMP_MAX],
> > attr->sched_util_max, true);
> > }
> > +
> > + p->sched_reset_uclamp_on_fork = !!(attr->sched_flags &
> > + SCHED_FLAG_RESET_UCLAMP_ON_FORK);
> > +
> > }
> >
> > static void uclamp_fork(struct task_struct *p)
> > @@ -1956,7 +1960,7 @@ static void uclamp_fork(struct task_struct *p)
> > for_each_clamp_id(clamp_id)
> > p->uclamp[clamp_id].active = false;
> >
> > - if (likely(!p->sched_reset_on_fork))
> > + if (likely(!p->sched_reset_on_fork && !p->sched_reset_uclamp_on_fork))
> > return;
> >
> > for_each_clamp_id(clamp_id) {
> > diff --git a/tools/include/uapi/linux/sched.h b/tools/include/uapi/linux/sched.h
> > index 3bac0a8ceab2..d52c59a2e0d0 100644
> > --- a/tools/include/uapi/linux/sched.h
> > +++ b/tools/include/uapi/linux/sched.h
> > @@ -132,12 +132,14 @@ struct clone_args {
> > #define SCHED_FLAG_KEEP_PARAMS 0x10
> > #define SCHED_FLAG_UTIL_CLAMP_MIN 0x20
> > #define SCHED_FLAG_UTIL_CLAMP_MAX 0x40
> > +#define SCHED_FLAG_RESET_UCLAMP_ON_FORK 0x80
> >
> > #define SCHED_FLAG_KEEP_ALL (SCHED_FLAG_KEEP_POLICY | \
> > SCHED_FLAG_KEEP_PARAMS)
> >
> > #define SCHED_FLAG_UTIL_CLAMP (SCHED_FLAG_UTIL_CLAMP_MIN | \
> > - SCHED_FLAG_UTIL_CLAMP_MAX)
> > + SCHED_FLAG_UTIL_CLAMP_MAX | \
> > + SCHED_FLAG_RESET_UCLAMP_ON_FORK)
> >
> > #define SCHED_FLAG_ALL (SCHED_FLAG_RESET_ON_FORK | \
> > SCHED_FLAG_RECLAIM | \
> > --
> > 2.40.0.634.g4ca3ef3211-goog
> >