Re: [RFC PATCH v1] sched/uclamp: Introduce SCHED_FLAG_RESET_UCLAMP_ON_FORK flag

From: Qais Yousef
Date: Wed Apr 19 2023 - 13:54:27 EST


Hi David!

On 04/16/23 14:34, David Dai wrote:
> A userspace service may manage uclamp dynamically for individual tasks and
> a child task will unintentionally inherit a pesudo-random uclamp setting.
> This could result in the child task being stuck with a static uclamp value
> that results in poor performance or poor power.
>
> Using SCHED_FLAG_RESET_ON_FORK is too coarse for this usecase and will
> reset other useful scheduler attributes. Adding a
> SCHED_FLAG_RESET_UCLAMP_ON_FORK will allow userspace to have finer control
> over scheduler attributes of child processes.

Thanks a lot for the patch. This has a been a known limitation for a while but
didn't manage to find the time to push anything yet.

ADPF (Android Dynamic Performance Framework) exposes APIs to manage performance
for a set of pids [1]. Only these tasks belong to the session and any forked
tasked is expected to have its uclamp values reset. But as you pointed out, the
current RESET_ON_FORK resets everything, but we don't want that as these
attributes don't belong to ADPF to decide whether they should be reset too or
not. And not resetting them means we can end up with tasks inheriting random
uclamp values unintentionally. We can't tell these tasks not to fork anything.
If the forked tasks are expected to be part of the session, then their pids
must be added explicitly.

[1] https://developer.android.com/reference/android/os/PerformanceHintManager#createHintSession(int%5B%5D,%20long)

>
> Cc: Qais Yousef <qyousef@xxxxxxxxxx>
> Cc: Quentin Perret <qperret@xxxxxxxxxx>
> Cc: Saravana Kannan <saravanak@xxxxxxxxxx>
> Signed-off-by: David Dai <davidai@xxxxxxxxxx>
> ---
> include/linux/sched.h | 3 +++
> include/uapi/linux/sched.h | 4 +++-
> kernel/sched/core.c | 6 +++++-
> tools/include/uapi/linux/sched.h | 4 +++-
> 4 files changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 63d242164b1a..b1676b9381f9 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -885,6 +885,9 @@ struct task_struct {
> unsigned sched_reset_on_fork:1;

nit: can't we convert to a flag and re-use?

> unsigned sched_contributes_to_load:1;
> unsigned sched_migrated:1;
> +#ifdef CONFIG_UCLAMP_TASK
> + unsigned sched_reset_uclamp_on_fork:1;
> +#endif
>
> /* Force alignment to the next boundary: */
> unsigned :0;
> diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
> index 3bac0a8ceab2..7515106e1f1a 100644
> --- a/include/uapi/linux/sched.h
> +++ b/include/uapi/linux/sched.h
> @@ -132,12 +132,14 @@ struct clone_args {
> #define SCHED_FLAG_KEEP_PARAMS 0x10
> #define SCHED_FLAG_UTIL_CLAMP_MIN 0x20
> #define SCHED_FLAG_UTIL_CLAMP_MAX 0x40
> +#define SCHED_FLAG_RESET_UCLAMP_ON_FORK 0x80
>
> #define SCHED_FLAG_KEEP_ALL (SCHED_FLAG_KEEP_POLICY | \
> SCHED_FLAG_KEEP_PARAMS)
>
> #define SCHED_FLAG_UTIL_CLAMP (SCHED_FLAG_UTIL_CLAMP_MIN | \
> - SCHED_FLAG_UTIL_CLAMP_MAX)
> + SCHED_FLAG_UTIL_CLAMP_MAX | \
> + SCHED_FLAG_RESET_UCLAMP_ON_FORK)

I was considering to have something a bit more generic that allows selecting
which attributes to reset.

For example a syscall with SCHED_FLAG_RESET_ON_FORK_SEL combined with
SCHED_FLAG_UCLAMP_MIN/MAX will only reset those. This should make it extensible
if we have other similar use cases in the future. The downside it *might*
require to be done in a separate syscall to the one that sets these parameter.
But it should be done once.

Maybe there's a better interface, but I think it makes sense to do it in a way
that we won't have to do this again. Would be good to hear from maintainers
first before you take my word for it ;-)


Cheers

--
Qais Yousef

>
> #define SCHED_FLAG_ALL (SCHED_FLAG_RESET_ON_FORK | \
> SCHED_FLAG_RECLAIM | \
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 0d18c3969f90..f2d5f7911855 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1943,6 +1943,10 @@ static void __setscheduler_uclamp(struct task_struct *p,
> uclamp_se_set(&p->uclamp_req[UCLAMP_MAX],
> attr->sched_util_max, true);
> }
> +
> + p->sched_reset_uclamp_on_fork = !!(attr->sched_flags &
> + SCHED_FLAG_RESET_UCLAMP_ON_FORK);
> +
> }
>
> static void uclamp_fork(struct task_struct *p)
> @@ -1956,7 +1960,7 @@ static void uclamp_fork(struct task_struct *p)
> for_each_clamp_id(clamp_id)
> p->uclamp[clamp_id].active = false;
>
> - if (likely(!p->sched_reset_on_fork))
> + if (likely(!p->sched_reset_on_fork && !p->sched_reset_uclamp_on_fork))
> return;
>
> for_each_clamp_id(clamp_id) {
> diff --git a/tools/include/uapi/linux/sched.h b/tools/include/uapi/linux/sched.h
> index 3bac0a8ceab2..d52c59a2e0d0 100644
> --- a/tools/include/uapi/linux/sched.h
> +++ b/tools/include/uapi/linux/sched.h
> @@ -132,12 +132,14 @@ struct clone_args {
> #define SCHED_FLAG_KEEP_PARAMS 0x10
> #define SCHED_FLAG_UTIL_CLAMP_MIN 0x20
> #define SCHED_FLAG_UTIL_CLAMP_MAX 0x40
> +#define SCHED_FLAG_RESET_UCLAMP_ON_FORK 0x80
>
> #define SCHED_FLAG_KEEP_ALL (SCHED_FLAG_KEEP_POLICY | \
> SCHED_FLAG_KEEP_PARAMS)
>
> #define SCHED_FLAG_UTIL_CLAMP (SCHED_FLAG_UTIL_CLAMP_MIN | \
> - SCHED_FLAG_UTIL_CLAMP_MAX)
> + SCHED_FLAG_UTIL_CLAMP_MAX | \
> + SCHED_FLAG_RESET_UCLAMP_ON_FORK)
>
> #define SCHED_FLAG_ALL (SCHED_FLAG_RESET_ON_FORK | \
> SCHED_FLAG_RECLAIM | \
> --
> 2.40.0.634.g4ca3ef3211-goog
>