Re: [RFC PATCH v1] sched/uclamp: Introduce SCHED_FLAG_RESET_UCLAMP_ON_FORK flag

From: Qais Yousef
Date: Fri Apr 28 2023 - 07:45:04 EST


Hi Vincent

Sorry for the late response.

On 04/21/23 17:10, Vincent Guittot wrote:
> On Thu, 20 Apr 2023 at 18:26, Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
> >
> > On Thu, Apr 20, 2023 at 6:44 AM Vincent Guittot
> > <vincent.guittot@xxxxxxxxxx> wrote:
> > >
> > > On Thu, 20 Apr 2023 at 11:37, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
> > > >
> > > > On 20/04/2023 03:11, David Dai wrote:
> > > > > On Tue, Apr 18, 2023 at 10:18 PM Dietmar Eggemann
> > > > > <dietmar.eggemann@xxxxxxx> wrote:
> > > > >>
> > > > >
> > > > > Hi Dietmar, thanks for your time,
> > > > >
> > > > >> On 16/04/2023 23:34, David Dai wrote:
> > > > >>> A userspace service may manage uclamp dynamically for individual tasks and
> > > > >>> a child task will unintentionally inherit a pesudo-random uclamp setting.
> > > > >>> This could result in the child task being stuck with a static uclamp value
> > > > >>
> > > > >> Could you explain this with a little bit more detail? Why isn't the
> > > > >> child task also managed by the userspace service?
> > > > >
> > > > > See Qais’ reply that contains more detail on how it’s being used in
> > > > > Android. In general, if a dynamic userspace service will adjust uclamp
> > > > > on the fly for a given task, but has no knowledge or control over if
> > > > > or when a task forks. Depending on the timing of the fork, a child
> > > > > task may inherit a very large or a small uclamp_min or uclamp_max
> > > > > value. The intent of this patch is to provide more flexibility to the
> > > > > uclamp APIs such that child tasks do not get stuck with a poor uclamp
> > > > > value when spawned while retaining other sched attributes. When
> > > > > RESET_ON_FORK is set on the parent task, it will reset uclamp values
> > > > > for the child but also reset other sched attributes as well.
> > > >
> > > > OK, in this case, why not just change behavior and always reset the
> > > > uclamp values at fork?
> > > >
> > > > Do we anticipate a use-case in which uclamp inheritance would be required?
> > > >
> > > > Let's not over-complicate the sched_[sg]etattr() unnecessarily.
> > >
> > > I was about to ask the same question and I'm aligned with Dietmar.
> > > Use RESET_ON_FORK and set all attributes
> >
> > That's racy though. If we have an external service (that's only
> > responsible for setting uclamp) setting all the attributes, the forked
> > thread could also be trying to set some of the attributes. Also, how
> > is this external service going to keep track of all the threads being
> > forked and set the right attributes for all of them?
>
> My assumption was that you didn't use RESET_ON_FORK because there were
> other attributes that you wanted to keep from parent but it doesn't

Correct.

> seem to be the case so use RESET_ON_FORK and if needed the forked
> thread will set its other attributes

Hmm, it seems you think it's the parent who's trying to control the fork for
the child. But the situation we're in is different.

ADPF is a shared library/middleware/system service that provides higher level
API to manage the performance requirements for a group of tasks. They provide
a set of PIDs and a deadline, then continuously tell it how long they took to
finish their work. ADPF uses this info to manipulate uclamp_min dynamically on
their behalf so they finish their work in time. Explaining it very simply and
briefly here. It should make using uclamp_max easier too (though we are not
there yet) since the most efficient point is platform specific and the
middleware can help abstract this better in a portable way.

The problem is these tasks can be RT and/or can have their priorities/nice
values set by something else. But what we want is to make sure that we control
uclamp values only for the PIDs we were asked to manage, and anything forked
must have their uclamp values reset, but not the rest as this is outside of
this service available context. It gets delegated to handle performance hints
only and make using uclamp easier and generic. But this should not interfere
with policy and priority management done outside of its scope.

Apps that manage their own uclamp values wouldn't need this. But if we are to
build a smart middleware that provides a consistent and easier to use higher
level APIs, which is what we have here, we need this selective reset on fork.

Does this explain the problem better?

Note we noticed because we saw problems here. So we are not trying to be
theoretically cautious.


Thanks!

--
Qais Yousef