Re: [patch] Re: autogroup: sched_setscheduler() fails

From: Mike Galbraith
Date: Tue Feb 22 2011 - 09:48:03 EST


On Tue, 2011-02-22 at 13:24 +0100, torbenh wrote:
> On Fri, Feb 18, 2011 at 01:50:12PM +0100, Mike Galbraith wrote:
> > On Fri, 2011-02-18 at 12:09 +0100, torbenh wrote:
> > > On Tue, Feb 15, 2011 at 05:43:30PM +0100, Mike Galbraith wrote:
> > > > On Tue, 2011-02-15 at 16:46 +0100, torbenh wrote:
> > > > > On Mon, Jan 17, 2011 at 02:16:00PM +0100, Peter Zijlstra wrote:
> > > > > > On Thu, 2011-01-13 at 04:54 +0100, Mike Galbraith wrote:
> > > > > > > sched, autogroup: fix CONFIG_RT_GROUP_SCHED sched_setscheduler() failure.
> > > > > > >
> > > > > > > If CONFIG_RT_GROUP_SCHED is set, __sched_setscheduler() fails due to autogroup
> > > > > > > not allocating rt_runtime. Free unused/unusable rt_se and rt_rq, redirect RT
> > > > > > > tasks to the root task group, and tell __sched_setscheduler() that it's ok.
> > > > > > >
> > > > > > > Signed-off-by: Mike Galbraith <efault@xxxxxx>
> > > > > > > Reported-by: Bharata B Rao <bharata@xxxxxxxxxxxxxxxxxx>
> > > > > >
> > > > > > Thanks, applied!
> > > > >
> > > > > while this behaviour is certeinly necessary, i think this is a hack.
> > > > > it fixes the problem for autogroups.
> > > > > But its not fixed for things which want to control the cfs shares via
> > > > > normal cgroups.
> > > >
> > > > You mean automated control ala systemd? For a static set of groups, it
> > > > works fine. I was wondering how systemd would deal with it.
> > >
> > > but i can not get the same behaviour as if CONFIG_RT_GROUP_SCHED was
> > > off. iE N cgroups with different cpu.share values, but each with
> > > rt_runtime_us=950000
> >
> > ? if CONFIG_RT_GROUP_SCHED was a noop, it wouldn't exist.
> >
> > > if the rt_runtime_us was in a different subsystem, its my understanding
> > > that i could leave rt_runtime_us alone, and have all tasks in the root
> > > group in the rt_runtime subsystem.
> >
> > Sounds like you just want to turn CONFIG_RT_GROUP_SCHED off.
>
> but distros turn it on.
> we could prevent debian from turning it on.
> now opensuse 11.4 has turned it on.

If you or anyone else turns on RT_GROUP_SCHED, you will count your
beans, and pay up front, or you will not play. That's a very sensible
policy for realtime.

> > > > The allocation problem was shamelessly punted back to the user, where I
> > > > think it truly belongs.
> > >
> > > sure it belongs to the user. but what if user wants to have different
> > > cpu.shares, but full rt_runtime_us for all tasks ?
> >
> > Then the user doesn't want CONFIG_RT_GROUP_SCHED set.
>
> but distros force it onto the user.
> if systemd shows up, and puts things into different cgroups,
> we end up with a pretty fragmented rt_runtime_us.

If systemd deals with it at all, seems to me it can only make a mess of
it. But who knows, maybe they made a clever allocator. If they didn't,
they'll need an escape hatch methinks.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/