Re: [BUG] cpu controller can't provide fair CPU time for each group

From: Yasunori Goto
Date: Thu Nov 19 2009 - 02:10:22 EST


Hi, Peter-san.
Sorry for late response.

> > On Wed, 2009-11-11 at 15:21 +0900, Yasunori Goto wrote:
> > > > Someone needs to fix that if they really care.
> > >
> > > To be honest, I don't have any good idea because I'm not familiar with
> > > schduler's code.
> >
> > You could possible try something like the below, its rather gross but
> > might maybe work,.. has other issues though.
>
> Thank you very much for your patch. I'm very glad.
>
> Unfortunately, my test staff and I don't have test box today,
> I'll report the result later.
>
> Thanks again for your time.

We tested your patch. It works perfectly as we expected.

In addition, we confirmed the environment how this issue occurs.
Even if we doesn't use neither affinity nor cpuset, this issue can be
reproduced.

For example, we made 2 groups which had same share values on the box
which had 2 cores.

CPU time CPU time
# of tasks shares w/o your patch with your patch
group A 1 1024 Avg 71% Avg 100%
group B 3 1024 Avg 129% Avg 100%

Probably, this is the worst case for current cpu controller.
But your patch can fix this.

So, I would like to push this patch into main line if you are ok.
Though you may feel this patch is ugly, I think this is good for stable kernel.

Anyway, thanks a lot for your patch.

Regards.

Acked-by: Yasunori Goto <y-goto@xxxxxxxxxxxxxx>
Tested-by: Kengo Kuwahara <kuwahara.kengo@xxxxxxxxxxxxxx>


>
> >
> >
> > ---
> > kernel/sched.c | 9 +++++++--
> > 1 files changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/sched.c b/kernel/sched.c
> > index 91642c1..65629a3 100644
> > --- a/kernel/sched.c
> > +++ b/kernel/sched.c
> > @@ -1614,7 +1614,7 @@ static void update_group_shares_cpu(struct task_group *tg, int cpu,
> > */
> > static int tg_shares_up(struct task_group *tg, void *data)
> > {
> > - unsigned long weight, rq_weight = 0, shares = 0;
> > + unsigned long weight, rq_weight = 0, sum_weight = 0, shares = 0;
> > unsigned long *usd_rq_weight;
> > struct sched_domain *sd = data;
> > unsigned long flags;
> > @@ -1630,6 +1630,7 @@ static int tg_shares_up(struct task_group *tg, void *data)
> > weight = tg->cfs_rq[i]->load.weight;
> > usd_rq_weight[i] = weight;
> >
> > + rq_weight += weight;
> > /*
> > * If there are currently no tasks on the cpu pretend there
> > * is one of average load so that when a new task gets to
> > @@ -1638,10 +1639,14 @@ static int tg_shares_up(struct task_group *tg, void *data)
> > if (!weight)
> > weight = NICE_0_LOAD;
> >
> > - rq_weight += weight;
> > + sum_weight += weight;
> > +
> > shares += tg->cfs_rq[i]->shares;
> > }
> >
> > + if (!rq_weight)
> > + rq_weight = sum_weight;
> > +
> > if ((!shares && rq_weight) || shares > tg->shares)
> > shares = tg->shares;
> >
> >
>
> --
> Yasunori Goto

--
Yasunori Goto


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/