Re: CFS Bandwidth Control - Test results of cgroups tasks pinned vsunpinnede

From: Peter Zijlstra
Date: Fri Sep 09 2011 - 08:31:26 EST


On Thu, 2011-09-08 at 20:45 +0530, Srivatsa Vaddagiri wrote:
> * Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> [2011-09-07 21:22:22]:
>
> > On Wed, 2011-09-07 at 20:50 +0530, Srivatsa Vaddagiri wrote:
> > >
> > > Fix excessive idle time reported when cgroups are capped.
> >
> > Where from? The whole idea of bandwidth caps is to introduce idle time,
> > so what's excessive and where does it come from?
>
> We have setup cgroups and their hard limits so that in theory they should
> consume the entire capacity available on machine, leading to 0% idle time.
> That's not what we see. A more detailed description of the setup and the problem
> is here:
>
> https://lkml.org/lkml/2011/6/7/352

That's frigging irrelevant isn't it? A patch should contain its own
justification.

> Machine : 16-cpus (2 Quad-core w/ HT enabled)
> Cgroups : 5 in number (C1-C5), each having {2, 2, 4, 8, 16} tasks respectively.
> Further, each task is placed in its own (sub-)cgroup with
> a capped usage of 50% CPU.

So that's loads: {512,512}, {512,512}, {256,256,256,256}, {128,..} and {64,..}

And you expect that to be balanced perfectly when a bandwidth cap is
introduced, I think you need some expectation adjustments.


> From what I could find out, the "excess" idle time crops up because
> load-balancer is not perfect. For example, there are instances when a
> CPU has just 1 task on its runqueue (rather then the ideal number of 2
> tasks/cpu). When that lone task exceeds its 50% limit, cpu is forced to
> become idle.

So try and cure that instead of frobbing crap like this.

> > > The patch introduces the notion of "steal"
> >
> > The virt folks already claimed steal-time and have it mean something
> > entirely different. You get to pick a new name.
>
> grace time?

Well, ideally this frobbing of symptoms instead of fixing of causes
isn't going to happen at all, its just retarded. And it most certainly
shouldn't be the first approach to any problem.


> > Ok, so this is a solution to an unstated problem. Why is it a good
> > solution?
>
> I am not sure if there are any "good" solutions to this problem!

Good, so then we're not going to do it, full stop.

> One
> possibility is to make the idle load balancer become aggressive in
> pulling tasks across sched-domain boundaries i.e when a CPU becomes idle
> (after a task got throttled) and invokes the idle load balancer, it
> should try "harder" at pulling a task from far-off cpus (across
> package/node boundaries)?

How about we just live with it? You set up a nearly impossible
(non-scalable) problem and then complain we don't do well. Tough fscking
luck, don't do that.

I mean, I'm all for improving things, but your frobbing here is just not
going to happen, most certainly not without very _very_ good
justification, and your patch frankly didn't have any.

Furthermore your patch frobs the bandwidth accounting but doesn't spend
a single word explaining how, if at all, it keeps the accounting a 0-sum
game.

Seriously, you suck, you patch sucks and your method sucks. Go away.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/