Re: [RFC PATCH 0/4] Gang scheduling in CFS

From: Benjamin Herrenschmidt
Date: Mon Dec 19 2011 - 17:11:24 EST


On Mon, 2011-12-19 at 16:51 +0100, Peter Zijlstra wrote:
> On Mon, 2011-12-19 at 14:03 +0530, Nikunj A. Dadhania wrote:
> > The following patches implements gang scheduling. These patches
> > are *highly* experimental in nature and are not proposed for
> > inclusion at this time.
>
> Nor will they ever be, I've always strongly opposed the whole concept
> and I'm not about to change my mind. Gang scheduling is a scalability
> nightmare.
>
> > Gang scheduling can be helpful in virtualization scenario. It will
> > help in avoiding the lock-holder-preemption[1] problem and other
> > benefits include improved lock-acquisition times. This feature
> > will help address some limitations of KVM on Power
>
> Use paravirt ticket locks or a pause-loop-filter like thing.
>
> > On Power, we have an interesting hardware restriction on guests
> > running across SMT theads: on any single core, we can only run one
> > mm context at any given time.
>
> OMFG are your hardware engineers insane?

No we can run separate mm contexts, but we can only run one -partition-
at a time. Sadly the host kernel is also a partition for the MMU so that
means that all 4 threads must be running the same guest and enter/exit
the guest at the same time.

> Anyway, I had a look at your patches and I don't see how could ever
> work. You gang-schedule cgroup entities, but there's no guarantee the
> load-balancer will have at least one task for each group on every cpu.

Cheers,
Ben.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/