Re: [PATCH 2/3] sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem

From: Tejun Heo
Date: Tue May 19 2015 - 11:51:48 EST


Hello, Peter.

On Tue, May 19, 2015 at 05:16:59PM +0200, Peter Zijlstra wrote:
> .gitconfig:
>
> [diff "default"]
> xfuncname = "^[[:alpha:]$_].*[^:]$"
>
> Will avoid keying on labels like that and show us this is
> __cgroup_procs_write().

Ah, nice trick.

> So my only worry with this patch-set is that these operations will be
> hugely expensive.
>
> Now it looks like the cgroup_update_dfl_csses() thing is very rare, its
> when you change which controllers are active in a given subtree under
> the uber-l337-super-comount design.
>
> The other one, __cgorup_procs_write() is every /procs, /tasks write to a
> cgroup, and that does worry me, this could be a somewhat common thing.
>
> The Changelog states task migration is a cold path, but is tens of
> miliseconds per task really no problem?

The latency is bound by synchronize_sched_expedited(). Given the way
cgroups are used in majority of setups (process migration happening
only during service / session setups), I think this should be okay.

I agree that something which is closer to lglock in characteristics
would fit the workload better tho. If this actually becomes a
problem, we can come up with a different percpu locking scheme which
puts a bit more overhead on the reader side to reduce the latency /
overhead on the writer side which shouldn't be that difficult but
let's see whether we need to get there at all.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/