Re: Severe performance regression w/ 4.4+ on Android due to cgroup locking changes

From: Peter Zijlstra
Date: Wed Jul 13 2016 - 04:21:37 EST


On Tue, Jul 12, 2016 at 05:00:04PM -0700, John Stultz wrote:
> Hey Tejun,
>
> So Dmitry Shmidt recently noticed that with 4.4 based systems we're
> seeing quite a bit of performance overhead from
> __cgroup_procs_write().
>
> With 4.4 tree as it stands, we're seeing __cgroup_procs_write() quite
> often take 10s of miliseconds to execute (with max times up in the
> 80ms range).
>
> While with 4.1 it was quite often in the single usec range, and max
> time values still in in sub-milisecond range.
>
> The majority of these performance regressions seem to come from the
> locking changes in:
>
> 3014dde762f6 ("cgroup: simplify threadgroup locking")
> and
> 1ed1328792ff ("sched, cgroup: replace signal_struct->group_rwsem with
> a global percpu_rwsem")
>
> Dmitry has found that by reverting these two changes (which don't
> revert easiliy), we can get back down to tens 10-100 usec range for
> most calls, with max values occasionally spiking to ~18ms.
>
> Those two commits do talk about performance regressions, that were
> supposedly alleviated by percpu_rwsem changes, but I'm not sure we are
> seeing this.

Do you have 'funny' RCU options that quickly force a grace period when
you go idle or something?

But yes, it does not surprise me to find this commit is causing
problems.