Re: Severe performance regression w/ 4.4+ on Android due to cgroup locking changes

From: John Stultz
Date: Thu Jul 14 2016 - 13:30:55 EST


On Thu, Jul 14, 2016 at 10:13 AM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> On 07/14, John Stultz wrote:
>>
>> So I am seeing synchronize_sched called, and its taking the
>> !rcu_gp_is_expedited path when I see the particularly bad latencies.
>>
>> I wonder if I just mucked up applying the patch?
>
> Probably yes...

Hm. So I applied peterz patch to 4.7-rc7 and then diffed it to what I
had and it was just whitespace changes.

I've synched them up now, so I suspect my application isn't the issue
now. Just to be clear, I'm not supposed to be applying this on-top of
Paul's change, right?


> Just in case, could you try the patch below? Of course, without other
> optimizations from Peter, this change makes cgroup_threadgroup_rwsem
> much worse than a plain rw_semaphore.
>
> Oleg.
>
> --- x/kernel/cgroup.c
> +++ x/kernel/cgroup.c
> @@ -5605,6 +5605,8 @@ int __init cgroup_init(void)
> BUG_ON(cgroup_init_cftypes(NULL, cgroup_dfl_base_files));
> BUG_ON(cgroup_init_cftypes(NULL, cgroup_legacy_base_files));
>
> + rcu_sync_enter(&cgroup_threadgroup_rwsem.rss);
> +


So adding this does make a huge difference ontop of Peter's patch. I'm
seeing sub 200us values for everything. The biggest spike in my basic
testing has been 138us.

I'm also not seeing synchronize_sched being called nearly as often,
and it doesn't seem to be being called in cgroup_procs_write path.

thanks
-john