Re: [PATCH] Drop reference added by grab_header

From: zhouchengming
Date: Thu Jan 05 2017 - 07:15:40 EST


On 2017/1/5 19:56, Hanjun Guo wrote:
On 2017/1/5 19:33, Zhou Chengming wrote:
Fixes CVE-2016-9191.

CVE-2016-9191 says that it's cgroup bug but turns out it's
not, I think you need to add more commit message to
explain it? For example, we got different calltrace stack
but all of them point to drop_sysctl_table() and it turns out
a reference count bug.

Thanks
Hanjun

Well, I put the calltrace here.

[ 5535.960522] Call Trace:
[ 5535.963265] [<ffffffff817cdaaf>] schedule+0x3f/0xa0
[ 5535.968817] [<ffffffff817d33fb>] schedule_timeout+0x3db/0x6f0
[ 5535.975346] [<ffffffff817cf055>] ? wait_for_completion+0x45/0x130
[ 5535.982256] [<ffffffff817cf0d3>] wait_for_completion+0xc3/0x130
[ 5535.988972] [<ffffffff810d1fd0>] ? wake_up_q+0x80/0x80
[ 5535.994804] [<ffffffff8130de64>] drop_sysctl_table+0xc4/0xe0
[ 5536.001227] [<ffffffff8130de17>] drop_sysctl_table+0x77/0xe0
[ 5536.007648] [<ffffffff8130decd>] unregister_sysctl_table+0x4d/0xa0
[ 5536.014654] [<ffffffff8130deff>] unregister_sysctl_table+0x7f/0xa0
[ 5536.021657] [<ffffffff810f57f5>] unregister_sched_domain_sysctl+0x15/0x40
[ 5536.029344] [<ffffffff810d7704>] partition_sched_domains+0x44/0x450
[ 5536.036447] [<ffffffff817d0761>] ? __mutex_unlock_slowpath+0x111/0x1f0
[ 5536.043844] [<ffffffff81167684>] rebuild_sched_domains_locked+0x64/0xb0
[ 5536.051336] [<ffffffff8116789d>] update_flag+0x11d/0x210
[ 5536.057373] [<ffffffff817cf61f>] ? mutex_lock_nested+0x2df/0x450
[ 5536.064186] [<ffffffff81167acb>] ? cpuset_css_offline+0x1b/0x60
[ 5536.070899] [<ffffffff810fce3d>] ? trace_hardirqs_on+0xd/0x10
[ 5536.077420] [<ffffffff817cf61f>] ? mutex_lock_nested+0x2df/0x450
[ 5536.084234] [<ffffffff8115a9f5>] ? css_killed_work_fn+0x25/0x220
[ 5536.091049] [<ffffffff81167ae5>] cpuset_css_offline+0x35/0x60
[ 5536.097571] [<ffffffff8115aa2c>] css_killed_work_fn+0x5c/0x220
[ 5536.104207] [<ffffffff810bc83f>] process_one_work+0x1df/0x710
[ 5536.110736] [<ffffffff810bc7c0>] ? process_one_work+0x160/0x710
[ 5536.117461] [<ffffffff810bce9b>] worker_thread+0x12b/0x4a0
[ 5536.123697] [<ffffffff810bcd70>] ? process_one_work+0x710/0x710
[ 5536.130426] [<ffffffff810c3f7e>] kthread+0xfe/0x120
[ 5536.135991] [<ffffffff817d4baf>] ret_from_fork+0x1f/0x40
[ 5536.142041] [<ffffffff810c3e80>] ? kthread_create_on_node+0x230/0x230

And one cgroup maintainer mentioned that "cgroup is trying to offline
a cpuset css, which takes place under cgroup_mutex. The offlining
ends up trying to drain active usages of a sysctl table which apprently
is not happening." The real reason is that proc_sys_readdir doesn't
drop reference added by grab_header when return from !dir_emit_dots path.

Thanks.


Reported-by: CAI Qian<caiqian@xxxxxxxxxx>
Tested-by: Yang Shukui<yangshukui@xxxxxxxxxx>
Signed-off-by: Zhou Chengming<zhouchengming1@xxxxxxxxxx>
---
fs/proc/proc_sysctl.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 5d931bf..c4c90bd 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -718,7 +718,7 @@ static int proc_sys_readdir(struct file *file, struct dir_context *ctx)
ctl_dir = container_of(head, struct ctl_dir, header);

if (!dir_emit_dots(file, ctx))
- return 0;
+ goto out;

pos = 2;

@@ -728,6 +728,7 @@ static int proc_sys_readdir(struct file *file, struct dir_context *ctx)
break;
}
}
+out:
sysctl_head_finish(head);
return 0;
}



.