Re: [PATCH v2] proc: use ns_capable instead of capable for timerslack_ns

From: Eric W. Biederman
Date: Wed Oct 31 2018 - 00:32:11 EST


Benjamin Gordon <bmgordon@xxxxxxxxxx> writes:

> Access to timerslack_ns is controlled by a process having CAP_SYS_NICE
> in its effective capability set, but the current check looks in the root
> namespace instead of the process' user namespace. Since a process is
> allowed to do other activities controlled by CAP_SYS_NICE inside a
> namespace, it should also be able to adjust timerslack_ns.

Acked-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>

I don't see any fundamental probess with how the processes user
namespace is being accessed. You can race with setns
and that may result in a descendent user namespace of the current
user namespace being set. But if you have permissions in the parent
user namespace you will have permissions over a child user namespace.
So the race can't effect the outcome of the ns_capable test.

That and while __task_cred(p) may change it is guaranteed there is a
valid one until __put_task_struct which only happens when a process has
a zero refcount. Which the success of get_proc_task in before these
checks already ensures is not true.

So from my perspective this looks like a reasonable change.

I don't know how this looks from people who understand the timer bits
and what timerslack does. I suspect it is reasonable as there is no
permission check for changing yourself.

Eric

> Signed-off-by: Benjamin Gordon <bmgordon@xxxxxxxxxx>
> Cc: John Stultz <john.stultz@xxxxxxxxxx>
> Cc: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>
> Cc: Kees Cook <keescook@xxxxxxxxxxxx>
> Cc: "Serge E. Hallyn" <serge@xxxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Arjan van de Ven <arjan@xxxxxxxxxxxxxxx>
> Cc: Oren Laadan <orenl@xxxxxxxxxxx>
> Cc: Ruchi Kandoi <kandoiruchi@xxxxxxxxxx>
> Cc: Rom Lemarchand <romlem@xxxxxxxxxxx>
> Cc: Todd Kjos <tkjos@xxxxxxxxxx>
> Cc: Colin Cross <ccross@xxxxxxxxxxx>
> Cc: Nick Kralevich <nnk@xxxxxxxxxx>
> Cc: Dmitry Shmidt <dimitrysh@xxxxxxxxxx>
> Cc: Elliott Hughes <enh@xxxxxxxxxx>
> Cc: Android Kernel Team <kernel-team@xxxxxxxxxxx>
> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> ---
>
> Changes from v1:
> - Use the namespace of the target process instead of the file opener.
> Didn't carry over John Stultz' Acked-by since the changes aren't
> cosmetic.
>
> fs/proc/base.c | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index c78d8da09b52c..bdc093ba81dd3 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -2385,10 +2385,13 @@ static ssize_t timerslack_ns_write(struct file *file, const char __user *buf,
> return -ESRCH;
>
> if (p != current) {
> - if (!capable(CAP_SYS_NICE)) {
> + rcu_read_lock();
> + if (!ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE)) {
> + rcu_read_unlock();
> count = -EPERM;
> goto out;
> }
> + rcu_read_unlock();
>
> err = security_task_setscheduler(p);
> if (err) {
> @@ -2421,11 +2424,14 @@ static int timerslack_ns_show(struct seq_file *m, void *v)
> return -ESRCH;
>
> if (p != current) {
> -
> - if (!capable(CAP_SYS_NICE)) {
> + rcu_read_lock();
> + if (!ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE)) {
> + rcu_read_unlock();
> err = -EPERM;
> goto out;
> }
> + rcu_read_unlock();
> +
> err = security_task_getscheduler(p);
> if (err)
> goto out;