Re: [PATCH v2 1/5] rlimit: Fix RLIMIT_NPROC enforcement failure caused by capability calls in set_user

From: Solar Designer
Date: Wed Feb 16 2022 - 12:42:37 EST


On Wed, Feb 16, 2022 at 09:58:28AM -0600, Eric W. Biederman wrote:
> Solar Designer <solar@xxxxxxxxxxxx> wrote:
> > I'm not aware of anyone actually running into this issue and reporting
> > it. The systems that I personally know use suexec along with rlimits
> > still run older/distro kernels, so would not yet be affected.
> >
> > So my mention was based on my understanding of how suexec works, and
> > code review. Specifically, Apache httpd has the setting RLimitNPROC,
> > which makes it set RLIMIT_NPROC:
> >
> > https://httpd.apache.org/docs/2.4/mod/core.html#rlimitnproc
> >
> > The above documentation for it includes:
> >
> > "This applies to processes forked from Apache httpd children servicing
> > requests, not the Apache httpd children themselves. This includes CGI
> > scripts and SSI exec commands, but not any processes forked from the
> > Apache httpd parent, such as piped logs."
> >
> > In code, there are:
> >
> > ./modules/generators/mod_cgid.c: ( (cgid_req.limits.limit_nproc_set) && ((rc = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC,
> > ./modules/generators/mod_cgi.c: ((rc = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC,
> > ./modules/filters/mod_ext_filter.c: rv = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC, conf->limit_nproc);
> >
> > For example, in mod_cgi.c this is in run_cgi_child().
> >
> > I think this means an httpd child sets RLIMIT_NPROC shortly before it
> > execs suexec, which is a SUID root program. suexec then switches to the
> > target user and execs the CGI script.
> >
> > Before 2863643fb8b9, the setuid() in suexec would set the flag, and the
> > target user's process count would be checked against RLIMIT_NPROC on
> > execve(). After 2863643fb8b9, the setuid() in suexec wouldn't set the
> > flag because setuid() is (naturally) called when the process is still
> > running as root (thus, has those limits bypass capabilities), and
> > accordingly execve() would not check the target user's process count
> > against RLIMIT_NPROC.
>
> In commit 2863643fb8b9 ("set_user: add capability check when
> rlimit(RLIMIT_NPROC) exceeds") capable calls were added to set_user to
> make it more consistent with fork. Unfortunately because of call site
> differences those capables calls were checking the credentials of the

s/capables/capable/

> user before set*id() instead of after set*id().
>
> This breaks enforcement of RLIMIT_NPROC for applications that set the
> rlimit and then call set*id() while holding a full set of
> capabilities. The capabilities are only changed in the new credential
> in security_task_fix_setuid().
>
> The code in apache suexec appears to follow this pattern.
>
> Commit 909cc4ae86f3 ("[PATCH] Fix two bugs with process limits
> (RLIMIT_NPROC)") where this check was added describes the targes of this
> capability check as:
>
> 2/ When a root-owned process (e.g. cgiwrap) sets up process limits and then
> calls setuid, the setuid should fail if the user would then be running
> more than rlim_cur[RLIMIT_NPROC] processes, but it doesn't. This patch
> adds an appropriate test. With this patch, and per-user process limit
> imposed in cgiwrap really works.
>
> So the original use case also of this check also appears to match the broken
> pattern.

Duplicate "also" - drop one.

> Restore the enforcement of RLIMIT_NPROC by removing the bad capable
> checks added in set_user. This unfortunately restores the
> inconsistencies state the code has been in for the last 11 years, but

s/inconsistencies/inconsistent/

> dealing with the inconsistencies looks like a larger problem.
>
> Cc: stable@xxxxxxxxxxxxxxx
> Link: https://lore.kernel.org/all/20210907213042.GA22626@xxxxxxxxxxxx/
> Link: https://lkml.kernel.org/r/20220212221412.GA29214@xxxxxxxxxxxx
> Fixes: 2863643fb8b9 ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds")
> History-Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
> Signed-off-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>
> ---
> kernel/sys.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/kernel/sys.c b/kernel/sys.c
> index ecc4cf019242..8dd938a3d2bf 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -480,8 +480,7 @@ static int set_user(struct cred *new)
> * failure to the execve() stage.
> */
> if (is_ucounts_overlimit(new->ucounts, UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC)) &&
> - new_user != INIT_USER &&
> - !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN))
> + new_user != INIT_USER)
> current->flags |= PF_NPROC_EXCEEDED;
> else
> current->flags &= ~PF_NPROC_EXCEEDED;

Reviewed-by: Solar Designer <solar@xxxxxxxxxxxx>

Alexander