[RFC PATCH 0/6] RLIMIT_NPROC in ucounts fixups

From: Michal Koutný
Date: Mon Feb 07 2022 - 08:13:19 EST


This series is a result of looking deeper into breakage of
tools/testing/selftests/rlimits/rlimits-per-userns.c after
https://lore.kernel.org/r/20220204181144.24462-1-mkoutny@xxxxxxxx/
is applied.

The description of the original problem that lead to RLIMIT_NPROC et al.
ucounts rewrite could be ambiguously interpretted as supporting either
the case of:
- never-fork service or
- fork (RLIMIT_NPROC-1) times service.

The scenario is weird anyway given existence of pids controller.

The realization of that scenario relies not only on tracking number of
processes per user_ns but also newly allows the root to override limit through
set*uid. The commit message didn't mention that, so it's unclear if it
was the intention too.

I also noticed that the RLIMIT_NPROC enforcing in fork seems subject to TOCTOU
race (check(nr_tasks),...,nr_tasks++) so the limit is rather advisory (but
that's not a new thing related to ucounts rewrite).

This series is RFC to discuss relevance of the subtle changes RLIMIT_NPROC to
ucounts rewrite introduced.

Michal Koutný (6):
set_user: Perform RLIMIT_NPROC capability check against new user
credentials
set*uid: Check RLIMIT_PROC against new credentials
cred: Count tasks by their real uid into RLIMIT_NPROC
ucounts: Allow root to override RLIMIT_NPROC
selftests: Challenge RLIMIT_NPROC in user namespaces
selftests: Test RLIMIT_NPROC in clone-created user namespaces

fs/exec.c | 2 +-
include/linux/cred.h | 2 +-
kernel/cred.c | 29 ++-
kernel/fork.c | 2 +-
kernel/sys.c | 20 +-
kernel/ucount.c | 3 +
kernel/user_namespace.c | 2 +-
.../selftests/rlimits/rlimits-per-userns.c | 233 +++++++++++++++---
8 files changed, 229 insertions(+), 64 deletions(-)

--
2.34.1