Re: [PATCH v1] sysctl: Allow change system v ipc sysctls inside ipc namespace

From: Alexey Gladkov
Date: Tue Aug 16 2022 - 11:49:02 EST


On Mon, Jul 25, 2022 at 11:16:07AM -0500, Eric W. Biederman wrote:
> Alexey Gladkov <legion@xxxxxxxxxx> writes:
>
> > Rootless containers are not allowed to modify kernel IPC parameters such
> > as kernel.msgmnb.
> >
> > It seems to me that we can allow customization of these parameters if
> > the user has CAP_SYS_RESOURCE in that ipc namespace.
> >
> > CAP_SYS_RESOURCE is already needed in order to overcome mqueue limits
> > (msg_max and msgsize_max).
>
>
> For changing the permissions on who can modify the SysV limits, I don't
> think this change is safe. I don't see anything that will prevent abuse
> if anyone can modify these limits. Replacing the ordinary unix DAC
> permission check with ns_capable will allow anyone to modify the limits.

All limits are set to almost maximum values - ULONG_MAX. Limit values
are not inherited and are counted in the each ipc namespace (shm_tot is
not global and is located in ipc_ns). In fact, limits are disabled by
default. They can only be reduced.

> That said there is RLIMIT_MSGQUEUE that limits the posix messages queues
> so those should be safe to allow anyone to modify their limits.
>
> The code in mqueue_get_inode is where that limiting happens.
>
> For the posix message queues all that should be needed is to change the
> owner of the sysctl files from the global root to the user namespace
> root. There are also two capable calls in ipc/mqueue.c that can
> probably be changed to ns_capable calls.
>
>
> The only posix message queue limit that I don't immediately see
> something that will prevent abuse of is /proc/sys/fs/mqueue/queus_max.
> That probably still runs into RLIMIT_MSGQUEUE somewhere but it was
> not immediately obvious at first glance.

Everything always ends in mqueue_get_inode. In mqueue_create_attr we
check mq_queues_max and call mqueue_get_inode almost immediately.

I suggest allowing root in user namespace to change ipc namespace
limits.

--

Alexey Gladkov (3):
sysctl: Allow change system v ipc sysctls inside ipc namespace
sysctl: Allow to change limits for posix messages queues
docs: Add information about ipc sysctls limitations

Documentation/admin-guide/sysctl/kernel.rst | 14 ++++++--
ipc/ipc_sysctl.c | 34 ++++++++++++++++---
ipc/mq_sysctl.c | 36 +++++++++++++++++++++
3 files changed, 76 insertions(+), 8 deletions(-)

--
2.33.4