Re: [PATCH] capabilities: add a option PR_SET_CAPS for sys_prctl

From: Paul Moore
Date: Thu Oct 19 2023 - 14:57:53 EST


On Wed, Oct 18, 2023 at 8:20 AM Yunhui Cui <cuiyunhui@xxxxxxxxxxxxx> wrote:
>
> By infecting the container process, the already running container is
> cloned, which means that each process of the container forks
> independently. But the process in the container lacks some permissions
> that cannot be completed.
>
> For a container that is already running, we cannot modify the
> configuration and restart it to complete the permission elevation.
> Since capset() can only complete the setting of a subset of the
> capabilities of the process, it cannot meet the requirements for
> raising permissions. So an option is added to prctl() to complete it.

I'm having a difficult time understanding the description above, would
it be possible for you to explain the problem differently?

> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
> index 82cb4210ba50..9a8dae2be801 100644
> --- a/include/uapi/linux/prctl.h
> +++ b/include/uapi/linux/prctl.h
> @@ -246,6 +246,7 @@ struct prctl_mm_map {
> # define PR_SCHED_CORE_SHARE_FROM 3 /* pull core_sched cookie to pid */
> # define PR_SCHED_CORE_MAX 4
>
> +#define PR_SET_CAPS 63
> /* Clone and personalize thread */
> #define PR_PERSONALIZED_CLONE 1000
> /* Isolation eventfd & epollfd during fork */
> diff --git a/kernel/capability.c b/kernel/capability.c
> index 1444f3954d75..968edd8b3564 100644
> --- a/kernel/capability.c
> +++ b/kernel/capability.c
> @@ -266,11 +270,17 @@ SYSCALL_DEFINE2(capset, cap_user_header_t, header, const cap_user_data_t, data)
> if (!new)
> return -ENOMEM;
>
> - ret = security_capset(new, current_cred(),
> - &effective, &inheritable, &permitted);
> - if (ret < 0)
> - goto error;
> -
> + if (!prctl) {
> + ret = security_capset(new, current_cred(),
> + &effective, &inheritable, &permitted);
> + if (ret < 0)
> + goto error;
> + } else {
> + ret = __capset(new, current_cred(),
> + &effective, &inheritable, &permitted);
> + if (ret < 0)
> + goto error;
> + }

It isn't clear to me why we would want to avoid the security_capset()
hook in the prctl case; the change to the task's capabilities is the
same, yes? If the goal of prctl(PR_SET_CAPS, ...) is to bypass the
security_capset() controls, you're going to need to do a much better
job of explaining why this is necessary.

> audit_log_capset(new, current_cred());
>
> return commit_creds(new);

--
paul-moore.com