Re: [PATCH v2 6/6] shmem: add support to ignore swap

From: Luis Chamberlain
Date: Tue Apr 18 2023 - 17:51:36 EST


On Tue, Apr 18, 2023 at 09:38:10AM +0200, Christian Brauner wrote:
> On Mon, Apr 17, 2023 at 10:50:59PM -0700, Hugh Dickins wrote:
> > On Thu, 9 Mar 2023, Luis Chamberlain wrote:
> >
> > > In doing experimentations with shmem having the option to avoid swap
> > > becomes a useful mechanism. One of the *raves* about brd over shmem is
> > > you can avoid swap, but that's not really a good reason to use brd if
> > > we can instead use shmem. Using brd has its own good reasons to exist,
> > > but just because "tmpfs" doesn't let you do that is not a great reason
> > > to avoid it if we can easily add support for it.
> > >
> > > I don't add support for reconfiguring incompatible options, but if
> > > we really wanted to we can add support for that.
> > >
> > > To avoid swap we use mapping_set_unevictable() upon inode creation,
> > > and put a WARN_ON_ONCE() stop-gap on writepages() for reclaim.
> >
> > I have one big question here, which betrays my ignorance:
> > I hope that you or Christian can reassure me on this.
> >
> > tmpfs has fs_flags FS_USERNS_MOUNT. I know nothing about namespaces,
> > nothing; but from overhearings, wonder if an ordinary user in a namespace
> > might be able to mount their own tmpfs with "noswap", and thereby evade
> > all accounting of the locked memory.
> >
> > That would be an absolute no-no for this patch; but I assume that even
> > if so, it can be easily remedied by inserting an appropriate (unknown
> > to me!) privilege check where the "noswap" option is validated.
>
> Oh, good catch. Thanks! So you would just need sm like:
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 787e83791eb5..21ce9b26bb4d 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -3571,6 +3571,10 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param)
> ctx->seen |= SHMEM_SEEN_INUMS;
> break;
> case Opt_noswap:
> + if ((fc->user_ns != &init_user_ns) || !capable(CAP_SYS_ADMIN)) {
> + return invalfc(fc,
> + "Turning off swap in unprivileged tmpfs mounts unsupported");
> + }
> ctx->noswap = true;
> ctx->seen |= SHMEM_SEEN_NOSWAP;
> break;
>
> The fc->user_ns is the userns that the tmpfs mount will be mounted in, i.e.,
> fc->user_ns will become sb->s_user_ns if FS_USERNS_MOUNT is raised. So with the
> check above we require that the tmpfs instance must ultimately belong to the
> initial userns and that the caller has CAP_SYS_ADMIN in the initial userns
> (CAP_SYS_ADMIN guards swapon and swapoff) according to capabilities(7).

Christian, mind sending this as a fix?

Luis