Re: [PATCH v8 3/5] mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC

From: Jeff Xu
Date: Thu Jun 29 2023 - 00:33:46 EST


Hello!

On Wed, Jun 28, 2023 at 12:31 PM Dominique Martinet
<asmadeus@xxxxxxxxxxxxx> wrote:
>
> Dominique Martinet wrote on Wed, Jun 28, 2023 at 08:42:41PM +0900:
> > If flags already has either MFD_EXEC or MFD_NOEXEC_SEAL, you don't check
> > the sysctl at all.
> > [...repro snipped..]
> >
> > What am I missing?
>
> (Perhaps the intent is just to force people to use the flag so it is
> easier to check for memfd_create in seccomp or other LSM?
> But I don't see why such a check couldn't consider the absence of a flag
> as well, so I don't see the point.)
>
Yes. There is consideration to motivate app devs to migrate their code
to use the new EXEC/NOEXEC_SEAL flag for memfd_create, if that answers
your question.

>
> > BTW I find the current behaviour rather hard to use: setting this to 2
> > should still set NOEXEC by default in my opinion, just refuse anything
> > that explicitly requested EXEC.
>
> And I just noticed it's not possible to lower the value despite having
> CAP_SYS_ADMIN: what the heck?! I have never seen such a sysctl and it
> just forced me to reboot because I willy-nilly tested in the init pid
> namespace, and quite a few applications that don't require exec broke
> exactly as I described below.
>
> If the user has CAP_SYS_ADMIN there are more container escape methods
> than I can count, this is basically free pass to root on main namespace
> anyway, you're not protecting anything. Please let people set the sysctl
> to what they want.
>
Yama has a similar setting, for example, 3 (YAMA_SCOPE_NO_ATTACH)
will not allow downgrading at runtime.

Since this is a security feature, not allowing downgrading at run time
is part of the security consideration. I hope you understand.

> --
> Dominique Martinet | Asmadeus

Thanks!
-Jeff