Re: [PATCH v7 0/6] mm/memfd: introduce MFD_NOEXEC_SEAL and MFD_EXEC

From: Kees Cook
Date: Wed Dec 14 2022 - 13:54:50 EST


On Fri, Dec 09, 2022 at 04:04:47PM +0000, jeffxu@xxxxxxxxxxxx wrote:
> From: Jeff Xu <jeffxu@xxxxxxxxxx>
>
> Since Linux introduced the memfd feature, memfd have always had their
> execute bit set, and the memfd_create() syscall doesn't allow setting
> it differently.
>
> However, in a secure by default system, such as ChromeOS, (where all
> executables should come from the rootfs, which is protected by Verified
> boot), this executable nature of memfd opens a door for NoExec bypass
> and enables “confused deputy attack”. E.g, in VRP bug [1]: cros_vm
> process created a memfd to share the content with an external process,
> however the memfd is overwritten and used for executing arbitrary code
> and root escalation. [2] lists more VRP in this kind.
>
> On the other hand, executable memfd has its legit use, runc uses memfd’s
> seal and executable feature to copy the contents of the binary then
> execute them, for such system, we need a solution to differentiate runc's
> use of executable memfds and an attacker's [3].
>
> To address those above, this set of patches add following:
> 1> Let memfd_create() set X bit at creation time.
> 2> Let memfd to be sealed for modifying X bit.
> 3> A new pid namespace sysctl: vm.memfd_noexec to control the behavior of
> X bit.For example, if a container has vm.memfd_noexec=2, then
> memfd_create() without MFD_NOEXEC_SEAL will be rejected.
> 4> A new security hook in memfd_create(). This make it possible to a new
> LSM, which rejects or allows executable memfd based on its security policy.

I think patch 1-5 look good to land. The LSM hook seems separable, and
could continue on its own. Thoughts?

(Which tree should memfd change go through?)

-Kees

--
Kees Cook