Re: [PATCH] mm/mmap: Map MAP_STACK to VM_STACK

From: Matthew Wilcox
Date: Wed Apr 19 2023 - 11:09:35 EST


On Wed, Apr 19, 2023 at 11:07:04AM -0400, Waiman Long wrote:
> On 4/18/23 23:46, Matthew Wilcox wrote:
> > On Tue, Apr 18, 2023 at 09:16:37PM -0400, Waiman Long wrote:
> > >  1) App runs creating lots of threads.
> > >  2) It mmap's 256K pages of anonymous memory.
> > >  3) It writes executable code to that memory.
> > >  4) It calls mprotect() with PROT_EXEC on that memory so
> > >     it can subsequently execute the code.
> > >
> > > The above mprotect() will fail if the mmap'd region's VMA gets merged with
> > > the VMA for one of the thread stacks.  That's because the default RHEL
> > > SELinux policy is to not allow executable stacks.
> > By the way, this is a daft policy. The policy you really want is
> > EXEC|WRITE is not allowed. A non-writable stack is useless, so it's
> > actually a superset of your current policy. Forbidding _simultaneous_
> > write and executable is just good programming. This way, you don't need
> > to care about the underlying VMA's current permissions, you just need
> > to do:
> >
> > if ((prot & (PROT_EXEC|PROT_WRITE)) == (PROT_EXEC|PROT_WRITE))
> > return -EACCESS;
>
> I am not totally sure if the application changes the VMA to read-only first.
> Even if it does that, it highlights another possible issue when an anonymous
> VMA is merged with a stack VMA. Either the mprotect() to write-protect the
> VMA will fail or the application will segfault if it writes stuff to the
> stack. This particular issue is not related to SELinux. It provides another
> good idea why we should avoid merging stack VMA to anonymous VMA.

mprotect will split the VMA into two VMAs, one that is
PROT_READ|PROT_WRITE and one the is PROT_READ|PROT_EXEC.