Re: [PATCH] mmap: add sysctl for controlling ~VM_MAYEXEC taint

From: Will Drewry
Date: Tue Aug 16 2011 - 18:35:54 EST


On Tue, Aug 16, 2011 at 4:54 PM, Andrew Morton
<akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Mon, 15 Aug 2011 15:57:35 -0500
> Will Drewry <wad@xxxxxxxxxxxx> wrote:
>
>> This patch proposes a sysctl knob that allows a privileged user to
>> disable ~VM_MAYEXEC tainting when mapping in a vma from a MNT_NOEXEC
>> mountpoint.  It does not alter the normal behavior resulting from
>> attempting to directly mmap(PROT_EXEC) a vma (-EPERM) nor the behavior
>> of any other subsystems checking MNT_NOEXEC.
>>
>> It is motivated by a common /dev/shm, /tmp usecase. There are few
>> facilities for creating a shared memory segment that can be remapped in
>> the same process address space with different permissions.  Often, a
>> file in /tmp provides this functionality.  However, on distributions
>> that are more restrictive/paranoid, world-writeable directories are
>> often mounted "noexec".  The only workaround to support software that
>> needs this behavior is to either not use that software or remount /tmp
>> exec.
>
> Remounting /tmp would appear to have the same effect as altering this
> sysctl, so why not just remount /tmp?

The main difference is that you still achieve the primary goals of
noexec without the secondary:
1. exec still fails
2. mmap(PROT_EXEC) still fails

This means that with a common gnu-ish userspace, it's not possible to
execute an arbitrary binary in /tmp or use it as a preload or dlopen()
source. It's like half-noexec.

>>  (E.g., https://bugs.gentoo.org/350336?id=350336)  Given that
>> the only recourse is using SysV IPC, the application programmer loses
>> many of the useful ABI features that they get using a mmap'd file (and
>> as such are often hesitant to explore that more painful path).
>>
>> With this patch, it would be possible to change the sysctl variable
>> such that mprotect(PROT_EXEC) would succeed.  In cases like the example
>> above, an additional userspace mmap-wrapper would be needed, but in
>> other cases, like how code.google.com/p/nativeclient mmap()s then
>> mprotect()s, the behavior would be unaffected.
>>
>> The tradeoff is a loss of defense in depth, but it seems reasonable when
>> the alternative is to disable the defense entirely.
>>
>> ...
>>
>> --- a/kernel/sysctl.c
>> +++ b/kernel/sysctl.c
>> @@ -89,6 +89,9 @@
>>  /* External variables not in a header file. */
>>  extern int sysctl_overcommit_memory;
>>  extern int sysctl_overcommit_ratio;
>> +#ifdef CONFIG_MMU
>
> The ifdef isn't needed in the header and we generally omit it to avoid
> clutter.

Thanks - I'll remove it!

> afaict this feature could be made available on NOMMU systems?

When I poked around I didn't see VM_MAYEXEC being used in NOMMU
systems, but I may have just been misreading! I'll relook.

>> +extern int sysctl_mmap_noexec_taint;
>
> The term "taint" has a specific meaning in the kernel (see
> add_taint()).  It's regrettable that this patch attaches a second
> meaning to that term.  Can we think of a better word to use?
>
> A better word would communicate the sense of the sysctl operation.  If
> a "taint" flag is set to true, I don't know whether that means that
> noexec is enabled or disabled.  Something like
> sysctl_mmap_noexec_override or sysctl_mmap_noexec_disable, perhaps.

Thanks for the good points and suggestions. Maybe something like
sysctl_mprotect_ignores_noexec
would reflect this more closely, though still not quite as accurately
as your examples.
(hrm, maybe sysctl_mmap_noexec_propagates)

> This patch forgot to document the new feature and its sysctl.
> Documentation/sysctl/vm.txt might be the right place.

I will add that along with the changes from your other comments.

Thanks!
will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/