Re: BTI interaction between seccomp filters in systemd and glibc mprotect calls, causing service failures

From: Kees Cook
Date: Thu Oct 22 2020 - 16:02:25 EST


On Thu, Oct 22, 2020 at 01:39:07PM +0300, Topi Miettinen wrote:
> But I think SELinux has a more complete solution (execmem) which can track
> the pages better than is possible with seccomp solution which has a very
> narrow field of view. Maybe this facility could be made available to
> non-SELinux systems, for example with prctl()? Then the in-kernel MDWX could
> allow mprotect(PROT_EXEC | PROT_BTI) in case the backing file hasn't been
> modified, the source filesystem isn't writable for the calling process and
> the file descriptor isn't created with memfd_create().

Right. The problem here is that systemd is attempting to mediate a
state change using only syscall details (i.e. with seccomp) instead of
a stateful analysis. Using a MAC is likely the only sane way to do that.
SELinux is a bit difficult to adjust "on the fly" the way systemd would
like to do things, and the more dynamic approach seen with SARA[1] isn't
yet in the kernel. Trying to enforce memory W^X protection correctly
via seccomp isn't really going to work well, as far as I can see.

Regardless, it makes sense to me to have the kernel load the executable
itself with BTI enabled by default. I prefer gaining Catalin's suggested
patch[2]. :)

[1] https://lore.kernel.org/kernel-hardening/1562410493-8661-1-git-send-email-s.mesoraca16@xxxxxxxxx/
[2] https://lore.kernel.org/linux-arm-kernel/20201022093104.GB1229@gaia/

--
Kees Cook