Re: BTI interaction between seccomp filters in systemd and glibc mprotect calls, causing service failures

From: Salvatore Mesoraca
Date: Sat Oct 24 2020 - 10:12:50 EST


On Sat, 24 Oct 2020 at 12:34, Topi Miettinen <toiwoton@xxxxxxxxx> wrote:
>
> On 23.10.2020 20.52, Salvatore Mesoraca wrote:
> > Hi,
> >
> > On Thu, 22 Oct 2020 at 23:24, Topi Miettinen <toiwoton@xxxxxxxxx> wrote:
> >> SARA looks interesting. What is missing is a prctl() to enable all W^X
> >> protections irrevocably for the current process, then systemd could
> >> enable it for services with MemoryDenyWriteExecute=yes.
> >
> > SARA actually has a procattr[0] interface to do just that.
> > There is also a library[1] to help using it.
>
> That means that /proc has to be available and writable at that point, so
> setting up procattrs has to be done before mount namespaces are set up.
> In general, it would be nice for sandboxing facilities in kernel if
> there would be a way to start enforcing restrictions only at next
> execve(), like setexeccon() for SELinux and aa_change_onexec() for
> AppArmor. Otherwise the exact order of setting up various sandboxing
> options can be very tricky to arrange correctly, since each option may
> have a subtle effect to the sandboxing features enabled later. In case
> of SARA, the operations done between shuffling the mount namespace and
> before execve() shouldn't be affected so it isn't important. Even if it
> did (a new sandboxing feature in the future would need trampolines or
> JIT code generation), maybe the procattr file could be opened early but
> it could be written closer to execve().

A new "apply on exec" procattr file seems reasonable and relatively easy to add.
As Kees pointed out, the main obstacle here is the fact that SARA is
not upstream :(

Salvatore