Re: [PATCH] x86, vsyscall: add CONFIG to control default

From: Andy Lutomirski
Date: Mon Aug 31 2015 - 17:24:39 EST


On Aug 31, 2015 1:13 PM, "Kees Cook" <keescook@xxxxxxxxxxxx> wrote:
>
> On Wed, Aug 12, 2015 at 7:23 PM, Josh Triplett <josh@xxxxxxxxxxxxxxxx> wrote:
> > On Wed, Aug 12, 2015 at 05:55:19PM -0700, Kees Cook wrote:
> >> Most modern systems can run with vsyscall=none. In an effort to provide
> >> a way for build-time defaults to lack legacy settings, this adds a new
> >> CONFIG to select the type of vsyscall mapping to use, similar to the
> >> existing "vsyscall" command line parameter.
> >>
> >> Signed-off-by: Kees Cook <keescook@xxxxxxxxxxxx>
> >
> > Seems reasonable to me. One question, though: is there *any* reason to
> > choose "native" over "emulate"? (Does "emulate" have a sufficient
> > performance penalty to matter, and do people running old glibc really
> > care about that performance while still not wanting to upgrade?)
> > If there is a reason, could you please document it in the
> > descriptions of the "native" and "emulate" options (as an upside and a
> > downside, respectively)? If there isn't, you might consider a patch to
> > remove "native".
>
> I think "native" is available out of an abundance of caution. Andy
> left it available, though I'm not sure if he had plans to remove
> "native" entirely.

Native adds almost no code and almost no maintenance burden -- it's
really just a PTE bit.

>
> Can someone from the x86 tree take this patch, or are there other
> things to improve?

It looks good to me.

I was thinking about how to control vsyscalls per process, and it's
not so easy. We can turn off emulation per process trivially (modulo
figuring out the ABI), but the Project Zero thing makes me think that
we want to be able to switch off *read* access.

For almost all purposes, we could just switch off read access globally
with no ill effects. The problem is that nasty little programs like
pin will start crashing when run on old binaries.

We could allocate two copies of the top pud, switch them out in the
pgd depending on whether vsyscalls are on for the mm, and clearing the
G bit. It's a bit of a departure for how things work now, and it'll
interact really weirdly with the fixmap code and anything else that
pokes at that part of the kernel page tables (e.g. Xen?) Hmm.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/