Re: [PATCH 6/9] x86, pkeys: add pkey set/get syscalls

From: Ingo Molnar
Date: Mon Jul 11 2016 - 03:35:45 EST



* Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:

> On Jul 9, 2016 1:37 AM, "Ingo Molnar" <mingo@xxxxxxxxxx> wrote:
> >
> >
> > * Dave Hansen <dave@xxxxxxxx> wrote:
> >
> > > On 07/08/2016 12:18 AM, Ingo Molnar wrote:
> > >
> > > > So the question is, what is user-space going to do? Do any glibc patches
> > > > exist? How are the user-space library side APIs going to look like?
> > >
> > > My goal at the moment is to get folks enabled to the point that they can start
> > > modifying apps to use pkeys without having to patch their kernels.
> > > I don't have confidence that we can design good high-level userspace interfaces
> > > without seeing some real apps try to use the low-level ones and seeing how they
> > > struggle.
> > >
> > > I had some glibc code to do the pkey alloc/free operations, but those aren't
> > > necessary if we're doing it in the kernel. Other than getting the syscall
> > > wrappers in place, I don't have any immediate plans to do anything in glibc.
> > >
> > > Was there something you were expecting to see?
> >
> > Yeah, so (as you probably guessed!) I'm starting to have second thoughts about the
> > complexity of the alloc/free/set/get interface I suggested, and Mel's review
> > certainly strengthened that feeling.
> >
> > I have two worries:
> >
> > 1)
> >
> > A technical worry I have is that the 'pkey allocation interface' does not seem to
> > be taking the per thread property of pkeys into account - while that property
> > would be useful for apps. That is a limitation that seems unjustified.
> >
> > The reason for this is that we are storing the key allocation bitmap in struct_mm,
> > in mm->context.pkey_allocation_map - while we should be storing it in task_struct
> > or thread_info.
>
> Huh? Doesn't this have to be per mm? Sure, PKRU is per thread, but
> the page tables are shared.

But the keys are not shared, and they carry meaningful per thread information.

mprotect_pkey()'s effects are per MM, but the system calls related to managing the
keys (alloc/free/get/set) are fundamentally per CPU.

Here's an example of how this could matter to applications:

- 'writer thread' gets a RW- key into index 1 to a specific data area
- a pool of 'reader threads' may get the same pkey index 1 R-- to read the data
area.

Same page tables, same index, two protections and two purposes.

With a global, per MM allocation of keys we'd have to use two indices: index 1 and 2.

Depending on how scarce the index space turns out to be making the key indices per
thread is probably the right model.

> There are still two issues that I think we need to address, though:
>
> 1. Signal delivery shouldn't unconditionally clear PKRU. That's what
> the current patches do, and it's unsafe. I'd rather set PKRU to the
> maximally locked down state on signal delivery (except for the
> PROT_EXEC key), although that might cause its own set of problems.

Right now the historic pattern for signal handlers is that they safely and
transparently stack on top of existing FPU related resources and do a save/restore
of them. In that sense saving+clearing+restoring the pkeys state would be the
correct approach that follows that pattern. There are two extra considerations:

- If we think of pkeys as a temporary register that can be used to access/unaccess
normally unaccessible memory regions then this makes sense, in fact it's more
secure: signal handlers cannot accidentally stomp on an encryption key or on a
database area, unless they intentionally gain access to them.

- If we think of pkeys as permanent memory mappings that enhance existing MM
permissions then it would be correct to let them leak into signal handler state.
The globl true-PROT_EXEC key would fall into this category.

So I agree, mostly: the correct approach is to save+clear+restore the first 14
pkey indices, and to leave alone the two 'global' indices.

> 2. When thread A allocates a pkey, how does it lock down thread B?

So see above, I think the temporary key space should be per thread, so there would
be no inter thread interactions: each thread is responsible for its own key
management (via per thread management data in the library that implments it).

Thanks,

Ingo