Re: [PATCH 6/9] x86, pkeys: add pkey set/get syscalls

From: Dave Hansen
Date: Mon Jul 11 2016 - 10:29:05 EST


On 07/11/2016 12:35 AM, Ingo Molnar wrote:
> * Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> mprotect_pkey()'s effects are per MM, but the system calls related to managing the
> keys (alloc/free/get/set) are fundamentally per CPU.
>
> Here's an example of how this could matter to applications:
>
> - 'writer thread' gets a RW- key into index 1 to a specific data area
> - a pool of 'reader threads' may get the same pkey index 1 R-- to read the data
> area.
>
> Same page tables, same index, two protections and two purposes.
>
> With a global, per MM allocation of keys we'd have to use two indices: index 1 and 2.

I'm not sure how this would work. A piece of data mapped at only one
virtual address can have only one key associated with it. For a data
area, you would need to indicate between threads which key they needed
in order to access the data. Both threads need to agree on the virtual
address *and* the key used for access.

Remember, PKRU is just a *bitmap*. The only place keys are stored is in
the page tables.

Here's how this ends up looking in practice when we have an initializer,
a reader and a writer:

/* allocator: */
pkey = pkey_alloc();
data = mmap(PAGE_SIZE, PROT_NONE, ...);
pkey_mprotect(data, PROT_WRITE|PROT_READ, pkey);
metadata[data].pkey = pkey;

/* reader */
pkey_set(metadata[data].pkey, PKEY_DENY_WRITE);
readerfoo = *data;
pkey_set(metadata[data].pkey, PKEY_DENY_WRITE|ACCESS);

/* writer */
pkey_set(metadata[data].pkey, 0); /* 0 == deny nothing */
*data = bar;
pkey_set(metadata[data].pkey, PKEY_DENY_WRITE|ACCESS);


I'm also not sure what the indexes are that you're referring to.

> Depending on how scarce the index space turns out to be making the key indices per
> thread is probably the right model.

Yeah, I'm totally confused about what you mean by indexes.

>> There are still two issues that I think we need to address, though:
>>
>> 1. Signal delivery shouldn't unconditionally clear PKRU. That's what
>> the current patches do, and it's unsafe. I'd rather set PKRU to the
>> maximally locked down state on signal delivery (except for the
>> PROT_EXEC key), although that might cause its own set of problems.
>
> Right now the historic pattern for signal handlers is that they safely and
> transparently stack on top of existing FPU related resources and do a save/restore
> of them. In that sense saving+clearing+restoring the pkeys state would be the
> correct approach that follows that pattern. There are two extra considerations:
>
> - If we think of pkeys as a temporary register that can be used to access/unaccess
> normally unaccessible memory regions then this makes sense, in fact it's more
> secure: signal handlers cannot accidentally stomp on an encryption key or on a
> database area, unless they intentionally gain access to them.
>
> - If we think of pkeys as permanent memory mappings that enhance existing MM
> permissions then it would be correct to let them leak into signal handler state.
> The globl true-PROT_EXEC key would fall into this category.
>
> So I agree, mostly: the correct approach is to save+clear+restore the first 14
> pkey indices, and to leave alone the two 'global' indices.

The current scheme is the most permissive, but it has an important
property: it's the most _flexible_. You can implement almost any scheme
you want in userspace on top of it. The first userspace instruction of
the handler could easily be WRKRU to fully lock down access in whatever
scheme a program wants.