Re: [RFC][PATCH 6/7] x86, pkeys: add pkey set/get syscalls

From: Ingo Molnar
Date: Tue Feb 23 2016 - 01:46:16 EST



* Dave Hansen <dave@xxxxxxxx> wrote:

>
> From: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
>
> This establishes two more system calls for protection key management:
>
> unsigned long pkey_get(int pkey);
> int pkey_set(int pkey, unsigned long access_rights);
>
> The return value from pkey_get() and the 'access_rights' passed
> to pkey_set() are the same format: a bitmask containing
> PKEY_DENY_WRITE and/or PKEY_DENY_ACCESS, or nothing set at all.
>
> These can replace userspace's direct use of the new rdpkru/wrpkru
> instructions.
>
> With current hardware, the kernel can not enforce that it has
> control over a given key. But, this at least allows the kernel
> to indicate to userspace that userspace does not control a given
> protection key. This makes it more likely that situations like
> using a pkey after sys_pkey_free() can be detected.

So it's analogous to file descriptor open()/close() syscalls: the kernel does not
enforce that different libraries of the same process do not interfere with each
other's file descriptors - but in practice it's not a problem because everyone
uses open()/close().

Resources that a process uses don't per se 'need' kernel level isolation to be
useful.

> The kernel does _not_ enforce that this interface must be used for
> changes to PKRU, whether or not a key has been "allocated".

Nor does the kernel enforce that open() must be used to get a file descriptor, so
code can do the following:

close(100);

and can interfere with a library that is holding a file open - but it's generally
not a problem and the above is considered poor code that will cause problems.

One thing that is different is that file descriptors are generally plentiful,
while of pkeys there are at most 16 - but I think it's still "large enough" to not
be an issue in practice.

We'll see ...

> This syscall interface could also theoretically be replaced with a pair of
> vsyscalls. The vsyscalls would just call WRPKRU/RDPKRU directly in situations
> where they are drop-in equivalents for what the kernel would be doing.

Indeed.

Thanks,

Ingo