Re: [PATCH 0/6] Memory Mapping (VMA) protection using PKU - set 1

From: Dave Hansen
Date: Wed May 17 2023 - 11:08:40 EST


On 5/17/23 03:51, Stephen Röttger wrote:
> On Wed, May 17, 2023 at 12:41 AM Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
>> Can't run arbitrary instructions, but can make (pretty) arbitrary syscalls?
>
> The threat model is that the attacker has arbitrary read/write, while other
> threads run in parallel. So whenever a regular thread performs a syscall and
> takes a syscall argument from memory, we assume that argument can be attacker
> controlled.
> Unfortunately, the line is a bit blurry which syscalls / syscall arguments we
> need to assume to be attacker controlled.

Ahh, OK. So, it's not that the *attacker* can make arbitrary syscalls.
It's that the attacker might leverage its arbitrary write to trick a
victim thread into turning what would otherwise be a good syscall into a
bad one with attacker-controlled content.

I guess that makes the readv/writev-style of things a bad idea in this
environment.

>>> Sigreturn is a separate problem that we hope to solve by adding pkey
>>> support to sigaltstack
>>
>> What kind of support were you planning to add?
>
> We’d like to allow registering pkey-tagged memory as a sigaltstack. This would
> allow the signal handler to run isolated from other threads. Right now, the
> main reason this doesn’t work is that the kernel would need to change the pkru
> state before storing the register state on the stack.
>
>> I was thinking that an attacker with arbitrary write access would wait
>> until PKRU was on the userspace stack and *JUST* before the kernel
>> sigreturn code restores it to write a malicious value. It could
>> presumably do this with some asynchronous mechanism so that even if
>> there was only one attacker thread, it could change its own value.
>
> I’m not sure I follow the details, can you give an example of an asynchronous
> mechanism to do this? E.g. would this be the kernel writing to the memory in a
> syscall for example?

I was thinking of all of the IORING_OP_*'s that can write to memory or
aio(7).