Re: [RFC PATCH v2 18/19] heki: x86: Protect guest kernel memory using the KVM hypervisor

From: Madhavan T. Venkataraman
Date: Wed Nov 29 2023 - 14:47:14 EST




On 11/27/23 14:03, Peter Zijlstra wrote:
> On Mon, Nov 27, 2023 at 11:05:23AM -0600, Madhavan T. Venkataraman wrote:
>> Apologies for the late reply. I was on vacation. Please see my response below:
>>
>> On 11/13/23 02:54, Peter Zijlstra wrote:
>>> On Sun, Nov 12, 2023 at 09:23:25PM -0500, Mickaël Salaün wrote:
>>>> From: Madhavan T. Venkataraman <madvenka@xxxxxxxxxxxxxxxxxxx>
>>>>
>>>> Implement a hypervisor function, kvm_protect_memory() that calls the
>>>> KVM_HC_PROTECT_MEMORY hypercall to request the KVM hypervisor to
>>>> set specified permissions on a list of guest pages.
>>>>
>>>> Using the protect_memory() function, set proper EPT permissions for all
>>>> guest pages.
>>>>
>>>> Use the MEM_ATTR_IMMUTABLE property to protect the kernel static
>>>> sections and the boot-time read-only sections. This enables to make sure
>>>> a compromised guest will not be able to change its main physical memory
>>>> page permissions. However, this also disable any feature that may change
>>>> the kernel's text section (e.g., ftrace, Kprobes), but they can still be
>>>> used on kernel modules.
>>>>
>>>> Module loading/unloading, and eBPF JIT is allowed without restrictions
>>>> for now, but we'll need a way to authenticate these code changes to
>>>> really improve the guests' security. We plan to use module signatures,
>>>> but there is no solution yet to authenticate eBPF programs.
>>>>
>>>> Being able to use ftrace and Kprobes in a secure way is a challenge not
>>>> solved yet. We're looking for ideas to make this work.
>>>>
>>>> Likewise, the JUMP_LABEL feature cannot work because the kernel's text
>>>> section is read-only.
>>>
>>> What is the actual problem? As is the kernel text map is already RO and
>>> never changed.
>>
>> For the JUMP_LABEL optimization, the text needs to be patched at some point.
>> That patching requires a writable mapping of the text page at the time of
>> patching.
>>
>> In this Heki feature, we currently lock down the kernel text at the end of
>> kernel boot just before kicking off the init process. The lockdown is
>> implemented by setting the permissions of a text page to R_X in the extended
>> page table and not allowing write permissions in the EPT after that. So, jump label
>> patching during kernel boot is not a problem. But doing it after kernel
>> boot is a problem.
>
> But you see, that's exactly what the kernel already does with the normal
> permissions. They get set to RX after init and are never changed.
>
> See the previous patch, we establish a read-write alias and write there.
>
> You seem to lack basic understanding of how the kernel works in this
> regard, which makes me very nervous about you touching any of this.
>
> I must also say I really dislike your extra/random permssion calls all
> over the place. They don't really get us anything afaict. Why can't you
> plumb into the existing set_memory_*() family?

I have responded to your comments on your other email. Please read my
response there.

Thanks.

Madhavan