Re: [PATCH v4 12/12] KVM: Expose KVM_MEM_PRIVATE

From: Maciej S. Szmigiero
Date: Wed Feb 23 2022 - 13:33:10 EST


On 23.02.2022 13:00, Chao Peng wrote:
On Tue, Feb 22, 2022 at 02:16:46AM +0100, Maciej S. Szmigiero wrote:
On 17.02.2022 14:45, Chao Peng wrote:
On Tue, Jan 25, 2022 at 09:20:39PM +0100, Maciej S. Szmigiero wrote:
On 18.01.2022 14:21, Chao Peng wrote:
KVM_MEM_PRIVATE is not exposed by default but architecture code can turn
on it by implementing kvm_arch_private_memory_supported().

Also private memslot cannot be movable and the same file+offset can not
be mapped into different GFNs.

Signed-off-by: Yu Zhang <yu.c.zhang@xxxxxxxxxxxxxxx>
Signed-off-by: Chao Peng <chao.p.peng@xxxxxxxxxxxxxxx>
---
(..)
static bool kvm_check_memslot_overlap(struct kvm_memslots *slots, int id,
- gfn_t start, gfn_t end)
+ struct file *file,
+ gfn_t start, gfn_t end,
+ loff_t start_off, loff_t end_off)
{
struct kvm_memslot_iter iter;
+ struct kvm_memory_slot *slot;
+ struct inode *inode;
+ int bkt;
kvm_for_each_memslot_in_gfn_range(&iter, slots, start, end) {
if (iter.slot->id != id)
return true;
}
+ /* Disallow mapping the same file+offset into multiple gfns. */
+ if (file) {
+ inode = file_inode(file);
+ kvm_for_each_memslot(slot, bkt, slots) {
+ if (slot->private_file &&
+ file_inode(slot->private_file) == inode &&
+ !(end_off <= slot->private_offset ||
+ start_off >= slot->private_offset
+ + (slot->npages >> PAGE_SHIFT)))
+ return true;
+ }
+ }

That's a linear scan of all memslots on each CREATE (and MOVE) operation
with a fd - we just spent more than a year rewriting similar linear scans
into more efficient operations in KVM.

(..)
So linear scan is used before I can find a better way.

Another option would be to simply not check for overlap at add or move
time, declare such configuration undefined behavior under KVM API and
make sure in MMU notifiers that nothing bad happens to the host kernel
if it turns out somebody actually set up a VM this way (it could be
inefficient in this case, since it's not supposed to ever happen
unless there is a bug somewhere in the userspace part).

Specific to TDX case, SEAMMODULE will fail the overlapping case and then
KVM prints a message to the kernel log. It will not cause any other side
effect, it does look weird however. Yes warn that in the API document
can help to some extent.

So for the functionality you are adding this code for (TDX) this scan
isn't necessary and the overlapping case (not supported anyway) is safely
handled by the hardware (or firmware)?
Then I would simply remove the scan and, maybe, add a comment instead
that the overlap check is done by the hardware.

By the way, if a kernel log message could be triggered by (misbehaving)
userspace then it should be rate limited (if it isn't already).

Thanks,
Chao

Thanks,
Maciej