RE: [PATCH] selftests: kvm: fix overlapping addresses in memslot_perf_test

From: Duan, Zhenzhong
Date: Thu Jun 03 2021 - 23:35:54 EST




> -----Original Message-----
> From: Maciej S. Szmigiero <maciej.szmigiero@xxxxxxxxxx>
> Sent: Thursday, June 3, 2021 9:06 PM
> To: Andrew Jones <drjones@xxxxxxxxxx>
> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx;
> kvm@xxxxxxxxxxxxxxx; Duan, Zhenzhong <zhenzhong.duan@xxxxxxxxx>
> Subject: Re: [PATCH] selftests: kvm: fix overlapping addresses in
> memslot_perf_test
>
> On 03.06.2021 14:37, Andrew Jones wrote:
> > On Thu, Jun 03, 2021 at 05:26:33AM +0000, Duan, Zhenzhong wrote:
> >>> -----Original Message-----
> >>> From: Maciej S. Szmigiero <maciej.szmigiero@xxxxxxxxxx>
> >>> Sent: Thursday, June 3, 2021 7:07 AM
> >>> To: Paolo Bonzini <pbonzini@xxxxxxxxxx>; Duan, Zhenzhong
> >>> <zhenzhong.duan@xxxxxxxxx>
> >>> Cc: linux-kernel@xxxxxxxxxxxxxxx; kvm@xxxxxxxxxxxxxxx; Andrew Jones
> >>> <drjones@xxxxxxxxxx>
> >>> Subject: Re: [PATCH] selftests: kvm: fix overlapping addresses in
> >>> memslot_perf_test
> >>>
> >>> On 30.05.2021 01:13, Maciej S. Szmigiero wrote:
> >>>> On 29.05.2021 12:20, Paolo Bonzini wrote:
> >>>>> On 28/05/21 21:51, Maciej S. Szmigiero wrote:
> >>>>>> On 28.05.2021 21:11, Paolo Bonzini wrote:
> >>>>>>> The memory that is allocated in vm_create is already mapped
> >>>>>>> close to GPA 0, because test_execute passes the requested memory
> >>>>>>> to prepare_vm.  This causes overlapping memory regions and the
> >>>>>>> test crashes.  For simplicity just move MEM_GPA higher.
> >>>>>>>
> >>>>>>> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> >>>>>>
> >>>>>> I am not sure that I understand the issue correctly, is
> >>>>>> vm_create_default() already reserving low GPAs (around
> >>>>>> 0x10000000) on some arches or run environments?
> >>>>>
> >>>>> It maps the number of pages you pass in the second argument, see
> >>>>> vm_create.
> >>>>>
> >>>>>    if (phy_pages != 0)
> >>>>>      vm_userspace_mem_region_add(vm,
> VM_MEM_SRC_ANONYMOUS,
> >>>>>                                  0, 0, phy_pages, 0);
> >>>>>
> >>>>> In this case:
> >>>>>
> >>>>>    data->vm = vm_create_default(VCPU_ID, mempages, guest_code);
> >>>>>
> >>>>> called here:
> >>>>>
> >>>>>    if (!prepare_vm(data, nslots, maxslots, tdata->guest_code,
> >>>>>                    mem_size, slot_runtime)) {
> >>>>>
> >>>>> where mempages is mem_size, which is declared as:
> >>>>>
> >>>>>          uint64_t mem_size = tdata->mem_size ? : MEM_SIZE_PAGES;
> >>>>>
> >>>>> but actually a better fix is just to pass a small fixed value (e.g.
> >>>>> 1024) to vm_create_default, since all other regions are added by
> >>>>> hand
> >>>>
> >>>> Yes, but the argument that is passed to vm_create_default()
> >>>> (mem_size in the case of the test) is not passed as phy_pages to
> vm_create().
> >>>> Rather, vm_create_with_vcpus() calculates some upper bound of extra
> >>>> memory that is needed to cover that much guest memory (including
> >>>> for its page tables).
> >>>>
> >>>> The biggest possible mem_size from memslot_perf_test is 512 MiB + 1
> >>>> page, according to my calculations this results in phy_pages of
> >>>> 1029
> >>>> (~4 MiB) in the x86-64 case and around 1540 (~6 MiB) in the s390x
> >>>> case (here I am not sure about the exact number, since s390x has
> >>>> some additional alignment requirements).
> >>>>
> >>>> Both values are well below 256 MiB (0x10000000UL), so I was
> >>>> wondering what kind of circumstances can make these allocations
> >>>> collide (maybe I am missing something in my analysis).
> >>>
> >>> I see now that there has been a patch merged last week called
> >>> "selftests: kvm: make allocation of extra memory take effect" by
> >>> Zhenzhong that now allocates also the whole memory size passed to
> >>> vm_create_default() (instead of just page tables for that much memory).
> >>>
> >>> The commit message of this patch says that "perf_test_util and
> >>> kvm_page_table_test use it to alloc extra memory currently", however
> >>> both kvm_page_table_test and lib/perf_test_util framework explicitly
> >>> add the required memory allocation by doing a
> >>> vm_userspace_mem_region_add() call for the same memory size that
> they pass to vm_create_default().
> >>>
> >>> So now they allocate this memory twice.
> >>>
> >>> @Zhenzhong: did you notice improper operation of either
> >>> kvm_page_table_test or perf_test_util-based tests
> >>> (demand_paging_test, dirty_log_perf_test,
> >>> memslot_modification_stress_test) before your patch?
> >> No
> >>
> >>>
> >>> They seem to work fine for me without the patch (and I guess other
> >>> people would have noticed earlier, too, if they were broken).
> >>>
> >>> After this patch not only these tests allocate their memory twice
> >>> but it is harder to make vm_create_default() allocate the right
> >>> amount of memory for the page tables in cases where the test needs
> >>> to explicitly use
> >>> vm_userspace_mem_region_add() for its allocations (because it wants
> >>> the allocation placed at a specific GPA or in a specific memslot).
> >>>
> >>> One has to basically open-code the page table size calculations from
> >>> vm_create_with_vcpus() in the particular test then, taking also into
> >>> account that vm_create_with_vcpus() will not only allocate the
> >>> passed memory size (calculated page tables size) but also behave
> >>> like it was allocating space for page tables for these page tables
> >>> (even though the passed memory size itself is supposed to cover them).
> >> Looks we have different understanding to the parameter
> extra_mem_pages of vm_create_default().
> >>
> >> In your usage, extra_mem_pages is only used for page table
> >> calculations, real extra memory allocation happens in the extra call of
> vm_userspace_mem_region_add().
> >
> > Yes, this is the meaning that kvm selftests has always had for
> > extra_mem_pages of vm_create_default(). If we'd rather have a
> > different meaning, that's fine, but we need to change all the callers
> > of the function as well.
>
> If we change the meaning of extra_mem_pages (keep the patch) it would be
> good to still have an additional parameter to vm_create_with_vcpus() for
> tests that have to allocate their memory on their own via
> vm_userspace_mem_region_add() for vm_create_with_vcpus() to just
> allocate the page tables for these manual allocations.
> Or a helper to calculate the required extra_mem_pages for them.
>
> > If we decide to leave vm_create_default() the way it was by reverting
> > this patch, then maybe we should consider renaming the parameter
> > and/or documenting the function.
>
> Adding a descriptive comment (and possibly renaming the parameter) seems
> like a much simpler solution to me that adapting these tests (and possibly
> adding the parameter or helper described above for them).

Agree, I prefer the simpler way.

I also think of an idea for custom slot0 memory, keep extra_mem_pages the original way, adding a global slot0_pages for custom slot0 memory. Maybe not a good choice as it's not thread safe, just for discussion. That is:
1. revert "selftests: kvm: make allocation of extra memory take effect"
2. add below patch
--- a/tools/testing/selftests/kvm/include/kvm_util.h
+++ b/tools/testing/selftests/kvm/include/kvm_util.h
@@ -280,6 +280,9 @@ vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num,
struct kvm_vm *vm_create_default(uint32_t vcpuid, uint64_t extra_mem_pages,
void *guest_code);

+struct kvm_vm *vm_create_slot0(uint32_t vcpuid, uint64_t slot0_mem_pages,
+ uint64_t extra_mem_pages, void *guest_code);
+
/* Same as vm_create_default, but can be used for more than one vcpu */
struct kvm_vm *vm_create_default_with_vcpus(uint32_t nr_vcpus, uint64_t extra_mem_pages,
uint32_t num_percpu_pages, void *guest_code,
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 63418df921f0..56b1225865d5 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -196,6 +196,7 @@ const struct vm_guest_mode_params vm_guest_mode_params[] = {
_Static_assert(sizeof(vm_guest_mode_params)/sizeof(struct vm_guest_mode_params) == NUM_VM_MODES,
"Missing new mode params?");

+uint64_t slot0_pages = DEFAULT_GUEST_PHY_PAGES;
/*
* VM Create
*
@@ -319,8 +320,8 @@ struct kvm_vm *vm_create_with_vcpus(enum vm_guest_mode mode, uint32_t nr_vcpus,
* than N/x*2.
*/
uint64_t vcpu_pages = (DEFAULT_STACK_PGS + num_percpu_pages) * nr_vcpus;
- uint64_t extra_pg_pages = (extra_mem_pages + vcpu_pages) / PTES_PER_MIN_PAGE * 2;
- uint64_t pages = DEFAULT_GUEST_PHY_PAGES + vcpu_pages + extra_pg_pages;
+ uint64_t extra_pg_pages = (slot0_pages + extra_mem_pages + vcpu_pages) / PTES_PER_MIN_PAGE * 2;
+ uint64_t pages = slot0_pages + vcpu_pages + extra_pg_pages;
struct kvm_vm *vm;
int i;

@@ -358,9 +359,18 @@ struct kvm_vm *vm_create_default_with_vcpus(uint32_t nr_vcpus, uint64_t extra_me
num_percpu_pages, guest_code, vcpuids);
}

+struct kvm_vm *vm_create_slot0(uint32_t vcpuid, uint64_t slot0_mem_pages,
+ uint64_t extra_mem_pages, void *guest_code)
+{
+ slot0_pages = slot0_mem_pages;
+ return vm_create_default_with_vcpus(1, extra_mem_pages, 0, guest_code,
+ (uint32_t []){ vcpuid });
+}
+
struct kvm_vm *vm_create_default(uint32_t vcpuid, uint64_t extra_mem_pages,
void *guest_code)
{
+ slot0_pages = DEFAULT_GUEST_PHY_PAGES;
return vm_create_default_with_vcpus(1, extra_mem_pages, 0, guest_code,
(uint32_t []){ vcpuid });
}
@@ -626,6 +636,9 @@ void kvm_vm_free(struct kvm_vm *vmp)

/* Free the structure describing the VM. */
free(vmp);
+
+ /* Restore slot0 memory to default size for next VM creation */
+ slot0_pages = DEFAULT_GUEST_PHY_PAGES;
}

/*

Thanks
Zhenzhong