Re: [RFC PATCH v2 6/6] KVM: guest_memfd: selftest: Add test case for error_remove_page method

From: Sean Christopherson
Date: Thu Sep 21 2023 - 19:22:46 EST


On Thu, Sep 21, 2023, isaku.yamahata@xxxxxxxxx wrote:
> From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
>
> This test case implements fault injection into guest memory by
> madvise(MADV_HWPOISON) for shared(conventional) memory region and
> KVM_GUEST_MEMORY_FAILURE for private gmem region. Once page is poisoned,
> free the poisoned page and try to run vcpu again to see a new zero page is
> assigned.

Thanks much for the test! I think for the initial merge it makes sense to leave
this out, mainly because I don't think we want a KVM specific ioctl(). But I'll
definitely keep this around to do manual point testing.

> +#define BASE_DATA_SLOT 10
> +#define BASE_DATA_GPA ((uint64_t)(1ull << 32))
> +#define PER_CPU_DATA_SIZE ((uint64_t)(SZ_2M))
> +
> +enum ucall_syncs {
> + HWPOISON_SHARED,
> + HWPOISON_PRIVATE,
> +};
> +
> +static void guest_sync_shared(uint64_t gpa)

Probably guest_poison_{shared,private}(), or maybe just open code the GUEST_SYNC2()
calls. I added helpers in the other tests because the ucalls were a bit more
involved then passing the GPA.

However, I don't see any reason to do hypercalls and on-demand mapping/fallocate.
Just have two separate sub-tests, one for private and one for shared, each with
its own host. I'm pretty sure the guest code can be the same, e.g. I believe it
would just boil down to:

static void guest_code(uint64_t gpa)
{
uint64_t *addr = (void *)gpa;

WRITE_ONCE(*addr, <some pattern>);

/* Ask the host to poison the page. */
GUEST_SYNC(EWPOISON);

/*
* Access the poisoned page. The host should see a SIGBUS or EHWPOISON
* and then truncate the page. After truncation, the page should be
* faulted back and read zeros, all before the read completes.
*/
GUEST_ASSERT_EQ(*(uint64_t *)gpa, 0);
GUEST_DONE();
}

> + if (uc.args[0] == HWPOISON_PRIVATE) {
> + int ret;
> +
> + inject_memory_failure(gmem_fd, gpa);
> + ret = _vcpu_run(vcpu);
> + TEST_ASSERT(ret == -1 && errno == EHWPOISON &&

Honestly, I'm kinda surprised the KVM code actually works :-)

> + run->exit_reason == KVM_EXIT_MEMORY_FAULT,
> + "exit_reason 0x%x",
> + run->exit_reason);
> + /* Discard the poisoned page and assign new page. */
> + vm_guest_mem_fallocate(vm, gpa, PAGE_SIZE, true);
> + } else {
> + uint8_t *hva = addr_gpa2hva(vm, gpa);
> + int r;
> +
> + r = madvise(hva, 8, MADV_HWPOISON);

Huh. TIL there's an MADV_HWPOISON. We've already talked about adding fbind(),
adding an fadvise() seems like the obvious solution. Or maybe overload
fallocate() with a new flag? Regardless, I think we should add or extend a generic
fd-based syscall(), not throw in something KVM specific.