RE: [RFC RFT PATCH 0/4] Handle set_memory_XXcrypted() errors in hyperv

From: Michael Kelley
Date: Thu Mar 07 2024 - 12:11:56 EST


From: Michael Kelley <mhklinux@xxxxxxxxxxx> Sent: Friday, March 1, 2024 11:00 AM
> >
> > IMPORTANT NOTE:
> > I don't have a setup to test tdx hyperv changes. These changes are compile
> > tested only. Previously Michael Kelley suggested some folks at MS might be
> > able to help with this.
>
> Thanks for doing these changes. Overall they look pretty good,
> modulo a few comments. The "decrypted" flag in the vmbus_gpadl
> structure is a good way to keep track of the encryption status of
> the associated memory.
>
> The memory passed to the gpadl (Guest Physical Address Descriptor
> List) functions may allocated and freed directly by the driver, as in
> the netvsc and UIO cases. You've handled that case. But memory
> may also be allocated by vmbus_alloc_ring() and freed by
> vmbus_free_ring(). Your patch set needs an additional change
> to check the "decrypted" flag in vmbus_free_ring().
>
> In reviewing the code, I also see some unrelated memory freeing
> issues in error paths. They are outside the scope of your changes.
> I'll make a note of these for future fixing.
>
> For testing, I'll do two things:
>
> 1) Verify that the non-error paths still work correctly with the
> changes. That should be relatively straightforward as the
> changes are pretty much confined to the error paths.
>
> 2) Hack set_memory_encrypted() to always fail. I hope Linux
> still boots in that case, but just leaks some memory. Then if
> I unbind a Hyper-V synthetic device, that should exercise the
> path where set_memory_encrypted() is called. Failures
> should be handled cleanly, albeit while leaking the memory.
>
> I should be able to test in a normal VM, a TDX VM, and an
> SEV-SNP VM.
>

Rick --

Using your patches plus the changes in my comments, I've
done most of the testing described above. The normal
paths work, and when I hack set_memory_encrypted()
to fail, the error paths correctly did not free the memory.
I checked both the ring buffer memory and the additional
vmalloc memory allocated by the netvsc driver and the uio
driver. The memory status can be checked after-the-fact
via /proc/vmmallocinfo and /proc/buddyinfo since these
are mostly large allocations. As expected, the drivers
output their own error messages after the failures to
teardown the GPADLs.

I did not test the vmbus_disconnect() path since that
effectively kills the VM.

I tested in a normal VM, and in an SEV-SNP VM. I didn't
specifically test in a TDX VM, but given that Hyper-V CoCo
guests run with a paravisor, the guest sees the same thing
either way.

Michael