Re: [PATCH v1] drm/ttm: Refcount allocated tail pages

From: Christian König
Date: Thu Aug 18 2022 - 05:41:29 EST


Am 18.08.22 um 01:13 schrieb Dmitry Osipenko:
On 8/18/22 01:57, Dmitry Osipenko wrote:
On 8/15/22 18:54, Dmitry Osipenko wrote:
On 8/15/22 17:57, Dmitry Osipenko wrote:
On 8/15/22 16:53, Christian König wrote:
Am 15.08.22 um 15:45 schrieb Dmitry Osipenko:
[SNIP]
Well that comment sounds like KVM is doing the right thing, so I'm
wondering what exactly is going on here.
KVM actually doesn't hold the page reference, it takes the temporal
reference during page fault and then drops the reference once page is
mapped, IIUC. Is it still illegal for TTM? Or there is a possibility for
a race condition here?

Well the question is why does KVM grab the page reference in the first
place?

If that is to prevent the mapping from changing then yes that's illegal
and won't work. It can always happen that you grab the address, solve
the fault and then immediately fault again because the address you just
grabbed is invalidated.

If it's for some other reason than we should probably investigate if we
shouldn't stop doing this.
CC: +Paolo Bonzini who introduced this code

commit add6a0cd1c5ba51b201e1361b05a5df817083618
Author: Paolo Bonzini <pbonzini@xxxxxxxxxx>
Date: Tue Jun 7 17:51:18 2016 +0200

KVM: MMU: try to fix up page faults before giving up

The vGPU folks would like to trap the first access to a BAR by setting
vm_ops on the VMAs produced by mmap-ing a VFIO device. The fault
handler
then can use remap_pfn_range to place some non-reserved pages in the
VMA.

This kind of VM_PFNMAP mapping is not handled by KVM, but follow_pfn
and fixup_user_fault together help supporting it. The patch also
supports
VM_MIXEDMAP vmas where the pfns are not reserved and thus subject to
reference counting.

@Paolo,
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F73e5ed8d-0d25-7d44-8fa2-e1d61b1f5a04%40amd.com%2FT%2F%23m7647ce5f8c4749599d2c6bc15a2b45f8d8cf8154&amp;data=05%7C01%7Cchristian.koenig%40amd.com%7Cecb0f0eb6c2d43afa15b08da80a625ff%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637963748360952327%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=52Dvisa7p%2BckmBxMvDJFScGSNy9JRPDdnPK05C%2F6n7A%3D&amp;reserved=0

If we need to bump the refcount only for VM_MIXEDMAP and not for
VM_PFNMAP, then perhaps we could add a flag for that to the kvm_main
code that will denote to kvm_release_page_clean whether it needs to put
the page?
The other variant that kind of works is to mark TTM pages reserved using
SetPageReserved/ClearPageReserved, telling KVM not to mess with the page
struct. But the potential consequences of doing this are unclear to me.

Christian, do you think we can do it?
Although, no. It also doesn't work with KVM without additional changes
to KVM.

Well my fundamental problem is that I can't fit together why KVM is grabing a page reference in the first place.

See the idea of the page reference is that you have one reference is that you count the reference so that the memory is not reused while you access it, e.g. for I/O or mapping it into different address spaces etc...

But none of those use cases seem to apply to KVM. If I'm not totally mistaken in KVM you want to make sure that the address space mapping, e.g. the translation between virtual and physical address, don't change while you handle it, but grabbing a page reference is the completely wrong approach for that.

Regards,
Christian.