Re: [PATCH] vfio/type1: Unpin zero pages

From: David Hildenbrand
Date: Wed Sep 07 2022 - 05:00:36 EST


On 07.09.22 01:30, Jason Gunthorpe wrote:
On Fri, Sep 02, 2022 at 10:32:01AM +0200, David Hildenbrand wrote:

So I wonder instead of continuing to fix trickiness around the zero
page whether it is a better idea to pursue allocating a normal
page from the beginning for pinned RO mappings?

That's precisely what I am working. For example, that's required to get
rid of FOLL_FORCE|FOLL_WRITE for taking a R/O pin as done by RDMA:

And all these issues are exactly why RDMA uses FOLL_FORCE and it is,
IMHO, a simple bug that VFIO does not.

I consider the BUG that our longterm page pinning behaves the way it currently does, not that we're not using the FOLL_FORCE flag here.

But it doesn't matter, I'm working on fixing/cleaning it up.


I do wonder if that's a real issue, though. One approach would be to
warn the VFIO users and allow for slightly exceeding the MEMLOCK limit
for a while. Of course, that only works if we assume that such pinned
zeropages are only extremely rarely longterm-pinned for a single VM
instance by VFIO.

I'm confused, doesn't vfio increment the memlock for every page of VA
it pins? Why would it matter if the page was COW'd or not? It is
already accounted for today as though it was a unique page.

IOW if we add FOLL_FORCE it won't change the value of the memlock.

I only briefly skimmed over the code Alex might be able to provide more details and correct me if I'm wrong:

vfio_pin_pages_remote() contains a comment:

"Reserved pages aren't counted against the user, externally pinned pages are already counted against the user."

is_invalid_reserved_pfn() should return "true" for the shared zeropage and prevent us from accounting it via vfio_lock_acct(). Otherwise, vfio_find_vpfn() seems to be in place to avoid double-accounting pages.

--
Thanks,

David / dhildenb