On Fri, Sep 02, 2022 at 10:32:01AM +0200, David Hildenbrand wrote:
So I wonder instead of continuing to fix trickiness around the zero
page whether it is a better idea to pursue allocating a normal
page from the beginning for pinned RO mappings?
That's precisely what I am working. For example, that's required to get
rid of FOLL_FORCE|FOLL_WRITE for taking a R/O pin as done by RDMA:
And all these issues are exactly why RDMA uses FOLL_FORCE and it is,
IMHO, a simple bug that VFIO does not.
I do wonder if that's a real issue, though. One approach would be to
warn the VFIO users and allow for slightly exceeding the MEMLOCK limit
for a while. Of course, that only works if we assume that such pinned
zeropages are only extremely rarely longterm-pinned for a single VM
instance by VFIO.
I'm confused, doesn't vfio increment the memlock for every page of VA
it pins? Why would it matter if the page was COW'd or not? It is
already accounted for today as though it was a unique page.
IOW if we add FOLL_FORCE it won't change the value of the memlock.