Re: [PATCH 1/4] mm: Trial do_wp_page() simplification

From: John Hubbard
Date: Fri Sep 18 2020 - 17:06:26 EST


On 9/18/20 1:40 PM, Peter Xu wrote:
On Fri, Sep 18, 2020 at 02:32:40PM -0300, Jason Gunthorpe wrote:
On Fri, Sep 18, 2020 at 12:40:32PM -0400, Peter Xu wrote:

Firstly in the draft patch mm->has_pinned is introduced and it's written to 1
as long as FOLL_GUP is called once. It's never reset after set.

Worth thinking about also adding FOLL_LONGTERM here, at last as long
as it is not a counter. That further limits the impact.

But theoritically we should also trigger COW here for pages even with PIN &&
!LONGTERM, am I right? Assuming that FOLL_PIN is already a corner case.


This note, plus Linus' comment about "I'm a normal process, I've never
done any special rdma page pinning", has me a little worried. Because
page_maybe_dma_pinned() is counting both short- and long-term pins,
actually. And that includes O_DIRECT callers.

O_DIRECT pins are short-term, and RDMA systems are long-term (and should
be setting FOLL_LONGTERM). But there's no way right now to discern
between them, once the initial pin_user_pages*() call is complete. All
we can do today is to count the number of FOLL_PIN calls, not the number
of FOLL_PIN | FOLL_LONGTERM calls.

The reason it's that way, is that writeback and such can experience
problems regardless of the duration of the pin. There are ideas about
how to deal with the pins, and the filesystem (layout leases...) but
still disagreement, which is why there's basically no
page_maybe_dma_pinned() callers yet.

Although I think we're getting closer to using it. There was a recent
attempt at using this stuff, from Chris Wilson. [1]


[1] https://lore.kernel.org/intel-gfx/20200624191417.16735-1-chris%40chris-wilson.co.uk/


thanks,
--
John Hubbard
NVIDIA