Re: [PATCH v3 0/1] mm: introduce put_user_page*(), placeholder versions

From: Jerome Glisse
Date: Wed Mar 13 2019 - 15:33:37 EST


On Wed, Mar 13, 2019 at 07:16:51PM +0000, Christopher Lameter wrote:
> On Tue, 12 Mar 2019, Jerome Glisse wrote:
>
> > > > This has been discuss extensively already. GUP usage is now widespread in
> > > > multiple drivers, removing that would regress userspace ie break existing
> > > > application. We all know what the rules for that is.
>
> You are still misstating the issue. In RDMA land GUP is widely used for
> anonyous memory and memory based filesystems. *Not* for real filesystems.

Then why are they bug report as one pointed out in cover letter ? It
means someone is doing GUP on filesystem. Moreover looking at RDMA
driver i do not see anything that check that VA for GUP belongs to a
vma that is not back by a regular file.

>
> > > Because someone was able to get away with weird ways of abusing the system
> > > it not an argument that we should continue to allow such things. In fact
> > > we have repeatedly ensured that the kernel works reliably by improving the
> > > kernel so that a proper failure is occurring.
> >
> > Driver doing GUP on mmap of regular file is something that seems to
> > already have widespread user (in the RDMA devices at least). So they
> > are active users and they were never told that what they are doing
> > was illegal.
>
> Not true. Again please differentiate the use cases between regular
> filesystem and anonyous mappings.

Again where does the bug comes from ? Where in RDMA is the check that
VA belong to a vma that is not back by a file ?

>
> > > Well swapout cannot occur if the page is pinned and those pages are also
> > > often mlocked.
> >
> > I would need to check the swapout code but i believe the write to disk
> > can happen before the pin checks happens. I believe the event flow is:
> > map read only, allocate swap, write to disk, try to free page which
> > checks for pin. So that you could write stale data to disk and the GUP
> > going away before you perform the pin checks.
>
> Allocate swap is a separate step that associates a swap entry to an
> anonymous page.
>
> > They are other thing to take into account and that need proper page
> > dirtying, like soft dirtyness for instance.
>
> RDMA mapped pages are all dirty all the time.

Point is the pte dirty bit might not be accurate nor the soft dirty bit
because GUP user does not update those bits and thus GUP user need to
call the set_page_dirty or similar to properly report page dirtyness.

> > Well RDMA driver maintainer seems to report that this has been a valid
> > and working workload for their users.
>
> No they dont.
>
> Could you please get up to date on the discussion before posting?

Again why is there bug report ? Where is the code in RDMA that check
that VA does not belong to vma that is back by a file ?

As much as i would like that this use case did not exist i fear it
does and it has been upstream for a while. This also very much apply
to O_DIRECT wether you like it or not.

Cheers,
Jérôme