Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation

From: Andrew Morton
Date: Mon Apr 11 2005 - 19:17:59 EST


Roland Dreier <roland@xxxxxxxxxxx> wrote:
>
> Troy> Do we even need the mlock in userspace then?
>
> Yes, because the kernel may go through and unmap pages from userspace
> while trying to swap. Since we have the page locked in the kernel,
> the physical page won't go anywhere, but userspace might end up with a
> different page mapped at the same virtual address.

That shouldn't happen. If get_user_pages() has elevated the refcount on a
page then the following can happen:

- The VM may decide to add the page to swapcache (if it's not mmapped
from a file).

- Once the page is backed by either swapcache of a (mmapped) file, the VM
may decide the unmap the application's pte's. A later minor fault by the
app will cause the same physical page to be remapped.

- The VM may decide to try to write the page to its backing file or swap.
If it does, the page is still in core, but is now clean.

- Once all pte's are unmapped and the page is clean, the VM may decide to
try to reclaim the page. The VM will then see the elevated refcount and
will bale out, leaving the page in core.

- If your code was doing a read-from-disk (modifying memory), then your
code should run set_page_dirty() or set_page_dirty_lock() against the
page before dropping the refcount which get_user_pages() added. Once the
page is dirty, the VM can't reclaim it until it has been been written to
swap or mmapped backing file.

IOW: while the page has an elevated refcount from get_user_pages(), that
physical page is 100% pinned. Once you've done the
set_page_dirty+put_page(), the page is again under control of the VM.

There should be no need to run mlock() from userspace.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/