Re: RDMA memory registration (was: [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation)

From: Libor Michalek
Date: Fri Apr 29 2005 - 12:25:35 EST


On Fri, Apr 29, 2005 at 09:45:50AM -0700, Roland Dreier wrote:
> Is there anything wrong with the following plan?
>
> 1) For memory registration, use get_user_pages() in the kernel. Use
> locked_vm and RLIMIT_MEMLOCK to limit the amount of memory pinned
> by a given process. One disadvantage of this is that the
> accounting will overestimate the amount of pinned memory if a
> process pins the same page twice, but this doesn't seem that bad to
> me -- it errs on the side of safety.

I think the overestimate will be fine in practice. If a process is
locking a lot of memory it will most likely be in big chunks, so not
much page overlap there. If the process is locking lots of tiny buffers
with lots of page overlap, the total locked amount will most likely be
small. Although it is odd that you could end up with a total locked
amount larger then the number of physical pages in the system...

> 2) For fork() support:
>
> a) Extend mprotect() with PROT_DONTCOPY so processes can avoid
> copy-on-write problems.
>
> b) (maybe someday?) Add a VM_ALWAYSCOPY flag and extend mprotect()
> with PROT_ALWAYSCOPY so processes can mark pages to be
> pre-copied into child processes, to handle the case where only
> half a page is registered.
>
> I believe this puts the code that must be trusted into the kernel and
> gives userspace primitives that let apps handle the rest.

I'm assuming that for libibverbs memory registration you plan on hiding
the mprotect in the library? Without reference counting at the kernel
level this could yield unexpected results in a perfectly legitimate app.

For example if the app is managing a buffer it will pass to another
device, but also want's to move data in/out with RDMA hardware, the user
marks it themselves with DONTCOPY, registers with libibverbs, performs
IO, unregisters with libibverbs. At this point the user expects the buffer
to have DONTCOPY set, but it does not because of the unregister... Not
that it's likely, but it's a valid thing to do. However, since I don't
have a better suggestion, I'm in favour of using mprotect as you outlined.


-Libor
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/